Posted by: mmurphy in traffic, SEO, PR, live, google, alexa on
Sep 15, 2008
So its been awhile since we did a blog entry, Mainly because we have been studying SEO and taking full advantage of your robots.txt to help get better results.
It came to our attention because we had all the SEF turned on, Google still was indexing none SEF urls.
We had 758 pages indexed. Now every day we are seeing pages vanish (thats our goal)
First off if you didnt know, Install xmap it generated a sitemap not only for your site but it gives you an xml built on the fly that can be submitted to google, yahoo, ask and msn and several other sites.
You can also add this to your robots.txt
Sitemap: http://www.joomlamafia.com/index.php?option=com_xmap&sitemap=1&view=xml&no_html=1
Whats this do? Every time a bot comes to your site it looks at robots.txt to know what not to index and what to index... It just found out first thing is all bots this applies to, heres my sitemap, heres all my pages!!! But if any of these links are present (which we know they are not because this is what all our viewers of our site get to see when they click sitemap link)
Next thing we added was
Disallow: index2.php
and then
Disallow: /component/content/article/* Disallow: /component/search/*
Disallow: /component/*
This keeps any component pages from getting indexed, as well as any search pages and the firt keeps the articles in none sef format from being indexed and causing double content.
How Can I Apply This To My Site?
Regularly check what kind of pages Google is indexing on your site and look for patterns. If there are a lot of PDF pages, or dozens of useless links from a particular component, you can act quickly to block them out with robots.txt. Use the site:mydomain.com search function or a tool such as WebCEO.com.
Among the most important things you can do is check your pages that are in Google's supplemental index. This is where you'll find lots of your low-quality pages, ripe for removal by robots.txt. If the pages don't contain useful information, dump them.
Originally the wildcard wasn't supported by robots.txt but that has since changed. Both Google and Yahoo now recognize it: Regularly check what kind of pages Google is indexing on your site and look for patterns.
If there are a lot of PDF pages, or dozens of useless links from a particular component, you can act quickly to block them out with robots.txt. Use the site:mydomain.com search function or a tool such as
WebCEO.com.
Among the most important things you can do is check your pages that are in Google's supplemental index. This is where you'll find lots of your low-quality pages, ripe for removal by robots.txt. If the pages don't contain useful information, dump them.
Posted by: admin2 in twitter, traffic, technorati, stumbleupon, Social Bookmark, Social, slashdot, SEO, reddit, newsvine, myweb, myspace, Module, live, Joomla, google, gadgets, furl, favorites, fark, facebook, email, digg, delicious on
Jul 16, 2008
So We reported in a a older blog (our first) about a cool social bookmarking Module, well as time went by we noticed the space it took up and some of you complained that when you went to digg something that it would give an invalid url error on digg.com we couldnt figure out what the reason was, nor did we even try to get help from Joomladigg (so we are in no way trying to say anything negative in any way other then we had issues) We liked what AddThis had as well as the added features that they give such as some stats info. So we coded AddThis Module for Joomla. The only module available using them for Joomla.
We are the first and this is over a week old, We just haven't advertised it at all until now.