Joomla Mafia -> Blog -> tags -> live
Close Panel

Mafia Icons

Created with Mafia Icons
AddThis
Club Extensions
iJoomla

iwebtools

Alexa Certified Site Stats for www.joomlamafia.com

DownDoggy.com

JoomlaMafia Blog

Welcome to the JoomlaMafia Blog

Tag >> live

Robots.txt Update

Posted by: mmurphy in trafficSEOPRlivegooglealexa on

So its been awhile since we did a blog entry, Mainly because we have been studying SEO and taking full advantage of your robots.txt to help get better results.

 It came to our attention because we had all the SEF turned on, Google still was indexing none SEF urls.

 We had 758 pages indexed. Now every day we are seeing pages vanish (thats our goal)

 First off if you didnt know, Install xmap it generated a sitemap not only for your site but it gives you an xml built on the fly that can be submitted to google, yahoo, ask and msn and several other sites. 

 You can also add this to your robots.txt

Sitemap: http://www.joomlamafia.com/index.php?option=com_xmap&sitemap=1&view=xml&no_html=1

 Whats this do? Every time a bot comes to your site it looks at robots.txt to know what not to index and what to index... It just found out first thing is all bots this applies to, heres my sitemap, heres all my pages!!! But if any of these links are present (which we know they are not because this is what all our viewers of our site get to see when they click sitemap link)

 Next thing we added was

Disallow: index2.php

 and then

Disallow: /component/content/article/* Disallow: /component/search/* 
Disallow: /component/*

This keeps any component pages from getting indexed, as well as any search pages and the firt keeps the articles in none sef format from being indexed and causing double content.

How Can I Apply This To My Site?

 

Regularly check what kind of pages Google is indexing on your site and look for patterns. If there are a lot of PDF pages, or dozens of useless links from a particular component, you can act quickly to block them out with robots.txt. Use the site:mydomain.com search function or a tool such as WebCEO.com. 
 
Among the most important things you can do is check your pages that are in Google's supplemental index. This is where you'll find lots of your low-quality pages, ripe for removal by robots.txt. If the pages don't contain useful information, dump them.
 
Originally the wildcard wasn't supported by robots.txt but that has since changed. Both Google and Yahoo now recognize it: Regularly check what kind of pages Google is indexing on your site and look for patterns.
 
If there are a lot of PDF pages, or dozens of useless links from a particular component, you can act quickly to block them out with robots.txt. Use the site:mydomain.com search function or a tool such as  WebCEO.com

 

Among the most important things you can do is check your pages that are in Google's supplemental index. This is where you'll find lots of your low-quality pages, ripe for removal by robots.txt. If the pages don't contain useful information, dump them.


So We reported in a a older blog (our first) about a cool social bookmarking Module, well as time went by we noticed the space it took up and some of you complained that when you went to digg something that it would give an invalid url error on digg.com we couldnt figure out what the reason was, nor did we even try to get help from Joomladigg (so we are in no way trying to say anything negative in any way other then we had issues) We liked what AddThis had as well as the added features that they give such as some stats info. So we coded AddThis Module for Joomla. The only module available using them for Joomla.

We are the first and this is over a week old, We just haven't advertised it at all until now.


Page copy protected against web site content infringement by Copyscape
Copyright © 2009 . Joomla Mafia 1.5.x Tutorials Tips Tricks & more.
Club Templates

Joomla Mafia Supporter

Joomla Mafia Supporter

Joomla Mafia Supporter

JoomlaMafia Supporter

JoomlaMafia Supporter

JoomlaMafia Supporter

Template Plazza

JoomlaMafia Supporter

Recommended Web Hosting

Site Ground

Host Gator


View my page on PickensPlan

Please Take The Poll

New Site
 
Join Our Newsletter