Saturday, March 10, 2007

If you want Googlebot not to index your page

There are times when you don’t want to eat; there are times when you don’t want to travel and there are times when you don’t want to do something that you would want to most of the times. Similarly, there are times when you don’t want Google to follow one or some of the pages of your site. Let’s say, you have created a temporary page within your site, or you have contents in a page that you will be moving to some other pages. Then, probably it is not the best idea to let Googlebot follow or crawl through that page. However, you do want to Googlebot to index all other regular pages.

Basically, you can prevent Googlebot not to index a page by using “NOINDEX” tag. This tag is appropriate, when you want Googlebot to follow some links but not all. However, use of “NOINDEX” tag may not be that appropriate as it requires a lot of updates when you change something. Optionally, you can use “NOFOLLOW” tag on the parent page if you don’t want Googlebot to follow any links. The use of “NOFOLLOW” or “NOINDEX” can be done as below inside the head tag:

<META NAME="ROBOTS" CONTENT="NOINDEX">
<META NAME="ROBOTS" CONTENT="NOFOLLOW">

Using “NOFOLLOW” tag is quite simple. However, it is not the safest way to prevent Googlebot from following links, especially if your page that you don’t want to be indexed is linked from a third indexed page.

Therefore, you will be better off using “NOINDEX” tag instead. Another way you can control is by using file ‘robot.txt’, which has instruction set for bot about how to behave on that site.

No comments: