How to Write proper robots.txt file for your website

In this post I am going to discuss about the proper writing of robots.txt file for your website.

What is robots.txt

robots.txt is a text file which would be uploaded to the root directory of your website. It contains a set of rules for the Search Engine spiders.


Role of robots.txt

Robots.txt is mainly used to tell the web spiders to don't crawl the following (given) links. One hing we do mind that robots.txt files cannot tell a spider to crawl and index the following page as indexing is the normal duty of a spider. I think you got the point. So no one can force the spiders to crawl their website as it is purely depends upon spiders. But one can block spiders from accessing certain part or even full of his website.


How to Write robots.txt

To write robots.txt follow the examples below.

1. Block spiders from crawling your entire website

To disallow a spider from crawling your website, the format should be.

User-agent: *
Disallow: /


2. Giving access to your website in robots.txt

To make it reverse we should change it to either


User-agent: *
Disallow:



Or


User-agent: *
allow: /


Please note that allowing a spider to your website doesn't make any sense other than avoiding 404 error if spiders look for robots.txt file on your website.


3. Block spiders from accessing certain files on your site


To block spiders from accessing certain files from your website create a robots.txt file like below.

User-agent: *
Disallow: /cgi-bin/
Disallow: /wusage/
Disallow: /textures/



4. Block certain spiders from accessing your website

To block certain spider from accessing your website we need to write the robots.txt as:

User-agent: " spider name"
Disallow: /



Eg:

User-agent: Googlebot-Image
Disallow: /


Related Articles

1. Optimize blogger template for better search results

2. How to insert adsense ads in between blog post title and content

3. How to setup a custom domain name for your blogger blog

Home

0 comments: