Thursday, June 21, 2012

What is Robots. txt and How it works?


Robots.txt file is a very important file if you want to have a good ranking on search engines, many websites don't offer this file. ARobots.txt file is helpful to keep out unwanted search engine spiders like email retrievers, image strippers, etc. It defines which paths are off limits for spiders to visit. This is useful if you want to hide some personal information or some secret files.

What is Robots.txt

Robots.txt file is a special text file that is always located in your Web server's root directory. Robots.txt file contains restrictions for Web Spiders, telling them where they have permission to search. A Robots.txt is like defining rules for search engine spiders (robots) what to follow and what not to. It should be noted that Web Robots are not required to respect Robots.txtfiles, but most well written Web Spiders follow the rules you define.

How to Create Robots.txt

The format for the robots.txt file is special. It consists of records. Each record consists of two fields : a User-agent line and one or more Disallow: lines. The format is:
<Field> ":" <value>
The robots.txt file should be created in Unix line ender mode! Most good text editors will have a Unix mode or your FTP client *should* do the conversion for you. Do not attempt to use an HTML editor that does not specifically have a text mode to create a robots.txt file.

User-agent

The User-agent line specifies the robot. For example:
User-agent: googlebot
You may also use the wildcard character "*" to specify all robots:
User-agent: *
You can find user agent names in your own logs by checking for requests to robots.txt. Most major search engines have short names for their spiders.

Disallow

The second part of a record consists of Disallow: directive lines. These lines specify files and/or directories. For example, the following line instructs spiders that it can not download contactinfo.htm:
Disallow: contactinfo.htm
You may also specify directories:
Disallow: /cgi-bin/
Which would block spiders from your cgi-bin directory.
There is a wildcard nature to the Disallow directive. The standard dictates that /bob would disallow /bob.html and /bob/indes.html (both the file bob and files in the bob directory will not be indexed).
If you leave the Disallow line blank, it indicates that ALL files may be retrieved. At least one disallow line must be present for each User-agent directive to be correct. A completely empty Robots.txt file is the same as if it were not present.

White Space & Comments

Any line in the robots.txt that begins with # is considered to be a comment only. The standard allows for comments at the end of directive lines, but this is really bad style:
Disallow: bob #comment
Some spider will not interpret the above line correctly and instead will attempt to disallow "bob#comment". The moral is to place comments on lines by themselves.
White space at the beginning of a line is allowed, but not recommended.
Disallow: bob #comment

Examples

The following allows all robots to visit all files because the wildcard "*" specifies all robots.
User-agent: *
Disallow:
This one keeps all robots out.
User-agent: *
Disallow: /
The next one bars all robots from the cgi-bin and images directories:
User-agent: *
Disallow: /cgi-bin/
Disallow: /images/
This one bans Roverdog from all files on the server:
User-agent: Roverdog
Disallow: /
This one bans keeps googlebot from getting at the personal.htm file:
User-agent: googlebot
Disallow: personal.htm


For More Information , Please visit Us : http://www.seomaterial.com/seo_articles/importance_of_robots_txt.html

23 comments:

  1. I am genuinely glad to read this blog posts which includes plenty of valuable data,
    thanks for providing such information.
    Visit my web-site ; Alicia Pennington

    ReplyDelete
  2. this is very useful for me, becoz i am working in this field. your valuable information is very essential in my work.
    Thank you so much..

    Please click on this link once..

    http://www.begoniainfosys.com/

    ReplyDelete
  3. Certainly an article to be read! This was a great and informative read! Fabulous work by the author and creator! Nice feedback from the readers as well! I must admit the author had some very valid points here. Thank you for taking the time to share this with us!

    ReplyDelete
  4. If you desire to take much from this piece of writing then you have to
    apply these strategies to your won blog.
    Also see my web site - seo analysen

    ReplyDelete
  5. Great post! It's pretty useful,i should recommend it to my freinds.

    website checker

    ReplyDelete
  6. we are the best website design in Middle east , we provide the best quality in website
    design and programing, we care in our client and we put our client in priority our work,
    also we provide electronic marketing service for websites
    Also visit my weblog :: website design

    ReplyDelete
  7. This comment has been removed by the author.

    ReplyDelete
  8. Thanks for Providing Procedure for adding robots.txt to website and also for sharing its importance for adding robots.txt in website.

    Website Development Company India

    ReplyDelete
  9. This is the perfect web site for anyone who really wants
    to find out about this topic. You know a whole lot its almost hard to argue with you (not that
    I personally would want to…HaHa). You certainly put a fresh spin on a topic that's been written about for many years. Excellent stuff, just excellent!
    Feel free to visit my web-site : Performance based SEO India

    ReplyDelete
  10. Its ѕuch аѕ you lеarn my thoughts!
    You appear tο know a lot about thiѕ, like you
    wrote thе guide in it oг somеthing.

    I believe that yоu simρly can ԁo with a few
    % to drive the message home a little bit, however other than that, that is wonderful blog. A great read. I will definitely be back.
    Here is my web site try this site

    ReplyDelete
  11. This blog having informational content, Thanks for sharing.
    web development Riyadh

    ReplyDelete
  12. Website owners use the /robots.txt file to give instructions about their site to web robots; this is called The Robots Exclusion Protocol. Outsource Website Design

    ReplyDelete
  13. Your ѕtyle іs uniquе compaгed tο
    other people Ι have reaԁ stuff from. Ӏ aρpгecіatе yοu foг posting when you
    have thе opportunity, Guеss I will just bookmark
    this site.
    Also visit my weblog - Auspicious Stones

    ReplyDelete
  14. Hello great website! Does running a blog similar to
    this require a lot of work? I've no expertise in computer programming but I had been hoping to start my own blog soon. Anyways, if you have any ideas or tips for new blog owners please share. I know this is off topic but I simply had to ask. Thanks a lot!
    Feel free to surf my web site casino gambling

    ReplyDelete
  15. Appreciating the dedication you put into your blog and in depth information
    you provide. It's awesome to come across a blog every once in a while that isn't
    the same unwanted rehashed material. Fantastic read!
    I've saved your site and I'm including your RSS feeds to my
    Google account.
    My web page : social media ROI

    ReplyDelete
  16. I like looking through a post that can make people think.

    Also, thank you for allowing me to comment!
    Check out my webpage - eaton radio

    ReplyDelete
  17. Hi there, after reading this remarkable paragraph i am also
    happy to share my familiarity here with mates.
    My blog post : The Diet Solution

    ReplyDelete
  18. Heya! I like to deliver you a large thumbs up for your important tips you have here on
    this article. To finally add some benefit I want to leave two facts: an individual will add on
    some fat to get big, dine on slow on the way to stay low fat
    my webpage - amino acids benefits

    ReplyDelete
  19. nice article, thanks for the informations about order professional web/blog design..

    ReplyDelete
  20. Hi,

    Keep up the great work. Thanks for posting this informative article.

    Please visit once here....

    http://www.begoniainfosys.com/

    ReplyDelete
  21. This was a great and informative read! Fabulous work by the author and creator!
    Web Design Portland

    ReplyDelete