Robot Text Files - robot.txt

As the name suggest, robot text files or robot.txt is a file that contains rules for search bots (search engine crawlers) for crawling a site. In general, search bots are supposed to craw all contents of a site without rules, but with the encounter of this file, which contains a set of language that the search engine understands, stops by and read these rules to see which file or directory (folder) that it is not supposed to craw and indexed. This file is recommended for all web weavers who have some files that contains information which may be security information or other information that they want to hide from the general public.

It is different from the .htaccess (rules for the server) file. A typical robot.txt file when opened looks like this

User-agent: *
Disallow: /admin/
Disallow: /images/
Disallow: /lang/
 

Where the user agent refers to the bot, with the disallow telling it not to attempt crawling the listed directories or files so that he administrators folder will not be crawled. In other instances, you can neglect this file if you got nothing to hide from the world.