Simple. Affordable. Complete.

What is robot.txt?

Robots exclusion protocol (REP), or robots.txt, is a text file webmasters create to instruct robots– typically search engine robots– how to crawl and index their website pages.  A robots.txt file is a publicly available file, meaning that anyone can see what sections a webmaster has blocked from search engines. Essentially robots.txt tells Googlebot and other crawlers what is and is not allowed to be crawled; while the noindex tag tells Google Search what is and is not allowed to be indexed and displayed in Google Search.

The Basics of robot.txt

  • Robots with parameters “noindex, follow” are used to restrict crawling or indexation
  • However malicious crawlers tend to ignore robots.txt so the above protocol is not a reliable security measure
  • Only one “Disallow:” line is permitted for each URL
  • The filename of robots.txt is case sensitive. Make sure you use “robots.txt”, not “Robots.txt.”
  • Spacing is not an accepted way to separate query parameters. For example, “/category/ /product page” would not be honored by robots.txt.

Now What?

So now that you have a better understanding of robots.txt, you may be motivated to get your website in shape, but the next question is, how? You can either hire a webmaster to do this for you, or you can use Inbound Brew’s free  robots.txt management system to help you along the way. 

Ready to begin?

Download Inbound Brew