How do I test my robots.txt file?

Posted By Mardian Purwanto

You can use the Google robots.txt analysis tool in google webmaster tools to:

  • Check specific URLs to see if your robots.txt file allows or blocks them.
  • See if Googlebot had trouble parsing any lines in your robots.txt file.
  • Test changes to your robots.txt file.

If you don’t currently use a robots.txt file, you can create one and then test it with the tool before you upload it to your site.

the robots.txt analysis tool displays the text of your cached robots.txt file. You can enter a list of URLs and check to make sure that file restricts or allows access as you expect. You can also modify the displayed robots.txt file and then enter a list of URLs to check so that you can see how changes to your robots.txt file would change Googlebot’s access to specific pages.

Once you are happy with your revised robots.txt file, make the changes on the version on your site. Note that it may take up to a day for the Googlebot to retrieve the latest version. You can always check when we last downloaded your robots.txt file at the top of the robots.txt analysis page.

What are URLs restricted by robots.txt errors?
Google was unable to crawl the URL due to a robots.txt restriction. This can happen for a number of reasons. For instance, your robots.txt file might prohibit the Googlebot entirely; it might prohibit access to the directory in which this URL is located; or it might prohibit access to the URL specifically. Often, this is not an error. You may have specifically set up a robots.txt file to prevent us from crawling this URL. If that is the case, there’s no need to fix this; we will continue to respect robots.txt for this file.

What are URLs restricted by robots.txt errors?

Google was unable to crawl the URL due to a robots.txt restriction. This can happen for a number of reasons. For instance, your robots.txt file might prohibit the Googlebot entirely; it might prohibit access to the directory in which this URL is located; or it might prohibit access to the URL specifically. Often, this is not an error. You may have specifically set up a robots.txt file to prevent us from crawling this URL. If that is the case, there’s no need to fix this; we will continue to respect robots.txt for this file.

What do the robots.txt file analysis results mean?

When you test a URL against a robots.txt file, you will see one of the following results:

  • Allowed— Googlebot will crawl the URL.
  • Blocked— Googlebot will not crawl the URL.
  • Not in domain— This URL is not on the same domain as the robots.txt file and therefore, you cannot block it.
  • Syntax not understood— Googlebot does not recognize this as a valid URL.

Additionally you may see the following message:

  • Detected as a directory; specific files may have different restrictions— Although this directory is blocked or allowed, there may be other, more specific rules in the file that block or allow URLs in the directory, so you will want to check those as well.

If Googlebot has difficulty understanding parts of your robots.txt file, you will see one of the following parsing results, which you will want to fix:

  • Accepted, but should be Disallow— You misspelled “Disallow.”
  • Accepted, but should be user-agent— You misspelled “user-agent.”
  • Accepted, but correct syntax includes a colon (Rule: path)— You forgot to put a colon between “Allow” or “Disallow” and the path.
  • Rule ignored by Googlebot— This is not a rule that Googlebot follows (for example, “Crawl-delay”).
  • No user-agent specified— You have rules that aren’t associated with a user-agent.
  • Syntax not understood— Googlebot does not understand this line.
  • robots.txt file does not appear to be valid— Googlebot doesn’t understand any parts of this file and therefore, doesn’t recognize it as a valid a robots.txt file.
Nov 17th, 2007

No Comments! Be The First!

Leave a Reply

You must be logged in to post a comment.

At AdBux, we will PAY YOU to view websites, complete offers, sample products, signup for free trials, play games, shop online, and more!