Archive: This post was originally written in English and is part of my archive. Please note that some information may no longer be up-to-date.

Optimizing the robots.txt and the Google webmaster tools

February 27, 2010 min read Archived

One week ago, this site had some URL path changes. It is possible to check for Google crawl errors on: www.google.com/webmasters/tools/

The web crawler errors were containing some paths, which I am not using, but are used by the hosting provider. My directory structure also contains some paths, which not have to be indexed. To prevent that the Google webcrawler has errors, it is good to add this paths to the robots.txt in the root of the domain directory.

The Drupal install contains already a robots.txt. The following text is added to the directories part of this robots.txt:

# Directories
Disallow: /www/
Disallow: /drupal/
Disallow: /cgi-bin/
Disallow: /arjanwooning/
Disallow: /info4admins/
Disallow: /webstat/
Disallow: /wiki/