File robots.txt

The robots.txt file explained and illustrated

Some important things first:

1. There has to be only one file robots.txt and it must be named "robots.txt".
2. The file has to be located at the root of the website.
3. Comments are possible with a # mark.


1. Allow all web crawlers access to all content:

User-agent: * 

2. Block all web crawlers from all content:

User-agent: * 
Disallow: /

3. Block all web crawlers from a specific folder:

User-agent: * 
Disallow: /no-index/

4. Block all web crawlers from files of a specific type:

User-agent: *
Disallow: /*.pdf$

5. Link to sitemap.xml:

User-agent: *

