robots.txt

robots.txt is a text file webmasters create to instruct robots (typically search engine robots) how to crawl and index pages on their website.

Web Crawlers (also known as: web-indexing robots or spider bots) are programs that traverse the Web automatically. Search engines such as Google use them to index the web content, spammers use them to scan for email addresses, and they have many other uses.

User-agent is non-case-sensitive.

Crawler Bots are indexing your website and consume resources of the server where your website is hosted.

1 indexed page is the same as the user will open 1 page in the browser.

Robots.txt examples:

robots.txt example syntax:


Sitemap: http://website.com/sitemap.xml



User-agent: *

Disallow: /wp-admin/

Disallow: /wp-includes/

Disallow: /wp-content/plugins/

Disallow: /wp-content/themes/

Crawl-delay: 10



User-agent: Mediapartners-Google

Disallow: /wp-admin/

Disallow: /wp-includes/

Crawl-delay: 10



User-agent: Googlebot

Disallow: /wp-admin/

Disallow: /wp-includes/

Crawl-delay: 10



User-agent: Adsbot-Google

Disallow: /wp-admin/

Disallow: /wp-includes/

Crawl-delay: 10



User-agent: msnbot

Disallow: /wp-admin/

Disallow: /wp-includes/

Crawl-delay: 10



User-agent: bingbot

Disallow: /wp-admin/

Disallow: /wp-includes/

Crawl-delay: 10



User-agent: Slurp

User-agent: Yahoo

Disallow: /wp-admin/

Disallow: /wp-includes/

Crawl-delay: 10



# Block Google

User-agent: googlebot

Disallow: /



# Block Bing

User-agent: bingbot

Disallow: /



User-agent: msnbot

Disallow: /



# Block Yahoo

User-agent: slurp

User-agent: yahoo

Disallow: /



# Block Ask

User-agent: askjeeves

User-agent: jeeves

User-agent: teoma

Disallow: /



# Block Baidu

User-agent: baiduspider

Disallow: /



# Block Yandex

User-agent: yandex

Disallow: /



Leave a Comment