robots.txt file with more restrictive rules for certain user agents

Posted by Carson63000 on Server Fault See other posts from Server Fault or by Carson63000
Published on 2010-06-08T06:04:57Z Indexed on 2010/06/08 6:12 UTC
Read the original article Hit count: 163

Filed under:

Hi,

I'm a bit vague on the precise syntax of robots.txt, but what I'm trying to achieve is:

  • Tell all user agents not to crawl certain pages
  • Tell certain user agents not to crawl anything

(basically, some pages with enormous amounts of data should never be crawled; and some voracious but useless search engines, e.g. Cuil, should never crawl anything)

If I do something like this:

User-agent: *
Disallow: /path/page1.aspx
Disallow: /path/page2.aspx
Disallow: /path/page3.aspx

User-agent: twiceler
Disallow: /

..will it flow through as expected, with all user agents matching the first rule and skipping page1, page2 and page3; and twiceler matching the second rule and skipping everything?

© Server Fault or respective owner

Related posts about robots.txt