Disallow robots.txt from being accessed in a browser but still accessible by spiders?

Posted by Michael Irigoyen on Pro Webmasters See other posts from Pro Webmasters or by Michael Irigoyen
Published on 2011-02-14T22:06:08Z Indexed on 2011/02/15 23:35 UTC
Read the original article Hit count: 238

Filed under:
|
|

We make use of the robots.txt file to prevent Google (and other search spiders) from crawling certain pages/directories in our domain. Some of these directories/files are secret, meaning they aren't linked (except perhaps on other pages encompassed by the robots.txt file). Some of these directories/files aren't secret, we just don't want them indexed.

If somebody browses directly to www.mydomain.com/robots.txt, they can see the contents of the robots.txt file. From a security standpoint, this is not something we want publicly available to anybody. Any directories that contain secure information are set behind authentication, but we still don't want them to be discoverable unless the user specifically knows about them.

Is there a way to provide a robots.txt file but to have it's presence masked by John Doe accessing it from his browser? Perhaps by using PHP to generate the document based on certain criteria? Perhaps something I'm not thinking of? We'd prefer a way to centrally do it (meaning a <meta> tag solution is less than ideal).

© Pro Webmasters or respective owner

Related posts about php

Related posts about security