Googlebot repeatedly looks for files that aren't on my server

Posted by John at CashCommons on Server Fault See other posts from Server Fault or by John at CashCommons
Published on 2012-11-13T02:51:14Z Indexed on 2012/11/13 5:02 UTC
Read the original article Hit count: 453

I'm hosting a site for a volunteer organization. I've moved the site to WordPress, but it wasn't always that way. I suspect at one point it was hacked badly.

My Apache error log file has grown to 122 kB in just the past 18 hours. The large majority of the errors logged are of this form -- it's repeated hundreds of times today alone in my log files:

[Mon Nov 12 18:29:27 2012] [error] [client xx.xxx.xx.xxx] File does not exist: /home/*******/public_html/*******.org/calendar.php
[Mon Nov 12 18:29:27 2012] [error] [client xx.xxx.xx.xxx] File does not exist: /home/*******/public_html/*******.org/404.shtml

(I verified that xx.xxx.xx.xxx was a Google server.)

I suspect there was a security hole somewhere before, likely in calendar.php, that was exploited.

The files don't exist anymore, but there may be many backlinks that exist that reference here, hence why googlebot is so interested in crawling them.

How do I fix this gracefully? I still would like Google to index the site. I just want to tell it somehow not to look for these files anymore.

© Server Fault or respective owner

Related posts about apache2

Related posts about error-logging