Why do people crawl sites without downloading pictures?

Posted by Michael on Pro Webmasters See other posts from Pro Webmasters or by Michael
Published on 2011-06-21T07:32:50Z Indexed on 2011/06/21 8:31 UTC
Read the original article Hit count: 304

Filed under:

Let me show you what I mean:

IP               Pages   Hits    Bandwidth
85.xx.xx.xxx     236     236     735.00 KB
195.xx.xxx.xx    164     164     533.74 KB
95.xxx.xxx.xxx   90      90      293.47 KB

It's very clear that these person are crawling my site with bots. There's no way that you could visit my site and use <1MB bandwidth. You might say that there's the possibility that they could be browsing the site using some browser or plug-in that does not download images, js/css files, etc., but the simple fact of the matter is that there are not 90-236 pages that are linked from the home page (outside of WP files), even if you visited every page twice.

I could understand if these people were crawling the site for pictures, but once again, the bandwidth indicates that this isn't what is happening. Why, then, would they crawl the site to simply view the HTML/txt/js/etc. files?

The only thing that I can come up with is that they are scanning for outdated versions of WordPress, SQL injection vulnerabilities, etc., which makes me inclined to outright ban the IPs, but I'm curious, is it possible that this person is a legitimate user, or at the very least, not intending to be harmful?

© Pro Webmasters or respective owner

Related posts about crawling