Apache logging issues

Posted by Dan on Server Fault See other posts from Server Fault or by Dan
Published on 2010-01-13T21:40:30Z Indexed on 2010/03/17 4:31 UTC
Read the original article Hit count: 422

Filed under:
|
|

I'm trying to parse apache log files, but I'm finding some strange results and I'm not sure what they mean. Hopefully someone can provide some insight. (all of the IP addresses were altered. none actually start with 192, I didn't figure the search engines mattered though.)

In the first example, multiple ip addresses are showing up in the host field:

192.249.71.25 - - [04/Aug/2009:04:21:44 -0500] "GET /publications/example.pdf HTTP/1.1" 200 2738
192.0.100.93, 192.20.31.86 - - [04/Aug/2009:04:21:22 -0500] "GET /docs/another.pdf HTTP/1.0" 206 371469

What causes this? Does it have to do with proxy servers? Is there a way to have Apache only log one?

In the second example, a bunch of information is just completely missing! What would cause this?

msnbot-65-55-207-50.search.msn.com - - [29/Dec/2009:15:45:16 -0600] "GET /publications/example.pdf HTTP/1.1" 200 3470073 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)" 266 3476792
- - - - "-" - - "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; InfoPath.1)" 285 594
- - - - "-" - - "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; InfoPath.1)" 285 4195
- - - - "-" - - "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; InfoPath.1)" 299 109218
crawl-17c.cuil.com - - [29/Dec/2009:15:45:46 -0600] "GET /publications/another.pdf HTTP/1.0" 200 101481 "-" "Mozilla/5.0 (Twiceler-0.9 http://www.cuil.com/twiceler/robot.html)" 253 101704

My CustomLog configuration says:

LogFormat "%h %l %u %t \"%r\" %s %b \"%{Referer}i\" \"%{User-agent}i\" %I %O" common

© Server Fault or respective owner

Related posts about linux

Related posts about apache