central apache log analysis of many hosts

Posted by Jason Antman on Server Fault See other posts from Server Fault or by Jason Antman
Published on 2012-06-08T15:22:37Z Indexed on 2012/06/08 16:42 UTC
Read the original article Hit count: 231

Filed under:
|
|

We have 30+ apache httpd servers, and are looking to perform analysis on the logs both for historical trending and near "real time" monitoring/alerting. I'm mainly interested in things like error rates (4xx/5xx), response time, overall request rate, etc. but it would also be very useful to pull out more compute-intensive statistics like unique client IPs and user agents per unit of time.

I'm leaning towards building this as a centralized collector/server/storage, and am also considering the possibility of storing non-apache logs (i.e. general syslog, firewall logs, etc.) in the same system.

Obviously a large part of this will probably have to be custom (at least the connection between pieces and the parsing/analysis we do), but I haven't been able to find much information on people who have done stuff like this, at least at shops smaller than Google/Facebook/etc. who can throw their log data into a hundred-node compute cluster and run Map/Reduce on it.

The main things I'm looking for are: - All open source - Some way of collecting logs from apache machines that isn't too resource-intensive, and transports them relatively quickly over the network - Some way of storing them (NoSQL? key-value store?) on the backend, for a given amount of time (and then rolling them up into historical averages) - In the middle of this, a way of graphing in near-real-time (probably also with some statistical analysis on it) and hopefully alerting off of those graphs.

Any suggestions/pointers/ideas, to either "products"/projects or descriptions of how other people do this would be greatly helpful. Unfortunately, we're not exactly a new-age-y devops shop, lots of old stuff, homogeneous infrastructure, and strained boxes.

© Server Fault or respective owner

Related posts about apache2

Related posts about logging