best way to statistically detect anomalies in data

Posted by reinier on Stack Overflow See other posts from Stack Overflow or by reinier
Published on 2009-08-20T15:01:53Z Indexed on 2010/05/18 2:41 UTC
Read the original article Hit count: 312

Hi,

our webapp collects huge amount of data about user actions, network business, database load, etc etc etc

All data is stored in warehouses and we have quite a lot of interesting views on this data.

if something odd happens chances are, it shows up somewhere in the data.

However, to manually detect if something out of the ordinary is going on, one has to continually look through this data, and look for oddities.

My question: what is the best way to detect changes in dynamic data which can be seen as 'out of the ordinary'.

Are bayesan filters (I've seen these mentioned when reading about spam detection) the way to go?

Any pointers would be great!

EDIT: To clarify the data for example shows a daily curve of database load. This curve typically looks similar to the curve from yesterday In time this curve might change slowly.

It would be nice that if the curve from day to day changes say within some perimeters, a warning could go off.

R

© Stack Overflow or respective owner

Related posts about statistics

Related posts about database