Server load spikes several times a day, load average for the past month is 5 times the load average all year

Posted by AMF on Server Fault See other posts from Server Fault or by AMF
Published on 2011-06-21T21:38:04Z Indexed on 2011/06/22 0:24 UTC
Read the original article Hit count: 475

Filed under:
|
|
|
|

My Munin notifications set up for our (Debian) LAMP cluster have been notifying me continuously that our load on our production machine has been at dangerous levels. While the average load all year typically runs between 2 and 8, the load in the past month and only the past month -- has been skyrocketing to 10, 18, and occasionally even 50-60. The spikes last only 5-10 minutes at a time and occur about every 2-3 hours. The spikes do not effect performance only because I have a script that sends traffic off our server to a mirror CDN when the load goes above 10. I've looked for cron jobs that correlate with this timeframe but there is nothing I can see that would cause this. Site traffic is also normal (we receive about 200K visits per day). I'm also trying to think of anything I've changed around the time this problem began, and I really cannot think of anything.

This is probably not much to go on. Maybe there is a clue in the top print-out (below) that I'm not seeing.

How do I proceed to find the cause?

-- Typical top when the load is NOT spiking:

top - 11:13:09 up 472 days, 25 min,  1 user,  load average: 6.08, 4.29, 3.80
Tasks: 105 total,   1 running, 104 sleeping,   0 stopped,   0 zombie
Cpu(s): 41.2%us,  5.8%sy,  0.0%ni, 49.5%id,  2.7%wa,  0.1%hi,  0.7%si,  0.0%st
Mem:   3369592k total,  2166980k used,  1202612k free,   559504k buffers
Swap:  2650684k total,     1892k used,  2648792k free,  1129116k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
32046 apache    15   0 36300  12m 9828 S   20  0.4   0:01.97 apache2
32679 apache    15   0 36568  13m  10m S   19  0.4   0:01.69 apache2
31441 apache    15   0 36616  13m  10m S   19  0.4   0:04.13 apache2
31477 apache    15   0 36596  13m 9.8m S   15  0.4   0:01.99 apache2
31993 apache    15   0 36876  16m  12m S   12  0.5   0:02.01 apache2
31782 apache    15   0 36836  14m  10m S    8  0.4   0:02.17 apache2
32198 apache    15   0 36536  13m  10m S    7  0.4   0:01.59 apache2
  880 apache    15   0 36508 9708 6236 S    7  0.3   0:00.42 apache2
31945 apache    17   0 36876  16m  13m S    5  0.5   0:03.17 apache2
32197 apache    16   0 36636  10m 7504 S    5  0.3   0:02.70 apache2
32326 apache    15   0 37024  11m 7632 S    5  0.3   0:02.15 apache2
32565 apache    15   0 37280  13m 9.8m S    5  0.4   0:03.75 apache2
32676 apache    15   0 36896  16m  12m S    4  0.5   0:00.95 apache2
32678 apache    15   0 36536  12m 9692 S    4  0.4   0:02.27 apache2
  974 apache    16   0 37064 9888 6016 D    4  0.3   0:00.13 apache2
32150 apache    16   0 36832  13m  10m S    3  0.4   0:01.74 apache2
31780 apache    16   0 36848  11m 7660 S    3  0.3   0:02.87 apache2

© Server Fault or respective owner

Related posts about apache

Related posts about php