task blocked for more than
- by Manuel Sopena Ballesteros
I have a webserver with the configuration below:
VMWare ESXi environemt
CPanel installed
CentOS release 6.5 (Final)
4 CPUs
2G RAM
2x VM disks 100G each
LVM system
My issue is I am getting kernel panic quite frequently. These is a list of some processes blocked I could see from the console:
mysqld
queueprocd
httpd
suphp
vmtoolsd
loop0
auditd
this is my sar logs
Linux 2.6.32-431.3.1.el6.x86_64 (test01)        08/22/2014      _x86_64_        (4 CPU)
12:00:01 AM     CPU     %user     %nice   %system   %iowait    %steal     %idle
12:10:01 AM     all     26.86      0.01      0.98      0.57      0.00     71.57
12:20:01 AM     all      1.78      0.02      1.03      0.08      0.00     97.09
12:30:01 AM     all     26.34      0.02      0.85      0.05      0.00     72.74
12:40:01 AM     all     27.12      0.01      1.11      1.22      0.00     70.54
12:50:01 AM     all      1.59      0.02      0.94      0.13      0.00     97.32
01:00:01 AM     all     26.10      0.01      0.77      0.04      0.00     73.07
01:10:01 AM     all     27.51      0.01      1.16      0.14      0.00     71.18
01:20:01 AM     all      1.80      0.07      1.06      0.08      0.00     96.99
01:30:01 AM     all     26.19      0.01      0.78      0.05      0.00     72.96
01:40:01 AM     all     26.62      0.02      0.87      0.05      0.00     72.45
01:50:02 AM     all      1.35      0.01      0.87      0.02      0.00     97.75
02:00:01 AM     all     26.11      0.02      0.69      0.02      0.00     73.17
02:10:01 AM     all     26.73      0.02      0.89      0.14      0.00     72.21
02:20:01 AM     all      1.45      0.01      0.92      0.04      0.00     97.58
02:30:01 AM     all     26.59      0.01      1.06      0.03      0.00     72.31
02:40:01 AM     all     26.27      0.01      0.72      0.05      0.00     72.95
02:50:01 AM     all      0.86      0.01      0.50      0.09      0.00     98.53
03:00:01 AM     all     25.61      0.02      0.39      0.03      0.00     73.96
03:10:01 AM     all     26.30      0.08      0.66      0.14      0.00     72.82
03:20:01 AM     all      0.81      0.01      0.51      0.04      0.00     98.63
03:30:02 AM     all     26.15      0.02      0.53      0.07      0.00     73.24
03:40:01 AM     all     26.06      0.01      0.47      0.04      0.00     73.42
03:50:01 AM     all      0.96      0.02      0.51      0.03      0.00     98.48
Average:        all     17.69      0.02      0.79      0.14      0.00     81.36
06:58:14 AM       LINUX RESTART
07:00:01 AM     CPU     %user     %nice   %system   %iowait    %steal     %idle
07:10:01 AM     all      1.04      0.02      0.57      0.95      0.00     97.42
07:20:02 AM     all      0.66      0.01      0.39      0.06      0.00     98.87
07:30:01 AM     all     25.71      0.01      0.45      0.16      0.00     73.67
07:40:01 AM     all     25.88      0.01      0.35      0.08      0.00     73.68
As you can see the server became unresponsive at 03.50 AM and I had to reset the VM at 06.58 AM to fix it.
dmesg does not show any relevant information.
I don't see any bottleneck in sar, any idea what can I check next?
thank you very much