Sporadic unspecific kernel panic

Posted by koma on Server Fault See other posts from Server Fault or by koma
Published on 2012-03-26T11:07:38Z Indexed on 2012/03/26 11:33 UTC
Read the original article Hit count: 214

Filed under:
|

I'm experiencing seldom (so far about once a month) hard crashes on our ubuntu server 10.04 LTS box. The box itself is quite old (Dell PowerEdge 750 from 2004, Pentium4 2.8 GHz). I set up netconsole after it crashed twice last thursday and was able to extract the following output:

[ 9354.062473] invalid opcode: 0000 [#1] SMP
[ 9354.062516] last sysfs file: /sys/devices/pci0000:00/0000:00:1d.0/usb2/2-2/2-2:1.0/uevent
[ 9354.062555] Modules linked in: ppdev adm1026 hwmon_vid i2c_i801 bridge stp dcdbas psmouse serio_raw netconsole configfs shpchp lp parport usbhid hid e1000
[ 9354.062685]
[ 9354.062704] Pid: 3988, comm: rsync Not tainted 2.6.38-12-generic-pae #51~lucid1-Ubuntu Dell Computer Corporation PowerEdge 750              /0R1479
[ 9354.062773] EIP: 0060:[<c104fef1>] EFLAGS: 00010046 CPU: 1
[ 9354.062802] EIP is at check_preempt_wakeup+0x181/0x250
[ 9354.062826] EAX: 00000002 EBX: f2a10ccc ECX: 00000000 EDX: 00000002
[ 9354.062850] ESI: f1db71cc EDI: f1db71a0 EBP: f1dbdea8 ESP: f1dbde8c
[ 9354.062875]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[ 9354.062900] Process rsync (pid: 3988, ti=f1dbc000 task=f1db71a0 task.ti=f1dbc000)
[ 9354.062933] Stack:
[ 9354.062951]  0053ea60 f7907680 f28da840 f2a10ca0 c153ea60 f7907680 c153ea60 f1dbdebc
[ 9354.063019]  c103f98a f2a10ca0 f7907680 00000001 f1dbdef8 c104f97f 00000000 f2f0bacc
[ 9354.063088]  f7904338 00000001 00000003 00000000 f2f0bacc 00000001 00000001 00000086
[ 9354.063157] Call Trace:
[ 9354.063183]  [<c103f98a>] check_preempt_curr+0x6a/0x80
[ 9354.063210]  [<c104f97f>] try_to_wake_up+0x5f/0x3f0
[ 9354.063236]  [<c1077a00>] ? hrtimer_wakeup+0x0/0x30
[ 9354.063261]  [<c104fd64>] wake_up_process+0x14/0x20
[ 9354.063286]  [<c1077a1d>] hrtimer_wakeup+0x1d/0x30
[ 9354.063310]  [<c1077f4a>] __run_hrtimer+0x7a/0x1c0
[ 9354.063336]  [<c107dbad>] ? ktime_get+0x6d/0x110
[ 9354.063360]  [<c1078310>] hrtimer_interrupt+0x120/0x2b0
[ 9354.063390]  [<c1535c36>] smp_apic_timer_interrupt+0x56/0x8a
[ 9354.063418]  [<c152f459>] apic_timer_interrupt+0x31/0x38
[ 9354.063446]  [<c1520000>] ? mca_attach_bus+0x5/0xc0
[ 9354.063469] Code: 8b 9b 20 01 00 00 8b 86 24 01 00 00 3b 83 24 01 00 00 75 e6 85 db 0f 84 a3 00 00 00 89 da 89 f0 e8 75 f6 fe ff 83 f8 01 0f 85 00 <fe> ff ff 89 f8 e8 95 f9 fe ff 8b 5e 1c 85 db 0f 84 e4 fe ff ff
[ 9354.063804] EIP: [<c104fef1>] check_preempt_wakeup+0x181/0x250 SS:ESP 0068:f1dbde8c
[ 9354.064231] ---[ end trace 290689cea65aea7f ]---
[ 9354.064290] Kernel panic - not syncing: Fatal exception in interrupt
[ 9354.064352] Pid: 3988, comm: rsync Tainted: G      D     2.6.38-12-generic-pae #51~lucid1-Ubuntu
[ 9354.064424] Call Trace:
[ 9354.064481]  [<c152c057>] ? panic+0x5c/0x15b
[ 9354.064539]  [<c15302bd>] ? oops_end+0xcd/0xd0
[ 9354.064539]  [<c100d9e4>] ? die+0x54/0x80
[ 9354.064539]  [<c152f926>] ? do_trap+0x96/0xc0
[ 9354.064539]  [<c100ba00>] ? do_invalid_op+0x0/0xa0
[ 9354.064539]  [<c100ba8b>] ? do_invalid_op+0x8b/0xa0
[ 9354.064539]  [<c104fef1>] ? check_preempt_wakeup+0x181/0x250
[ 9354.064539]  [<c144884d>] ? __kfree_skb+0x3d/0x90
[ 9354.064539]  [<c1042ae7>] ? update_curr+0x247/0x2a0
[ 9354.064539]  [<c10447bb>] ? update_cfs_load+0x11b/0x2d0
[ 9354.064539]  [<c1042a25>] ? update_curr+0x185/0x2a0
[ 9354.064539]  [<c152f6bf>] ? error_code+0x67/0x6c
[ 9354.064539]  [<c104fef1>] ? check_preempt_wakeup+0x181/0x250
[ 9354.064539]  [<c103f98a>] ? check_preempt_curr+0x6a/0x80
[ 9354.064539]  [<c104f97f>] ? try_to_wake_up+0x5f/0x3f0
[ 9354.064539]  [<c1077a00>] ? hrtimer_wakeup+0x0/0x30
[ 9354.064539]  [<c104fd64>] ? wake_up_process+0x14/0x20
[ 9354.064539]  [<c1077a1d>] ? hrtimer_wakeup+0x1d/0x30
[ 9354.064539]  [<c1077f4a>] ? __run_hrtimer+0x7a/0x1c0
[ 9354.064539]  [<c107dbad>] ? ktime_get+0x6d/0x110
[ 9354.064539]  [<c1078310>] ? hrtimer_interrupt+0x120/0x2b0
[ 9354.064539]  [<c1535c36>] ? smp_apic_timer_interrupt+0x56/0x8a
[ 9354.064539]  [<c152f459>] ? apic_timer_interrupt+0x31/0x38
[ 9354.064539]  [<c1520000>] ? mca_attach_bus+0x5/0xc0

Googling for this issue didn't really turn up anything useful (most stuff I found was related to btrfs, but I don't use that, although the module exists and is sometimes loaded). From experience it might have to do with relatively heavy I/O, as two of the panics happened during a backup procedure.

Kernel is 2.6.38-12-generic-pae, but I'm pretty sure I also saw panics on 2.6.32. I meanwhile upgraded to 3.0.0-17-generic-pae and am waiting for the next crash ;-)

I'm at a loss here, so any pointers where to look for the cause or what it could be would be great :-) Thanks !

© Server Fault or respective owner

Related posts about linux

Related posts about ubuntu-10.04