file corruption on read/write 2.6.32-22-server (happens across many kernels)

Posted by Jonathan on Server Fault See other posts from Server Fault or by Jonathan
Published on 2010-05-31T12:24:43Z Indexed on 2010/05/31 12:34 UTC
Read the original article Hit count: 284

Filed under:
|
|
|
|

Hi Guys,

I'm having an issue where after the server has been up for a period of time (~week/few days) the server will start reading corrupt data. For instance when I run a sha1sum of a file after a fresh boot it remains the same. However after a while I will start to get segfaults and from then on whenever I read this file I get a different sha1sum.

I've checked S.M.A.R.T with long tests and I've run an extended memtest86+(12 passes)

My lspci is as follows:

00:00.0 Host bridge: Advanced Micro Devices [AMD] RS780 Host Bridge
00:01.0 PCI bridge: Advanced Micro Devices [AMD] RS780 PCI to PCI bridge (int gfx)
00:06.0 PCI bridge: Advanced Micro Devices [AMD] RS780 PCI to PCI bridge (PCIE port 2)
00:07.0 PCI bridge: Advanced Micro Devices [AMD] RS780 PCI to PCI bridge (PCIE port 3)
00:11.0 SATA controller: ATI Technologies Inc SB700/SB800 SATA Controller [AHCI mode]
00:12.0 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI0 Controller
00:12.1 USB Controller: ATI Technologies Inc SB700 USB OHCI1 Controller
00:12.2 USB Controller: ATI Technologies Inc SB700/SB800 USB EHCI Controller
00:13.0 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI0 Controller
00:13.1 USB Controller: ATI Technologies Inc SB700 USB OHCI1 Controller
00:13.2 USB Controller: ATI Technologies Inc SB700/SB800 USB EHCI Controller
00:14.0 SMBus: ATI Technologies Inc SBx00 SMBus Controller (rev 3c)
00:14.1 IDE interface: ATI Technologies Inc SB700/SB800 IDE Controller
00:14.3 ISA bridge: ATI Technologies Inc SB700/SB800 LPC host controller
00:14.4 PCI bridge: ATI Technologies Inc SBx00 PCI to PCI Bridge
00:14.5 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI2 Controller
00:18.0 Host bridge: Advanced Micro Devices [AMD] K10 [Opteron, Athlon64, Sempron] HyperTransport Configuration
00:18.1 Host bridge: Advanced Micro Devices [AMD] K10 [Opteron, Athlon64, Sempron] Address Map
00:18.2 Host bridge: Advanced Micro Devices [AMD] K10 [Opteron, Athlon64, Sempron] DRAM Controller
00:18.3 Host bridge: Advanced Micro Devices [AMD] K10 [Opteron, Athlon64, Sempron] Miscellaneous Control
00:18.4 Host bridge: Advanced Micro Devices [AMD] K10 [Opteron, Athlon64, Sempron] Link Control
01:05.0 VGA compatible controller: ATI Technologies Inc Radeon HD 3300 Graphics
01:05.1 Audio device: ATI Technologies Inc RS780 Azalia controller
02:00.0 Ethernet controller: Atheros Communications Atheros AR8121/AR8113/AR8114 PCI-E Ethernet Controller (rev b0)
03:00.0 FireWire (IEEE 1394): VIA Technologies, Inc. Device 3403

I could really use some help on this, do you have any idea what could cause this? It's really frustrating me as it seems to trigger entirely randomly and will not go away until I reboot. I'm also use KVM for virtualization as well as MD for software RAID on this server and the processor is a Phenom II X4 965. I don't believe it's the software raid however as this affects files also hosted on non-raid partitions so I don't know.

© Server Fault or respective owner

Related posts about linux

Related posts about raid