What does "single-bit ECC errors were detected on the RAID controller" mean?

Posted by jsp on Server Fault See other posts from Server Fault or by jsp
Published on 2014-02-07T22:02:55Z Indexed on 2014/08/23 16:24 UTC
Read the original article Hit count: 1934

Filed under:
|
|
|
|

I have a Dell T7600 with a Perc H710P RAID controller and 4 attached 3TB drives. Over the past few months the RAID controller has been intermittently reporting errors on boot: "no boot device found", "adapter at baseport is not responding", disks frequently reported as missing or failed.

I have since replaced the RAID controller, the 4 hard drives, and finally the system's motherboard.

After replacing the motherboard and rebooting a few times, I got the error

Single bit ECC errors were detected on the RAID controller.
Please contact technical support to resolve this issue.

After rebooting about 20 more times, I haven't seen the ECC error. The system seems otherwise OK, except for the fact that the disk fans will sometimes start blowing at full blast when the the system is sitting completely idle and not stop until I reboot.

Are the ECC errors in memory on the RAID controller? Or, does the RAID controller map in system memory, and the ECC errors are really in system memory? Or, are the ECC errors in the 1GB cache that resides in the RAID controller?

© Server Fault or respective owner

Related posts about raid

Related posts about memory