Linux Software RAID recovery

Posted by Zoredache on Server Fault See other posts from Server Fault or by Zoredache
Published on 2012-03-15T02:13:55Z Indexed on 2012/03/21 5:32 UTC
Read the original article Hit count: 564

Filed under:
|

I am seeing a discrepancy between the output of mdadm --detail and mdadm --examine, and I don't understand why.

This output

mdadm --detail /dev/md2
/dev/md2:
        Version : 0.90
  Creation Time : Wed Mar 14 18:20:52 2012
     Raid Level : raid10
     Array Size : 3662760640 (3493.08 GiB 3750.67 GB)
  Used Dev Size : 1465104256 (1397.23 GiB 1500.27 GB)
   Raid Devices : 5
  Total Devices : 5
Preferred Minor : 2
    Persistence : Superblock is persistent

Seems to contradict this. (the same for every disk in the array)

mdadm --examine /dev/sdc2
/dev/sdc2:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 1f54d708:60227dd6:163c2a05:89fa2e07 (local to host)
  Creation Time : Wed Mar 14 18:20:52 2012
     Raid Level : raid10
  Used Dev Size : 1465104320 (1397.23 GiB 1500.27 GB)
     Array Size : 2930208640 (2794.46 GiB 3000.53 GB)
   Raid Devices : 5
  Total Devices : 5
Preferred Minor : 2

The array was created like this.

mdadm -v --create  /dev/md2 \
  --level=raid10 --layout=o2 --raid-devices=5 \
  --chunk=64 --metadata=0.90 \
 /dev/sdg2 /dev/sdf2 /dev/sde2 /dev/sdd2 /dev/sdc2 

Each of the 5 individual drives have partitions like this.

Disk /dev/sdc: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders, total 2930277168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00057754

   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1            2048       34815       16384   83  Linux
/dev/sdc2           34816  2930243583  1465104384   fd  Linux raid autodetect

Backstory

So the SATA controller failed in a box I provide some support for. The failure was a ugly and so individual drives fell out of the array a little at a time. While there are backups, we the are not really done as frequently as we really need. There is some data that I am trying to recover if I can.

I got additional hardware and I was able to access the drives again. The drives appear to be fine, and I can get the array and filesystem active and mounted (using read-only mode). I am able to access some data on the filesystem and have been copying that off, but I am seeing lots of errors when I try to copy the most recent data.

When I am trying to access that most recent data I am getting errors like below which makes me think that the array size discrepancy may be the problem.

Mar 14 18:26:04 server kernel: [351588.196299] dm-7: rw=0, want=6619839616, limit=6442450944
Mar 14 18:26:04 server kernel: [351588.196309] attempt to access beyond end of device
Mar 14 18:26:04 server kernel: [351588.196313] dm-7: rw=0, want=6619839616, limit=6442450944
Mar 14 18:26:04 server kernel: [351588.199260] attempt to access beyond end of device
Mar 14 18:26:04 server kernel: [351588.199264] dm-7: rw=0, want=20647626304, limit=6442450944
Mar 14 18:26:04 server kernel: [351588.202446] attempt to access beyond end of device
Mar 14 18:26:04 server kernel: [351588.202450] dm-7: rw=0, want=19973212288, limit=6442450944
Mar 14 18:26:04 server kernel: [351588.205516] attempt to access beyond end of device
Mar 14 18:26:04 server kernel: [351588.205520] dm-7: rw=0, want=8009695096, limit=6442450944

© Server Fault or respective owner

Related posts about linux

Related posts about md