Reconstructing the disk order in RAID 6 with 7 disks

Posted by rkotulla on Server Fault See other posts from Server Fault or by rkotulla
Published on 2014-06-11T14:54:13Z Indexed on 2014/06/11 15:28 UTC
Read the original article Hit count: 243

Filed under:
|
|
|

a little background to this question first: I am running a RAID-6 within a QNAP TS869L external RAID/NAS system. I started with 5 disks of 3 TB each back in the day, and later added another 2 disks of 3TB to the RAID. The QNAP internals handled the growing and re-syncing etc, and everything seemd to be perfectly fine.

About 2 weeks ago, I had one of the disks (disk #5, disk #2 has gone bad in the mean time) fail, and somehow (I have no idea why), also disks 1 and 2 got kicked out of the array. I replaced disk #5, but the RAID didn't start working again.

After some calls to QNAP technical support, they re-created the array (using mdadm --create --force --assume-clean ...), but the resulting array couldn't find a filesystem, and I was kindly referred to contact a data recovery company that I can't afford.

After some digging through old log files, resetting the disk to factory default, etc, I found a few errors that were made during this re-create - I wish I still had some of the original metadata, but unfortunately i don't (I definitely learned that lesson).

I'm currently at the point where I know the correct chunk-size (64K), metadata-version (1.0; factory default was 0.9, but from what I read 0.9 doesn't handle disks over 2 TB, mine are 3 TB), and I now find the ext4 filesystem that should be on the disks.

Only variable left to determine is the right disk order!

I started using the description found in answer #4 of "Recover RAID 5 data after created new array instead of re-using" but am a little confused on what the order should be for a proper RAID-6. RAID-5 is pretty well documented in a number of places, but RAID-6 much less so.

Also, does the layout, i.e. distribution of parity and data chunks across the disks, change after the growing of the array from 5 to 7 disks, or does the re-sync re-organize them in such a way a native 7-disk RAID-6 would have been?

Thanks


some more mdadm output that might be helpful:

mdadm version:

[~] # mdadm --version
mdadm - v2.6.3 - 20th August 2007

mdadm details from one of the disks in the array:

[~] # mdadm --examine /dev/sda3 
/dev/sda3:
          Magic : a92b4efc
        Version : 1.0
    Feature Map : 0x0
     Array UUID : 1c1614a5:e3be2fbb:4af01271:947fe3aa
           Name : 0
  Creation Time : Tue Jun 10 10:27:58 2014
     Raid Level : raid6
   Raid Devices : 7

  Used Dev Size : 5857395112 (2793.02 GiB 2998.99 GB)
     Array Size : 29286975360 (13965.12 GiB 14994.93 GB)
      Used Size : 5857395072 (2793.02 GiB 2998.99 GB)
   Super Offset : 5857395368 sectors
          State : clean
    Device UUID : 7c572d8f:20c12727:7e88c888:c2c357af

    Update Time : Tue Jun 10 13:01:06 2014
       Checksum : d275c82d - correct
         Events : 7036

     Chunk Size : 64K

    Array Slot : 0 (0, 1, failed, 3, failed, 5, 6)
   Array State : Uu_u_uu 2 failed

mdadm details for the array in the current disk-order (based on my best guess reconstructed from old log-files)

[~] # mdadm --detail /dev/md0
/dev/md0:
        Version : 01.00.03
  Creation Time : Tue Jun 10 10:27:58 2014
     Raid Level : raid6
     Array Size : 14643487680 (13965.12 GiB 14994.93 GB)
  Used Dev Size : 2928697536 (2793.02 GiB 2998.99 GB)
   Raid Devices : 7
  Total Devices : 5
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Tue Jun 10 13:01:06 2014
          State : clean, degraded
 Active Devices : 5
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 0

     Chunk Size : 64K

           Name : 0
           UUID : 1c1614a5:e3be2fbb:4af01271:947fe3aa
         Events : 7036

    Number   Major   Minor   RaidDevice State
       0       8        3        0      active sync   /dev/sda3
       1       8       19        1      active sync   /dev/sdb3
       2       0        0        2      removed
       3       8       51        3      active sync   /dev/sdd3
       4       0        0        4      removed
       5       8       99        5      active sync   /dev/sdg3
       6       8       83        6      active sync   /dev/sdf3

output from /proc/mdstat (md8, md9, and md13 are internally used RAIDs holding swap, etc; the one I'm after is md0)

[~] # more /proc/mdstat 
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] 
md0 : active raid6 sdf3[6] sdg3[5] sdd3[3] sdb3[1] sda3[0]
      14643487680 blocks super 1.0 level 6, 64k chunk, algorithm 2 [7/5] [UU_U_UU]

md8 : active raid1 sdg2[2](S) sdf2[3](S) sdd2[4](S) sdc2[5](S) sdb2[6](S) sda2[1] sde2[0]
      530048 blocks [2/2] [UU]

md13 : active raid1 sdg4[3] sdf4[4] sde4[5] sdd4[6] sdc4[2] sdb4[1] sda4[0]
      458880 blocks [8/7] [UUUUUUU_]
      bitmap: 21/57 pages [84KB], 4KB chunk

md9 : active raid1 sdg1[6] sdf1[5] sde1[4] sdd1[3] sdc1[2] sda1[0] sdb1[1]
      530048 blocks [8/7] [UUUUUUU_]
      bitmap: 37/65 pages [148KB], 4KB chunk

unused devices: <none>

© Server Fault or respective owner

Related posts about linux

Related posts about raid