Splitting a raidctl mirror safely

Posted by milkfilk on Server Fault See other posts from Server Fault or by milkfilk
Published on 2010-02-18T19:09:35Z Indexed on 2010/03/27 19:03 UTC
Read the original article Hit count: 658

Filed under:
|
|
|

I have a Sun T5220 server with the onboard LSI card and two disks that were in a RAID 1 mirror. The data is not important right now but we had a failed disk and are trying to understand how to do this for real if we had to recover from a failure.

The initial situation looked like this:

# raidctl -l c1t0d0
Volume                  Size    Stripe  Status   Cache  RAID
         Sub                     Size                    Level
                 Disk
----------------------------------------------------------------
c1t0d0                  136.6G  N/A     DEGRADED OFF    RAID1
                 0.1.0   136.6G          GOOD
                 N/A     136.6G          FAILED

Green light on the 0.0.0 disk. Find / lights up the 0.1.0 disk. So I know I have a bad drive and which one it is. Server still boots obviously.

First, we tried putting a new disk in. This disk came from an unknown source. Format would not see it, cfgadm -al would not see it so raidctl -l would not see it. I figure it's bad. We tried another disk from another spare server:

# raidctl -c c1t1d0 c1t0d0  (where t1 is my good disk - 0.1.0)
Disk has occupied space.

Also the different syntax options don't change anything:

# raidctl -C "0.1.0 0.0.0" -r 1 1
Disk has occupied space.

# raidctl -C "0.1.0 0.0.0" 1
Disk has occupied space.

Ok. Maybe this is because the disk from the spare server had a RAID 1 on it already. Aha, I can see another volume in raidctl:

# raidctl -l
Controller: 1
         Volume:c1t1d0  (this is my server's root mirror)
         Volume:c1t132d0  (this is the foreign root mirror)
         Disk: 0.0.0
         Disk: 0.1.0
         ...

No problem. I don't care about the data, I'll just delete the foreign mirror.

# raidctl -d c1t132d0
(warning about data deletion but it works)

At this point, /usr/bin/ binaries freak out. By that I mean, ls -l /usr/bin/which shows 1.4k but cat /usr/bin/which gives me a newline. Great, I just blew away the binaries (ie: binaries in mem still work)? I bounce the box. It all comes back fine. WTF. Anyway, back to recreating my mirror.

# raidctl -l
Controller: 1
         Volume:c1t1d0  (this is my server's root mirror)
         Disk: 0.0.0
         Disk: 0.1.0
         ...

Man says that you can delete a mirror and it will split it. Ok, I'll delete the root mirror.

# raidctl -d c1t0d0
Array in use.  (this might not be the exact error)

I googled this and found of course you can't do this (even with -f) while booted off the mirror. Ok. I boot cdrom -s and deleted the volume.

Now I have one disk that has a type of "LSI-Logical-Volume" on c1t1d0 (where my data is) and a brand new "Hitachi 146GB" on c1t0d0 (what I'm trying to mirror to):

(booted off the CD)
# raidctl -c c1t1d0 c1t0d0 (man says it's source destination for mirroring)
Illegal Array Layout.

# raidctl -C "0.1.0 0.0.0" -r 1 1  (alt syntax per man)
Illegal Array Layout.

# raidctl -C "0.1.0 0.0.0" 1  (assumes raid1, no help)
Illegal Array Layout.

Same size disks, same manufacturer but I did delete the volume instead of throwing in a blank disk and waiting for it to resync. Maybe this was a critical error. I tried selecting the type in format for my good disk to be a plain 146gb disk but it resets the partition table which I'm pretty sure would wipe the data (bad if this was production).

Am I boned? Anyone have experience with breaking and resyncing a mirror? There's nothing on Google about "Illegal Array Layout" so here's my contrib to the search gods.

© Server Fault or respective owner

Related posts about solaris

Related posts about raid