raid1 - Developer IT

Linux software raid fails to include one device for one RAID1 array

- by user1389890

One of my four Linux software raid arrays drops one of its two devices when I reboot my system. The other three arrays work fine. I am running RAID1 on kernel version 2.6.32-5-amd64. Every time I reboot, /dev/md2 comes up with only one device. I can manually add the device by saying $ sudo mdadm /dev/md2 --add /dev/sdc1. This works fine, and mdadm confirms that the device has been re-added as follows: mdadm: re-added /dev/sdc1 After adding the device and and allowing the array time to resynch, this is what the output of $ cat /proc/mdstat looks like: Personalities : [raid1] md3 : active raid1 sda4[0] sdb4[1] 244186840 blocks super 1.2 [2/2] [UU] md2 : active raid1 sdc1[0] sdd1[1] 732574464 blocks [2/2] [UU] md1 : active raid1 sda3[0] sdb3[1] 722804416 blocks [2/2] [UU] md0 : active raid1 sda1[0] sdb1[1] 6835520 blocks [2/2] [UU] unused devices: <none> Then after I reboot, this is what the output of $ cat /proc/mdstat looks like: Personalities : [raid1] md3 : active raid1 sda4[0] sdb4[1] 244186840 blocks super 1.2 [2/2] [UU] md2 : active raid1 sdd1[1] 732574464 blocks [2/1] [_U] md1 : active raid1 sda3[0] sdb3[1] 722804416 blocks [2/2] [UU] md0 : active raid1 sda1[0] sdb1[1] 6835520 blocks [2/2] [UU] unused devices: <none> During reboot, here is the output of $ sudo cat /var/log/syslog | grep mdadm : Jun 22 19:00:08 rook mdadm[1709]: RebuildFinished event detected on md device /dev/md2 Jun 22 19:00:08 rook mdadm[1709]: SpareActive event detected on md device /dev/md2, component device /dev/sdc1 Jun 22 19:00:20 rook kernel: [ 7819.446412] mdadm: sending ioctl 1261 to a partition! Jun 22 19:00:20 rook kernel: [ 7819.446415] mdadm: sending ioctl 1261 to a partition! Jun 22 19:00:20 rook kernel: [ 7819.446782] mdadm: sending ioctl 1261 to a partition! Jun 22 19:00:20 rook kernel: [ 7819.446785] mdadm: sending ioctl 1261 to a partition! Jun 22 19:00:20 rook kernel: [ 7819.515844] mdadm: sending ioctl 1261 to a partition! Jun 22 19:00:20 rook kernel: [ 7819.515847] mdadm: sending ioctl 1261 to a partition! Jun 22 19:00:20 rook kernel: [ 7819.606829] mdadm: sending ioctl 1261 to a partition! Jun 22 19:00:20 rook kernel: [ 7819.606832] mdadm: sending ioctl 1261 to a partition! Jun 22 19:03:48 rook kernel: [ 8027.855616] mdadm: sending ioctl 1261 to a partition! Jun 22 19:03:48 rook kernel: [ 8027.855620] mdadm: sending ioctl 1261 to a partition! Jun 22 19:03:48 rook kernel: [ 8027.855950] mdadm: sending ioctl 1261 to a partition! Jun 22 19:03:48 rook kernel: [ 8027.855952] mdadm: sending ioctl 1261 to a partition! Jun 22 19:03:49 rook kernel: [ 8027.962169] mdadm: sending ioctl 1261 to a partition! Jun 22 19:03:49 rook kernel: [ 8027.962171] mdadm: sending ioctl 1261 to a partition! Jun 22 19:03:49 rook kernel: [ 8028.054365] mdadm: sending ioctl 1261 to a partition! Jun 22 19:03:49 rook kernel: [ 8028.054368] mdadm: sending ioctl 1261 to a partition! Jun 22 19:10:23 rook kernel: [ 9.588662] mdadm: sending ioctl 1261 to a partition! Jun 22 19:10:23 rook kernel: [ 9.588664] mdadm: sending ioctl 1261 to a partition! Jun 22 19:10:23 rook kernel: [ 9.601990] mdadm: sending ioctl 1261 to a partition! Jun 22 19:10:23 rook kernel: [ 9.601991] mdadm: sending ioctl 1261 to a partition! Jun 22 19:10:23 rook kernel: [ 9.602693] mdadm: sending ioctl 1261 to a partition! Jun 22 19:10:23 rook kernel: [ 9.602695] mdadm: sending ioctl 1261 to a partition! Jun 22 19:10:23 rook kernel: [ 9.605981] mdadm: sending ioctl 1261 to a partition! Jun 22 19:10:23 rook kernel: [ 9.605983] mdadm: sending ioctl 1261 to a partition! Jun 22 19:10:23 rook kernel: [ 9.606138] mdadm: sending ioctl 800c0910 to a partition! Jun 22 19:10:23 rook kernel: [ 9.606139] mdadm: sending ioctl 800c0910 to a partition! Jun 22 19:10:48 rook mdadm[1737]: DegradedArray event detected on md device /dev/md2 Here is the mdadm.conf file: ARRAY /dev/md0 metadata=0.90 UUID=92121d42:37f46b82:926983e9:7d8aad9b ARRAY /dev/md1 metadata=0.90 UUID=9c1bafc3:1762d51d:c1ae3c29:66348110 ARRAY /dev/md2 metadata=0.90 UUID=98cea6ca:25b5f305:49e8ec88:e84bc7f0 ARRAY /dev/md3 metadata=1.2 name=rook:3 UUID=ca3fce37:95d49a09:badd0ddc:b63a4792 I also ran $ sudo smartctl -t long /dev/sdc and no hardware issues were detected. As long as I do not reboot, /dev/md2 seems to work fine. Does anyone have any suggestions? Here is the output of $ sudo mdadm -E /dev/sdc1 after re-adding the device and letting it resync: /dev/sdc1: Magic : a92b4efc Version : 0.90.00 UUID : 98cea6ca:25b5f305:49e8ec88:e84bc7f0 (local to host rook) Creation Time : Sun Jul 13 08:05:55 2008 Raid Level : raid1 Used Dev Size : 732574464 (698.64 GiB 750.16 GB) Array Size : 732574464 (698.64 GiB 750.16 GB) Raid Devices : 2 Total Devices : 2 Preferred Minor : 2 Update Time : Mon Jun 24 07:42:49 2013 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 Checksum : 5fd6cc13 - correct Events : 180998 Number Major Minor RaidDevice State this 0 8 33 0 active sync /dev/sdc1 0 0 8 33 0 active sync /dev/sdc1 1 1 8 49 1 active sync /dev/sdd1 Here is the output of $ sudo mdadm -D /dev/md2 after re-adding the device and letting it resync: /dev/md2: Version : 0.90 Creation Time : Sun Jul 13 08:05:55 2008 Raid Level : raid1 Array Size : 732574464 (698.64 GiB 750.16 GB) Used Dev Size : 732574464 (698.64 GiB 750.16 GB) Raid Devices : 2 Total Devices : 2 Preferred Minor : 2 Persistence : Superblock is persistent Update Time : Mon Jun 24 07:42:49 2013 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 UUID : 98cea6ca:25b5f305:49e8ec88:e84bc7f0 (local to host rook) Events : 0.180998 Number Major Minor RaidDevice State 0 8 33 0 active sync /dev/sdc1 1 8 49 1 active sync /dev/sdd1

Read the article

RAID1: Which disk will be mirrored?

- by tmelen

How does a RAID1 system determine which disk to use as the source and which disk to use as the destination when mirroring? Assume for instance the following scenario: A RAID1 array is created with two disks A and B. A is replaced by disk C, which is added to the array. Files are beeing modified as time goes by. Now B is removed and A is reinserted. Will the RAID1 system realize that A and C are out of sync? And that C is more up-to-date than A? And if not, is there a safe way to avoid the mirroring process to start immediately when disk A is inserted?

Read the article

Various problems with software raid1 array built with Samsung 840 Pro SSDs

- by Andy B

I am bringing to ServerFault a problem that is tormenting me for 6+ months. I have a CentOS 6 (64bit) server with an md software raid-1 array with 2 x Samsung 840 Pro SSDs (512GB). Problems: Serious write speed problems: root [~]# time dd if=arch.tar.gz of=test4 bs=2M oflag=sync 146+1 records in 146+1 records out 307191761 bytes (307 MB) copied, 23.6788 s, 13.0 MB/s real 0m23.680s user 0m0.000s sys 0m0.932s When doing the above (or any other larger copy) the load spikes to unbelievable values (even over 100) going up from ~ 1. When doing the above I've also noticed very weird iostat results: Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 0.00 1589.50 0.00 54.00 0.00 13148.00 243.48 0.60 11.17 0.46 2.50 sdb 0.00 1627.50 0.00 16.50 0.00 9524.00 577.21 144.25 1439.33 60.61 100.00 md1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 md2 0.00 0.00 0.00 1602.00 0.00 12816.00 8.00 0.00 0.00 0.00 0.00 md0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 And it keeps it this way until it actually writes the file to the device (out from swap/cache/memory). The problem is that the second SSD in the array has svctm and await roughly 100 times larger than the second. For some reason the wear is different between the 2 members of the array root [~]# smartctl --attributes /dev/sda | grep -i wear 177 Wear_Leveling_Count 0x0013 094% 094 000 Pre-fail Always - 180 root [~]# smartctl --attributes /dev/sdb | grep -i wear 177 Wear_Leveling_Count 0x0013 070% 070 000 Pre-fail Always - 1005 The first SSD has a wear of 6% while the second SSD has a wear of 30%!! It's like the second SSD in the array works at least 5 times as hard as the first one as proven by the first iteration of iostat (the averages since reboot): Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 10.44 51.06 790.39 125.41 8803.98 1633.11 11.40 0.33 0.37 0.06 5.64 sdb 9.53 58.35 322.37 118.11 4835.59 1633.11 14.69 0.33 0.76 0.29 12.97 md1 0.00 0.00 1.88 1.33 15.07 10.68 8.00 0.00 0.00 0.00 0.00 md2 0.00 0.00 1109.02 173.12 10881.59 1620.39 9.75 0.00 0.00 0.00 0.00 md0 0.00 0.00 0.41 0.01 3.10 0.02 7.42 0.00 0.00 0.00 0.00 What I've tried: I've updated the firmware to DXM05B0Q (following reports of dramatic improvements for 840Ps after this update). I have looked for "hard resetting link" in dmesg to check for cable/backplane issues but nothing. I have checked the alignment and I believe they are aligned correctly (1MB boundary, listing below) I have checked /proc/mdstat and the array is Optimal (second listing below). root [~]# fdisk -ul /dev/sda Disk /dev/sda: 512.1 GB, 512110190592 bytes 255 heads, 63 sectors/track, 62260 cylinders, total 1000215216 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x00026d59 Device Boot Start End Blocks Id System /dev/sda1 2048 4196351 2097152 fd Linux raid autodetect Partition 1 does not end on cylinder boundary. /dev/sda2 * 4196352 4605951 204800 fd Linux raid autodetect Partition 2 does not end on cylinder boundary. /dev/sda3 4605952 814106623 404750336 fd Linux raid autodetect root [~]# fdisk -ul /dev/sdb Disk /dev/sdb: 512.1 GB, 512110190592 bytes 255 heads, 63 sectors/track, 62260 cylinders, total 1000215216 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x0003dede Device Boot Start End Blocks Id System /dev/sdb1 2048 4196351 2097152 fd Linux raid autodetect Partition 1 does not end on cylinder boundary. /dev/sdb2 * 4196352 4605951 204800 fd Linux raid autodetect Partition 2 does not end on cylinder boundary. /dev/sdb3 4605952 814106623 404750336 fd Linux raid autodetect /proc/mdstat root # cat /proc/mdstat Personalities : [raid1] md0 : active raid1 sdb2[1] sda2[0] 204736 blocks super 1.0 [2/2] [UU] md2 : active raid1 sdb3[1] sda3[0] 404750144 blocks super 1.0 [2/2] [UU] md1 : active raid1 sdb1[1] sda1[0] 2096064 blocks super 1.1 [2/2] [UU] unused devices: Running a read test with hdparm root [~]# hdparm -t /dev/sda /dev/sda: Timing buffered disk reads: 664 MB in 3.00 seconds = 221.33 MB/sec root [~]# hdparm -t /dev/sdb /dev/sdb: Timing buffered disk reads: 288 MB in 3.01 seconds = 95.77 MB/sec But look what happens if I add --direct root [~]# hdparm --direct -t /dev/sda /dev/sda: Timing O_DIRECT disk reads: 788 MB in 3.01 seconds = 262.08 MB/sec root [~]# hdparm --direct -t /dev/sdb /dev/sdb: Timing O_DIRECT disk reads: 534 MB in 3.02 seconds = 176.90 MB/sec Both tests increase but /dev/sdb doubles while /dev/sda increases maybe 20%. I just don't know what to make of this. As suggested by Mr. Wagner I've done another read test with dd this time and it confirms the hdparm test: root [/home2]# dd if=/dev/sda of=/dev/null bs=1G count=10 10+0 records in 10+0 records out 10737418240 bytes (11 GB) copied, 38.0855 s, 282 MB/s root [/home2]# dd if=/dev/sdb of=/dev/null bs=1G count=10 10+0 records in 10+0 records out 10737418240 bytes (11 GB) copied, 115.24 s, 93.2 MB/s So sda is 3 times faster than sdb. Or maybe sdb is doing also something else besides what sda does. Is there some way to find out if sdb is doing more than what sda does? UPDATE Again, as suggested by Mr. Wagner, I have swapped the 2 SSDs. And as he thought it would happen, the problem moved from sdb to sda. So I guess I'll RMA one of the SSDs. I wonder if the cage might be problematic. What is wrong with this array? Please help!

Read the article

Rebuilding RAID1 in Ubuntu

- by John Utech

I had my second HD in my RAID1 come up with bad sectors. So I got another drive and pulled out the bad sector drive and put the new drive in. With the original working RAID1 drive in the computer it failed to boot. I manually copied everything from the old drive over via a Gparted Live CD. Still no booting. Kind of scratching my head here as I can see that both of the drives have data on them but are unable to get either of them to boot. I used a Ubuntu live CD and couldn't even manually mount either of the drives, which I thought was really the odd part. Not sure where to go from here.

Read the article

high load average, high wait, dmesg raid error messages (debian nfs server)

- by John Stumbles

Debian 6 on HP proliant (2 CPU) with raid (2*1.5T RAID1 + 2*2T RAID1 joined RAID0 to make 3.5T) running mainly nfs & imapd (plus samba for windows share & local www for previewing web pages); with local ubuntu desktop client mounting $HOME, laptops accessing imap & odd files (e.g. videos) via nfs/smb; boxes connected 100baseT or wifi via home router/switch uname -a Linux prole 2.6.32-5-686 #1 SMP Wed Jan 11 12:29:30 UTC 2012 i686 GNU/Linux Setup has been working for months but prone to intermittently going very slow (user experience on desktop mounting $HOME from server, or laptop playing videos) and now consistently so bad I've had to delve into it to try to find what's wrong(!) Server seems OK at low load e.g. (laptop) client (with $HOME on local disk) connecting to server's imapd and nfs mounting RAID to access 1 file: top shows load ~ 0.1 or less, 0 wait but when (desktop) client mounts $HOME and starts user KDE session (all accessing server) then top shows e.g. top - 13:41:17 up 3:43, 3 users, load average: 9.29, 9.55, 8.27 Tasks: 158 total, 1 running, 157 sleeping, 0 stopped, 0 zombie Cpu(s): 0.4%us, 0.4%sy, 0.0%ni, 49.0%id, 49.7%wa, 0.0%hi, 0.5%si, 0.0%st Mem: 903856k total, 851784k used, 52072k free, 171152k buffers Swap: 0k total, 0k used, 0k free, 476896k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3935 root 20 0 2456 1088 784 R 2 0.1 0:00.02 top 1 root 20 0 2028 680 584 S 0 0.1 0:01.14 init 2 root 20 0 0 0 0 S 0 0.0 0:00.00 kthreadd 3 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/0 4 root 20 0 0 0 0 S 0 0.0 0:00.12 ksoftirqd/0 5 root RT 0 0 0 0 S 0 0.0 0:00.00 watchdog/0 6 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/1 7 root 20 0 0 0 0 S 0 0.0 0:00.16 ksoftirqd/1 8 root RT 0 0 0 0 S 0 0.0 0:00.00 watchdog/1 9 root 20 0 0 0 0 S 0 0.0 0:00.42 events/0 10 root 20 0 0 0 0 S 0 0.0 0:02.26 events/1 11 root 20 0 0 0 0 S 0 0.0 0:00.00 cpuset 12 root 20 0 0 0 0 S 0 0.0 0:00.00 khelper 13 root 20 0 0 0 0 S 0 0.0 0:00.00 netns 14 root 20 0 0 0 0 S 0 0.0 0:00.00 async/mgr 15 root 20 0 0 0 0 S 0 0.0 0:00.00 pm 16 root 20 0 0 0 0 S 0 0.0 0:00.02 sync_supers 17 root 20 0 0 0 0 S 0 0.0 0:00.02 bdi-default 18 root 20 0 0 0 0 S 0 0.0 0:00.00 kintegrityd/0 19 root 20 0 0 0 0 S 0 0.0 0:00.00 kintegrityd/1 20 root 20 0 0 0 0 S 0 0.0 0:00.02 kblockd/0 21 root 20 0 0 0 0 S 0 0.0 0:00.08 kblockd/1 22 root 20 0 0 0 0 S 0 0.0 0:00.00 kacpid 23 root 20 0 0 0 0 S 0 0.0 0:00.00 kacpi_notify 24 root 20 0 0 0 0 S 0 0.0 0:00.00 kacpi_hotplug 25 root 20 0 0 0 0 S 0 0.0 0:00.00 kseriod 28 root 20 0 0 0 0 S 0 0.0 0:04.19 kondemand/0 29 root 20 0 0 0 0 S 0 0.0 0:02.93 kondemand/1 30 root 20 0 0 0 0 S 0 0.0 0:00.00 khungtaskd 31 root 20 0 0 0 0 S 0 0.0 0:00.18 kswapd0 32 root 25 5 0 0 0 S 0 0.0 0:00.00 ksmd 33 root 20 0 0 0 0 S 0 0.0 0:00.00 aio/0 34 root 20 0 0 0 0 S 0 0.0 0:00.00 aio/1 35 root 20 0 0 0 0 S 0 0.0 0:00.00 crypto/0 36 root 20 0 0 0 0 S 0 0.0 0:00.00 crypto/1 203 root 20 0 0 0 0 S 0 0.0 0:00.00 ksuspend_usbd 204 root 20 0 0 0 0 S 0 0.0 0:00.00 khubd 205 root 20 0 0 0 0 S 0 0.0 0:00.00 ata/0 206 root 20 0 0 0 0 S 0 0.0 0:00.00 ata/1 207 root 20 0 0 0 0 S 0 0.0 0:00.14 ata_aux 208 root 20 0 0 0 0 S 0 0.0 0:00.01 scsi_eh_0 dmesg suggests there's a disk problem: .............. (previous episode) [13276.966004] raid1:md0: read error corrected (8 sectors at 489900360 on sdc7) [13276.966043] raid1: sdb7: redirecting sector 489898312 to another mirror [13279.569186] ata4.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0 [13279.569211] ata4.00: irq_stat 0x40000008 [13279.569230] ata4.00: failed command: READ FPDMA QUEUED [13279.569257] ata4.00: cmd 60/08:00:00:6a:05/00:00:23:00:00/40 tag 0 ncq 4096 in [13279.569262] res 41/40:00:05:6a:05/00:00:23:00:00/40 Emask 0x409 (media error) <F> [13279.569306] ata4.00: status: { DRDY ERR } [13279.569321] ata4.00: error: { UNC } [13279.575362] ata4.00: configured for UDMA/133 [13279.575388] ata4: EH complete [13283.169224] ata4.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0 [13283.169246] ata4.00: irq_stat 0x40000008 [13283.169263] ata4.00: failed command: READ FPDMA QUEUED [13283.169289] ata4.00: cmd 60/08:00:00:6a:05/00:00:23:00:00/40 tag 0 ncq 4096 in [13283.169294] res 41/40:00:07:6a:05/00:00:23:00:00/40 Emask 0x409 (media error) <F> [13283.169331] ata4.00: status: { DRDY ERR } [13283.169345] ata4.00: error: { UNC } [13283.176071] ata4.00: configured for UDMA/133 [13283.176104] ata4: EH complete [13286.224814] ata4.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0 [13286.224837] ata4.00: irq_stat 0x40000008 [13286.224853] ata4.00: failed command: READ FPDMA QUEUED [13286.224879] ata4.00: cmd 60/08:00:00:6a:05/00:00:23:00:00/40 tag 0 ncq 4096 in [13286.224884] res 41/40:00:06:6a:05/00:00:23:00:00/40 Emask 0x409 (media error) <F> [13286.224922] ata4.00: status: { DRDY ERR } [13286.224935] ata4.00: error: { UNC } [13286.231277] ata4.00: configured for UDMA/133 [13286.231303] ata4: EH complete [13288.802623] ata4.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0 [13288.802646] ata4.00: irq_stat 0x40000008 [13288.802662] ata4.00: failed command: READ FPDMA QUEUED [13288.802688] ata4.00: cmd 60/08:00:00:6a:05/00:00:23:00:00/40 tag 0 ncq 4096 in [13288.802693] res 41/40:00:05:6a:05/00:00:23:00:00/40 Emask 0x409 (media error) <F> [13288.802731] ata4.00: status: { DRDY ERR } [13288.802745] ata4.00: error: { UNC } [13288.808901] ata4.00: configured for UDMA/133 [13288.808927] ata4: EH complete [13291.380430] ata4.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0 [13291.380453] ata4.00: irq_stat 0x40000008 [13291.380470] ata4.00: failed command: READ FPDMA QUEUED [13291.380496] ata4.00: cmd 60/08:00:00:6a:05/00:00:23:00:00/40 tag 0 ncq 4096 in [13291.380501] res 41/40:00:05:6a:05/00:00:23:00:00/40 Emask 0x409 (media error) <F> [13291.380577] ata4.00: status: { DRDY ERR } [13291.380594] ata4.00: error: { UNC } [13291.386517] ata4.00: configured for UDMA/133 [13291.386543] ata4: EH complete [13294.347147] ata4.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0 [13294.347169] ata4.00: irq_stat 0x40000008 [13294.347186] ata4.00: failed command: READ FPDMA QUEUED [13294.347211] ata4.00: cmd 60/08:00:00:6a:05/00:00:23:00:00/40 tag 0 ncq 4096 in [13294.347217] res 41/40:00:06:6a:05/00:00:23:00:00/40 Emask 0x409 (media error) <F> [13294.347254] ata4.00: status: { DRDY ERR } [13294.347268] ata4.00: error: { UNC } [13294.353556] ata4.00: configured for UDMA/133 [13294.353583] sd 3:0:0:0: [sdc] Unhandled sense code [13294.353590] sd 3:0:0:0: [sdc] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [13294.353599] sd 3:0:0:0: [sdc] Sense Key : Medium Error [current] [descriptor] [13294.353610] Descriptor sense data with sense descriptors (in hex): [13294.353616] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 [13294.353635] 23 05 6a 06 [13294.353644] sd 3:0:0:0: [sdc] Add. Sense: Unrecovered read error - auto reallocate failed [13294.353657] sd 3:0:0:0: [sdc] CDB: Read(10): 28 00 23 05 6a 00 00 00 08 00 [13294.353675] end_request: I/O error, dev sdc, sector 587557382 [13294.353726] ata4: EH complete [13294.366953] raid1:md0: read error corrected (8 sectors at 489900544 on sdc7) [13294.366992] raid1: sdc7: redirecting sector 489898496 to another mirror and they're happening quite frequently, which I guess is liable to account for the performance problem(?) # dmesg | grep mirror [12433.561822] raid1: sdc7: redirecting sector 489900464 to another mirror [12449.428933] raid1: sdb7: redirecting sector 489900504 to another mirror [12464.807016] raid1: sdb7: redirecting sector 489900512 to another mirror [12480.196222] raid1: sdb7: redirecting sector 489900520 to another mirror [12495.585413] raid1: sdb7: redirecting sector 489900528 to another mirror [12510.974424] raid1: sdb7: redirecting sector 489900536 to another mirror [12526.374933] raid1: sdb7: redirecting sector 489900544 to another mirror [12542.619938] raid1: sdc7: redirecting sector 489900608 to another mirror [12559.431328] raid1: sdc7: redirecting sector 489900616 to another mirror [12576.553866] raid1: sdc7: redirecting sector 489900624 to another mirror [12592.065265] raid1: sdc7: redirecting sector 489900632 to another mirror [12607.621121] raid1: sdc7: redirecting sector 489900640 to another mirror [12623.165856] raid1: sdc7: redirecting sector 489900648 to another mirror [12638.699474] raid1: sdc7: redirecting sector 489900656 to another mirror [12655.610881] raid1: sdc7: redirecting sector 489900664 to another mirror [12672.255617] raid1: sdc7: redirecting sector 489900672 to another mirror [12672.288746] raid1: sdc7: redirecting sector 489900680 to another mirror [12672.332376] raid1: sdc7: redirecting sector 489900688 to another mirror [12672.362935] raid1: sdc7: redirecting sector 489900696 to another mirror [12674.201177] raid1: sdc7: redirecting sector 489900704 to another mirror [12698.045050] raid1: sdc7: redirecting sector 489900712 to another mirror [12698.089309] raid1: sdc7: redirecting sector 489900720 to another mirror [12698.111999] raid1: sdc7: redirecting sector 489900728 to another mirror [12698.134006] raid1: sdc7: redirecting sector 489900736 to another mirror [12719.034376] raid1: sdc7: redirecting sector 489900744 to another mirror [12734.545775] raid1: sdc7: redirecting sector 489900752 to another mirror [12734.590014] raid1: sdc7: redirecting sector 489900760 to another mirror [12734.624050] raid1: sdc7: redirecting sector 489900768 to another mirror [12734.647308] raid1: sdc7: redirecting sector 489900776 to another mirror [12734.664657] raid1: sdc7: redirecting sector 489900784 to another mirror [12734.710642] raid1: sdc7: redirecting sector 489900792 to another mirror [12734.721919] raid1: sdc7: redirecting sector 489900800 to another mirror [12734.744732] raid1: sdc7: redirecting sector 489900808 to another mirror [12734.779330] raid1: sdc7: redirecting sector 489900816 to another mirror [12782.604564] raid1: sdb7: redirecting sector 1242934216 to another mirror [12798.264153] raid1: sdc7: redirecting sector 1242935080 to another mirror [13245.832193] raid1: sdb7: redirecting sector 489898296 to another mirror [13261.376929] raid1: sdb7: redirecting sector 489898304 to another mirror [13276.966043] raid1: sdb7: redirecting sector 489898312 to another mirror [13294.366992] raid1: sdc7: redirecting sector 489898496 to another mirror although the arrays are still running on all disks - they haven't given up on any yet: # cat /proc/mdstat Personalities : [raid1] [raid0] md10 : active raid0 md0[0] md1[1] 3368770048 blocks super 1.2 512k chunks md1 : active raid1 sde2[2] sdd2[1] 1464087824 blocks super 1.2 [2/2] [UU] md0 : active raid1 sdb7[0] sdc7[2] 1904684920 blocks super 1.2 [2/2] [UU] unused devices: <none> So I think I have some idea what the problem is but I am not a linux sysadmin expert by the remotest stretch of the imagination and would really appreciate some clue checking here with my diagnosis and what do I need to do: obviously I need to source another drive for sdc. (I'm guessing I could buy a larger drive if the price is right: I'm thinking that one day I'll need to grow the size of the array and that would be one less drive to replace with a larger one) then use mdadm to fail out the existing sdc, remove it and fit the new drive fdisk the new drive with the same size partition for the array as the old one had use mdadm to add the new drive into the array that sound OK?

Read the article

Configuring RAID1 on HP Proliant Microserver N54L / Ubuntu 14.04.1 LTS [duplicate]

- by Chris Beach

This question already has an answer here: Cant find my harddrives in ubuntu installation? 2 answers I've bought a N54L and fitted two 3GB drives. Keen to get set up with RAID1. BIOS: SATA controller mode set to "RAID" RAID Option ROM utility: both physical drives set up as one logical drive When I came to install Ubuntu (14.04.1), both drives appeared during the setup process. I was only expecting to see the logical drive, although I'm a complete novice with RAID. I've read that the HP Proliant Microservers don't have "proper" RAID support, and require some kind of driver to be installed. I've tried a few HP utilities from the following apt repo: deb http://downloads.linux.hp.com/SDR/repo/mcp wheezy/current non-free On installation, most say "server not supported" Would appreciate your advice.

Read the article

large RAID 10 vs small RAID1

- by user116399

The machine will store and serve millions of small files (<15Kb each), and all those files require a total storage space of 400G Considering the exact same SATA hard drives maker and models, on the exact same environment (OS, cpu, ram, raid controller, etc...) which one of the setups bellow would be faster? A) RAID 1 with 2 drives of 2T each, making up total storage of 2T B) RAID 10 with 4 drives of 2T each, making up total storage of 4T [EDIT]: I'm aware RAID10 is faster than RAID1. The larger the disk, at least in theory, the longer will take to do seeks/writes. So, will the performance gain of RAID10 will be outweighed by the "drag" caused the larger disk area when seek/write operations happened?

Read the article

RAID1 Broken Mirroring

- by Sanoj

I'm having a little server with Windows Small Business Server 2003. I'm using RAID1, via a HighPoint Rocket RAID 1640 RAID-card, using two harddrives. This week the server alarmed, and durig reboot I got the error-message Broken Mirroring (User Manual page 30). I had a few alternatives (see the manual), first I tried Continue, but the server restarted during boot. Next time I took Power Off, and replaced the oldest harddrive with a new one, and when I booted, I selected Rebuild. Then I selected the new harddrive to be the new one. The rebuild-procedure started and a progress bar at 0% showed up, but after a few seconds I got the message Copy Failed!, then the server booted and Windows Server started. Now it works fine. But I guess that I'm just using one harddrive now, and it's not mirrored. I haven't touched the server since then (two days ago). What should I do now? I have no experience of this situation. Anyone that have some guidance?

Read the article

Very slow write performance on Debian 6.0 (AMD64) with DMCRYPT/LVM/RAID1

- by jdelic

I'm seeing very strange performance characteristics on one of my servers. This server is running a simple two-disk software-RAID1 setup with LVM spanning /dev/md0. One of the logical volumes /dev/vg0/secure is encrypted using dmcrypt with LUKS and mounted with the sync and noatimes flag. Writing to that volume is incredibly slow at 1.8 MB/s and the CPU usage stays near 0%. There are 8 crpyto/1-8 processes running (it's a Intel Quadcore CPU). I hope that someone on serverfault has seen this before :-(. uname -a 2.6.32-5-xen-amd64 #1 SMP Tue Mar 8 00:01:30 UTC 2011 x86_64 GNU/Linux Interestingly, when I read from the device I get good performance numbers: reading without encryption: $ dd if=/dev/vg0/secure of=/dev/null bs=64k count=100000 100000+0 records in 100000+0 records out 6553600000 bytes (6.6 GB) copied, 68.8951 s, 95.1 MB/s reading with encryption: $ dd if=/dev/mapper/secure of=/dev/null bs=64k count=100000 100000+0 records in 100000+0 records out 6553600000 bytes (6.6 GB) copied, 69.7116 s, 94.0 MB/s However, when I try to write to the device: $ dd if=/dev/zero of=./test bs=64k 8809+0 records in 8809+0 records out 577306624 bytes (577 MB) copied, 321.861 s, 1.8 MB/s Also, when I read I see CPU usage, when I write, the CPU stays at almost 0% usage. Here is output of cryptsetup luksDump: LUKS header information for /dev/vg0/secure Version: 1 Cipher name: aes Cipher mode: cbc-essiv:sha256 Hash spec: sha1 Payload offset: 2056 MK bits: 256 MK digest: dd 62 b9 a5 bf 6c ec 23 36 22 92 4c 39 f8 d6 5d c1 3a b7 37 MK salt: cc 2e b3 d9 fb e3 86 a1 bb ab eb 9d 65 df b3 dd d9 6b f4 49 de 8f 85 7d 3b 1c 90 83 5d b2 87 e2 MK iterations: 44500 UUID: a7c9af61-d9f0-4d3f-b422-dddf16250c33 Key Slot 0: ENABLED Iterations: 178282 Salt: 60 24 cb be 5c 51 9f b4 85 64 3d f8 07 22 54 d4 1a 5f 4c bc 4b 82 76 48 d8 a2 d2 6a ee 13 d7 5d Key material offset: 8 AF stripes: 4000 Key Slot 1: DISABLED Key Slot 2: DISABLED Key Slot 3: DISABLED Key Slot 4: DISABLED Key Slot 5: DISABLED Key Slot 6: DISABLED Key Slot 7: DISABLED

Read the article

RAID1 Hard drives with Intel Controller

- by Dave_H

I have an older server with 2 35GB hard drives arranged as a RAID-1 drive. I have 3 open bays, I'd like to add 2 more hard drives to the open bays and have all 4 drives configured as a RAID-1 drive that Windows sees as 1 larger drive. Is this possible? If not, the new drives are 2x as big as the old drives, is it possible to replace 1 of the old drives and rebuild then replace the second drive and rebuild and somehow expand the array to use all of the available space? The server is using an Intel hardware raid controller, the software interface is Intel Storage Console v2.12.

Read the article

Linux Software RAID1 Rebuild Completes, but after reboot, its degraded again

- by zimmy6996

I have been beating my head with an issue here, and I'm now turning to the internet for help. I have a system running Mandrake Linux, with the following configuration: /dev/hda - This is a IDE drive. Has some partitions on it that boot the system and make up most of the file system. /dev/sda - This is drive 1 of 2 for a software raid /dev/md0 /dev/sdb - This is drive 2 of 2 for a software raid /dev/md0 md0 gets mounted but fstab as /data-storage, so it is not critical to the systems ability to boot. We can comment it out of fstab, and the system works just fine either way. The problem is, we have a failed sdb drive. So I shut the box down, and have pulled the failed disk and installed a new disk. When the system boots up, /proc/mdstat shows only sda as part of the raid. I then run the various command to rebuild the RAID to /dev/sdb. Everything rebuilds correctly, and upon completion, you look at /proc/mdstat and it shows 2 drives sda1(0) and sdb1(1). Everything looks great. Then you reboot the box ... UGH!!! Once rebooted, sdb is missing again from the RAID. It is like the rebuild never happened. I can walk through the commands to rebuild it again, and it will work, but again, after reboot, the box seems to make sdb just vanish! The real odd thing is, if after reboot, I pull sda out of the box, and try to get the system to load with the rebuilt sdb drive in the system, and when I do, the system actually throws and error just after grub, and says something about drive error, and the system has to shut down. Thoughts??? I'm starting to wonder if grub has something to do with this mess. That the drive isn't being setup within grub to be visible at boot? This RAID array isn't necessary for the system to boot, but when the replacement drive is in there, without SDA it won't boot system, so it makes me believe there is something to that. On top of that, there just seems to be something wonky here the drive falling off of RAID after reboot. I've hit the point of pounding my head on the keyboard. Any help would be greatly appreciated!!!

Read the article

RAID1: Migrate HDD to SSD?

- by OMG Ponies

My current workstation uses an Adaptec 5805, with Win2008 mirrored between two 72 GB (10K?) savvio drives. My question is if there's a way to migrate the mirror to use SSDs - I've been looking at 90GB Corsair Force (Sandy Bridge) to replace the existing setup. If it's possible, without installing the OS fresh. If I replaced the mirrored drive with an SSD, would the array sync the drives? Then I could promote the SSD mirror to be the primary, and use the second SSD as the mirror. That'd be too easy... Or should use Ghost to get an image of the existing setup, apply it to the SSD for a new mirror to be setup on?

Read the article

RAID1: Migrate HHD to SSD?

- by OMG Ponies

My current workstation uses an Adaptec 5805, with Win2008 mirrored between two 72 GB (10K?) savvio drives. My question is if there's a way to migrate the mirror to use SSDs - I've been looking at 90GB Corsair Force (Sandy Bridge) to replace the existing setup. If it's possible, without installing the OS fresh. If I replaced the mirrored drive with an SSD, would the array sync the drives? Then I could promote the SSD mirror to be the primary, and use the second SSD as the mirror. That'd be too easy... Or should use Ghost to get an image of the existing setup, apply it to the SSD for a new mirror to be setup on?

Read the article

HP Smart Array; Is it possible to convert Raid1 to Raid0 by dropping a failed drive?

- by Erik Heppler

I have a server that was running two 60GB drives as a logical RAID1. At some point the second drive was physically removed and the logical drive has been in "Interim Recovery Mode" for several months now. There's no need for the redundancy of RAID1 on this machine, and I have no intention of replacing the missing drive. If possible I would like to convert the current RAID1 to a single-drive RAID0 by simply dropping the failed drive from the current configuration. I'm only interested in doing this if it can be done in-place. Otherwise I'm perfectly content to leave it in "Interim Recovery Mode" indefinitely.

Read the article

Upon reboot, Linux software raid fails to include one device of a RAID1 array

- by user1389890

One of my four Linux software raid arrays drops one of its two devices when I reboot my system. The other three arrays work fine. I am running RAID1 on kernel version 2.6.32-5-amd64 (Debian Squeeze). Every time I reboot, /dev/md2 comes up with only one device. I can manually add the device by saying $ sudo mdadm /dev/md2 --add /dev/sdc1. This works fine, and mdadm confirms that the device has been re-added as follows: mdadm: re-added /dev/sdc1 After adding the device and allowing the array time to resynch, this is what the output of $ cat /proc/mdstat looks like: Personalities : [raid1] md3 : active raid1 sda4[0] sdb4[1] 244186840 blocks super 1.2 [2/2] [UU] md2 : active raid1 sdc1[0] sdd1[1] 732574464 blocks [2/2] [UU] md1 : active raid1 sda3[0] sdb3[1] 722804416 blocks [2/2] [UU] md0 : active raid1 sda1[0] sdb1[1] 6835520 blocks [2/2] [UU] unused devices: <none> Then after I reboot, this is what the output of $ cat /proc/mdstat looks like: Personalities : [raid1] md3 : active raid1 sda4[0] sdb4[1] 244186840 blocks super 1.2 [2/2] [UU] md2 : active raid1 sdd1[1] 732574464 blocks [2/1] [_U] md1 : active raid1 sda3[0] sdb3[1] 722804416 blocks [2/2] [UU] md0 : active raid1 sda1[0] sdb1[1] 6835520 blocks [2/2] [UU] unused devices: <none> During reboot, here is the output of $ sudo cat /var/log/syslog | grep mdadm : Jun 22 19:00:08 rook mdadm[1709]: RebuildFinished event detected on md device /dev/md2 Jun 22 19:00:08 rook mdadm[1709]: SpareActive event detected on md device /dev/md2, component device /dev/sdc1 Jun 22 19:00:20 rook kernel: [ 7819.446412] mdadm: sending ioctl 1261 to a partition! Jun 22 19:00:20 rook kernel: [ 7819.446415] mdadm: sending ioctl 1261 to a partition! Jun 22 19:00:20 rook kernel: [ 7819.446782] mdadm: sending ioctl 1261 to a partition! Jun 22 19:00:20 rook kernel: [ 7819.446785] mdadm: sending ioctl 1261 to a partition! Jun 22 19:00:20 rook kernel: [ 7819.515844] mdadm: sending ioctl 1261 to a partition! Jun 22 19:00:20 rook kernel: [ 7819.515847] mdadm: sending ioctl 1261 to a partition! Jun 22 19:00:20 rook kernel: [ 7819.606829] mdadm: sending ioctl 1261 to a partition! Jun 22 19:00:20 rook kernel: [ 7819.606832] mdadm: sending ioctl 1261 to a partition! Jun 22 19:03:48 rook kernel: [ 8027.855616] mdadm: sending ioctl 1261 to a partition! Jun 22 19:03:48 rook kernel: [ 8027.855620] mdadm: sending ioctl 1261 to a partition! Jun 22 19:03:48 rook kernel: [ 8027.855950] mdadm: sending ioctl 1261 to a partition! Jun 22 19:03:48 rook kernel: [ 8027.855952] mdadm: sending ioctl 1261 to a partition! Jun 22 19:03:49 rook kernel: [ 8027.962169] mdadm: sending ioctl 1261 to a partition! Jun 22 19:03:49 rook kernel: [ 8027.962171] mdadm: sending ioctl 1261 to a partition! Jun 22 19:03:49 rook kernel: [ 8028.054365] mdadm: sending ioctl 1261 to a partition! Jun 22 19:03:49 rook kernel: [ 8028.054368] mdadm: sending ioctl 1261 to a partition! Jun 22 19:10:23 rook kernel: [ 9.588662] mdadm: sending ioctl 1261 to a partition! Jun 22 19:10:23 rook kernel: [ 9.588664] mdadm: sending ioctl 1261 to a partition! Jun 22 19:10:23 rook kernel: [ 9.601990] mdadm: sending ioctl 1261 to a partition! Jun 22 19:10:23 rook kernel: [ 9.601991] mdadm: sending ioctl 1261 to a partition! Jun 22 19:10:23 rook kernel: [ 9.602693] mdadm: sending ioctl 1261 to a partition! Jun 22 19:10:23 rook kernel: [ 9.602695] mdadm: sending ioctl 1261 to a partition! Jun 22 19:10:23 rook kernel: [ 9.605981] mdadm: sending ioctl 1261 to a partition! Jun 22 19:10:23 rook kernel: [ 9.605983] mdadm: sending ioctl 1261 to a partition! Jun 22 19:10:23 rook kernel: [ 9.606138] mdadm: sending ioctl 800c0910 to a partition! Jun 22 19:10:23 rook kernel: [ 9.606139] mdadm: sending ioctl 800c0910 to a partition! Jun 22 19:10:48 rook mdadm[1737]: DegradedArray event detected on md device /dev/md2 Here is the result of $ cat /etc/mdadm/mdadm.conf: ARRAY /dev/md0 metadata=0.90 UUID=92121d42:37f46b82:926983e9:7d8aad9b ARRAY /dev/md1 metadata=0.90 UUID=9c1bafc3:1762d51d:c1ae3c29:66348110 ARRAY /dev/md2 metadata=0.90 UUID=98cea6ca:25b5f305:49e8ec88:e84bc7f0 ARRAY /dev/md3 metadata=1.2 name=rook:3 UUID=ca3fce37:95d49a09:badd0ddc:b63a4792 Here is the output of $ sudo mdadm -E /dev/sdc1 after re-adding the device and letting it resync: /dev/sdc1: Magic : a92b4efc Version : 0.90.00 UUID : 98cea6ca:25b5f305:49e8ec88:e84bc7f0 (local to host rook) Creation Time : Sun Jul 13 08:05:55 2008 Raid Level : raid1 Used Dev Size : 732574464 (698.64 GiB 750.16 GB) Array Size : 732574464 (698.64 GiB 750.16 GB) Raid Devices : 2 Total Devices : 2 Preferred Minor : 2 Update Time : Mon Jun 24 07:42:49 2013 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 Checksum : 5fd6cc13 - correct Events : 180998 Number Major Minor RaidDevice State this 0 8 33 0 active sync /dev/sdc1 0 0 8 33 0 active sync /dev/sdc1 1 1 8 49 1 active sync /dev/sdd1 Here is the output of $ sudo mdadm -D /dev/md2 after re-adding the device and letting it resync: /dev/md2: Version : 0.90 Creation Time : Sun Jul 13 08:05:55 2008 Raid Level : raid1 Array Size : 732574464 (698.64 GiB 750.16 GB) Used Dev Size : 732574464 (698.64 GiB 750.16 GB) Raid Devices : 2 Total Devices : 2 Preferred Minor : 2 Persistence : Superblock is persistent Update Time : Mon Jun 24 07:42:49 2013 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 UUID : 98cea6ca:25b5f305:49e8ec88:e84bc7f0 (local to host rook) Events : 0.180998 Number Major Minor RaidDevice State 0 8 33 0 active sync /dev/sdc1 1 8 49 1 active sync /dev/sdd1 I also ran $ sudo smartctl -t long /dev/sdc and no hardware issues were detected. As long as I do not reboot, /dev/md2 seems to work fine. Does anyone have any suggestions?

Read the article

Can I rebuild degraded RAID1 on a SAS 5 i controller with a larger mirror drive size?

- by Kenny

A drive failed on a Dell SAS 5i controller - see controller bios screengrab: http://imagebin.ca/view/SkZbszA.html The primary is a 160GB 10k sata drive. I added a 250GB 7k rpm drive in the hope that the array would rebuild onto this drive. This did not happen. (assuming that the controller would just operate at the speed of the slowest drive) The controller could see the new drive, but it didn't automatically rebuild the raid1 onto this drive. (my assumption is that it did not do this rebuild as the drive sizes are different). There was an option to add the new drive to the existing raid1 array - but when I tried this a message appeared stating that all data on the array would be lost. (I didn't get a screenshot of this message, I will later) Should the SAS 5i allow me to rebuild the array onto a larger drive? Is the option to add the drive to the array the right way to go? Many thanks!

Read the article

Can I rebuild degraded RAID1 on a SAS 5 i controller with a larger mirror drive size?

- by Kenny

Hi, A drive failed on a Dell SAS 5i controller - see controller bios screengrab: http://imagebin.ca/view/SkZbszA.html The primary is a 160GB 10k sata drive. I added a 250GB 7k rpm drive in the hope that the array would rebuild onto this drive. This did not happen. (assuming that the controller would just operate at the speed of the slowest drive) The controller could see the new drive, but it didn't automatically rebuild the raid1 onto this drive. (my assumption is that it did not do this rebuild as the drive sizes are different). There was an option to add the new drive to the existing raid1 array - but when I tried this a message appeared stating that all data on the array would be lost. (I didn't get a screenshot of this message, I will later) Should the SAS 5i allow me to rebuild the array onto a larger drive? Is the option to add the drive to the array the right way to go? Many thanks!

Read the article

How to make a Linux software RAID1 detect disc corruption?

- by Paul

This is one of the nightmare days: A virtualized server running on a Linux SW-RAID1 runs a VM that exhibits random segfaults in seemingly random codechunks. While debugging I find that a file gives different md5sums on each and every run. Digging deeper I find this: The raw disc partitions that make up the RAID1 mirror contain 2 bit-differences and ca. 9 sectors are completely empty on one disc and filled with data on the other disc. Obviously Linux gives back a sector from a undeterministically chosen disc of the mirror set. So sometimes the same sector is returned OK, sometimes the corrupted is given back. The docs say: RAID cannot and is not supposed to guard against data corruption on the media. Therefore, it doesn't make any sense either, to purposely corrupt data (using dd for example) on a disk to see how the RAID system will handle that. It is most likely (unless you corrupt the RAID superblock) that the RAID layer will never find out about the corruption, but your filesystem on the RAID device will be corrupted. Thanks. That will help me sleep. :-/ Is there a way to have Linux at least detect this corruption by using sector checksumming or something like that? Would this be detected in a RAID5 setup? Is this the moment I wish I used ZFS or btrfs (once it becomes usable without uber-admin capabilities)?

Read the article

Linux Software RAID1: How to boot after (physically) removing /dev/sda? (LVM, mdadm, Grub2)

- by flight

A server set up with Debian 6.0/squeeze. During the squeeze installation, I configured the two 500GB SATA disks (/dev/sda and /dev/sdb) as a RAID1 (managed with mdadm). The RAID keeps a 500 GB LVM volume group (vg0). In the volume group, there's a single logical volume (lv0). vg0-lv0 is formatted with extfs3 and mounted as root partition (no dedicated /boot partition). The system boots using GRUB2. In normal use, the systems boots fine. Also, when I tried and removed the second SATA drive (/dev/sdb) after a shutdown, the system came up without problem, and after reconnecting the drive, I was able to --re-add /dev/sdb1 to the RAID array. But: After removing the first SATA drive (/dev/sda), the system won't boot any more! A GRUB welcome message shows up for a second, then the system reboots. I tried to install GRUB2 manually on /dev/sdb ("grub-install /dev/sdb"), but that doesn't help. Appearently squeeze fails to set up GRUB2 to launch from the second disk when the first disk is removed, which seems to be quite an essential feature when running this kind of Software RAID1, isn't it? At the moment, I'm lost whether this is a problem with GRUB2, with LVM or with the RAID setup. Any hints?

Read the article

How to delete removed devices from a mdadm RAID1?

- by Kabuto

I had to replace two hard drives in my RAID1. After adding the two new partitions the old ones are still showing up as removed while the new ones are only added as spare. I've had no luck removing the partitions marked as removed. Here's the RAID in question. Note the two devices (0 and 1) with state removed. $ mdadm --detail /dev/md1 mdadm: metadata format 00.90 unknown, ignored. mdadm: metadata format 00.90 unknown, ignored. /dev/md1: Version : 00.90 Creation Time : Thu May 20 12:32:25 2010 Raid Level : raid1 Array Size : 1454645504 (1387.26 GiB 1489.56 GB) Used Dev Size : 1454645504 (1387.26 GiB 1489.56 GB) Raid Devices : 3 Total Devices : 3 Preferred Minor : 1 Persistence : Superblock is persistent Update Time : Tue Nov 12 21:30:39 2013 State : clean, degraded Active Devices : 1 Working Devices : 3 Failed Devices : 0 Spare Devices : 2 UUID : 10d7d9be:a8a50b8e:788182fa:2238f1e4 Events : 0.8717546 Number Major Minor RaidDevice State 0 0 0 0 removed 1 0 0 1 removed 2 8 34 2 active sync /dev/sdc2 3 8 18 - spare /dev/sdb2 4 8 2 - spare /dev/sda2 How do I get rid of these devices and add the new partitions as active RAID devices? Update I seem to have gotten rid of them. My RAID is resyncing, but the two drives are still marked as spares and are number 3 and 4, which looks wrong. I'll have to wait for the resync to finish. All I did was to fix the metadata error by editing my mdadm.conf and rebooting. I tried rebooting before, but this time it worked for whatever reason. Number Major Minor RaidDevice State 3 8 2 0 spare rebuilding /dev/sda2 4 8 18 1 spare rebuilding /dev/sdb2 2 8 34 2 active sync /dev/sdc2

Read the article

Very poor read performance compared to write performance on md(raid1) / crypt(luks) / lvm

- by Android5360

I'm experiencing very poor read performance over raid1/crypt/lvm. In the same time, write speeds are about 2x+ faster on the same setup. On another raid1 setup on the same machine I get normal read speeds (maybe because I'm not using cryptsetup). OS related disks: sda + sdb. I have raid1 configuration with two disks, both are in place. I'm using LVM over the RAID. No encryption. Both disks are WD Green, 5400 rpm. IO test results on this raid1: dd if=/dev/zero of=/tmp/output.img3 bs=8k count=256k conv=fsync - 2147483648 bytes (2.1 GB) copied, 22.3392 s, 96.1 MB/s sync echo 3 > /proc/sys/vm/drop_caches dd if=/tmp/output.img3 of=/dev/null bs=8k - 2147483648 bytes (2.1 GB) copied, 15.9 s, 135 MB/s And here is the problematic setup (on the same machine). Currently I have only one sdc (WD Green, 5400rpm) configured in software raid1 + crypt (luks, serpent-xts-plain) + lvm. Tomorrow I will attach another disk (sdd) to complete this two-disk raid1 setup. IO tests results on this raid1: dd if=/dev/zero of=output.img3 bs=8k count=256k conv=fsync 2147483648 bytes (2.1 GB) copied, 17.7235 s, 121 MB/s sync echo 3 > /proc/sys/vm/drop_caches dd if=output.img3 of=/dev/null bs=8k 2147483648 bytes (2.1 GB) copied, 36.2454 s, 59.2 MB/s We can see that the read performance is very very bad (59MB/s compared to 135MB/s when using no encryption). Nothing is using the disks during benchmark. I can confirm this because I checked with iostat and dstat. Details on the hardware: disks: all are WD green, 5400rpm, 64mb cache. cpu: FX-8350 at stock speed ram: 4x4GB at 1066Mhz. Details on the software: OS: Debian Wheezy 7, amd64 mdadm: v3.2.5 - 18th May 2012 LVM version: 2.02.95(2) (2012-03-06) LVM Library version: 1.02.74 (2012-03-06) LVM Driver version: 4.22.0 cryptsetup: 1.4.3 Here is how I configured the slow raid1+crypt+lvm setup: parted /dev/sdc mklabel gpt type: ext4 start: 2048s end: -1 Now the raid, crypt and the lvm configuration: mdadm --create /dev/md1 --level=1 --raid-disks=2 missing /dev/sdc cryptsetup --cipher serpent-xts-plain luksFormat /dev/md1 cryptsetup luksOpen /dev/md1 md1_crypt vgcreate vg_sql /dev/mapper/md1_crypt lvcreate -l 100%VG vg_sql -n lv_sql mkfs.ext4 /dev/mapper/vg_sql-lv-sql mount /dev/mapper/vg_sql-lv_sql /sql So guys, can you help me identify the reason and fix it? It has to be something with the cryptsetup as there is no such read slowdown on the other setup (sda+sdb) where no encryption is present. But I have no idea what to do. Thanks!

Read the article

Linux software raid robustness for raid1 vs other raid levels

- by Waxhead

I have a raid5 running and now also a raid1 that I set up yesterday. Since raid5 calculates parity it should be able to catch silent data corruption on one disk. However for raid1 the disks are just mirrors. The more I think about it I figure that raid1 is actually quite risky. Sure it will save me from a disk failure but i might not be as good when it comes to protecting the data on disk (who is actually more important for me). How does Linux software raid actually store raid1 type data on disk? How does it know what spindle is giving corrupt data (if the disk(subsystem) is not reporting any errors) If raid1 really is not giving me data protection but rather disk protection is there some tricks I can do with mdadm to create a two disk "raid5 like" setup? e.g. loose capacity but still keep redundancy also for data!?

Read the article

RAID1: can't replace faulty spare (marked again as 'faulty spare' within seconds)

- by user212475

I got a problem that I cannot solve: Our fileserver runs XUbuntu and 3 RAID1s. One has a problem since monday: it consists of sdb and sdc. sdb was marked as faulty by mdadm for unknown reasons. I used --remove to remove it from the RAID and then to add it by --add. All was fine, re-syncing started but never got above 0% and after a few seconds, sdb was again marked as 'faulty spare' (and therefore the RAID degraded, but clean). So I saved the first 512 byte of the old sdb to a file, bought a new HDD of same size (4TB), shut down the computer and replaced sdb physically, switched the computer back on and wrote the 512 byte back to the new drive to have the same partition info as the old drive (both are the same type, from same company). But the new drive shows the same behaviour as the old: I can add, re-syncing starts and after a few seconds its marked as 'faulty spare'. Here exactly what i did: mdadm --remove /dev/md/1 /dev/sdb maadm --detail /dev/md/1 gives me: /dev/md/1: Version : 1.2 Creation Time : Sat Jun 8 22:32:05 2013 Raid Level : raid1 Array Size : 3906887360 (3725.90 GiB 4000.65 GB) Used Dev Size : 3906887360 (3725.90 GiB 4000.65 GB) Raid Devices : 2 Total Devices : 1 Persistence : Superblock is persistent Update Time : Thu Nov 7 06:56:13 2013 State : clean, degraded Active Devices : 1 Working Devices : 1 Failed Devices : 0 Spare Devices : 0 Name : File-Server:1 (local to host File-Server) UUID : 44ed561f:b733e946:e69820f4:aba9b223 Events : 2424 Number Major Minor RaidDevice State 0 0 0 0 removed 1 8 32 1 active sync /dev/sdc mdadm --add /dev/md/1 /dev/sdb mdadm --detail /dev/md/1 gives me: Version : 1.2 Creation Time : Sat Jun 8 22:32:05 2013 Raid Level : raid1 Array Size : 3906887360 (3725.90 GiB 4000.65 GB) Used Dev Size : 3906887360 (3725.90 GiB 4000.65 GB) Raid Devices : 2 Total Devices : 2 Persistence : Superblock is persistent Update Time : Thu Nov 7 06:57:49 2013 State : clean, degraded, recovering Active Devices : 1 Working Devices : 1 Failed Devices : 1 Spare Devices : 0 Rebuild Status : 0% complete Name : File-Server:1 (local to host File-Server) UUID : 44ed561f:b733e946:e69820f4:aba9b223 Events : 2431 Number Major Minor RaidDevice State 2 8 16 0 faulty spare rebuilding /dev/sdb 1 8 32 1 active sync /dev/sdc and after a few seconds: /dev/md/1: Version : 1.2 Creation Time : Sat Jun 8 22:32:05 2013 Raid Level : raid1 Array Size : 3906887360 (3725.90 GiB 4000.65 GB) Used Dev Size : 3906887360 (3725.90 GiB 4000.65 GB) Raid Devices : 2 Total Devices : 2 Persistence : Superblock is persistent Update Time : Thu Nov 7 06:57:50 2013 State : clean, degraded Active Devices : 1 Working Devices : 1 Failed Devices : 1 Spare Devices : 0 Name : File-Server:1 (local to host File-Server) UUID : 44ed561f:b733e946:e69820f4:aba9b223 Events : 2436 Number Major Minor RaidDevice State 0 0 0 0 removed 1 8 32 1 active sync /dev/sdc 2 8 16 - faulty spare /dev/sdb same behaviour if I zero the superblock (mdadm --zero-superblock /dev/sdb) before adding sdb. I do all commands as root and the system holds 3 more 4TB drives, ie the mainboard can handle them. The old harddrive was checked for errors using badblocks, but all is fine. Does anybody have any idea, what the problem is?

Read the article

mdadm - Recovering a 'split' RAID1 array

- by Hamza

I have two drives that used to be part of a single RAID1 volume but it appears that one of them went offline for some time, something I've noticed just now when I rebooted my system. I now seem to have two RAID volumes, as reported by: # cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md126 : active raid1 sdc[1] 2096116 blocks super 1.2 [2/1] [_U] md127 : active (auto-read-only) raid1 sdb[0] 2096116 blocks super 1.2 [2/1] [U_] unused devices: <none> Not exactly sure where to go from here. How can I merge and re-sync these volumes without data loss? Thanks.

Read the article

Unable to access intel fake RAID 1 array in Fedora 14 after reboot

- by Sim

Hello everyone, 1st I am relatively new to linux (but not to *nix). I have 4 disks assembled in the following intel ahci bios fake raid arrays: 2x320GB RAID1 - used for operating systems md126 2x1TB RAID1 - used for data md125 I have used the raid of size 320GB to install my operating system and the second raid I didn't even select during the installation of Fedora 14. After successful partitioning and installation of Fedora, I tried to make the second array available, it was possible to make it visible in linux with mdadm --assembe --scan , after that I created one maximum size partition and 1 maximum size ext4 filesystem in it. Mounted, and used it. After restart - a few I/O errors during boot regarding md125 + inability to mount the filesystem on it and dropped into repair shell. I commented the filesystem in fstab and it booted. To my surprise, the array was marked as "auto read only": [root@localhost ~]# cat /proc/mdstat Personalities : [raid1] md125 : active (auto-read-only) raid1 sdc[1] sdd[0] 976759808 blocks super external:/md127/0 [2/2] [UU] md127 : inactive sdc[1](S) sdd[0](S) 4514 blocks super external:imsm md126 : active raid1 sda[1] sdb[0] 312566784 blocks super external:/md1/0 [2/2] [UU] md1 : inactive sdb[1](S) sda[0](S) 4514 blocks super external:imsm unused devices: <none> [root@localhost ~]# and the partition in it was not available as device special file in /dev: [root@localhost ~]# ls -l /dev/md125* brw-rw---- 1 root disk 9, 125 Jan 6 15:50 /dev/md125 [root@localhost ~]# But the partition is there according to fdisk: [root@localhost ~]# fdisk -l /dev/md125 Disk /dev/md125: 1000.2 GB, 1000202043392 bytes 19 heads, 10 sectors/track, 10281682 cylinders, total 1953519616 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x1b238ea9 Device Boot Start End Blocks Id System /dev/md125p1 2048 1953519615 976758784 83 Linux [root@localhost ~]# I tried to "activate" the array in different ways (I'm not experienced with mdadm and the man page is gigantic so I was only browsing it looking for my answer) but it was impossible - the array would still stay in "auto read only" and the device special file for the partition it will not be in /dev. It was only after I recreated the partition via fdisk that it reappeared in /dev... until next reboot. So, my question is - How do I make the array automatically available after reboot? Here is some additional information: 1st I am able to see the UUID of the array in blkid: [root@localhost ~]# blkid /dev/sdc: UUID="b9a1149f-ae11-4fc8-a600-0d77354dc42a" SEC_TYPE="ext2" TYPE="ext3" /dev/sdd: UUID="b9a1149f-ae11-4fc8-a600-0d77354dc42a" SEC_TYPE="ext2" TYPE="ext3" /dev/md126p1: UUID="60C8D9A7C8D97C2A" TYPE="ntfs" /dev/md126p2: UUID="3d1b38a3-b469-4b7c-b016-8abfb26a5d7d" TYPE="ext4" /dev/md126p3: UUID="1Msqqr-AAF8-k0wi-VYnq-uWJU-y0OD-uIFBHL" TYPE="LVM2_member" /dev/mapper/vg00-rootlv: LABEL="_Fedora-14-x86_6" UUID="34cc1cf5-6845-4489-8303-7a90c7663f0a" TYPE="ext4" /dev/mapper/vg00-swaplv: UUID="4644d857-e13b-456c-ac03-6f26299c1046" TYPE="swap" /dev/mapper/vg00-homelv: UUID="82bd58b2-edab-4b4b-aec4-b79595ecd0e3" TYPE="ext4" /dev/mapper/vg00-varlv: UUID="1b001444-5fdd-41b6-a59a-9712ec6def33" TYPE="ext4" /dev/mapper/vg00-tmplv: UUID="bf7d2459-2b35-4a1c-9b81-d4c4f24a9842" TYPE="ext4" /dev/md125: UUID="b9a1149f-ae11-4fc8-a600-0d77354dc42a" SEC_TYPE="ext2" TYPE="ext3" /dev/sda: TYPE="isw_raid_member" /dev/md125p1: UUID="420adfdd-6c4e-4552-93f0-2608938a4059" TYPE="ext4" [root@localhost ~]# Here is how /etc/mdadm.conf looks like: [root@localhost ~]# cat /etc/mdadm.conf # mdadm.conf written out by anaconda MAILADDR root AUTO +imsm +1.x -all ARRAY /dev/md1 UUID=89f60dee:e46a251f:7475814b:d4cc19a9 ARRAY /dev/md126 UUID=a8775c90:cee66376:5310fc13:63bcba5b ARRAY /dev/md125 UUID=b9a1149f:ae114fc8:a6000d77:354dc42a [root@localhost ~]# here is how /proc/mdstat looks like after I recreate the partition in the array so that it becomes available: [root@localhost ~]# cat /proc/mdstat Personalities : [raid1] md125 : active raid1 sdc[1] sdd[0] 976759808 blocks super external:/md127/0 [2/2] [UU] md127 : inactive sdc[1](S) sdd[0](S) 4514 blocks super external:imsm md126 : active raid1 sda[1] sdb[0] 312566784 blocks super external:/md1/0 [2/2] [UU] md1 : inactive sdb[1](S) sda[0](S) 4514 blocks super external:imsm unused devices: <none> [root@localhost ~]# Detailed output regarding the array in subject: [root@localhost ~]# mdadm --detail /dev/md125 /dev/md125: Container : /dev/md127, member 0 Raid Level : raid1 Array Size : 976759808 (931.51 GiB 1000.20 GB) Used Dev Size : 976759940 (931.51 GiB 1000.20 GB) Raid Devices : 2 Total Devices : 2 Update Time : Fri Jan 7 00:38:00 2011 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 UUID : 30ebc3c2:b6a64751:4758d05c:fa8ff782 Number Major Minor RaidDevice State 1 8 32 0 active sync /dev/sdc 0 8 48 1 active sync /dev/sdd [root@localhost ~]# and /etc/fstab, with /data commented (the filesystem that is on this array): # # /etc/fstab # Created by anaconda on Thu Jan 6 03:32:40 2011 # # Accessible filesystems, by reference, are maintained under '/dev/disk' # See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info # /dev/mapper/vg00-rootlv / ext4 defaults 1 1 UUID=3d1b38a3-b469-4b7c-b016-8abfb26a5d7d /boot ext4 defaults 1 2 #UUID=420adfdd-6c4e-4552-93f0-2608938a4059 /data ext4 defaults 0 1 /dev/mapper/vg00-homelv /home ext4 defaults 1 2 /dev/mapper/vg00-tmplv /tmp ext4 defaults 1 2 /dev/mapper/vg00-varlv /var ext4 defaults 1 2 /dev/mapper/vg00-swaplv swap swap defaults 0 0 tmpfs /dev/shm tmpfs defaults 0 0 devpts /dev/pts devpts gid=5,mode=620 0 0 sysfs /sys sysfs defaults 0 0 proc /proc proc defaults 0 0 [root@localhost ~]# Thanks in advance to everyone that even read this whole issue :-)

Search Results

Search found 303 results on 13 pages for 'raid1'.

Page 1/13 | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

- by user1389890

- by tmelen

- by Andy B

- by John Utech

- by John Stumbles

- by Chris Beach

- by user116399

- by Sanoj

- by jdelic

- by Dave_H

- by zimmy6996

- by OMG Ponies

- by OMG Ponies

- by Erik Heppler

- by user1389890

- by Kenny

- by Kenny

- by Paul

- by flight

- by Kabuto

- by Android5360

- by Waxhead

- by user212475

- by Hamza

- by Sim

1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >