Search Results

Search found 66 results on 3 pages for 'multipath'.

Page 1/3 | 1 2 3  | Next Page >

  • Ubuntu 10.04 & IBM DS3524 with FC multipath, inactive path is [failed][faulty] instead of [active][ghost]

    - by Graeme Donaldson
    OK, this is my setup: FC Switches IBM/Brocade, Switch1 and Switch2, independent fabrics. Server IBM x3650 M2, 2x QLogic QLE2460, 1 connected to each FC Switch. Storage IBM DS3524, 2x controllers with 4x FC ports each, but only 2x connected on each. +-----------------------------------------------------------------------+ | HBA1 Server HBA2 | +-----------------------------------------------------------------------+ | | | | | | +-----------------------------+ +------------------------------+ | Switch1 | | Switch2 | +-----------------------------+ +------------------------------+ | | | | | | | | | | | | | | | | | | | | +-----------------------------------+-----------------------------------+ | Contr A, port 3 | Contr A, port 4 | Contr B, port 3 | Contr B, port 4 | +-----------------------------------+-----------------------------------+ | Storage | +-----------------------------------------------------------------------+ My /etc/multipath.conf is from the IBM redbook for the DS3500, except I use a different setting for prio_callout, IBM uses /sbin/mpath_prio_tpc, but according to http://changelogs.ubuntu.com/changelogs/pool/main/m/multipath-tools/multipath-tools_0.4.8-7ubuntu2/changelog, this was renamed to /sbin/mpath_prio_rdac, which I'm using. devices { device { #ds3500 vendor "IBM" product "1746 FAStT" hardware_handler "1 rdac" path_checker rdac failback 0 path_grouping_policy multibus prio_callout "/sbin/mpath_prio_rdac /dev/%n" } } multipaths { multipath { wwid xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx alias array07 path_grouping_policy multibus path_checker readsector0 path_selector "round-robin 0" failback "5" rr_weight priorities no_path_retry "5" } } The output of multipath -ll with controller A as the preferred path: root@db06:~# multipath -ll sdg: checker msg is "directio checker reports path is down" sdh: checker msg is "directio checker reports path is down" array07 (xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx) dm-2 IBM ,1746 FASt [size=4.9T][features=1 queue_if_no_path][hwhandler=0] \_ round-robin 0 [prio=2][active] \_ 5:0:1:0 sdd 8:48 [active][ready] \_ 5:0:2:0 sde 8:64 [active][ready] \_ 6:0:1:0 sdg 8:96 [failed][faulty] \_ 6:0:2:0 sdh 8:112 [failed][faulty] If I change the preferred path using IBM DS Storage Manager to Controller B, the output swaps accordingly: root@db06:~# multipath -ll sdd: checker msg is "directio checker reports path is down" sde: checker msg is "directio checker reports path is down" array07 (xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx) dm-2 IBM ,1746 FASt [size=4.9T][features=1 queue_if_no_path][hwhandler=0] \_ round-robin 0 [prio=2][active] \_ 5:0:1:0 sdd 8:48 [failed][faulty] \_ 5:0:2:0 sde 8:64 [failed][faulty] \_ 6:0:1:0 sdg 8:96 [active][ready] \_ 6:0:2:0 sdh 8:112 [active][ready] According to IBM, the inactive path should be "[active][ghost]", not "[failed][faulty]". Despite this, I don't seem to have any I/O issues, but my syslog is being spammed with this every 5 seconds: Jun 1 15:30:09 db06 multipathd: sdg: directio checker reports path is down Jun 1 15:30:09 db06 kernel: [ 2350.282065] sd 6:0:2:0: [sdh] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE Jun 1 15:30:09 db06 kernel: [ 2350.282071] sd 6:0:2:0: [sdh] Sense Key : Illegal Request [current] Jun 1 15:30:09 db06 kernel: [ 2350.282076] sd 6:0:2:0: [sdh] <<vendor>> ASC=0x94 ASCQ=0x1ASC=0x94 ASCQ=0x1 Jun 1 15:30:09 db06 kernel: [ 2350.282083] sd 6:0:2:0: [sdh] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00 Jun 1 15:30:09 db06 kernel: [ 2350.282092] end_request: I/O error, dev sdh, sector 0 Jun 1 15:30:10 db06 multipathd: sdh: directio checker reports path is down Jun 1 15:30:14 db06 kernel: [ 2355.312270] sd 6:0:1:0: [sdg] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE Jun 1 15:30:14 db06 kernel: [ 2355.312277] sd 6:0:1:0: [sdg] Sense Key : Illegal Request [current] Jun 1 15:30:14 db06 kernel: [ 2355.312282] sd 6:0:1:0: [sdg] <<vendor>> ASC=0x94 ASCQ=0x1ASC=0x94 ASCQ=0x1 Jun 1 15:30:14 db06 kernel: [ 2355.312290] sd 6:0:1:0: [sdg] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00 Jun 1 15:30:14 db06 kernel: [ 2355.312299] end_request: I/O error, dev sdg, sector 0 Does anyone know how I can get the inactive path to show "[active][ghost]" instead of "[failed][faulty]"? I assume that once I can get that right then the spam in my syslog will end as well. One final thing worth mentioning is that the IBM redbook doc targets SLES 11 so I'm assuming there's something a little different under Ubuntu that I just haven't figured out yet. Update: As suggested by Mitch, I've tried removing /etc/multipath.conf, and now the output of multipath -ll looks like this: root@db06:~# multipath -ll sdg: checker msg is "directio checker reports path is down" sdh: checker msg is "directio checker reports path is down" xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxdm-1 IBM ,1746 FASt [size=4.9T][features=0][hwhandler=0] \_ round-robin 0 [prio=1][active] \_ 5:0:2:0 sde 8:64 [active][ready] \_ round-robin 0 [prio=1][enabled] \_ 5:0:1:0 sdd 8:48 [active][ready] \_ round-robin 0 [prio=0][enabled] \_ 6:0:1:0 sdg 8:96 [failed][faulty] \_ round-robin 0 [prio=0][enabled] \_ 6:0:2:0 sdh 8:112 [failed][faulty] So its more or less the same, with the same message in the syslog every 5 minutes as before, but the grouping has changed.

    Read the article

  • Debian and Multipath IO problem

    - by tearman
    Basically the situation is, I have a box running Debian, the box internally has an Intel SCSI RAID controller which is controlling 2 hard drives in RAID1 mode which is where the OS is installed. Further, I have a QLogic fiber channel adapter that connects the unit to a Fiber Channel SAN. My process of installation is I'll install Debian to the local drives, and leave the QLogic firmware out of it for the time being. Then once I get the unit online, I'll install the firmware drivers. This flops my internal drives from /dev/sda to /dev/sdc, which is a bit annoying, but recoverable. Probably should address these by UUID anyways. Once I get back online, I have to install multipath-tools (the framework is a multipath framework). However, once I reboot the machine again, it fails on boot after discovering multipath targets, saying my local drives are busy and cannot be mounted to /root. Any help in what may be the problem here? Or at least how to disable multipath until after the unit boots and then ignores the internal drives?

    Read the article

  • How to get multipath working for Ubuntu Server 12.04

    - by mlampi
    I'm working on a project which aims to make use of Ubuntu servers running on enterprise class hardware. In our case that means IBM HS23E blade servers, QLogic 4GB fibre channel extension cards and quite old IBM DS4500 disk array with two controllers. At the moment we have fibre channel as only boot option and Ubuntu Server 12.04 installed just fine and is able to boot without multipath. I'm not a linux professional myself but in our team we have people who will understand the technical stuff. Don't let my post confuse :) The current situation is that we have only one fibre channel connection to a single disk array controller. Real life case would be of course quite different. At minimum we should have two fibre channel ports connected to two different switches and two different controllers. However, we have no idea how to set up multipath tool. Is the DM-MPIO the right software? At minimum we should be able to boot when multiple connections are available and achieve fault tolerance when any of them should be down. Since the disk array is not the latest hardware, I managed to find RDAC driver sources only for 2.6.x kernel. And we have 3.2.x. Another issue is to build a multipath.conf. The said driver sources are from IBM support and the QLogic drivers provided to Ubuntu installer are from Ubuntu site. It seems that RHEL and SLES would have near out of the box support but that is not an option for our project. Actual questions: - What is the recommended software tool for multipath for Ubuntu Server 12.04? - Is there available pre-made configurations or templates? Does it require disk array / controller specific settings or do a more generic config work? - Do you have expriences on similar setup and like to share the knowledge? I'll provide you with any additional information you might require. Thanks in advance.

    Read the article

  • MultiPath configuration on RHEL5 and Clariion CX-300

    - by Kamil Z
    I have problem with discovering my FC-connected CX-300 storage. Frankly speaking I'm complete novice in FibreChannel, so step by step explanation would be appreciated. My configuration consist of two IBM HS20 blades with RHEL5.4 on board and 2x Qlogic ISP2422-based 4Gb Fibre Channel HBAs on each blade. As a FC switch there are two Brocades built in BladeCenter Chassis, and finally there is EMC Clariion CX-300. CX300, and Brocade switches should be configured properly, because they were working fine with previous configuration, which main defference was RHEL3 instead RHEL5.4 Below there is my output from several usefull commands: #lspci | grep Fibre 06:01.0 FibreChannle: Qlogic Corp. ISP2422-based 4Gb Fibre Channel to PCI-X HBA (rev 02) 06:01.1 FibreChannle: Qlogic Corp. ISP2422-based 4Gb Fibre Channel to PCI-X HBA (rev 02) #lsmod | grep qla qla2xxx 1084741 0 scsi_transport_fc 37577 1 qla2xxx scsi_mod 141717 10 scsi_dh,qla2xxx,sg,scsi_transport_fc,usb_storage,libata,mptspi,mptscsih,scsi_transport_spi,sd_mod #cat /proc/scsi/scsi Attached Devices: Host: scsi0 Channel: 00 Id: 00 Lun: 00 Vendor: LSILOGIC Model: 1030 IM IM Rev: 1000 Type: Direct-Access ANSI SCSI revision: 02 Host: scsi0 Channel: 01 Id: 00 Lun: 00 Vendor: IBM-ESXS Model: ST936701LC FN Rev: B418 Type: Direct-Access ANSI SCSI revision: 04 Host: scsi0 Channel: 01 Id: 00 Lun: 00 Vendor: IBM-ESXS Model: ST936701LC FN Rev: B418 Type: Direct-Access ANSI SCSI revision: 04 I'd followed instructions from this site (editing /etc/multipath.conf), but i failed after multipath -ll - the output was empty. Do you have any suggestions about discovering FC Connected LUNs in such configuration?

    Read the article

  • BGP Multipath & return routes

    - by Dennis van der Stelt
    I'm probably a complete n00b concerning serverfault related questions, but our IT department makes a bold statement I wish to verify. I've searched the internet, but can find nothing related to my question, so I come here. We have Threat Management Gateway 2010 and we used to just route the request to IIS and it contained the ip address so we could see where it was coming from. But now they turned on "Requests apear to come the TMG server" so ip addresses aren't forwarded anymore. Every request has the ip of the TMG server. Now the idea behind this is that because of multipath bgp routes, the incoming request goes over RouteA, but the acknowledgement messages could return over RouteB. The claim is that because the request doesn't come from the first known source, our proxy, but instead from IIS, some smart routers at the visitor of our websites don't recognize the acknowledgement message and filter it out. In other words, the response never arrives. Again, this is the claim. But I cannot find ANY resources on the internet that support this claim. I do read about bgp multipath, but more in the case that there are alternative routes when the fastest route fails for some reason. So is the claim completely bogus or is there (some) truth to it? Can someone explain or point me to resources? Thanks in advance!

    Read the article

  • PGB Multipath & return routes

    - by Dennis van der Stelt
    I'm probably a complete n00b concerning serverfault related questions, but our IT department makes a bold statement I wish to verify. I've searched the internet, but can find nothing related to my question, so I come here. We have Threat Management Gateway 2010 and we used to just route the request to IIS and it contained the ip address so we could see where it was coming from. But now they turned on "Requests apear to come the TMG server" so ip addresses aren't forwarded anymore. Every request has the ip of the TMG server. Now the idea behind this is that because of multipath bgp routes, the incoming request goes over RouteA, but the acknowledgement messages could return over RouteB. The claim is that because the request doesn't come from the first known source, our proxy, but instead from IIS, some smart routers at the visitor of our websites don't recognize the acknowledgement message and filter it out. In other words, the response never arrives. Again, this is the claim. But I cannot find ANY resources on the internet that support this claim. I do read about pgb multipath, but more in the case that there are alternative routes when the fastest route fails for some reason. So is the claim completely bogus or is there (some) truth to it? Can someone explain or point me to resources? Thanks in advance!

    Read the article

  • Merit and demerits for various Linux fiberchannel multipath options

    - by wzzrd
    On our Linux servers, we currently use HPs qla2xxx drivers, because it has multipathing (active/passive) built in. The are, however, various other options, like Red Hats device-mapper-multipath with the stock qla2xxx drivers (multibus and failover) and things like SecurePath and PowerPath (both of which can do trunking, iirc). Can someone tell me what the merits and demerits of the various options are (if I can ask such a question), besides the obvious fact that the {Secure,Power}Path options cost vast amounts of money? I'm mainly interested in the freely available options, like HPs qla2xxx vs. Red Hats multipathd and possible other open source solutions, but I would like to hear good reasons to go for the commercial solutions too. UPDATE: I'll be benchmarking various options the coming few days (the average of 10 runs of iozone for each option (options being native qla2xxx failver, native qla2xxx multibus, HP qla2xxx failover)). I'll post a summary of results here for those interested.

    Read the article

  • Improving SAS multipath to JBOD performance on Linux

    - by user36825
    Hello all I'm trying to optimize a storage setup on some Sun hardware with Linux. Any thoughts would be greatly appreciated. We have the following hardware: Sun Blade X6270 2* LSISAS1068E SAS controllers 2* Sun J4400 JBODs with 1 TB disks (24 disks per JBOD) Fedora Core 12 2.6.33 release kernel from FC13 (also tried with latest 2.6.31 kernel from FC12, same results) Here's the datasheet for the SAS hardware: http://www.sun.com/storage/storage_networking/hba/sas/PCIe.pdf It's using PCI Express 1.0a, 8x lanes. With a bandwidth of 250 MB/sec per lane, we should be able to do 2000 MB/sec per SAS controller. Each controller can do 3 Gb/sec per port and has two 4 port PHYs. We connect both PHYs from a controller to a JBOD. So between the JBOD and the controller we have 2 PHYs * 4 SAS ports * 3 Gb/sec = 24 Gb/sec of bandwidth, which is more than the PCI Express bandwidth. With write caching enabled and when doing big writes, each disk can sustain about 80 MB/sec (near the start of the disk). With 24 disks, that means we should be able to do 1920 MB/sec per JBOD. multipath { rr_min_io 100 uid 0 path_grouping_policy multibus failback manual path_selector "round-robin 0" rr_weight priorities alias somealias no_path_retry queue mode 0644 gid 0 wwid somewwid } I tried values of 50, 100, 1000 for rr_min_io, but it doesn't seem to make much difference. Along with varying rr_min_io I tried adding some delay between starting the dd's to prevent all of them writing over the same PHY at the same time, but this didn't make any difference, so I think the I/O's are getting properly spread out. According to /proc/interrupts, the SAS controllers are using a "IR-IO-APIC-fasteoi" interrupt scheme. For some reason only core #0 in the machine is handling these interrupts. I can improve performance slightly by assigning a separate core to handle the interrupts for each SAS controller: echo 2 /proc/irq/24/smp_affinity echo 4 /proc/irq/26/smp_affinity Using dd to write to the disk generates "Function call interrupts" (no idea what these are), which are handled by core #4, so I keep other processes off this core too. I run 48 dd's (one for each disk), assigning them to cores not dealing with interrupts like so: taskset -c somecore dd if=/dev/zero of=/dev/mapper/mpathx oflag=direct bs=128M oflag=direct prevents any kind of buffer cache from getting involved. None of my cores seem maxed out. The cores dealing with interrupts are mostly idle and all the other cores are waiting on I/O as one would expect. Cpu0 : 0.0%us, 1.0%sy, 0.0%ni, 91.2%id, 7.5%wa, 0.0%hi, 0.2%si, 0.0%st Cpu1 : 0.0%us, 0.8%sy, 0.0%ni, 93.0%id, 0.2%wa, 0.0%hi, 6.0%si, 0.0%st Cpu2 : 0.0%us, 0.6%sy, 0.0%ni, 94.4%id, 0.1%wa, 0.0%hi, 4.8%si, 0.0%st Cpu3 : 0.0%us, 7.5%sy, 0.0%ni, 36.3%id, 56.1%wa, 0.0%hi, 0.0%si, 0.0%st Cpu4 : 0.0%us, 1.3%sy, 0.0%ni, 85.7%id, 4.9%wa, 0.0%hi, 8.1%si, 0.0%st Cpu5 : 0.1%us, 5.5%sy, 0.0%ni, 36.2%id, 58.3%wa, 0.0%hi, 0.0%si, 0.0%st Cpu6 : 0.0%us, 5.0%sy, 0.0%ni, 36.3%id, 58.7%wa, 0.0%hi, 0.0%si, 0.0%st Cpu7 : 0.0%us, 5.1%sy, 0.0%ni, 36.3%id, 58.5%wa, 0.0%hi, 0.0%si, 0.0%st Cpu8 : 0.1%us, 8.3%sy, 0.0%ni, 27.2%id, 64.4%wa, 0.0%hi, 0.0%si, 0.0%st Cpu9 : 0.1%us, 7.9%sy, 0.0%ni, 36.2%id, 55.8%wa, 0.0%hi, 0.0%si, 0.0%st Cpu10 : 0.0%us, 7.8%sy, 0.0%ni, 36.2%id, 56.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu11 : 0.0%us, 7.3%sy, 0.0%ni, 36.3%id, 56.4%wa, 0.0%hi, 0.0%si, 0.0%st Cpu12 : 0.0%us, 5.6%sy, 0.0%ni, 33.1%id, 61.2%wa, 0.0%hi, 0.0%si, 0.0%st Cpu13 : 0.1%us, 5.3%sy, 0.0%ni, 36.1%id, 58.5%wa, 0.0%hi, 0.0%si, 0.0%st Cpu14 : 0.0%us, 4.9%sy, 0.0%ni, 36.4%id, 58.7%wa, 0.0%hi, 0.0%si, 0.0%st Cpu15 : 0.1%us, 5.4%sy, 0.0%ni, 36.5%id, 58.1%wa, 0.0%hi, 0.0%si, 0.0%st Given all this, the throughput reported by running "dstat 10" is in the range of 2200-2300 MB/sec. Given the math above I would expect something in the range of 2*1920 ~= 3600+ MB/sec. Does anybody have any idea where my missing bandwidth went? Thanks!

    Read the article

  • Failure connecting to Dell MD3200i from XenServer 6.2 pool

    - by Tom Sparrow
    This question also asked at Citrix Forums http://forums.citrix.com/thread.jspa?threadID=332289 I have a MD3200i that is currently working fine with my Xen5.6 pool, but I cannot get a connection to the new 6.2 pool to work. I previously had a problem with a 6.0 upgrade (which is why the old pool is still on 5.6), but rolled back rather than fix it as it wasn't urgent at the time. This install is on new machines - I tried 6.1 first (which had the same problems) then 6.2 was released the second day after installation so I switched to that. I've not installed anything from the Dell resource DVD at this point - I can't find anything saying I should, and everything I have read suggests it shouldn't be necessary. I can ping all 8 ip addresses from both servers in the pool, iscsiadm -m discovery works fine, I can login to the nodes and iscsiadm reports the sessions active correctly. I've added the required sections to multipath.conf, but multipath -ll reports DM multipath kernel driver not loaded immediately after boot. The following is a log of a test session immediately after boot. root@xen3 ~]# iscsiadm -m node --loginall=all Logging in to [iface: default, target: iqn.1984-05.com.dell:powervault.md3200i.6782bcb0006bd850000000004ed88b91, portal: 192.168.130.101,3260] Logging in to [iface: default, target: iqn.1984-05.com.dell:powervault.md3200i.6782bcb0006bd850000000004ed88b91, portal: 192.168.131.101,3260] Logging in to [iface: default, target: iqn.1984-05.com.dell:powervault.md3200i.6782bcb0006bd850000000004ed88b91, portal: 192.168.131.104,3260] Logging in to [iface: default, target: iqn.1984-05.com.dell:powervault.md3200i.6782bcb0006bd850000000004ed88b91, portal: 192.168.131.102,3260] Logging in to [iface: default, target: iqn.1984-05.com.dell:powervault.md3200i.6782bcb0006bd850000000004ed88b91, portal: 192.168.130.103,3260] Logging in to [iface: default, target: iqn.1984-05.com.dell:powervault.md3200i.6782bcb0006bd850000000004ed88b91, portal: 192.168.130.104,3260] Logging in to [iface: default, target: iqn.1984-05.com.dell:powervault.md3200i.6782bcb0006bd850000000004ed88b91, portal: 192.168.130.102,3260] Logging in to [iface: default, target: iqn.1984-05.com.dell:powervault.md3200i.6782bcb0006bd850000000004ed88b91, portal: 192.168.131.103,3260] Login to [iface: default, target: iqn.1984-05.com.dell:powervault.md3200i.6782bcb0006bd850000000004ed88b91, portal: 192.168.130.101,3260]: successful Login to [iface: default, target: iqn.1984-05.com.dell:powervault.md3200i.6782bcb0006bd850000000004ed88b91, portal: 192.168.131.101,3260]: successful Login to [iface: default, target: iqn.1984-05.com.dell:powervault.md3200i.6782bcb0006bd850000000004ed88b91, portal: 192.168.131.104,3260]: successful Login to [iface: default, target: iqn.1984-05.com.dell:powervault.md3200i.6782bcb0006bd850000000004ed88b91, portal: 192.168.131.102,3260]: successful Login to [iface: default, target: iqn.1984-05.com.dell:powervault.md3200i.6782bcb0006bd850000000004ed88b91, portal: 192.168.130.103,3260]: successful Login to [iface: default, target: iqn.1984-05.com.dell:powervault.md3200i.6782bcb0006bd850000000004ed88b91, portal: 192.168.130.104,3260]: successful Login to [iface: default, target: iqn.1984-05.com.dell:powervault.md3200i.6782bcb0006bd850000000004ed88b91, portal: 192.168.130.102,3260]: successful Login to [iface: default, target: iqn.1984-05.com.dell:powervault.md3200i.6782bcb0006bd850000000004ed88b91, portal: 192.168.131.103,3260]: successful [root@xen3 ~]# iscsiadm -m session tcp: [1] 192.168.130.101:3260,1 iqn.1984-05.com.dell:powervault.md3200i.6782bcb0006bd850000000004ed88b91 tcp: [2] 192.168.131.101:3260,1 iqn.1984-05.com.dell:powervault.md3200i.6782bcb0006bd850000000004ed88b91 tcp: [3] 192.168.131.104:3260,2 iqn.1984-05.com.dell:powervault.md3200i.6782bcb0006bd850000000004ed88b91 tcp: [4] 192.168.131.102:3260,2 iqn.1984-05.com.dell:powervault.md3200i.6782bcb0006bd850000000004ed88b91 tcp: [5] 192.168.130.103:3260,1 iqn.1984-05.com.dell:powervault.md3200i.6782bcb0006bd850000000004ed88b91 tcp: [6] 192.168.130.104:3260,2 iqn.1984-05.com.dell:powervault.md3200i.6782bcb0006bd850000000004ed88b91 tcp: [7] 192.168.130.102:3260,2 iqn.1984-05.com.dell:powervault.md3200i.6782bcb0006bd850000000004ed88b91 tcp: [8] 192.168.131.103:3260,1 iqn.1984-05.com.dell:powervault.md3200i.6782bcb0006bd850000000004ed88b91 [root@xen3 ~]# service multipathd restart ok Stopping multipathd daemon: [ OK ] Starting multipathd daemon: [ OK ] [root@xen3 ~]# multipath Jul 04 09:58:47 | DM multipath kernel driver not loaded Jul 04 09:58:47 | DM multipath kernel driver not loaded [root@xen3 ~]# multipath -ll Jul 04 09:59:03 | DM multipath kernel driver not loaded Jul 04 09:59:03 | DM multipath kernel driver not loaded [ root@xen3 ~]# modprobe dm_multipath [root@xen3 ~]# multipath Jul 04 10:19:50 | 36b8ca3a0e7024800194a0bd11891cd14: ignoring map create: 1Dell_Internal_Dual_SD_0123456789AB undef Dell,Internal Dual SD size=1.9G features='0' hwhandler='0' wp=undef `-+- policy='round-robin 0' prio=1 status=undef `- 7:0:0:0 sdb 8:16 undef ready running [root@xen3 ~]# multipath -ll 1Dell_Internal_Dual_SD_0123456789AB dm-1 Dell,Internal Dual SD size=1.9G features='0' hwhandler='0' wp=rw `-+- policy='round-robin 0' prio=1 status=enabled `- 7:0:0:0 sdb 8:16 active ready running [root@xen3 ~]# iscsiadm -m session tcp: [1] 192.168.130.101:3260,1 iqn.1984-05.com.dell:powervault.md3200i.6782bcb0006bd850000000004ed88b91 tcp: [2] 192.168.131.101:3260,1 iqn.1984-05.com.dell:powervault.md3200i.6782bcb0006bd850000000004ed88b91 tcp: [3] 192.168.131.104:3260,2 iqn.1984-05.com.dell:powervault.md3200i.6782bcb0006bd850000000004ed88b91 tcp: [4] 192.168.131.102:3260,2 iqn.1984-05.com.dell:powervault.md3200i.6782bcb0006bd850000000004ed88b91 tcp: [5] 192.168.130.103:3260,1 iqn.1984-05.com.dell:powervault.md3200i.6782bcb0006bd850000000004ed88b91 tcp: [6] 192.168.130.104:3260,2 iqn.1984-05.com.dell:powervault.md3200i.6782bcb0006bd850000000004ed88b91 tcp: [7] 192.168.130.102:3260,2 iqn.1984-05.com.dell:powervault.md3200i.6782bcb0006bd850000000004ed88b91 tcp: [8] 192.168.131.103:3260,1 iqn.1984-05.com.dell:powervault.md3200i.6782bcb0006bd850000000004ed88b91 [root@xen3 ~]# dmesg | tail -n 50 [ 1161.881010] sd 8:0:0:0: [sdf] Unhandled error code [ 1161.881013] sd 8:0:0:0: [sdf] Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK [ 1161.881017] sd 8:0:0:0: [sdf] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00 [ 1161.881024] end_request: I/O error, dev sdf, sector 0 [ 1161.881031] Buffer I/O error on device sdf, logical block 0 [ 1161.881045] sd 15:0:0:0: [sdi] Unhandled error code [ 1161.881048] sd 15:0:0:0: [sdi] Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK [ 1161.881052] sd 15:0:0:0: [sdi] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00 [ 1161.881058] end_request: I/O error, dev sdi, sector 0 [ 1161.881065] Buffer I/O error on device sdi, logical block 0 [ 1161.881122] sd 9:0:0:0: [sdg] Unhandled error code [ 1161.881124] sd 9:0:0:0: [sdg] Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK [ 1161.881126] sd 9:0:0:0: [sdg] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00 [ 1161.881132] end_request: I/O error, dev sdg, sector 0 [ 1161.881140] Buffer I/O error on device sdg, logical block 0 [ 1168.220951] connection6:0: ping timeout of 15 secs expired, recv timeout 10, last rx 84060, last ping 85060, now 86560 [ 1168.220957] connection7:0: ping timeout of 15 secs expired, recv timeout 10, last rx 84060, last ping 85060, now 86560 [ 1168.220967] connection7:0: detected conn error (1011) [ 1168.220969] connection4:0: ping timeout of 15 secs expired, recv timeout 10, last rx 84060, last ping 85060, now 86560 [ 1168.220973] connection4:0: detected conn error (1011) [ 1168.220975] connection3:0: ping timeout of 15 secs expired, recv timeout 10, last rx 84060, last ping 85060, now 86560 [ 1168.220978] connection3:0: detected conn error (1011) [ 1168.220985] connection6:0: detected conn error (1011) [ 1168.480994] sd 14:0:0:0: [sde] Unhandled error code [ 1168.480998] sd 14:0:0:0: [sde] Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK [ 1168.481001] sd 14:0:0:0: [sde] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00 [ 1168.481009] end_request: I/O error, dev sde, sector 0 [ 1168.481015] Buffer I/O error on device sde, logical block 0 [ 1168.481076] sd 11:0:0:0: [sdc] Unhandled error code [ 1168.481078] sd 11:0:0:0: [sdc] Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK [ 1168.481080] sd 11:0:0:0: [sdc] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00 [ 1168.481087] end_request: I/O error, dev sdc, sector 0 [ 1168.481092] Buffer I/O error on device sdc, logical block 0 [ 1168.481144] sd 10:0:0:0: [sdd] Unhandled error code [ 1168.481147] sd 10:0:0:0: [sdd] Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK [ 1168.481150] sd 10:0:0:0: [sdd] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00 [ 1168.481156] end_request: I/O error, dev sdd, sector 0 [ 1168.481163] Buffer I/O error on device sdd, logical block 0 [ 1168.481168] sd 13:0:0:0: [sdj] Unhandled error code [ 1168.481170] sd 13:0:0:0: [sdj] Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK [ 1168.481172] sd 13:0:0:0: [sdj] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00 [ 1168.481178] end_request: I/O error, dev sdj, sector 0 [ 1168.481184] Buffer I/O error on device sdj, logical block 0 [ 1457.105996] device-mapper: multipath round-robin: version 1.0.0 loaded [ 1457.106155] device-mapper: multipath: Cannot access device path 8:0: -16 [ 1457.106164] device-mapper: table: 252:1: multipath: error getting device [ 1457.106172] device-mapper: ioctl: error adding target to table [ 1457.171292] device-mapper: multipath: Cannot access device path 8:0: -16 [ 1457.171299] device-mapper: table: 252:1: multipath: error getting device [ 1457.171304] device-mapper: ioctl: error adding target to table [root@xen3 ~]# fdisk -l Disk /dev/sda: 299.4 GB, 299439751168 bytes 255 heads, 63 sectors/track, 36404 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sda1 1 5 40131 de Dell Utility /dev/sda2 * 6 528 4194304 83 Linux Partition 2 does not end on cylinder boundary. /dev/sda3 528 1050 4194304 83 Linux /dev/sda4 1050 36404 283986359+ 8e Linux LVM Disk /dev/sdb: 2040 MB, 2040528896 bytes 255 heads, 63 sectors/track, 248 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sdb1 1 248 1992028+ 83 Linux Disk /dev/dm-1: 2040 MB, 2040528896 bytes 255 heads, 63 sectors/track, 248 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/dm-1p1 1 248 1992028+ 83 Linux [root@xen3 ~]# xe sr-probe type=lvmoiscsi device-config:target=192.168.130.101 device-config:targetIQN=iqn.1984-05.com.dell:powervault.md3200i.6782bcb0006bd850000000004ed88b91 Error code: SR_BACKEND_FAILURE_107 Error parameters: , The SCSIid parameter is missing or incorrect, <?xml version="1.0" ?> <iscsi-target/> Note: the xml ends there correctly on the last line - it doesn't ever return a list of LUNs (and there is one in the group on the SAN for those servers.

    Read the article

  • Howo to get Multipath IO with Dell MD3600i into active/active setup?

    - by Disco
    I'm desperately trying to improve performance of my SAN connection. Here's what i have: [root@xnode1 dell]# multipath -ll mpath1 (36d4ae520009bd7cc0000030e4fe8230b) dm-2 DELL,MD36xxi [size=5.5T][features=3 queue_if_no_path pg_init_retries 50][hwhandler=1 rdac][rw] \_ round-robin 0 [prio=200][active] \_ 18:0:0:0 sdb 8:16 [active][ready] \_ 19:0:0:0 sdd 8:48 [active][ghost] \_ 20:0:0:0 sdf 8:80 [active][ghost] \_ 21:0:0:0 sdh 8:112 [active][ready] And multipath.conf : defaults { udev_dir /dev polling_interval 5 prio_callout none rr_min_io 100 max_fds 8192 user_friendly_names yes path_grouping_policy multibus default_features "1 fail_if_no_path" } blacklist { device { vendor "*" product "Universal Xport" } } devices { device { vendor "DELL" product "MD36xxi" path_checker rdac path_selector "round-robin 0" hardware_handler "1 rdac" failback immediate features "2 pg_init_retries 50" no_path_retry 30 rr_min_io 100 prio_callout "/sbin/mpath_prio_rdac /dev/%n" } } And sessions. [root@xnode1 dell]# iscsiadm -m session tcp: [13] 10.0.51.220:3260,1 iqn.1984-05.com.dell:powervault.md3600i.6d4ae520009bd7cc000000004fd7507c tcp: [14] 10.0.50.221:3260,2 iqn.1984-05.com.dell:powervault.md3600i.6d4ae520009bd7cc000000004fd7507c tcp: [15] 10.0.51.221:3260,2 iqn.1984-05.com.dell:powervault.md3600i.6d4ae520009bd7cc000000004fd7507c tcp: [16] 10.0.50.220:3260,1 iqn.1984-05.com.dell:powervault.md3600i.6d4ae520009bd7cc000000004fd7507c I'm getting very poor read performance : dd if=/dev/mapper/mpath1 of=/dev/null bs=1M count=1000 The SAN is configured as follows: CTRL0,PORT0 : 10.0.50.220 CTRL0,PORT1 : 10.0.50.221 CTRL1,PORT0 : 10.0.51.220 CTRL1,PORT1 : 10.0.51.221 And on the host : IF0 : 10.0.50.1 IF1 : 10.0.51.1 (Dual 10GbE Ethernet Card Intel DA2) It's connected to a 10gbE switch dedicated for SAN traffic. My questions being; why the connection is set up as 'ghost' and not 'ready' like an active/active configuration ?

    Read the article

  • linux multipath routing load balance

    - by user52883
    I would like to know how to load balance two Business DLS links which have fixed IPs. I believe it would look something like this: ip route add default scope global \ nexthop via gatewayDLS1 dev interface1 weight 1 \ nexthop via gatewayDLS2 dev interface2 weight 1 Is this be all I need in order to get multipath routing? Please, give me a more detailed answer if possible, thanks you.

    Read the article

  • How to block some disks from probes on Linux boot?

    - by Igor Velkov
    My linux host connected to SAN with FC interface. It connect with one path, and see some luns, that can't access, because they need anohter path, not available to host. On boot linux probe all lun he can see, get read error on unaccessible luns, and hangs there for a long-long time. Is there a way to disable any access to some luns at boot time, and later? I found a filters for device ignoration for LVM and MULTIPATH, but it not help during boot process. Generally, lvm still affected too despite of filter, and gives me a IO error on every operation like lvdisplay and vgdisplay, but this is another question.

    Read the article

  • How to best tune my SAN/Initiators for best performance?

    - by Disco
    Recent owner of a Dell PowerVault MD3600i i'm experiencing some weird results. I have a dedicated 24x 10GbE Switch (PowerConnect 8024), setup to jumbo frames 9K. The MD3600 has 2 RAID controllers, each has 2x 10GbE ethernet nics. There's nothing else on the switch; one VLAN for SAN traffic. Here's my multipath.conf defaults { udev_dir /dev polling_interval 5 selector "round-robin 0" path_grouping_policy multibus getuid_callout "/sbin/scsi_id -g -u -s /block/%n" prio_callout none path_checker readsector0 rr_min_io 100 max_fds 8192 rr_weight priorities failback immediate no_path_retry fail user_friendly_names yes # prio rdac } blacklist { device { vendor "*" product "Universal Xport" } # devnode "^sd[a-z]" } devices { device { vendor "DELL" product "MD36xxi" path_grouping_policy group_by_prio prio rdac # polling_interval 5 path_checker rdac path_selector "round-robin 0" hardware_handler "1 rdac" failback immediate features "2 pg_init_retries 50" no_path_retry 30 rr_min_io 100 prio_callout "/sbin/mpath_prio_rdac /dev/%n" } } And iscsid.conf : node.startup = automatic node.session.timeo.replacement_timeout = 15 node.conn[0].timeo.login_timeout = 15 node.conn[0].timeo.logout_timeout = 15 node.conn[0].timeo.noop_out_interval = 5 node.conn[0].timeo.noop_out_timeout = 10 node.session.iscsi.InitialR2T = No node.session.iscsi.ImmediateData = Yes node.session.iscsi.FirstBurstLength = 262144 node.session.iscsi.MaxBurstLength = 16776192 node.conn[0].iscsi.MaxRecvDataSegmentLength = 262144 After my tests; i can barely come to 200 Mb/s read/write. Should I expect more than that ? Providing it has dual 10 GbE my thoughts where to come around the 400 Mb/s. Any ideas ? Guidelines ? Troubleshooting tips ?

    Read the article

  • Can MySQL use multiple data directories on different physical storage devices

    - by sirlark
    I am running MySQL with its data dir on a 128Gb SSD. I am dealing with large datasets (~20Gb) that are loaded and processed weekly, each stored in a separate DB for the purposes of time point comparisons. Putting all the data into a single database in unfeasible because the performance on such large databases is already a problem. However, I cannot keep more than 6 datasets on the SSD at a time. Right now I am manually dumping the oldest to much larger 2Tb spinning disk every week, and dropping the database to make space for the new one. But if I need one of the 'archived' databases (a semi regular occurrence) I have to drop a current one (after dumping), reload it, do what I need to, then reverse the results. Is there a way to configure MySQL to use multiple data directories, say one on the SSD and one on the 2Tb spinning disk, and 'merge' them transparently? If I could do this, then archiving would no longer mean "moved out of the database entirely", but instead would mean "moved onto the slow physical device". The time taken to do my queries on a spinning disk would be less than that taken to completely dump, drop, load, drop, reload two entire databases, so this is a win. I thought of using something like unionfs but I can't think of a way to control which database gets stored on which physical drive, because it works by merging on a directory level (from what I understand) so I'm still stuck with using multiple directories. Any help appreciated, thanks in advance

    Read the article

  • Linux?????Oracle ASMLib??????

    - by ??-Oracle
    ??Oracle ASMLIB?????? ?????????linux??????oracle?asmlib?????????,?????????????,????????????????????????????????????????" multipatha",???????? ??????????: ???????????,???????????:ASM??????2??????,????????????????????????3?,?: ???????? ???????? ????????????? ???????:?????????????,?/ dev/ sda?,?????????????.?????2?????????????????? Linux?SCSI????????????????????/dev/sdb?/dev/sdc.??????sdb??sdc?????????? ??,???????????????,???????????????? ,?/dev/multipatha,?????????????,????,??I/ O??multipatha????????????????????????sdb??,?????????????,????????????multipath?????????sdc?????????? ????????????????????,???????????:sdb??sdc???multpatha,??????????????ASMLIB???,??????,ASMLIB?????????????? ASMLIB???,??????????,??ASM????????????????????????????ASM????????,?????????? ??????????:ASM??????????? ??????,ASMLIB????????????.Linux???????????,?????????????,????multipath ?????????? ???????ASMLIB??????????!??Oracle?????,?????????? ?????,????,??????ASMLIB??????????????????,??????????????: ??????:     ASMLIB???ASMLIB???????????????ASMLIB??????ASMLIB?????????????????????????ASMLIB??????????????????,???????????/etc/init.d/oracleasm scandisks????????????ASMLIB????????????????????????????asmlib???,???????????ASMLIB????,?????,ASMLIB??OS????????????,????OS?????????? ????????,???????OS?????????????????ASMLIB?????????????????????????ASMLIB?????????????,?????Oracle??? ASMLib??????????????????????,???????????????????????,ASMLib?????????????,?????????????????.?????????????????????????     ??????????????????,??????????????????????,??ASMLib?????,?????????????????????????????????ASMLib???????????,??????????? ???????: ASMLib?????????/etc/sysconfig/oracleasm.???????/etc/sysconfig/oracleasm-_dev_oracleasm ???????????????????????????/etc/init.d/oracleasm configure ???????????,?????????????? ???????????????? ORACLEASM_SCANORDER ??????????????; ORACLEASM_SCANEXCLUDE????????????????????; ???????????????????.???,????????????????,?????????,?????sd???????SCSI??????????????.??????????,??????????????? /dev/ ?????????????? ??:???????,?????????????????????device-mapper???,????????/dev/dm-XX??/dev/mapper/XXX???????udev???????????????ORACLEASM_SCANORDER ?? ORACLEASM_SCANEXCLUDE???? dm ????? ???????: ??:?????????/etc/sysconfig/oracleasm,????????????/etc/sysconfig/oracleasm-_dev_oracleasm???? ?????????: ???????ASMLib??????????????,?ASMLib??????,??ORACLEASM_SCANORDER??,????: ORACLEASM_SCANORDER="multipath sd" ??,???????,ASMLib??????"multipath"????????????/dev/multipatha ?????????????????????ASMLib?????"sd"?????????SCSI????????/dev/sda?????,???????ASM???? ??????/dev/sdb?/dev/sdc??????,???ASM???,??ASMLib ????????????????ASMLib?????????ASMLib??????????????????? ????????: ?????????ASMLib???????????ASMLib??????,??ORACLEASM_SCANEXCLUDE??,????: ORACLEASM_SCANEXCLUDE="sdb sdc" ??,????????????ASMLib???????????/dev/sdb?/dev/sdc.????????SCSI?????,ASMLib????????????2???,?????/dev/multipath???,??,Oracle?????????? EMC PowerPath ?ASMLib ??????????EMC PowerPatch??????ASMLib?????? ????,PowerPath?2.4 kernels EMC??????Linux??2.6??,?RHEL 4??SLES 9??2.0ASMLib kernel ??????? ??EMC Power Patch???,???EMC Support Matrix????????/?? ??????????? ???????? ASMLib ? PowerPath ?Linux 2.4 Kernel????,?RHEL3 SLES8 ???????,???EMC ?: ??????OTN?????: Configuring Oracle ASMLib on Multipath Disks

    Read the article

  • Yum Update Failing mod_ssl and glibc_devel

    - by Kerry
    Any ideas on how to get this to not fail? # yum update Freeing read locks for locker 0x82: 4189/140342084876032 Freeing read locks for locker 0x84: 4189/140342084876032 Freeing read locks for locker 0x85: 4189/140342084876032 Freeing read locks for locker 0x86: 4189/140342084876032 Freeing read locks for locker 0x87: 4189/140342084876032 Freeing read locks for locker 0x9a: 4189/140342084876032 Freeing read locks for locker 0x9c: 4189/140342084876032 Freeing read locks for locker 0x9d: 4189/140342084876032 Freeing read locks for locker 0x9e: 4189/140342084876032 Freeing read locks for locker 0x9f: 4189/140342084876032 Freeing read locks for locker 0xa0: 4189/140342084876032 Freeing read locks for locker 0xa1: 4189/140342084876032 Freeing read locks for locker 0xa2: 4189/140342084876032 Freeing read locks for locker 0xa3: 4189/140342084876032 Freeing read locks for locker 0xa4: 4189/140342084876032 Freeing read locks for locker 0xa5: 4189/140342084876032 Freeing read locks for locker 0xa6: 4189/140342084876032 Freeing read locks for locker 0xa7: 4189/140342084876032 Freeing read locks for locker 0xa8: 4189/140342084876032 Freeing read locks for locker 0xa9: 4189/140342084876032 Freeing read locks for locker 0xaa: 4189/140342084876032 Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile * base: mirror.hmc.edu * epel: mirrors.kernel.org * extras: centos.mirror.freedomvoice.com * updates: mirrors.sonic.net Setting up Update Process Resolving Dependencies There are unfinished transactions remaining. You might consider running yum-complete-transaction first to finish them. The program yum-complete-transaction is found in the yum-utils package. --> Running transaction check ---> Package device-mapper-persistent-data.x86_64 0:0.2.8-2.el6 will be updated ---> Package device-mapper-persistent-data.x86_64 0:0.2.8-4.el6_5 will be an update ---> Package glibc-headers.x86_64 0:2.12-1.132.el6 will be updated --> Processing Dependency: glibc-headers = 2.12-1.132.el6 for package: glibc-devel-2.12-1.132.el6.x86_64 ---> Package glibc-headers.x86_64 0:2.12-1.132.el6_5.2 will be an update ---> Package httpd.x86_64 0:2.2.15-29.el6.centos will be updated --> Processing Dependency: httpd = 2.2.15-29.el6.centos for package: 1:mod_ssl-2.2.15-29.el6.centos.x86_64 ---> Package httpd.x86_64 0:2.2.15-30.el6.centos will be an update ---> Package kernel.x86_64 0:2.6.32-431.17.1.el6 will be installed ---> Package kernel-devel.x86_64 0:2.6.32-431.17.1.el6 will be installed ---> Package selinux-policy-targeted.noarch 0:3.7.19-231.el6_5.1 will be updated ---> Package selinux-policy-targeted.noarch 0:3.7.19-231.el6_5.3 will be an update --> Finished Dependency Resolution Error: Package: 1:mod_ssl-2.2.15-29.el6.centos.x86_64 (@base) Requires: httpd = 2.2.15-29.el6.centos Removing: httpd-2.2.15-29.el6.centos.x86_64 (@base) httpd = 2.2.15-29.el6.centos Updated By: httpd-2.2.15-30.el6.centos.x86_64 (updates) httpd = 2.2.15-30.el6.centos Error: Package: glibc-devel-2.12-1.132.el6.x86_64 (@base) Requires: glibc-headers = 2.12-1.132.el6 Removing: glibc-headers-2.12-1.132.el6.x86_64 (@base) glibc-headers = 2.12-1.132.el6 Updated By: glibc-headers-2.12-1.132.el6_5.2.x86_64 (updates) glibc-headers = 2.12-1.132.el6_5.2 Available: glibc-headers-2.12-1.132.el6_5.1.x86_64 (updates) glibc-headers = 2.12-1.132.el6_5.1 You could try using --skip-broken to work around the problem ** Found 34 pre-existing rpmdb problem(s), 'yum check' output follows: audit-2.2-4.el6_5.x86_64 is a duplicate with audit-2.2-2.el6.x86_64 audit-libs-2.2-4.el6_5.x86_64 is a duplicate with audit-libs-2.2-2.el6.x86_64 curl-7.19.7-37.el6_5.3.x86_64 is a duplicate with curl-7.19.7-37.el6_4.x86_64 device-mapper-multipath-0.4.9-72.el6_5.2.x86_64 is a duplicate with device-mapper-multipath-0.4.9-72.el6_5.1.x86_64 device-mapper-multipath-libs-0.4.9-72.el6_5.2.x86_64 is a duplicate with device-mapper-multipath-libs-0.4.9-72.el6_5.1.x86_64 2:ethtool-3.5-1.4.el6_5.x86_64 is a duplicate with 2:ethtool-3.5-1.2.el6_5.x86_64 glibc-2.12-1.132.el6_5.2.x86_64 is a duplicate with glibc-2.12-1.132.el6.x86_64 glibc-common-2.12-1.132.el6_5.2.x86_64 is a duplicate with glibc-common-2.12-1.132.el6.x86_64 glibc-devel-2.12-1.132.el6_5.2.x86_64 is a duplicate with glibc-devel-2.12-1.132.el6.x86_64 glibc-devel-2.12-1.132.el6_5.2.x86_64 has missing requires of glibc-headers = ('0', '2.12', '1.132.el6_5.2') gnutls-2.8.5-14.el6_5.x86_64 is a duplicate with gnutls-2.8.5-13.el6_5.x86_64 httpd-2.2.15-29.el6.centos.x86_64 has missing requires of httpd-tools = ('0', '2.2.15', '29.el6.centos') httpd-manual-2.2.15-30.el6.centos.noarch has missing requires of httpd = ('0', '2.2.15', '30.el6.centos') iproute-2.6.32-32.el6_5.x86_64 is a duplicate with iproute-2.6.32-31.el6.x86_64 kernel-firmware-2.6.32-431.17.1.el6.noarch is a duplicate with kernel-firmware-2.6.32-431.11.2.el6.noarch kernel-headers-2.6.32-431.17.1.el6.x86_64 is a duplicate with kernel-headers-2.6.32-431.11.2.el6.x86_64 kpartx-0.4.9-72.el6_5.2.x86_64 is a duplicate with kpartx-0.4.9-72.el6_5.1.x86_64 krb5-libs-1.10.3-15.el6_5.1.x86_64 is a duplicate with krb5-libs-1.10.3-10.el6_4.6.x86_64 libblkid-2.17.2-12.14.el6_5.x86_64 is a duplicate with libblkid-2.17.2-12.14.el6.x86_64 libcurl-7.19.7-37.el6_5.3.x86_64 is a duplicate with libcurl-7.19.7-37.el6_4.x86_64 libcurl-devel-7.19.7-37.el6_5.3.x86_64 is a duplicate with libcurl-devel-7.19.7-37.el6_4.x86_64 libtasn1-2.3-6.el6_5.x86_64 is a duplicate with libtasn1-2.3-3.el6_2.1.x86_64 libuuid-2.17.2-12.14.el6_5.x86_64 is a duplicate with libuuid-2.17.2-12.14.el6.x86_64 libxml2-2.7.6-14.el6_5.1.x86_64 is a duplicate with libxml2-2.7.6-14.el6.x86_64 mdadm-3.2.6-7.el6_5.2.x86_64 is a duplicate with mdadm-3.2.6-7.el6.x86_64 1:mod_ssl-2.2.15-30.el6.centos.x86_64 is a duplicate with 1:mod_ssl-2.2.15-29.el6.centos.x86_64 1:mod_ssl-2.2.15-30.el6.centos.x86_64 has missing requires of httpd = ('0', '2.2.15', '30.el6.centos') nss-softokn-3.14.3-10.el6_5.x86_64 is a duplicate with nss-softokn-3.14.3-9.el6.x86_64 openssl-1.0.1e-16.el6_5.7.x86_64 is a duplicate with openssl-1.0.1e-16.el6_5.4.x86_64 openssl-1.0.1e-16.el6_5.14.x86_64 is a duplicate with openssl-1.0.1e-16.el6_5.7.x86_64 openssl-devel-1.0.1e-16.el6_5.14.x86_64 is a duplicate with openssl-devel-1.0.1e-16.el6_5.7.x86_64 selinux-policy-3.7.19-231.el6_5.3.noarch is a duplicate with selinux-policy-3.7.19-231.el6_5.1.noarch tzdata-2014d-1.el6.noarch is a duplicate with tzdata-2014b-1.el6.noarch util-linux-ng-2.17.2-12.14.el6_5.x86_64 is a duplicate with util-linux-ng-2.17.2-12.14.el6.x86_64 UPDATE I installed and ran yum-complete-transaction as requested, it finished some things and suggested I run package-cleanup --problems, which yielded this: package-cleanup --problems Loaded plugins: fastestmirror Package httpd-manual-2.2.15-30.el6.centos.noarch requires httpd = ('0', '2.2.15', '30.el6.centos') Package httpd-2.2.15-29.el6.centos.x86_64 requires httpd-tools = ('0', '2.2.15', '29.el6.centos') Package mod_ssl-2.2.15-30.el6.centos.x86_64 requires httpd = ('0', '2.2.15', '30.el6.centos') Package glibc-devel-2.12-1.132.el6_5.2.x86_64 requires glibc-headers = ('0', '2.12', '1.132.el6_5.2') I'm definitely not a sys-admin, what would be the next step? UPDATE 2 I ran yum distro-sync: # yum distro-sync Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile * base: mirror.hmc.edu * epel: mirrors.kernel.org * extras: centos.mirror.freedomvoice.com * updates: mirrors.sonic.net Setting up Distribution Synchronization Process Resolving Dependencies --> Running transaction check ---> Package glibc-headers.x86_64 0:2.12-1.132.el6 will be updated --> Processing Dependency: glibc-headers = 2.12-1.132.el6 for package: glibc-devel-2.12-1.132.el6.x86_64 ---> Package glibc-headers.x86_64 0:2.12-1.132.el6_5.2 will be an update ---> Package httpd.x86_64 0:2.2.15-29.el6.centos will be updated --> Processing Dependency: httpd = 2.2.15-29.el6.centos for package: 1:mod_ssl-2.2.15-29.el6.centos.x86_64 ---> Package httpd.x86_64 0:2.2.15-30.el6.centos will be an update --> Finished Dependency Resolution Error: Package: 1:mod_ssl-2.2.15-29.el6.centos.x86_64 (@base) Requires: httpd = 2.2.15-29.el6.centos Removing: httpd-2.2.15-29.el6.centos.x86_64 (@base) httpd = 2.2.15-29.el6.centos Updated By: httpd-2.2.15-30.el6.centos.x86_64 (updates) httpd = 2.2.15-30.el6.centos Error: Package: glibc-devel-2.12-1.132.el6.x86_64 (@base) Requires: glibc-headers = 2.12-1.132.el6 Removing: glibc-headers-2.12-1.132.el6.x86_64 (@base) glibc-headers = 2.12-1.132.el6 Updated By: glibc-headers-2.12-1.132.el6_5.2.x86_64 (updates) glibc-headers = 2.12-1.132.el6_5.2 Available: glibc-headers-2.12-1.132.el6_5.1.x86_64 (updates) glibc-headers = 2.12-1.132.el6_5.1 You could try using --skip-broken to work around the problem ** Found 34 pre-existing rpmdb problem(s), 'yum check' output follows: audit-2.2-4.el6_5.x86_64 is a duplicate with audit-2.2-2.el6.x86_64 audit-libs-2.2-4.el6_5.x86_64 is a duplicate with audit-libs-2.2-2.el6.x86_64 curl-7.19.7-37.el6_5.3.x86_64 is a duplicate with curl-7.19.7-37.el6_4.x86_64 device-mapper-multipath-0.4.9-72.el6_5.2.x86_64 is a duplicate with device-mapper-multipath-0.4.9-72.el6_5.1.x86_64 device-mapper-multipath-libs-0.4.9-72.el6_5.2.x86_64 is a duplicate with device-mapper-multipath-libs-0.4.9-72.el6_5.1.x86_64 2:ethtool-3.5-1.4.el6_5.x86_64 is a duplicate with 2:ethtool-3.5-1.2.el6_5.x86_64 glibc-2.12-1.132.el6_5.2.x86_64 is a duplicate with glibc-2.12-1.132.el6.x86_64 glibc-common-2.12-1.132.el6_5.2.x86_64 is a duplicate with glibc-common-2.12-1.132.el6.x86_64 glibc-devel-2.12-1.132.el6_5.2.x86_64 is a duplicate with glibc-devel-2.12-1.132.el6.x86_64 glibc-devel-2.12-1.132.el6_5.2.x86_64 has missing requires of glibc-headers = ('0', '2.12', '1.132.el6_5.2') gnutls-2.8.5-14.el6_5.x86_64 is a duplicate with gnutls-2.8.5-13.el6_5.x86_64 httpd-2.2.15-29.el6.centos.x86_64 has missing requires of httpd-tools = ('0', '2.2.15', '29.el6.centos') httpd-manual-2.2.15-30.el6.centos.noarch has missing requires of httpd = ('0', '2.2.15', '30.el6.centos') iproute-2.6.32-32.el6_5.x86_64 is a duplicate with iproute-2.6.32-31.el6.x86_64 kernel-firmware-2.6.32-431.17.1.el6.noarch is a duplicate with kernel-firmware-2.6.32-431.11.2.el6.noarch kernel-headers-2.6.32-431.17.1.el6.x86_64 is a duplicate with kernel-headers-2.6.32-431.11.2.el6.x86_64 kpartx-0.4.9-72.el6_5.2.x86_64 is a duplicate with kpartx-0.4.9-72.el6_5.1.x86_64 krb5-libs-1.10.3-15.el6_5.1.x86_64 is a duplicate with krb5-libs-1.10.3-10.el6_4.6.x86_64 libblkid-2.17.2-12.14.el6_5.x86_64 is a duplicate with libblkid-2.17.2-12.14.el6.x86_64 libcurl-7.19.7-37.el6_5.3.x86_64 is a duplicate with libcurl-7.19.7-37.el6_4.x86_64 libcurl-devel-7.19.7-37.el6_5.3.x86_64 is a duplicate with libcurl-devel-7.19.7-37.el6_4.x86_64 libtasn1-2.3-6.el6_5.x86_64 is a duplicate with libtasn1-2.3-3.el6_2.1.x86_64 libuuid-2.17.2-12.14.el6_5.x86_64 is a duplicate with libuuid-2.17.2-12.14.el6.x86_64 libxml2-2.7.6-14.el6_5.1.x86_64 is a duplicate with libxml2-2.7.6-14.el6.x86_64 mdadm-3.2.6-7.el6_5.2.x86_64 is a duplicate with mdadm-3.2.6-7.el6.x86_64 1:mod_ssl-2.2.15-30.el6.centos.x86_64 is a duplicate with 1:mod_ssl-2.2.15-29.el6.centos.x86_64 1:mod_ssl-2.2.15-30.el6.centos.x86_64 has missing requires of httpd = ('0', '2.2.15', '30.el6.centos') nss-softokn-3.14.3-10.el6_5.x86_64 is a duplicate with nss-softokn-3.14.3-9.el6.x86_64 openssl-1.0.1e-16.el6_5.7.x86_64 is a duplicate with openssl-1.0.1e-16.el6_5.4.x86_64 openssl-1.0.1e-16.el6_5.14.x86_64 is a duplicate with openssl-1.0.1e-16.el6_5.7.x86_64 openssl-devel-1.0.1e-16.el6_5.14.x86_64 is a duplicate with openssl-devel-1.0.1e-16.el6_5.7.x86_64 selinux-policy-3.7.19-231.el6_5.3.noarch is a duplicate with selinux-policy-3.7.19-231.el6_5.1.noarch tzdata-2014d-1.el6.noarch is a duplicate with tzdata-2014b-1.el6.noarch util-linux-ng-2.17.2-12.14.el6_5.x86_64 is a duplicate with util-linux-ng-2.17.2-12.14.el6.x86_64

    Read the article

  • SBD killing both cluster nodes when there are even small SAN network problems

    - by Wieslaw Herr
    I am having problems with stonith SBD in a openais-based cluster. Some background: The active/passive cluster has two nodes, node1 and node2. They are configured to provide an NFS service to users. To avoid problems with split-brain, they are both configured to use SBD. SBD is using two 1MB disks available to the hosts via an multipath fibre-channel network. The problems start if something happens with the SAN network. For example, today one of the brocade switches got rebooted and both nodes lost 2 out of 4 paths to each disks, which resulted in both nodes committing suicide and rebooting. This, of course, was highly undesirable because a) there were paths left b) even if the switch would be out for 10-20 seconds a reboot cycle of both nodes would take 5-10 minutes and all NFS-locks would be lost. I tried increasing the SBD timeout values (to 10sec+ values, dump attached at the end), however a "WARN: Latency: No liveness for 4 s exceeds threshold of 3 s" hints that something isn't working as I would it expect to. Here is what I would like to know: a) Is SBD working as it should killing nodes when 2 paths are available? b) If not, is the multipath.conf file attached correct? The storage controller we use is an IBM SVC (IBM 2145), should there be any specific configuration for it? (as in multipath.conf.defaults) c) How should I go about increasing the timeouts in SBD attachements: Multipath.conf and sbd dump (http://hpaste.org/69537)

    Read the article

  • Why is my RAID /dev/md1 showing up as /dev/md126? Is mdadm.conf being ignored?

    - by mmorris
    I created a RAID with: sudo mdadm --create --verbose /dev/md1 --level=mirror --raid-devices=2 /dev/sdb1 /dev/sdc1 sudo mdadm --create --verbose /dev/md2 --level=mirror --raid-devices=2 /dev/sdb2 /dev/sdc2 sudo mdadm --detail --scan returns: ARRAY /dev/md1 metadata=1.2 name=ion:1 UUID=aa1f85b0:a2391657:cfd38029:772c560e ARRAY /dev/md2 metadata=1.2 name=ion:2 UUID=528e5385:e61eaa4c:1db2dba7:44b556fb Which I appended it to /etc/mdadm/mdadm.conf, see below: # mdadm.conf # # Please refer to mdadm.conf(5) for information about this file. # # by default (built-in), scan all partitions (/proc/partitions) and all # containers for MD superblocks. alternatively, specify devices to scan, using # wildcards if desired. #DEVICE partitions containers # auto-create devices with Debian standard permissions CREATE owner=root group=disk mode=0660 auto=yes # automatically tag new arrays as belonging to the local system HOMEHOST <system> # instruct the monitoring daemon where to send mail alerts MAILADDR root # definitions of existing MD arrays # This file was auto-generated on Mon, 29 Oct 2012 16:06:12 -0500 # by mkconf $Id$ ARRAY /dev/md1 metadata=1.2 name=ion:1 UUID=aa1f85b0:a2391657:cfd38029:772c560e ARRAY /dev/md2 metadata=1.2 name=ion:2 UUID=528e5385:e61eaa4c:1db2dba7:44b556fb cat /proc/mdstat returns: Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] md2 : active raid1 sdb2[0] sdc2[1] 208629632 blocks super 1.2 [2/2] [UU] md1 : active raid1 sdb1[0] sdc1[1] 767868736 blocks super 1.2 [2/2] [UU] unused devices: <none> ls -la /dev | grep md returns: brw-rw---- 1 root disk 9, 1 Oct 30 11:06 md1 brw-rw---- 1 root disk 9, 2 Oct 30 11:06 md2 So I think all is good and I reboot. After the reboot, /dev/md1 is now /dev/md126 and /dev/md2 is now /dev/md127????? sudo mdadm --detail --scan returns: ARRAY /dev/md/ion:1 metadata=1.2 name=ion:1 UUID=aa1f85b0:a2391657:cfd38029:772c560e ARRAY /dev/md/ion:2 metadata=1.2 name=ion:2 UUID=528e5385:e61eaa4c:1db2dba7:44b556fb cat /proc/mdstat returns: Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] md126 : active raid1 sdc2[1] sdb2[0] 208629632 blocks super 1.2 [2/2] [UU] md127 : active (auto-read-only) raid1 sdb1[0] sdc1[1] 767868736 blocks super 1.2 [2/2] [UU] unused devices: <none> ls -la /dev | grep md returns: drwxr-xr-x 2 root root 80 Oct 30 11:18 md brw-rw---- 1 root disk 9, 126 Oct 30 11:18 md126 brw-rw---- 1 root disk 9, 127 Oct 30 11:18 md127 All is not lost, I: sudo mdadm --stop /dev/md126 sudo mdadm --stop /dev/md127 sudo mdadm --assemble --verbose /dev/md1 /dev/sdb1 /dev/sdc1 sudo mdadm --assemble --verbose /dev/md2 /dev/sdb2 /dev/sdc2 and verify everything: sudo mdadm --detail --scan returns: ARRAY /dev/md1 metadata=1.2 name=ion:1 UUID=aa1f85b0:a2391657:cfd38029:772c560e ARRAY /dev/md2 metadata=1.2 name=ion:2 UUID=528e5385:e61eaa4c:1db2dba7:44b556fb cat /proc/mdstat returns: Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] md2 : active raid1 sdb2[0] sdc2[1] 208629632 blocks super 1.2 [2/2] [UU] md1 : active raid1 sdb1[0] sdc1[1] 767868736 blocks super 1.2 [2/2] [UU] unused devices: <none> ls -la /dev | grep md returns: brw-rw---- 1 root disk 9, 1 Oct 30 11:26 md1 brw-rw---- 1 root disk 9, 2 Oct 30 11:26 md2 So once again, I think all is good and I reboot. Again, after the reboot, /dev/md1 is /dev/md126 and /dev/md2 is /dev/md127????? sudo mdadm --detail --scan returns: ARRAY /dev/md/ion:1 metadata=1.2 name=ion:1 UUID=aa1f85b0:a2391657:cfd38029:772c560e ARRAY /dev/md/ion:2 metadata=1.2 name=ion:2 UUID=528e5385:e61eaa4c:1db2dba7:44b556fb cat /proc/mdstat returns: Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] md126 : active raid1 sdc2[1] sdb2[0] 208629632 blocks super 1.2 [2/2] [UU] md127 : active (auto-read-only) raid1 sdb1[0] sdc1[1] 767868736 blocks super 1.2 [2/2] [UU] unused devices: <none> ls -la /dev | grep md returns: drwxr-xr-x 2 root root 80 Oct 30 11:42 md brw-rw---- 1 root disk 9, 126 Oct 30 11:42 md126 brw-rw---- 1 root disk 9, 127 Oct 30 11:42 md127 What am I missing here?

    Read the article

  • How to stop RAID5 array while it is shown to be busy?

    - by RCola
    I have a raid5 array and need to stop it, but while trying to stop it getting error. # cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md0 : active raid5 sde1[3](F) sdc1[4](F) sdf1[2] sdd1[1] 2120320 blocks level 5, 32k chunk, algorithm 2 [3/2] [_UU] unused devices: <none> # mdadm --stop mdadm: metadata format 00.90 unknown, ignored. mdadm: metadata format 00.90 unknown, ignored. mdadm: No devices given. # mdadm --stop /dev/md0 mdadm: metadata format 00.90 unknown, ignored. mdadm: metadata format 00.90 unknown, ignored. mdadm: fail to stop array /dev/md0: Device or resource busy and # lsof | grep md0 md0_raid5 965 root cwd DIR 8,1 4096 2 / md0_raid5 965 root rtd DIR 8,1 4096 2 / md0_raid5 965 root txt unknown /proc/965/exe # cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md0 : active raid5 sde1[3](F) sdc1[4](F) sdf1[2] sdd1[1] 2120320 blocks level 5, 32k chunk, algorithm 2 [3/2] [_UU] # grep md0 /proc/mdstat md0 : active raid5 sde1[3](F) sdc1[4](F) sdf1[2] sdd1[1] # grep md0 /proc/partitions 9 0 2120320 md0 While booting, md1 is mounted ok but md0 failed for some unknown reason # dmesg | grep md[0-9] [ 4.399658] raid5: allocated 3179kB for md1 [ 4.400432] raid5: raid level 5 set md1 active with 3 out of 3 devices, algorithm 2 [ 4.400678] md1: detected capacity change from 0 to 2121793536 [ 4.403135] md1: unknown partition table [ 38.937932] Filesystem "md1": Disabling barriers, trial barrier write failed [ 38.941969] XFS mounting filesystem md1 [ 41.058808] Ending clean XFS mount for filesystem: md1 [ 46.325684] raid5: allocated 3179kB for md0 [ 46.327103] raid5: raid level 5 set md0 active with 2 out of 3 devices, algorithm 2 [ 46.330620] md0: detected capacity change from 0 to 2171207680 [ 46.335598] md0: unknown partition table [ 46.410195] md: recovery of RAID array md0 [ 117.970104] md: md0: recovery done. # cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md0 : active raid5 sde1[0] sdf1[2] sdd1[1] 2120320 blocks level 5, 32k chunk, algorithm 2 [3/3] [UUU] md1 : active raid5 sdc2[0] sdf2[2] sde2[3](S) sdd2[1] 2072064 blocks level 5, 128k chunk, algorithm 2 [3/3] [UUU]

    Read the article

  • Has anybody tried to create a really big storage with ZFS and plain SAS controllers? [closed]

    - by Eccehomo
    I'm considering to build one with something like this: http://www.supermicro.com/products/chassis/4U/847/SC847E26-R1400U.cfm (a chasis with two dual port multipath expanders) http://www.supermicro.com/products/accessories/addon/AOC-SAS2LP-MV8.cfm (4 8-port plain SAS controllers, 2 for each backplane) and 36 Seagate 3Tb SAS drives (ST33000650SS) OS -- FreeBSD. And it's very interesting: How good expander sas backplanes and multipath configurations work with freebsd ? How to locate a specific drive in the bay? (literally -- how to blink an indicator on the drive in freebsd) How to detect a fail of a controller? Will it work together at all? I'm asking to share any experience.

    Read the article

  • How do you re-mount an ext3 fs readwrite after it gets mounted readonly from a disk error?

    - by cagenut
    Its a relatively common problem when something goes wrong in a SAN for ext3 to detect the disk write errors and remount the filesystem read-only. Thats all well and good, only when the SAN is fixed I can't figure out how to re-re-mount the filesystem read-write without rebooting. Behold: [root@localhost ~]# multipath -ll mpath0 (36001f93000a310000299000200000000) dm-2 XIOTECH,ISE1400 [size=1.1T][features=1 queue_if_no_path][hwhandler=0][rw] \_ round-robin 0 [prio=2][active] \_ 1:0:0:1 sdb 8:16 [active][ready] \_ 2:0:0:1 sdc 8:32 [active][ready] [root@localhost ~]# mount /dev/mapper/mpath0 /mnt/foo [root@localhost ~]# touch /mnt/foo/blah All good, now I yank the LUN out from under it. [root@localhost ~]# touch /mnt/foo/blah [root@localhost ~]# touch /mnt/foo/blah touch: cannot touch `/mnt/foo/blah': Read-only file system [root@localhost ~]# tail /var/log/messages Mar 18 13:17:33 localhost multipathd: sdb: tur checker reports path is down Mar 18 13:17:34 localhost multipathd: sdc: tur checker reports path is down Mar 18 13:17:35 localhost kernel: Aborting journal on device dm-2. Mar 18 13:17:35 localhost kernel: Buffer I/O error on device dm-2, logical block 1545 Mar 18 13:17:35 localhost kernel: lost page write due to I/O error on dm-2 Mar 18 13:17:36 localhost kernel: ext3_abort called. Mar 18 13:17:36 localhost kernel: EXT3-fs error (device dm-2): ext3_journal_start_sb: Detected aborted journal Mar 18 13:17:36 localhost kernel: Remounting filesystem read-only It only thinks its read-only, in reality its not even there. [root@localhost ~]# multipath -ll sdb: checker msg is "tur checker reports path is down" sdc: checker msg is "tur checker reports path is down" mpath0 (36001f93000a310000299000200000000) dm-2 XIOTECH,ISE1400 [size=1.1T][features=0][hwhandler=0][rw] \_ round-robin 0 [prio=0][enabled] \_ 1:0:0:1 sdb 8:16 [failed][faulty] \_ 2:0:0:1 sdc 8:32 [failed][faulty] [root@localhost ~]# ll /mnt/foo/ ls: reading directory /mnt/foo/: Input/output error total 20 -rw-r--r-- 1 root root 0 Mar 18 13:11 bar How it still remembers that 'bar' file being there... mystery, but not important right now. Now I re-present the LUN: [root@localhost ~]# tail /var/log/messages Mar 18 13:23:58 localhost multipathd: sdb: tur checker reports path is up Mar 18 13:23:58 localhost multipathd: 8:16: reinstated Mar 18 13:23:58 localhost multipathd: mpath0: queue_if_no_path enabled Mar 18 13:23:58 localhost multipathd: mpath0: Recovered to normal mode Mar 18 13:23:58 localhost multipathd: mpath0: remaining active paths: 1 Mar 18 13:23:58 localhost multipathd: dm-2: add map (uevent) Mar 18 13:23:58 localhost multipathd: dm-2: devmap already registered Mar 18 13:23:59 localhost multipathd: sdc: tur checker reports path is up Mar 18 13:23:59 localhost multipathd: 8:32: reinstated Mar 18 13:23:59 localhost multipathd: mpath0: remaining active paths: 2 Mar 18 13:23:59 localhost multipathd: dm-2: add map (uevent) Mar 18 13:23:59 localhost multipathd: dm-2: devmap already registered [root@localhost ~]# multipath -ll mpath0 (36001f93000a310000299000200000000) dm-2 XIOTECH,ISE1400 [size=1.1T][features=1 queue_if_no_path][hwhandler=0][rw] \_ round-robin 0 [prio=2][enabled] \_ 1:0:0:1 sdb 8:16 [active][ready] \_ 2:0:0:1 sdc 8:32 [active][ready] Great right? It says [rw] right there. Not so fast: [root@localhost ~]# touch /mnt/foo/blah touch: cannot touch `/mnt/foo/blah': Read-only file system OK, doesn't do it automatically, I'll just give it a little push: [root@localhost ~]# mount -o remount /mnt/foo mount: block device /dev/mapper/mpath0 is write-protected, mounting read-only Noooooooooo. I have tried all sorts of different mount/tune2fs/dmsetup commands and I cannot figure out how to get it to un-flag the block device as write-protected. Rebooting will fix it, but I'd much rather do it on-line. An hour of googling has gotten me nowhere either. Save me ServerFault.

    Read the article

  • How do you re-mount an ext3 fs readwrite after it gets mounted readonly from a disk error?

    - by cagenut
    Its a relatively common problem when something goes wrong in a SAN for ext3 to detect the disk write errors and remount the filesystem read-only. Thats all well and good, only when the SAN is fixed I can't figure out how to re-re-mount the filesystem read-write without rebooting. Behold: [root@localhost ~]# multipath -ll mpath0 (36001f93000a310000299000200000000) dm-2 XIOTECH,ISE1400 [size=1.1T][features=1 queue_if_no_path][hwhandler=0][rw] \_ round-robin 0 [prio=2][active] \_ 1:0:0:1 sdb 8:16 [active][ready] \_ 2:0:0:1 sdc 8:32 [active][ready] [root@localhost ~]# mount /dev/mapper/mpath0 /mnt/foo [root@localhost ~]# touch /mnt/foo/blah All good, now I yank the LUN out from under it. [root@localhost ~]# touch /mnt/foo/blah [root@localhost ~]# touch /mnt/foo/blah touch: cannot touch `/mnt/foo/blah': Read-only file system [root@localhost ~]# tail /var/log/messages Mar 18 13:17:33 localhost multipathd: sdb: tur checker reports path is down Mar 18 13:17:34 localhost multipathd: sdc: tur checker reports path is down Mar 18 13:17:35 localhost kernel: Aborting journal on device dm-2. Mar 18 13:17:35 localhost kernel: Buffer I/O error on device dm-2, logical block 1545 Mar 18 13:17:35 localhost kernel: lost page write due to I/O error on dm-2 Mar 18 13:17:36 localhost kernel: ext3_abort called. Mar 18 13:17:36 localhost kernel: EXT3-fs error (device dm-2): ext3_journal_start_sb: Detected aborted journal Mar 18 13:17:36 localhost kernel: Remounting filesystem read-only It only thinks its read-only, in reality its not even there. [root@localhost ~]# multipath -ll sdb: checker msg is "tur checker reports path is down" sdc: checker msg is "tur checker reports path is down" mpath0 (36001f93000a310000299000200000000) dm-2 XIOTECH,ISE1400 [size=1.1T][features=0][hwhandler=0][rw] \_ round-robin 0 [prio=0][enabled] \_ 1:0:0:1 sdb 8:16 [failed][faulty] \_ 2:0:0:1 sdc 8:32 [failed][faulty] [root@localhost ~]# ll /mnt/foo/ ls: reading directory /mnt/foo/: Input/output error total 20 -rw-r--r-- 1 root root 0 Mar 18 13:11 bar How it still remembers that 'bar' file being there... mystery, but not important right now. Now I re-present the LUN: [root@localhost ~]# tail /var/log/messages Mar 18 13:23:58 localhost multipathd: sdb: tur checker reports path is up Mar 18 13:23:58 localhost multipathd: 8:16: reinstated Mar 18 13:23:58 localhost multipathd: mpath0: queue_if_no_path enabled Mar 18 13:23:58 localhost multipathd: mpath0: Recovered to normal mode Mar 18 13:23:58 localhost multipathd: mpath0: remaining active paths: 1 Mar 18 13:23:58 localhost multipathd: dm-2: add map (uevent) Mar 18 13:23:58 localhost multipathd: dm-2: devmap already registered Mar 18 13:23:59 localhost multipathd: sdc: tur checker reports path is up Mar 18 13:23:59 localhost multipathd: 8:32: reinstated Mar 18 13:23:59 localhost multipathd: mpath0: remaining active paths: 2 Mar 18 13:23:59 localhost multipathd: dm-2: add map (uevent) Mar 18 13:23:59 localhost multipathd: dm-2: devmap already registered [root@localhost ~]# multipath -ll mpath0 (36001f93000a310000299000200000000) dm-2 XIOTECH,ISE1400 [size=1.1T][features=1 queue_if_no_path][hwhandler=0][rw] \_ round-robin 0 [prio=2][enabled] \_ 1:0:0:1 sdb 8:16 [active][ready] \_ 2:0:0:1 sdc 8:32 [active][ready] Great right? It says [rw] right there. Not so fast: [root@localhost ~]# touch /mnt/foo/blah touch: cannot touch `/mnt/foo/blah': Read-only file system OK, doesn't do it automatically, I'll just give it a little push: [root@localhost ~]# mount -o remount /mnt/foo mount: block device /dev/mapper/mpath0 is write-protected, mounting read-only The hell you are: [root@localhost ~]# mount -o remount,rw /mnt/foo mount: block device /dev/mapper/mpath0 is write-protected, mounting read-only Noooooooooo. I have tried all sorts of different mount/tune2fs/dmsetup commands and I cannot figure out how to get it to un-flag the block device as write-protected. Rebooting will fix it, but I'd much rather do it on-line. An hour of googling has gotten me nowhere either. Save me ServerFault.

    Read the article

  • Diagnosing SAN connectivity issues (RHEL5)

    - by Matthew
    We are currently utilizing GFS2 to share a SAN LUN between 3 servers. However due to a feature problem with vendor software we are using, we currently have the volume unmounted on two of the boxes, and are instead exporting the GFS2 filesystem via NFS from the first one (the software requires some weird locking mechanics that GFS2 doesn't support). As of this morning, NFS was no longer able to read/write to the volume from any of the servers, including the NFS server. I then tried checking the normal mount (the directory that is exported on the NFS server) and I received a weird input/output error just trying to CD into it. When I tried running multipath, I got a DM error, however multipath -l worked just fine. I tried unmounting the GFS2 volume, and the CLI hung. I ran init 0 which killed most services, but then the shutdown appeared to have been hung. I logged in via out of band access (hp ILO) and saw that the shutdown was hung trying to unmount GFS2 volumes. My main priority was getting the box back online so after about 5 minutes of waiting I did a hard reset. I am now trying to figure out what went wrong. What are the correct logs to investigate? I've never run into SAN issues like this before. The SAN is connected via 2 fibre connections. Any help would really be appreciated. Everything appears to be up and functional now.

    Read the article

1 2 3  | Next Page >