megacli - Developer IT

How to create Raid 10 with megacli

- by Henno

I have OpenFiler storage server. Without installing Windows and MSM, I want to create raid10 array from disks 2 to 21. I have already successfully installed MegaCli to OpenFiler but I'm stuck in figuring out the correct command line for creating a raid 10 array. The documentations says that the syntax for creating a raid 10 is: MegaCli -CfgSpanAdd -r10 -Array0[E:S,E:S] -Array1[E:S,E:S] -aN My enclosure ID is 25, so: [root@linux-h5ut ~]# MegaCli -CfgSpanAdd -r10 -Array0[E25:S02,E25:S21] -Array1[E25:S02,E25:S21] WB Cached NoCachedBadBBU -a0 Invalid input at or near token E I have googled high and low but there doesn't seem to be any example doing raid10 with megaraid (only the syntax). Can anyone explain what is wrong?

Read the article

megacli forceWB

- by Pascal den Bekker

We are using a raid controller: LSI Logic / Symbios Logic MegaRAID SAS 2008 But there is no BBU on Board, is there a way to force the WB Cache ?? I tried following: - /opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp CachedBadBBU -L0 -a0 Failed to set Write Cache OK if bad BBU on Adapter 0, VD 0. FW error description: The requested command has invalid arguments. Can anyone help? Cheers, Pascal

Read the article

MegaCli newly created disk doesn't appear under /dev/sdX

- by Henry-Nicolas Tourneur

After having successfully added 2 new disks in a new RAID virtual drive (background initialization done), I would have exepected it to appear under /dev/sdh but it's not there (so, unusable). The system is running a CentOS 5.2 64 bits, HAL and udev daemons are running, not records of any sdh apparition under the messsage log file or in dmesg, only MegaCli do see that virtual drive. Any idea ? Some data: [root@server ~]# ./MegaCli -LDInfo -LALL -a0 Adapter 0 -- Virtual Drive Information: Virtual Disk: 0 (target id: 0) Name: RAID Level: Primary-1, Secondary-0, RAID Level Qualifier-0 Size:139392MB State: Optimal Stripe Size: 64kB Number Of Drives:2 Span Depth:1 Default Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU Current Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU Access Policy: Read/Write Disk Cache Policy: Disk's Default Virtual Disk: 1 (target id: 1) Name: RAID Level: Primary-1, Secondary-0, RAID Level Qualifier-0 Size:285568MB State: Optimal Stripe Size: 64kB Number Of Drives:2 Span Depth:1 Default Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU Current Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU Access Policy: Read/Write Disk Cache Policy: Disk's Default [root@server ~]# ls -l /dev/disk/by-id/scsi-360* lrwxrwxrwx 1 root root 9 Nov 17 2010 /dev/disk/by-id/scsi-36001ec90f82fe100108ca0a704098d09 -> ../../sda lrwxrwxrwx 1 root root 10 Nov 17 2010 /dev/disk/by-id/scsi-36001ec90f82fe100108ca0a704098d09-part1 -> ../../sda1 lrwxrwxrwx 1 root root 10 Nov 17 2010 /dev/disk/by-id/scsi-36001ec90f82fe100108ca0a704098d09-part2 -> ../../sda2 lrwxrwxrwx 1 root root 9 Nov 17 2010 /dev/disk/by-id/scsi-36090a028e0fe07e78f94940c0000a0ee -> ../../sdf lrwxrwxrwx 1 root root 10 Nov 17 2010 /dev/disk/by-id/scsi-36090a028e0fe07e78f94940c0000a0ee-part1 -> ../../sdf1 lrwxrwxrwx 1 root root 9 Nov 17 2010 /dev/disk/by-id/scsi-36090a028e0fe972a3f91240a0000005f -> ../../sdb lrwxrwxrwx 1 root root 10 Nov 17 2010 /dev/disk/by-id/scsi-36090a028e0fe972a3f91240a0000005f-part1 -> ../../sdb1 lrwxrwxrwx 1 root root 9 Nov 17 2010 /dev/disk/by-id/scsi-36090a028e0fea7e18f94640c000020ec -> ../../sde lrwxrwxrwx 1 root root 10 Nov 17 2010 /dev/disk/by-id/scsi-36090a028e0fea7e18f94640c000020ec-part1 -> ../../sde1 lrwxrwxrwx 1 root root 9 Nov 17 2010 /dev/disk/by-id/scsi-36090a028e0feb7da8f94340c0000203d -> ../../sdd lrwxrwxrwx 1 root root 10 Nov 17 2010 /dev/disk/by-id/scsi-36090a028e0feb7da8f94340c0000203d-part1 -> ../../sdd1 lrwxrwxrwx 1 root root 9 Nov 17 2010 /dev/disk/by-id/scsi-36090a028e0fed7d78f94040c000080b7 -> ../../sdc lrwxrwxrwx 1 root root 10 Nov 17 2010 /dev/disk/by-id/scsi-36090a028e0fed7d78f94040c000080b7-part1 -> ../../sdc1 lrwxrwxrwx 1 root root 9 Nov 17 2010 /dev/disk/by-id/scsi-36090a05830145e58e0b9c479000010a1 -> ../../sdg lrwxrwxrwx 1 root root 10 Nov 17 2010 /dev/disk/by-id/scsi-36090a05830145e58e0b9c479000010a1-part1 -> ../../sdg1

Read the article

Install MegaCli to Monitor Perc 5/i in Nexentastor 3

- by Peter Valadez

I have a Dell 2950 with a Perc 5/i Raid controller that we've already installed Nexentastor 3 Community Edition on. We setup a raid-10 array that and put a ZFS pool on top of the hardware. As I understand, in this configuration ZFS/Nexentastor will not be able to tell when a disk fails in the array. Obviously, this is not optimal. Since the Dell Perc 5/i controller is a rebranded LSI controller, you should be able to use the MegaCli utility to manage the array and monitor its condition. I had seen in a separate forum that the Perc 5/i is very similar to the LSI MegaRAID 8480E, so I tried installing the MegaCli utility at the link below. However, I have not been able to successfully install the utility. http://www.lsi.com/support/products/Pages/MegaRAIDSAS8480E.aspx Here is what happened when I tried to install MegaCli: root@Nexenta2:/files# pkgadd -d MegaCli.pkg Warning: unable to relocate '$BASEDIR' mv: cannot move `solmegacli-8.02.16/' to a subdirectory of itself, `solmegacli-8.02.16//var/lib/dpkg/alien/solmegacli/reloc/solmegacli-8.02.16' mv: cannot move `solmegacli-8.02.16/' to a subdirectory of itself, `solmegacli-8.02.16//opt/solmegacli-8.02.16' 822-date: warning: This program is deprecated. Please use 'date -R' instead. 822-date: warning: This program is deprecated. Please use 'date -R' instead. solmegacli_8.02.16-1_all.deb generated (Reading database ... 41397 files and directories currently installed.) Preparing to replace solmegacli 8.02.16-1 (using solmegacli_8.02.16-1_all.deb) ... Unpacking replacement solmegacli ... Setting up solmegacli (8.02.16-1) ... In /var/logs/dpkg.log: 2012-03-23 20:40:19 status unpacked solmegacli 8.02.16-1 2012-03-23 20:40:19 configure solmegacli 8.02.16-1 8.02.16-1 2012-03-23 20:40:19 status unpacked solmegacli 8.02.16-1 2012-03-23 20:40:19 status half-configured solmegacli 8.02.16-1 2012-03-23 20:40:19 status installed solmegacli 8.02.16-1 So... I've got three questions: Is it possible to install and use MegaCli in Nexentastor 3? If so, how can I install MegaCli on Nexentastor 3? Suggestions welcome!!! If not, is there a better way to monitor the condition of the Perc 5/i hardware raid? Our 2950 does have a DRAC card, so can I use that to monitor the raid condition?

Read the article

MegaCLI always returns blank output

- by JamesHannah

This server is a Dell R200 running Ubuntu 8.04LTS using a LSI SAS1068E raid card supplied from Dell, I suspect that there might be some kind of RAID issue with the hardware raid built into the motherboard, but I can't seem to get MegaCLi to return any useful output: root@81 $ ./MegaCli -AdpAllInfo -aALL root@81 $ ./MegaCli -PDList -aALL root@81 $ The disks work and AFAIK the raid software is installed correctly. I've seen this issue on RedHat issues also in the past. The RAID was initially setup through the BIOS on this server and appears to be functioning fine apart from this.

Read the article

Megacli is killing me, any help appreciated

- by Stefan

I run a server with 2 drives in raid0 configured through BIOS. I just added 2 more drives using hotplug (the server is dell r610 with RHEL 5.4 64bit) and I would like to configure a separate raid0 partition on these drives. I am getting the following error: /opt/MegaRAID/MegaCli/MegaCli64 -CfgLdAdd r0[32:2, 32:3] -a0 The specified physical disk does not have the appropriate attributes to complete the requested command. Exit Code: 0x26 All the parameters are correct and there is just no reason why this command could not work, see this (fujitsu is current raid, seagate is the new one I want to create): /opt/MegaRAID/MegaCli/MegaCli64 -PDList -aALL | egrep 'Adapter|Enclosure|Slot|Inquiry' Adapter #0 Enclosure Device ID: 32 Slot Number: 0 Enclosure position: 0 Inquiry Data: FUJITSU MBD2147RC D807D0A4PA101174 Enclosure Device ID: 32 Slot Number: 1 Enclosure position: 0 Inquiry Data: FUJITSU MBD2147RC D807D0A4PA10115T Enclosure Device ID: 32 Slot Number: 2 Enclosure position: 0 Inquiry Data: SEAGATE ST9300603SS FS033SE0TF5K Enclosure Device ID: 32 Slot Number: 3 Enclosure position: 0 Inquiry Data: SEAGATE ST9300603SS FS023SE070FK I also tried to set up the drive as hotspare, also some strange error: /opt/MegaRAID/MegaCli/MegaCli64 -PDHSP -Set -physdrv[32:3] -a0 Adapter: 0: Set Physical Drive at EnclId-32 SlotId-3 as Hot Spare Failed. FW error description: The specified device is in a state that doesn't support the requested command. Exit Code: 0x32 As you can see the disk is in Unconfigured, Good state: Enclosure Device ID: 32 Slot Number: 3 Enclosure position: 0 Device Id: 3 Sequence Number: 1 Media Error Count: 0 Other Error Count: 0 Predictive Failure Count: 0 Last Predictive Failure Event Seq Number: 0 PD Type: SAS Raw Size: 279.396 GB [0x22ecb25c Sectors] Non Coerced Size: 278.896 GB [0x22dcb25c Sectors] Coerced Size: 278.875 GB [0x22dc0000 Sectors] Firmware state: Unconfigured(good), Spun Up SAS Address(0): 0x5000c50005cd20b1 SAS Address(1): 0x0 Connected Port Number: 3(path0) Inquiry Data: SEAGATE ST9300603SS FS023SE070FK FDE Capable: Not Capable FDE Enable: Disable Secured: Unsecured Locked: Unlocked Needs EKM Attention: No Foreign State: Foreign Foreign Secure: Drive is not secured by a foreign lock key Device Speed: Unknown Link Speed: Unknown Media Type: Hard Disk Device Drive Temperature :30C (86.00 F)

Read the article

Bad Blocks Exist in Virtual Device PERC H700 Integrated

- by neoX

I have a DELL server with PERC H700 Integrated controller. I've made RAID5 with 12 harddrives and the virtual device is in Optimal state, but I receive such errors under linux: sd 0:2:0:0: [sda] Unhandled error code sd 0:2:0:0: [sda] Result: hostbyte=0x07 driverbyte=0x00 sd 0:2:0:0: [sda] CDB: cdb[0]=0x88: 88 00 00 00 00 07 22 50 bd 98 00 00 00 08 00 00 end_request: I/O error, dev sda, sector 30640487832 sd 0:2:0:0: [sda] Unhandled error code sd 0:2:0:0: [sda] Result: hostbyte=0x07 driverbyte=0x00 sd 0:2:0:0: [sda] CDB: cdb[0]=0x88: 88 00 00 00 00 07 22 50 bd 98 00 00 00 08 00 00 end_request: I/O error, dev sda, sector 30640487832 sd 0:2:0:0: [sda] Unhandled error code sd 0:2:0:0: [sda] Result: hostbyte=0x07 driverbyte=0x00 sd 0:2:0:0: [sda] CDB: cdb[0]=0x88: 88 00 00 00 00 07 22 50 bc e0 00 00 01 00 00 00 end_request: I/O error, dev sda, sector 30640487648 But all disk are in Firmware state: Online, Spun Up. Also there is not a single ATA read or write error in any disk in the raid (I check them with smartctl -a -d sat+megaraid,N -H /dev/sda). The only strange thing is in the output in megacli: megacli -LDInfo -L0 -a0 ... Bad Blocks Exist: Yes How could there be bad blocks in a Virtual Drive, which is in optimal state and no disk is broken or even with a single error? I tried "Consistency Check", but it finished successfully and the errors are still in dmesg. Could Someone help me to figure it out what is wrong with my raid?

Read the article

?????Exadata????

- by Liu Maclean(???)

??check Exadata Image & OS versions , GI & DB patches sundiag exacheck cellserv ==> imageinfo dbhost ==> /usr/local/bin/imagehistory Also check the version of the switch. Login to Switch and execute the following command [root@myswitch-1 sbin]# version [root@dmorlsw-ib2 sbin]# cd /usr/local/bin [root@dmorlsw-ib2 bin]# ls -lrt version -rwxr-xr-x 1 root root 20356 Apr 4 2011 version Output will look as below. [root@dmorlsw-ib2 ~]# version SUN DCS 36p version: 1.3.3-2 Build time: Apr 4 2011 11:15:19 SP board info: Manufacturing Date: 2009.05.05 Serial Number: "NCD3X0178" Hardware Revision: 0x0006 Firmware Revision: 0x0102 BIOS version: NOW1R112 BIOS date: 04/24/2009 ib8# cat /sys/class/infiniband/is4_0/fw_ver 7.2.300 ib8 # cat /sys/class/dmi/id/bios_version NOW1R112 ib8 # nm2version NM2-36p version: 1.0.1-1 Build time: Sep 14 2009 12:52:51 ComExpress info: Manufacturing Date: 2009.08.19 Serial Number: Hardware Revision: 0x0006 Firmware Revision: 0x0102 { case `uname` in Linux ) ILOM="/usr/bin/ipmitool sunoem cli" ;; SunOS ) ILOM="/opt/ipmitool/bin/ipmitool sunoem cli" ;; esac ; ImageInfo="/opt/oracle.cellos/imageinfo" ; uname -srm ; head -1 /etc/*release ; uptime | cut -d, -f1 ; $ILOM "show /SP system_description system_identifier" | grep = ; $ImageInfo -activated -node -status -ver | grep -v ^$ ; } | tee /tmp/ExaInfo.log $GRID_HOME/OPatch/opatch lsinv -all -oh $GRID_HOME | tee /tmp/OPatchInv.log $ORACLE_HOME/OPatch/opatch lsinv -all | tee -a /tmp/OPatchInv.log cat /tmp/ExaInfo.log Linux 2.6.18-128.1.16.0.1.el5 x86_64 ==> /etc/enterprise-release <== Enterprise Linux Enterprise Linux Server release 5.3 (Carthage) ==> /etc/redhat-release <== Enterprise Linux Enterprise Linux Server release 5.3 (Carthage) 20:37:56 up 458 days system_description = SUN FIRE X4170 SERVER, ILOM v3.0.6.10.b, r52264 system_identifier = Sun Oracle Database Machine Active image version: 11.2.1.2.3 Active image activated: XXXX-XX-XX 12:27:12 +0800 Active image status: success Active node type: COMPUTE Inactive image version: undefined FileName: OPatchInv.log ---------------- ... Oracle Home : /u01/app/11.2.0/grid Central Inventory : /u01/app/oraInventory from : /etc/oraInst.loc OPatch version : 11.2.0.1.2 OUI version : 11.2.0.1.0 OUI location : /u01/app/11.2.0/grid/oui ... -------------------------------------------------------------------------------- List of Oracle Homes: Name Location Ora11g_gridinfrahome1 /u01/app/11.2.0/grid OraDb11g_home1 /u01/app/oracle/product/11.2.0/dbhome_1 -------------------------------------------------------------------------------- Installed Top-level Products (1): Oracle Grid Infrastructure 11.2.0.1.0 ... Interim patches (2) : Patch 9524394 : applied on Thu Jun 03 20:46:05 CST 2010 ... {TRACKING BUG FOR 11.2.0.1 DB MACHINE BUNDLE PATCH 3} Patch 9455587 : applied on Fri Apr 02 18:27:47 CST 2010 ... {MERGE REQUEST ON TOP OF 11.2.0.1.0 FOR BUGS 8483425 8667622 8702731 8730804} Rac system comprising of multiple nodes Local node = dbserv01 Remote node = dbserv02 Remote node = dbserv03 Remote node = dbserv04 -------------------------------------------------------------------------------- OPatch succeeded. ... Oracle Home : /u01/app/oracle/product/11.2.0/dbhome_1 ... Oracle Database 11g 11.2.0.1.0 ... Interim patches (5) : Patch 8888434 : applied on Sat Jan 08 00:27:33 CST 2011 ... {AIX-ASM-CF: LMHB TERMINATE INSTANCE WHEN OFFLINE ONE FAILGROUP IN ASM DG} Patch 8730312 : applied on Thu Jun 03 21:30:03 CST 2010 ... {FWD MERGE FOR BASE BUG 8715387 FOR 12G} Patch 9502717 : applied on Thu Jun 03 21:25:54 CST 2010 ... {LMS HIT ORA-600 [KJBLDRMNEXTPKEY:SEEN] AND CRASHED THE INSTANCE} { + same 2 as GI above} ?? cell server Cache Policy cell08# MegaCli64 -LDInfo -Lall -aALL | grep 'Current Cache Policy' Current Cache Policy: WriteThrough, ReadAheadNone, Direct, No Write Cache if Bad BBU cell09# MegaCli64 -LDInfo -Lall -aALL | grep 'Current Cache Policy' Current Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU Default Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU Current Cache Policy: WriteThrough, ReadAheadNone, Direct, No Write Cache if Bad BBU Cache policy is in WB Would recommend proactive battery repalcement. Example : a. /opt/MegaRAID/MegaCli/MegaCli64 -LDGetProp -Cache -LALL -aALL ####( Will list the cache policy) b. /opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp -WB -LALL -aALL ####( Will try to change teh policy from xx to WB) So policy Change to WB will not come into effect immediately Set Write Policy to WriteBack on Adapter 0, VD 0 (target id: 0) success Battery capacity is below the threshold value ??cell BBU??????: cell08# /opt/MegaRAID/MegaCli/MegaCli64 -AdpBbuCmd -GetBbuStatus -a0 BBU status for Adapter: 0 BatteryType: iBBU Voltage: 4061 mV Current: 0 mA Temperature: 36 C BBU Firmware Status: Charging Status : None Voltage : OK Temperature : OK Learn Cycle Requested : No Learn Cycle Active : No Learn Cycle Status : OK Learn Cycle Timeout : No I2c Errors Detected : No Battery Pack Missing : No Battery Replacement required : No Remaining Capacity Low : Yes Periodic Learn Required : No Battery state: GasGuageStatus: Fully Discharged : No Fully Charged : Yes Discharging : Yes Initialized : Yes Remaining Time Alarm : No Remaining Capacity Alarm: No Discharge Terminated : No Over Temperature : No Charging Terminated : No Over Charged : No Relative State of Charge: 99 % Charger System State: 49168 Charger System Ctrl: 0 Charging current: 0 mA Absolute state of charge: 21 % Max Error: 2 % Exit Code: 0x00 ????BBU ??: dcli -g ~/cell_group -l root -t '{ uname -srm ; head -1 /etc/*release ; uptime | cut -d, -f1 ; imagehistory ; ipmitool sunoem cli "show /SP system_description system_identifier" | grep = ; ipmitool sunoem cli "show /SP/policy FLASH_ACCELERATOR_CARD_INSTALLED /opt/MegaRAID/MegaCli/MegaCli64 -AdpBbuCmd -GetBbuStatus -a0 | egrep -i 'BBU|Battery|Charge:|Fully|Low|Learn' ; }' | tee /tmp/ExaInfo.log Target cells: ['cellserv01', 'cellserv02', 'cellserv03', 'cellserv04', 'cellserv05', 'cellserv06', 'cellserv07'] cellserv01: Linux 2.6.18-128.1.16.0.1.el5 x86_64 cellserv01: ==> /etc/enterprise-release <== cellserv01: Enterprise Linux Enterprise Linux Server release 5.3 (Carthage) cellserv01: cellserv01: ==> /etc/redhat-release <== cellserv01: Enterprise Linux Enterprise Linux Server release 5.3 (Carthage) cellserv01: 01:17:39 up 635 days cellserv01: Version : 11.2.1.2.1 cellserv01: Image activation date : 2011-03-25 11:59:34 -0800 cellserv01: Imaging mode : fresh cellserv01: Imaging status : success cellserv01: cellserv01: Version : 11.2.1.2.3 cellserv01: Image activation date : 2011-04-13 12:15:46 +0800 cellserv01: Imaging mode : patch cellserv01: Imaging status : success cellserv01: cellserv01: Version : 11.2.1.2.6 cellserv01: Image activation date : 2011-05-27 23:08:22 +0800 cellserv01: Imaging mode : patch cellserv01: Imaging status : success cellserv01: cellserv01: system_description = SUN FIRE X4275 SERVER, ILOM v3.0.6.10.b, r52264 cellserv01: system_identifier = Sun Oracle Database Machine cellserv01: Connected. Use ^D to exit. cellserv01: -> show /SP/policy FLASH_ACCELERATOR_CARD_INSTALLED cellserv01: show: No matching properties found. cellserv01: cellserv01: -> Session closed cellserv01: Disconnected cellserv01: BBU status for Adapter: 0 cellserv01: BatteryType: iBBU cellserv01: BBU Firmware Status: cellserv01: Learn Cycle Requested : No cellserv01: Learn Cycle Active : No cellserv01: Learn Cycle Status : OK cellserv01: Learn Cycle Timeout : No cellserv01: Battery Pack Missing : No cellserv01: Battery Replacement required : No cellserv01: Remaining Capacity Low : Yes cellserv01: Periodic Learn Required : No cellserv01: Battery state: cellserv01: Fully Discharged : No cellserv01: Fully Charged : Yes cellserv01: Relative State of Charge: 99 % cellserv01: Absolute state of charge: 21 % dcli -l root -g /root/all_group '/opt/MegaRAID/MegAaCli/MegaCli64 -AdpBbuCmd -a0' > BBU.out check ipmi: dcli -g ~/cell_group -l root -t '{ > ipmitool sunoem cli "show /SP/policy FLASH_ACCELERATOR_CARD_INSTALLED" | grep = ; MegaCli64 -LDInfo -Lall -aALL | grep 'Current Cache Policy' ; }' | tee /tmp/ExaCells.log

Read the article

Is current SATA 6 gb/s equipment simply unreliable?

- by korkman

I have a 45-disk array of Seagate Barracuda 3 TB ST3000DM001 (yes these are desktop drives I'm aware of that) in a Supermicro sc847 JBOD, connected via LSI 9285. I have found a solution for the problem description below by reducing speed via MegaCli -PhySetLinkSpeed -phy0 2 -a0; for i in $(seq 48); do MegaCli -PhySetLinkSpeed -phy${i} 2 -a0; done and rebooting. The question remains: Is this typical for current 6 gb/s equipment? Is this the sad state of SATA storage? Or is some of my equipment (the sff-8088 cables come to mind) bad? The Problem was: Synchronizing HW RAID-6, disks kept offlining. Fetching SMART values reveiled that those which offlined did not increase powered-on hours anymore. That is, their firmware (CC4C) seems to crash. Digging into the matter by switching to Software RAID-6, with the disks passed-through, I got tons of kernel messages scattered across all disks, with 6 gb/s: sd 0:0:9:0: [sdb] Sense Key : No Sense [current] Info fld=0x0 sd 0:0:9:0: [sdb] Add. Sense: No additional sense information And finally, when a disk offlines: megasas: [ 5]waiting for 160 commands to complete ... megasas: [35]waiting for 159 commands to complete ... megasas: [155]waiting for 156 commands to complete ... megaraid_sas: pending commands remain after waiting, will reset adapter. Ugly controller reset here, then minutes later: megaraid_sas: Reset successful. sd 0:0:28:0: Device offlined - not ready after error recovery ... sd 0:0:28:0: [sdu] Unhandled error code sd 0:0:28:0: [sdu] Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK sd 0:0:28:0: [sdu] CDB: Read(10): 28 00 23 21 2f 40 00 00 70 00 sd 0:0:28:0: [sdu] killing request Reduced speed to 3 gb/s like written above, all problems vanished.

Read the article

Very slow disk performance on Dell PowerEdge 2950 w/ PERC 6/i running RAID 10

- by vocoder

I recently set up a server running Ubuntu 10.04 LTS on a Dell PowerEdge 2950 server - it has 6 500 gb 7200RPM SATA drives setup in a RAID 10 config. I am seeing extremely poor disk performance - the RAID array reports all disks are normal and using MegaCLI, it looks like the BBU is fine. hdparm -tT /dev/sda reports: Timing cached reads: 90 MB in 2.05 seconds = 43.96 MB/sec Timing buffered disk reads: 24 MB in 3.11 seconds = 7.72 MB/sec So as you can see, it takes forever to something as simple as an apt-get upgrade and even logging into the server. How do I go about troubleshooting what is causing this? I upgraded the firmware on the PERC 6i RAID controller to the latest, but didn't see any improvements.

Read the article

How to set up Dell OMSA tools on Debian 6 (Squeeze) (PE2950)

- by javano

I assume I have set up them up incorrectly or am missing a library or dependency. When I log into the OMSA web interface I can't see anything Also, omreport tells me nothing; root@box:~# omreport storage controller No controllers found I assume these two will use the same source of information so what ever is wrong will fix them both. I set up OMSA as per these instructions. Also I have compiled MegaCLI (as this is a PowerEdge 2950 with a Perc 6/i controller) and I have used that to update the RAID firmware, so that works, but the Dell tools aren't. What have I missed during set up? root@kvm1:~# cat /etc/issue Debian GNU/Linux 6.0 \n \l root@kvm1:~# uname -a Linux kvm1.mivoice.cust.vostron.net 3.4.9 #1 SMP Wed Aug 22 19:08:46 BST 2012 x86_64 GNU/Linux

Read the article

S.M.A.R.T - Predictive Failure Count

- by Bastien974

I'm monitoring my IBM ServeRAID M5015 controller for RAID status with MegaCLI, I have this on one of the disk : Enclosure Device ID: 252 Slot Number: 6 Enclosure position: 0 Device Id: 14 Sequence Number: 2 Media Error Count: 32 Other Error Count: 0 Predictive Failure Count: 18 Last Predictive Failure Event Seq Number: 8119 PD Type: SAS Raw Size: 279.396 GB [0x22ecb25c Sectors] Non Coerced Size: 278.896 GB [0x22dcb25c Sectors] Coerced Size: 278.464 GB [0x22cee000 Sectors] Firmware state: Online, Spun Up SAS Address(0): 0x5000c50042c319c9 SAS Address(1): 0x0 Connected Port Number: 5(path0) Inquiry Data: IBM-ESXSST9300653SS B6336XN04HC10525B633 IBM FRU/CRU: 81Y9671 FDE Capable: Not Capable FDE Enable: Disable Secured: Unsecured Locked: Unlocked Needs EKM Attention: No Foreign State: None Device Speed: 6.0Gb/s Link Speed: 6.0Gb/s Media Type: Hard Disk Device Drive: Not Certified Drive Temperature :33 Celsius What does this mean exactly ? I can't find an exact description, is there a way to have more details ? The RAID array has the Optimal state. Media Error Count: 32 Predictive Failure Count: 18 Is there a way through the CLI to power-on the front LED so I physically know which disk I need to replace ?

Read the article

Why load increase ,ssd iops increase but cpu iowait decrease?

- by mq44944

There is a strange thing on my server which has a mysql running on it. The QPS is more than 4000 but TPS is less than 20. The server load is more than 80 and cpu usr is more than 86% but iowait is less than 8%. The disk iops is more than 16000 and util of disk is more than 99%. When the QPS decreases, the load decreases, the cpu iowait increases. I can't catch this! root@mypc # dmidecode | grep "Product Name" Product Name: PowerEdge R510 Product Name: 084YMW root@mypc # megacli -PDList -aALL |grep "Inquiry Data" Inquiry Data: SEAGATE ST3600057SS ES656SL316PT Inquiry Data: SEAGATE ST3600057SS ES656SL30THV Inquiry Data: ATA INTEL SSDSA2CW300362CVPR201602A6300EGN Inquiry Data: ATA INTEL SSDSA2CW300362CVPR2044037K300EGN Inquiry Data: ATA INTEL SSDSA2CW300362CVPR204402PX300EGN Inquiry Data: ATA INTEL SSDSA2CW300362CVPR204403WN300EGN Inquiry Data: ATA INTEL SSDSA2CW300362CVPR202000HU300EGN Inquiry Data: ATA INTEL SSDSA2CW300362CVPR202001E7300EGN Inquiry Data: ATA INTEL SSDSA2CW300362CVPR204402WE300EGN Inquiry Data: ATA INTEL SSDSA2CW300362CVPR204404E5300EGN Inquiry Data: ATA INTEL SSDSA2CW300362CVPR204401QF300EGN Inquiry Data: ATA INTEL SSDSA2CW300362CVPR20450001300EGN the mysql data files lie on the ssd disks which are organizaed using RAID 10. root@mypc # megacli -LDInfo -L1 -a0 Adapter 0 -- Virtual Drive Information: Virtual Disk: 1 (Target Id: 1) Name: RAID Level: Primary-1, Secondary-0, RAID Level Qualifier-0 Size:1427840MB State: Optimal Stripe Size: 64kB Number Of Drives:2 Span Depth:5 Default Cache Policy: WriteThrough, ReadAheadNone, Direct, No Write Cache if Bad BBU Current Cache Policy: WriteThrough, ReadAheadNone, Direct, No Write Cache if Bad BBU Access Policy: Read/Write Disk Cache Policy: Disk's Default Exit Code: 0x00 -------- -----load-avg---- ---cpu-usage--- ---swap--- -------------------------io-usage----------------------- -QPS- -TPS- -Hit%- time | 1m 5m 15m |usr sys idl iow| si so| r/s w/s rkB/s wkB/s queue await svctm %util| ins upd del sel iud| lor hit| 09:05:29|79.80 64.49 42.00| 82 7 6 5| 0 0|16421.1 10.6262705.9 85.2 8.3 0.5 0.1 99.5| 0 0 0 3968 0| 495482 96.58| 09:05:30|79.80 64.49 42.00| 79 7 8 6| 0 0|15907.4 230.6254409.7 6357.5 8.4 0.5 0.1 98.5| 0 0 0 4195 0| 496434 96.68| 09:05:31|81.34 65.07 42.31| 81 7 7 5| 0 0|16198.7 8.6259029.2 99.8 8.1 0.5 0.1 99.3| 0 0 0 4220 0| 508983 96.70| 09:05:32|81.34 65.07 42.31| 82 7 5 5| 0 0|16746.6 8.7267853.3 92.4 8.5 0.5 0.1 99.4| 0 0 0 4084 0| 503834 96.54| 09:05:33|81.34 65.07 42.31| 81 7 6 5| 0 0|16498.7 9.6263856.8 92.3 8.0 0.5 0.1 99.3| 0 0 0 4030 0| 507051 96.60| 09:05:34|81.34 65.07 42.31| 80 8 7 6| 0 0|16328.4 11.5261101.6 95.8 8.1 0.5 0.1 98.3| 0 0 0 4119 0| 504409 96.63| 09:05:35|81.31 65.33 42.52| 82 7 6 5| 0 0|16374.0 8.7261921.9 92.5 8.1 0.5 0.1 99.7| 0 0 0 4127 0| 507279 96.66| 09:05:36|81.31 65.33 42.52| 81 8 6 5| 0 0|16496.2 8.6263832.0 84.5 8.5 0.5 0.1 99.2| 0 0 0 4100 0| 505054 96.59| 09:05:37|81.31 65.33 42.52| 82 8 6 4| 0 0|16239.4 9.6259768.8 84.3 8.0 0.5 0.1 99.1| 0 0 0 4273 0| 510621 96.72| 09:05:38|81.31 65.33 42.52| 81 7 6 5| 0 0|16349.6 8.7261439.2 81.4 8.2 0.5 0.1 98.9| 0 0 0 4171 0| 510145 96.67| 09:05:39|81.31 65.33 42.52| 82 7 6 5| 0 0|16116.8 8.7257667.6 96.5 8.0 0.5 0.1 99.1| 0 0 0 4348 0| 513093 96.74| 09:05:40|79.60 65.24 42.61| 79 7 7 7| 0 0|16154.2 242.9258390.4 6388.4 8.5 0.5 0.1 99.0| 0 0 0 4033 0| 507244 96.70| 09:05:41|79.60 65.24 42.61| 79 7 8 6| 0 0|16583.1 21.2265129.6 173.5 8.2 0.5 0.1 99.1| 0 0 0 3995 0| 501474 96.57| 09:05:42|79.60 65.24 42.61| 81 8 6 5| 0 0|16281.0 9.7260372.2 69.5 8.3 0.5 0.1 98.7| 0 0 0 4221 0| 509322 96.70| 09:05:43|79.60 65.24 42.61| 80 7 7 6| 0 0|16355.3 8.7261515.5 104.3 8.2 0.5 0.1 99.6| 0 0 0 4087 0| 502052 96.62| -------- -----load-avg---- ---cpu-usage--- ---swap--- -------------------------io-usage----------------------- -QPS- -TPS- -Hit%- time | 1m 5m 15m |usr sys idl iow| si so| r/s w/s rkB/s wkB/s queue await svctm %util| ins upd del sel iud| lor hit| 09:05:44|79.60 65.24 42.61| 83 7 5 4| 0 0|16469.4 11.6263387.0 138.8 8.2 0.5 0.1 98.7| 0 0 0 4292 0| 509979 96.65| 09:05:45|79.07 65.37 42.77| 80 7 6 6| 0 0|16659.5 9.7266478.7 85.0 8.4 0.5 0.1 98.5| 0 0 0 3899 0| 496234 96.54| 09:05:46|79.07 65.37 42.77| 78 7 7 8| 0 0|16752.9 8.7267921.8 97.1 8.4 0.5 0.1 98.9| 0 0 0 4126 0| 508300 96.57| 09:05:47|79.07 65.37 42.77| 82 7 6 5| 0 0|16657.2 9.6266439.3 84.3 8.3 0.5 0.1 98.9| 0 0 0 4086 0| 502171 96.57| 09:05:48|79.07 65.37 42.77| 79 8 6 6| 0 0|16814.5 8.7268924.1 77.6 8.5 0.5 0.1 99.0| 0 0 0 4059 0| 499645 96.52| 09:05:49|79.07 65.37 42.77| 81 7 6 5| 0 0|16553.0 6.8264708.6 42.5 8.3 0.5 0.1 99.4| 0 0 0 4249 0| 501623 96.60| 09:05:50|79.63 65.71 43.01| 79 7 7 7| 0 0|16295.1 246.9260475.0 6442.4 8.7 0.5 0.1 99.1| 0 0 0 4231 0| 511032 96.70| 09:05:51|79.63 65.71 43.01| 80 7 6 6| 0 0|16568.9 8.7264919.7 104.7 8.3 0.5 0.1 99.7| 0 0 0 4272 0| 517177 96.68| 09:05:53|79.63 65.71 43.01| 79 7 7 6| 0 0|16539.0 8.6264502.9 87.6 8.4 0.5 0.1 98.9| 0 0 0 3992 0| 496728 96.52| 09:05:54|79.63 65.71 43.01| 79 7 7 7| 0 0|16527.5 11.6264363.6 92.6 8.5 0.5 0.1 98.8| 0 0 0 4045 0| 502944 96.59| 09:05:55|79.63 65.71 43.01| 80 7 7 6| 0 0|16374.7 12.5261687.2 134.9 8.6 0.5 0.1 99.2| 0 0 0 4143 0| 507006 96.66| 09:05:56|76.05 65.20 42.96| 77 8 8 8| 0 0|16464.9 9.6263314.3 111.9 8.5 0.5 0.1 98.9| 0 0 0 4250 0| 505417 96.64| 09:05:57|76.05 65.20 42.96| 79 7 6 7| 0 0|16460.1 8.8263283.2 93.4 8.3 0.5 0.1 98.8| 0 0 0 4294 0| 508168 96.66| 09:05:58|76.05 65.20 42.96| 80 7 7 7| 0 0|16176.5 9.6258762.1 127.3 8.3 0.5 0.1 98.9| 0 0 0 4160 0| 509349 96.72| 09:05:59|76.05 65.20 42.96| 75 7 9 10| 0 0|16522.0 10.7264274.6 93.1 8.6 0.5 0.1 97.5| 0 0 0 4034 0| 492623 96.51| -------- -----load-avg---- ---cpu-usage--- ---swap--- -------------------------io-usage----------------------- -QPS- -TPS- -Hit%- time | 1m 5m 15m |usr sys idl iow| si so| r/s w/s rkB/s wkB/s queue await svctm %util| ins upd del sel iud| lor hit| 09:06:00|76.05 65.20 42.96| 79 7 7 7| 0 0|16369.6 21.2261867.3 262.5 8.4 0.5 0.1 98.9| 0 0 0 4305 0| 494509 96.59| 09:06:01|75.33 65.23 43.09| 73 6 9 12| 0 0|15864.0 209.3253685.4 6238.0 10.0 0.6 0.1 98.7| 0 0 0 3913 0| 483480 96.62| 09:06:02|75.33 65.23 43.09| 73 7 8 12| 0 0|15854.7 12.7253613.2 93.6 11.0 0.7 0.1 99.0| 0 0 0 4271 0| 483771 96.64| 09:06:03|75.33 65.23 43.09| 75 7 9 9| 0 0|16074.8 8.7257104.3 81.7 8.1 0.5 0.1 98.5| 0 0 0 4060 0| 480701 96.55| 09:06:04|75.33 65.23 43.09| 76 7 8 9| 0 0|16221.7 9.7259500.1 139.4 8.1 0.5 0.1 97.6| 0 0 0 3953 0| 486774 96.56| 09:06:05|74.98 65.33 43.24| 78 7 8 8| 0 0|16330.7 8.7261166.5 85.3 8.2 0.5 0.1 98.5| 0 0 0 3957 0| 481775 96.53| 09:06:06|74.98 65.33 43.24| 75 7 9 9| 0 0|16093.7 11.7257436.1 93.7 8.2 0.5 0.1 99.2| 0 0 0 3938 0| 489251 96.60| 09:06:07|74.98 65.33 43.24| 75 7 5 13| 0 0|15758.9 19.2251989.4 188.2 14.7 0.9 0.1 99.7| 0 0 0 4140 0| 494738 96.70| 09:06:08|74.98 65.33 43.24| 69 7 10 15| 0 0|16166.3 8.7258474.9 81.2 8.9 0.5 0.1 98.7| 0 0 0 3993 0| 487162 96.58| 09:06:09|74.98 65.33 43.24| 74 7 9 10| 0 0|16071.0 8.7257010.9 93.3 8.2 0.5 0.1 99.2| 0 0 0 4098 0| 491557 96.61| 09:06:10|70.98 64.66 43.14| 71 7 9 12| 0 0|15549.6 216.1248701.1 6188.7 8.3 0.5 0.1 97.8| 0 0 0 3879 0| 480832 96.66| 09:06:11|70.98 64.66 43.14| 71 7 10 13| 0 0|16233.7 22.4259568.1 257.1 8.2 0.5 0.1 99.2| 0 0 0 4088 0| 493200 96.62| 09:06:12|70.98 64.66 43.14| 78 7 8 7| 0 0|15932.4 10.6254779.5 108.1 8.1 0.5 0.1 98.6| 0 0 0 4168 0| 489838 96.63| 09:06:13|70.98 64.66 43.14| 71 8 9 12| 0 0|16255.9 11.5259902.3 103.9 8.3 0.5 0.1 98.0| 0 0 0 3874 0| 481246 96.52| 09:06:14|70.98 64.66 43.14| 60 6 16 18| 0 0|15621.0 9.7249826.1 81.9 8.0 0.5 0.1 99.3| 0 0 0 3956 0| 480278 96.65|

Read the article

Using udev to create a character device based on a driver being loaded

- by SteveCB

I'm in the process of setting up RAID monitoring for a number of Dell servers that use the PERC 6i integrated card. We're using Nagios at present and the check_megasasctl plugin seems to fit the bill. However, the plugin relies upon the existence of: /dev/megaraid_sas_ioctl_node This device node doesn't exist by default, you have to create it by hand using something like: mknod /dev/megaraid_sas_ioctl_node c 253 0 Now, to make the existence of this device node persistent across reboots, I thought I could write a udev rule, but as usual, I'm missing something. I thought I could create a file such as /etc/udev/rules.d/10-local/rules that contained: DRIVER=="megasas" NAME="megaraid_sas_ioctl_node" MODE="0600" But this doesn't work - no device node after a reboot. Dmesg output indicates the megasas driver is loaded and functional: megasas: 00.00.04.01-RH1 Thu July 10 09:41:51 PST 2008 megasas: 0x1000:0x0060:0x1028:0x1f0c: bus 1:slot 0:func 0 megasas: FW now in Ready state Further, I don't see any means to instruct udev on which type of device node to create: character or block. I suspect I'm failing to understand exactly how udev is meant to work. I realise I could just cheat and run MegaCLI in /etc/rc.local, redirecting output to /dev/null; it creates the megaraid_sas_ioctl_node device node as part of its execution. I just thought using udev rules would be a) cleaner and b) a useful learning exercise. Perhaps I should just dump the above mknod command in /etc/rc.local... So how do I get udev to create the /dev/megaraid_sas_ioctl_node device node based on the presence of the megasas driver? Cheers Steve

Read the article

LSI 9285-8e and Supermicro SC837E26-RJBOD1 duplicate enclosure ID and slot numbers

- by Andy Shinn

I am working with 2 x Supermicro SC837E26-RJBOD1 chassis connected to a single LSI 9285-8e card in a Supermicro 1U host. There are 28 drives in each chassis for a total of 56 drives in 28 RAID1 mirrors. The problem I am running in to is that there are duplicate slots for the 2 chassis (the slots list twice and only go from 0 to 27). All the drives also show the same enclosure ID (ID 36). However, MegaCLI -encinfo lists the 2 enclosures correctly (ID 36 and ID 65). My question is, why would this happen? Is there an option I am missing to use 2 enclosures effectively? This is blocking me rebuilding a drive that failed in slot 11 since I can only specify enclosure and slot as parameters to replace a drive. When I do this, it picks the wrong slot 11 (device ID 46 instead of device ID 19). Adapter #1 is the LSI 9285-8e, adapter #0 (which I removed due to space limitations) is the onboard LSI. Adapter information: Adapter #1 ============================================================================== Versions ================ Product Name : LSI MegaRAID SAS 9285-8e Serial No : SV12704804 FW Package Build: 23.1.1-0004 Mfg. Data ================ Mfg. Date : 06/30/11 Rework Date : 00/00/00 Revision No : 00A Battery FRU : N/A Image Versions in Flash: ================ BIOS Version : 5.25.00_4.11.05.00_0x05040000 WebBIOS Version : 6.1-20-e_20-Rel Preboot CLI Version: 05.01-04:#%00001 FW Version : 3.140.15-1320 NVDATA Version : 2.1106.03-0051 Boot Block Version : 2.04.00.00-0001 BOOT Version : 06.253.57.219 Pending Images in Flash ================ None PCI Info ================ Vendor Id : 1000 Device Id : 005b SubVendorId : 1000 SubDeviceId : 9285 Host Interface : PCIE ChipRevision : B0 Number of Frontend Port: 0 Device Interface : PCIE Number of Backend Port: 8 Port : Address 0 5003048000ee8e7f 1 5003048000ee8a7f 2 0000000000000000 3 0000000000000000 4 0000000000000000 5 0000000000000000 6 0000000000000000 7 0000000000000000 HW Configuration ================ SAS Address : 500605b0038f9210 BBU : Present Alarm : Present NVRAM : Present Serial Debugger : Present Memory : Present Flash : Present Memory Size : 1024MB TPM : Absent On board Expander: Absent Upgrade Key : Absent Temperature sensor for ROC : Present Temperature sensor for controller : Absent ROC temperature : 70 degree Celcius Settings ================ Current Time : 18:24:36 3/13, 2012 Predictive Fail Poll Interval : 300sec Interrupt Throttle Active Count : 16 Interrupt Throttle Completion : 50us Rebuild Rate : 30% PR Rate : 30% BGI Rate : 30% Check Consistency Rate : 30% Reconstruction Rate : 30% Cache Flush Interval : 4s Max Drives to Spinup at One Time : 2 Delay Among Spinup Groups : 12s Physical Drive Coercion Mode : Disabled Cluster Mode : Disabled Alarm : Enabled Auto Rebuild : Enabled Battery Warning : Enabled Ecc Bucket Size : 15 Ecc Bucket Leak Rate : 1440 Minutes Restore HotSpare on Insertion : Disabled Expose Enclosure Devices : Enabled Maintain PD Fail History : Enabled Host Request Reordering : Enabled Auto Detect BackPlane Enabled : SGPIO/i2c SEP Load Balance Mode : Auto Use FDE Only : No Security Key Assigned : No Security Key Failed : No Security Key Not Backedup : No Default LD PowerSave Policy : Controller Defined Maximum number of direct attached drives to spin up in 1 min : 10 Any Offline VD Cache Preserved : No Allow Boot with Preserved Cache : No Disable Online Controller Reset : No PFK in NVRAM : No Use disk activity for locate : No Capabilities ================ RAID Level Supported : RAID0, RAID1, RAID5, RAID6, RAID00, RAID10, RAID50, RAID60, PRL 11, PRL 11 with spanning, SRL 3 supported, PRL11-RLQ0 DDF layout with no span, PRL11-RLQ0 DDF layout with span Supported Drives : SAS, SATA Allowed Mixing: Mix in Enclosure Allowed Mix of SAS/SATA of HDD type in VD Allowed Status ================ ECC Bucket Count : 0 Limitations ================ Max Arms Per VD : 32 Max Spans Per VD : 8 Max Arrays : 128 Max Number of VDs : 64 Max Parallel Commands : 1008 Max SGE Count : 60 Max Data Transfer Size : 8192 sectors Max Strips PerIO : 42 Max LD per array : 16 Min Strip Size : 8 KB Max Strip Size : 1.0 MB Max Configurable CacheCade Size: 0 GB Current Size of CacheCade : 0 GB Current Size of FW Cache : 887 MB Device Present ================ Virtual Drives : 28 Degraded : 0 Offline : 0 Physical Devices : 59 Disks : 56 Critical Disks : 0 Failed Disks : 0 Supported Adapter Operations ================ Rebuild Rate : Yes CC Rate : Yes BGI Rate : Yes Reconstruct Rate : Yes Patrol Read Rate : Yes Alarm Control : Yes Cluster Support : No BBU : No Spanning : Yes Dedicated Hot Spare : Yes Revertible Hot Spares : Yes Foreign Config Import : Yes Self Diagnostic : Yes Allow Mixed Redundancy on Array : No Global Hot Spares : Yes Deny SCSI Passthrough : No Deny SMP Passthrough : No Deny STP Passthrough : No Support Security : No Snapshot Enabled : No Support the OCE without adding drives : Yes Support PFK : Yes Support PI : No Support Boot Time PFK Change : Yes Disable Online PFK Change : No PFK TrailTime Remaining : 0 days 0 hours Support Shield State : Yes Block SSD Write Disk Cache Change: Yes Supported VD Operations ================ Read Policy : Yes Write Policy : Yes IO Policy : Yes Access Policy : Yes Disk Cache Policy : Yes Reconstruction : Yes Deny Locate : No Deny CC : No Allow Ctrl Encryption: No Enable LDBBM : No Support Breakmirror : No Power Savings : Yes Supported PD Operations ================ Force Online : Yes Force Offline : Yes Force Rebuild : Yes Deny Force Failed : No Deny Force Good/Bad : No Deny Missing Replace : No Deny Clear : No Deny Locate : No Support Temperature : Yes Disable Copyback : No Enable JBOD : No Enable Copyback on SMART : No Enable Copyback to SSD on SMART Error : Yes Enable SSD Patrol Read : No PR Correct Unconfigured Areas : Yes Enable Spin Down of UnConfigured Drives : Yes Disable Spin Down of hot spares : No Spin Down time : 30 T10 Power State : Yes Error Counters ================ Memory Correctable Errors : 0 Memory Uncorrectable Errors : 0 Cluster Information ================ Cluster Permitted : No Cluster Active : No Default Settings ================ Phy Polarity : 0 Phy PolaritySplit : 0 Background Rate : 30 Strip Size : 64kB Flush Time : 4 seconds Write Policy : WB Read Policy : Adaptive Cache When BBU Bad : Disabled Cached IO : No SMART Mode : Mode 6 Alarm Disable : Yes Coercion Mode : None ZCR Config : Unknown Dirty LED Shows Drive Activity : No BIOS Continue on Error : No Spin Down Mode : None Allowed Device Type : SAS/SATA Mix Allow Mix in Enclosure : Yes Allow HDD SAS/SATA Mix in VD : Yes Allow SSD SAS/SATA Mix in VD : No Allow HDD/SSD Mix in VD : No Allow SATA in Cluster : No Max Chained Enclosures : 16 Disable Ctrl-R : Yes Enable Web BIOS : Yes Direct PD Mapping : No BIOS Enumerate VDs : Yes Restore Hot Spare on Insertion : No Expose Enclosure Devices : Yes Maintain PD Fail History : Yes Disable Puncturing : No Zero Based Enclosure Enumeration : No PreBoot CLI Enabled : Yes LED Show Drive Activity : Yes Cluster Disable : Yes SAS Disable : No Auto Detect BackPlane Enable : SGPIO/i2c SEP Use FDE Only : No Enable Led Header : No Delay during POST : 0 EnableCrashDump : No Disable Online Controller Reset : No EnableLDBBM : No Un-Certified Hard Disk Drives : Allow Treat Single span R1E as R10 : No Max LD per array : 16 Power Saving option : Don't Auto spin down Configured Drives Max power savings option is not allowed for LDs. Only T10 power conditions are to be used. Default spin down time in minutes: 30 Enable JBOD : No TTY Log In Flash : No Auto Enhanced Import : No BreakMirror RAID Support : No Disable Join Mirror : No Enable Shield State : Yes Time taken to detect CME : 60s Exit Code: 0x00 Enclosure information: # /opt/MegaRAID/MegaCli/MegaCli64 -encinfo -a1 Number of enclosures on adapter 1 -- 3 Enclosure 0: Device ID : 36 Number of Slots : 28 Number of Power Supplies : 2 Number of Fans : 3 Number of Temperature Sensors : 1 Number of Alarms : 1 Number of SIM Modules : 0 Number of Physical Drives : 28 Status : Normal Position : 1 Connector Name : Port B Enclosure type : SES VendorId is LSI CORP and Product Id is SAS2X36 VendorID and Product ID didnt match FRU Part Number : N/A Enclosure Serial Number : N/A ESM Serial Number : N/A Enclosure Zoning Mode : N/A Partner Device Id : 65 Inquiry data : Vendor Identification : LSI CORP Product Identification : SAS2X36 Product Revision Level : 0718 Vendor Specific : x36-55.7.24.1 Number of Voltage Sensors :2 Voltage Sensor :0 Voltage Sensor Status :OK Voltage Value :5020 milli volts Voltage Sensor :1 Voltage Sensor Status :OK Voltage Value :11820 milli volts Number of Power Supplies : 2 Power Supply : 0 Power Supply Status : OK Power Supply : 1 Power Supply Status : OK Number of Fans : 3 Fan : 0 Fan Speed :Low Speed Fan Status : OK Fan : 1 Fan Speed :Low Speed Fan Status : OK Fan : 2 Fan Speed :Low Speed Fan Status : OK Number of Temperature Sensors : 1 Temp Sensor : 0 Temperature : 48 Temperature Sensor Status : OK Number of Chassis : 1 Chassis : 0 Chassis Status : OK Enclosure 1: Device ID : 65 Number of Slots : 28 Number of Power Supplies : 2 Number of Fans : 3 Number of Temperature Sensors : 1 Number of Alarms : 1 Number of SIM Modules : 0 Number of Physical Drives : 28 Status : Normal Position : 1 Connector Name : Port A Enclosure type : SES VendorId is LSI CORP and Product Id is SAS2X36 VendorID and Product ID didnt match FRU Part Number : N/A Enclosure Serial Number : N/A ESM Serial Number : N/A Enclosure Zoning Mode : N/A Partner Device Id : 36 Inquiry data : Vendor Identification : LSI CORP Product Identification : SAS2X36 Product Revision Level : 0718 Vendor Specific : x36-55.7.24.1 Number of Voltage Sensors :2 Voltage Sensor :0 Voltage Sensor Status :OK Voltage Value :5020 milli volts Voltage Sensor :1 Voltage Sensor Status :OK Voltage Value :11760 milli volts Number of Power Supplies : 2 Power Supply : 0 Power Supply Status : OK Power Supply : 1 Power Supply Status : OK Number of Fans : 3 Fan : 0 Fan Speed :Low Speed Fan Status : OK Fan : 1 Fan Speed :Low Speed Fan Status : OK Fan : 2 Fan Speed :Low Speed Fan Status : OK Number of Temperature Sensors : 1 Temp Sensor : 0 Temperature : 47 Temperature Sensor Status : OK Number of Chassis : 1 Chassis : 0 Chassis Status : OK Enclosure 2: Device ID : 252 Number of Slots : 8 Number of Power Supplies : 0 Number of Fans : 0 Number of Temperature Sensors : 0 Number of Alarms : 0 Number of SIM Modules : 1 Number of Physical Drives : 0 Status : Normal Position : 1 Connector Name : Unavailable Enclosure type : SGPIO Failed in first Inquiry commnad FRU Part Number : N/A Enclosure Serial Number : N/A ESM Serial Number : N/A Enclosure Zoning Mode : N/A Partner Device Id : Unavailable Inquiry data : Vendor Identification : LSI Product Identification : SGPIO Product Revision Level : N/A Vendor Specific : Exit Code: 0x00 Now, notice that each slot 11 device shows an enclosure ID of 36, I think this is where the discrepancy happens. One should be 36. But the other should be on enclosure 65. Drives in slot 11: Enclosure Device ID: 36 Slot Number: 11 Drive's postion: DiskGroup: 5, Span: 0, Arm: 1 Enclosure position: 0 Device Id: 48 WWN: Sequence Number: 11 Media Error Count: 0 Other Error Count: 0 Predictive Failure Count: 0 Last Predictive Failure Event Seq Number: 0 PD Type: SATA Raw Size: 2.728 TB [0x15d50a3b0 Sectors] Non Coerced Size: 2.728 TB [0x15d40a3b0 Sectors] Coerced Size: 2.728 TB [0x15d400000 Sectors] Firmware state: Online, Spun Up Is Commissioned Spare : YES Device Firmware Level: A5C0 Shield Counter: 0 Successful diagnostics completion on : N/A SAS Address(0): 0x5003048000ee8a53 Connected Port Number: 1(path0) Inquiry Data: MJ1311YNG6YYXAHitachi HDS5C3030ALA630 MEAOA5C0 FDE Enable: Disable Secured: Unsecured Locked: Unlocked Needs EKM Attention: No Foreign State: None Device Speed: 6.0Gb/s Link Speed: 6.0Gb/s Media Type: Hard Disk Device Drive Temperature :30C (86.00 F) PI Eligibility: No Drive is formatted for PI information: No PI: No PI Drive's write cache : Disabled Drive's NCQ setting : Enabled Port-0 : Port status: Active Port's Linkspeed: 6.0Gb/s Drive has flagged a S.M.A.R.T alert : No Enclosure Device ID: 36 Slot Number: 11 Drive's postion: DiskGroup: 19, Span: 0, Arm: 1 Enclosure position: 0 Device Id: 19 WWN: Sequence Number: 4 Media Error Count: 0 Other Error Count: 0 Predictive Failure Count: 0 Last Predictive Failure Event Seq Number: 0 PD Type: SATA Raw Size: 2.728 TB [0x15d50a3b0 Sectors] Non Coerced Size: 2.728 TB [0x15d40a3b0 Sectors] Coerced Size: 2.728 TB [0x15d400000 Sectors] Firmware state: Online, Spun Up Is Commissioned Spare : NO Device Firmware Level: A580 Shield Counter: 0 Successful diagnostics completion on : N/A SAS Address(0): 0x5003048000ee8e53 Connected Port Number: 0(path0) Inquiry Data: MJ1313YNG1VA5CHitachi HDS5C3030ALA630 MEAOA580 FDE Enable: Disable Secured: Unsecured Locked: Unlocked Needs EKM Attention: No Foreign State: None Device Speed: 6.0Gb/s Link Speed: 6.0Gb/s Media Type: Hard Disk Device Drive Temperature :30C (86.00 F) PI Eligibility: No Drive is formatted for PI information: No PI: No PI Drive's write cache : Disabled Drive's NCQ setting : Enabled Port-0 : Port status: Active Port's Linkspeed: 6.0Gb/s Drive has flagged a S.M.A.R.T alert : No Update 06/28/12: I finally have some new information about (what we think) the root cause of this problem so I thought I would share. After getting in contact with a very knowledgeable Supermicro tech, they provided us with a tool called Xflash (doesn't appear to be readily available on their FTP). When we gathered some information using this utility, my colleague found something very strange: root@mogile2 test]# ./xflash.dat -i get avail Initializing Interface. Expander: SAS2X36 (SAS2x36) 1) SAS2X36 (SAS2x36) (50030480:00EE917F) (0.0.0.0) 2) SAS2X36 (SAS2x36) (50030480:00E9D67F) (0.0.0.0) 3) SAS2X36 (SAS2x36) (50030480:0112D97F) (0.0.0.0) This lists the connected enclosures. You see the 3 connected (we have since added a 3rd and a 4th which is not yet showing up) with their respective SAS address / WWN (50030480:00EE917F). Now we can use this address to get information on the individual enclosures: [root@mogile2 test]# ./xflash.dat -i 5003048000EE917F get exp Initializing Interface. Expander: SAS2X36 (SAS2x36) Reading the expander information.......... Expander: SAS2X36 (SAS2x36) B3 SAS Address: 50030480:00EE917F Enclosure Logical Id: 50030480:0000007F IP Address: 0.0.0.0 Component Identifier: 0x0223 Component Revision: 0x05 [root@mogile2 test]# ./xflash.dat -i 5003048000E9D67F get exp Initializing Interface. Expander: SAS2X36 (SAS2x36) Reading the expander information.......... Expander: SAS2X36 (SAS2x36) B3 SAS Address: 50030480:00E9D67F Enclosure Logical Id: 50030480:0000007F IP Address: 0.0.0.0 Component Identifier: 0x0223 Component Revision: 0x05 [root@mogile2 test]# ./xflash.dat -i 500304800112D97F get exp Initializing Interface. Expander: SAS2X36 (SAS2x36) Reading the expander information.......... Expander: SAS2X36 (SAS2x36) B3 SAS Address: 50030480:0112D97F Enclosure Logical Id: 50030480:0112D97F IP Address: 0.0.0.0 Component Identifier: 0x0223 Component Revision: 0x05 Did you catch it? The first 2 enclosures logical ID is partially masked out where the 3rd one (which has a correct unique enclosure ID) is not. We pointed this out to Supermicro and were able to confirm that this address is supposed to be set during manufacturing and there was a problem with a certain batch of these enclosures where the logical ID was not set. We believe that the RAID controller is determining the ID based on the logical ID and since our first 2 enclosures have the same logical ID, they get the same enclosure ID. We also confirmed that 0000007F is the default which comes from LSI as an ID. The next pointer that helps confirm this could be a manufacturing problem with a run of JBODs is the fact that all 6 of the enclosures that have this problem begin with 00E. I believe that between 00E8 and 00EE Supermicro forgot to program the logical IDs correctly and neglected to recall or fix the problem post production. Fortunately for us, there is a tool to manage the WWN and logical ID of the devices from Supermicro: ftp://ftp.supermicro.com/utility/ExpanderXtools_Lite/. Our next step is to schedule a shutdown of these JBODs (after data migration) and reprogram the logical ID and see if it solves the problem. Update 06/28/12 #2: I just discovered this FAQ at Supermicro while Google searching for "lsi 0000007f": http://www.supermicro.com/support/faqs/faq.cfm?faq=11805. I still don't understand why, in the last several times we contacted Supermicro, they would have never directed us to this article :\

Read the article

Quantifying the effects of partition mis-alignment

- by Matt

I'm experiencing some significant performance issues on an NFS server. I've been reading up a bit on partition alignment, and I think I have my partitions mis-aligned. I can't find anything that tells me how to actually quantify the effects of mis-aligned partitions. Some of the general information I found suggests the performance penalty can be quite high (upwards of 60%) and others say it's negligible. What I want to do is determine if partition alignment is a factor in this server's performance problems or not; and if so, to what degree? So I'll put my info out here, and hopefully the community can confirm if my partitions are indeed mis-aligned, and if so, help me put a number to what the performance cost is. Server is a Dell R510 with dual E5620 CPUs and 8 GB RAM. There are eight 15k 2.5” 600 GB drives (Seagate ST3600057SS) configured in hardware RAID-6 with a single hot spare. RAID controller is a Dell PERC H700 w/512MB cache (Linux sees this as a LSI MegaSAS 9260). OS is CentOS 5.6, home directory partition is ext3, with options “rw,data=journal,usrquota”. I have the HW RAID configured to present two virtual disks to the OS: /dev/sda for the OS (boot, root and swap partitions), and /dev/sdb for a big NFS share: [root@lnxutil1 ~]# parted -s /dev/sda unit s print Model: DELL PERC H700 (scsi) Disk /dev/sda: 134217599s Sector size (logical/physical): 512B/512B Partition Table: msdos Number Start End Size Type File system Flags 1 63s 465884s 465822s primary ext2 boot 2 465885s 134207009s 133741125s primary lvm [root@lnxutil1 ~]# parted -s /dev/sdb unit s print Model: DELL PERC H700 (scsi) Disk /dev/sdb: 5720768639s Sector size (logical/physical): 512B/512B Partition Table: gpt Number Start End Size File system Name Flags 1 34s 5720768606s 5720768573s lvm Edit 1 Using the cfq IO scheduler (default for CentOS 5.6): # cat /sys/block/sd{a,b}/queue/scheduler noop anticipatory deadline [cfq] noop anticipatory deadline [cfq] Chunk size is the same as strip size, right? If so, then 64kB: # /opt/MegaCli -LDInfo -Lall -aALL -NoLog Adapter #0 Number of Virtual Disks: 2 Virtual Disk: 0 (target id: 0) Name:os RAID Level: Primary-6, Secondary-0, RAID Level Qualifier-3 Size:65535MB State: Optimal Stripe Size: 64kB Number Of Drives:7 Span Depth:1 Default Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU Current Cache Policy: WriteThrough, ReadAdaptive, Direct, No Write Cache if Bad BBU Access Policy: Read/Write Disk Cache Policy: Disk's Default Number of Spans: 1 Span: 0 - Number of PDs: 7 ... physical disk info removed for brevity ... Virtual Disk: 1 (target id: 1) Name:share RAID Level: Primary-6, Secondary-0, RAID Level Qualifier-3 Size:2793344MB State: Optimal Stripe Size: 64kB Number Of Drives:7 Span Depth:1 Default Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU Current Cache Policy: WriteThrough, ReadAdaptive, Direct, No Write Cache if Bad BBU Access Policy: Read/Write Disk Cache Policy: Disk's Default Number of Spans: 1 Span: 0 - Number of PDs: 7 If it's not obvious, virtual disk 0 corresponds to /dev/sda, for the OS; virtual disk 1 is /dev/sdb (the exported home directory tree).

Read the article

Volume group disappeared, LVs still available

- by Ben

I've run into an issue with my KVM host which runs VMs on a LVM volume. As of last night the logical volumes are no longer seen as such (I can't create snapshots of them even though I have been for months now). Running any scans all result in nothing being found: [root@apollo ~]# pvscan No matching physical volumes found [root@apollo ~]# vgscan Reading all physical volumes. This may take a while... No volume groups found root@apollo ~]# lvscan No volume groups found If I try restoring the VG conf backup from /etc/lvm/backups/vg0 I get the following error: [root@apollo ~]# vgcfgrestore -f /etc/lvm/backup/vg0 vg0 Couldn't find device with uuid 20zG25-H8MU-UQPf-u0hD-NftW-ngsC-mG63dt. Cannot restore Volume Group vg0 with 1 PVs marked as missing. Restore failed. /etc/lvm/backups/vg0 has the following for the physical volume: physical_volumes { pv0 { id = "20zG25-H8MU-UQPf-u0hD-NftW-ngsC-mG63dt" device = "/dev/sda5" # Hint only status = ["ALLOCATABLE"] flags = [] dev_size = 4292870143 # 1.99902 Terabytes pe_start = 384 pe_count = 524031 # 1.99902 Terabytes } } fdisk -l /dev/sda shows the following: [root@apollo ~]# fdisk -l /dev/sda Disk /dev/sda: 6000.1 GB, 6000069312512 bytes 64 heads, 32 sectors/track, 5722112 cylinders Units = cylinders of 2048 * 512 = 1048576 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x000188b7 Device Boot Start End Blocks Id System /dev/sda1 2 32768 33553408 82 Linux swap / Solaris /dev/sda2 32769 33280 524288 83 Linux /dev/sda3 33281 1081856 1073741824 83 Linux /dev/sda4 1081857 3177984 2146435072 85 Linux extended /dev/sda5 1081857 3177984 2146435071+ 8e Linux LVM The server is running a 4 disk HW RAID10 which seems perfectly healthy according to megacli and smartd. The only odd message in /var/log/messages is the following which shows up every couple of hours: Jun 10 09:41:57 apollo udevd[527]: failed to create queue file: No space left on device Output of df -h [root@apollo ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda3 1016G 119G 847G 13% / /dev/sda2 508M 67M 416M 14% /boot Does anyone have any ideas what to do next? The VMs are all running fine at the moment apart from not being able to snapshot them. Updated with extra info It's not a lack of inodes: [root@apollo ~]# df -i Filesystem Inodes IUsed IFree IUse% Mounted on /dev/sda3 67108864 48066 67060798 1% / /dev/sda2 32768 47 32721 1% /boot pvs, vgs & lvs either output nothing or "No volume groups found".

Search Results

Search found 17 results on 1 pages for 'megacli'.

Page 1/1 | 1

- by Henno

- by Pascal den Bekker

- by Henry-Nicolas Tourneur

- by Peter Valadez

- by JamesHannah

- by Stefan

- by neoX

- by Liu Maclean(???)

- by korkman

- by vocoder

- by javano

- by Bastien974

- by mq44944

- by SteveCB

- by Andy Shinn

- by Matt

- by Ben