iops - Page 2 - Developer IT

Why your Netapp is so slow...

- by Darius Zanganeh

Have you ever wondered why your Netapp FAS box is slow and doesn't perform well at large block workloads? In this blog entry I will give you a little bit of information that will probably help you understand why it’s so slow, why you shouldn't use it for applications that read and write in large blocks like 64k, 128k, 256k ++ etc.. Of course since I work for Oracle at this time, I will show you why the ZS3 storage boxes are excellent choices for these types of workloads. Netapp’s Fundamental Problem The fundamental problem you have running these workloads on Netapp is the backend block size of their WAFL file system. Every application block on a Netapp FAS ends up in a 4k chunk on a disk. Reference: Netapp TR-3001 Whitepaper Netapp has proven this lacking large block performance fact in at least two different ways. They have NEVER posted an SPC-2 Benchmark yet they have posted SPC-1 and SPECSFS, both recently. In 2011 they purchased Engenio to try and fill this GAP in their portfolio. Block Size Matters So why does block size matter anyways? Many applications use large block chunks of data especially in the Big Data movement. Some examples are SAS Business Analytics, Microsoft SQL, Hadoop HDFS is even 64MB! Now let me boil this down for you. If an application such MS SQL is writing data in a 64k chunk then before Netapp actually writes it on disk it will have to split it into 16 different 4k writes and 16 different disk IOPS. When the application later goes to read that 64k chunk the Netapp will have to again do 16 different disk IOPS. In comparison the ZS3 Storage Appliance can write in variable block sizes ranging from 512b to 1MB. So if you put the same MSSQL database on a ZS3 you can set the specific LUNs for this database to 64k and then when you do an application read/write it requires only a single disk IO. That is 16x faster! But, back to the problem with your Netapp, you will VERY quickly run out of disk IO and hit a wall. Now all arrays will have some fancy pre fetch algorithm and some nice cache and maybe even flash based cache such as a PAM card in your Netapp but with large block workloads you will usually blow through the cache and still need significant disk IO. Also because these datasets are usually very large and usually not dedupable they are usually not good candidates for an all flash system. You can do some simple math in excel and very quickly you will see why it matters. Here are a couple of READ examples using SAS and MSSQL. Assume these are the READ IOPS the application needs even after all the fancy cache and algorithms. Here is an example with 128k blocks. Notice the numbers of drives on the Netapp! Here is an example with 64k blocks You can easily see that the Oracle ZS3 can do dramatically more work with dramatically less drives. This doesn't even take into account that the ONTAP system will likely run out of CPU way before you get to these drive numbers so you be buying many more controllers. So with all that said, lets look at the ZS3 and why you should consider it for any workload your running on Netapp today. ZS3 World Record Price/Performance in the SPC-2 benchmark ZS3-2 is #1 in Price Performance $12.08ZS3-2 is #3 in Overall Performance 16,212 MBPS Note: The number one overall spot in the world is held by an AFA 33,477 MBPS but at a Price Performance of $29.79. A customer could purchase 2 x ZS3-2 systems in the benchmark with relatively the same performance and walk away with $600,000 in their pocket.

Read the article

Flash Technology Can Revolutionize your IT Infrastructure

- by kimberly.billings

A recent article in the Data Center Journal written by Mark Teter outlines how flash is becoming a disruptive technology in the data center and how it will soon replace HDDs in the storage hierarchy. As Teter explains, the drivers behind this trend are lower cost/performance and power savings; flash is over 100x faster for reads than the fastest HDD, and while it is expensive, it can produce dramatic reductions in the cost of performance as measured in Input/Outputs per second (IOPS). What's more, flash consumes 1/5th the power of HDD, so it's faster AND greener. Teter writes, "when appropriately used, flash turns the current economics of IT performance on its head. That's disruptive." Exadata Smart Flash Cache in the Sun Oracle Database Machine makes intelligent use of flash storage to deliver extreme performance for OLTP and mixed workloads. It intelligently caches data from the Oracle Database replacing slow mechanical I/O operations to disk with very rapid flash memory operations. Exadata Smart Flash Cache is the fundamental technology of the Sun Oracle Database Machine that enables the processing of up to 1 million random I/O operations per second (IOPS), and the scanning of data within Exadata storage at up to 50 GB/second. Are you incorporating flash into your storage strategy? Let us know! Read more: "Flash technology can revolutionize your IT infrastructure", The Data Center Journal, March 30, 2010. Exadata Smart Flash Cache and the Sun Oracle Database Machine white paper var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www."); document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E")); try { var pageTracker = _gat._getTracker("UA-13185312-1"); pageTracker._trackPageview(); } catch(err) {}

Read the article

Ginormous...the new Oracle T5-8 SPARC SuperCluster!

- by user12608550

Ginormous...no other way to describe it...the new Oracle T5-8 SPARC SuperCluster...2000+ fast CPU threads, massive memory (DRAM & Flash), 1.2 M IOPS, HUGE storage and bandwidth...WOW! Wanna build a SPARC Cloud? This is it! Multiple virtualization technologies (VM Server for SPARC, and Solaris zones) for elasticity and resource pooling along with Oracle Enterprise Manager Ops Center 12c providing full cloud management capability. Check it out!: ORACLE SUPERCLUSTER T5-8 Overview and Frequently Asked Questions Oracle SuperCluster T5-8 Oracle T5-8 SuperCluster E-Book

Read the article

Announcing: Oracle's Sun Flash Accelerator F80 PCIe Card

- by uwes

Ramp Up Your Server Performance with Oracle's Sun Flash Accelerator F80 PCIe Card! Oracle’s Sun Flash Accelerator F80 PCIe Card accelerates IO-starved applications and server performance by reducing storage latencies and increasing I/O throughput for greater productivity and business response! Sun Flash Accelerator F80 PCIe Card offers the following: Helps servers and their applications run faster and more efficient, while reducing power and space With 800GB capacity, delivers 2x the capacity of the previous F40 Flash Card for less than half the $/GB Accelerates I/O constrained databases with increased IOPS and consistent low-latency response timers Current and planned server support includes: The F80 is currently supported in Oracle’s SPARC T4-1, T4-2 and X4-2L servers. SPARC T5, M5, M6 and Fujitsu M10 server support is planned for December 2013 (Preliminary only) Please read the Sales Bulletin on Oracle HW TRC for more details. (If you are not registered on Oracle HW TRC, click here ... and follow the instructions..) For More Information Go To: Oracle.com Flash Page Oracle Technology Network Flash Page

Read the article

Samsung 830 very slow benchmark numbers

- by alekop

I just bought a new SSD, and installed a fresh copy of Windows on it. I didn't see any noticeable difference in boot times, app start-up times, so I decided to benchmark it. Asus P7P55D-E Intel i5-760 Samsung 830 256GB SATA III Windows 7 Ultimate 64-bit The Windows experience index gave the drive a 7.3 rating, but real-world performance is not particularly impressive. Any ideas why the numbers are so low? UPDATE: It turns out that SATA III support is turned off by default on the P7P55D motherboard. After enabling it in BIOS (Tools - Level Up), the scores went up: Read Write Seq 325 183 4K 16 49 IOPS 32K 28K It's an improvement, but still far below what they should be for this drive.

Read the article

Database Server Hardware components (order of importance), CPU speed VS CPU cache vs RAM vs DISK

- by nulltorpedo

I am new to database world and would like to know what are crucial hardware specs when it comes to database performance. I have searched the internet and found this so far (In order of decreasing importance): 1) Hard Disk: Get an SSD basically (much more IOPS than spinners) 2) Memory: Get as much as you can afford 3) CPU: For the same $ spent, prefer larger cache size over speed. Are these findings sensible? EDIT: I would like to focus on CPU speed VS CPU cache size. EDIT2: The database is used to store some combination of ints and int arrays with few text fields. There are a lot of Select queries looking for existing entries. If entry is not found, then insert it. I would say most of processing would be trying to find a match across a table with 200 columns and 20k rows. The insert statements are very few. EDIT3: Also, we have a lot of views (basically select queries).

Read the article

Sharing storage on Linux and Solaris

- by devlearn

I'm looking for a solution in order to share a san mounted volume between several hosts running on Linux (RHEL) and/or Solaris (Sparc). Note that I basically need to share a set of directories containing large binary files that are accessed in random R/W mode. I have the following reqs : keep the data on the SAN suitable i/o performances as the software is pretty demanding on IOPS stick to a shared file system as I can't afford a cluster fs (lack of MDS/OSS infrastructure) compression could be really usefull For now I've found only the following candidates : GFS2 , supports Linux only, no compression VxFS , supports Linux and Solaris, compression supported So if you have some suggestions for this list, I'll really welcome them. Thanks in advance,

Read the article

Help with proposed iSCSI SAN VMware implementation.

- by obsidian

We have four (with plans to grow to four more) Dell servers with 6 NICs. They are running VMware ESXi 4.1. We would like to connect all of them to an Openfiler iSCSI SAN via HP ProCurve 1810G switches. Based on the design below, is there anything I should be concerned about or anything unusual that I should look out for when making the iSCSI network configurations on the servers, switches and OpenFiler? Should I bond the connections on the servers or simply setup them up for failover? The primary goal is to maximize IOPS. Thanks in advance.

Read the article

Is the sql backend right choice for LDAP?

- by skomak

Hi, I have felt some troubles with LDAP dif database after unexpected system reboots. This databse was only read so it is confused why database have had errors. So im searching for replacement of this database. I think SQL would be more reliable. What do you think, is it? I need to know how much performance loss i'll meet then. How many more IOPS(I/O per second) in percentage I loss too. Thanks in advance, skomak

Read the article

Backup Mongodb on EC2 through EBS snapshots - timing issue

- by DmitrySemenov

I'm following this guidance http://docs.mongodb.org/ecosystem/tutorial/backup-and-restore-mongodb-on-amazon-ec2/ I have 4 EBS 1000 IOPS volumes assigned to instance These 4 volumes through MDADM assembled into software RAID10 array. I want to do backups through EBS Snapshots as explained in the article above Question: Mongodb says - that I need to mongo shelldb.runCommand({fsync:1,lock:1}); -- this will lock the db for writing ....run snapshot creation... mongo shell db.$cmd.sys.unlock.findOne(); -- this will unlock the db for writing so do I need to unlock the DB for writing after I issued the comand ec2-create-snapshot or after it's finished and the actual snapshot is created thanks, Dmitry

Read the article

Storage sizing for virtual machines

- by njo

I am currently doing research to determine the consolidation ratio my company could expect should we start using a virtualization platform. I find myself continually running into a dead end when researching how to translate observed performance (weeks of perfmon data) to hdd array requirements for a virtualization server. I am familiar with the concept of IOPs, but they seem to be an overly simplistic measurement that fails to take into account cache, write combining, etc. Is there a seminal work on storage array performance analysis that I'm missing? This seems like an area where hearsay and 'black magic' have taken over for cold, hard fact.

Read the article

Vmware peaks NFS load every 30 seconds

- by gtirloni

We were troubleshooting a performance problem on one of our storage servers and after investigating almost everything in sight we saw that every 30 seconds, Vmware would go from 10k IOPS (NFS) to 30k, 50k, 100k or whatever the server would handle. Most of it were reads. What could cause this raise in NFS operations per second every 30 seconds? The virtual machines are managed by external customers and there isn't much in common between them. While breaking utilization down by filename, we discovered 5-10 virtual machines that contributed more to those peaks but it still doesn't explain why every 30 seconds. There are no other peaks outside that 30 sec period (ie. it stays in an almost constant average). Is there an NFS tweak in Vmware to change that 30 second period? If that's really necessary, we would like to introduce some variation so all that workload isn't dropped all at once. It's causing NFS timeout on the ESX 3.5/4.0 hosts when the storage gets overloaded.

Read the article

New Write Flash SSDs and more disk trays

- by Steve Tunstall

In case you haven't heard, the Write SSDs the ZFSSA have been updated. Much faster now for the same price. Sweet. The new write-flash SSDs have a new part number of 7105026 , so make sure you order the right ones. It's important to note that you MUST be on code level 2011.1.4.0 or higher to use these. They have increased in IOPS from 6,000 to 11,000, and increased throughput from 200MB/s to 350MB/s. Also, you can now add six SAS HBAs (up from 4) to the 7420, allowing one to have three SAS channels with 12 disk trays each, for a new total of 36 disk trays. With 3TB drives, that's 2.5 Petabytes. Is that enough for you? Make sure you add new cards to the correct slots. I've talked about this before, but here is the handy-dandy matrix again so you don't have to go find it. Remember the rules: You can have 6 of any one kind of card (like six 10GigE cards), but you only really get 8 slots, since you have two SAS cards no matter what. If you want more than 12 disk trays, you need two more SAS cards, so think about expansion later, too. In fact, if you are going to have two different speeds of drives, in other words you want to mix 15K speed and 7,200 speed drives in the same system, I would highly recommend two different SAS channels. So I would want four SAS cards in that system, no matter how many trays you have.

Read the article

PASS 13 Dispatches: moving to the cloud

- by Tony Davis

PASS Summit 13, Day 1 keynote by Quentin Clarke and we're hearing about “redefiniing mission critical in the cloud”. With a move to the Windows Azure cloud comes the promise of capacity on demand, automatic HA, backups, patching and so on, as well as passing responsibility to MS for managing hardware, upgrades and so on. However, for many databases and applications the best route to the cloud is not necessarily obvious. For most, the path of least resistance is IaaS – SQL Server in a Azure VM. It removes the hardware burden but you still have to manage your databases and implementing HA for SQL Server is your responsibility. Also, scaling up comes at quite a cost – the biggest VM (8 CPU cores, 56 GB RAM, 16 1TB drives with 500 IOPS each) weighs in at over over $4500 per month. With PaaS, in the form of Windows SQL Database, you get a “3-copies replica set” so HA comes out-of the box, and removes the majority of the administration burden, but you are moving your database into a very different environment. For a start, it's a shared environment, with other customers using the same compute nodes in the cluster, and potentially even sharing the same database (multi-tenancy). Unless you pay for SQL DB Premium edition, the resources available for your workload will depends on how nicely others “play” in the shared environment. You'll potentially need to do a lot of tuning, and application rewriting to avoid throttling issues, optimising application-database communication to deal with increased latency between the two, and so on. You'll need aggressive application caching. You'll also need retry logic and to deal with (expected) node failure and the need to reconnect. In Tuesday's PASS Summit pre-con from the SQLCAT team, they spent a lot of time covering some of the telemetric techniques (collect into Azure storage the necessary monitoring data) to perform capacity planning, work out the hotspots and bottlenecks in your cloud applications. Tools like WAD (Windows Azure Diagnostics), performance counters SQL Database DMVs, and others, will be essential. Of course, to truly exploit the vast horizontal scaling that is available from the existence of thousands of compute nodes, you'll also need to need to consider how to “shard” your data so Azure can move it between nodes at will. Finding the right path to the Cloud isn't easy, but it's coming. I spoke to people one year ago who saw no real benefit in trying to move their infrastructure and databases to the cloud, but now at their company, it's the conversation that won't go away. Tony.

Read the article

LSI 9260-8i w/ 6 256gb SSDs - RAID 5, 6, 10, or bad idea overall?

- by Michael Pearson

We're provisioning a new production server for our reasonably busy website. Our choice of host have available a 6 drive configuration with a LSI 9260-8i card. My initial thought was to fill all six bays with SSDs (Intel 520 256gb) and set them up in RAID. Good, bad, or terrible idea? Can the card handle it? Should we be using RAID 5, 6 or 10? This would be the first time the provider have filled all six slots for this rackmount with SSDs, so they're a bit hesitant. I'm wondering if somebody else with this card has done something similar in a production environment. We do about 43gb of writes per day and currently use about 300gb of storage. The server acts as webserver, database, and image store for approx 1 million files. The plan is to underprovision the SSDs by approximately 10% to 20% to increase their overall lifespan & performance. The fallback option is 2x480gb SSDs in RAID 1 and another 2x1TB HDDs in RAID 1. The motivation behind this is that the server rental cost difference between 2xSSDs and 6xSSDs is minimal (compared to the overall cost of the rental). We do not have any special high-IOPs requirements. However, if the configuration is known to work, I don't see a good reason to not use it and not have to worry about having separate 'fast and small' and 'slow and large' disks.

Read the article

DAS vs SAN storage for serving 2 to 4 nodes

- by Luke404

We currently have 4 Linux nodes with local storage, arranged in two active/passive pairs with storage mirrored using DRBD, running virtual machines (actually using Xen Hypervisor) for typical hosting workloads (mail, web, a couple VPS, etc.). We're approaching the (presumed) maximum IOPS of those servers, and we're planning to migrate to an external storage solution with two active nodes, with capacity for up to four active nodes. Since we're an all-Dell shop I've done some research and found the MD3200 / MD3200i products should be the ones we're looking for. We are pretty sure we won't be attaching more than 4 hosts on a single storage and I'm wondering if there is any clear advantage for one or the other. In theory I should be able to attach 4 SAS hosts to a single MD3200 (single links on a single controller MD3200, or dual redundant SAS links from each host to a dual-controller MD3200), or 4 iSCSI hosts to a single MD3200i (directly on its 4 GigE ports without any switch, again with dual links for the dual controller option). Both setups should let us implement live VM migration since all hosts can access all the LUNs at the same time, and also some shared filesystem like GFS2 or OCFS2. Also, both setups should allow full redundancy of the whole system (assuming dual controllers in the storage). One difference I can see is that the DAS solution is actually limited to 4 hosts while the iSCSI one should be able to grow to more hosts (adding two GigE switches to the mix). One point for the iSCSI solution is that it would allow us to start out with our current nodes and upgrade them at a later time (we can't add other SAS controllers, but they already have 4 GigE ports each). With the right (iSCSI|SAS) controllers I should be able to connect diskless nodes and boot them off the external storage which I think is a good thing (get rid of any local storage). On the other hand, I would have thought the SAS one to be cheaper but it seems like an MD3200 actually costs a little less than an MD3200i (?) (please note: I've used Dell gear in my examples since that's what we're looking for but I assume the same goes with other vendors) I would like to know if my assumptions above are correct, and if I'm missing any important difference between the two setups.

Read the article

Dell R320 RAID 10 with CacheCade

- by Geekman

I'm looking for a higher-performance build for our 1RU Dell R320 servers, in terms of IOPS. Right now I'm fairly settled on: 4 x 600 GB 3.5" 15K RPM SAS RAID 1+0 array This should give good performance, but if possible, I want to also add an SSD Cache into the mix, but I'm not sure if there's enough room? According to the tech-specs, there's only up to 4 total 3.5" drive bays available. Is there any way to fit at least a single SSD drive along-side the 4x3.5" drives? I was hoping there's a special spot to put the cache SSD drive (though from memory, I doubt there'd be room). Or am I right in thinking that the cache drives are simply drives plugged in "normally" just as any other drive, but are nominated as CacheCade drives in the PERC controller? Are there any options for having the 4x600GB RAID 10 array, and the SSD cache drive, too? Based on the tech-specs (with up to 8x2.5" drives), maybe I need to use 2.5" SAS drives, leaving another 4 bays spare, plenty of room for the SSD cache drive. Has anyone achieved this using 3.5" drives, somehow?

Read the article

Which upgrade path for disk IO bound postgres server?

- by user41679

Hi all, We currently have a Sun x4270 with 2xquad core Xeon Nehalmen 2.93ghz cores (16 threads), 72 gig of ram and 16 x 10k SAS disks split between the os raid 1, a partition for the Write Ahead Logs which is raid 10 and a partition for the database tables and indexes which is also raid 10, all xfs. I'm currently evaluating which path to go down in terms of upgrades. We'll be sharding the DB at some point soon, but for now I need to focus on hardware upgrades specifically. The machine is not CPU or memory bound at all at the moment, just IOWait is become an issue. The machine is mostly write access as we have a heavy caching layer. We're seeing about 300 write IOPS average on both the database partitions. We don't have any additional storage infrastructure like a Fiber Channel or ISCSI network. Budget isn't too much of a concern, something inline with the size of this server (i.e no $1m IBM machines) Space is ok on the DB side of things, we're running out obviously but there's also some reduction we can do. Additional space would be good though. My current thoughts are either: * ISCSI SAN, possible with 10Gbit network that has solid state acceleration. * FusionIO card / Sun F20 card (will the FusionIO card work in the Sun box? * DAS shelf (something like this http://www.broadberry.co.uk/das-direct-attached-storage-servers/cyberstore-224s-das) which a combination of 15k sas disks and some Intel X25-E drives for DB indexes etc) what would I need to put in the x4270 to add a DAS shelf? I think it's a SAS HBA card, do I have to use Sun's own card or will any PCI Express card work? Anything else??? what would you guys do from your experience? I appreciate it's a lot of questions, but I haven't expanded a DB machine for a number of years and the landscape has changed dramatically since then! Any advice or feedback would be very much appreciated. Let me know if there's anything else I can clarify. Thanks in advance!

Read the article

RAIDs with a lot of spindles - how to safely put to use the "wasted" space

- by kubanczyk

I have a fairly large number of RAID arrays (server controllers as well as midrange SAN storage) that all suffer from the same problem: barely enough spindles to keep the peak I/O performance, and tons of unused disk space. I guess it's a universal issue since vendors offer the smallest drives of 300 GB capacity but the random I/O performance hasn't really grown much since the time when the smallest drives were 36 GB. One example is a database that has 300 GB and needs random performance of 3200 IOPS, so it gets 16 disks (4800 GB minus 300 GB and we have 4.5 TB wasted space). Another common example are redo logs for a OLTP database that is sensitive in terms of response time. The redo logs get their own 300 GB mirror, but take 30 GB: 270 GB wasted. What I would like to see is a systematic approach for both Linux and Windows environment. How to set up the space so sysadmin team would be reminded about the risk of hindering the performance of the main db/app? Or, even better, to be protected from that risk? The typical situation that comes to my mind is "oh, I have this very large zip file, where do I uncompress it? Umm let's see the df -h and we figure something out in no time..." I don't put emphasis on strictness of the security (sysadmins are trusted to act in good faith), but on overall simplicity of the approach. For Linux, it would be great to have a filesystem customized to cap I/O rate to a very low level - is this possible?

Read the article

MD3200 - 3 to 4x less throughput than MD1220. Am I missing something here?

- by Igor Polishchuk

I have two R710 servers with similar configuration. One in my office has MD1220 attached. Another one in the datacenter of my hosting services vendor has MD3200. I'm getting significantly worse throughput from MD3200 at my vendors setup. I'm mostly interested in sequential writes, and I'm getting these results in bonnie++ and dd tests: Seq. writes on MD1220 in my office: 1.1 GB/s - bonnie++, 1.3GB/s - dd Seq. writes on MD3200 at my vendor's: 240MB/s - bonnie++, 310MB/s - dd Unfortunately, I could not test the exactly the same configurations, but the two I have should be comparable. If anything, my good performing environment is cheaper than the bad performing. I expect at least similar throughput from these two setups. My vendor cannot really help me. Hopefully, somebody more familiar with the DAS performance can look at it and tell if I'm missing something here and my expectations are too high. To summarize, the question here is it reasonable to expect about 100MB/s of sequential write throughput per each couple of drives in RAID10 on MD3200? Is there any trick to enable such performance in MD3200 with dual controller as opposed to simple MD1220 with a single H800 adapter? More details about the configurations: A good one in my office: Dell R710 2CPU X5650 @ 2.67GHz 12 cores 96GB DDR3, OS: RHEL 5.5, kernel 2.6.18-194.26.1.el5 x86_64 20x300GB 2.5" SAS 10K in a single RAID10 1MB chunk size on MD1220 + Dell H800 I/O controller with 1GB cache in the host Not so good one at my vendor's: Dell R710 2CPU L5520 @ 2.27GHz 8 cores 144GB DDR3, OS: RHEL 5.5, kernel 2.6.18-194.11.4.el5 x86_64 20x146GB 2.5" SAS 15K in a single RAID10 512KB chunk size, Dell MD3200, 2 I/O controllers in array with 1GB cache each Additional information. I've also ran the same tests on the same vendor's host, but the storage was: two raids of 14x146GB 15K RPM drives RAID 10, striped together on the OS level on MD3000+MD1000. The performance was about 25% worse than on MD3200 despite having more drives. When I ran similar tests on the internal storage of my vendor's host (2x146GB 15K RPM drives RAID1, Perc 6i) I've got about 128MB/s seq. writes. Just two internal drives gave me about a half of 20 drives' throughput on MD3200. The random I/O performance of the MD3200 setup is ok, it gives me at least 1300 IOPS. I'm mostly have problems with sequentioal I/O throughput. Thank you for looking into it. Regards Igor

Read the article

ZFS/Btrfs/LVM2-like storage with advanced features on Linux?

- by Easter Sunshine

I have 3 identical internal 7200 RPM SATA hard disk drives on a Linux machine. I'm looking for a storage set-up that will give me all of this: Different data sets (filesystems or subtrees) can have different RAID levels so I can choose performance, space overhead, and risk trade-offs differently for different data sets while having a few number of physical disks (very important data can be 3xRAID1, important data can be 3xRAID5, unimportant reproducible data can be 3xRAID0). If each data set has an explicit size or size limit, then the ability to grow and shrink the size limit (offline if need be) Avoid out-of-kernel modules R/W or read-only COW snapshots. If it's a block-level snapshots, the filesystem should be synced and quiesced during a snapshot. Ability to add physical disks and then grow/redistribute RAID1, RAID5, and RAID0 volumes to take advantage of the new spindle and make sure no spindle is hotter than the rest (e.g., in NetApp, growing a RAID-DP raid group by a few disks will not balance the I/O across them without an explicit redistribution) Not required but nice-to-haves: Transparent compression, per-file or subtree. Even better if, like NetApps, analyzes the data first for compressibility and only compresses compressible data Deduplication that doesn't have huge performance penalties or require obscene amounts of memory (NetApp does scheduled deduplication on weekends, which is good) Resistance to silent data corruption like ZFS (this is not required because I have never seen ZFS report any data corruption on these specific disks) Storage tiering, either automatic (based on caching rules) or user-defined rules (yes, I have all-identical disks now but this will let me add a read/write SSD cache in the future). If it's user-defined rules, these rules should have the ability to promote to SSD on a file level and not a block level. Space-efficient packing of small files I tried ZFS on Linux but the limitations were: Upgrading is additional work because the package is in an external repository and is tied to specific kernel versions; it is not integrated with the package manager Write IOPS does not scale with number of devices in a raidz vdev. Cannot add disks to raidz vdevs Cannot have select data on RAID0 to reduce overhead and improve performance without additional physical disks or giving ZFS a single partition of the disks ext4 on LVM2 looks like an option except I can't tell whether I can shrink, extend, and redistribute onto new spindles RAID-type logical volumes (of course, I can experiment with LVM on a bunch of files). As far as I can tell, it doesn't have any of the nice-to-haves so I was wondering if there is something better out there. I did look at LVM dangers and caveats but then again, no system is perfect.

Read the article

SQLIO Writes

- by Grant Fritchey

SQLIO is a fantastic utility for testing the abilities of the disks in your system. It has a very unfortunate name though, since it's not really a SQL Server testing utility at all. It really is a disk utility. They ought to call it DiskIO because they'd get more people using I think. Anyway, branding is not the point of this blog post. Writes are the point of this blog post. SQLIO works by slamming your disk. It performs as mean reads as it can or it performs as many writes as it can depending on how you've configured your tests. There are much smarter people than me who will get into all the various types of tests you should run. I'd suggest reading a bit of what Jonathan Kehayias (blog|twitter) has to say or wade into Denny Cherry's (blog|twitter) work. They're going to do a better job than I can describing all the benefits and mechanisms around using this excellent piece of software. My concerns are very focused. I needed to set up a series of tests to see how well our product SQL Storage Compress worked. I wanted to know the effects it would have on a system, the disk for sure, but also memory and CPU. How to stress the system? SQLIO of course. But when I set it up and ran it, following the documentation that comes with it, I was seeing better than 99% compression on the files. Don't get me wrong. Our product is magnificent, wonderful, all things great and beautiful, gets you coffee in the morning and is made mostly from bacon. But 99% compression. No, it's not that good. So what's up? Well, it's the configuration. The default mechanism is to load up a file, something large that will overwhelm your disk cache. You're instructed to load the file with a character 0x0. I never got a computer science degree. I went to film school. Because of this, I didn't memorize ASCII tables so when I saw this, I thought it was zero's or something. Nope. It's NULL. That's right, you're making a very large file, but you're filling it with NULL values. That's actually ok when all you're testing is the disk sub-system. But, when you want to test a compression and decompression, that can be an issue. I got around this fairly quickly. Instead of generating a file filled with NULL values, I just copied a database file for my tests. And to test it with SQL Storage Compress, I used a database file that had already been run through compression (about 40% compression on that file if you're interested). Now the reads were taken care of. I am seeing very realistic performance from decompressing the information for reads through SQLIO. But what about writes? Well, the issue is, what does SQLIO write? I don't have access to the code. But I do have access to the results. I did two different tests, just to be sure of what I was seeing. First test, use the .DAT file as described in the documentation. I opened the .DAT file after I was done with SQLIO, using WordPad. Guess what? It's a giant file full of air. SQLIO writes NULL values. What does that do to compression? I did the test again on a copy of an uncompressed database file. Then I ran the original and the SQLIO modified copy through ZIP to see what happened. I got better than 99% compression out of the SQLIO modified file (original file of 624,896kb went to 275,871kb compressed, after SQLIO it went to 608kb compressed). So, what does SQLIO write? It writes air. If you're trying to test it with compression or maybe some other type of file storage mechanism like dedupe, you need to know this because your tests really won't be valid. Should I find some other mechanism for testing? Yeah, if all I'm interested in is establishing performance to my own satisfaction, yes. But, I want to be able to compare my results with other people's results and we all need to be using the same tool in order for that to happen. SQLIO is the common mechanism that most people I know use to establish disk performance behavior. It'd be better if we could get SQLIO to do writes in some other fashion. Oh, and before I go, I get to brag a bit. Measuring IOPS, SQL Storage Compress outperforms my disk alone by about 30%.

Read the article

mySQL Optimization Suggestions

- by Brian Schroeter

I'm trying to optimize our mySQL configuration for our large Magento website. The reason I believe that mySQL needs to be configured further is because New Relic has shown that our SELECT queries are taking a long time (20,000+ ms) in some categories. I ran MySQLTuner 1.3.0 and got the following results... (Disclaimer: I restarted mySQL earlier after tweaking some settings, and so the results here may not be 100% accurate): >> MySQLTuner 1.3.0 - Major Hayden <[email protected]> >> Bug reports, feature requests, and downloads at http://mysqltuner.com/ >> Run with '--help' for additional options and output filtering [OK] Currently running supported MySQL version 5.5.37-35.0 [OK] Operating on 64-bit architecture -------- Storage Engine Statistics ------------------------------------------- [--] Status: +ARCHIVE +BLACKHOLE +CSV -FEDERATED +InnoDB +MRG_MYISAM [--] Data in MyISAM tables: 7G (Tables: 332) [--] Data in InnoDB tables: 213G (Tables: 8714) [--] Data in PERFORMANCE_SCHEMA tables: 0B (Tables: 17) [--] Data in MEMORY tables: 0B (Tables: 353) [!!] Total fragmented tables: 5492 -------- Security Recommendations ------------------------------------------- [!!] User '@host5.server1.autopartsnetwork.com' has no password set. [!!] User '@localhost' has no password set. [!!] User 'root@%' has no password set. -------- Performance Metrics ------------------------------------------------- [--] Up for: 5h 3m 4s (5M q [317.443 qps], 42K conn, TX: 18B, RX: 2B) [--] Reads / Writes: 95% / 5% [--] Total buffers: 35.5G global + 184.5M per thread (1024 max threads) [!!] Maximum possible memory usage: 220.0G (174% of installed RAM) [OK] Slow queries: 0% (6K/5M) [OK] Highest usage of available connections: 5% (61/1024) [OK] Key buffer size / total MyISAM indexes: 512.0M/3.1G [OK] Key buffer hit rate: 100.0% (102M cached / 45K reads) [OK] Query cache efficiency: 66.9% (3M cached / 5M selects) [!!] Query cache prunes per day: 3486361 [OK] Sorts requiring temporary tables: 0% (0 temp sorts / 812K sorts) [!!] Joins performed without indexes: 1328 [OK] Temporary tables created on disk: 11% (126K on disk / 1M total) [OK] Thread cache hit rate: 99% (61 created / 42K connections) [!!] Table cache hit rate: 19% (9K open / 49K opened) [OK] Open file limit used: 2% (712/25K) [OK] Table locks acquired immediately: 100% (5M immediate / 5M locks) [!!] InnoDB buffer pool / data size: 32.0G/213.4G [OK] InnoDB log waits: 0 -------- Recommendations ----------------------------------------------------- General recommendations: Run OPTIMIZE TABLE to defragment tables for better performance MySQL started within last 24 hours - recommendations may be inaccurate Reduce your overall MySQL memory footprint for system stability Enable the slow query log to troubleshoot bad queries Increasing the query_cache size over 128M may reduce performance Adjust your join queries to always utilize indexes Increase table_cache gradually to avoid file descriptor limits Read this before increasing table_cache over 64: http://bit.ly/1mi7c4C Variables to adjust: *** MySQL's maximum memory usage is dangerously high *** *** Add RAM before increasing MySQL buffer variables *** query_cache_size (> 512M) [see warning above] join_buffer_size (> 128.0M, or always use indexes with joins) table_cache (> 12288) innodb_buffer_pool_size (>= 213G) My my.cnf configuration is as follows... [client] port = 3306 [mysqld_safe] nice = 0 [mysqld] tmpdir = /var/lib/mysql/tmp user = mysql port = 3306 skip-external-locking character-set-server = utf8 collation-server = utf8_general_ci event_scheduler = 0 key_buffer = 512M max_allowed_packet = 64M thread_stack = 512K thread_cache_size = 512 sort_buffer_size = 24M read_buffer_size = 8M read_rnd_buffer_size = 24M join_buffer_size = 128M # for some nightly processes client sessions set the join buffer to 8 GB auto-increment-increment = 1 auto-increment-offset = 1 myisam-recover = BACKUP max_connections = 1024 # max connect errors artificially high to support behaviors of NetScaler monitors max_connect_errors = 999999 concurrent_insert = 2 connect_timeout = 5 wait_timeout = 180 net_read_timeout = 120 net_write_timeout = 120 back_log = 128 # this table_open_cache might be too low because of MySQL bugs #16244691 and #65384) table_open_cache = 12288 tmp_table_size = 512M max_heap_table_size = 512M bulk_insert_buffer_size = 512M open-files-limit = 8192 open-files = 1024 query_cache_type = 1 # large query limit supports SOAP and REST API integrations query_cache_limit = 4M # larger than 512 MB query cache size is problematic; this is typically ~60% full query_cache_size = 512M # set to true on read slaves read_only = false slow_query_log_file = /var/log/mysql/slow.log slow_query_log = 0 long_query_time = 0.2 expire_logs_days = 10 max_binlog_size = 1024M binlog_cache_size = 32K sync_binlog = 0 # SSD RAID10 technically has a write capacity of 10000 IOPS innodb_io_capacity = 400 innodb_file_per_table innodb_table_locks = true innodb_lock_wait_timeout = 30 # These servers have 80 CPU threads; match 1:1 innodb_thread_concurrency = 48 innodb_commit_concurrency = 2 innodb_support_xa = true innodb_buffer_pool_size = 32G innodb_file_per_table innodb_flush_log_at_trx_commit = 1 innodb_log_buffer_size = 2G skip-federated [mysqldump] quick quote-names single-transaction max_allowed_packet = 64M I have a monster of a server here to power our site because our catalog is very large (300,000 simple SKUs), and I'm just wondering if I'm missing anything that I can configure further. :-) Thanks!

Read the article

Why does Process Explorer cause highly targeted failure of some applications / basic UI functions in a high-power EC2 Windows instance?

- by Dan Nissenbaum

Update: I have determined that Process Explorer itself - the program I am using to debug a performance issue - seems to be the cause of the issue. See note, with updated question, at end. I am running a high-power (cc2.8xlarge) Amazon AWS EC2 Windows instance off of a boot EBS volume, provisioned at 2500 PIOPS, which was created from a snapshot of a previous boot volume. My purpose with the instance is to use it as a development workstation with many developer tools installed, such as Visual Studio, a local XAMPP stack, etc. I have upwards of 40 programs installed on the machine. The usability of the instance as a development machine often works quite well. The RDP lag is adequately small. I have used it for hours on end without problems for some of my most intense development tasks. As a result, I have just purchased a reserved instance, and I opted to rebuild my development machine starting from scratch with a Windows Server 2012 AMI. After having installed all of my desired/required applications for development over this past week, again the machine seems to often work well and I have worked for up to an hour at a time without problems doing heavy development work. However, I continue to run into catastrophic OS usability issues that may prevent me from being able to rely on this machine as a development machine. I would like to track down the source of the problem, if there is an easily identifiable source. (Update: I have tracked down the source to be Process Explorer, the very program I was using to debug the problem. See update at end.) The issues are as follows. (These are some primary examples) Some applications, after a period of adequate responsiveness, suddenly begin to respond very, very slowly to basic user interface actions such as clicking on menus and pressing Ctrl-Tab to switch between open documents. Two examples are UltraEdit and PhpEd. It typically takes ~2 seconds for a menu to appear, and ~4 seconds to switch between open documents. Additionally, insertion point motion in the editor is lagged by upwards of ~2 seconds. Process Explorer, which I am using to help debug the problem, seems to run acceptably for a couple of minutes, but on multiple occasions Process Explorer itself hangs completely. It hangs at the same time as the problems noted above. When it hangs, it is 100% unresponsive. Clicking on its taskbar icon neither causes it to come to the top or go behind, and its viewable area is filled with nothing but a region partially containing pure white and partially containing incomplete windows widgets that are unreadable, and that never change. Waiting 10 minutes does not clear the problem. Attempting to force-quit Process Explorer by right-clicking on its taskbar icon and choosing "Close Window" takes about 5 full minutes to exit (Process Explorer itself can't be used to exit Process Explorer, and it is registered as a Task Manager substitute). Other programs work just fine during this time. For example, Chrome tabs flip very quickly back and forth, menus pop open instantly, web pages load quickly, and typing in forms/web applications inside the browser works promptly. Another example of an application that works crisply is Filemaker - its menus open instantly, and switching views in this application occurs promptly. Other applications also work without issue. Also, switching between applications occurs promptly as well. It is only a handful of applications that exhibit the problem, with some primary examples given above. At first I thought that EBS IOPS might be a problem. Therefore, I ran Performance Monitor, and watched the "Disk Transfers/sec" monitor in real time. At no point did this measure come anywhere close to hitting the 2500 PIOPS provisioned for the EBS volume. The RAM was also well under the limit (~10 GB used out of 60 GB). I did notice that one CPU core (out of 32 logical cores) was fully thrashing at 100% (i.e., ~3.1%) during the problematic periods. This seems to indicate that a single CPU core is handling the menus / flipping between open documents (for some applications only) / managing the Process Explorer user interface, and that this single core was hosed for some reason during the problematic periods. Also note that I have a desktop workstation (Windows 7) that I also use as a development machine, via a remote connection, with a nearly identical set of programs installed, and this desktop workstation does not exhibit any of the problems I've discussed above. I have been using it heavily for well over a year now. Any suggestions regarding either the source of the problem, or steps I might take to investigate the source of the problem, would be appreciated. Thanks. Note: After extensive testing & investigation, I have noticed that when I quit Process Explorer, the problem vanishes and the system performance returns to normal, and then reappears quickly when I run Process Explorer again (note: again, the performance problems only appear for a subset of applications - other applications work perfectly fine during the same period). My question is therefore (thankfully) more specific: Why does Process Explorer cause highly targeted failure of some applications (including itself) and basic UI functions, in a high-power EC2 Windows instance?

Search Results

Search found 49 results on 2 pages for 'iops'.

Page 2/2 | < Previous Page | 1 2

- by Darius Zanganeh

- by kimberly.billings

- by user12608550

- by uwes

- by alekop

- by nulltorpedo

- by devlearn

- by obsidian

- by skomak

- by DmitrySemenov

- by njo

- by gtirloni

- by Steve Tunstall

- by Tony Davis

- by Michael Pearson

- by Luke404

- by Geekman

- by user41679

- by kubanczyk

- by Igor Polishchuk

- by Easter Sunshine

- by Grant Fritchey

- by Brian Schroeter

- by Dan Nissenbaum

< Previous Page | 1 2