Search Results

Search found 78 results on 4 pages for 'drbd'.

Page 4/4 | < Previous Page | 1 2 3 4 

  • System Requirements of a write-heavy applications serving hundreds of requests per second

    - by Rolando Cruz
    NOTE: I am a self-taught PHP developer who has little to none experience managing web and database servers. I am about to write a web-based attendance system for a very large userbase. I expect around 1000 to 1500 users logged-in at the same time making at least 1 request every 10 seconds or so for a span of 30 minutes a day, 3 times a week. So it's more or less 100 requests per second, or at the very worst 1000 requests in a second (average of 16 concurrent requests? But it could be higher given the short timeframe that users will make these requests. crosses fingers to avoid 100 concurrent requests). I expect two types of transactions, a local (not referring to a local network) and a foreign transaction. local transactions basically download userdata in their locality and cache it for 1 - 2 weeks. Attendance equests will probably be two numeric strings only: userid and eventid. foreign transactions are for attendance of those do not belong in the current locality. This will pass in the following data instead: (numeric) locality_id, (string) full_name. Both requests are done in Ajax so no HTML data included, only JSON. Both type of requests expect at the very least a single numeric response from the server. I think there will be a 50-50 split on the frequency of local and foreign transactions, but there's only a few bytes of difference anyways in the sizes of these transactions. As of this moment the userid may only reach 6 digits and eventid are 4 to 5-digit integers too. I expect my users table to have at least 400k rows, and the event table to have as many as 10k rows, a locality table with at least 1500 rows, and my main attendance table to increase by 400k rows (based on the number of users in the users table) a day for 3 days a week (1.2M rows a week). For me, this sounds big. But is this really that big? Or can this be handled by a single server (not sure about the server specs yet since I'll probably avail of a VPS from ServInt or others)? I tried to read on multiple server setups Heatbeat, DRBD, master-slave setups. But I wonder if they're really necessary. the users table will add around 500 1k rows a week. If this can't be handled by a single server, then if I am to choose a MySQL replication topology, what would be the best setup for this case? Sorry, if I sound vague or the question is too wide. I just don't know what to ask or what do you want to know at this point.

    Read the article

  • recommendations for efficient offsite remote backup solution of vm's

    - by senorsmile
    I am looking for recommendations for backing up my current 6 vm's(and soon to grow to up to 20). Currently I am running a two node proxmox cluster(which is a debian base using kvm for virtualization with a custom web front end to administer). I have two nearly identical boxes with amd phenom II x4's and asus motherboards. Each has 4 500 GB sata2 hdd's, 1 for the os and other data for the proxmox install, and 3 using mdadm+drbd+lvm to share the 1.5 TB's of storage between the two machines. I mount lvm images to kvm for all of the virtual machines. I currently have the ability to do live transfer from one machine to the other, typically within seconds(it takes about 2 minutes on the largest vm running win2008 with m$ sql server). I am using proxmox's built-in vzdump utility to take snapshots of the vm's and store those on an external harddrive on the network. I then have jungledisk service (using rackspace) to sync the vzdump folder for remote offsite backup. This is all fine and dandy, but it's not very scalable. For one, the backups themselves can take up to a few hours every night. With jungledisk's block level incremental transfers, the sync only transfers a small portion of the data offsite, but that still takes at least a half an hour. The much better solution would of course be something that allows me to instantly take the difference of two time points (say what was written from 6am to 7am), zip it, then send that difference file to the backup server which would instantly transfer to the remote storage on rackspace. I have looked a little into zfs and it's ability to do send/receive. That coupled with a pipe of the data in bzip or something would seem perfect. However, it seems that implementing a nexenta server with zfs would essentially require at least one or two more dedicated storage servers to serve iSCSI block volumes (via zvol's???) to the proxmox servers. I would prefer to keep the setup as minimal as possible (i.e. NOT having separate storage servers) if at all possible. I have also briefly read about zumastor. It looks like it could also do what I want, but it appears to have halted development in 2008. So, zfs, zumastor or other?

    Read the article

  • How can a single disk in a hardware SATA RAID-10 array bring the entire array to a screeching halt?

    - by Stu Thompson
    Prelude: I'm a code-monkey that's increasingly taken on SysAdmin duties for my small company. My code is our product, and increasingly we provide the same app as SaaS. About 18 months ago I moved our servers from a premium hosting centric vendor to a barebones rack pusher in a tier IV data center. (Literally across the street.) This ment doing much more ourselves--things like networking, storage and monitoring. As part the big move, to replace our leased direct attached storage from the hosting company, I built a 9TB two-node NAS based on SuperMicro chassises, 3ware RAID cards, Ubuntu 10.04, two dozen SATA disks, DRBD and . It's all lovingly documented in three blog posts: Building up & testing a new 9TB SATA RAID10 NFSv4 NAS: Part I, Part II and Part III. We also setup a Cacit monitoring system. Recently we've been adding more and more data points, like SMART values. I could not have done all this without the awesome boffins at ServerFault. It's been a fun and educational experience. My boss is happy (we saved bucket loads of $$$), our customers are happy (storage costs are down), I'm happy (fun, fun, fun). Until yesterday. Outage & Recovery: Some time after lunch we started getting reports of sluggish performance from our application, an on-demand streaming media CMS. About the same time our Cacti monitoring system sent a blizzard of emails. One of the more telling alerts was a graph of iostat await. Performance became so degraded that Pingdom began sending "server down" notifications. The overall load was moderate, there was not traffic spike. After logging onto the application servers, NFS clients of the NAS, I confirmed that just about everything was experiencing highly intermittent and insanely long IO wait times. And once I hopped onto the primary NAS node itself, the same delays were evident when trying to navigate the problem array's file system. Time to fail over, that went well. Within 20 minuts everything was confirmed to be back up and running perfectly. Post-Mortem: After any and all system failures I perform a post-mortem to determine the cause of the failure. First thing I did was ssh back into the box and start reviewing logs. It was offline, completely. Time for a trip to the data center. Hardware reset, backup an and running. In /var/syslog I found this scary looking entry: Nov 15 06:49:44 umbilo smartd[2827]: Device: /dev/twa0 [3ware_disk_00], 6 Currently unreadable (pending) sectors Nov 15 06:49:44 umbilo smartd[2827]: Device: /dev/twa0 [3ware_disk_07], SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 171 to 170 Nov 15 06:49:45 umbilo smartd[2827]: Device: /dev/twa0 [3ware_disk_10], 16 Currently unreadable (pending) sectors Nov 15 06:49:45 umbilo smartd[2827]: Device: /dev/twa0 [3ware_disk_10], 4 Offline uncorrectable sectors Nov 15 06:49:45 umbilo smartd[2827]: Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error Nov 15 06:49:45 umbilo smartd[2827]: # 1 Short offline Completed: read failure 90% 6576 3421766910 Nov 15 06:49:45 umbilo smartd[2827]: # 2 Short offline Completed: read failure 90% 6087 3421766910 Nov 15 06:49:45 umbilo smartd[2827]: # 3 Short offline Completed: read failure 10% 5901 656821791 Nov 15 06:49:45 umbilo smartd[2827]: # 4 Short offline Completed: read failure 90% 5818 651637856 Nov 15 06:49:45 umbilo smartd[2827]: So I went to check the Cacti graphs for the disks in the array. Here we see that, yes, disk 7 is slipping away just like syslog says it is. But we also see that disk 8's SMART Read Erros are fluctuating. There are no messages about disk 8 in syslog. More interesting is that the fluctuating values for disk 8 directly correlate to the high IO wait times! My interpretation is that: Disk 8 is experiencing an odd hardware fault that results in intermittent long operation times. Somehow this fault condition on the disk is locking up the entire array Maybe there is a more accurate or correct description, but the net result has been that the one disk is impacting the performance of the whole array. The Question(s) How can a single disk in a hardware SATA RAID-10 array bring the entire array to a screeching halt? Am I being naïve to think that the RAID card should have dealt with this? How can I prevent a single misbehaving disk from impacting the entire array? Am I missing something?

    Read the article

< Previous Page | 1 2 3 4