application performance - Page 133

Postfix spool on ext3 optimiziations in >=linux-2.6.34 days

- by Luke404

Given the very specific nature of the subject (we're not talking about mailboxes, just the spool; we're not talking about other filesystems, just ext3; and so on...) and the maturity of the softwares involved (linux kernel, ext3fs, postfix) I'd think there should be a more or less agreed on set of best practices to filesystem related tuning. I'm trying to get a roundup of them: data=journal became the default in recent kernels (somewhere around 2.6.30 IIRC) so we should be ok with that Wietse Venema says atime must be on, but Postfix documentation recommendsnoatime while talking about the Incoming Queue. Does that mean that postfix needs atime on just for some queue directories and will benefit from noatime on the others? can we use noatime if we just don't use ETRN? filesystem can be mounted nodev,noexec,nosuid - no* won't prevent you from setting attributes (postfix uses exec attr) they just won't have any effect (we don't run anything from the spool) the fsync() issue cited by Wietse and/or the chattr -S are probably linked to sync/async options of ext3fs but I do not understand them enough. Mouting the filesystem with async option is equivalent to chattr -R -S the whole fs? Seems like it will increase performance, but will that pose a risk of "loss of mail after a system crash" or is it really "safe on /var/spool/postfix" ? would you tune anything else on postfix-2.6.x to work better on ext3 or do you leave defaults everywhere? is there a "best" linux I/O scheduler for this kind of workload (namely CFQ or deadline?) or that's something that will vary too much based on hardware configuration? would you tune anything else in the filesystem or in the kernel? anything else? References: Postfix Performance here on SF Postfix documentation about the Incoming Queue Wietse Venema in Best file system on [email protected] here Postfix and ext3 on [email protected] here and there

Read the article

Real benefits of tcp TIME-WAIT and implications in production environment

- by user64204

SOME THEORY I've been doing some reading on tcp TIME-WAIT (here and there) and what I read is that it's a value set to 2 x MSL (maximum segment life) which keeps a connection in the "connection table" for a while to guarantee that, "before your allowed to create a connection with the same tuple, all the packets belonging to previous incarnations of that tuple will be dead". Since segments received (apart from SYN under specific circumstances) while a connection is either in TIME-WAIT or no longer existing would be discarded, why not close the connection right away? Q1: Is it because there is less processing involved in dealing with segments from old connections and less processing to create a new connection on the same tuple when in TIME-WAIT (i.e. are there performance benefits)? If the above explanation doesn't stand, the only reason I see the TIME-WAIT being useful would be if a client sends a SYN for a connection before it sends remaining segments for an old connection on the same tuple in which case the receiver would re-open the connection but then get bad segments and and would have to terminate it. Q2: Is this analysis correct? Q3: Are there other benefits to using TIME-WAIT? SOME PRACTICE I've been looking at the munin graphs on a production server that I administrate. Here is one: As you can see there are more connections in TIME-WAIT than ESTABLISHED, around twice as many most of the time, on some occasions four times as many. Q4: Does this have an impact on performance? Q5: If so, is it wise/recommended to reduce the TIME-WAIT value (and what to)? Q6: Is this ratio of TIME-WAIT / ESTABLISHED connections normal? Could this be related to malicious connection attempts?

Read the article

ZFS with L2ARC (SSD) slower for random seeks than without L2ARC

- by Florian Kruse

I am currently testing ZFS (Opensolaris 2009.06) in an older fileserver to evaluate its use for our needs. Our current setup is as follows: Dual core (2,4 GHz) with 4 GB RAM 3x SATA controller with 11 HDDs (250 GB) and one SSD (OCZ Vertex 2 100 GB) We want to evaluate the use of a L2ARC, so the current ZPOOL is: $ zpool status pool: tank state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM afstank ONLINE 0 0 0 raidz1 ONLINE 0 0 0 c11t0d0 ONLINE 0 0 0 c11t1d0 ONLINE 0 0 0 c11t2d0 ONLINE 0 0 0 c11t3d0 ONLINE 0 0 0 raidz1 ONLINE 0 0 0 c13t0d0 ONLINE 0 0 0 c13t1d0 ONLINE 0 0 0 c13t2d0 ONLINE 0 0 0 c13t3d0 ONLINE 0 0 0 cache c14t3d0 ONLINE 0 0 0 where c14t3d0 is the SSD (of course). We run IO tests with bonnie++ 1.03d, size is set to 200 GB (-s 200g) so that the test sample will never be completely in ARC/L2ARC. The results without SSD are (average values over several runs which show no differences) write_chr write_blk rewrite read_chr read_blk random seeks 101.998 kB/s 214.258 kB/s 96.673 kB/s 77.702 kB/s 254.695 kB/s 900 /s With SSD it becomes interesting. My assumption was that the results should be in worst case at least the same. While write/read/rewrite rates are not different, the random seek rate differs significantly between individual bonnie++ runs (between 188 /s and 1333 /s so far), average is 548 +- 200 /s, so below the value w/o SSD. So, my questions are mainly: Why do the random seek rates differ so much? If the seeks are really random, they should not differ much (my assumption). So, even if the SSD is impairing the performance it should be the same in each bonnie++ run. Why is the random seek performance worse in most of the bonnie++ runs? I would assume that some part of the bonnie++ data is in the L2ARC and random seeks on this data performs better while random seeks on other data just performs similarly like before.

Read the article

Some free cloud solution to enhance your business

- by Saif Bechan

I am co-owner of a small internet business. I am in charge of IT, and I try to get things done as low cost as possible. When investing in servers, resources and overall business costs your project can soon turn into a financial disaster. Cloud solutions can help you in solving some financial problems, they can help you in scalability problems, and overall performance problems of your server or web application. Recently I moved the whole internal/external communication(email,calendar,documents) of my business to the cloud. I did this by using the free version of Google Apps. This works great and is a big advantage on multiple levels. I do not have to fight spam anymore on my system, and there are less resources used on my system. Also switching servers will go a lot easier. Questions Can you name some cloud solution that you have used, or some you just recommend. They can fairy form financial benefits, organizational benefits, performance benefits. It doesn't matter as soon as it helps you spread the load of your business.

Read the article

HP Proliant DL380 G4 - Can this server still perform in 2011?

- by BSchriver

Can the HP Proliant DL380 G4 series server still perform at high a quality in the 2011 IT world? This may sound like a weird question but we are a very small company whose primary business is NOT IT related. So my IT dollars have to stretch a long way. I am in need of a good web and database server. The load and demand for a while will be fairly low so I am not looking nor do I have the money to buy a brand new HP Dl380 G7 series box for $6K. While searching around today I found a company in ATL that buys servers off business leases and then stripes them down to parts. They clean, check and test each part and then custom "rebuild" the server based on whatever specs you request. The interesting thing is they also provide a 3-year warranty on all their servers they sell. I am contemplating buying two of the following: HP Proliant DL380 G4 Dual (2) Intel Xeon 3.6 GHz 800Mhz 1MB Cache processors 8GB PC3200R ECC Memory 6 x 73GB U320 15K rpm SCSI drives Smart Array 6i Card Dual Power Supplies Plus the usual cdrom, dual nic, etc... All this for $750 each or $1500 for two pretty nicely equipped servers. The price then jumps up on the next model up which is the G5 series. It goes from $750 to like $2000 for a comparable server. I just do not have $4000 to buy two servers right now. So back to my original question, if I load Windows 2008 R2 Server and IIS 7 on one of the machines and Windows 2008 R2 server and MS SQL 2008 R2 Server on another machine, what kind of performance might I expect to see from these machines? The facts is this series is now 3 versions behind the G7's and this series of server was built when Windows 200 Server was the dominant OS and Windows 2003 Server was just coming out. If you are running Windows 2008 R2 Server on a G4 with similar or less specs I would love to hear what your performance is like.

Read the article

Caching all files in varnish

- by csgwro

I want my varnish servers to cache all files. At backend there is lighttpd hosting only static files, and there is an md5 in the url in case of file change, ex. /gfx/Bird.b6e0bc2d6cbb7dfe1a52bc45dd2b05c4.swf). However my hit ratio is very poorly (about 0.18) My config: sub vcl_recv { set req.backend=default; ### passing health to backend if (req.url ~ "^/health.html$") { return (pass); } remove req.http.If-None-Match; remove req.http.cookie; remove req.http.authenticate; if (req.request == "GET") { return (lookup); } } sub vcl_fetch { ### do not cache wrong codes if (beresp.status == 404 || beresp.status >= 500) { set beresp.ttl = 0s; } remove beresp.http.Etag; remove beresp.http.Last-Modified; } sub vcl_deliver { set resp.http.expires = "Thu, 31 Dec 2037 23:55:55 GMT"; } I have made an performance tuning: DAEMON_OPTS="${DAEMON_OPTS} -p thread_pool_min=200 -p thread_pool_max=4000 -p thread_pool_add_delay=2 -p session_linger=100" The main url which is missed is... /health.html. Is that forward to backend correctly configured? Disabling health checking hit ratio increases to 0.45. Now mostly "/crossdomain.xml" is missed (from many domains, as it is wildcard). How can I avoid that? Should I carry on other headers like User-Agent or Accept-Encoding? I thing that default hashing mechanism is using url + host/IP. Compression is used at the backend. What else can improve performance?

Read the article

Server slowdown

- by Clinton Bosch

I have a GWT application running on Tomcat on a cloud linux(Ubuntu) server, recently I released a new version of the application and suddenly my server response times have gone from 500ms average to 15s average. I have run every monitoring tool I know. iostat says my disks are 0.03% utilised mysqltuner.pl says I am OK other see below top says my processor is 99% idle and load average: 0.20, 0.31, 0.33 memory usage is 50% (-/+ buffers/cache: 3997 3974) mysqltuner output [OK] Logged in using credentials from debian maintenance account. -------- General Statistics -------------------------------------------------- [--] Skipped version check for MySQLTuner script [OK] Currently running supported MySQL version 5.1.63-0ubuntu0.10.04.1-log [OK] Operating on 64-bit architecture -------- Storage Engine Statistics ------------------------------------------- [--] Status: +Archive -BDB -Federated +InnoDB -ISAM -NDBCluster [--] Data in MyISAM tables: 370M (Tables: 52) [--] Data in InnoDB tables: 697M (Tables: 1749) [!!] Total fragmented tables: 1754 -------- Security Recommendations ------------------------------------------- [OK] All database users have passwords assigned -------- Performance Metrics ------------------------------------------------- [--] Up for: 19h 25m 41s (1M q [28.122 qps], 1K conn, TX: 2B, RX: 1B) [--] Reads / Writes: 98% / 2% [--] Total buffers: 1.0G global + 2.7M per thread (500 max threads) [OK] Maximum possible memory usage: 2.4G (30% of installed RAM) [OK] Slow queries: 0% (1/1M) [OK] Highest usage of available connections: 34% (173/500) [OK] Key buffer size / total MyISAM indexes: 16.0M/279.0K [OK] Key buffer hit rate: 99.9% (50K cached / 40 reads) [OK] Query cache efficiency: 61.4% (844K cached / 1M selects) [!!] Query cache prunes per day: 553779 [OK] Sorts requiring temporary tables: 0% (0 temp sorts / 34K sorts) [OK] Temporary tables created on disk: 4% (4K on disk / 102K total) [OK] Thread cache hit rate: 84% (185 created / 1K connections) [!!] Table cache hit rate: 0% (256 open / 27K opened) [OK] Open file limit used: 0% (20/2K) [OK] Table locks acquired immediately: 100% (692K immediate / 692K locks) [OK] InnoDB data size / buffer pool: 697.2M/1.0G -------- Recommendations ----------------------------------------------------- General recommendations: Run OPTIMIZE TABLE to defragment tables for better performance MySQL started within last 24 hours - recommendations may be inaccurate Enable the slow query log to troubleshoot bad queries Increase table_cache gradually to avoid file descriptor limits Variables to adjust: query_cache_size (> 16M) table_cache (> 256)

Read the article

Multiple servers vs 1 big server performace

- by pistacchio

Hi to all! My team of developers has suggested a server structure for an upcoming project we are developing. Our structure is "logical", meaning that the various logical components of the application (it is a distributed one) relies on different servers. Some components are more critical than others and will be subjected to more load. Our proposal was to have 1 server per component but the hardware guys suggested to replace the various machines with a single, bigger one with virtual servers. They're gonna use Blade Servers. Now, I'm not an expert at all, but my question to the guys was: so if we need, for example, 3 2GHz CPU / 2GB RAM machines and you give me 1 machine with 3 2GHz CPUs and 6 GB of RAM it is the same? They told me it is. Is this accurate? What are the advantages or disadvantages of both the solutions? What are the generally accepted best practices? Could you point out some URL reference dealing with the problem? Thank you in advance! EDIT: Some more info. The (internet / intranet) application is already layered. We have some servers on the DMZ that will expose pages to the internet and the databases are on their own machines. What we want to split (and they want to join) are some webservers that mainly expose webservices. One is a DAL that communicates with the database layer, one is our Single Sign On / User Profile application that gets called once per page and one is a clone of what seen on the Internet to be used on our lan.

Read the article

Measure Upload Speed between a client and our server

- by tresstylez

We host a SAAS application specially customized for multiple clients. For one customer in particular -- they are reporting sporadic performance issues from various locations on their network, in particular UPLOADING documents through a form on our website. The client claims they have "bandwidth to spare" and that utilization of their "pipe" is so low that it MUST be our application, but our application has MANY clients and all features are working fine for all other clients. Interestingly enough -- DOWNLOADS (ie. just accessing the website, or downloading documents) is working fine. Speed test shows that they should get 1.2Mbps UP. So, a 3MB file should take 20 secs to upload. It takes 60+ seconds on their network. Sometimes even small files take OVER 10 minutes to upload or they timeout. Pings and Traceroutes don't show any abnormally long hops or response times. They claim other SAAS applications they use allow them to upload just fine. Both IT teams are working together to resolve this issue. What kind of data can I request from the clients to begin ruling things out. Seems like we need to somehow measure LATENCY of the networks involved or even at the switch level, we need to understand if packets are getting dropped somewhere and why. Where should I start? Any help is appreciated. I'll provide more info upon requests

Read the article

Is there an objective way to measure slowness of PC/WINDOWS?

- by ekms

We've a lot of users that usually complain about that his PC is "slow". (we use win XP). We usually check startup programs, virus, fragmentation, disk health and common problems that causes slowness (Symantec AV drops disk to 1mb/s , or a seagate HD firmware error in certain models), but in those cases the slowness is pretty evident. In other hand, the most common is the user complaining about his pc but for us looks OK, even in 6 years old desktops. People sometimes even complains about his new quad core desktops speed!!! So, we are asking if there's a way to OBJECTIVELY check that a computer didn't dropped its performance, compared with similar ones o previous measures, specially for work use (I don't think that 3dmark benchmark o similar may help). The only thing that I found that was useful is HDTune, but it only check hard disk performance. Basically, what we want is something that enable us to say to our users "see? your PC is as slow as was three years ago! stop complaining! Is all in your head!"

Read the article

Why database partitioning didn't work? Extract from thedailywtf.com

- by questzen

Original link. http://thedailywtf.com/Articles/The-Certified-DBA.aspx. Article summary: The DBA suggests an approach involving rigorous partitioning, 10 partitions per disk (3 actual disks and 3 raid). The stats show that the performance is non-optimal. Then the DBA suggests an alternative of 1 partition per disk (with more added disks). This also fails. The sys-admin then sets up a single disk, single partition and saves the day. The size of disks was not mentioned but given today,s typical disk sizes (of the order of 100 GB), the partitions ; would be huge, it surprises me that a single disk with all partitions outperformed. Initially I suspect that the data was segregated and hence faster reads. But how come the performance didn't degrade as time went by with all the inserts and updates happening? Saw this on reddit, but the explanation was by far spindle/platter centered. There was no mention in the article about this. Is there any other reason? I can only guess that the tables were using a incorrect hash distribution causing non-uniform allocation across disks (wrong partitioning); this would increase fetch times. Any thoughts?

Read the article

Disk fragmentation when dealing with many small files

- by Zorlack

On a daily basis we generate about 3.4 Million small jpeg files. We also delete about 3.4 Million 90 day old images. To date, we've dealt with this content by storing the images in a hierarchical manner. The heriarchy is something like this: /Year/Month/Day/Source/ This heirarchy allows us to effectively delete days worth of content across all sources. The files are stored on a Windows 2003 server connected to a 14 disk SATA RAID6. We've started having significant performance issues when writing-to and reading-from the disks. This may be due to the performance of the hardware, but I suspect that disk fragmentation may be a culprit at well. Some people have recommended storing the data in a database, but I've been hesitant to do this. An other thought was to use some sort of container file, like a VHD or something. Does anyone have any advice for mitigating this kind of fragmentation? Additional Info: The average file size is 8-14KB Format information from fsutil: NTFS Volume Serial Number : 0x2ae2ea00e2e9d05d Version : 3.1 Number Sectors : 0x00000001e847ffff Total Clusters : 0x000000003d08ffff Free Clusters : 0x000000001c1a4df0 Total Reserved : 0x0000000000000000 Bytes Per Sector : 512 Bytes Per Cluster : 4096 Bytes Per FileRecord Segment : 1024 Clusters Per FileRecord Segment : 0 Mft Valid Data Length : 0x000000208f020000 Mft Start Lcn : 0x00000000000c0000 Mft2 Start Lcn : 0x000000001e847fff Mft Zone Start : 0x0000000002163b20 Mft Zone End : 0x0000000007ad2000

Read the article

Windows XP to remote server 2008 R2 shares - awful response times

- by nick3216

I have a network infrastructure of Windows XP clients (a mix of XP and 64-bit XP), that are accessing a network share on a Windows 2008 R2 server. Whenever users type the address of a folder into the address bar of Windows Explorer it's as snappy at determining the contents of the current folder and presenting them to you in the address bar as if you're working on a local drive. But if you open one of the subfolders users get the animated red torch and 'Searching for items...' dialog, typically for 45 seconds. Similarly when using the open folder dialog to try and select a subfolder on this share it takes, on average, 45 seconds for the dialog to expand each node and show the subfolders of each node. Also, while the Explorer instance accsesing the network share is running slowly users notice that the performance of all other Explorer windows suffers. So while Explorer is searching for files on the network share they can't switch to another task and navigate around their local drive using Explorer because it's now as slow as a dead dog at accessing anything. Are there any settings we can change which will improve the performance accessing network shares?

Read the article

window login when web application is using network share in IIS 6

- by James

Hi, I have installed a web application which is configured using a network drive But i am keep getting a pop up asking for credentials looking in the event log, the network logon is set to my domain/account which looks fine however caller user name is empty (not sure if this is an issue) the application works fine when i use a local drive the application also runs fine when i set "connect as" user the application also works fine when a share on the local machine is used!! direct asses using the unc path is not a problem Please advise what i can do or should check Thanks and Regards, James

Read the article

map my url with window 2008 r2 for a tomcat web application

- by Dinidu

I have developed a web application using jsp/servelt technology, now i have hosted in my company windows server by using tomcat web server. when i want to go to the application i have to type my server name and the 8080 port no. I want to remove this and want to use my web application name instead of the server name. hope a quick answer. example: now (http://my server name:8080/) what i want (http://my application name)

Read the article

Is segfault signal always sent to application

- by noonex

My application usually crashes and prints stack to log if received segfault signal. But in some environment the 'dmesg' shows segfault messages related to my application, but application uptime is much older. Can segfault be suppressed and application doesn't receive signal? Or what errors from dmesg can mean ?

Read the article

Granting read-write rights to my web application on VPS

- by davykiash

Am currently testing a bulk CSV import functionality web application and I came across a error The given destination is not writeable My application is zend based and uses the MVC structure application --uploads library --Zend public --index.php What Ubuntu command do I exectute to safely grant the necessary rights to my uploads folder in my web application?

Read the article

Multi-core processors: Better performance running multiple applications?

- by aip.cd.aish

I realize for single applications - the application itself has to be designed to take advantage of multiple cores. But what about executing many different applications simultaneously? On my development machine at an average instance, I run multiple servers (a database server, a web server), multiple instance of IDEs (either Visual Studio or NetBeans), Web-browser with multiple tabs (in Chrome, each tab is a process on its own), FTP client, SSH client etc. Does having a multi-core system improve the ability to run multiple applications simultaneously?

Read the article

Echo 404 directly from nginx to improve performance

- by user64204

I am in charge of production servers serving static content for a website. Those servers are constantly being crawled by bots looking for potential exploits (which isn't that much of a problem security-wise because no application can be reached behind the web server) but generates thousands of 404 per day, sometimes per hour. I am looking into ways of blocking those requests but it's tricky (you want to make sure you don't block legitimate traffic and these bots are becoming more and more clever at looking like they're legit) and is going to take me a while to find an acceptable solution. In the meantime I would like to reduce the performance impact of serving those 404 pages. Indeed we're using nginx which by default is configured to serve it's 404 page from the disk (This can be changed using the error_page directive but in the end the 404 will either have to be served from disk or from another external source (e.g. upstream application which would be worst)) which isn't ideal. I ran a test with ab on my local machine with a basic configuration: in one case I echo a message directly from nginx so the disk isn't touched at all, in the other case I hit a missing page and nginx serves its 404 from disk. server { # [...] the default nginx stuff location / { } location /this_page_exists { echo "this page was found"; } } Here are the test results (my laptop has Intel(R) Core(TM) i7-2670QM + SSD in case you're wondering why they are so high): $ ab -n 500000 -c 1000 http://localhost/this_page_exists Requests per second: 25609.16 [#/sec] (mean) $ ab -n 500000 -c 1000 http://localhost/this_page_doesnt_exists Requests per second: 22905.72 [#/sec] (mean) As you can see, returning a value with echo is 11% ((25609-22905)÷22905×100) faster than serving the 404 page from disk. Accordingly I would like to echo a simple 404 Page not Found string from nginx. I tried many things so far but they all failed, essentially the idea was this: location / { try_files $uri @not_found; } location @not_found { echo "404 - Page not found"; } The problem is that as soon as the echo directive is used, the http response code is set to 200. I tried changing that by doing error_page 200 = 400 but that breaks the configuration. How can I serve a 404 page directly from nginx? (without hacking the source which may be might next step)

Read the article

Ubuntu's gui "Open With" command is not memorizing the application

- by Tom Brito

In Ubuntu, when we right-click a icon and use "open with", there is an option to remember that application for that file's type. For some reason, my system is no more memorizing the application (I think it was aftare some update). And, even when I have already used the "open with" command, the application is not showing in the "open with" list, so I have, everytime, to go to "open with"-"Other Application". Any hint to solve this?

Read the article

How to force two process to run on the same CPU?

- by kovan

Context: I'm programming a software system that consists of multiple processes. It is programmed in C++ under Linux. and they communicate among them using Linux shared memory. Usually, in software development, is in the final stage when the performance optimization is made. Here I came to a big problem. The software has high performance requirements, but in machines with 4 or 8 CPU cores (usually with more than one CPU), it was only able to use 3 cores, thus wasting 25% of the CPU power in the first ones, and more than 60% in the second ones. After many research, and having discarded mutex and lock contention, I found out that the time was being wasted on shmdt/shmat calls (detach and attach to shared memory segments). After some more research, I found out that these CPUs, which usually are AMD Opteron and Intel Xeon, use a memory system called NUMA, which basically means that each processor has its fast, "local memory", and accessing memory from other CPUs is expensive. After doing some tests, the problem seems to be that the software is designed so that, basically, any process can pass shared memory segments to any other process, and to any thread in them. This seems to kill performance, as process are constantly accessing memory from other processes. Question: Now, the question is, is there any way to force pairs of processes to execute in the same CPU?. I don't mean to force them to execute always in the same processor, as I don't care in which one they are executed, altough that would do the job. Ideally, there would be a way to tell the kernel: If you schedule this process in one processor, you must also schedule this "brother" process (which is the process with which it communicates through shared memory) in that same processor, so that performance is not penalized.

Read the article

Alternative or succesor to GDBM

- by Anon Guy

We a have a GDBM key-value database as the backend to a load-balanced web-facing application that is in implemented in C++. The data served by the application has grown very large, so our admins have moved the GDBM files from "local" storage (on the webservers, or very close by) to a large, shared, remote, NFS-mounted filesystem. This has affected performance. Our performance tests (in a test environment) show page load times jumping from hundreds of milliseconds (for local disk) to several seconds (over NFS, local network), and sometimes getting as high as 30 seconds. I believe a large part of the problem is that the application makes lots of random reads from the GDBM files, and that these are slow over NFS, and this will be even worse in production (where the front-end and back-end have even more network hardware between them) and as our database gets even bigger. While this is not a critical application, I would like to improve performance, and have some resources available, including the application developer time and Unix admins. My main constraint is time only have the resources for a few weeks. As I see it, my options are: Improve NFS performance by tuning parameters. My instinct is we wont get much out of this, but I have been wrong before, and I don't really know very much about NFS tuning. Move to a different key-value database, such as memcachedb or Tokyo Cabinet. Replace NFS with some other protocol (iSCSI has been mentioned, but i am not familiar with it). How should I approach this problem?

Read the article

Refactoring or Rewriting Monolithic PHP Spaghetti Codebase

- by nategood

I've inherited a really poorly designed PHP spaghetti code project. It's been gaining a good bit of traffic recently and is starting to have performance issues on top of the poor monolithic code base. Its maxing out performance on a chunky 16GB dedicated machine when it really shouldn't be. I'm planning on doing some performance tweaks right off the bat to help the performance issue, but this still won't really help the horrible code base. The team is small but expecting to grow very soon. I've read Joel's article on the troubles of doing a complete rewrite and see the concerns. But how bad does the code base have to be before you consider a rewrite? There is PHP handling logic interjected into what one would usually consider a "view". Even worse, in some places SQL statements are in these same files! The only real separation of presentation and logic are a few PHP scripts that serve as function libraries. These scripts do most of the ORM stuff... if you can even call it that. Trying to slowly refractor this seems like a nightmare. Open to your thoughts and opinions... however not interested in hearing, "Run away, Run away!".

Read the article

Windows Azure Service Bus Scatter-Gather Implementation

- by Alan Smith

One of the more challenging enterprise integration patterns that developers may wish to implement is the Scatter-Gather pattern. In this article I will show the basic implementation of a scatter-gather pattern using the topic-subscription model of the windows azure service bus. I’ll be using the implementation in demos, and also as a lab in my training courses, and the pattern will also be included in the next release of my free e-book the “Windows Azure Service Bus Developer Guide”. The Scatter-Gather pattern answers the following scenario. How do you maintain the overall message flow when a message needs to be sent to multiple recipients, each of which may send a reply? Use a Scatter-Gather that broadcasts a message to multiple recipients and re-aggregates the responses back into a single message. The Enterprise Integration Patterns website provides a description of the Scatter-Gather pattern here. The scatter-gather pattern uses a composite of the publish-subscribe channel pattern and the aggregator pattern. The publish-subscribe channel is used to broadcast messages to a number of receivers, and the aggregator is used to gather the response messages and aggregate them together to form a single message. Scatter-Gather Scenario The scenario for this scatter-gather implementation is an application that allows users to answer questions in a poll based voting scenario. A poll manager application will be used to broadcast questions to users, the users will use a voting application that will receive and display the questions and send the votes back to the poll manager. The poll manager application will receive the users’ votes and aggregate them together to display the results. The scenario should be able to scale to support a large number of users. Scatter-Gather Implementation The diagram below shows the overall architecture for the scatter-gather implementation. Messaging Entities Looking at the scatter-gather pattern diagram it can be seen that the topic-subscription architecture is well suited for broadcasting a message to a number of subscribers. The poll manager application can send the question messages to a topic, and each voting application can receive the question message on its own subscription. The static limit of 2,000 subscriptions per topic in the current release means that 2,000 voting applications can receive question messages and take part in voting. The vote messages can then be sent to the poll manager application using a queue. The voting applications will send their vote messages to the queue, and the poll manager will receive and process the vote messages. The questions topic and answer queue are created using the Windows Azure Developer Portal. Each instance of the voting application will create its own subscription in the questions topic when it starts, allowing the question messages to be broadcast to all subscribing voting applications. Data Contracts Two simple data contracts will be used to serialize the questions and votes as brokered messages. The code for these is shown below. [DataContract] public class Question { [DataMember] public string QuestionText { get; set; } } To keep the implementation of the voting functionality simple and focus on the pattern implementation, the users can only vote yes or no to the questions. [DataContract] public class Vote { [DataMember] public string QuestionText { get; set; } [DataMember] public bool IsYes { get; set; } } Poll Manager Application The poll manager application has been implemented as a simple WPF application; the user interface is shown below. A question can be entered in the text box, and sent to the topic by clicking the Add button. The topic and subscriptions used for broadcasting the messages are shown in a TreeView control. The questions that have been broadcast and the resulting votes are shown in a ListView control. When the application is started any existing subscriptions are cleared form the topic, clients are then created for the questions topic and votes queue, along with background workers for receiving and processing the vote messages, and updating the display of subscriptions. public MainWindow() { InitializeComponent(); // Create a new results list and data bind it. Results = new ObservableCollection<Result>(); lsvResults.ItemsSource = Results; // Create a token provider with the relevant credentials. TokenProvider credentials = TokenProvider.CreateSharedSecretTokenProvider (AccountDetails.Name, AccountDetails.Key); // Create a URI for the serivce bus. Uri serviceBusUri = ServiceBusEnvironment.CreateServiceUri ("sb", AccountDetails.Namespace, string.Empty); // Clear out any old subscriptions. NamespaceManager = new NamespaceManager(serviceBusUri, credentials); IEnumerable<SubscriptionDescription> subs = NamespaceManager.GetSubscriptions(AccountDetails.ScatterGatherTopic); foreach (SubscriptionDescription sub in subs) { NamespaceManager.DeleteSubscription(sub.TopicPath, sub.Name); } // Create the MessagingFactory MessagingFactory factory = MessagingFactory.Create(serviceBusUri, credentials); // Create the topic and queue clients. ScatterGatherTopicClient = factory.CreateTopicClient(AccountDetails.ScatterGatherTopic); ScatterGatherQueueClient = factory.CreateQueueClient(AccountDetails.ScatterGatherQueue); // Start the background worker threads. VotesBackgroundWorker = new BackgroundWorker(); VotesBackgroundWorker.DoWork += new DoWorkEventHandler(ReceiveMessages); VotesBackgroundWorker.RunWorkerAsync(); SubscriptionsBackgroundWorker = new BackgroundWorker(); SubscriptionsBackgroundWorker.DoWork += new DoWorkEventHandler(UpdateSubscriptions); SubscriptionsBackgroundWorker.RunWorkerAsync(); } When the poll manager user nters a question in the text box and clicks the Add button a question message is created and sent to the topic. This message will be broadcast to all the subscribing voting applications. An instance of the Result class is also created to keep track of the votes cast, this is then added to an observable collection named Results, which is data-bound to the ListView control. private void btnAddQuestion_Click(object sender, RoutedEventArgs e) { // Create a new result for recording votes. Result result = new Result() { Question = txtQuestion.Text }; Results.Add(result); // Send the question to the topic Question question = new Question() { QuestionText = result.Question }; BrokeredMessage msg = new BrokeredMessage(question); ScatterGatherTopicClient.Send(msg); txtQuestion.Text = ""; } The Results class is implemented as follows. public class Result : INotifyPropertyChanged { public string Question { get; set; } private int m_YesVotes; private int m_NoVotes; public event PropertyChangedEventHandler PropertyChanged; public int YesVotes { get { return m_YesVotes; } set { m_YesVotes = value; NotifyPropertyChanged("YesVotes"); } } public int NoVotes { get { return m_NoVotes; } set { m_NoVotes = value; NotifyPropertyChanged("NoVotes"); } } private void NotifyPropertyChanged(string prop) { if(PropertyChanged != null) { PropertyChanged(this, new PropertyChangedEventArgs(prop)); } } } The INotifyPropertyChanged interface is implemented so that changes to the number of yes and no votes will be updated in the ListView control. Receiving the vote messages from the voting applications is done asynchronously, using a background worker thread. // This runs on a background worker. private void ReceiveMessages(object sender, DoWorkEventArgs e) { while (true) { // Receive a vote message from the queue BrokeredMessage msg = ScatterGatherQueueClient.Receive(); if (msg != null) { // Deserialize the message. Vote vote = msg.GetBody<Vote>(); // Update the results. foreach (Result result in Results) { if (result.Question.Equals(vote.QuestionText)) { if (vote.IsYes) { result.YesVotes++; } else { result.NoVotes++; } break; } } // Mark the message as complete. msg.Complete(); } } } When a vote message is received, the result that matches the vote question is updated with the vote from the user. The message is then marked as complete. A second background thread is used to update the display of subscriptions in the TreeView, with a dispatcher used to update the user interface. // This runs on a background worker. private void UpdateSubscriptions(object sender, DoWorkEventArgs e) { while (true) { // Get a list of subscriptions. IEnumerable<SubscriptionDescription> subscriptions = NamespaceManager.GetSubscriptions(AccountDetails.ScatterGatherTopic); // Update the user interface. SimpleDelegate setQuestion = delegate() { trvSubscriptions.Items.Clear(); TreeViewItem topicItem = new TreeViewItem() { Header = AccountDetails.ScatterGatherTopic }; foreach (SubscriptionDescription subscription in subscriptions) { TreeViewItem subscriptionItem = new TreeViewItem() { Header = subscription.Name }; topicItem.Items.Add(subscriptionItem); } trvSubscriptions.Items.Add(topicItem); topicItem.ExpandSubtree(); }; this.Dispatcher.BeginInvoke(DispatcherPriority.Send, setQuestion); Thread.Sleep(3000); } } Voting Application The voting application is implemented as another WPF application. This one is more basic, and allows the user to vote “Yes” or “No” for the questions sent by the poll manager application. The user interface for that application is shown below. When an instance of the voting application is created it will create a subscription in the questions topic using a GUID as the subscription name. The application can then receive copies of every question message that is sent to the topic. Clients for the new subscription and the votes queue are created, along with a background worker to receive the question messages. The voting application is set to receiving mode, meaning it is ready to receive a question message from the subscription. public MainWindow() { InitializeComponent(); // Set the mode to receiving. IsReceiving = true; // Create a token provider with the relevant credentials. TokenProvider credentials = TokenProvider.CreateSharedSecretTokenProvider (AccountDetails.Name, AccountDetails.Key); // Create a URI for the serivce bus. Uri serviceBusUri = ServiceBusEnvironment.CreateServiceUri ("sb", AccountDetails.Namespace, string.Empty); // Create the MessagingFactory MessagingFactory factory = MessagingFactory.Create(serviceBusUri, credentials); // Create a subcription for this instance NamespaceManager mgr = new NamespaceManager(serviceBusUri, credentials); string subscriptionName = Guid.NewGuid().ToString(); mgr.CreateSubscription(AccountDetails.ScatterGatherTopic, subscriptionName); // Create the subscription and queue clients. ScatterGatherSubscriptionClient = factory.CreateSubscriptionClient (AccountDetails.ScatterGatherTopic, subscriptionName); ScatterGatherQueueClient = factory.CreateQueueClient(AccountDetails.ScatterGatherQueue); // Start the background worker thread. BackgroundWorker = new BackgroundWorker(); BackgroundWorker.DoWork += new DoWorkEventHandler(ReceiveMessages); BackgroundWorker.RunWorkerAsync(); } I took the inspiration for creating the subscriptions in the voting application from the chat application that uses topics and subscriptions blogged by Ovais Akhter here. The method that receives the question messages runs on a background thread. If the application is in receive mode, a question message will be received from the subscription, the question will be displayed in the user interface, the voting buttons enabled, and IsReceiving set to false to prevent more questing from being received before the current one is answered. // This runs on a background worker. private void ReceiveMessages(object sender, DoWorkEventArgs e) { while (true) { if (IsReceiving) { // Receive a question message from the topic. BrokeredMessage msg = ScatterGatherSubscriptionClient.Receive(); if (msg != null) { // Deserialize the message. Question question = msg.GetBody<Question>(); // Update the user interface. SimpleDelegate setQuestion = delegate() { lblQuestion.Content = question.QuestionText; btnYes.IsEnabled = true; btnNo.IsEnabled = true; }; this.Dispatcher.BeginInvoke(DispatcherPriority.Send, setQuestion); IsReceiving = false; // Mark the message as complete. msg.Complete(); } } else { Thread.Sleep(1000); } } } When the user clicks on the Yes or No button, the btnVote_Click method is called. This will create a new Vote data contract with the appropriate question and answer and send the message to the poll manager application using the votes queue. The user voting buttons are then disabled, the question text cleared, and the IsReceiving flag set to true to allow a new message to be received. private void btnVote_Click(object sender, RoutedEventArgs e) { // Create a new vote. Vote vote = new Vote() { QuestionText = (string)lblQuestion.Content, IsYes = ((sender as Button).Content as string).Equals("Yes") }; // Send the vote message. BrokeredMessage msg = new BrokeredMessage(vote); ScatterGatherQueueClient.Send(msg); // Update the user interface. lblQuestion.Content = ""; btnYes.IsEnabled = false; btnNo.IsEnabled = false; IsReceiving = true; } Testing the Application In order to test the application, an instance of the poll manager application is started; the user interface is shown below. As no instances of the voting application have been created there are no subscriptions present in the topic. When an instance of the voting application is created the subscription will be displayed in the poll manager. Now that a voting application is subscribing, a questing can be sent from the poll manager application. When the message is sent to the topic, the voting application will receive the message and display the question. The voter can then answer the question by clicking on the appropriate button. The results of the vote are updated in the poll manager application. When two more instances of the voting application are created, the poll manager will display the new subscriptions. More questions can then be broadcast to the voting applications. As the question messages are queued up in the subscription for each voting application, the users can answer the questions in their own time. The vote messages will be received by the poll manager application and aggregated to display the results. The screenshots of the applications part way through voting are shown below. The messages for each voting application are queued up in sequence on the voting application subscriptions, allowing the questions to be answered at different speeds by the voters.

Read the article

How can a single disk in a hardware SATA RAID-10 array bring the entire array to a screeching halt?

- by Stu Thompson

Prelude: I'm a code-monkey that's increasingly taken on SysAdmin duties for my small company. My code is our product, and increasingly we provide the same app as SaaS. About 18 months ago I moved our servers from a premium hosting centric vendor to a barebones rack pusher in a tier IV data center. (Literally across the street.) This ment doing much more ourselves--things like networking, storage and monitoring. As part the big move, to replace our leased direct attached storage from the hosting company, I built a 9TB two-node NAS based on SuperMicro chassises, 3ware RAID cards, Ubuntu 10.04, two dozen SATA disks, DRBD and . It's all lovingly documented in three blog posts: Building up & testing a new 9TB SATA RAID10 NFSv4 NAS: Part I, Part II and Part III. We also setup a Cacit monitoring system. Recently we've been adding more and more data points, like SMART values. I could not have done all this without the awesome boffins at ServerFault. It's been a fun and educational experience. My boss is happy (we saved bucket loads of $$$), our customers are happy (storage costs are down), I'm happy (fun, fun, fun). Until yesterday. Outage & Recovery: Some time after lunch we started getting reports of sluggish performance from our application, an on-demand streaming media CMS. About the same time our Cacti monitoring system sent a blizzard of emails. One of the more telling alerts was a graph of iostat await. Performance became so degraded that Pingdom began sending "server down" notifications. The overall load was moderate, there was not traffic spike. After logging onto the application servers, NFS clients of the NAS, I confirmed that just about everything was experiencing highly intermittent and insanely long IO wait times. And once I hopped onto the primary NAS node itself, the same delays were evident when trying to navigate the problem array's file system. Time to fail over, that went well. Within 20 minuts everything was confirmed to be back up and running perfectly. Post-Mortem: After any and all system failures I perform a post-mortem to determine the cause of the failure. First thing I did was ssh back into the box and start reviewing logs. It was offline, completely. Time for a trip to the data center. Hardware reset, backup an and running. In /var/syslog I found this scary looking entry: Nov 15 06:49:44 umbilo smartd[2827]: Device: /dev/twa0 [3ware_disk_00], 6 Currently unreadable (pending) sectors Nov 15 06:49:44 umbilo smartd[2827]: Device: /dev/twa0 [3ware_disk_07], SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 171 to 170 Nov 15 06:49:45 umbilo smartd[2827]: Device: /dev/twa0 [3ware_disk_10], 16 Currently unreadable (pending) sectors Nov 15 06:49:45 umbilo smartd[2827]: Device: /dev/twa0 [3ware_disk_10], 4 Offline uncorrectable sectors Nov 15 06:49:45 umbilo smartd[2827]: Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error Nov 15 06:49:45 umbilo smartd[2827]: # 1 Short offline Completed: read failure 90% 6576 3421766910 Nov 15 06:49:45 umbilo smartd[2827]: # 2 Short offline Completed: read failure 90% 6087 3421766910 Nov 15 06:49:45 umbilo smartd[2827]: # 3 Short offline Completed: read failure 10% 5901 656821791 Nov 15 06:49:45 umbilo smartd[2827]: # 4 Short offline Completed: read failure 90% 5818 651637856 Nov 15 06:49:45 umbilo smartd[2827]: So I went to check the Cacti graphs for the disks in the array. Here we see that, yes, disk 7 is slipping away just like syslog says it is. But we also see that disk 8's SMART Read Erros are fluctuating. There are no messages about disk 8 in syslog. More interesting is that the fluctuating values for disk 8 directly correlate to the high IO wait times! My interpretation is that: Disk 8 is experiencing an odd hardware fault that results in intermittent long operation times. Somehow this fault condition on the disk is locking up the entire array Maybe there is a more accurate or correct description, but the net result has been that the one disk is impacting the performance of the whole array. The Question(s) How can a single disk in a hardware SATA RAID-10 array bring the entire array to a screeching halt? Am I being naïve to think that the RAID card should have dealt with this? How can I prevent a single misbehaving disk from impacting the entire array? Am I missing something?

Search Results

Search found 60072 results on 2403 pages for 'application performance'.

Page 133/2403 | < Previous Page | 129 130 131 132 133 134 135 136 137 138 139 140 | Next Page >

- by Luke404

- by user64204

- by Florian Kruse

- by Saif Bechan

- by BSchriver

- by csgwro

- by Clinton Bosch

- by pistacchio

- by tresstylez

- by ekms

- by questzen

- by Zorlack

- by nick3216

- by James

- by Dinidu

- by noonex

- by davykiash

- by aip.cd.aish

- by user64204

- by Tom Brito

- by kovan

- by Anon Guy

- by nategood

- by Alan Smith

- by Stu Thompson

< Previous Page | 129 130 131 132 133 134 135 136 137 138 139 140 | Next Page >