Search Results

Search found 13151 results on 527 pages for 'performance counters'.

Page 17/527 | < Previous Page | 13 14 15 16 17 18 19 20 21 22 23 24  | Next Page >

  • Premature-Optimization and Performance Anxiety

    - by James Michael Hare
    While writing my post analyzing the new .NET 4 ConcurrentDictionary class (here), I fell into one of the classic blunders that I myself always love to warn about.  After analyzing the differences of time between a Dictionary with locking versus the new ConcurrentDictionary class, I noted that the ConcurrentDictionary was faster with read-heavy multi-threaded operations.  Then, I made the classic blunder of thinking that because the original Dictionary with locking was faster for those write-heavy uses, it was the best choice for those types of tasks.  In short, I fell into the premature-optimization anti-pattern. Basically, the premature-optimization anti-pattern is when a developer is coding very early for a perceived (whether rightly-or-wrongly) performance gain and sacrificing good design and maintainability in the process.  At best, the performance gains are usually negligible and at worst, can either negatively impact performance, or can degrade maintainability so much that time to market suffers or the code becomes very fragile due to the complexity. Keep in mind the distinction above.  I'm not talking about valid performance decisions.  There are decisions one should make when designing and writing an application that are valid performance decisions.  Examples of this are knowing the best data structures for a given situation (Dictionary versus List, for example) and choosing performance algorithms (linear search vs. binary search).  But these in my mind are macro optimizations.  The error is not in deciding to use a better data structure or algorithm, the anti-pattern as stated above is when you attempt to over-optimize early on in such a way that it sacrifices maintainability. In my case, I was actually considering trading the safety and maintainability gains of the ConcurrentDictionary (no locking required) for a slight performance gain by using the Dictionary with locking.  This would have been a mistake as I would be trading maintainability (ConcurrentDictionary requires no locking which helps readability) and safety (ConcurrentDictionary is safe for iteration even while being modified and you don't risk the developer locking incorrectly) -- and I fell for it even when I knew to watch out for it.  I think in my case, and it may be true for others as well, a large part of it was due to the time I was trained as a developer.  I began college in in the 90s when C and C++ was king and hardware speed and memory were still relatively priceless commodities and not to be squandered.  In those days, using a long instead of a short could waste precious resources, and as such, we were taught to try to minimize space and favor performance.  This is why in many cases such early code-bases were very hard to maintain.  I don't know how many times I heard back then to avoid too many function calls because of the overhead -- and in fact just last year I heard a new hire in the company where I work declare that she didn't want to refactor a long method because of function call overhead.  Now back then, that may have been a valid concern, but with today's modern hardware even if you're calling a trivial method in an extremely tight loop (which chances are the JIT compiler would optimize anyway) the results of removing method calls to speed up performance are negligible for the great majority of applications.  Now, obviously, there are those coding applications where speed is absolutely king (for example drivers, computer games, operating systems) where such sacrifices may be made.  But I would strongly advice against such optimization because of it's cost.  Many folks that are performing an optimization think it's always a win-win.  That they're simply adding speed to the application, what could possibly be wrong with that?  What they don't realize is the cost of their choice.  For every piece of straight-forward code that you obfuscate with performance enhancements, you risk the introduction of bugs in the long term technical debt of the application.  It will become so fragile over time that maintenance will become a nightmare.  I've seen such applications in places I have worked.  There are times I've seen applications where the designer was so obsessed with performance that they even designed their own memory management system for their application to try to squeeze out every ounce of performance.  Unfortunately, the application stability often suffers as a result and it is very difficult for anyone other than the original designer to maintain. I've even seen this recently where I heard a C++ developer bemoaning that in VS2010 the iterators are about twice as slow as they used to be because Microsoft added range checking (probably as part of the 0x standard implementation).  To me this was almost a joke.  Twice as slow sounds bad, but it almost never as bad as you think -- especially if you're gaining safety.  The only time twice is really that much slower is when once was too slow to begin with.  Think about it.  2 minutes is slow as a response time because 1 minute is slow.  But if an iterator takes 1 microsecond to move one position and a new, safer iterator takes 2 microseconds, this is trivial!  The only way you'd ever really notice this would be in iterating a collection just for the sake of iterating (i.e. no other operations).  To my mind, the added safety makes the extra time worth it. Always favor safety and maintainability when you can.  I know it can be a hard habit to break, especially if you started out your career early or in a language such as C where they are very performance conscious.  But in reality, these type of micro-optimizations only end up hurting you in the long run. Remember the two laws of optimization.  I'm not sure where I first heard these, but they are so true: For beginners: Do not optimize. For experts: Do not optimize yet. This is so true.  If you're a beginner, resist the urge to optimize at all costs.  And if you are an expert, delay that decision.  As long as you have chosen the right data structures and algorithms for your task, your performance will probably be more than sufficient.  Chances are it will be network, database, or disk hits that will be your slow-down, not your code.  As they say, 98% of your code's bottleneck is in 2% of your code so premature-optimization may add maintenance and safety debt that won't have any measurable impact.  Instead, code for maintainability and safety, and then, and only then, when you find a true bottleneck, then you should go back and optimize further.

    Read the article

  • High-concurrency counters without sharding

    - by dound
    This question concerns two implementations of counters which are intended to scale without sharding (with a tradeoff that they might under-count in some situations): http://appengine-cookbook.appspot.com/recipe/high-concurrency-counters-without-sharding/ (the code in the comments) http://blog.notdot.net/2010/04/High-concurrency-counters-without-sharding My questions: With respect to #1: Running memcache.decr() in a deferred, transactional task seems like overkill. If memcache.decr() is done outside the transaction, I think the worst-case is the transaction fails and we miss counting whatever we decremented. Am I overlooking some other problem that could occur by doing this? What are the significiant tradeoffs between the two implementations? Here are the tradeoffs I see: #2 does not require datastore transactions. To get the counter's value, #2 requires a datastore fetch while with #1 typically only needs to do a memcache.get() and memcache.add(). When incrementing a counter, both call memcache.incr(). Periodically, #2 adds a task to the task queue while #1 transactionally performs a datastore get and put. #1 also always performs memcache.add() (to test whether it is time to persist the counter to the datastore). Conclusions (without actually running any performance tests): #1 should typically be faster at retrieving a counter (#1 memcache vs #2 datastore). Though #1 has to perform an extra memcache.add() too. However, #2 should be faster when updating counters (#1 datastore get+put vs #2 enqueue a task). On the other hand, with #1 you have to be a bit more careful with the update interval since the task queue quota is almost 100x smaller than either the datastore or memcahce APIs.

    Read the article

  • How to avoid Memory "Hard Fault/sec"

    - by Flavio Oliveira
    i've a problem on my windows 2008 server x64, and i cannot understand how can i solve it. i'm looking to Resource Monitor and see about 100 to 200 hard faults/sec. and generally the machine is slow. As i've readed a bit it is caused by a "memory Page" that is no longer available on physical memory and causes a io operations (disk) and it is a problem. The current hardware is a intel core2duo E8400 (3.0GHz) with 6GB RAM on a Windows Server Web 64-bit. Actually the machine have about 2GB Ram used what having 4Gb available to use, Why is the machine requires that high level of Disk operations? what can i do to increase the performance? Im experiencing a memory issues? what should be my starting point?

    Read the article

  • SQL server peformance, virtual memory usage

    - by user45641
    Hello, I have a very large DB used mostly for analytics. The performance overall is very sluggish. I just noticed that when running the query below, the amount of virtual memory used greatly exceeds the amount of physical memory available. Currently, physical memory is 10GB (10238 MB) whereas the virtual memory returns significantly more - 8388607 MB. That seems really wrong, but I'm at a bit of a loss on how to proceed. USE [master]; GO select cpu_count , hyperthread_ratio , physical_memory_in_bytes / 1048576 as 'mem_MB' , virtual_memory_in_bytes / 1048576 as 'virtual_mem_MB' , max_workers_count , os_error_mode , os_priority_class from sys.dm_os_sys_info

    Read the article

  • Benchmarking Java programs

    - by stefan-ock
    For university, I perform bytecode modifications and analyze their influence on performance of Java programs. Therefore, I need Java programs---in best case used in production---and appropriate benchmarks. For instance, I already got HyperSQL and measure its performance by the benchmark program PolePosition. The Java programs running on a JVM without JIT compiler. Thanks for your help! P.S.: I cannot use programs to benchmark the performance of the JVM or of the Java language itself (such as Wide Finder).

    Read the article

  • Can compressing Program Files save space *and* give a significant boost to SSD performance?

    - by Christopher Galpin
    Considering solid-state disk space is still an expensive resource, compressing large folders has appeal. Thanks to VirtualStore, could Program Files be a case where it might even improve performance? Discovery In particular I have been reading: SSD and NTFS Compression Speed Increase? Does NTFS compression slow SSD/flash performance? Will somebody benchmark whole disk compression (HD,SSD) please? (may have to scroll up) The first link is particularly dreamy, but maybe head a little too far in the clouds. The third link has this sexy semi-log graph (logarithmic scale!). Quote (with notes): Using highly compressable data (IOmeter), you get at most a 30x performance increase [for reads], and at least a 49x performance DECREASE [for writes]. Assuming I interpreted and clarified that sentence correctly, this single user's benchmark has me incredibly interested. Although write performance tanks wretchedly, read performance still soars. It gave me an idea. Idea: VirtualStore It so happens that thanks to sanity saving security features introduced in Windows Vista, write access to certain folders such as Program Files is virtualized for non-administrator processes. Which means, in normal (non-elevated) usage, a program or game's attempt to write data to its install location in Program Files (which is perhaps a poor location) is redirected to %UserProfile%\AppData\Local\VirtualStore, somewhere entirely different. Thus, to my understanding, writes to Program Files should primarily only occur when installing an application. This makes compressing it not only a huge source of space gain, but also a potential candidate for performance gain. Testing The beginning of this post has me a bit timid, it suggests benchmarking NTFS compression on a whole drive is difficult because turning it off "doesn't decompress the objects". However it seems to me the compact command is perfectly capable of doing so for both drives and individual folders. Could it be only marking them for decompression the next time the OS reads from them? I need to find the answer before I begin my own testing.

    Read the article

  • NSClient++ FAIL on Windows 2008 R2 -- PDHCollector.cpp(215) Failed to query performance counters

    - by John DaCosta
    I am attempting to monitor windows server 2008 r2 x64 Enterprise with Nagios. When I test/install the nsclientI get the following error: PDHCollector.cpp(215) Failed to query performance counters: \Processor(_total)\% Processor Time: PdhGetFormattedCounterValue failed: A counter with a negative denominator value was detected. (800007D6) Has anyone else encountered the same issue and / or resolved it, found a work around?

    Read the article

  • Performance question: Inverting an array of pointers in-place vs array of values

    - by Anders
    The background for asking this question is that I am solving a linearized equation system (Ax=b), where A is a matrix (typically of dimension less than 100x100) and x and b are vectors. I am using a direct method, meaning that I first invert A, then find the solution by x=A^(-1)b. This step is repated in an iterative process until convergence. The way I'm doing it now, using a matrix library (MTL4): For every iteration I copy all coeffiecients of A (values) in to the matrix object, then invert. This the easiest and safest option. Using an array of pointers instead: For my particular case, the coefficients of A happen to be updated between each iteration. These coefficients are stored in different variables (some are arrays, some are not). Would there be a potential for performance gain if I set up A as an array containing pointers to these coefficient variables, then inverting A in-place? The nice thing about the last option is that once I have set up the pointers in A before the first iteration, I would not need to copy any values between successive iterations. The values which are pointed to in A would automatically be updated between iterations. So the performance question boils down to this, as I see it: - The matrix inversion process takes roughly the same amount of time, assuming de-referencing of pointers is non-expensive. - The array of pointers does not need the extra memory for matrix A containing values. - The array of pointers option does not have to copy all NxN values of A between each iteration. - The values that are pointed to the array of pointers option are generally NOT ordered in memory. Hopefully, all values lie relatively close in memory, but *A[0][1] is generally not next to *A[0][0] etc. Any comments to this? Will the last remark affect performance negatively, thus weighing up for the positive performance effects?

    Read the article

  • Measuring ASP.NET and SharePoint output cache

    - by DigiMortal
    During ASP.NET output caching week in my local blog I wrote about how to measure ASP.NET output cache. As my posting was based on real work and real-life results then I thought that this posting is maybe interesting to you too. So here you can read what I did, how I did and what was the result. Introduction Caching is not effective without measuring it. As MVP Henn Sarv said in one of his sessions then you will get what you measure. And right he is. Lately I measured caching on local Microsoft community portal to make sure that our caching strategy is good enough in environment where this system lives. In this posting I will show you how to start measuring the cache of your web applications. Although the application measured is built on SharePoint Server publishing infrastructure, all those counters have same meaning as similar counters under pure ASP.NET applications. Measured counters I used Performance Monitor and the following performance counters (their names are similar on ASP.NET and SharePoint WCMS): Total number of objects added – how much objects were added to output cache. Total object discards – how much objects were deleted from output cache. Cache hit count – how many times requests were served by cache. Cache hit ratio – percent of requests served from cache. The first three counters are cumulative while last one is coefficient. You can use also other counters to measure the full effect of caching (memory, processor, disk I/O, network load etc before and after caching). Measuring process The measuring I describe here started from freshly restarted web server. I measured application during 12 hours that covered also time ranges when users are most active. The time range does not include late evening hours and night because there is nothing to measure during these hours. During measuring we performed no maintenance or administrative tasks on server. All tasks performed were related to usual daily content management and content monitoring. Also we had no advertisement campaigns or other promotions running at same time. The results You can see the results on following graphic.   Total number of objects added   Total object discards   Cache hit count   Cache hit ratio You can see that adds and discards are growing in same tempo. It is good because cache expires and not so popular items are not kept in memory. If there are more popular content then the these lines may have bigger distance between them. Cache hit count grows faster and this shows that more and more content is served from cache. In current case it shows that cache is filled optimally and we can do even better if we tune caches more. The site contains also pages that are discarded when some subsite changes (page was added/modified/deleted) and one modification may affect about four or five pages. This may also decrease cache hit count because during day the site gets about 5-10 new pages. Cache hit ratio is currently extremely good. The suggested minimum is about 85% but after some tuning and measuring I achieved 98.7% as a result. This is due to the fact that new pages are most often requested and after new pages are added the older ones are requested only sometimes. So they get discarded from cache and only some of these will return sometimes back to cache. Although this may also indicate the need for additional SEO work the result is very well in technical means. Conclusion Measuring ASP.NET output cache is not complex thing to do and you can start by measuring performance of cache as a start. Later you can move on and measure caching effect to other counters such as disk I/O, network, processors etc. What you have to achieve is optimal cache that is not full of items asked only couple of times per day (you can avoid this by not using too long cache durations). After some tuning you should be able to boost cache hit ratio up to at least 85%.

    Read the article

  • Performance impact: What is the optimal payload for SqlBulkCopy.WriteToServer()?

    - by Linchi Shea
    For many years, I have been using a C# program to generate the TPC-C compliant data for testing. The program relies on the SqlBulkCopy class to load the data generated by the program into the SQL Server tables. In general, the performance of this C# data loader is satisfactory. Lately however, I found myself in a situation where I needed to generate a much larger amount of data than I typically do and the data needed to be loaded within a confined time frame. So I was driven to look into the code...(read more)

    Read the article

  • Performance triage

    - by Dave
    Folks often ask me how to approach a suspected performance issue. My personal strategy is informed by the fact that I work on concurrency issues. (When you have a hammer everything looks like a nail, but I'll try to keep this general). A good starting point is to ask yourself if the observed performance matches your expectations. Expectations might be derived from known system performance limits, prototypes, and other software or environments that are comparable to your particular system-under-test. Some simple comparisons and microbenchmarks can be useful at this stage. It's also useful to write some very simple programs to validate some of the reported or expected system limits. Can that disk controller really tolerate and sustain 500 reads per second? To reduce the number of confounding factors it's better to try to answer that question with a very simple targeted program. And finally, nothing beats having familiarity with the technologies that underlying your particular layer. On the topic of confounding factors, as our technology stacks become deeper and less transparent, we often find our own technology working against us in some unexpected way to choke performance rather than simply running into some fundamental system limit. A good example is the warm-up time needed by just-in-time compilers in Java Virtual Machines. I won't delve too far into that particular hole except to say that it's rare to find good benchmarks and methodology for java code. Another example is power management on x86. Power management is great, but it can take a while for the CPUs to throttle up from low(er) frequencies to full throttle. And while I love "turbo" mode, it makes benchmarking applications with multiple threads a chore as you have to remember to turn it off and then back on otherwise short single-threaded runs may look abnormally fast compared to runs with higher thread counts. In general for performance characterization I disable turbo mode and fix the power governor at "performance" state. Another source of complexity is the scheduler, which I've discussed in prior blog entries. Lets say I have a running application and I want to better understand its behavior and performance. We'll presume it's warmed up, is under load, and is an execution mode representative of what we think the norm would be. It should be in steady-state, if a steady-state mode even exists. On Solaris the very first thing I'll do is take a set of "pstack" samples. Pstack briefly stops the process and walks each of the stacks, reporting symbolic information (if available) for each frame. For Java, pstack has been augmented to understand java frames, and even report inlining. A few pstack samples can provide powerful insight into what's actually going on inside the program. You'll be able to see calling patterns, which threads are blocked on what system calls or synchronization constructs, memory allocation, etc. If your code is CPU-bound then you'll get a good sense where the cycles are being spent. (I should caution that normal C/C++ inlining can diffuse an otherwise "hot" method into other methods. This is a rare instance where pstack sampling might not immediately point to the key problem). At this point you'll need to reconcile what you're seeing with pstack and your mental model of what you think the program should be doing. They're often rather different. And generally if there's a key performance issue, you'll spot it with a moderate number of samples. I'll also use OS-level observability tools to lock for the existence of bottlenecks where threads contend for locks; other situations where threads are blocked; and the distribution of threads over the system. On Solaris some good tools are mpstat and too a lesser degree, vmstat. Try running "mpstat -a 5" in one window while the application program runs concurrently. One key measure is the voluntary context switch rate "vctx" or "csw" which reflects threads descheduling themselves. It's also good to look at the user; system; and idle CPU percentages. This can give a broad but useful understanding if your threads are mostly parked or mostly running. For instance if your program makes heavy use of malloc/free, then it might be the case you're contending on the central malloc lock in the default allocator. In that case you'd see malloc calling lock in the stack traces, observe a high csw/vctx rate as threads block for the malloc lock, and your "usr" time would be less than expected. Solaris dtrace is a wonderful and invaluable performance tool as well, but in a sense you have to frame and articulate a meaningful and specific question to get a useful answer, so I tend not to use it for first-order screening of problems. It's also most effective for OS and software-level performance issues as opposed to HW-level issues. For that reason I recommend mpstat & pstack as my the 1st step in performance triage. If some other OS-level issue is evident then it's good to switch to dtrace to drill more deeply into the problem. Only after I've ruled out OS-level issues do I switch to using hardware performance counters to look for architectural impediments.

    Read the article

  • Disabling CPU management

    - by Tiffany Walker
    If I add the following processor.max_cstate=0 to the kernel command line for boot up, does that disable all CPU power management and throttling? I also found: http://www.experts-exchange.com/OS/Linux/Administration/A_3492-Avoiding-CPU-speed-scaling-in-modern-Linux-distributions-Running-CPU-at-full-speed-Tips.html The link talks of Change CPU governor from 'ondemand' to 'performance' for all CPUs/cores and disabling kondemand from kernel. Server is for web hosting UPDATES: 2.6.32-379.1.1.lve1.1.7.6.el6.x86_64 #1 SMP Sat Aug 4 09:56:37 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux . # dmidecode 2.11 SMBIOS 2.6 present. 74 structures occupying 2878 bytes. Table at 0x0009F000. Handle 0x0000, DMI type 0, 24 bytes BIOS Information Vendor: American Megatrends Inc. Version: 1.0c Release Date: 05/27/2010 Address: 0xF0000 Runtime Size: 64 kB ROM Size: 4096 kB Characteristics: ISA is supported PCI is supported PNP is supported BIOS is upgradeable BIOS shadowing is allowed ESCD support is available Boot from CD is supported Selectable boot is supported BIOS ROM is socketed EDD is supported 5.25"/1.2 MB floppy services are supported (int 13h) 3.5"/720 kB floppy services are supported (int 13h) 3.5"/2.88 MB floppy services are supported (int 13h) Print screen service is supported (int 5h) 8042 keyboard services are supported (int 9h) Serial services are supported (int 14h) Printer services are supported (int 17h) CGA/mono video services are supported (int 10h) ACPI is supported USB legacy is supported LS-120 boot is supported ATAPI Zip drive boot is supported BIOS boot specification is supported Targeted content distribution is supported BIOS Revision: 8.16 Handle 0x0001, DMI type 1, 27 bytes System Information Manufacturer: Supermicro Product Name: X8SIE Version: 0123456789 Serial Number: 0123456789 UUID: 49434D53-0200-9033-2500-33902500D52C Wake-up Type: Power Switch SKU Number: To Be Filled By O.E.M. Family: To Be Filled By O.E.M. Handle 0x0002, DMI type 2, 15 bytes Base Board Information Manufacturer: Supermicro Product Name: X8SIE Version: 0123456789 Serial Number: VM11S61561 Asset Tag: To Be Filled By O.E.M. Features: Board is a hosting board Board is replaceable Location In Chassis: To Be Filled By O.E.M. Chassis Handle: 0x0003 Type: Motherboard Contained Object Handles: 0 Handle 0x0003, DMI type 3, 21 bytes Chassis Information Manufacturer: Supermicro Type: Sealed-case PC Lock: Not Present Version: 0123456789 Serial Number: 0123456789 Asset Tag: To Be Filled By O.E.M. Boot-up State: Safe Power Supply State: Safe Thermal State: Safe Security Status: None OEM Information: 0x00000000 Height: Unspecified Number Of Power Cords: 1 Contained Elements: 0

    Read the article

  • Very long (>300s) request processing time on Apache Server serving static content from particular IP

    - by Ron Bieber
    We are running an Apache 2.2 server for a very large web site. Over the past few months we have been having some users reporting slow response times, while others (including our resources, both on the internal network and our home networks) do not see any degradation in performance. After a ton of investigation, we finally found a "Deny from none" statement in our configuration that was causing reverse DNS lookups (which were timing out) that solved the bulk of our issues, but we still have some customers that we are seeing in the Apache logs (using %D in the log format) with request processing times of 300s for images, css, javascript and other static content. We've checked all Deny / Allow statements for reoccurrence of "none", as well as all other things we know of that would cause reverse DNS lookups (such as using "REMOTE_HOST" in rewrite rules, using %a instead of %h in our log format configuration) as well as verified that HostnameLookups is set to "Off". As an aside, we've also validated that reverse DNS lookups for folks having this problem do not time out - so I'm fairly certain DNS is not an issue in this case. I've run out of ideas. Are there any Apache configuration scenarios that someone can point me to that I might be missing that would cause request times for static content to take so long only for certain users? Thank you in advance.

    Read the article

  • Understanding RedHats recommended tuned profiles

    - by espenfjo
    We are going to roll out tuned (and numad) on ~1000 servers, the majority of them being VMware servers either on NetApp or 3Par storage. According to RedHats documentation we should choose the virtual-guestprofile. What it is doing can be seen here: tuned.conf We are changing the IO scheduler to NOOP as both VMware and the NetApp/3Par should do sufficient scheduling for us. However, after investigating a bit I am not sure why they are increasing vm.dirty_ratio and kernel.sched_min_granularity_ns. As far as I have understood increasing increasing vm.dirty_ratio to 40% will mean that for a server with 20GB ram, 8GB can be dirty at any given time unless vm.dirty_writeback_centisecsis hit first. And while flushing these 8GB all IO for the application will be blocked until the dirty pages are freed. Increasing the dirty_ratio would probably mean higher write performance at peaks as we now have a larger cache, but then again when the cache fills IO will be blocked for a considerably longer time (Several seconds). The other is why they are increasing the sched_min_granularity_ns. If I understand it correctly increasing this value will decrease the number of time slices per epoch(sched_latency_ns) meaning that running tasks will get more time to finish their work. I can understand this being a very good thing for applications with very few threads, but for eg. apache or other processes with a lot of threads would this not be counter-productive?

    Read the article

  • F# performance in scientific computing

    - by aaa
    hello. I am curious as to how F# performance compares to C++ performance? I asked a similar question with regards to Java, and the impression I got was that Java is not suitable for heavy numbercrunching. I have read that F# is supposed to be more scalable and more performant, but how is this real-world performance compares to C++? specific questions about current implementation are: How well does it do floating-point? Does it allow vector instructions how friendly is it towards optimizing compilers? How big a memory foot print does it have? Does it allow fine-grained control over memory locality? does it have capacity for distributed memory processors, for example Cray? what features does it have that may be of interest to computational science where heavy number processing is involved? Are there actual scientific computing implementations that use it? Thanks

    Read the article

  • VS2010 + Resharper 5 performance issues

    - by Jeremy Roberts
    I have been using VS2010 with Resharper 5 for several weeks and am having a performance issue. Sometimes when typing, the cursor will lag and the keystrokes won't show instantaneously. Also, scrolling will lag at times. There is a forum thread started and JetBrains has been responding. Several people (including myself) have added their voice and uploaded some performance profiles. If anyone here has has this issue, I would encourage you to visit the thread and let JetBrains know about it. Has anyone had this problem and have a suggestion to restore performance?

    Read the article

  • Measuring Web Page Performance on Client vs. Server

    - by Yaakov Ellis
    I am working with a web page (ASP.net 3.5) that is very complicated and in certain circumstances has major performance issues. It uses Ajax (through the Telerik AjaxManager) for most of its functionality. I would like to be able to measure in some way the amounts of time for the following, for each request: On client submitting request to server Client-to-Server On server initializing request On server processing request Server-to-Client Client rendering, JavaScript processing I have monitored the database traffic and cannot find any obvious culprit. On the other hand, I have a suspicion that some of the Ajax interactions are causing performance issues. However, until I have a way to track the times involved, make a baseline measurement, and measure performance as I tweak, it will be hard to work on the issue. So what is the best way to measure all of these? Is there one tool that can do it? Combination of FireBug and logging inserted into different places in the page life-cycle?

    Read the article

  • Looking for SQL Server Performance Monitor Tools

    - by the-locster
    I may be approaching this problem from the wrong angle but what I'm thinking of is some kind of performance monitor tool for SQl server that works in a similar way to code performance tools, e.g. I;d like to see an output of how many times each stored procedure was called, average executuion time and possibly various resource usage stats such as cache/index utilisation, resultign disk access and table scans, etc. As far as I can tell the performance monitor that comes with SQL Server just logs the various calls but doesn't report he variosu stats I'm looking for. Potentially I just need a tool to analyze the log output?

    Read the article

  • Entity Framework associations killing performance

    - by Chris
    Here is the performance test i am looking at. I have 8 different entities that are table per type. Some of the entities contain over 100 thousand rows. This particular application does several recursive calculations on the client so I think it may be best to preload the data instead of lazy loading. If there are no associations I can load the entire database in about 3 seconds. As I add associations in any way the performance starts to drastically decline. I am loading all the data the same way (just calling toList() on the entity attached to the context). I ran the test with edmx generated classes and self tracking entities and had similar results. I am sure if I were to try and deal with the associations myself, similar to how I would in a dataset, the performance problem would go away. On the other hand I am pretty sure this is not how the entity framework was intended to being used. Any thoughts or ideas?

    Read the article

  • azure performance

    - by Dave K
    I've moved my app from a dedicated server to azure (and sql azure), and have noticed substantial performance degradation. obviously not having the database and web server on the same piece of hardware is much of it, but I'm curious what other people have found in migrating to azure, and if there is anything any of you would suggest I do to improve it. Right now I'm considering moving back to my dedicated server... So in summary, are there any rules of thumb for this, existing research (wasn't able to find much) or other pieces of advice on improving the performance of the app? has anyone else found the same to be true, and improved their site's performance in some way? it's built in C# on asp.net mvc 2. Thanks!

    Read the article

  • Performance statistics hooks

    - by tinny
    Lets be honest, most software that developers produce has quite modest performance requirements. E.g. Systems perhaps serving 100's of requests per second, if that. But lets assume for a moment (or even dream) that you where perhaps involved in the "next big thing" (whatever that means) and you wanted to put some sort of performance statistics logging in place to help you out when all those users come flying in. Performance statistics logging, how would you approach this requirement? Perhaps you would use some sort of generic framework for this? Or roll your own solution? What would you log? How granular? Or would you not even bother putting anything in place and rather deal with this issue when it actually became an issue? It would be really interesting to hear your thoughts on this topic.

    Read the article

  • Is code clearness killing application performance?

    - by Jorge Córdoba
    As today's code is getting more complex by the minute, code needs to be designed to be maintainable - meaning easy to read, and easy to understand. That being said, I can't help but remember the programs that ran a couple of years ago such as Winamp or some games in which you needed a high performance program because your 486 100 Mhz wouldn't play mp3s with that beautiful mp3 player which consumed all of your CPU cycles. Now I run Media Player (or whatever), start playing an mp3 and it eats up a 25-30% of one of my four cores. Come on!! If a 486 can do it, how can the playback take up so much processor to do the same? I'm a developer myself, and I always used to advise: keep your code simple, don't prematurely optimize for performance. It seems that we've gone from "trying to get it to use the least amount of CPU as possible" to "if it doesn't take too much CPU is all right". So, do you think we are killing performance by ignoring optimizations?

    Read the article

  • Oracle, slow performance when using sub select

    - by Wyass
    I have a view that is very slow if you fetch all rows. But if I select a subset (providing an ID in the where clause) the performance is very good. I cannot hardcode the ID so I create a sub select to get the ID from another table. The sub select only returns one ID. Now the performance is very slow and it seems like Oracle is evaluating the whole view before using the where clause. Can I somehow help Oracle so SQL 2 and 3 have the same performance? I’m using Oracle 10g 1 slow select * from ci.my_slow_view 2 fast select * from ci.my_slow_view where id = 1; 3 slow select * from ci.my_slow_view where id in (select id from active_ids)

    Read the article

  • Apache2 benchmarks - very poor performance

    - by andrzejp
    I have two servers on which I test the configuration of apache2. The first server: 4GB of RAM, AMD Athlon (tm) 64 X2 Dual Core Processor 5600 + Apache 2.2.3, mod_php, mpm prefork: Settings: Timeout 100 KeepAlive On MaxKeepAliveRequests 150 KeepAliveTimeout 4 <IfModule Mpm_prefork_module> StartServers 7 MinSpareServers 15 MaxSpareServers 30 MaxClients 250 MaxRequestsPerChild 2000 </ IfModule> Compiled in modules: core.c mod_log_config.c mod_logio.c prefork.c http_core.c mod_so.c Second server: 8GB of RAM, Intel (R) Core (TM) i7 CPU [email protected] Apache 2.2.9, **fcgid, mpm worker, suexec** PHP scripts are running via fcgi-wrapper Settings: Timeout 100 KeepAlive On MaxKeepAliveRequests 100 KeepAliveTimeout 4 <IfModule Mpm_worker_module> StartServers 10 MaxClients 200 MinSpareThreads 25 MaxSpareThreads 75 ThreadsPerChild 25 MaxRequestsPerChild 1000 </ IfModule> Compiled in modules: core.c mod_log_config.c mod_logio.c worker.c http_core.c mod_so.c The following test results, which are very strange! New server (dynamic content - php via fcgid+suexec): Server Software: Apache/2.2.9 Server Hostname: XXXXXXXX Server Port: 80 Document Path: XXXXXXX Document Length: 179512 bytes Concurrency Level: 10 Time taken for tests: 0.26276 seconds Complete requests: 1000 Failed requests: 0 Total transferred: 179935000 bytes HTML transferred: 179512000 bytes Requests per second: 38.06 Transfer rate: 6847.88 kb/s received Connnection Times (ms) min avg max Connect: 2 4 54 Processing: 161 257 449 Total: 163 261 503 Old server (dynamic content - mod_php): Server Software: Apache/2.2.3 Server Hostname: XXXXXX Server Port: 80 Document Path: XXXXXX Document Length: 187537 bytes Concurrency Level: 10 Time taken for tests: 173.073 seconds Complete requests: 1000 Failed requests: 22 (Connect: 0, Length: 22, Exceptions: 0) Total transferred: 188003372 bytes HTML transferred: 187546372 bytes Requests per second: 5777.91 Transfer rate: 1086267.40 kb/s received Connnection Times (ms) min avg max Connect: 3 3 28 Processing: 298 1724 26615 Total: 301 1727 26643 Old server: Static content (jpg file) Server Software: Apache/2.2.3 Server Hostname: xxxxxxxxx Server Port: 80 Document Path: /images/top2.gif Document Length: 40486 bytes Concurrency Level: 100 Time taken for tests: 3.558 seconds Complete requests: 1000 Failed requests: 0 Write errors: 0 Total transferred: 40864400 bytes HTML transferred: 40557482 bytes Requests per second: 281.09 [#/sec] (mean) Time per request: 355.753 [ms] (mean) Time per request: 3.558 [ms] (mean, across all concurrent requests) Transfer rate: 11217.51 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 3 11 4.5 12 23 Processing: 40 329 61.4 339 1009 Waiting: 6 282 55.2 293 737 Total: 43 340 63.0 351 1020 New server - static content (jpg file) Server Software: Apache/2.2.9 Server Hostname: XXXXX Server Port: 80 Document Path: /images/top2.gif Document Length: 40486 bytes Concurrency Level: 100 Time taken for tests: 3.571531 seconds Complete requests: 1000 Failed requests: 0 Write errors: 0 Total transferred: 41282792 bytes HTML transferred: 41030080 bytes Requests per second: 279.99 [#/sec] (mean) Time per request: 357.153 [ms] (mean) Time per request: 3.572 [ms] (mean, across all concurrent requests) Transfer rate: 11287.88 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 2 63 24.8 66 119 Processing: 124 278 31.8 282 391 Waiting: 3 70 28.5 66 164 Total: 126 341 35.9 350 443 I noticed that in the apache error.log is a lot of entries: [notice] mod_fcgid: call /www/XXXXX/public_html/forum/index.php with wrapper /www/php-fcgi-scripts/XXXXXX/php-fcgi-starter What I have omitted, or do not understand? Such a difference in requests per second? Is it possible? What could be the cause?

    Read the article

  • openvpn TCP/UDP slow SSH/SMB performance

    - by Petr Latal
    I have question about strange behavior of my openVPN configuration on Debian lenny. I have 2 server configs (one proto tcp-server based and one proto udp based). ISP bandwidth is 7Mbit/7Mbit. When I uses proto tcp-server my download server rate is fine around 6,4 Mbit/s, but upload rate is about 3Mbit/s. When I uses proto udp, my download server rate is around 3Mbit/s and upload rate around 6,4Mbit/s. I tried to handle the MTU, MSSFIX and cipher on/off on server and client configs to synchronize rates, but without solution. Here is TCP based SERVER config: mode server tls-server port 1194 proto tcp-server dev tap0 ifconfig 11.10.15.1 255.255.255.0 ifconfig-pool 11.10.15.2 11.10.15.20 255.255.255.0 push "route 192.168.1.0 255.255.255.0" push "dhcp-option DNS 192.168.1.200" push "route-gateway 11.10.15.1" push "dhcp-option WINS 192.168.1.200" route-up /etc/openvpn/routeup.sh duplicate-cn ca /etc/openvpn/ca.crt cert /etc/openvpn/server.crt key /etc/openvpn/server.key dh /etc/openvpn/dh2048.pem log-append /var/log/openvpn.log status /var/run/vpn.status 10 user nobody group nogroup keepalive 10 120 comp-lzo verb 3 script-security 3 plugin /usr/lib/openvpn/openvpn-auth-pam.so system-auth persist-tun persist-key mssfix cipher BF-CBC Here is UDP based SERVER config: port 1194 proto udp dev tun0 local xx.xx.xx.xx server 11.10.15.0 255.255.255.0 ca /etc/openvpn/ca.crt cert /etc/openvpn/server.crt key /etc/openvpn/server.key dh /etc/openvpn/dh2048.pem log-append /var/log/openvpn.log status /var/run/vpn.status 10 user nobody group nogroup keepalive 10 120 comp-lzo verb 3 duplicate-cn script-security 3 plugin /usr/lib/openvpn/openvpn-auth-pam.so system-auth persist-tun persist-key tun-mtu 1500 mssfix 1212 client-to-client ifconfig-pool-persist ipp.txt Here is TCP/UDP based windows CLIENT config: remote xx.xx.xx.xx --socket-flags TCP_NODELAY tls-client port 1194 proto tcp-client #proto udp dev tap #dev tun pull ca ca.crt cert latis.crt key latis.key mute 0 comp-lzo adaptive verb 3 resolv-retry infinite nobind persist-key auth-user-pass auth-nocache script-security 2 mssfix cipher BF-CBC

    Read the article

< Previous Page | 13 14 15 16 17 18 19 20 21 22 23 24  | Next Page >