Search Results

Search found 13249 results on 530 pages for 'performance tuning'.

Page 3/530 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

How do I calculate the amount of tuning needed for my server ?

- by Low Kian Seong

I have a server which is running a few discrete Python, Java application which most of the time imports data into a PostGreSQL database. I would like to know from people out there who have experience tuning enterprise grade servers how do i go about calculating in a holistic way the amount of tuning needed for my server for example vm.swappiness, vm.overcommit_ratio and other numerical tunings needed for my server. I tried to enable sar on my server to capture daily numbers but these are more along the lines of total numbers and I can't figure out how to allocate memory for my applications. Help would be appreciated. Thanks.

Read the article
EPM 11.1.2 - In WebLogic Server, Enable Native IO Performance Pack

- by Ahmed Awan

Performance can be improved by enabling native IO in production mode. WebLogic Server benchmarks show major performance improvements when native performance packs are used on machines that host Oracle WebLogic Server instances. Important Note: Always enable native I/O, if available, and check for errors at startup to make sure it is being initialed properly. Tip: The use of NATIVE performance packs are enabled by default in the configuration shipped with your distribution. You can use the Administration Console to verify that performance packs are enabled by clicking on each managed server and click on Tuning tab.

Read the article
Tuning Default WorkManager - Advantages and Disadvantages

- by Murali Veligeti

Before discussing on Tuning Default WorkManager, lets have a brief introduction on What is Default WorkManger Before Weblogic Server 9.0 release, we had the concept of Execute Queues. WebLogic Server (before WLS 9.0), processing was performed in multiple execute queues. Different classes of work were executed in different queues, based on priority and ordering requirements, and to avoid deadlocks. In addition to the default execute queue, weblogic.kernel.default, there were pre-configured queues dedicated to internal administrative traffic, such as weblogic.admin.HTTP and weblogic.admin.RMI.Users could control thread usage by altering the number of threads in the default queue, or configure custom execute queues to ensure that particular applications had access to a fixed number of execute threads, regardless of overall system load. From WLS 9.0 release onwards WebLogic Server uses is a single thread pool (single thread pool which is called Default WorkManager), in which all types of work are executed. WebLogic Server prioritizes work based on rules you define, and run-time metrics, including the actual time it takes to execute a request and the rate at which requests are entering and leaving the pool.The common thread pool changes its size automatically to maximize throughput. The queue monitors throughput over time and based on history, determines whether to adjust the thread count. For example, if historical throughput statistics indicate that a higher thread count increased throughput, WebLogic increases the thread count. Similarly, if statistics indicate that fewer threads did not reduce throughput, WebLogic decreases the thread count. This new strategy makes it easier for administrators to allocate processing resources and manage performance, avoiding the effort and complexity involved in configuring, monitoring, and tuning custom executes queues. The Default WorkManager is used to handle thread management and perform self-tuning.This Work Manager is used by an application when no other Work Managers are specified in the application’s deployment descriptors. In many situations, the default Work Manager may be sufficient for most application requirements. WebLogic Server’s thread-handling algorithms assign each application its own fair share by default. Applications are given equal priority for threads and are prevented from monopolizing them. The default work-manager, as its name tells, is the work-manager defined by default.Thus, all applications deployed on WLS will use it. But sometimes, when your application is already in production, it's obvious you can't take your EAR / WAR, update the deployment descriptor(s) and redeploy it.The default work-manager belongs to a thread-pool, as initial thread-pool comes with only five threads, that's not much. If your application has to face a large number of hits, you may want to start with more than that.Well, that's quite easy. You have two option to do so.1) Modify the config.xmlJust add the following line(s) in your server definition : <server> <name>AdminServer</name> <self-tuning-thread-pool-size-min>100</self-tuning-thread-pool-size-min> <self-tuning-thread-pool-size-max>200</self-tuning-thread-pool-size-max> [...] </server> 2) Adding some JVM parameters Add the following system property in setDomainEnv.sh/setDomainEnv.cmd or startWebLogic.sh/startWebLogic.cmd : -Dweblogic.threadpool.MinPoolSize=100 -Dweblogic.threadpool.MaxPoolSize=100 Reboot WLS and see the option has been taken into account . Disadvantage: So far its fine. But here there is an disadvantage in tuning Default WorkManager. Internally Weblogic Server has many work managers configured for different types of work. if we run out of threads in the self-tuning pool(because of system property -Dweblogic.threadpool.MaxPoolSize) due to being undersized, then important work that WLS might need to do could be starved. So, while limiting the self-tuning would limit the default WorkManager and internally it also limits all other internal WorkManagers which WLS uses.So the best alternative is to override the default WorkManager that means creating a WorkManager for the Application and assign the WorkManager for the application instead of tuning the Default WorkManager.

Read the article
Performance analytics via DBMS "plugins", or other solution

- by Polynomial

I'm working on a systems monitoring product that currently focuses on performance at the system level. We're expanding out to monitoring database systems. Right now we can fetch simple performance information from a selection of DBMS, like connection count, disk IO rates, lock wait times, etc. However, we'd really like a way to measure the execution time of every query going into a DBMS, without requiring the client to implement monitoring in their application code. Some potential solutions might be: Some sort of proxy that sits between client and server. SSL might be an issue here, plus it requires us to reverse engineer and implement the network protocol for each DBMS. Plugin for each DBMS system that automatically records performance information when a query comes in. Other problems include "anonymising" the SQL, i.e. taking something like SELECT * FROM products WHERE price > 20 AND name LIKE "%disk%" and producing SELECT * FROM products WHERE price > ? AND name LIKE "%?%", though this shouldn't be too difficult with some clever parsing and regex. We're mainly focusing on: MySQL MSSQL Oracle Redis mongodb memcached Are there any plugin-style mechanisms we can utilise for any of these? Or is there a simpler solution?

Read the article
SQL SERVER – Database Dynamic Caching by Automatic SQL Server Performance Acceleration

- by pinaldave

My second look at SafePeak’s new version (2.1) revealed to me few additional interesting features. For those of you who hadn’t read my previous reviews SafePeak and not familiar with it, here is a quick brief: SafePeak is in business of accelerating performance of SQL Server applications, as well as their scalability, without making code changes to the applications or to the databases. SafePeak performs database dynamic caching, by caching in memory result sets of queries and stored procedures while keeping all those cache correct and up to date. Cached queries are retrieved from the SafePeak RAM in microsecond speed and not send to the SQL Server. The application gets much faster results (100-500 micro seconds), the load on the SQL Server is reduced (less CPU and IO) and the application or the infrastructure gets better scalability. SafePeak solution is hosted either within your cloud servers, hosted servers or your enterprise servers, as part of the application architecture. Connection of the application is done via change of connection strings or adding reroute line in the c:\windows\system32\drivers\etc\hosts file on all application servers. For those who would like to learn more on SafePeak architecture and how it works, I suggest to read this vendor’s webpage: SafePeak Architecture. More interesting new features in SafePeak 2.1 In my previous review of SafePeak new I covered the first 4 things I noticed in the new SafePeak (check out my article “SQLAuthority News – SafePeak Releases a Major Update: SafePeak version 2.1 for SQL Server Performance Acceleration”): Cache setup and fine-tuning – a critical part for getting good caching results Database templates Choosing which database to cache Monitoring and analysis options by SafePeak Since then I had a chance to play with SafePeak some more and here is what I found. 5. Analysis of SQL Performance (present and history): In SafePeak v.2.1 the tools for understanding of performance became more comprehensive. Every 15 minutes SafePeak creates and updates various performance statistics. Each query (or a procedure execute) that arrives to SafePeak gets a SQL pattern, and after it is used again there are statistics for such pattern. An important part of this product is that it understands the dependencies of every pattern (list of tables, views, user defined functions and procs). From this understanding SafePeak creates important analysis information on performance of every object: response time from the database, response time from SafePeak cache, average response time, percent of traffic and break down of behavior. One of the interesting things this behavior column shows is how often the object is actually pdated. The break down analysis allows knowing the above information for: queries and procedures, tables, views, databases and even instances level. The data is show now on all arriving queries, both read queries (that can be cached), but also any types of updates like DMLs, DDLs, DCLs, and even session settings queries. The stats are being updated every 15 minutes and SafePeak dashboard allows going back in time and investigating what happened within any time frame. 6. Logon trigger, for making sure nothing corrupts SafePeak cache data If you have an application with many parts, many servers many possible locations that can actually update the database, or the SQL Server is accessible to many DBAs or software engineers, each can access some database directly and do some changes without going thru SafePeak – this can create a potential corruption of the data stored in SafePeak cache. To make sure SafePeak cache is correct it needs to get all updates to arrive to SafePeak, and if a DBA will access the database directly and do some changes, for example, then SafePeak will simply not know about it and will not clean SafePeak cache. In the new version, SafePeak brought a new feature called “Logon Trigger” to solve the above challenge. By special click of a button SafePeak can deploy a special server logon trigger (with a CLR object) on your SQL Server that actually monitors all connections and informs SafePeak on any connection that is coming not from SafePeak. In SafePeak dashboard there is an interface that allows to control which logins can be ignored based on login names and IPs, while the rest will invoke cache cleanup of SafePeak and actually locks SafePeak cache until this connection will not be closed. Important to note, that this does not interrupt any logins, only informs SafePeak on such connection. On the Dashboard screen in SafePeak you will be able to see those connections and then decide what to do with them. Configuration of this feature in SafePeak dashboard can be done here: Settings -> SQL instances management -> click on instance -> Logon Trigger tab. Other features: 7. User management ability to grant permissions to someone without changing its configuration and only use SafePeak as performance analysis tool. 8. Better reports for analysis of performance using 15 minute resolution charts. 9. Caching of client cursors 10. Support for IPv6 Summary SafePeak is a great SQL Server performance acceleration solution for users who want immediate results for sites with performance, scalability and peak spikes challenges. Especially if your apps are packaged or 3rd party, since no code changes are done. SafePeak can significantly increase response times, by reducing network roundtrip to the database, decreasing CPU resource usage, eliminating I/O and storage access. SafePeak team provides a free fully functional trial www.safepeak.com/download and actually provides a one-on-one assistance during such trial. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: About Me, Pinal Dave, PostADay, SQL, SQL Authority, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, SQL Utility, T SQL, Technology

Read the article
Windows performance monitor new instances

- by fborozan

Hi all, I am trying to configure performance monitor on 2003/2008R1&R2 to capture new instances of the counters without any luck. For example if I select counter Process\%Processor time (to monitor processor time per any instances of the process) everything works fine until I open or close any application. If in the meanwhile new application is open it will not be included in the monitoring processor, and old application instance will display zero for % processor time. The problem is performance monitor is not refreshing instances of the new applications/users/new terminal session/ or any other metrics that changes instances in the meanwhile. The solution is to stop/start log file, but I don't want to do that every sec and the logging will be split into two files. Anybody knows how do I accomplish to add all new instances? Any help greatly appreciated

Read the article
Postfix performance

- by Brian G

Running postfix on ubuntu, sending alot of mail ( ~ 1 million messages ) per day. loads are extremly high but not much in terms of cpu and memory load. Anyone in a similiar situation and know how to remove the bottleneck? All mail on this server is outbound. I would have to assume the bottleneck is disk. Just an update, here is what iostat looks like: avg-cpu: %user %nice %system %iowait %steal %idle 0.00 0.00 0.12 99.88 0.00 0.00 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 0.00 12.38 0.00 2.48 0.00 118.81 48.00 0.00 0.00 0.00 0.00 sdb 1.49 22.28 72.28 42.57 629.70 1041.58 14.55 135.56 834.31 8.71 100.00 Are these numbers in line with the performance you would expect from a single disk? sdb is dedicated to postfix. I think it is queue shuffling, from incoming-active-deferred More details from questions: Server: Quad core Xeon(R) CPU E5405 @ 2.00GH with 4 GB ram Load average: 464.88, 489.11, 483.91, 4 cores. but the memory utilization and cpu is minimal Postfix instances between 16 - 32

Read the article
Increase application performance on Amazon AWS

- by Honus Wagner

I've got a client with an MVC v1 (.NET) application running on a micro instance. On this instance, I've got .NET, IIS 7.5, and MS SQL Server 2008 running to handle the application. The client has reported that it is taking nearly 10 seconds to process each request. Even loading the initial login page takes about that long, then logging in takes that long, etc etc. The currently running instance specs are as follows: 615 MB RAM Intel Xenon CPU E5430 @ 2.66GHz 2.78 GHz 64-Bit Is the memory availability the issue? or is it the processing power? I forsee two options: Change to a larget instance Set up a 2-tier architecture with two micro instances Which of these will give the application better performance? Thanks in advance.

Read the article
Very poor read performance compared to write performance on md(raid1) / crypt(luks) / lvm

- by Android5360

I'm experiencing very poor read performance over raid1/crypt/lvm. In the same time, write speeds are about 2x+ faster on the same setup. On another raid1 setup on the same machine I get normal read speeds (maybe because I'm not using cryptsetup). OS related disks: sda + sdb. I have raid1 configuration with two disks, both are in place. I'm using LVM over the RAID. No encryption. Both disks are WD Green, 5400 rpm. IO test results on this raid1: dd if=/dev/zero of=/tmp/output.img3 bs=8k count=256k conv=fsync - 2147483648 bytes (2.1 GB) copied, 22.3392 s, 96.1 MB/s sync echo 3 > /proc/sys/vm/drop_caches dd if=/tmp/output.img3 of=/dev/null bs=8k - 2147483648 bytes (2.1 GB) copied, 15.9 s, 135 MB/s And here is the problematic setup (on the same machine). Currently I have only one sdc (WD Green, 5400rpm) configured in software raid1 + crypt (luks, serpent-xts-plain) + lvm. Tomorrow I will attach another disk (sdd) to complete this two-disk raid1 setup. IO tests results on this raid1: dd if=/dev/zero of=output.img3 bs=8k count=256k conv=fsync 2147483648 bytes (2.1 GB) copied, 17.7235 s, 121 MB/s sync echo 3 > /proc/sys/vm/drop_caches dd if=output.img3 of=/dev/null bs=8k 2147483648 bytes (2.1 GB) copied, 36.2454 s, 59.2 MB/s We can see that the read performance is very very bad (59MB/s compared to 135MB/s when using no encryption). Nothing is using the disks during benchmark. I can confirm this because I checked with iostat and dstat. Details on the hardware: disks: all are WD green, 5400rpm, 64mb cache. cpu: FX-8350 at stock speed ram: 4x4GB at 1066Mhz. Details on the software: OS: Debian Wheezy 7, amd64 mdadm: v3.2.5 - 18th May 2012 LVM version: 2.02.95(2) (2012-03-06) LVM Library version: 1.02.74 (2012-03-06) LVM Driver version: 4.22.0 cryptsetup: 1.4.3 Here is how I configured the slow raid1+crypt+lvm setup: parted /dev/sdc mklabel gpt type: ext4 start: 2048s end: -1 Now the raid, crypt and the lvm configuration: mdadm --create /dev/md1 --level=1 --raid-disks=2 missing /dev/sdc cryptsetup --cipher serpent-xts-plain luksFormat /dev/md1 cryptsetup luksOpen /dev/md1 md1_crypt vgcreate vg_sql /dev/mapper/md1_crypt lvcreate -l 100%VG vg_sql -n lv_sql mkfs.ext4 /dev/mapper/vg_sql-lv-sql mount /dev/mapper/vg_sql-lv_sql /sql So guys, can you help me identify the reason and fix it? It has to be something with the cryptsetup as there is no such read slowdown on the other setup (sda+sdb) where no encryption is present. But I have no idea what to do. Thanks!

Read the article
Recommendation using Client side performance monitoring (boomerang/jiffy/episodes)

- by Yasei No Umi

There are a few Client-side JavaScript libraries that check web-site performance on the client side: Jiffy (http://code.google.com/p/jiffy-web/) Episodes (http://stevesouders.com/episodes/) by Steve Sounders Boomerang (http://yahoo.github.com/boomerang/doc/) by Yahoo! Have you used any of them or a similar too? What did you use for the server-side? for reporting? Is this a recommended approach? If not, how should I monitor my web-site performance from the end-user's view?

Read the article
SQL SERVER – Iridium I/O – SQL Server Deduplication that Shrinks Databases and Improves Performance

- by Pinal Dave

Database performance is a common problem for SQL Server DBA’s. It seems like we spend more time on performance than just about anything else. In many cases, we use scripts or tools that point out performance bottlenecks but we don’t have any way to fix them. For example, what do you do when you need to speed up a query that is already tuned as well as possible? Or what do you do when you aren’t allowed to make changes for a database supporting a purchased application? Iridium I/O for SQL Server was originally built at Confio software (makers of Ignite) because DBA’s kept asking for a way to actually fix performance instead of just pointing out performance problems. The technology is certified by Microsoft and was so promising that it was spun out into a separate company that is now run by the Confio Founder/CEO and technology management team. Iridium uses deduplication technology to both shrink the databases as well as boost IO performance. It is intriguing to see it work. It will deduplicate a live database as it is running transactions. You can watch the database get smaller while user queries are running. Iridium is a simple tool to use. After installing the software, you click an “Analyze” button which will spend a minute or two on each database and estimate both your storage and performance savings. Next, you click an “Activate” button to turn on Iridium I/O for your selected databases. You don’t need to reboot the operating system or restart the database during any part of the process. As part of my test, I also wanted to see if there would be an impact on my databases when Iridium was removed. The ‘revert’ process (bringing the files back to their SQL Server native format) was executed by a simple click of a button, and completed while the databases were available for normal processing. I was impressed and enjoyed playing with the software and encourage all of you to try it out. Here is the link to the website to download Iridium for free. . Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: PostADay, SQL, SQL Authority, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
Pain of the Week/Expert's Perspective: Performance Tuning for Backups and Restores

- by KKline

First off - the Pain of the Week webcast series has been renamed. It's now known as The Expert's Perspective . Please join us for future webcasts and, if you're interested in speaking, drop me a note to see if we can get you on the roster! The bigger your databases get, the longer backups take. That doesn't really seem like a huge problem — until disaster strikes and you need to restore your databases as fast as possible. Join my buddy Brent Ozar ( blog | twitter ), a Microsoft Certified Master of...(read more)

Read the article
Linux - real-world hardware RAID controller tuning (scsi and cciss)

- by ewwhite

Most of the Linux systems I manage feature hardware RAID controllers (mostly HP Smart Array). They're all running RHEL or CentOS. I'm looking for real-world tunables to help optimize performance for setups that incorporate hardware RAID controllers with SAS disks (Smart Array, Perc, LSI, etc.) and battery-backed or flash-backed cache. Assume RAID 1+0 and multiple spindles (4+ disks). I spend a considerable amount of time tuning Linux network settings for low-latency and financial trading applications. But many of those options are well-documented (changing send/receive buffers, modifying TCP window settings, etc.). What are engineers doing on the storage side? Historically, I've made changes to the I/O scheduling elevator, recently opting for the deadline and noop schedulers to improve performance within my applications. As RHEL versions have progressed, I've also noticed that the compiled-in defaults for SCSI and CCISS block devices have changed as well. This has had an impact on the recommended storage subsystem settings over time. However, it's been awhile since I've seen any clear recommendations. And I know that the OS defaults aren't optimal. For example, it seems that the default read-ahead buffer of 128kb is extremely small for a deployment on server-class hardware. The following articles explore the performance impact of changing read-ahead cache and nr_requests values on the block queues. http://zackreed.me/articles/54-hp-smart-array-p410-controller-tuning http://www.overclock.net/t/515068/tuning-a-hp-smart-array-p400-with-linux-why-tuning-really-matters http://yoshinorimatsunobu.blogspot.com/2009/04/linux-io-scheduler-queue-size-and.html For example, these are suggested changes for an HP Smart Array RAID controller: echo "noop" > /sys/block/cciss\!c0d0/queue/scheduler blockdev --setra 65536 /dev/cciss/c0d0 echo 512 > /sys/block/cciss\!c0d0/queue/nr_requests echo 2048 > /sys/block/cciss\!c0d0/queue/read_ahead_kb What else can be reliably tuned to improve storage performance? I'm specifically looking for sysctl and sysfs options in production scenarios.

Read the article
Server Performance

- by sb12

I know very little about performance tuning of servers etc... so i thought i'd put this up here as i start some research on it, just to get some direction. I am in the process of migrating from my old server to a new one - both are 64 bit machines. One is a few years old, the other brand new (PowerEdge R410). The old server spec is: 2 cpus, 3.4GHz Pentiums, 8G of RAM, Fedora 11 currently installed The new server spec is: 16 cpus, 3.2 GHz Xeon, 16G of RAM, CentOS 6.2 installed. Also RAID10 is on the new server - no RAID on the old one. Both servers currently have the same database (MySQL) with the same data migrated. I wrote a Perl script that simply steps through each row of a table in the database (about 18000 rows) and updates a value in that row. Every row in the table is updated. Out of curiosity i ran this perl script on both machines, just to see how the new server would perform vs. the old one, and it produced interesting results: The old server was twice as fast as the new one to complete. Looking at the database, both are configured exactly the same (the new one being a dump of the old one...)... Anyone any ideas why this would be given the hardware gap between both? As i said i'm about to start some digging, but thought i'd put this up here to maybe get some good direction.... Many thanks in advance..

Read the article
How can I tell which page is creating a high-CPU-load httpd process?

- by Greg

I have a LAMP server (CentOS-based MediaTemple (DV) Extreme with 2GB RAM) running a customized Wordpress+bbPress combination . At about 30k pageviews per day the server is starting to groan. It stumbled earlier today for about 5 minutes when there was an influx of traffic. Even under normal conditions I can see that the virtual server is sometimes at 90%+ CPU load. Using Top I can often see 5-7 httpd processes that are each using 15-30% (and sometimes even 50%) CPU. Before we do a big optimization pass (our use of MySQL is probably the culprit) I would love to find the pages that are the main offenders and deal with them first. Is there a way that I can find out which specific requests were responsible for the most CPU-hungry httpd processes? I have found a lot of info on optimization in general, but nothing on this specific question. Secondly, I know there are a million variables, but if you have any insight on whether we should be at the boundaries of performance with a single dedicated virtual server with a site of this size, then I would love to hear your opinion. Should we be thinking about moving to a more powerful server, or should we be focused on optimization on the current server?

Read the article
How to collect the performance data of a server during an unreachable/down period using Nagios?

- by gsc-frank

Some time services and host stop responding due to a poor server performance. I mean, if for some reason (could be lot of concurrency services access, a expensive backup execution on the server or whatever that consume tons of server resources) a server performance is very degraded, that could lead that the server isn't capable to establish any "normal network communication" (without trigger whatever standards timeouts defined for such communication). Knowing host's performance data (cpu, memory, ...) in case of available during that period (host is not down and despite of its performance degradation still allow plugins collect performance data) could be very useful for sysadmin to try to determine what cause the problem, or at least, if the host performance was good and don't interfered at all in the host/service down. This problem could be solved using remote active (NRPE) or remote passive (NSCA) if such remote solutions could store (buffered) perf data to be send to central Nagios server when host performance or network outage allow it. I read the doc of both solutions and can't find any reference to such buffer mechanism neither what happened in case that NSCA can't reach Nagios server. Any idea of how solve this lack of info? so useful for forensic analysis. EDIT: My questions isn about which tools I can use to debug perf problems or gather perf data to analysis, but is about how collect (using Nagios) host perf data even during a network outage for its posterior analysis (kind of forensic analysis). The idea is integrate such data to Nagios graphers like pnp4nagios and NagiosGrapther. I know that I could install tools like Cacti in each of my host, and have a kind of performance data collection redundancy, but I really want avoid that and try to solve all perf analysis requirements with one tools: Nagios

Read the article
SQL Server 2005 standard filegroups / files for performance on SAN

- by Blootac

I submitted this to stack overflow (here) but realised it should really be on serverfault. so apologies for the incorrect and duplicate posting: Ok so I've just been on a SQL Server course and we discussed the usage scenarios of multiple filegroups and files when in use over local RAID and local disks but we didn't touch SAN scenarios so my question is as follows; I currently have a 250 gig database running on SQL Server 2005 where some tables have a huge number of writes and others are fairly static. The database and all objects reside in a single file group with a single data file. The log file is also on the same volume. My interpretation is that separate data files should be used across different disks to lessen disk contention and that file groups should be used for partitioning of data. However, with a SAN you obviously don't really have the same issue of disk contention that you do with a small RAID setup (or at least we don't at the moment), and standard edition doesn't support partitioning. So in order to improve parallelism what should I do? My understanding of various Microsoft publications is that if I increase the number of data files, separate threads can act across each file separately. Which leads me to the question how many files should I have. One per core? Should I be putting tables and indexes with high levels of activity in separate file groups, each with the same number of data files as we have cores? Thank you

Read the article
KVM Slow performance on XP Guest

- by Gregg Leventhal

The system is very slow to do anything, even browse a local folder, and CPU sits at 100% frequently. Guest is XP 32 bit. Host is Scientific Linux 6.2, Libvirt 0.10, Guest XP OS shows ACPI Multiprocessor HAL and a virtIO driver for NIC and SCSI. Installed. CPUInfo on host: processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 42 model name : Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz stepping : 7 cpu MHz : 3200.000 cache size : 8192 KB physical id : 0 siblings : 8 core id : 0 cpu cores : 4 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dts tpr_shadow vnmi flexpriority ept vpid bogomips : 6784.93 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: <memory unit='KiB'>4194304</memory> <currentMemory unit='KiB'>4194304</currentMemory> <vcpu placement='static' cpuset='0'>1</vcpu> <os> <type arch='x86_64' machine='rhel6.3.0'>hvm</type> <boot dev='hd'/> </os> <features> <acpi/> <apic/> <pae/> </features> <cpu mode='custom' match='exact'> <model fallback='allow'>SandyBridge</model> <vendor>Intel</vendor> <feature policy='require' name='vme'/> <feature policy='require' name='tm2'/> <feature policy='require' name='est'/> <feature policy='require' name='vmx'/> <feature policy='require' name='osxsave'/> <feature policy='require' name='smx'/> <feature policy='require' name='ss'/> <feature policy='require' name='ds'/> <feature policy='require' name='tsc-deadline'/> <feature policy='require' name='dtes64'/> <feature policy='require' name='ht'/> <feature policy='require' name='pbe'/> <feature policy='require' name='tm'/> <feature policy='require' name='pdcm'/> <feature policy='require' name='ds_cpl'/> <feature policy='require' name='xtpr'/> <feature policy='require' name='acpi'/> <feature policy='require' name='monitor'/> <feature policy='force' name='sse'/> <feature policy='force' name='sse2'/> <feature policy='force' name='sse4.1'/> <feature policy='force' name='sse4.2'/> <feature policy='force' name='ssse3'/> <feature policy='force' name='x2apic'/> </cpu> <clock offset='localtime'> <timer name='rtc' tickpolicy='catchup'/> </clock> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>restart</on_crash> <devices> <emulator>/usr/libexec/qemu-kvm</emulator> <disk type='file' device='disk'> <driver name='qemu' type='qcow2' cache='none'/> <source file='/var/lib/libvirt/images/Server-10-9-13.qcow2'/> <target dev='vda' bus='virtio'/> <alias name='virtio-disk0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/> </disk>

Read the article
Application Performance: The Best of the Web

- by Michaela Murray

Wisdom A deep understanding and realization […] resulting in the ability to apply perceptions, judgements and actions. It is also the comprehension of what is true coupled with optimum judgment as to action. - Wikipedia We’re writing a book for ASP.NET developers, and we want you to be a part of it. We know that there’s a huge amount of web developer wisdom that never gets shared, and we want to find those golden nuggets of knowledge and experience, and make sure everyone can learn from them. Right now, we want to find out about your top tips, hard-won lessons, and sage advice for avoiding, finding, and fixing application performance problems. If you work with .NET and SQL, even better – a lot of application performance relies on the interaction with the database, so we want to hear from you! “How Do You Want Me To Be Involved?” Right! Details! We want you, our most excellent readers, to email us with the Best Advice you would give to other developers for getting the best performance out of their applications. It doesn’t matter if your advice is for newbies or veterans, .NET or SQL – so long as it’s about application performance, we want to hear from you. (And if you think that there’s developer wisdom out there that “everyone knows”, a) I’m willing to bet you could find someone who doesn’t know about it, and b) it probably bears repeating anyway!) “I’m Interested. What Can You Do For Me?” Excellent question. For starters, there’s a chance to win a Microsoft Surface (the tablet, not the table-top). Once all the ASP.NET Wisdom has been collected, tallied, and labelled, it will then be weighed and measured by a team of expert judges (whose identities are still a closely-guarded secret). The top tip in both SQL & .NET categories will each win their author their very own MS Surface. But that’s not all! We can also give you… immortality! More details? Ok. We’ll be collecting all of the tips sent in by our readers (and we can’t wait to learn from you all,) and with the help of our Simple-Talk editors, we will publish and distribute your combined and documented knowledge as a free, community-created, professionally typeset eBook. You will naturally be credited by name / pseudonym / twitter handle / GitHub username / StackOverflow profile / Whatever, as the clearly ingenious author of hot performance tips. The Not-Very-Fine Print Here’s the breakdown: We want to bring together the best application performance knowledge from ASP.NET developers. Closing date for submissions will be 9am GMT, December 4th. Submissions should be made by email – [email protected] Submissions will be judged by a panel of expert judges (who will be revealed soon). The top submission in both the SQL & .NET categories will each win a Microsoft Surface. ALL the tips which make it through the judging process will be polished by Simple-Talk editors, and turned into a professionally typeset eBook, which will be freely available, and promoted alongside the ANTS Performance Profiler tool. Anyone whose entry makes it into the book will be clearly and profusely credited in the method of their choice (or can remain anonymous.) The really REALLY short version Share what you know about ASP.NET application performance for a chance to win a Microsoft Surface, and then get your name credited in a slick eBook with top-notch production values. For more details, see above. We can’t wait to learn from you!

Read the article
Analysing and measuring the performance of a .NET application (survey results)

- by Laila

Back in December last year, I asked myself: could it be that .NET developers think that you need three days and a PhD to do performance profiling on their code? What if developers are shunning profilers because they perceive them as too complex to use? If so, then what method do they use to measure and analyse the performance of their .NET applications? Do they even care about performance? So, a few weeks ago, I decided to get a 1-minute survey up and running in the hopes that some good, hard data would clear the matter up once and for all. I posted the survey on Simple Talk and got help from a few people to promote it. The survey consisted of 3 simple questions: Amazingly, 533 developers took the time to respond - which means I had enough data to get representative results! So before I go any further, I would like to thank all of you who contributed, because I now have some pretty good answers to the troubling questions I was asking myself. To thank you properly, I thought I would share some of the results with you. First of all, application performance is indeed important to most of you. In fact, performance is an intrinsic part of the development cycle for a good 40% of you, which is much higher than I had anticipated, I have to admit. (I know, "Have a little faith Laila!") When asked what tool you use to measure and analyse application performance, I found that nearly half of the respondents use logging statements, a third use performance counters, and 70% of respondents use a profiler of some sort (a 3rd party performance profilers, the CLR profiler or the Visual Studio profiler). The importance attributed to logging statements did surprise me a little. I am still not sure why somebody would go to the trouble of manually instrumenting code in order to measure its performance, instead of just using a profiler. I personally find the process of annotating code, calculating times from log files, and relating it all back to your source terrifyingly laborious. Not to mention that you then need to remember to turn it all off later! Even when you have logging in place throughout all your code anyway, you still have a fair amount of potentially error-prone calculation to sift through the results; in addition, you'll only get method-level rather than line-level timings, and you won't get timings from any framework or library methods you don't have source for. To top it all, we all know that bottlenecks are rarely where you would expect them to be, so you could be wasting time looking for a performance problem in the wrong place. On the other hand, profilers do all the work for you: they automatically collect the CPU and wall-clock timings, and present the results from method timing all the way down to individual lines of code. Maybe I'm missing a trick. I would love to know about the types of scenarios where you actively prefer to use logging statements. Finally, while a third of the respondents didn't have a strong opinion about code performance profilers, those who had an opinion thought that they were mainly complex to use and time consuming. Three respondents in particular summarised this perfectly: "sometimes, they are rather complex to use, adding an additional time-sink to the process of trying to resolve the existing problem". "they are simple to use, but the results are hard to understand" "Complex to find the more advanced things, easy to find some low hanging fruit". These results confirmed my suspicions: Profilers are seen to be designed for more advanced users who can use them effectively and make sense of the results. I found yet more interesting information when I started comparing samples of "developers for whom performance is an important part of the dev cycle", with those "to whom performance is only looked at in times of crisis", and "developers to whom performance is not important, as long as the app works". See the three graphs below. Sample of developers to whom performance is an important part of the dev cycle: Sample of developers to whom performance is important only in times of crisis: Sample of developers to whom performance is not important, as long as the app works: As you can see, there is a strong correlation between the usage of a profiler and the importance attributed to performance: indeed, the more important performance is to a development team, the more likely they are to use a profiler. In addition, developers to whom performance is an important part of the dev cycle have a higher tendency to use a much wider range of methods for performance measurement and analysis. And, unsurprisingly, the less important performance is, the less varied the methods of measurement are. So all in all, to come back to my random questions: .NET developers do care about performance. Those who care the most use a wider range of performance measurement methods than those who care less. But overall, logging statements, performance counters and third party performance profilers are the performance measurement methods of choice for most developers. Finally, although most of you find code profilers complex to use, those of you who care the most about performance tend to use profilers more than those of you to whom performance is not so important.

Read the article
C# Performance Pitfall – Interop Scenarios Change the Rules

- by Reed

C# and .NET, overall, really do have fantastic performance in my opinion. That being said, the performance characteristics dramatically differ from native programming, and take some relearning if you’re used to doing performance optimization in most other languages, especially C, C++, and similar. However, there are times when revisiting tricks learned in native code play a critical role in performance optimization in C#. I recently ran across a nasty scenario that illustrated to me how dangerous following any fixed rules for optimization can be… The rules in C# when optimizing code are very different than C or C++. Often, they’re exactly backwards. For example, in C and C++, lifting a variable out of loops in order to avoid memory allocations often can have huge advantages. If some function within a call graph is allocating memory dynamically, and that gets called in a loop, it can dramatically slow down a routine. This can be a tricky bottleneck to track down, even with a profiler. Looking at the memory allocation graph is usually the key for spotting this routine, as it’s often “hidden” deep in call graph. For example, while optimizing some of my scientific routines, I ran into a situation where I had a loop similar to: for (i=0; i<numberToProcess; ++i) { // Do some work ProcessElement(element[i]); } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } This loop was at a fairly high level in the call graph, and often could take many hours to complete, depending on the input data. As such, any performance optimization we could achieve would be greatly appreciated by our users. After a fair bit of profiling, I noticed that a couple of function calls down the call graph (inside of ProcessElement), there was some code that effectively was doing: // Allocate some data required DataStructure* data = new DataStructure(num); // Call into a subroutine that passed around and manipulated this data highly CallSubroutine(data); // Read and use some values from here double values = data->Foo; // Cleanup delete data; // ... return bar; Normally, if “DataStructure” was a simple data type, I could just allocate it on the stack. However, it’s constructor, internally, allocated it’s own memory using new, so this wouldn’t eliminate the problem. In this case, however, I could change the call signatures to allow the pointer to the data structure to be passed into ProcessElement and through the call graph, allowing the inner routine to reuse the same “data” memory instead of allocating. At the highest level, my code effectively changed to something like: DataStructure* data = new DataStructure(numberToProcess); for (i=0; i<numberToProcess; ++i) { // Do some work ProcessElement(element[i], data); } delete data; Granted, this dramatically reduced the maintainability of the code, so it wasn’t something I wanted to do unless there was a significant benefit. In this case, after profiling the new version, I found that it increased the overall performance dramatically – my main test case went from 35 minutes runtime down to 21 minutes. This was such a significant improvement, I felt it was worth the reduction in maintainability. In C and C++, it’s generally a good idea (for performance) to: Reduce the number of memory allocations as much as possible, Use fewer, larger memory allocations instead of many smaller ones, and Allocate as high up the call stack as possible, and reuse memory I’ve seen many people try to make similar optimizations in C# code. For good or bad, this is typically not a good idea. The garbage collector in .NET completely changes the rules here. In C#, reallocating memory in a loop is not always a bad idea. In this scenario, for example, I may have been much better off leaving the original code alone. The reason for this is the garbage collector. The GC in .NET is incredibly effective, and leaving the allocation deep inside the call stack has some huge advantages. First and foremost, it tends to make the code more maintainable – passing around object references tends to couple the methods together more than necessary, and overall increase the complexity of the code. This is something that should be avoided unless there is a significant reason. Second, (unlike C and C++) memory allocation of a single object in C# is normally cheap and fast. Finally, and most critically, there is a large advantage to having short lived objects. If you lift a variable out of the loop and reuse the memory, its much more likely that object will get promoted to Gen1 (or worse, Gen2). This can cause expensive compaction operations to be required, and also lead to (at least temporary) memory fragmentation as well as more costly collections later. As such, I’ve found that it’s often (though not always) faster to leave memory allocations where you’d naturally place them – deep inside of the call graph, inside of the loops. This causes the objects to stay very short lived, which in turn increases the efficiency of the garbage collector, and can dramatically improve the overall performance of the routine as a whole. In C#, I tend to: Keep variable declarations in the tightest scope possible Declare and allocate objects at usage While this tends to cause some of the same goals (reducing unnecessary allocations, etc), the goal here is a bit different – it’s about keeping the objects rooted for as little time as possible in order to (attempt) to keep them completely in Gen0, or worst case, Gen1. It also has the huge advantage of keeping the code very maintainable – objects are used and “released” as soon as possible, which keeps the code very clean. It does, however, often have the side effect of causing more allocations to occur, but keeping the objects rooted for a much shorter time. Now – nowhere here am I suggesting that these rules are hard, fast rules that are always true. That being said, my time spent optimizing over the years encourages me to naturally write code that follows the above guidelines, then profile and adjust as necessary. In my current project, however, I ran across one of those nasty little pitfalls that’s something to keep in mind – interop changes the rules. In this case, I was dealing with an API that, internally, used some COM objects. In this case, these COM objects were leading to native allocations (most likely C++) occurring in a loop deep in my call graph. Even though I was writing nice, clean managed code, the normal managed code rules for performance no longer apply. After profiling to find the bottleneck in my code, I realized that my inner loop, a innocuous looking block of C# code, was effectively causing a set of native memory allocations in every iteration. This required going back to a “native programming” mindset for optimization. Lifting these variables and reusing them took a 1:10 routine down to 0:20 – again, a very worthwhile improvement. Overall, the lessons here are: Always profile if you suspect a performance problem – don’t assume any rule is correct, or any code is efficient just because it looks like it should be Remember to check memory allocations when profiling, not just CPU cycles Interop scenarios often cause managed code to act very differently than “normal” managed code. Native code can be hidden very cleverly inside of managed wrappers

Read the article
How to save a perfmon Performance Counter as a textfile (Reliability and Performance Monitor Version

- by Lieven Cardoen

Now the file gets saved as blg, but I would like a txt versin to import in Excel.

Read the article
How often is software speed evident in the eyes of customers?

- by rwong

In theory, customers should be able to feel the software performance improvements from first-hand experience. In practice, sometimes the improvements are not noticible enough, such that in order to monetize from the improvements, it is necessary to use quotable performance figures in marketing in order to attract customers. We already know the difference between perceived performance (GUI latency, etc) and server-side performance (machines, networks, infrastructure, etc). How often is it that programmers need to go the extra length to "write up" performance analyses for which the audience is not fellow programmers, but managers and customers?

Read the article
Strange C++ performance difference?

- by STingRaySC

I just stumbled upon a change that seems to have counterintuitive performance ramifications. Can anyone provide a possible explanation for this behavior? Original code: for (int i = 0; i < ct; ++i) { // do some stuff... int iFreq = getFreq(i); double dFreq = iFreq; if (iFreq != 0) { // do some stuff with iFreq... // do some calculations with dFreq... } } While cleaning up this code during a "performance pass," I decided to move the definition of dFreq inside the if block, as it was only used inside the if. There are several calculations involving dFreq so I didn't eliminate it entirely as it does save the cost of multiple run-time conversions from int to double. I expected no performance difference, or if any at all, a negligible improvement. However, the perfomance decreased by nearly 10%. I have measured this many times, and this is indeed the only change I've made. The code snippet shown above executes inside a couple other loops. I get very consistent timings across runs and can definitely confirm that the change I'm describing decreases performance by ~10%. I would expect performance to increase because the int to double conversion would only occur when iFreq != 0. Chnaged code: for (int i = 0; i < ct; ++i) { // do some stuff... int iFreq = getFreq(i); if (iFreq != 0) { // do some stuff with iFreq... double dFreq = iFreq; // do some stuff with dFreq... } } Can anyone explain this? I am using VC++ 9.0 with /O2. I just want to understand what I'm not accounting for here.

Read the article
mysql settings - using the available resources

- by Christian Payne

I've got a lot of processing work I need to run on a mysql server. I've installed mysql 5.1.45-community on a Win 2007 64bit. Its running on a xenon, 3ghz 6 processors with 8 gig ram. It doesn't seem to matter what queries I run (or the number I run at the same time), when I look in task manager, I'll see one processor is out at 100%. The other 5 are idol. Memory is static at 1.54 gig. When I installed mysql, I used the wizard and selected the default "server" (not workstation) option. I feel like I should be getting more bang for my buck. Is there something else I should be monitoring or something I should change to use the other system resources???

Read the article

Search Results

Search found 13249 results on 530 pages for 'performance tuning'.

Page 3/530 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

- by Low Kian Seong

- by Ahmed Awan

- by Murali Veligeti

- by Polynomial

- by pinaldave

- by fborozan

- by Brian G

- by Honus Wagner

- by Android5360

- by Yasei No Umi

- by Pinal Dave

- by KKline

- by ewwhite

- by sb12

- by Greg

- by gsc-frank

- by Blootac

- by Gregg Leventhal

- by Michaela Murray

- by Laila

- by Reed

- by Lieven Cardoen

- by rwong

- by STingRaySC

- by Christian Payne

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >