Search Results

Search found 13151 results on 527 pages for 'performance counters'.

Page 2/527 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

Linux Server Performance Monitoring

- by Jon

I'm looking to monitor performance on my Linux servers (which happen to be Centos). What are the best tools for monitoring things in realtime such as: Disk Performance I/O, swapping etc.. CPU Performance Looking for low level tools, rather than web based tools such as Nagios, Ganglia etc... n.b. I'd like to know exactly what each tool does rather than just having a list of random toolnames if possible please. Why the tool is a better option than others would be good also.

Read the article
The meaning of thermal throttle counters and package power limit notifications in Linux

- by Trustin Lee

Whenever I do some performance testing on my Linux-installed MacBook Pro, I often see the following messages in dmesg: Aug 8 09:29:31 infinity kernel: [79791.789404] CPU1: Package power limit notification (total events = 40365) Aug 8 09:29:31 infinity kernel: [79791.789408] CPU3: Package power limit notification (total events = 40367) Aug 8 09:29:31 infinity kernel: [79791.789411] CPU2: Package power limit notification (total events = 40453) Aug 8 09:29:31 infinity kernel: [79791.789414] CPU0: Package power limit notification (total events = 40453) I also see the throttle counters in the sysfs increases over time: trustin@infinity:/sys/devices/system/cpu/cpu0/thermal_throttle $ ls core_power_limit_count package_power_limit_count core_throttle_count package_throttle_count $ cat core_power_limit_count 0 $ cat core_throttle_count 41912 $ cat package_power_limit_count 67945 $ cat package_throttle_count 67565 What do these counters mean? Do they affect the performance of CPU or system? Do they result in increased deviation of performance numbers? (i.e. Do they prevent me from getting reliable performance numbers?) If so, how do I avoid these messages and increasing counters? Would running the performance tests on a well-cooled desktop system help?

Read the article
Automating XNA Performance Testing?

- by Grofit

I was wondering what peoples approaches or thoughts were on automating performance testing in XNA. Currently I am looking at only working in 2d, but that poses many areas where performance can be improved with different implementations. An example would be if you had 2 different implementations of spatial partitioning, one may be faster than another but without doing some actual performance testing you wouldn't be able to tell which one for sure (unless you saw the code was blatantly slow in certain parts). You could write a unit test which for a given time frame kept adding/updating/removing entities for both implementations and see how many were made in each timeframe and the higher one would be the faster one (in this given example). Another higher level example would be if you wanted to see how many entities you can have on the screen roughly without going beneath 60fps. The problem with this is to automate it you would need to use the hidden form trick or some other thing to kick off a mock game and purely test which parts you care about and disable everything else. I know that this isnt a simple affair really as even if you can automate the tests, really it is up to a human to interpret if the results are performant enough, but as part of a build step you could have it run these tests and publish the results somewhere for comparison. This way if you go from version 1.1 to 1.2 but have changed a few underlying algorithms you may notice that generally the performance score would have gone up, meaning you have improved your overall performance of the application, and then from 1.2 to 1.3 you may notice that you have then dropped overall performance a bit. So has anyone automated this sort of thing in their projects, and if so how do you measure your performance comparisons at a high level and what frameworks do you use to test? As providing you have written your code so its testable/mockable for most parts you can just use your tests as a mechanism for getting some performance results... === Edit === Just for clarity, I am more interested in the best way to make use of automated tests within XNA to track your performance, not play testing or guessing by manually running your game on a machine. This is completely different to seeing if your game is playable on X hardware, it is more about tracking the change in performance as your game engine/framework changes. As mentioned in one of the comments you could easily test "how many nodes can I insert/remove/update within QuadTreeA within 2 seconds", but you have to physically look at these results every time to see if it has changed, which may be fine and is still better than just relying on playing it to see if you notice any difference between version. However if you were to put an Assert in to notify you of a fail if it goes lower than lets say 5000 in 2 seconds you have a brittle test as it is then contextual to the hardware, not just the implementation. Although that being said these sort of automated tests are only really any use if you are running your tests as some sort of build pipeline i.e: Checkout - Run Unit Tests - Run Integration Tests - Run Performance Tests - Package So then you can easily compare the stats from one build to another on the CI server as a report of some sort, and again this may not mean much to anyone if you are not used to Continuous Integration. The main crux of this question is to see how people manage this between builds and how they find it best to report upon. As I said it can be subjective but as knowledge will be gained from the answers it seems a worthwhile question.

Read the article
SQL SERVER – Out of the Box – Activty and Performance Reports from SSSMS

- by pinaldave

SQL Server management Studio 2008 is wonderful tool and has many different features. Many times, an average user does not use them as they are not aware about these features. Today, we will learn one such feature. SSMS comes with many inbuilt performance and activity reports, but we do not use it to the full potential. Let us see how we can access these standard reports. Connect to SQL Server Node >> Right Click on it >> Go to Reports >> Click on Standard Reports >> Pick Any Report. Click to Enlarge You can see there are many reports, which an average users needs right away, are available there. Let me list all the reports available. Server Dashboard Configuration Changes History Schema Changes History Scheduler Health Memory Consumption Activity – All Blocking Transactions Activity – All Cursors Activity – All Sessions Activity – Top Sessions Activity – Dormant Sessions Activity - Top Connections Top Transactions by Age Top Transactions by Blocked Transactions Count Top Transactions by Locks Count Performance – Batch Execution Statistics Performance – Object Execution Statistics Performance – Top Queries by Average CPU Time Performance – Top Queries by Average IO Performance – Top Queries by Total CPU Time Performance – Top Queries by Total IO Service Broker Statistics Transactions Log Shipping Status In fact, when you look at the above list, it is fairly clear that they are very thought out and commonly needed reports that are available in SQL Server 2008. Let us run a couple of reports and observe their result. Performance – Top Queries by Total CPU Time Click to Enlarge Memory Consumption Click to Enlarge There are options for custom reports as well, which we can configure. We will learn about them in some other post. Additionally, you can right click on the reports and export in Excel or PDF. I think this tool can really help those who are just looking for some quick details. Does any of you use this feature, or this feature has some limitations and You would like to see more features? Reference : Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, SQL, SQL Authority, SQL Optimization, SQL Performance, SQL Query, SQL Server, SQL Server Management Studio, SQL Tips and Tricks, T SQL, Technology

Read the article
SQL SERVER – Quick Look at SQL Server Configuration for Performance Indications

- by pinaldave

Earlier I wrote SQL SERVER – Beginning SQL Server: One Step at a Time – SQL Server Magazine. That was the first article on the series of my real world experience of Performance Tuning experience. I have written second part the same series over here. Read second part over here: Quick Look at SQL Server Configuration for Performance Indications. In this second part I talk about two types of my clients. 1) Those who want instant results 2) Those who want the right results It is really fun to work with both the clients. I talk about various configuration options which I look at when I try to give very early opinion about SQL Server Performance. There are various eight configurations, I give quick look and start talking about performance. Head over to original article over here: Quick Look at SQL Server Configuration for Performance Indications. Reference : Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Optimization, SQL Performance, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
SQL Server – Learning SQL Server Performance: Indexing Basics – Video

- by pinaldave

Today I remember one of my older cartoon years ago created for Indexing and Performance. Every single time when Performance is discussed, Indexes are mentioned along with it. In recent times, data and application complexity is continuously growing. The demand for faster query response, performance, and scalability by organizations is increasing and developers and DBAs need to now write efficient code to achieve this. DBA and Developers A DBA’s role is critical, because a production environment has to run 24×7, hence maintenance, trouble shooting, and quick resolutions are the need of the hour. The first baby step into any performance tuning exercise in SQL Server involves creating, analysing, and maintaining indexes. Though we have learnt indexing concepts from our college days, indexing implementation inside SQL Server can vary. Understanding this behaviour and designing our applications appropriately will make sure the application is performed to its highest potential. Video Learning Vinod Kumar and myself we often thought about this and realized that practical understanding of the indexes is very important. One can not master every single aspects of the index. However there are some minimum expertise one should gain if performance is one of the concern. We decided to build a course which just addresses the practical aspects of the performance. In this course, we explored some of these indexing fundamentals and we elaborated on how SQL Server goes about using indexes. At the end of this course of you will know the basic structure of indexes, practical insights into implementation, and maintenance tips and tricks revolving around indexes. Finally, we will introduce SQL Server 2012 column store indexes. We have refrained from discussing internal storage structure of the indexes but have taken a more practical, demo-oriented approach to explain these core concepts. Course Outline Here are salient topics of the course. We have explained every single concept along with a practical demonstration. Additionally shared our personal scripts along with the same. Introduction Fundamentals of Indexing Index Fundamentals Index Fundamentals – Visual Representation Practical Indexing Implementation Techniques Primary Key Over Indexing Duplicate Index Clustered Index Unique Index Included Columns Filtered Index Disabled Index Index Maintenance and Defragmentation Introduction to Columnstore Index Indexing Practical Performance Tips and Tricks Index and Page Types Index and Non Deterministic Columns Index and SET Values Importance of Clustered Index Effect of Compression and Fillfactor Index and Functions Dynamic Management Views (DMV) – Fillfactor Table Scan, Index Scan and Index Seek Index and Order of Columns Final Checklist: Index and Performance Well, we believe we have done our part, now waiting for your comments and feedback. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Index, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, SQLServer, T SQL, Technology, Video

Read the article
Performance analytics via DBMS "plugins", or other solution

- by Polynomial

I'm working on a systems monitoring product that currently focuses on performance at the system level. We're expanding out to monitoring database systems. Right now we can fetch simple performance information from a selection of DBMS, like connection count, disk IO rates, lock wait times, etc. However, we'd really like a way to measure the execution time of every query going into a DBMS, without requiring the client to implement monitoring in their application code. Some potential solutions might be: Some sort of proxy that sits between client and server. SSL might be an issue here, plus it requires us to reverse engineer and implement the network protocol for each DBMS. Plugin for each DBMS system that automatically records performance information when a query comes in. Other problems include "anonymising" the SQL, i.e. taking something like SELECT * FROM products WHERE price > 20 AND name LIKE "%disk%" and producing SELECT * FROM products WHERE price > ? AND name LIKE "%?%", though this shouldn't be too difficult with some clever parsing and regex. We're mainly focusing on: MySQL MSSQL Oracle Redis mongodb memcached Are there any plugin-style mechanisms we can utilise for any of these? Or is there a simpler solution?

Read the article
FreeBSD performance tuning. Sysctls, loader.conf, kernel

- by SaveTheRbtz

I wanted to share knowledge of tuning FreeBSD via sysctl.conf/loader.conf/KENCONF. It was initially based on Igor Sysoev's (author of nginx) presentation about FreeBSD tuning up to 100,000-200,000 active connections. Tunings are for FreeBSD-CURRENT. Since 7.2 amd64 some of them are tuned well by default. Prior 7.0 some of them are boot only (set via /boot/loader.conf) or does not exist at all. sysctl.conf: # No zero mapping feature # May break wine # (There are also reports about broken samba3) #security.bsd.map_at_zero=0 # If you have really busy webserver with apache13 you may run out of processes #kern.maxproc=10000 # Same for servers with apache2 / Pound #kern.threads.max_threads_per_proc=4096 # Max. backlog size kern.ipc.somaxconn=4096 # Shared memory // 7.2+ can use shared memory > 2Gb kern.ipc.shmmax=2147483648 # Sockets kern.ipc.maxsockets=204800 # Can cause this on older kernels: # http://old.nabble.com/Significant-performance-regression-for-increased-maxsockbuf-on-8.0-RELEASE-tt26745981.html#a26745981 ) kern.ipc.maxsockbuf=10485760 # Mbuf 2k clusters (on amd64 7.2+ 25600 is default) # For such high value vm.kmem_size must be increased to 3G kern.ipc.nmbclusters=262144 # Jumbo pagesize(_SC_PAGESIZE) clusters # Used as general packet storage for jumbo frames # can be monitored via `netstat -m` #kern.ipc.nmbjumbop=262144 # Jumbo 9k/16k clusters # If you are using them #kern.ipc.nmbjumbo9=65536 #kern.ipc.nmbjumbo16=32768 # For lower latency you can decrease scheduler's maximum time slice # default: stathz/10 (~ 13) #kern.sched.slice=1 # Increase max command-line length showed in `ps` (e.g for Tomcat/Java) # Default is PAGE_SIZE / 16 or 256 on x86 # This avoids commands to be presented as [executable] in `ps` # For more info see: http://www.freebsd.org/cgi/query-pr.cgi?pr=120749 kern.ps_arg_cache_limit=4096 # Every socket is a file, so increase them kern.maxfiles=204800 kern.maxfilesperproc=200000 kern.maxvnodes=200000 # On some systems HPET is almost 2 times faster than default ACPI-fast # Useful on systems with lots of clock_gettime / gettimeofday calls # See http://old.nabble.com/ACPI-fast-default-timecounter,-but-HPET-83--faster-td23248172.html # After revision 222222 HPET became default: http://svnweb.freebsd.org/base?view=revision&revision=222222 kern.timecounter.hardware=HPET # Small receive space, only usable on http-server, on file server this # should be increased to 65535 or even more #net.inet.tcp.recvspace=8192 # This is useful on Fat-Long-Pipes #net.inet.tcp.recvbuf_max=10485760 #net.inet.tcp.recvbuf_inc=65535 # Small send space is useful for http servers that serve small files # Autotuned since 7.x net.inet.tcp.sendspace=16384 # This is useful on Fat-Long-Pipes #net.inet.tcp.sendbuf_max=10485760 #net.inet.tcp.sendbuf_inc=65535 # Turn off receive autotuning # You can play with it. #net.inet.tcp.recvbuf_auto=0 #net.inet.tcp.sendbuf_auto=0 # This should be enabled if you going to use big spaces (>64k) # Also timestamp field is useful when using syncookies net.inet.tcp.rfc1323=1 # Turn this off on high-speed, lossless connections (LAN 1Gbit+) # If you set it there is no need in TCP_NODELAY sockopt (see man tcp) net.inet.tcp.delayed_ack=0 # This feature is useful if you are serving data over modems, Gigabit Ethernet, # or even high speed WAN links (or any other link with a high bandwidth delay product), # especially if you are also using window scaling or have configured a large send window. # Automatically disables on small RTT ( http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/netinet/tcp_subr.c?#rev1.237 ) # This sysctl was removed in 10-CURRENT: # See: http://www.mail-archive.com/[email protected]/msg06178.html #net.inet.tcp.inflight.enable=0 # TCP slowstart algorithm tunings # We assuming we have very fast clients #net.inet.tcp.slowstart_flightsize=100 #net.inet.tcp.local_slowstart_flightsize=100 # Disable randomizing of ports to avoid false RST # Before usage check SA here www.bsdcan.org/2006/papers/ImprovingTCPIP.pdf # (it's also says that port randomization auto-disables at some conn.rates, but I didn't checked it thou) #net.inet.ip.portrange.randomized=0 # Increase portrange # For outgoing connections only. Good for seed-boxes and ftp servers. net.inet.ip.portrange.first=1024 net.inet.ip.portrange.last=65535 # # stops route cache degregation during a high-bandwidth flood # http://www.freebsd.org/doc/en/books/handbook/securing-freebsd.html #net.inet.ip.rtexpire=2 net.inet.ip.rtminexpire=2 net.inet.ip.rtmaxcache=1024 # Security net.inet.ip.redirect=0 net.inet.ip.sourceroute=0 net.inet.ip.accept_sourceroute=0 net.inet.icmp.maskrepl=0 net.inet.icmp.log_redirect=0 net.inet.icmp.drop_redirect=1 net.inet.tcp.drop_synfin=1 # # There is also good example of sysctl.conf with comments: # http://www.thern.org/projects/sysctl.conf # # icmp may NOT rst, helpful for those pesky spoofed # icmp/udp floods that end up taking up your outgoing # bandwidth/ifqueue due to all that outgoing RST traffic. # #net.inet.tcp.icmp_may_rst=0 # Security net.inet.udp.blackhole=1 net.inet.tcp.blackhole=2 # IPv6 Security # For more info see http://www.fosslc.org/drupal/content/security-implications-ipv6 # Disable Node info replies # To see this vulnerability in action run `ping6 -a sglAac ::1` or `ping6 -w ::1` on unprotected node net.inet6.icmp6.nodeinfo=0 # Turn on IPv6 privacy extensions # For more info see proposal http://unix.derkeiler.com/Mailing-Lists/FreeBSD/net/2008-06/msg00103.html net.inet6.ip6.use_tempaddr=1 net.inet6.ip6.prefer_tempaddr=1 # Disable ICMP redirect net.inet6.icmp6.rediraccept=0 # Disable acceptation of RA and auto linklocal generation if you don't use them #net.inet6.ip6.accept_rtadv=0 #net.inet6.ip6.auto_linklocal=0 # Increases default TTL, sometimes useful # Default is 64 net.inet.ip.ttl=128 # Lessen max segment life to conserve resources # ACK waiting time in miliseconds # (default: 30000. RFC from 1979 recommends 120000) net.inet.tcp.msl=5000 # Max bumber of timewait sockets net.inet.tcp.maxtcptw=200000 # Don't use tw on local connections # As of 15 Apr 2009. Igor Sysoev says that nolocaltimewait has some buggy realization. # So disable it or now till get fixed #net.inet.tcp.nolocaltimewait=1 # FIN_WAIT_2 state fast recycle net.inet.tcp.fast_finwait2_recycle=1 # Time before tcp keepalive probe is sent # default is 2 hours (7200000) #net.inet.tcp.keepidle=60000 # Should be increased until net.inet.ip.intr_queue_drops is zero net.inet.ip.intr_queue_maxlen=4096 # Interrupt handling via multiple CPU, but with context switch. # You can play with it. Default is 1; #net.isr.direct=0 # This is for routers only #net.inet.ip.forwarding=1 #net.inet.ip.fastforwarding=1 # This speed ups dummynet when channel isn't saturated net.inet.ip.dummynet.io_fast=1 # Increase dummynet(4) hash #net.inet.ip.dummynet.hash_size=2048 #net.inet.ip.dummynet.max_chain_len # Should be increased when you have A LOT of files on server # (Increase until vfs.ufs.dirhash_mem becomes lower) vfs.ufs.dirhash_maxmem=67108864 # Note from commit http://svn.freebsd.org/base/head@211031 : # For systems with RAID volumes and/or virtualization envirnments, where # read performance is very important, increasing this sysctl tunable to 32 # or even more will demonstratively yield additional performance benefits. vfs.read_max=32 # Explicit Congestion Notification (see http://en.wikipedia.org/wiki/Explicit_Congestion_Notification) net.inet.tcp.ecn.enable=1 # Flowtable - flow caching mechanism # Useful for routers #net.inet.flowtable.enable=1 #net.inet.flowtable.nmbflows=65535 # Extreme polling tuning #kern.polling.burst_max=1000 #kern.polling.each_burst=1000 #kern.polling.reg_frac=100 #kern.polling.user_frac=1 #kern.polling.idle_poll=0 # IPFW dynamic rules and timeouts tuning # Increase dyn_buckets till net.inet.ip.fw.curr_dyn_buckets is lower net.inet.ip.fw.dyn_buckets=65536 net.inet.ip.fw.dyn_max=65536 net.inet.ip.fw.dyn_ack_lifetime=120 net.inet.ip.fw.dyn_syn_lifetime=10 net.inet.ip.fw.dyn_fin_lifetime=2 net.inet.ip.fw.dyn_short_lifetime=10 # Make packets pass firewall only once when using dummynet # i.e. packets going thru pipe are passing out from firewall with accept #net.inet.ip.fw.one_pass=1 # shm_use_phys Wires all shared pages, making them unswappable # Use this to lessen Virtual Memory Manager's work when using Shared Mem. # Useful for databases #kern.ipc.shm_use_phys=1 # ZFS # Enable prefetch. Useful for sequential load type i.e fileserver. # FreeBSD sets vfs.zfs.prefetch_disable to 1 on any i386 systems and # on any amd64 systems with less than 4GB of avaiable memory # For additional info check this nabble thread http://old.nabble.com/Samba-read-speed-performance-tuning-td27964534.html #vfs.zfs.prefetch_disable=0 # On highload servers you may notice following message in dmesg: # "Approaching the limit on PV entries, consider increasing either the # vm.pmap.shpgperproc or the vm.pmap.pv_entry_max tunable" vm.pmap.shpgperproc=2048 loader.conf: # Accept filters for data, http and DNS requests # Useful when your software uses select() instead of kevent/kqueue or when you under DDoS # DNS accf available on 8.0+ accf_data_load="YES" accf_http_load="YES" accf_dns_load="YES" # Async IO system calls aio_load="YES" # Linux specific devices in /dev # As for 8.1 it only /dev/full #lindev_load="YES" # Adds NCQ support in FreeBSD # WARNING! all ad[0-9]+ devices will be renamed to ada[0-9]+ # 8.0+ only #ahci_load="YES" #siis_load="YES" # FreeBSD 8.2+ # New Congestion Control for FreeBSD # http://caia.swin.edu.au/urp/newtcp/tools/cc_chd-readme-0.1.txt # http://www.ietf.org/proceedings/78/slides/iccrg-5.pdf # Initial merge commit message http://www.mail-archive.com/[email protected]/msg31410.html #cc_chd_load="YES" # Increase kernel memory size to 3G. # # Use ONLY if you have KVA_PAGES in kernel configuration, and you have more than 3G RAM # Otherwise panic will happen on next reboot! # # It's required for high buffer sizes: kern.ipc.nmbjumbop, kern.ipc.nmbclusters, etc # Useful on highload stateful firewalls, proxies or ZFS fileservers # (FreeBSD 7.2+ amd64 users: Check that current value is lower!) #vm.kmem_size="3G" # If your server has lots of swap (>4Gb) you should increase following value # according to http://lists.freebsd.org/pipermail/freebsd-hackers/2009-October/029616.html # Otherwise you'll be getting errors # "kernel: swap zone exhausted, increase kern.maxswzone" # kern.maxswzone="256M" # Older versions of FreeBSD can't tune maxfiles on the fly #kern.maxfiles="200000" # Useful for databases # Sets maximum data size to 1G # (FreeBSD 7.2+ amd64 users: Check that current value is lower!) #kern.maxdsiz="1G" # Maximum buffer size(vfs.maxbufspace) # You can check current one via vfs.bufspace # Should be lowered/upped depending on server's load-type # Usually decreased to preserve kmem # (default is 10% of mem) #kern.maxbcache="512M" # Sendfile buffers # For i386 only #kern.ipc.nsfbufs=10240 # FreeBSD 9+ # HPET "legacy route" support. It should allow HPET to work per-CPU # See http://www.mail-archive.com/[email protected]/msg03603.html #hint.atrtc.0.clock=0 #hint.attimer.0.clock=0 #hint.hpet.0.legacy_route=1 # syncache Hash table tuning net.inet.tcp.syncache.hashsize=1024 net.inet.tcp.syncache.bucketlimit=512 net.inet.tcp.syncache.cachelimit=65536 # Increased hostcache # Later host cache can be viewed via net.inet.tcp.hostcache.list hidden sysctl # Very useful for it's RTT RTTVAR # Must be power of two net.inet.tcp.hostcache.hashsize=65536 # hashsize * bucketlimit (which is 30 by default) # It allocates 255Mb (1966080*136) of RAM net.inet.tcp.hostcache.cachelimit=1966080 # TCP control-block Hash table tuning net.inet.tcp.tcbhashsize=4096 # Disable ipfw deny all # Should be uncommented when there is a chance that # kernel and ipfw binary may be out-of sync on next reboot #net.inet.ip.fw.default_to_accept=1 # # SIFTR (Statistical Information For TCP Research) is a kernel module that # logs a range of statistics on active TCP connections to a log file. # See prerelease notes http://groups.google.com/group/mailing.freebsd.current/browse_thread/thread/b4c18be6cdce76e4 # and man 4 sitfr #siftr_load="YES" # Enable superpages, for 7.2+ only # Also read http://lists.freebsd.org/pipermail/freebsd-hackers/2009-November/030094.html vm.pmap.pg_ps_enabled=1 # Usefull if you are using Intel-Gigabit NIC #hw.em.rxd=4096 #hw.em.txd=4096 #hw.em.rx_process_limit="-1" # Also if you have ALOT interrupts on NIC - play with following parameters # NOTE: You should set them for every NIC #dev.em.0.rx_int_delay: 250 #dev.em.0.tx_int_delay: 250 #dev.em.0.rx_abs_int_delay: 250 #dev.em.0.tx_abs_int_delay: 250 # There is also multithreaded version of em/igb drivers can be found here: # http://people.yandex-team.ru/~wawa/ # # for additional em monitoring and statistics use # sysctl dev.em.0.stats=1 ; dmesg # sysctl dev.em.0.debug=1 ; dmesg # Also after r209242 (-CURRENT) there is a separate sysctl for each stat variable; # Same tunings for igb #hw.igb.rxd=4096 #hw.igb.txd=4096 #hw.igb.rx_process_limit=100 # Some useful netisr tunables. See sysctl net.isr #net.isr.maxthreads=4 #net.isr.defaultqlimit=4096 #net.isr.maxqlimit: 10240 # Bind netisr threads to CPUs #net.isr.bindthreads=1 # # FreeBSD 9.x+ # Increase interface send queue length # See commit message http://svn.freebsd.org/viewvc/base?view=revision&revision=207554 #net.link.ifqmaxlen=1024 # Nicer boot logo =) loader_logo="beastie" And finally here is KERNCONF: # Just some of them, see also # cat /sys/{i386,amd64,}/conf/NOTES # This one useful only on i386 #options KVA_PAGES=512 # You can play with HZ in environments with high interrupt rate (default is 1000) # 100 is for my notebook to prolong it's battery life #options HZ=100 # Polling is goot on network loads with high packet rates and low-end NICs # NB! Do not enable it if you want more than one netisr thread #options DEVICE_POLLING # Eliminate datacopy on socket read-write # To take advantage with zero copy sockets you should have an MTU >= 4k # This req. is only for receiving data. # Read more in man zero_copy_sockets # Also this epic thread on kernel trap: # http://kerneltrap.org/node/6506 # Here Linus says that "anybody that does it that way (FreeBSD) is totally incompetent" #options ZERO_COPY_SOCKETS # Support TCP sign. Used for IPSec options TCP_SIGNATURE # There was stackoverflow found in KAME IPSec stack: # See http://secunia.com/advisories/43995/ # For quick workaround you can use `ipfw add deny proto ipcomp` options IPSEC # This ones can be loaded as modules. They described in loader.conf section #options ACCEPT_FILTER_DATA #options ACCEPT_FILTER_HTTP # Adding ipfw, also can be loaded as modules options IPFIREWALL # On 8.1+ you can disable verbose to see blocked packets on ipfw0 interface. # Also there is no point in compiling verbose into the kernel, because # now there is net.inet.ip.fw.verbose tunable. #options IPFIREWALL_VERBOSE #options IPFIREWALL_VERBOSE_LIMIT=10 options IPFIREWALL_FORWARD # Adding kernel NAT options IPFIREWALL_NAT options LIBALIAS # Traffic shaping options DUMMYNET # Divert, i.e. for userspace NAT options IPDIVERT # This is for OpenBSD's pf firewall device pf device pflog # pf's QoS - ALTQ options ALTQ options ALTQ_CBQ # Class Bases Queuing (CBQ) options ALTQ_RED # Random Early Detection (RED) options ALTQ_RIO # RED In/Out options ALTQ_HFSC # Hierarchical Packet Scheduler (HFSC) options ALTQ_PRIQ # Priority Queuing (PRIQ) options ALTQ_NOPCC # Required for SMP build # Pretty console # Manual can be found here http://forums.freebsd.org/showthread.php?t=6134 #options VESA #options SC_PIXEL_MODE # Disable reboot on Ctrl Alt Del #options SC_DISABLE_REBOOT # Change normal|kernel messages color options SC_NORM_ATTR=(FG_GREEN|BG_BLACK) options SC_KERNEL_CONS_ATTR=(FG_YELLOW|BG_BLACK) # More scroll space options SC_HISTORY_SIZE=8192 # Adding hardware crypto device device crypto device cryptodev # Useful network interfaces device vlan device tap #Virtual Ethernet driver device gre #IP over IP tunneling device if_bridge #Bridge interface device pfsync #synchronization interface for PF device carp #Common Address Redundancy Protocol device enc #IPsec interface device lagg #Link aggregation interface device stf #IPv4-IPv6 port # Also for my notebook, but may be used with Opteron device amdtemp # Same for Intel processors device coretemp # man 4 cpuctl device cpuctl # CPU control pseudo-device # Support for ECMP. More than one route for destination # Works even with default route so one can use it as LB for two ISP # For now code is unstable and panics (panic: rtfree 2) on route deletions. #options RADIX_MPATH # Multicast routing #options MROUTING #options PIM # Debug & DTrace options KDB # Kernel debugger related code options KDB_TRACE # Print a stack trace for a panic options KDTRACE_FRAME # amd64-only(?) options KDTRACE_HOOKS # all architectures - enable general DTrace hooks #options DDB #options DDB_CTF # all architectures - kernel ELF linker loads CTF data # Adaptive spining in lockmgr (8.x+) # See http://www.mail-archive.com/[email protected]/msg10782.html options ADAPTIVE_LOCKMGRS # UTF-8 in console (8.x+) #options TEKEN_UTF8 # FreeBSD 8.1+ # Deadlock resolver thread # For additional information see http://www.mail-archive.com/[email protected]/msg18124.html # (FYI: "resolution" is panic so use with caution) #options DEADLKRES # Increase maximum size of Raw I/O and sendfile(2) readahead #options MAXPHYS=(1024*1024) #options MAXBSIZE=(1024*1024) # For scheduler debug enable following option. # Debug will be available via `kern.sched.stats` sysctl # For more information see http://svnweb.freebsd.org/base/head/sys/conf/NOTES?view=markup #options SCHED_STATS If you are tuning network for maximum performance you may wish to play with ifconfig options like: # You can list all capabilities via `ifconfig -m` ifconfig [-]rxcsum [-]txcsum [-]tso [-]lro mtu In case you've enabled DDB in kernel config, you should edit your /etc/ddb.conf and add something like this to enable automatic reboot (and textdump as bonus): script kdb.enter.panic=textdump set; capture on; show pcpu; bt; ps; alltrace; capture off; call doadump; reset script kdb.enter.default=textdump set; capture on; bt; ps; capture off; call doadump; reset And do not forget to add ddb_enable="YES" to /etc/rc.conf Since FreeBSD 9 you can select to enable/disable flowcontrol on your NIC: # See http://en.wikipedia.org/wiki/Ethernet_flow_control and # http://www.mail-archive.com/[email protected]/msg07927.html for additional info ifconfig bge0 media auto mediaopt flowcontrol PS. Also most of FreeBSD's limits can be monitored by # vmstat -z and # limits PPS. variety of network counters can be monitored via # netstat -s In FreeBSD-9 netstat's -Q option appeared, try following command to display netisr stats # netstat -Q PPPS. also see # man 7 tuning PPPPS. I wanted to thank FreeBSD community, especially author of nginx - Igor Sysoev, nginx-ru@ and FreeBSD-performance@ mailing lists for providing useful information about FreeBSD tuning. FreeBSD WIP * Whats cooking for FreeBSD 7? * Whats cooking for FreeBSD 8? * Whats cooking for FreeBSD 9? So here is the question: What tunings are you using on yours FreeBSD servers? You can also post your /etc/sysctl.conf, /boot/loader.conf, kernel options, etc with description of its' meaning (do not copy-paste from sysctl -d). Don't forget to specify server type (web, smb, gateway, etc) Let's share experience!

Read the article
SQL SERVER – Database Dynamic Caching by Automatic SQL Server Performance Acceleration

- by pinaldave

My second look at SafePeak’s new version (2.1) revealed to me few additional interesting features. For those of you who hadn’t read my previous reviews SafePeak and not familiar with it, here is a quick brief: SafePeak is in business of accelerating performance of SQL Server applications, as well as their scalability, without making code changes to the applications or to the databases. SafePeak performs database dynamic caching, by caching in memory result sets of queries and stored procedures while keeping all those cache correct and up to date. Cached queries are retrieved from the SafePeak RAM in microsecond speed and not send to the SQL Server. The application gets much faster results (100-500 micro seconds), the load on the SQL Server is reduced (less CPU and IO) and the application or the infrastructure gets better scalability. SafePeak solution is hosted either within your cloud servers, hosted servers or your enterprise servers, as part of the application architecture. Connection of the application is done via change of connection strings or adding reroute line in the c:\windows\system32\drivers\etc\hosts file on all application servers. For those who would like to learn more on SafePeak architecture and how it works, I suggest to read this vendor’s webpage: SafePeak Architecture. More interesting new features in SafePeak 2.1 In my previous review of SafePeak new I covered the first 4 things I noticed in the new SafePeak (check out my article “SQLAuthority News – SafePeak Releases a Major Update: SafePeak version 2.1 for SQL Server Performance Acceleration”): Cache setup and fine-tuning – a critical part for getting good caching results Database templates Choosing which database to cache Monitoring and analysis options by SafePeak Since then I had a chance to play with SafePeak some more and here is what I found. 5. Analysis of SQL Performance (present and history): In SafePeak v.2.1 the tools for understanding of performance became more comprehensive. Every 15 minutes SafePeak creates and updates various performance statistics. Each query (or a procedure execute) that arrives to SafePeak gets a SQL pattern, and after it is used again there are statistics for such pattern. An important part of this product is that it understands the dependencies of every pattern (list of tables, views, user defined functions and procs). From this understanding SafePeak creates important analysis information on performance of every object: response time from the database, response time from SafePeak cache, average response time, percent of traffic and break down of behavior. One of the interesting things this behavior column shows is how often the object is actually pdated. The break down analysis allows knowing the above information for: queries and procedures, tables, views, databases and even instances level. The data is show now on all arriving queries, both read queries (that can be cached), but also any types of updates like DMLs, DDLs, DCLs, and even session settings queries. The stats are being updated every 15 minutes and SafePeak dashboard allows going back in time and investigating what happened within any time frame. 6. Logon trigger, for making sure nothing corrupts SafePeak cache data If you have an application with many parts, many servers many possible locations that can actually update the database, or the SQL Server is accessible to many DBAs or software engineers, each can access some database directly and do some changes without going thru SafePeak – this can create a potential corruption of the data stored in SafePeak cache. To make sure SafePeak cache is correct it needs to get all updates to arrive to SafePeak, and if a DBA will access the database directly and do some changes, for example, then SafePeak will simply not know about it and will not clean SafePeak cache. In the new version, SafePeak brought a new feature called “Logon Trigger” to solve the above challenge. By special click of a button SafePeak can deploy a special server logon trigger (with a CLR object) on your SQL Server that actually monitors all connections and informs SafePeak on any connection that is coming not from SafePeak. In SafePeak dashboard there is an interface that allows to control which logins can be ignored based on login names and IPs, while the rest will invoke cache cleanup of SafePeak and actually locks SafePeak cache until this connection will not be closed. Important to note, that this does not interrupt any logins, only informs SafePeak on such connection. On the Dashboard screen in SafePeak you will be able to see those connections and then decide what to do with them. Configuration of this feature in SafePeak dashboard can be done here: Settings -> SQL instances management -> click on instance -> Logon Trigger tab. Other features: 7. User management ability to grant permissions to someone without changing its configuration and only use SafePeak as performance analysis tool. 8. Better reports for analysis of performance using 15 minute resolution charts. 9. Caching of client cursors 10. Support for IPv6 Summary SafePeak is a great SQL Server performance acceleration solution for users who want immediate results for sites with performance, scalability and peak spikes challenges. Especially if your apps are packaged or 3rd party, since no code changes are done. SafePeak can significantly increase response times, by reducing network roundtrip to the database, decreasing CPU resource usage, eliminating I/O and storage access. SafePeak team provides a free fully functional trial www.safepeak.com/download and actually provides a one-on-one assistance during such trial. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: About Me, Pinal Dave, PostADay, SQL, SQL Authority, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, SQL Utility, T SQL, Technology

Read the article
Windows performance monitor new instances

- by fborozan

Hi all, I am trying to configure performance monitor on 2003/2008R1&R2 to capture new instances of the counters without any luck. For example if I select counter Process\%Processor time (to monitor processor time per any instances of the process) everything works fine until I open or close any application. If in the meanwhile new application is open it will not be included in the monitoring processor, and old application instance will display zero for % processor time. The problem is performance monitor is not refreshing instances of the new applications/users/new terminal session/ or any other metrics that changes instances in the meanwhile. The solution is to stop/start log file, but I don't want to do that every sec and the logging will be split into two files. Anybody knows how do I accomplish to add all new instances? Any help greatly appreciated

Read the article
Very poor read performance compared to write performance on md(raid1) / crypt(luks) / lvm

- by Android5360

I'm experiencing very poor read performance over raid1/crypt/lvm. In the same time, write speeds are about 2x+ faster on the same setup. On another raid1 setup on the same machine I get normal read speeds (maybe because I'm not using cryptsetup). OS related disks: sda + sdb. I have raid1 configuration with two disks, both are in place. I'm using LVM over the RAID. No encryption. Both disks are WD Green, 5400 rpm. IO test results on this raid1: dd if=/dev/zero of=/tmp/output.img3 bs=8k count=256k conv=fsync - 2147483648 bytes (2.1 GB) copied, 22.3392 s, 96.1 MB/s sync echo 3 > /proc/sys/vm/drop_caches dd if=/tmp/output.img3 of=/dev/null bs=8k - 2147483648 bytes (2.1 GB) copied, 15.9 s, 135 MB/s And here is the problematic setup (on the same machine). Currently I have only one sdc (WD Green, 5400rpm) configured in software raid1 + crypt (luks, serpent-xts-plain) + lvm. Tomorrow I will attach another disk (sdd) to complete this two-disk raid1 setup. IO tests results on this raid1: dd if=/dev/zero of=output.img3 bs=8k count=256k conv=fsync 2147483648 bytes (2.1 GB) copied, 17.7235 s, 121 MB/s sync echo 3 > /proc/sys/vm/drop_caches dd if=output.img3 of=/dev/null bs=8k 2147483648 bytes (2.1 GB) copied, 36.2454 s, 59.2 MB/s We can see that the read performance is very very bad (59MB/s compared to 135MB/s when using no encryption). Nothing is using the disks during benchmark. I can confirm this because I checked with iostat and dstat. Details on the hardware: disks: all are WD green, 5400rpm, 64mb cache. cpu: FX-8350 at stock speed ram: 4x4GB at 1066Mhz. Details on the software: OS: Debian Wheezy 7, amd64 mdadm: v3.2.5 - 18th May 2012 LVM version: 2.02.95(2) (2012-03-06) LVM Library version: 1.02.74 (2012-03-06) LVM Driver version: 4.22.0 cryptsetup: 1.4.3 Here is how I configured the slow raid1+crypt+lvm setup: parted /dev/sdc mklabel gpt type: ext4 start: 2048s end: -1 Now the raid, crypt and the lvm configuration: mdadm --create /dev/md1 --level=1 --raid-disks=2 missing /dev/sdc cryptsetup --cipher serpent-xts-plain luksFormat /dev/md1 cryptsetup luksOpen /dev/md1 md1_crypt vgcreate vg_sql /dev/mapper/md1_crypt lvcreate -l 100%VG vg_sql -n lv_sql mkfs.ext4 /dev/mapper/vg_sql-lv-sql mount /dev/mapper/vg_sql-lv_sql /sql So guys, can you help me identify the reason and fix it? It has to be something with the cryptsetup as there is no such read slowdown on the other setup (sda+sdb) where no encryption is present. But I have no idea what to do. Thanks!

Read the article
Recommendation using Client side performance monitoring (boomerang/jiffy/episodes)

- by Yasei No Umi

There are a few Client-side JavaScript libraries that check web-site performance on the client side: Jiffy (http://code.google.com/p/jiffy-web/) Episodes (http://stevesouders.com/episodes/) by Steve Sounders Boomerang (http://yahoo.github.com/boomerang/doc/) by Yahoo! Have you used any of them or a similar too? What did you use for the server-side? for reporting? Is this a recommended approach? If not, how should I monitor my web-site performance from the end-user's view?

Read the article
refresh windows network performance counters in command line

- by michalv82

I am testing a USB device connected to a windows PC. When the device is connected then windows has another network interface going through the device. I need to get the bytes transffered for that specific interface, basically I need the data shown in the networking tab in the task manager for my interface adapter. I found this question which helped to get this info: ms windows network activity bytes send and receive in command line However I have a problem when I run multiple tests - each time I disconnect and connect the device there's another line for the interface, like below. In the task manager networking tab I only see one record for my interface but I don't know how I can know from command line which is the lastest and current instance (it's not like the first line or last line is always the current interface, I noticed it's not consistent): wmic path Win32_PerfRawData_Tcpip_NetworkInterface Get Name,PacketsReceivedPerSec,PacketsSentPerSec,BytesReceivedPersec,BytesSentPersec BytesReceivedPersec BytesSentPersec Name PacketsReceivedPersec PacketsSentPersec 422666370 6317989292 Intel[R] 82579LM Gigabit Network Connection 2715169 8109643 49150 375973 My USB Device 432 568 0 0 My USB Device _2 0 0 0 0 My USB Device _3 0 0 0 0 My USB Device _6 0 0 0 0 Local Area Connection* 9 0 0

Read the article
SQL SERVER – Iridium I/O – SQL Server Deduplication that Shrinks Databases and Improves Performance

- by Pinal Dave

Database performance is a common problem for SQL Server DBA’s. It seems like we spend more time on performance than just about anything else. In many cases, we use scripts or tools that point out performance bottlenecks but we don’t have any way to fix them. For example, what do you do when you need to speed up a query that is already tuned as well as possible? Or what do you do when you aren’t allowed to make changes for a database supporting a purchased application? Iridium I/O for SQL Server was originally built at Confio software (makers of Ignite) because DBA’s kept asking for a way to actually fix performance instead of just pointing out performance problems. The technology is certified by Microsoft and was so promising that it was spun out into a separate company that is now run by the Confio Founder/CEO and technology management team. Iridium uses deduplication technology to both shrink the databases as well as boost IO performance. It is intriguing to see it work. It will deduplicate a live database as it is running transactions. You can watch the database get smaller while user queries are running. Iridium is a simple tool to use. After installing the software, you click an “Analyze” button which will spend a minute or two on each database and estimate both your storage and performance savings. Next, you click an “Activate” button to turn on Iridium I/O for your selected databases. You don’t need to reboot the operating system or restart the database during any part of the process. As part of my test, I also wanted to see if there would be an impact on my databases when Iridium was removed. The ‘revert’ process (bringing the files back to their SQL Server native format) was executed by a simple click of a button, and completed while the databases were available for normal processing. I was impressed and enjoyed playing with the software and encourage all of you to try it out. Here is the link to the website to download Iridium for free. . Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: PostADay, SQL, SQL Authority, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
EPM 11.1.2 - In WebLogic Server, Enable Native IO Performance Pack

- by Ahmed Awan

Performance can be improved by enabling native IO in production mode. WebLogic Server benchmarks show major performance improvements when native performance packs are used on machines that host Oracle WebLogic Server instances. Important Note: Always enable native I/O, if available, and check for errors at startup to make sure it is being initialed properly. Tip: The use of NATIVE performance packs are enabled by default in the configuration shipped with your distribution. You can use the Administration Console to verify that performance packs are enabled by clicking on each managed server and click on Tuning tab.

Read the article
Hyper-V R2 Performance Counters

- by Ascendo

Hi all I've been playing around with the WMI performance counters for Hyper-V. Of interest to me are the Virtual NIC bytes/sec input and output counters. I notice that the results are very "spikey". Over what time period is the latest counter averaged? I'm trying to calculate total traffic volume per VM, but sometimes a very high instantaneous poll result is inflating the result as I only poll the result each minute. I would prefer to read a 'bytes total' counter instead of a 'bytes/sec' counter - is there such a thing? Thanks Acendo

Read the article
How to collect the performance data of a server during an unreachable/down period using Nagios?

- by gsc-frank

Some time services and host stop responding due to a poor server performance. I mean, if for some reason (could be lot of concurrency services access, a expensive backup execution on the server or whatever that consume tons of server resources) a server performance is very degraded, that could lead that the server isn't capable to establish any "normal network communication" (without trigger whatever standards timeouts defined for such communication). Knowing host's performance data (cpu, memory, ...) in case of available during that period (host is not down and despite of its performance degradation still allow plugins collect performance data) could be very useful for sysadmin to try to determine what cause the problem, or at least, if the host performance was good and don't interfered at all in the host/service down. This problem could be solved using remote active (NRPE) or remote passive (NSCA) if such remote solutions could store (buffered) perf data to be send to central Nagios server when host performance or network outage allow it. I read the doc of both solutions and can't find any reference to such buffer mechanism neither what happened in case that NSCA can't reach Nagios server. Any idea of how solve this lack of info? so useful for forensic analysis. EDIT: My questions isn about which tools I can use to debug perf problems or gather perf data to analysis, but is about how collect (using Nagios) host perf data even during a network outage for its posterior analysis (kind of forensic analysis). The idea is integrate such data to Nagios graphers like pnp4nagios and NagiosGrapther. I know that I could install tools like Cacti in each of my host, and have a kind of performance data collection redundancy, but I really want avoid that and try to solve all perf analysis requirements with one tools: Nagios

Read the article
Postfix performance

- by Brian G

Running postfix on ubuntu, sending alot of mail ( ~ 1 million messages ) per day. loads are extremly high but not much in terms of cpu and memory load. Anyone in a similiar situation and know how to remove the bottleneck? All mail on this server is outbound. I would have to assume the bottleneck is disk. Just an update, here is what iostat looks like: avg-cpu: %user %nice %system %iowait %steal %idle 0.00 0.00 0.12 99.88 0.00 0.00 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 0.00 12.38 0.00 2.48 0.00 118.81 48.00 0.00 0.00 0.00 0.00 sdb 1.49 22.28 72.28 42.57 629.70 1041.58 14.55 135.56 834.31 8.71 100.00 Are these numbers in line with the performance you would expect from a single disk? sdb is dedicated to postfix. I think it is queue shuffling, from incoming-active-deferred More details from questions: Server: Quad core Xeon(R) CPU E5405 @ 2.00GH with 4 GB ram Load average: 464.88, 489.11, 483.91, 4 cores. but the memory utilization and cpu is minimal Postfix instances between 16 - 32

Read the article
Increase application performance on Amazon AWS

- by Honus Wagner

I've got a client with an MVC v1 (.NET) application running on a micro instance. On this instance, I've got .NET, IIS 7.5, and MS SQL Server 2008 running to handle the application. The client has reported that it is taking nearly 10 seconds to process each request. Even loading the initial login page takes about that long, then logging in takes that long, etc etc. The currently running instance specs are as follows: 615 MB RAM Intel Xenon CPU E5430 @ 2.66GHz 2.78 GHz 64-Bit Is the memory availability the issue? or is it the processing power? I forsee two options: Change to a larget instance Set up a 2-tier architecture with two micro instances Which of these will give the application better performance? Thanks in advance.

Read the article
Application Performance: The Best of the Web

- by Michaela Murray

Wisdom A deep understanding and realization […] resulting in the ability to apply perceptions, judgements and actions. It is also the comprehension of what is true coupled with optimum judgment as to action. - Wikipedia We’re writing a book for ASP.NET developers, and we want you to be a part of it. We know that there’s a huge amount of web developer wisdom that never gets shared, and we want to find those golden nuggets of knowledge and experience, and make sure everyone can learn from them. Right now, we want to find out about your top tips, hard-won lessons, and sage advice for avoiding, finding, and fixing application performance problems. If you work with .NET and SQL, even better – a lot of application performance relies on the interaction with the database, so we want to hear from you! “How Do You Want Me To Be Involved?” Right! Details! We want you, our most excellent readers, to email us with the Best Advice you would give to other developers for getting the best performance out of their applications. It doesn’t matter if your advice is for newbies or veterans, .NET or SQL – so long as it’s about application performance, we want to hear from you. (And if you think that there’s developer wisdom out there that “everyone knows”, a) I’m willing to bet you could find someone who doesn’t know about it, and b) it probably bears repeating anyway!) “I’m Interested. What Can You Do For Me?” Excellent question. For starters, there’s a chance to win a Microsoft Surface (the tablet, not the table-top). Once all the ASP.NET Wisdom has been collected, tallied, and labelled, it will then be weighed and measured by a team of expert judges (whose identities are still a closely-guarded secret). The top tip in both SQL & .NET categories will each win their author their very own MS Surface. But that’s not all! We can also give you… immortality! More details? Ok. We’ll be collecting all of the tips sent in by our readers (and we can’t wait to learn from you all,) and with the help of our Simple-Talk editors, we will publish and distribute your combined and documented knowledge as a free, community-created, professionally typeset eBook. You will naturally be credited by name / pseudonym / twitter handle / GitHub username / StackOverflow profile / Whatever, as the clearly ingenious author of hot performance tips. The Not-Very-Fine Print Here’s the breakdown: We want to bring together the best application performance knowledge from ASP.NET developers. Closing date for submissions will be 9am GMT, December 4th. Submissions should be made by email – [email protected] Submissions will be judged by a panel of expert judges (who will be revealed soon). The top submission in both the SQL & .NET categories will each win a Microsoft Surface. ALL the tips which make it through the judging process will be polished by Simple-Talk editors, and turned into a professionally typeset eBook, which will be freely available, and promoted alongside the ANTS Performance Profiler tool. Anyone whose entry makes it into the book will be clearly and profusely credited in the method of their choice (or can remain anonymous.) The really REALLY short version Share what you know about ASP.NET application performance for a chance to win a Microsoft Surface, and then get your name credited in a slick eBook with top-notch production values. For more details, see above. We can’t wait to learn from you!

Read the article
Server Performance

- by sb12

I know very little about performance tuning of servers etc... so i thought i'd put this up here as i start some research on it, just to get some direction. I am in the process of migrating from my old server to a new one - both are 64 bit machines. One is a few years old, the other brand new (PowerEdge R410). The old server spec is: 2 cpus, 3.4GHz Pentiums, 8G of RAM, Fedora 11 currently installed The new server spec is: 16 cpus, 3.2 GHz Xeon, 16G of RAM, CentOS 6.2 installed. Also RAID10 is on the new server - no RAID on the old one. Both servers currently have the same database (MySQL) with the same data migrated. I wrote a Perl script that simply steps through each row of a table in the database (about 18000 rows) and updates a value in that row. Every row in the table is updated. Out of curiosity i ran this perl script on both machines, just to see how the new server would perform vs. the old one, and it produced interesting results: The old server was twice as fast as the new one to complete. Looking at the database, both are configured exactly the same (the new one being a dump of the old one...)... Anyone any ideas why this would be given the hardware gap between both? As i said i'm about to start some digging, but thought i'd put this up here to maybe get some good direction.... Many thanks in advance..

Read the article
Analysing and measuring the performance of a .NET application (survey results)

- by Laila

Back in December last year, I asked myself: could it be that .NET developers think that you need three days and a PhD to do performance profiling on their code? What if developers are shunning profilers because they perceive them as too complex to use? If so, then what method do they use to measure and analyse the performance of their .NET applications? Do they even care about performance? So, a few weeks ago, I decided to get a 1-minute survey up and running in the hopes that some good, hard data would clear the matter up once and for all. I posted the survey on Simple Talk and got help from a few people to promote it. The survey consisted of 3 simple questions: Amazingly, 533 developers took the time to respond - which means I had enough data to get representative results! So before I go any further, I would like to thank all of you who contributed, because I now have some pretty good answers to the troubling questions I was asking myself. To thank you properly, I thought I would share some of the results with you. First of all, application performance is indeed important to most of you. In fact, performance is an intrinsic part of the development cycle for a good 40% of you, which is much higher than I had anticipated, I have to admit. (I know, "Have a little faith Laila!") When asked what tool you use to measure and analyse application performance, I found that nearly half of the respondents use logging statements, a third use performance counters, and 70% of respondents use a profiler of some sort (a 3rd party performance profilers, the CLR profiler or the Visual Studio profiler). The importance attributed to logging statements did surprise me a little. I am still not sure why somebody would go to the trouble of manually instrumenting code in order to measure its performance, instead of just using a profiler. I personally find the process of annotating code, calculating times from log files, and relating it all back to your source terrifyingly laborious. Not to mention that you then need to remember to turn it all off later! Even when you have logging in place throughout all your code anyway, you still have a fair amount of potentially error-prone calculation to sift through the results; in addition, you'll only get method-level rather than line-level timings, and you won't get timings from any framework or library methods you don't have source for. To top it all, we all know that bottlenecks are rarely where you would expect them to be, so you could be wasting time looking for a performance problem in the wrong place. On the other hand, profilers do all the work for you: they automatically collect the CPU and wall-clock timings, and present the results from method timing all the way down to individual lines of code. Maybe I'm missing a trick. I would love to know about the types of scenarios where you actively prefer to use logging statements. Finally, while a third of the respondents didn't have a strong opinion about code performance profilers, those who had an opinion thought that they were mainly complex to use and time consuming. Three respondents in particular summarised this perfectly: "sometimes, they are rather complex to use, adding an additional time-sink to the process of trying to resolve the existing problem". "they are simple to use, but the results are hard to understand" "Complex to find the more advanced things, easy to find some low hanging fruit". These results confirmed my suspicions: Profilers are seen to be designed for more advanced users who can use them effectively and make sense of the results. I found yet more interesting information when I started comparing samples of "developers for whom performance is an important part of the dev cycle", with those "to whom performance is only looked at in times of crisis", and "developers to whom performance is not important, as long as the app works". See the three graphs below. Sample of developers to whom performance is an important part of the dev cycle: Sample of developers to whom performance is important only in times of crisis: Sample of developers to whom performance is not important, as long as the app works: As you can see, there is a strong correlation between the usage of a profiler and the importance attributed to performance: indeed, the more important performance is to a development team, the more likely they are to use a profiler. In addition, developers to whom performance is an important part of the dev cycle have a higher tendency to use a much wider range of methods for performance measurement and analysis. And, unsurprisingly, the less important performance is, the less varied the methods of measurement are. So all in all, to come back to my random questions: .NET developers do care about performance. Those who care the most use a wider range of performance measurement methods than those who care less. But overall, logging statements, performance counters and third party performance profilers are the performance measurement methods of choice for most developers. Finally, although most of you find code profilers complex to use, those of you who care the most about performance tend to use profilers more than those of you to whom performance is not so important.

Read the article
C# Performance Pitfall – Interop Scenarios Change the Rules

- by Reed

C# and .NET, overall, really do have fantastic performance in my opinion. That being said, the performance characteristics dramatically differ from native programming, and take some relearning if you’re used to doing performance optimization in most other languages, especially C, C++, and similar. However, there are times when revisiting tricks learned in native code play a critical role in performance optimization in C#. I recently ran across a nasty scenario that illustrated to me how dangerous following any fixed rules for optimization can be… The rules in C# when optimizing code are very different than C or C++. Often, they’re exactly backwards. For example, in C and C++, lifting a variable out of loops in order to avoid memory allocations often can have huge advantages. If some function within a call graph is allocating memory dynamically, and that gets called in a loop, it can dramatically slow down a routine. This can be a tricky bottleneck to track down, even with a profiler. Looking at the memory allocation graph is usually the key for spotting this routine, as it’s often “hidden” deep in call graph. For example, while optimizing some of my scientific routines, I ran into a situation where I had a loop similar to: for (i=0; i<numberToProcess; ++i) { // Do some work ProcessElement(element[i]); } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } This loop was at a fairly high level in the call graph, and often could take many hours to complete, depending on the input data. As such, any performance optimization we could achieve would be greatly appreciated by our users. After a fair bit of profiling, I noticed that a couple of function calls down the call graph (inside of ProcessElement), there was some code that effectively was doing: // Allocate some data required DataStructure* data = new DataStructure(num); // Call into a subroutine that passed around and manipulated this data highly CallSubroutine(data); // Read and use some values from here double values = data->Foo; // Cleanup delete data; // ... return bar; Normally, if “DataStructure” was a simple data type, I could just allocate it on the stack. However, it’s constructor, internally, allocated it’s own memory using new, so this wouldn’t eliminate the problem. In this case, however, I could change the call signatures to allow the pointer to the data structure to be passed into ProcessElement and through the call graph, allowing the inner routine to reuse the same “data” memory instead of allocating. At the highest level, my code effectively changed to something like: DataStructure* data = new DataStructure(numberToProcess); for (i=0; i<numberToProcess; ++i) { // Do some work ProcessElement(element[i], data); } delete data; Granted, this dramatically reduced the maintainability of the code, so it wasn’t something I wanted to do unless there was a significant benefit. In this case, after profiling the new version, I found that it increased the overall performance dramatically – my main test case went from 35 minutes runtime down to 21 minutes. This was such a significant improvement, I felt it was worth the reduction in maintainability. In C and C++, it’s generally a good idea (for performance) to: Reduce the number of memory allocations as much as possible, Use fewer, larger memory allocations instead of many smaller ones, and Allocate as high up the call stack as possible, and reuse memory I’ve seen many people try to make similar optimizations in C# code. For good or bad, this is typically not a good idea. The garbage collector in .NET completely changes the rules here. In C#, reallocating memory in a loop is not always a bad idea. In this scenario, for example, I may have been much better off leaving the original code alone. The reason for this is the garbage collector. The GC in .NET is incredibly effective, and leaving the allocation deep inside the call stack has some huge advantages. First and foremost, it tends to make the code more maintainable – passing around object references tends to couple the methods together more than necessary, and overall increase the complexity of the code. This is something that should be avoided unless there is a significant reason. Second, (unlike C and C++) memory allocation of a single object in C# is normally cheap and fast. Finally, and most critically, there is a large advantage to having short lived objects. If you lift a variable out of the loop and reuse the memory, its much more likely that object will get promoted to Gen1 (or worse, Gen2). This can cause expensive compaction operations to be required, and also lead to (at least temporary) memory fragmentation as well as more costly collections later. As such, I’ve found that it’s often (though not always) faster to leave memory allocations where you’d naturally place them – deep inside of the call graph, inside of the loops. This causes the objects to stay very short lived, which in turn increases the efficiency of the garbage collector, and can dramatically improve the overall performance of the routine as a whole. In C#, I tend to: Keep variable declarations in the tightest scope possible Declare and allocate objects at usage While this tends to cause some of the same goals (reducing unnecessary allocations, etc), the goal here is a bit different – it’s about keeping the objects rooted for as little time as possible in order to (attempt) to keep them completely in Gen0, or worst case, Gen1. It also has the huge advantage of keeping the code very maintainable – objects are used and “released” as soon as possible, which keeps the code very clean. It does, however, often have the side effect of causing more allocations to occur, but keeping the objects rooted for a much shorter time. Now – nowhere here am I suggesting that these rules are hard, fast rules that are always true. That being said, my time spent optimizing over the years encourages me to naturally write code that follows the above guidelines, then profile and adjust as necessary. In my current project, however, I ran across one of those nasty little pitfalls that’s something to keep in mind – interop changes the rules. In this case, I was dealing with an API that, internally, used some COM objects. In this case, these COM objects were leading to native allocations (most likely C++) occurring in a loop deep in my call graph. Even though I was writing nice, clean managed code, the normal managed code rules for performance no longer apply. After profiling to find the bottleneck in my code, I realized that my inner loop, a innocuous looking block of C# code, was effectively causing a set of native memory allocations in every iteration. This required going back to a “native programming” mindset for optimization. Lifting these variables and reusing them took a 1:10 routine down to 0:20 – again, a very worthwhile improvement. Overall, the lessons here are: Always profile if you suspect a performance problem – don’t assume any rule is correct, or any code is efficient just because it looks like it should be Remember to check memory allocations when profiling, not just CPU cycles Interop scenarios often cause managed code to act very differently than “normal” managed code. Native code can be hidden very cleverly inside of managed wrappers

Read the article
How to save a perfmon Performance Counter as a textfile (Reliability and Performance Monitor Version

- by Lieven Cardoen

Now the file gets saved as blg, but I would like a txt versin to import in Excel.

Read the article
How often is software speed evident in the eyes of customers?

- by rwong

In theory, customers should be able to feel the software performance improvements from first-hand experience. In practice, sometimes the improvements are not noticible enough, such that in order to monetize from the improvements, it is necessary to use quotable performance figures in marketing in order to attract customers. We already know the difference between perceived performance (GUI latency, etc) and server-side performance (machines, networks, infrastructure, etc). How often is it that programmers need to go the extra length to "write up" performance analyses for which the audience is not fellow programmers, but managers and customers?

Read the article

Search Results

Search found 13151 results on 527 pages for 'performance counters'.

Page 2/527 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

- by Jon

- by Trustin Lee

- by Grofit

- by pinaldave

- by pinaldave

- by pinaldave

- by Polynomial

- by SaveTheRbtz

- by pinaldave

- by fborozan

- by Android5360

- by Yasei No Umi

- by michalv82

- by Pinal Dave

- by Ahmed Awan

- by Ascendo

- by gsc-frank

- by Brian G

- by Honus Wagner

- by Michaela Murray

- by sb12

- by Laila

- by Reed

- by Lieven Cardoen

- by rwong

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >