Search Results

Search found 562 results on 23 pages for 'profiling'.

Page 10/23 | < Previous Page | 6 7 8 9 10 11 12 13 14 15 16 17  | Next Page >

  • In sync query calls, one query causing other query to run slower. Why?

    - by Irchi
    Sorry for the long question, but I think this is an interesting situation and I couldn't find any explanations for it: I was involved in optimization of an application that performed a large number of sequential SELECT and INSERT statements on a single dedicated SQL Server database. The process needs to INSERT a large number of records into a table, but for each of them there should be some value mappings, which performed using SELECT statements on another table in the same database. For a specific execution, it took 90 minutes to run. I used a profiler (JProfiler - the application is Java-based) to determine how much time does each part of the application take. It yields that 60% of the time was spent on INSERT method calls, and almost 20% on SELECT calls (the rest distributed in other parts). After some trials, I came to this situation: I commented out the INSERT query that took 60% of the time. I was expecting for the total run time to be around 35 minutes, as I have removed 60% of the 90 minutes. But the whole process took the same 90 minutes (doing only SELECTs and nothing else), but each SELECT took longer this time! Everything was running sync, there were no async calls. And there was only one single thread of execution. SELECT and INSERT queries are very simple, and don't have anything special, and they are on different tables, but on the same DB. I tested with both the DB on the application machine, and on a remote network machine. I can't think of any explanation for this, as the Profiler (Application profiler, not SQL Profiler) reported the changes in the method call times, and by removing INSERT statements SELECT statements took longer to run. Can anyone give me some kind of explanation of what could have happened? (there can't be cache / query optimization stuff, because the queries were run in sync, and in a single thread, and it was far from affecting the cache this much) I should note that the bottleneck of the speed was in SQL server, using most of the CPU time.

    Read the article

  • Get gprof to profile based on wall-clock time?

    - by jetwolf
    My understanding is that by default gprof takes into account CPU time. Is there a way to get it to profile based on wall-clock time? My program does a lot of disk i/o, so the CPU time it uses only represents a fraction of the actual execution time. I need to know which portions of the disk i/o take up the most time.

    Read the article

  • How to use profiler?

    - by A-ha
    Can anyone provide me with link to a website/book in which would be well explainded how to use profiler in VS2010 for native code? I tried to search the web but all I'm getting is tutorials from Microsoft and those tutorials use C#. Thank you.

    Read the article

  • Count function calls by name or signature. Gcc, C++

    - by MajesticRa
    I have some c++ written package. Linux, gcc. I can modify compilation process (change Makefile, flags, etc.), but can not change C++ source code. One runs the package with different parameters, it does a job and exits. How to count: 1) Number of calls of function with specific name? 2) Number of calls of functions with specific signature? 3) Number of calls of functions where one of the parameters is of specific type i.e. std::string (type is specified by signature)? 4) and extra Number of calls of functions of STL objects, i.e. std::string copy constructor? (I mean count a number of calls during the run. ) I thought to do it with GDB, but I found it very tough to do (1) and have not found how to do (2)-(4) at all. All acceptable answers I will write here for humanity.

    Read the article

  • ANTS CLR and Memory Profiler In Depth Review (Part 1 of 2 &ndash; CLR Profiler)

    - by ToStringTheory
    One of the things that people might not know about me, is my obsession to make my code as efficient as possible.  Many people might not realize how much of a task or undertaking that this might be, but it is surely a task as monumental as climbing Mount Everest, except this time it is a challenge for the mind…  In trying to make code efficient, there are many different factors that play a part – size of project or solution, tiers, language used, experience and training of the programmer, technologies used, maintainability of the code – the list can go on for quite some time. I spend quite a bit of time when developing trying to determine what is the best way to implement a feature to accomplish the efficiency that I look to achieve.  One program that I have recently come to learn about – Red Gate ANTS Performance (CLR) and Memory profiler gives me tools to accomplish that job more efficiently as well.  In this review, I am going to cover some of the features of the ANTS profiler set by compiling some hideous example code to test against. Notice As a member of the Geeks With Blogs Influencers program, one of the perks is the ability to review products, in exchange for a free license to the program.  I have not let this affect my opinions of the product in any way, and Red Gate nor Geeks With Blogs has tried to influence my opinion regarding this product in any way. Introduction The ANTS Profiler pack provided by Red Gate was something that I had not heard of before receiving an email regarding an offer to review it for a license.  Since I look to make my code efficient, it was a no brainer for me to try it out!  One thing that I have to say took me by surprise is that upon downloading the program and installing it you fill out a form for your usual contact information.  Sure enough within 2 hours, I received an email from a sales representative at Red Gate asking if she could help me to achieve the most out of my trial time so it wouldn’t go to waste.  After replying to her and explaining that I was looking to review its feature set, she put me in contact with someone that setup a demo session to give me a quick rundown of its features via an online meeting.  After having dealt with a massive ordeal with one of my utility companies and their complete lack of customer service, Red Gates friendly and helpful representatives were a breath of fresh air, and something I was thankful for. ANTS CLR Profiler The ANTS CLR profiler is the thing I want to focus on the most in this post, so I am going to dive right in now. Install was simple and took no time at all.  It installed both the profiler for the CLR and Memory, but also visual studio extensions to facilitate the usage of the profilers (click any images for full size images): The Visual Studio menu options (under ANTS menu) Starting the CLR Performance Profiler from the start menu yields this window If you follow the instructions after launching the program from the start menu (Click File > New Profiling Session to start a new project), you are given a dialog with plenty of options for profiling: The New Session dialog.  Lots of options.  One thing I noticed is that the buttons in the lower right were half-covered by the panel of the application.  If I had to guess, I would imagine that this is caused by my DPI settings being set to 125%.  This is a problem I have seen in other applications as well that don’t scale well to different dpi scales. The profiler options give you the ability to profile: .NET Executable ASP.NET web application (hosted in IIS) ASP.NET web application (hosted in IIS express) ASP.NET web application (hosted in Cassini Web Development Server) SharePoint web application (hosted in IIS) Silverlight 4+ application Windows Service COM+ server XBAP (local XAML browser application) Attach to an already running .NET 4 process Choosing each option provides a varying set of other variables/options that one can set including options such as application arguments, operating path, record I/O performance performance counters to record (43 counters in all!), etc…  All in all, they give you the ability to profile many different .Net project types, and make it simple to do so.  In most cases of my using this application, I would be using the built in Visual Studio extensions, as they automatically start a new profiling project in ANTS with the options setup, and start your program, however RedGate has made it easy enough to profile outside of Visual Studio as well. On the flip side of this, as someone who lives most of their work life in Visual Studio, one thing I do wish is that instead of opening an entirely separate application/gui to perform profiling after launching, that instead they would provide a Visual Studio panel with the information, and integrate more of the profiling project information into Visual Studio.  So, now that we have an idea of what options that the profiler gives us, its time to test its abilities and features. Horrendous Example Code – Prime Number Generator One of my interests besides development, is Physics and Math – what I went to college for.  I have especially always been interested in prime numbers, as they are something of a mystery…  So, I decided that I would go ahead and to test the abilities of the profiler, I would write a small program, website, and library to generate prime numbers in the quantity that you ask for.  I am going to start off with some terrible code, and show how I would see the profiler being used as a development tool. First off, the IPrimes interface (all code is downloadable at the end of the post): interface IPrimes { IEnumerable<int> GetPrimes(int retrieve); } Simple enough, right?  Anything that implements the interface will (hopefully) provide an IEnumerable of int, with the quantity specified in the parameter argument.  Next, I am going to implement this interface in the most basic way: public class DumbPrimes : IPrimes { public IEnumerable<int> GetPrimes(int retrieve) { //store a list of primes already found var _foundPrimes = new List<int>() { 2, 3 }; //if i ask for 1 or two primes, return what asked for if (retrieve <= _foundPrimes.Count()) return _foundPrimes.Take(retrieve); //the next number to look at int _analyzing = 4; //since I already determined I don't have enough //execute at least once, and until quantity is sufficed do { //assume prime until otherwise determined bool isPrime = true; //start dividing at 2 //divide until number is reached, or determined not prime for (int i = 2; i < _analyzing && isPrime; i++) { //if (i) goes into _analyzing without a remainder, //_analyzing is NOT prime if (_analyzing % i == 0) isPrime = false; } //if it is prime, add to found list if (isPrime) _foundPrimes.Add(_analyzing); //increment number to analyze next _analyzing++; } while (_foundPrimes.Count() < retrieve); return _foundPrimes; } } This is the simplest way to get primes in my opinion.  Checking each number by the straight definition of a prime – is it divisible by anything besides 1 and itself. I have included this code in a base class library for my solution, as I am going to use it to demonstrate a couple of features of ANTS.  This class library is consumed by a simple non-MVVM WPF application, and a simple MVC4 website.  I will not post the WPF code here inline, as it is simply an ObservableCollection<int>, a label, two textbox’s, and a button. Starting a new Profiling Session So, in Visual Studio, I have just completed my first stint developing the GUI and DumbPrimes IPrimes class, so now I want to check my codes efficiency by profiling it.  All I have to do is build the solution (surprised initiating a profiling session doesn’t do this, but I suppose I can understand it), and then click the ANTS menu, followed by Profile Performance.  I am then greeted by the profiler starting up and already monitoring my program live: You are provided with a realtime graph at the top, and a pane at the bottom giving you information on how to proceed.  I am going to start by asking my program to show me the first 15000 primes: After the program finally began responding again (I did all the work on the main UI thread – how bad!), I stopped the profiler, which did kill the process of my program too.  One important thing to note, is that the profiler by default wants to give you a lot of detail about the operation – line hit counts, time per line, percent time per line, etc…  The important thing to remember is that this itself takes a lot of time.  When running my program without the profiler attached, it can generate the 15000 primes in 5.18 seconds, compared to 74.5 seconds – almost a 1500 percent increase.  While this may seem like a lot, remember that there is a trade off.  It may be WAY more inefficient, however, I am able to drill down and make improvements to specific problem areas, and then decrease execution time all around. Analyzing the Profiling Session After clicking ‘Stop Profiling’, the process running my application stopped, and the entire execution time was automatically selected by ANTS, and the results shown below: Now there are a number of interesting things going on here, I am going to cover each in a section of its own: Real Time Performance Counter Bar (top of screen) At the top of the screen, is the real time performance bar.  As your application is running, this will constantly update with the currently selected performance counters status.  A couple of cool things to note are the fact that you can drag a selection around specific time periods to drill down the detail views in the lower 2 panels to information pertaining to only that period. After selecting a time period, you can bookmark a section and name it, so that it is easy to find later, or after reloaded at a later time.  You can also zoom in, out, or fit the graph to the space provided – useful for drilling down. It may be hard to see, but at the top of the processor time graph below the time ticks, but above the red usage graph, there is a green bar. This bar shows at what times a method that is selected in the ‘Call tree’ panel is called. Very cool to be able to click on a method and see at what times it made an impact. As I said before, ANTS provides 43 different performance counters you can hook into.  Click the arrow next to the Performance tab at the top will allow you to change between different counters if you have them selected: Method Call Tree, ADO.Net Database Calls, File IO – Detail Panel Red Gate really hit the mark here I think. When you select a section of the run with the graph, the call tree populates to fill a hierarchical tree of method calls, with information regarding each of the methods.   By default, methods are hidden where the source is not provided (framework type code), however, Red Gate has integrated Reflector into ANTS, so even if you don’t have source for something, you can select a method and get the source if you want.  Methods are also hidden where the impact is seen as insignificant – methods that are only executed for 1% of the time of the overall calling methods time; in other words, working on making them better is not where your efforts should be focused. – Smart! Source Panel – Detail Panel The source panel is where you can see line level information on your code, showing the code for the currently selected method from the Method Call Tree.  If the code is not available, Reflector takes care of it and shows the code anyways! As you can notice, there does seem to be a problem with how ANTS determines what line is the actual line that a call is completed on.  I have suspicions that this may be due to some of the inline code optimizations that the CLR applies upon compilation of the assembly.  In a method with comments, the problem is much more severe: As you can see here, apparently the most offending code in my base library was a comment – *gasp*!  Removing the comments does help quite a bit, however I hope that Red Gate works on their counter algorithm soon to improve the logic on positioning for statistics: I did a small test just to demonstrate the lines are correct without comments. For me, it isn’t a deal breaker, as I can usually determine the correct placements by looking at the application code in the region and determining what makes sense, but it is something that would probably build up some irritation with time. Feature – Suggest Method for Optimization A neat feature to really help those in need of a pointer, is the menu option under tools to automatically suggest methods to optimize/improve: Nice feature – clicking it filters the call tree and stars methods that it thinks are good candidates for optimization.  I do wish that they would have made it more visible for those of use who aren’t great on sight: Process Integration I do think that this could have a place in my process.  After experimenting with the profiler, I do think it would be a great benefit to do some development, testing, and then after all the bugs are worked out, use the profiler to check on things to make sure nothing seems like it is hogging more than its fair share.  For example, with this program, I would have developed it, ran it, tested it – it works, but slowly. After looking at the profiler, and seeing the massive amount of time spent in 1 method, I might go ahead and try to re-implement IPrimes (I actually would probably rewrite the offending code, but so that I can distribute both sets of code easily, I’m just going to make another implementation of IPrimes).  Using two pieces of knowledge about prime numbers can make this method MUCH more efficient – prime numbers fall into two buckets 6k+/-1 , and a number is prime if it is not divisible by any other primes before it: public class SmartPrimes : IPrimes { public IEnumerable<int> GetPrimes(int retrieve) { //store a list of primes already found var _foundPrimes = new List<int>() { 2, 3 }; //if i ask for 1 or two primes, return what asked for if (retrieve <= _foundPrimes.Count()) return _foundPrimes.Take(retrieve); //the next number to look at int _k = 1; //since I already determined I don't have enough //execute at least once, and until quantity is sufficed do { //assume prime until otherwise determined bool isPrime = true; int potentialPrime; //analyze 6k-1 //assign the value to potential potentialPrime = 6 * _k - 1; //if there are any primes that divise this, it is NOT a prime number //using PLINQ for quick boost isPrime = !_foundPrimes.AsParallel() .Any(prime => potentialPrime % prime == 0); //if it is prime, add to found list if (isPrime) _foundPrimes.Add(potentialPrime); if (_foundPrimes.Count() == retrieve) break; //analyze 6k+1 //assign the value to potential potentialPrime = 6 * _k + 1; //if there are any primes that divise this, it is NOT a prime number //using PLINQ for quick boost isPrime = !_foundPrimes.AsParallel() .Any(prime => potentialPrime % prime == 0); //if it is prime, add to found list if (isPrime) _foundPrimes.Add(potentialPrime); //increment k to analyze next _k++; } while (_foundPrimes.Count() < retrieve); return _foundPrimes; } } Now there are definitely more things I can do to help make this more efficient, but for the scope of this example, I think this is fine (but still hideous)! Profiling this now yields a happy surprise 27 seconds to generate the 15000 primes with the profiler attached, and only 1.43 seconds without.  One important thing I wanted to call out though was the performance graph now: Notice anything odd?  The %Processor time is above 100%.  This is because there is now more than 1 core in the operation.  A better label for the chart in my mind would have been %Core time, but to each their own. Another odd thing I noticed was that the profiler seemed to be spot on this time in my DumbPrimes class with line details in source, even with comments..  Odd. Profiling Web Applications The last thing that I wanted to cover, that means a lot to me as a web developer, is the great amount of work that Red Gate put into the profiler when profiling web applications.  In my solution, I have a simple MVC4 application setup with 1 page, a single input form, that will output prime values as my WPF app did.  Launching the profiler from Visual Studio as before, nothing is really different in the profiler window, however I did receive a UAC prompt for a Red Gate helper app to integrate with the web server without notification. After requesting 500, 1000, 2000, and 5000 primes, and looking at the profiler session, things are slightly different from before: As you can see, there are 4 spikes of activity in the processor time graph, but there is also something new in the call tree: That’s right – ANTS will actually group method calls by get/post operations, so it is easier to find out what action/page is giving the largest problems…  Pretty cool in my mind! Overview Overall, I think that Red Gate ANTS CLR Profiler has a lot to offer, however I think it also has a long ways to go.  3 Biggest Pros: Ability to easily drill down from time graph, to method calls, to source code Wide variety of counters to choose from when profiling your application Excellent integration/grouping of methods being called from web applications by request – BRILLIANT! 3 Biggest Cons: Issue regarding line details in source view Nit pick – Processor time vs. Core time Nit pick – Lack of full integration with Visual Studio Ratings Ease of Use (7/10) – I marked down here because of the problems with the line level details and the extra work that that entails, and the lack of better integration with Visual Studio. Effectiveness (10/10) – I believe that the profiler does EXACTLY what it purports to do.  Especially with its large variety of performance counters, a definite plus! Features (9/10) – Besides the real time performance monitoring, and the drill downs that I’ve shown here, ANTS also has great integration with ADO.Net, with the ability to show database queries run by your application in the profiler.  This, with the line level details, the web request grouping, reflector integration, and various options to customize your profiling session I think create a great set of features! Customer Service (10/10) – My entire experience with Red Gate personnel has been nothing but good.  their people are friendly, helpful, and happy! UI / UX (8/10) – The interface is very easy to get around, and all of the options are easy to find.  With a little bit of poking around, you’ll be optimizing Hello World in no time flat! Overall (8/10) – Overall, I am happy with the Performance Profiler and its features, as well as with the service I received when working with the Red Gate personnel.  I WOULD recommend you trying the application and seeing if it would fit into your process, BUT, remember there are still some kinks in it to hopefully be worked out. My next post will definitely be shorter (hopefully), but thank you for reading up to here, or skipping ahead!  Please, if you do try the product, drop me a message and let me know what you think!  I would love to hear any opinions you may have on the product. Code Feel free to download the code I used above – download via DropBox

    Read the article

  • Why AQTime slows execution even when profiling is not on, and can anything be done for it?

    - by Antti Suni
    Hi! In AQTime for Delphi, it boasts to be very fast to get to the trouble spots by using areas and triggers etc. But it seems to me, that especially if you have very much code in the areas to profile, then the execution slows down dramatically even when the profiling is NOT on. For example, if I want to profile a specific routine late in the program flow, but don't know what is called there, I'd think to put this routine only as a trigger and the initial status for threads as Off, and then choose "Full check by Routines/Lines". However, when I do this, the program execution slows down heavily already before the trigger routine has ever been hit. For example if the "preparation flow" takes around 5 minutes without AQTime, then when I run it with profiling disabled, it already has been running for 30 minutes and still goes even when I know the trigger has not yet even been reached. I know I can try to workaround this by reducing the amount of routines/lines profiled, but it is not really a good solution for me, since I'd like to profile all of them once I get to the actual trigger routine. Also another, often better workaround is to start the application without AQTime and then use Attach to Process after the "preparation flow" has finished, but this works well only when the execution pauses in GUI in the proper place or otherwise provides a suitable time frame for doing the attaching. In all cases this is not the case. Any comments on why this is so and is there anything else to do than just try to reduce the code from the areas or attach later to the process?

    Read the article

  • ANTS Memory Profiler 7.0 Review

    - by Michael B. McLaughlin
    (This is my first review as a part of the GeeksWithBlogs.net Influencers program. It’s a program in which I (and the others who have been selected for it) get the opportunity to check out new products and services and write reviews about them. We don’t get paid for this, but we do generally get to keep a copy of the software or retain an account for some period of time on the service that we review. In this case I received a copy of Red Gate Software’s ANTS Memory Profiler 7.0, which was released in January. I don’t have any upgrade rights nor is my review guided, restrained, influenced, or otherwise controlled by Red Gate or anyone else. But I do get to keep the software license. I will always be clear about what I received whenever I do a review – I leave it up to you to decide whether you believe I can be objective. I believe I can be. If I used something and really didn’t like it, keeping a copy of it wouldn’t be worth anything to me. In that case though, I would simply uninstall/deactivate/whatever the software or service and tell the company what I didn’t like about it so they could (hopefully) make it better in the future. I don’t think it’d be polite to write up a terrible review, nor do I think it would be a particularly good use of my time. There are people who get paid for a living to review things, so I leave it to them to tell you what they think is bad and why. I’ll only spend my time telling you about things I think are good.) Overview of Common .NET Memory Problems When coming to land of managed memory from the wilds of unmanaged code, it’s easy to say to one’s self, “Wow! Now I never have to worry about memory problems again!” But this simply isn’t true. Managed code environments, such as .NET, make many, many things easier. You will never have to worry about memory corruption due to a bad pointer, for example (unless you’re working with unsafe code, of course). But managed code has its own set of memory concerns. For example, failing to unsubscribe from events when you are done with them leaves the publisher of an event with a reference to the subscriber. If you eliminate all your own references to the subscriber, then that memory is effectively lost since the GC won’t delete it because of the publishing object’s reference. When the publishing object itself becomes subject to garbage collection then you’ll get that memory back finally, but that could take a very long time depending of the life of the publisher. Another common source of resource leaks is failing to properly release unmanaged resources. When writing a class that contains members that hold unmanaged resources (e.g. any of the Stream-derived classes, IsolatedStorageFile, most classes ending in “Reader” or “Writer”), you should always implement IDisposable, making sure to use a properly written Dispose method. And when you are using an instance of a class that implements IDisposable, you should always make sure to use a 'using' statement in order to ensure that the object’s unmanaged resources are disposed of properly. (A ‘using’ statement is a nicer, cleaner looking, and easier to use version of a try-finally block. The compiler actually translates it as though it were a try-finally block. Note that Code Analysis warning 2202 (CA2202) will often be triggered by nested using blocks. A properly written dispose method ensures that it only runs once such that calling dispose multiple times should not be a problem. Nonetheless, CA2202 exists and if you want to avoid triggering it then you should write your code such that only the innermost IDisposable object uses a ‘using’ statement, with any outer code making use of appropriate try-finally blocks instead). Then, of course, there are situations where you are operating in a memory-constrained environment or else you want to limit or even eliminate allocations within a certain part of your program (e.g. within the main game loop of an XNA game) in order to avoid having the GC run. On the Xbox 360 and Windows Phone 7, for example, for every 1 MB of heap allocations you make, the GC runs; the added time of a GC collection can cause a game to drop frames or run slowly thereby making it look bad. Eliminating allocations (or else minimizing them and calling an explicit Collect at an appropriate time) is a common way of avoiding this (the other way is to simplify your heap so that the GC’s latency is low enough not to cause performance issues). ANTS Memory Profiler 7.0 When the opportunity to review Red Gate’s recently released ANTS Memory Profiler 7.0 arose, I jumped at it. In order to review it, I was given a free copy (which does not include upgrade rights for future versions) which I am allowed to keep. For those of you who are familiar with ANTS Memory Profiler, you can find a list of new features and enhancements here. If you are an experienced .NET developer who is familiar with .NET memory management issues, ANTS Memory Profiler is great. More importantly still, if you are new to .NET development or you have no experience or limited experience with memory profiling, ANTS Memory Profiler is awesome. From the very beginning, it guides you through the process of memory profiling. If you’re experienced and just want dive in however, it doesn’t get in your way. The help items GAHSFLASHDAJLDJA are well designed and located right next to the UI controls so that they are easy to find without being intrusive. When you first launch it, it presents you with a “Getting Started” screen that contains links to “Memory profiling video tutorials”, “Strategies for memory profiling”, and the “ANTS Memory Profiler forum”. I’m normally the kind of person who looks at a screen like that only to find the “Don’t show this again” checkbox. Since I was doing a review, though, I decided I should examine them. I was pleasantly surprised. The overview video clocks in at three minutes and fifty seconds. It begins by showing you how to get started profiling an application. It explains that profiling is done by taking memory snapshots periodically while your program is running and then comparing them. ANTS Memory Profiler (I’m just going to call it “ANTS MP” from here) analyzes these snapshots in the background while your application is running. It briefly mentions a new feature in Version 7, a new API that give you the ability to trigger snapshots from within your application’s source code (more about this below). You can also, and this is the more common way you would do it, take a memory snapshot at any time from within the ANTS MP window by clicking the “Take Memory Snapshot” button in the upper right corner. The overview video goes on to demonstrate a basic profiling session on an application that pulls information from a database and displays it. It shows how to switch which snapshots you are comparing, explains the different sections of the Summary view and what they are showing, and proceeds to show you how to investigate memory problems using the “Instance Categorizer” to track the path from an object (or set of objects) to the GC’s root in order to find what things along the path are holding a reference to it/them. For a set of objects, you can then click on it and get the “Instance List” view. This displays all of the individual objects (including their individual sizes, values, etc.) of that type which share the same path to the GC root. You can then click on one of the objects to generate an “Instance Retention Graph” view. This lets you track directly up to see the reference chain for that individual object. In the overview video, it turned out that there was an event handler which was holding on to a reference, thereby keeping a large number of strings that should have been freed in memory. Lastly the video shows the “Class List” view, which lets you dig in deeply to find problems that might not have been clear when following the previous workflow. Once you have at least one memory snapshot you can begin analyzing. The main interface is in the “Analysis” tab. You can also switch to the “Session Overview” tab, which gives you several bar charts highlighting basic memory data about the snapshots you’ve taken. If you hover over the individual bars (and the individual colors in bars that have more than one), you will see a detailed text description of what the bar is representing visually. The Session Overview is good for a quick summary of memory usage and information about the different heaps. You are going to spend most of your time in the Analysis tab, but it’s good to remember that the Session Overview is there to give you some quick feedback on basic memory usage stats. As described above in the summary of the overview video, there is a certain natural workflow to the Analysis tab. You’ll spin up your application and take some snapshots at various times such as before and after clicking a button to open a window or before and after closing a window. Taking these snapshots lets you examine what is happening with memory. You would normally expect that a lot of memory would be freed up when closing a window or exiting a document. By taking snapshots before and after performing an action like that you can see whether or not the memory is really being freed. If you already know an area that’s giving you trouble, you can run your application just like normal until just before getting to that part and then you can take a few strategic snapshots that should help you pin down the problem. Something the overview didn’t go into is how to use the “Filters” section at the bottom of ANTS MP together with the Class List view in order to narrow things down. The video tutorials page has a nice 3 minute intro video called “How to use the filters”. It’s a nice introduction and covers some of the basics. I’m going to cover a bit more because I think they’re a really neat, really helpful feature. Large programs can bring up thousands of classes. Even simple programs can instantiate far more classes than you might realize. In a basic .NET 4 WPF application for example (and when I say basic, I mean just MainWindow.xaml with a button added to it), the unfiltered Class List view will have in excess of 1000 classes (my simple test app had anywhere from 1066 to 1148 classes depending on which snapshot I was using as the “Current” snapshot). This is amazing in some ways as it shows you how in stark detail just how immensely powerful the WPF framework is. But hunting through 1100 classes isn’t productive, no matter how cool it is that there are that many classes instantiated and doing all sorts of awesome things. Let’s say you wanted to examine just the classes your application contains source code for (in my simple example, that would be the MainWindow and App). Under “Basic Filters”, click on “Classes with source” under “Show only…”. Voilà. Down from 1070 classes in the snapshot I was using as “Current” to 2 classes. If you then click on a class’s name, it will show you (to the right of the class name) two little icon buttons. Hover over them and you will see that you can click one to view the Instance Categorizer for the class and another to view the Instance List for the class. You can also show classes based on which heap they live on. If you chose both a Baseline snapshot and a Current snapshot then you can use the “Comparing snapshots” filters to show only: “New objects”; “Surviving objects”; “Survivors in growing classes”; or “Zombie objects” (if you aren’t sure what one of these means, you can click the helpful “?” in a green circle icon to bring up a popup that explains them and provides context). Remember that your selection(s) under the “Show only…” heading will still apply, so you should update those selections to make sure you are seeing the view you want. There are also links under the “What is my memory problem?” heading that can help you diagnose the problems you are seeing including one for “I don’t know which kind I have” for situations where you know generally that your application has some problems but aren’t sure what the behavior you have been seeing (OutOfMemoryExceptions, continually growing memory usage, larger memory use than expected at certain points in the program). The Basic Filters are not the only filters there are. “Filter by Object Type” gives you the ability to filter by: “Objects that are disposable”; “Objects that are/are not disposed”; “Objects that are/are not GC roots” (GC roots are things like static variables); and “Objects that implement _______”. “Objects that implement” is particularly neat. Once you check the box, you can then add one or more classes and interfaces that an object must implement in order to survive the filtering. Lastly there is “Filter by Reference”, which gives you the option to pare down the list based on whether an object is “Kept in memory exclusively by” a particular item, a class/interface, or a namespace; whether an object is “Referenced by” one or more of those choices; and whether an object is “Never referenced by” one or more of those choices. Remember that filtering is cumulative, so anything you had set in one of the filter sections still remains in effect unless and until you go back and change it. There’s quite a bit more to ANTS MP – it’s a very full featured product – but I think I touched on all of the most significant pieces. You can use it to debug: a .NET executable; an ASP.NET web application (running on IIS); an ASP.NET web application (running on Visual Studio’s built-in web development server); a Silverlight 4 browser application; a Windows service; a COM+ server; and even something called an XBAP (local XAML browser application). You can also attach to a .NET 4 process to profile an application that’s already running. The startup screen also has a large number of “Charting Options” that let you adjust which statistics ANTS MP should collect. The default selection is a good, minimal set. It’s worth your time to browse through the charting options to examine other statistics that may also help you diagnose a particular problem. The more statistics ANTS MP collects, the longer it will take to collect statistics. So just turning everything on is probably a bad idea. But the option to selectively add in additional performance counters from the extensive list could be a very helpful thing for your memory profiling as it lets you see additional data that might provide clues about a particular problem that has been bothering you. ANTS MP integrates very nicely with all versions of Visual Studio that support plugins (i.e. all of the non-Express versions). Just note that if you choose “Profile Memory” from the “ANTS” menu that it will launch profiling for whichever project you have set as the Startup project. One quick tip from my experience so far using ANTS MP: if you want to properly understand your memory usage in an application you’ve written, first create an “empty” version of the type of project you are going to profile (a WPF application, an XNA game, etc.) and do a quick profiling session on that so that you know the baseline memory usage of the framework itself. By “empty” I mean just create a new project of that type in Visual Studio then compile it and run it with profiling – don’t do anything special or add in anything (except perhaps for any external libraries you’re planning to use). The first thing I tried ANTS MP out on was a demo XNA project of an editor that I’ve been working on for quite some time that involves a custom extension to XNA’s content pipeline. The first time I ran it and saw the unmanaged memory usage I was convinced I had some horrible bug that was creating extra copies of texture data (the demo project didn’t have a lot of texture data so when I saw a lot of unmanaged memory I instantly figured I was doing something wrong). Then I thought to run an empty project through and when I saw that the amount of unmanaged memory was virtually identical, it dawned on me that the CLR itself sits in unmanaged memory and that (thankfully) there was nothing wrong with my code! Quite a relief. Earlier, when discussing the overview video, I mentioned the API that lets you take snapshots from within your application. I gave it a quick trial and it’s very easy to integrate and make use of and is a really nice addition (especially for projects where you want to know what, if any, allocations there are in a specific, complicated section of code). The only concern I had was that if I hadn’t watched the overview video I might never have known it existed. Even then it took me five minutes of hunting around Red Gate’s website before I found the “Taking snapshots from your code" article that explains what DLL you need to add as a reference and what method of what class you should call in order to take an automatic snapshot (including the helpful warning to wrap it in a try-catch block since, under certain circumstances, it can raise an exception, such as trying to call it more than 5 times in 30 seconds. The difficulty in discovering and then finding information about the automatic snapshots API was one thing I thought could use improvement. Another thing I think would make it even better would be local copies of the webpages it links to. Although I’m generally always connected to the internet, I imagine there are more than a few developers who aren’t or who are behind very restrictive firewalls. For them (and for me, too, if my internet connection happens to be down), it would be nice to have those documents installed locally or to have the option to download an additional “documentation” package that would add local copies. Another thing that I wish could be easier to manage is the Filters area. Finding and setting individual filters is very easy as is understanding what those filter do. And breaking it up into three sections (basic, by object, and by reference) makes sense. But I could easily see myself running a long profiling session and forgetting that I had set some filter a long while earlier in a different filter section and then spending quite a bit of time trying to figure out why some problem that was clearly visible in the data wasn’t showing up in, e.g. the instance list before remembering to check all the filters for that one setting that was only culling a few things from view. Some sort of indicator icon next to the filter section names that appears you have at least one filter set in that area would be a nice visual clue to remind me that “oh yeah, I told it to only show objects on the Gen 2 heap! That’s why I’m not seeing those instances of the SuperMagic class!” Something that would be nice (but that Red Gate cannot really do anything about) would be if this could be used in Windows Phone 7 development. If Microsoft and Red Gate could work together to make this happen (even if just on the WP7 emulator), that would be amazing. Especially given the memory constraints that apps and games running on mobile devices need to work within, a good memory profiler would be a phenomenally helpful tool. If anyone at Microsoft reads this, it’d be really great if you could make something like that happen. Perhaps even a (subsidized) custom version just for WP7 development. (For XNA games, of course, you can create a Windows version of the game and use ANTS MP on the Windows version in order to get a better picture of your memory situation. For Silverlight on WP7, though, there’s quite a bit of educated guess work and WeakReference creation followed by forced collections in order to find the source of a memory problem.) The only other thing I found myself wanting was a “Back” button. Between my Windows Phone 7, Zune, and other things, I’ve grown very used to having a “back stack” that lets me just navigate back to where I came from. The ANTS MP interface is surprisingly easy to use given how much it lets you do, and once you start using it for any amount of time, you learn all of the different areas such that you know where to go. And it does remember the state of the areas you were previously in, of course. So if you go to, e.g., the Instance Retention Graph from the Class List and then return back to the Class List, it will remember which class you had selected and all that other state information. Still, a “Back” button would be a welcome addition to a future release. Bottom Line ANTS Memory Profiler is not an inexpensive tool. But my time is valuable. I can easily see ANTS MP saving me enough time tracking down memory problems to justify it on a cost basis. More importantly to me, knowing what is happening memory-wise in my programs and having the confidence that my code doesn’t have any hidden time bombs in it that will cause it to OOM if I leave it running for longer than I do when I spin it up real quickly for debugging or just to see how a new feature looks and feels is a good feeling. It’s a feeling that I like having and want to continue to have. I got the current version for free in order to review it. Having done so, I’ve now added it to my must-have tools and will gladly lay out the money for the next version when it comes out. It has a 14 day free trial, so if you aren’t sure if it’s right for you or if you think it seems interesting but aren’t really sure if it’s worth shelling out the money for it, give it a try.

    Read the article

  • How to find and fix performance problems in ORM powered applications

    - by FransBouma
    Once in a while we get requests about how to fix performance problems with our framework. As it comes down to following the same steps and looking into the same things every single time, I decided to write a blogpost about it instead, so more people can learn from this and solve performance problems in their O/R mapper powered applications. In some parts it's focused on LLBLGen Pro but it's also usable for other O/R mapping frameworks, as the vast majority of performance problems in O/R mapper powered applications are not specific for a certain O/R mapper framework. Too often, the developer looks at the wrong part of the application, trying to fix what isn't a problem in that part, and getting frustrated that 'things are so slow with <insert your favorite framework X here>'. I'm in the O/R mapper business for a long time now (almost 10 years, full time) and as it's a small world, we O/R mapper developers know almost all tricks to pull off by now: we all know what to do to make task ABC faster and what compromises (because there are almost always compromises) to deal with if we decide to make ABC faster that way. Some O/R mapper frameworks are faster in X, others in Y, but you can be sure the difference is mainly a result of a compromise some developers are willing to deal with and others aren't. That's why the O/R mapper frameworks on the market today are different in many ways, even though they all fetch and save entities from and to a database. I'm not suggesting there's no room for improvement in today's O/R mapper frameworks, there always is, but it's not a matter of 'the slowness of the application is caused by the O/R mapper' anymore. Perhaps query generation can be optimized a bit here, row materialization can be optimized a bit there, but it's mainly coming down to milliseconds. Still worth it if you're a framework developer, but it's not much compared to the time spend inside databases and in user code: if a complete fetch takes 40ms or 50ms (from call to entity object collection), it won't make a difference for your application as that 10ms difference won't be noticed. That's why it's very important to find the real locations of the problems so developers can fix them properly and don't get frustrated because their quest to get a fast, performing application failed. Performance tuning basics and rules Finding and fixing performance problems in any application is a strict procedure with four prescribed steps: isolate, analyze, interpret and fix, in that order. It's key that you don't skip a step nor make assumptions: these steps help you find the reason of a problem which seems to be there, and how to fix it or leave it as-is. Skipping a step, or when you assume things will be bad/slow without doing analysis will lead to the path of premature optimization and won't actually solve your problems, only create new ones. The most important rule of finding and fixing performance problems in software is that you have to understand what 'performance problem' actually means. Most developers will say "when a piece of software / code is slow, you have a performance problem". But is that actually the case? If I write a Linq query which will aggregate, group and sort 5 million rows from several tables to produce a resultset of 10 rows, it might take more than a couple of milliseconds before that resultset is ready to be consumed by other logic. If I solely look at the Linq query, the code consuming the resultset of the 10 rows and then look at the time it takes to complete the whole procedure, it will appear to me to be slow: all that time taken to produce and consume 10 rows? But if you look closer, if you analyze and interpret the situation, you'll see it does a tremendous amount of work, and in that light it might even be extremely fast. With every performance problem you encounter, always do realize that what you're trying to solve is perhaps not a technical problem at all, but a perception problem. The second most important rule you have to understand is based on the old saying "Penny wise, Pound Foolish": the part which takes e.g. 5% of the total time T for a given task isn't worth optimizing if you have another part which takes a much larger part of the total time T for that same given task. Optimizing parts which are relatively insignificant for the total time taken is not going to bring you better results overall, even if you totally optimize that part away. This is the core reason why analysis of the complete set of application parts which participate in a given task is key to being successful in solving performance problems: No analysis -> no problem -> no solution. One warning up front: hunting for performance will always include making compromises. Fast software can be made maintainable, but if you want to squeeze as much performance out of your software, you will inevitably be faced with the dilemma of compromising one or more from the group {readability, maintainability, features} for the extra performance you think you'll gain. It's then up to you to decide whether it's worth it. In almost all cases it's not. The reason for this is simple: the vast majority of performance problems can be solved by implementing the proper algorithms, the ones with proven Big O-characteristics so you know the performance you'll get plus you know the algorithm will work. The time taken by the algorithm implementing code is inevitable: you already implemented the best algorithm. You might find some optimizations on the technical level but in general these are minor. Let's look at the four steps to see how they guide us through the quest to find and fix performance problems. Isolate The first thing you need to do is to isolate the areas in your application which are assumed to be slow. For example, if your application is a web application and a given page is taking several seconds or even minutes to load, it's a good candidate to check out. It's important to start with the isolate step because it allows you to focus on a single code path per area with a clear begin and end and ignore the rest. The rest of the steps are taken per identified problematic area. Keep in mind that isolation focuses on tasks in an application, not code snippets. A task is something that's started in your application by either another task or the user, or another program, and has a beginning and an end. You can see a task as a piece of functionality offered by your application.  Analyze Once you've determined the problem areas, you have to perform analysis on the code paths of each area, to see where the performance problems occur and which areas are not the problem. This is a multi-layered effort: an application which uses an O/R mapper typically consists of multiple parts: there's likely some kind of interface (web, webservice, windows etc.), a part which controls the interface and business logic, the O/R mapper part and the RDBMS, all connected with either a network or inter-process connections provided by the OS or other means. Each of these parts, including the connectivity plumbing, eat up a part of the total time it takes to complete a task, e.g. load a webpage with all orders of a given customer X. To understand which parts participate in the task / area we're investigating and how much they contribute to the total time taken to complete the task, analysis of each participating task is essential. Start with the code you wrote which starts the task, analyze the code and track the path it follows through your application. What does the code do along the way, verify whether it's correct or not. Analyze whether you have implemented the right algorithms in your code for this particular area. Remember we're looking at one area at a time, which means we're ignoring all other code paths, just the code path of the current problematic area, from begin to end and back. Don't dig in and start optimizing at the code level just yet. We're just analyzing. If your analysis reveals big architectural stupidity, it's perhaps a good idea to rethink the architecture at this point. For the rest, we're analyzing which means we collect data about what could be wrong, for each participating part of the complete application. Reviewing the code you wrote is a good tool to get deeper understanding of what is going on for a given task but ultimately it lacks precision and overview what really happens: humans aren't good code interpreters, computers are. We therefore need to utilize tools to get deeper understanding about which parts contribute how much time to the total task, triggered by which other parts and for example how many times are they called. There are two different kind of tools which are necessary: .NET profilers and O/R mapper / RDBMS profilers. .NET profiling .NET profilers (e.g. dotTrace by JetBrains or Ants by Red Gate software) show exactly which pieces of code are called, how many times they're called, and the time it took to run that piece of code, at the method level and sometimes even at the line level. The .NET profilers are essential tools for understanding whether the time taken to complete a given task / area in your application is consumed by .NET code, where exactly in your code, the path to that code, how many times that code was called by other code and thus reveals where hotspots are located: the areas where a solution can be found. Importantly, they also reveal which areas can be left alone: remember our penny wise pound foolish saying: if a profiler reveals that a group of methods are fast, or don't contribute much to the total time taken for a given task, ignore them. Even if the code in them is perhaps complex and looks like a candidate for optimization: you can work all day on that, it won't matter.  As we're focusing on a single area of the application, it's best to start profiling right before you actually activate the task/area. Most .NET profilers support this by starting the application without starting the profiling procedure just yet. You navigate to the particular part which is slow, start profiling in the profiler, in your application you perform the actions which are considered slow, and afterwards you get a snapshot in the profiler. The snapshot contains the data collected by the profiler during the slow action, so most data is produced by code in the area to investigate. This is important, because it allows you to stay focused on a single area. O/R mapper and RDBMS profiling .NET profilers give you a good insight in the .NET side of things, but not in the RDBMS side of the application. As this article is about O/R mapper powered applications, we're also looking at databases, and the software making it possible to consume the database in your application: the O/R mapper. To understand which parts of the O/R mapper and database participate how much to the total time taken for task T, we need different tools. There are two kind of tools focusing on O/R mappers and database performance profiling: O/R mapper profilers and RDBMS profilers. For O/R mapper profilers, you can look at LLBLGen Prof by hibernating rhinos or the Linq to Sql/LLBLGen Pro profiler by Huagati. Hibernating rhinos also have profilers for other O/R mappers like NHibernate (NHProf) and Entity Framework (EFProf) and work the same as LLBLGen Prof. For RDBMS profilers, you have to look whether the RDBMS vendor has a profiler. For example for SQL Server, the profiler is shipped with SQL Server, for Oracle it's build into the RDBMS, however there are also 3rd party tools. Which tool you're using isn't really important, what's important is that you get insight in which queries are executed during the task / area we're currently focused on and how long they took. Here, the O/R mapper profilers have an advantage as they collect the time it took to execute the query from the application's perspective so they also collect the time it took to transport data across the network. This is important because a query which returns a massive resultset or a resultset with large blob/clob/ntext/image fields takes more time to get transported across the network than a small resultset and a database profiler doesn't take this into account most of the time. Another tool to use in this case, which is more low level and not all O/R mappers support it (though LLBLGen Pro and NHibernate as well do) is tracing: most O/R mappers offer some form of tracing or logging system which you can use to collect the SQL generated and executed and often also other activity behind the scenes. While tracing can produce a tremendous amount of data in some cases, it also gives insight in what's going on. Interpret After we've completed the analysis step it's time to look at the data we've collected. We've done code reviews to see whether we've done anything stupid and which parts actually take place and if the proper algorithms have been implemented. We've done .NET profiling to see which parts are choke points and how much time they contribute to the total time taken to complete the task we're investigating. We've performed O/R mapper profiling and RDBMS profiling to see which queries were executed during the task, how many queries were generated and executed and how long they took to complete, including network transportation. All this data reveals two things: which parts are big contributors to the total time taken and which parts are irrelevant. Both aspects are very important. The parts which are irrelevant (i.e. don't contribute significantly to the total time taken) can be ignored from now on, we won't look at them. The parts which contribute a lot to the total time taken are important to look at. We now have to first look at the .NET profiler results, to see whether the time taken is consumed in our own code, in .NET framework code, in the O/R mapper itself or somewhere else. For example if most of the time is consumed by DbCommand.ExecuteReader, the time it took to complete the task is depending on the time the data is fetched from the database. If there was just 1 query executed, according to tracing or O/R mapper profilers / RDBMS profilers, check whether that query is optimal, uses indexes or has to deal with a lot of data. Interpret means that you follow the path from begin to end through the data collected and determine where, along the path, the most time is contributed. It also means that you have to check whether this was expected or is totally unexpected. My previous example of the 10 row resultset of a query which groups millions of rows will likely reveal that a long time is spend inside the database and almost no time is spend in the .NET code, meaning the RDBMS part contributes the most to the total time taken, the rest is compared to that time, irrelevant. Considering the vastness of the source data set, it's expected this will take some time. However, does it need tweaking? Perhaps all possible tweaks are already in place. In the interpret step you then have to decide that further action in this area is necessary or not, based on what the analysis results show: if the analysis results were unexpected and in the area where the most time is contributed to the total time taken is room for improvement, action should be taken. If not, you can only accept the situation and move on. In all cases, document your decision together with the analysis you've done. If you decide that the perceived performance problem is actually expected due to the nature of the task performed, it's essential that in the future when someone else looks at the application and starts asking questions you can answer them properly and new analysis is only necessary if situations changed. Fix After interpreting the analysis results you've concluded that some areas need adjustment. This is the fix step: you're actively correcting the performance problem with proper action targeted at the real cause. In many cases related to O/R mapper powered applications it means you'll use different features of the O/R mapper to achieve the same goal, or apply optimizations at the RDBMS level. It could also mean you apply caching inside your application (compromise memory consumption over performance) to avoid unnecessary re-querying data and re-consuming the results. After applying a change, it's key you re-do the analysis and interpretation steps: compare the results and expectations with what you had before, to see whether your actions had any effect or whether it moved the problem to a different part of the application. Don't fall into the trap to do partly analysis: do the full analysis again: .NET profiling and O/R mapper / RDBMS profiling. It might very well be that the changes you've made make one part faster but another part significantly slower, in such a way that the overall problem hasn't changed at all. Performance tuning is dealing with compromises and making choices: to use one feature over the other, to accept a higher memory footprint, to go away from the strict-OO path and execute queries directly onto the RDBMS, these are choices and compromises which will cross your path if you want to fix performance problems with respect to O/R mappers or data-access and databases in general. In most cases it's not a big issue: alternatives are often good choices too and the compromises aren't that hard to deal with. What is important is that you document why you made a choice, a compromise: which analysis data, which interpretation led you to the choice made. This is key for good maintainability in the years to come. Most common performance problems with O/R mappers Below is an incomplete list of common performance problems related to data-access / O/R mappers / RDBMS code. It will help you with fixing the hotspots you found in the interpretation step. SELECT N+1: (Lazy-loading specific). Lazy loading triggered performance bottlenecks. Consider a list of Orders bound to a grid. You have a Field mapped onto a related field in Order, Customer.CompanyName. Showing this column in the grid will make the grid fetch (indirectly) for each row the Customer row. This means you'll get for the single list not 1 query (for the orders) but 1+(the number of orders shown) queries. To solve this: use eager loading using a prefetch path to fetch the customers with the orders. SELECT N+1 is easy to spot with an O/R mapper profiler or RDBMS profiler: if you see a lot of identical queries executed at once, you have this problem. Prefetch paths using many path nodes or sorting, or limiting. Eager loading problem. Prefetch paths can help with performance, but as 1 query is fetched per node, it can be the number of data fetched in a child node is bigger than you think. Also consider that data in every node is merged on the client within the parent. This is fast, but it also can take some time if you fetch massive amounts of entities. If you keep fetches small, you can use tuning parameters like the ParameterizedPrefetchPathThreshold setting to get more optimal queries. Deep inheritance hierarchies of type Target Per Entity/Type. If you use inheritance of type Target per Entity / Type (each type in the inheritance hierarchy is mapped onto its own table/view), fetches will join subtype- and supertype tables in many cases, which can lead to a lot of performance problems if the hierarchy has many types. With this problem, keep inheritance to a minimum if possible, or switch to a hierarchy of type Target Per Hierarchy, which means all entities in the inheritance hierarchy are mapped onto the same table/view. Of course this has its own set of drawbacks, but it's a compromise you might want to take. Fetching massive amounts of data by fetching large lists of entities. LLBLGen Pro supports paging (and limiting the # of rows returned), which is often key to process through large sets of data. Use paging on the RDBMS if possible (so a query is executed which returns only the rows in the page requested). When using paging in a web application, be sure that you switch server-side paging on on the datasourcecontrol used. In this case, paging on the grid alone is not enough: this can lead to fetching a lot of data which is then loaded into the grid and paged there. Keep note that analyzing queries for paging could lead to the false assumption that paging doesn't occur, e.g. when the query contains a field of type ntext/image/clob/blob and DISTINCT can't be applied while it should have (e.g. due to a join): the datareader will do DISTINCT filtering on the client. this is a little slower but it does perform paging functionality on the data-reader so it won't fetch all rows even if the query suggests it does. Fetch massive amounts of data because blob/clob/ntext/image fields aren't excluded. LLBLGen Pro supports field exclusion for queries. You can exclude fields (also in prefetch paths) per query to avoid fetching all fields of an entity, e.g. when you don't need them for the logic consuming the resultset. Excluding fields can greatly reduce the amount of time spend on data-transport across the network. Use this optimization if you see that there's a big difference between query execution time on the RDBMS and the time reported by the .NET profiler for the ExecuteReader method call. Doing client-side aggregates/scalar calculations by consuming a lot of data. If possible, try to formulate a scalar query or group by query using the projection system or GetScalar functionality of LLBLGen Pro to do data consumption on the RDBMS server. It's far more efficient to process data on the RDBMS server than to first load it all in memory, then traverse the data in-memory to calculate a value. Using .ToList() constructs inside linq queries. It might be you use .ToList() somewhere in a Linq query which makes the query be run partially in-memory. Example: var q = from c in metaData.Customers.ToList() where c.Country=="Norway" select c; This will actually fetch all customers in-memory and do an in-memory filtering, as the linq query is defined on an IEnumerable<T>, and not on the IQueryable<T>. Linq is nice, but it can often be a bit unclear where some parts of a Linq query might run. Fetching all entities to delete into memory first. To delete a set of entities it's rather inefficient to first fetch them all into memory and then delete them one by one. It's more efficient to execute a DELETE FROM ... WHERE query on the database directly to delete the entities in one go. LLBLGen Pro supports this feature, and so do some other O/R mappers. It's not always possible to do this operation in the context of an O/R mapper however: if an O/R mapper relies on a cache, these kind of operations are likely not supported because they make it impossible to track whether an entity is actually removed from the DB and thus can be removed from the cache. Fetching all entities to update with an expression into memory first. Similar to the previous point: it is more efficient to update a set of entities directly with a single UPDATE query using an expression instead of fetching the entities into memory first and then updating the entities in a loop, and afterwards saving them. It might however be a compromise you don't want to take as it is working around the idea of having an object graph in memory which is manipulated and instead makes the code fully aware there's a RDBMS somewhere. Conclusion Performance tuning is almost always about compromises and making choices. It's also about knowing where to look and how the systems in play behave and should behave. The four steps I provided should help you stay focused on the real problem and lead you towards the solution. Knowing how to optimally use the systems participating in your own code (.NET framework, O/R mapper, RDBMS, network/services) is key for success as well as knowing what's going on inside the application you built. I hope you'll find this guide useful in tracking down performance problems and dealing with them in a useful way.  

    Read the article

  • Python Profiling not installed in Ubuntu? How do I get it in a virtualenv and without apt-get?

    - by interstar
    According to the Python documentation, the "profile" module is part of the standard library. But I can't find it. On my home machine, I was able to add it using apt-get install. (ie. it's split out into a separate ubuntu package.) On my work machine, (also ubuntu) I'm running in a virtualenv, so apt-get install isn't relevant. I can install python modules from pypi using easy-install, but I can't see anything on pypi which corresponds to the profiling module. (Presumably because it's meant to be part of the standard python install.) So how can I install it in this environment?

    Read the article

  • Revisiting ANTS Performance Profiler 7.4

    - by James Michael Hare
    Last year, I did a small review on the ANTS Performance Profiler 6.3, now that it’s a year later and a major version number higher, I thought I’d revisit the review and revise my last post. This post will take the same examples as the original post and update them to show what’s new in version 7.4 of the profiler. Background A performance profiler’s main job is to keep track of how much time is typically spent in each unit of code. This helps when we have a program that is not running at the performance we expect, and we want to know where the program is experiencing issues. There are many profilers out there of varying capabilities. Red Gate’s typically seem to be the very easy to “jump in” and get started with very little training required. So let’s dig into the Performance Profiler. I’ve constructed a very crude program with some obvious inefficiencies. It’s a simple program that generates random order numbers (or really could be any unique identifier), adds it to a list, sorts the list, then finds the max and min number in the list. Ignore the fact it’s very contrived and obviously inefficient, we just want to use it as an example to show off the tool: 1: // our test program 2: public static class Program 3: { 4: // the number of iterations to perform 5: private static int _iterations = 1000000; 6: 7: // The main method that controls it all 8: public static void Main() 9: { 10: var list = new List<string>(); 11: 12: for (int i = 0; i < _iterations; i++) 13: { 14: var x = GetNextId(); 15: 16: AddToList(list, x); 17: 18: var highLow = GetHighLow(list); 19: 20: if ((i % 1000) == 0) 21: { 22: Console.WriteLine("{0} - High: {1}, Low: {2}", i, highLow.Item1, highLow.Item2); 23: Console.Out.Flush(); 24: } 25: } 26: } 27: 28: // gets the next order id to process (random for us) 29: public static string GetNextId() 30: { 31: var random = new Random(); 32: var num = random.Next(1000000, 9999999); 33: return num.ToString(); 34: } 35: 36: // add it to our list - very inefficiently! 37: public static void AddToList(List<string> list, string item) 38: { 39: list.Add(item); 40: list.Sort(); 41: } 42: 43: // get high and low of order id range - very inefficiently! 44: public static Tuple<int,int> GetHighLow(List<string> list) 45: { 46: return Tuple.Create(list.Max(s => Convert.ToInt32(s)), list.Min(s => Convert.ToInt32(s))); 47: } 48: } So let’s run it through the profiler and see what happens! Visual Studio Integration First, let’s look at how the ANTS profilers integrate with Visual Studio’s menu system. Once you install the ANTS profilers, you will get an ANTS menu item with several options: Notice that you can either Profile Performance or Launch ANTS Performance Profiler. These sound similar but achieve two slightly different actions: Profile Performance: this immediately launches the profiler with all defaults selected to profile the active project in Visual Studio. Launch ANTS Performance Profiler: this launches the profiler much the same way as starting it from the Start Menu. The profiler will pre-populate the application and path information, but allow you to change the settings before beginning the profile run. So really, the main difference is that Profile Performance immediately begins profiling with the default selections, where Launch ANTS Performance Profiler allows you to change the defaults and attach to an already-running application. Let’s Fire it Up! So when you fire up ANTS either via Start Menu or Launch ANTS Performance Profiler menu in Visual Studio, you are presented with a very simple dialog to get you started: Notice you can choose from many different options for application type. You can profile executables, services, web applications, or just attach to a running process. In fact, in version 7.4 we see two new options added: ASP.NET Web Application (IIS Express) SharePoint web application (IIS) So this gives us an additional way to profile ASP.NET applications and the ability to profile SharePoint applications as well. You can also choose your level of detail in the Profiling Mode drop down. If you choose Line-Level and method-level timings detail, you will get a lot more detail on the method durations, but this will also slow down profiling somewhat. If you really need the profiler to be as unintrusive as possible, you can change it to Sample method-level timings. This is performing very light profiling, where basically the profiler collects timings of a method by examining the call-stack at given intervals. Which method you choose depends a lot on how much detail you need to find the issue and how sensitive your program issues are to timing. So for our example, let’s just go with the line and method timing detail. So, we check that all the options are correct (if you launch from VS2010, the executable and path are filled in already), and fire it up by clicking the [Start Profiling] button. Profiling the Application Once you start profiling the application, you will see a real-time graph of CPU usage that will indicate how much your application is using the CPU(s) on your system. During this time, you can select segments of the graph and bookmark them, giving them mnemonic names. This can be useful if you want to compare performance in one part of the run to another part of the run. Notice that once you select a block, it will give you the call tree breakdown for that selection only, and the relative performance of those calls. Once you feel you have collected enough information, you can click [Stop Profiling] to stop the application run and information collection and begin a more thorough analysis. Analyzing Method Timings So now that we’ve halted the run, we can look around the GUI and see what we can see. By default, the times are shown in terms of percentage of time of the total run of the application, though you can change it in the View menu item to milliseconds, ticks, or seconds as well. This won’t affect the percentages of methods, it only affects what units the times are shown. Notice also that the major hotspot seems to be in a method without source, ANTS Profiler will filter these out by default, but you can right-click on the line and remove the filter to see more detail. This proves especially handy when a bottleneck is due to a method in the BCL. So now that we’ve removed the filter, we see a bit more detail: In addition, ANTS Performance Profiler gives you the ability to decompile the methods without source so that you can dive even deeper, though typically this isn’t necessary for our purposes. When looking at timings, there are generally two types of timings for each method call: Time: This is the time spent ONLY in this method, not including calls this method makes to other methods. Time With Children: This is the total of time spent in both this method AND including calls this method makes to other methods. In other words, the Time tells you how much work is being done exclusively in this method, and the Time With Children tells you how much work is being done inclusively in this method and everything it calls. You can also choose to display the methods in a tree or in a grid. The tree view is the default and it shows the method calls arranged in terms of the tree representing all method calls and the parent method that called them, etc. This is useful for when you find a hot-spot method, you can see who is calling it to determine if the problem is the method itself, or if it is being called too many times. The grid method represents each method only once with its totals and is useful for quickly seeing what method is the trouble spot. In addition, you can choose to display Methods with source which are generally the methods you wrote (as opposed to native or BCL code), or Any Method which shows not only your methods, but also native calls, JIT overhead, synchronization waits, etc. So these are just two ways of viewing the same data, and you’re free to choose the organization that best suits what information you are after. Analyzing Method Source If we look at the timings above, we see that our AddToList() method (and in particular, it’s call to the List<T>.Sort() method in the BCL) is the hot-spot in this analysis. If ANTS sees a method that is consuming the most time, it will flag it as a hot-spot to help call out potential areas of concern. This doesn’t mean the other statistics aren’t meaningful, but that the hot-spot is most likely going to be your biggest bang-for-the-buck to concentrate on. So let’s select the AddToList() method, and see what it shows in the source window below: Notice the source breakout in the bottom pane when you select a method (from either tree or grid view). This shows you the timings in this method per line of code. This gives you a major indicator of where the trouble-spot in this method is. So in this case, we see that performing a Sort() on the List<T> after every Add() is killing our performance! Of course, this was a very contrived, duh moment, but you’d be surprised how many performance issues become duh moments. Note that this one line is taking up 86% of the execution time of this application! If we eliminate this bottleneck, we should see drastic improvement in the performance. So to fix this, if we still wanted to maintain the List<T> we’d have many options, including: delay Sort() until after all Add() methods, using a SortedSet, SortedList, or SortedDictionary depending on which is most appropriate, or forgoing the sorting all together and using a Dictionary. Rinse, Repeat! So let’s just change all instances of List<string> to SortedSet<string> and run this again through the profiler: Now we see the AddToList() method is no longer our hot-spot, but now the Max() and Min() calls are! This is good because we’ve eliminated one hot-spot and now we can try to correct this one as well. As before, we can then optimize this part of the code (possibly by taking advantage of the fact the list is now sorted and returning the first and last elements). We can then rinse and repeat this process until we have eliminated as many bottlenecks as possible. Calls by Web Request Another feature that was added recently is the ability to view .NET methods grouped by the HTTP requests that caused them to run. This can be helpful in determining which pages, web services, etc. are causing hot spots in your web applications. Summary If you like the other ANTS tools, you’ll like the ANTS Performance Profiler as well. It is extremely easy to use with very little product knowledge required to get up and running. There are profilers built into the higher product lines of Visual Studio, of course, which are also powerful and easy to use. But for quickly jumping in and finding hot spots rapidly, Red Gate’s Performance Profiler 7.4 is an excellent choice. Technorati Tags: Influencers,ANTS,Performance Profiler,Profiler

    Read the article

  • Java ME SDK 3.2 is now live

    - by SungmoonCho
    Hi everyone, It has been a while since we released the last version. We have been very busy integrating new features and making lots of usability improvements into this new version. Datasheet is available here. Please visit Java ME SDK 3.2 download page to get the latest and best version yet! Some of the new features in this version are described below. Embedded Application SupportOracle Java ME SDK 3.2 now supports the new Oracle® Java ME Embedded. This includes support for JSR 228, the Information Module Profile-Next Generation API (IMP-NG). You can test and debug applications either on the built-in device emulators or on your device. Memory MonitorThe Memory Monitor shows memory use as an application runs. It displays a dynamic detailed listing of the memory usage per object in table form, and a graphical representation of the memory use over time. Eclipse IDE supportOracle Java ME SDK 3.2 now officially supports Eclipse IDE. Once you install the Java ME SDK plugins on Eclipse, you can start developing, debugging, and profiling your mobile or embedded application. Skin CreatorWith the Custom Device Skin Creator, you can create your own skins. The appearance of the custom skins is generic, but the functionality can be tailored to your own specifications.  Here are the release highlights. Implementation and support for the new Oracle® Java Wireless Client 3.2 runtime and the Oracle® Java ME Embedded runtime. The AMS in the CLDC emulators has a new look and new functionality (Install Application, Manage Certificate Authorities and Output Console). Support for JSR 228, the Information Module Profile-Next Generation API (IMP-NG). The IMP-NG platform is implemented as a subset of CLDC. Support includes: A new emulator for headless devices. Javadocs for the following Oracle APIs: Device Access API, Logging API, AMS API, and AccessPoint API. New demos for IMP-NG features can be run on the emulator or on a real device running the Oracle® Java ME Embedded runtime. New Custom Device Skin Creator. This tool provides a way to create and manage custom emulator skins. The skin appearance is generic, but the functionality, such as the JSRs supported or the device properties, are up to you. This utility only supported in NetBeans. Eclipse plugin for CLDC/MIDP. For the first time Oracle Java ME SDK is available as an Eclipse plugin. The Eclipse version does not support CDC, the Memory Monitor, and the Custom Device Skin Creator in this release. All Java ME tools are implemented as NetBeans plugins. As of the plugin integrates Java ME utilities into the standard NetBeans menus. Tools > Java ME menu is the place to launch Java ME utilities, including the new Skin Creator. Profile > Java ME is the place to work with the Network Monitor and the Memory Monitor. Use the standard NetBeans tools for debugging. Profiling, Network monitoring, and Memory monitoring are integrated with the NetBeans profiling tools. New network monitoring protocols are supported in this release: WMA, SIP, Bluetooth and OBEX, SATSA APDU and JCRMI, and server sockets. Java ME SDK Update Center. Oracle Java ME SDK can be updated or extended by new components. The Update Center can download, install, and uninstall plugins specific to the Java ME SDK. A plugin consists of runtime components and skins. Bug fixes and enhancements. This version comes with a few known problems. All of them have workarounds, so I hope you don't get stuck in these issues when you are using the product. It you cannot watch static variables during an Eclipse debugging session, and sometimes the Variable view cannot show data. In the source code, move the mouse over the required variable to inspect the variable value. A real device shown in the Device Selector is deleted from the Device Manager, yet it still appears. Kill the device manager in the system tray, and relaunch it. Then you will see the device removed from the list. On-device profiling does not work on a device. CPU profiling, networking monitoring, and memory monitoring do not work on the device, since the device runtime does not yet support it. Please do the profiling with your emulator first, and then test your application on the device. In the Device Selector, using Clean Database on real external device causes a null pointer exception. External devices do not have a database recognized by the SDK, so you can disregard this exception message. Suspending the Emulator during a Memory Monitor session hangs the emulator. Do not use the Suspend option (F5) while the Memory Monitor is running. If the emulator is hung, open the Windows task manager and stop the emulator process (javaw). To switch to another application while the Memory Monitor is running, choose Application > AMS Home (F4), and select a different application. Please let us know how we can improve it even better, by sending us your feedback. -Java ME SDK Team

    Read the article

  • java profiler in java

    - by Sunil
    Hello I want to run a java program on linux server in profiling mode means I want to profile a java program on linux server.Is there any java profiling software that can profile java program on linux server. please suggest me what can I do for profiling. Thanks in advance.

    Read the article

  • ANTS CLR and Memory Profiler In Depth Review (Part 2 of 2 &ndash; Memory Profiler)

    - by ToStringTheory
    One of the things that people might not know about me, is my obsession to make my code as efficient as possible. Many people might not realize how much of a task or undertaking that this might be, but it is surely a task as monumental as climbing Mount Everest, except this time it is a challenge for the mind… In trying to make code efficient, there are many different factors that play a part – size of project or solution, tiers, language used, experience and training of the programmer, technologies used, maintainability of the code – the list can go on for quite some time. I spend quite a bit of time when developing trying to determine what is the best way to implement a feature to accomplish the efficiency that I look to achieve. One program that I have recently come to learn about – Red Gate ANTS Performance (CLR) and Memory profiler gives me tools to accomplish that job more efficiently as well. In this review, I am going to cover some of the features of the ANTS memory profiler set by compiling some hideous example code to test against. Notice As a member of the Geeks With Blogs Influencers program, one of the perks is the ability to review products, in exchange for a free license to the program. I have not let this affect my opinions of the product in any way, and Red Gate nor Geeks With Blogs has tried to influence my opinion regarding this product in any way. Introduction – Part 2 In my last post, I reviewed the feature packed Red Gate ANTS Performance Profiler.  Separate from the Red Gate Performance Profiler is the Red Gate ANTS Memory Profiler – a simple, easy to use utility for checking how your application is handling memory management…  A tool that I wish I had had many times in the past.  This post will be focusing on the ANTS Memory Profiler and its tool set. The memory profiler has a large assortment of features just like the Performance Profiler, with the new session looking nearly exactly alike: ANTS Memory Profiler Memory profiling is not something that I have to do very often…  In the past, the few cases I’ve had to find a memory leak in an application I have usually just had to trace the code of the operations being performed to look for oddities…  Sadly, I have come across more undisposed/non-using’ed IDisposable objects, usually from ADO.Net than I would like to ever see.  Support is not fun, however using ANTS Memory Profiler makes this task easier.  For this round of testing, I am going to use the same code from my previous example, using the WPF application. This time, I will choose the ‘Profile Memory’ option from the ANTS menu in Visual Studio, which launches the solution in its currently configured state/start-up project, and then launches the ANTS Memory Profiler to help.  It prepopulates all of the fields with the current project information, and all I have to do is select the ‘Start Profiling’ option. When the window comes up, it is actually quite barren, just giving ideas on how to work the profiler.  You start by getting to the point in your application that you want to profile, and then taking a ‘Memory Snapshot’.  This performs a full garbage collection, and snapshots the managed heap.  Using the same WPF app as before, I will go ahead and take a snapshot now. As you can see, ANTS is already giving me lots of information regarding the snapshot, however this is just a snapshot.  The whole point of the profiler is to perform an action, usually one where a memory problem is being noticed, and then take another snapshot and perform a diff between them to see what has changed.  I am going to go ahead and generate 5000 primes, and then take another snapshot: As you can see, ANTS is already giving me a lot of new information about this snapshot compared to the last.  Information such as difference in memory usage, fragmentation, class usage, etc…  If you take more snapshots, you can use the dropdown at the top to set your actual comparison snapshots. If you beneath the timeline, you will see a breadcrumb trail showing how best to approach profiling memory using ANTS.  When you first do the comparison, you start on the Summary screen.  You can either use the charts at the bottom, or switch to the class list screen to get to the next step.  Here is the class list screen: As you can see, it lists information about all of the instances between the snapshots, as well as at the bottom giving you a way to filter by telling ANTS what your problem is.  I am going to go ahead and select the Int16[] to look at the Instance Categorizer Using the instance categorizer, you can travel backwards to see where all of the instances are coming from.  It may be hard to see in this image, but hopefully the lightbox (click on it) will help: I can see that all of these instances are rooted to the application through the UI TextBlock control.  This image will probably be even harder to see, however using the ‘Instance Retention Graph’, you can trace an objects memory inheritance up the chain to see its roots as well.  This is a simple example, as this is simply a known element.  Usually you would be profiling an actual problem, and comparing those differences.  I know in the past, I have spotted a problem where a new context was created per page load, and it was rooted into the application through an event.  As the application began to grow, performance and reliability problems started to emerge.  A tool like this would have been a great way to identify the problem quickly. Overview Overall, I think that the Red Gate ANTS Memory Profiler is a great utility for debugging those pesky leaks.  3 Biggest Pros: Easy to use interface with lots of options for configuring profiling session Intuitive and helpful interface for drilling down from summary, to instance, to root graphs ANTS provides an API for controlling the profiler. Not many options, but still helpful. 2 Biggest Cons: Inability to automatically snapshot the memory by interval Lack of complete integration with Visual Studio via an extension panel Ratings Ease of Use (9/10) – I really do believe that they have brought simplicity to the once difficult task of memory profiling.  I especially liked how it stepped you further into the drilldown by directing you towards the best options. Effectiveness (10/10) – I believe that the profiler does EXACTLY what it purports to do.  Features (7/10) – A really great set of features all around in the application, however, I would like to see some ability for automatically triggering snapshots based on intervals or framework level items such as events. Customer Service (10/10) – My entire experience with Red Gate personnel has been nothing but good.  their people are friendly, helpful, and happy! UI / UX (9/10) – The interface is very easy to get around, and all of the options are easy to find.  With a little bit of poking around, you’ll be optimizing Hello World in no time flat! Overall (9/10) – Overall, I am happy with the Memory Profiler and its features, as well as with the service I received when working with the Red Gate personnel.  Thank you for reading up to here, or skipping ahead – I told you it would be shorter!  Please, if you do try the product, drop me a message and let me know what you think!  I would love to hear any opinions you may have on the product. Code Feel free to download the code I used above – download via DropBox

    Read the article

  • VS2010 crashes when opening a vsp generated using VS 2012

    - by Tarun Arora
    I recently profiled some web applications using Visual Studio 2012, a vsp (Visual Studio Profile) file was generated as a result of the profiling session. I could successfully open the vsp file in Visual Studio 2012 as expected but when I tried to open the vsp file in Visual Studio 2010 the VS2010 IDE crashed. As a responsible citizen I raised bug # 762202 on Microsoft Connect site using the Microsoft Visual Studio 2012 Feedback Client. Note – In case you didn’t already know, VSP generated in Visual Studio 2012 is not backward compatible. Please refer below for the steps to reproduce the issue and the resolution of the connect bug. 1. Behaviour and Steps to Reproduce the Issue Description I have generated a vsp file by using the Visual Studio 2012 Standalone profiler. When I try and open the vsp file in Visual Studio 2010 the IDE crashes. I understand that a vsp generated by using VS 2012 cannot be opened in VS 2010, but the IDE crashing is not the behaviour I would expect to see. Steps to Reproduce the Issue 1. Pick up the Stand lone profiler from the VS 2012 installation media. The folder has both x 64 and x86 installer, since the machine I am using is x64 bit. I have installed the x64 version of the standalone profiler. 2. I have configured the system path by setting the 'environment variable' path to where the profiler is installed. In my case this is, C:\Program Files (x86)\Microsoft Visual Studio 11.0\Team Tools\Performance Tools 3. Created a new environment variable _NT_SYMBOL_PATH and set its value to CACHE*C:\SYMBOLSCACHE;SRV*C:\SYMBOLSCACHE*HTTP://MSDL.MICROSOFT.COM/DOWNLOAD/SYMBOLS;\\FOO\BUILD1234 4. Open up CMD as an administrator and run 'VSPerfASPNETCmd /tip http://localhost:56180/ /o:C:\Temp\SampleEISK.vsp' 5. This generates the following message on the cmd       Microsoft (R) VSPerf ASP.NET Command, Version 11.0.0.0     Copyright (C) Microsoft Corporation. All rights reserved.     Configuring and attaching to ASP.NET process. Please wait.     Setting up profiling environment.     Starting monitor.     Launching ASP.NET service.     Attaching Monitor to process.     Launching Internet Explorer.     The profiler is attached to ASP.net. Please run your application scenario now.     Press Enter to stop data collection...   6. I perform certain actions and then I come back to the cmd and hit enter to shut down the profiling. Once I do this, the following message is written to the cmd, Press Enter to stop data collection... Profiling now shut down. Report file "C:\Temp\SampleEISK.vsp" was generated. Running VsPerfReport, packing symbols into the .VSP. Shutting down profiling and restarting ASP.NET. Please wait. Restarting w3wp.exe.   7. I look in the C:\Temp folder and I can see the SampleEISK.vsp file generated. I can successfully open this file in Visual Studio 2012. 8. When I am trying to open the vsp file in VS 2010 the VS 2010 IDE crashes. Kaboooom! What I would expect to happen I expect to receive a message "VS 2010 does not support the vsp file generated by VS 2012". What actually happened The VS 2010 IDE crashed 2. Resolution This is a valid bug! However, there isn’t much value in releasing a hotfix for this issue. Refer below to the resolution provided by the Visual Studio Profiler Team.  Thank you for taking the time to report this issue. We completely agree that Visual Studio 2010 should not crash. However in this particular case this is not a bug we are going to retroactively release a fix to 2010 for at this point. Given that a fix would not unblock the scenario of opening a 2012 created file on Visual Studio 2010, and there is not an active update channel for Visual Studio 2010 other than manually locating and installing hot fixes, we will not be fixing this particular issue. Best Regards, Visual Studio Profiler Team   Though it would be great to improve the behaviour however, this is not a defect that would stop you from progressing in any way. It’s important to note however that VSP files generated by Visual Studio 2012 are not backward compatible so you should refrain from opening these files in Visual Studio 2010.

    Read the article

  • ANTS Memory Profiler 8 released!

    - by Ben Emmett
    I’m excited to say that we’ve just released ANTS Memory Profiler 8! The big news is support for profiling .NET’s usage of unmanaged memory. There are two main parts to this. Firstly you can see a breakdown of unmanaged memory usage by module. This lets you see at a high level where unmanaged memory is being used – for example in the image below, it’s being used by a PDF generation library. Separately, when looking at a list of .NET classes, you can see how much unmanaged memory those classes are responsible for holding on to. You can also see that information for individual instances of those classes. Some clues you might need this: You’re using system objects or 3rd party components which deal with unmanaged memory under the hood (this includes things like the GDI+ functions used for working with bitmaps) Your application still relies on some legacy Delphi / C++ / etc code from left over from the days before your company moved over to using .NET You’ve used a previous version of ANTS Memory Profiler, and have ever seen a pie chart that looks something like this: You’ll also notice that the startup process has been entirely redesigned, bringing it in line with ANTS Performance Profiler 8, which was released earlier in the year. This makes it faster to start profiling and to run repeat profiling sessions, lets you profile using any browser instead of Internet Explorer, and also provides a host of stability improvements, particularly when launching websites in IIS. Download the new version (there’s a free trial), and as always I’d love to know what you think – just email [email protected]. Cheers! Ben

    Read the article

  • Java ME SDK 3.0.5 is released!

    - by SungmoonCho
      Java ME SDK 3.0.5 went live! For many months, we have been working hard to fix bugs from previous version, and add a lot of new features demanded by Java ME community. You can download the new version from this link. Please see below for more information. NetBeans Integration All Java ME tools are implemented as NetBeans plugins. Device Manager Java ME SDK now supports multiple device managers. You can switch between different versions of device managers. LWUIT 1.5 Support The Resource Editor is available from the Java ME menu to help you design and organize resources for LWUIT applications. For a description of LWUIT 1.5 features, visit the LWUIT download page Network Monitor Integrated with NetBeans profiling tools, the Network Monitor now supports WMA, SIP, Bluetooth and OBEX, SATSA APDU and JCRMI, and server sockets. CPU Profiler Now uses standard NetBeans profiling facilities to view snapshots. Profiling of VM classes can also be toggled on or off. WURFL Device Database The database has been updated with more than 1000 new devices. Tracing - New tracing functionality now includes CLDC VM events, and monitors events such as exceptions, class loading, garbage collection, and methods invocation. New or updated JSR support - Includes support for JSR 234 (Advanced Multimedia Supplements), JSR 253 (Mobile Telephony API), JSR 257 (Contactless Communication API), JSR 258 (Mobile User Interface Customization API), and JSR 293 (XML API for Java ME).

    Read the article

  • ANTS Performance Profiler 7.0 has been released!

    - by Michaela Murray
    Please join me in welcoming ANTS Performance Profiler 7 to the world of .NET. ANTS Performance Profiler is a .NET code profiling tool. It lets you identify performance bottlenecks within minutes and therefore enables you to optimize your application performance. Version 7.0 includes integrated decompilation: when profiling methods and assemblies with no source code file, you can generate source code right from the profiler interface. You can then browse and navigate this automatically generated source as if it was your own. If you have an assembly's PDB file but no source, integrated decompilation even lets you view line-level timings for each method, pinpointing the exact cause of performance bottlenecks. Integrated decompilation is powered by .NET Reflector, but you don't need Reflector installed to use the functionality. Watch this video to see it in action. Also new in ANTS Performance Profiler 7.0: · Full support for SharePoint 2010 - No need to manually configure profiling for the latest version of SharePoint · Full support for IIS Express · Azure and Amazon EC2 support, enabling you to profile in the cloud Please click here, for more details about the ANTS Performance Profiler 7.0.

    Read the article

  • EDQ Technical Enablement for OPN (Prague - June 17-19)

    - by milomir.vojvodic
    Oracle Enterprise Data Quality (EDQ) Technical Enablement and Partner Training Trusted Data for Your Enterprise Applications Oracle Enterprise Data Quality helps organizations achieve maximum value from their business-critical applications by delivering fit-for-purpose data. These products also enable individuals and collaborative teams to quickly and easily identify and resolve any problems in underlying data. With Oracle Enterprise Data Quality, customers can identify new opportunities, improve operational efficiency, and more efficiently comply with industry or governmental regulation. Oracle Enterprise Data Quality is designed to serve as a very channel friendly platform to OPN.  This means that pre-built extensions, components and even complete business solutions can readily be built and shared.  This allows our customers/partners to be highly efficient in how they deploy custom business solutions, but also allows our partners to develop specialized components, domain knowledge and even complete business solutions. Training is suitable for: · Database administrators · Architects · Technical staff Objectives of the training: After completing this course, participants should: · Have an understanding of the core functionality of EDQ across profiling, auditing, transforming, parsing and matching data · Be able to describe some of the key capabilities and benefits delivered by EDQ · Be able to create and run standalone EDQ processes and jobs · Be ready to start working with data from customers and (with practice) be able to demonstrate EDQ to customers Agenda 17th June Fundamentals For Demoing (Profile, Audit, Transform and More) Profiling Auditing Transforming Writing and exporting data Jobs and scheduling Publishing, packaging and copying EDQ processes Introduction to the Customer Data Extension Pack Realtime Processing via Web Services The Server Console Run Profiles Data Interfaces Sampling Publishing metrics to the Dashboard Users and security 18th June Matching Matching overview Basic matching configuration Matching rule hierarchies Clustering Merging Reviewing possible matches Outputting Match Data Case study 19th June Address Verification Address Verification Overview Configuration Accuracy Flags Parsing Parsing Overview Phrase profiling Tailoring a CDEP Parser Base Tokenization Classification Reclassification Selection Resolution Register Here Don’t miss this FREE event. Space is limited. Oracle University V Parku 2294/4 148 00 Praha 4 17.6. – 19.6. 2014 09:00 a.m.– 17:30 p.m.

    Read the article

  • WPF Animation FPS vs. CPU usage - Am I expecting too much?

    - by Cory Charlton
    Working on a screen saver for my wife, http://cchearts.codeplex.com/, and while I've been able to improve FPS on lower end machines (switch from Path to StreamGeometry, use DrawingVisual instead of UserControl, etc) the CPU usage still seems very high. Here's some numbers I ran from a few 5 minute sampling periods: ~60FPS 35% average CPU on Core 2 Duo T7500 @ 2.2GHz, 3GB ram, NVIDIA Quadro NVS 140M (128MB), Vista [My dev laptop] ~40FPS 50% average CPU on Pentium D @ 3.4GHz, 1.5GB ram, Standard VGA Graphics Adapter (unknown), 2003 Server [A crappy desktop] I can understand the lower frame rate and higher CPU usage on the crappy desktop but it still seems pretty high and 35% on my dev laptop seems high as well. I'd really like to analyze the application to get more details but I'm having issues there as well so I'm wondering if I'm doing something wrong (never profiled WPF before). WPF Performance Suite: Process Launch Error Unable to attach to process: CCHearts.exe Do you want to kill it? This error message occurs when I click cancel after attempting launch. If I don't click cancel it sits there idle, I guess waiting to attach. Performance Explorer: Could not launch C:\Projects2\CC.Hearts\CC.Hearts\bin\Debug (USEVISUAL)\CCHearts.exe. Previous attempt to profile the application finished unsuccessfully. Please restart the application. Output Window from Performance: Profiling started. Profiling process ID 5360 (CCHearts). Process ID 5360 has exited. Data written to C:\Projects2\CC.Hearts\CCHearts100608.vsp. Profiling finished. PRF0025: No data was collected. Profiling complete. So I'm stuck wanting to improve performance but have no concrete way to determine where the bottleneck is. Have been relatively successful throwing darts at this point but I'm beyond that now :) PS: Screensaver is hosted at CodePlex if you want to look at the source and missed the link above. Edit: My RenderOptions darts... // NOTE: Grasping at straws here ;-) RenderOptions.SetBitmapScalingMode(newHeart, BitmapScalingMode.LowQuality); RenderOptions.SetCachingHint(newHeart, CachingHint.Cache); RenderOptions.SetEdgeMode(newHeart, EdgeMode.Aliased); I threw those a little while back and didn't see much difference (not sure if the bitmap scaling even comes into play). Really wish I could get profiling working to know where I should try to optimize. For now I assume there is some overhead in creating a new HeartVisual and the DrawingVisual contained inside. Maybe if I reset and reused the hearts (tossed them in a queue once they completed or something) I'd see an improvement. Shrug Throwing darts while blindfolder is always fun.

    Read the article

  • iPhone openGLES performance tuning

    - by genesys
    Hey there! I'm trying now for quite a while to optimize the framerate of my game without really making progress. I'm running on the newest iPhone SDK and have a iPhone 3G 3.1.2 device. I invoke arround 150 drawcalls, rendering about 1900 Triangles in total (all objects are textured using two texturelayers and multitexturing. most textures come from the same textureAtlasTexture stored in pvrtc 2bpp compressed texture). This renders on my phone at arround 30 fps, which appears to me to be way too low for only 1900 triangles. I tried many things to optimize the performance, including batching together the objects, transforming the vertices on the CPU and rendering them in a single drawcall. this yelds 8 drawcalls (as oposed to 150 drawcalls), but performance is about the same (fps drop to arround 26fps) I'm using 32byte vertices stored in an interleaved array (12bytes position, 12bytes normals, 8bytes uv). I'm rendering triangleLists and the vertices are ordered in TriStrip order. I did some profiling but I don't really know how to interprete it. instruments-sampling using Instruments and Sampling yelds this result: http://neo.cycovery.com/instruments_sampling.gif telling me that a lot of time is spent in "mach_msg_trap". I googled for it and it seems this function is called in order to wait for some other things. But wait for what?? instruments-openGL instruments with the openGL module yelds this result: http://neo.cycovery.com/intstruments_openglES_debug.gif but here i have really no idea what those numbers are telling me shark profiling: profiling with shark didn't tell me much either: http://neo.cycovery.com/shark_profile_release.gif the largest number is 10%, spent by DrawTriangles - and the whole rest is spent in very small percentage functions Can anyone tell me what else I could do in order to figure out the bottleneck and could help me to interprete those profiling information? Thanks a lot!

    Read the article

  • Leak caused by fread

    - by Jack
    I'm profiling code of a game I wrote and I'm wondering how it is possible that the following snippet causes an heap increase of 4kb (I'm profiling with Heapshot Analysis of Xcode) every time it is executed: u8 WorldManager::versionOfMap(FILE *file) { char magic[4]; u8 version; fread(magic, 4, 1, file); <-- this is the line fread(&version,1,1,file); fseek(file, 0, SEEK_SET); return version; } According to the profiler the highlighted line allocates 4.00Kb of memory with a malloc every time the function is called, memory which is never released. This thing seems to happen with other calls to fread around the code, but this was the most eclatant one. Is there anything trivial I'm missing? Is it something internal I shouldn't care about? Just as a note: I'm profiling it on an iPhone and it's compiled as release (-O2).

    Read the article

  • What&rsquo;s new in VS.10 &amp; TFS.10?

    - by johndoucette
    Getting my geek on… I have decided to call the products VS.10 (Visual Studio 2010), TP.10 (Test Professional 2010),  and TFS.10 (Team Foundation Server 2010) Thanks Neno Loje. What's new in Visual Studio & Team Foundation Server 2010? Focusing on Visual Studio Team System (VSTS) ALM-related parts: Visual Studio Ultimate 2010 NEW: IntelliTrace® (aka the historical debugger) NEW: Architecture Tools New Project Type: Modeling Project UML Diagrams UML Use Case Diagram UML Class Diagram UML Sequence Diagram (supports reverse enginneering) UML Activity Diagram UML Component Diagram Layer Diagram (with Team Build integration for layer validation) Architecuture Explorer Dependency visualization DGML Web & Load Tests Visual Studio Premium 2010 NEW: Architecture Tools Read-only model viewer Development Tools Code Analysis New Rules like SQL Injection detection Rule Sets Code Profiler Multi-Tier Profiling JScript Profiling Profiling applications on virtual machines in sampling mode Code Metrics Test Tools Code Coverage NEW: Test Impact Analysis NEW: Coded UI Test Database Tools (DB schema versioning & deployment) Visual Studio Professional 2010 Debuger Mixed Mode Debugging for 64-bit Applications Export/Import of Breakpoints and data tips Visual Studio Test Professional 2010 Microsoft Test Manager (MTM, formerly known as "Camano")) Fast Forward Testing Visual Studio Team Foundation Server 2010 Work Item Tracking and Project Management New MSF templatesfor Agile and CMMI (V 5.0) Hierarchical Work Items Custom Work Item Link Types Ready to use Excel agile project management workbooks for managing your backlogs (including capacity planing) Convert Work Item query to an Excel report MS Excel integration Support for Work Item hierarchies Formatting is preserved after doing a 'Refresh' MS Project integration Hierarchy and successor/predecessor info is now synchronized NEW: Test Case Management Version Control Public Workspaces Branch & Merge Visualization Tracking of Changesets & Work Items Gated Check-In Team Build Build Controllers and Agents Workflow 4-based build process NEW: Lab Management (only a pre-release is avaiable at the moment!) Project Portal & Reporting Dashboards (on SharePoint Portal) Burndown Chart TFS Web Parts (to show data from TFS) Administration & Operations Topology enhancements Application tier network load balancing (NLB) SQL Server scale out Improved Sharepoint flexibility Report Server flexibility Zone support Kerberos support Separation of TFS and SQL administration Setup Separate install from configure Improved installation wizards Optional components Simplified account requirements Improved Reporting Services configuration Setup consolidation Upgrading from previous TFS versions Improved IIS flexibility Administration Consolidation of command line tools User rename support Project Collections Archive/restore individual project collections Move Team Project Collections Server consolidation Team Project Collection Split Team Project Collection Isolation Server request cancellation Licensing: TFS server license included in MSDN subscriptions Removed features (former features not part of Visual Studio 2010): Debug » Start With Application Verifier Object Test Bench IntelliSense for C++ / CLI Debugging support for SQL 2000

    Read the article

  • Using the JRockit Flight Recorder as an Exception Profiler.

    - by Marcus Hirt
    There is a lot of new data points in the JRockit Flight Recorder compared to the data available in the old JRA. One set of data deals with exceptions and where they are thrown. In JRA, it was possible to tell how many exceptions were thrown, but it was not possible to determine from where they were thrown. Here is how to do a recording with exception profiling enabled from JRockit Mission Control. 1. Right click on the JVM to profile, select Start Flight Recording. 2. Select the Profiling with Exceptions template.   3. Wait for the recording to finish. The count down for the time left will show in the Flight Recorder Control view. 4. When done the recording will automatically be downloaded and displayed. To show the exceptions, go to the Code | Exceptions tab.

    Read the article

  • DynamicMethod for ConstructorInfo.Invoke, what do I need to consider?

    - by Lasse V. Karlsen
    My question is this: If I'm going to build a DynamicMethod object, corresponding to a ConstructorInfo.Invoke call, what types of IL do I need to implement in order to cope with all (or most) types of arguments, when I can guarantee that the right type and number of arguments is going to be passed in before I make the call? Background I am on my 3rd iteration of my IoC container, and currently doing some profiling to figure out if there are any areas where I can easily shave off large amounts of time being used. One thing I noticed is that when resolving to a concrete type, ultimately I end up with a constructor being called, using ConstructorInfo.Invoke, passing in an array of arguments that I've worked out. What I noticed is that the invoke method has quite a bit of overhead, and I'm wondering if most of this is just different implementations of the same checks I do. For instance, due to the constructor matching code I have, to find a matching constructor for the predefined parameter names, types, and values that I have passed in, there's no way this particular invoke call will not end up with something it should be able to cope with, like the correct number of arguments, in the right order, of the right type, and with appropriate values. When doing a profiling session containing a million calls to my resolve method, and then replacing it with a DynamicMethod implementation that mimics the Invoke call, the profiling timings was like this: ConstructorInfo.Invoke: 1973ms DynamicMethod: 93ms This accounts for around 20% of the total runtime of this profiling application. In other words, by replacing the ConstructorInfo.Invoke call with a DynamicMethod that does the same, I am able to shave off 20% runtime when dealing with basic factory-scoped services (ie. all resolution calls end up with a constructor call). I think this is fairly substantial, and warrants a closer look at how much work it would be to build a stable DynamicMethod generator for constructors in this context. So, the dynamic method would take in an object array, and return the constructed object, and I already know the ConstructorInfo object in question. Therefore, it looks like the dynamic method would be made up of the following IL: l001: ldarg.0 ; the object array containing the arguments l002: ldc.i4.0 ; the index of the first argument l003: ldelem.ref ; get the value of the first argument l004: castclass T ; cast to the right type of argument (only if not "Object") (repeat l001-l004 for all parameters, l004 only for non-Object types, varying l002 constant from 0 and up for each index) l005: newobj ci ; call the constructor l006: ret Is there anything else I need to consider? Note that I'm aware that creating dynamic methods will probably not be available when running the application in "reduced access mode" (sometimes the brain just won't give up those terms), but in that case I can easily detect that and just calling the original constructor as before, with the overhead and all.

    Read the article

< Previous Page | 6 7 8 9 10 11 12 13 14 15 16 17  | Next Page >