Search Results

Search found 30046 results on 1202 pages for 'document load'.

Page 632/1202 | < Previous Page | 628 629 630 631 632 633 634 635 636 637 638 639  | Next Page >

  • How to create many div's with 100% height?

    - by ChrisBenyamin
    I need a html document, that contains multiple div's with 100% height (screen filling) one below the other. I have tried to apply every element a height of 100%, but that won't work seamless nor clean. Maybe there is a option with JavaScript? I don't have an idea. Please suggest me your solutions. chris

    Read the article

  • A way of doing real-world test-driven development (and some thoughts about it)

    - by Thomas Weller
    Lately, I exchanged some arguments with Derick Bailey about some details of the red-green-refactor cycle of the Test-driven development process. In short, the issue revolved around the fact that it’s not enough to have a test red or green, but it’s also important to have it red or green for the right reasons. While for me, it’s sufficient to initially have a NotImplementedException in place, Derick argues that this is not totally correct (see these two posts: Red/Green/Refactor, For The Right Reasons and Red For The Right Reason: Fail By Assertion, Not By Anything Else). And he’s right. But on the other hand, I had no idea how his insights could have any practical consequence for my own individual interpretation of the red-green-refactor cycle (which is not really red-green-refactor, at least not in its pure sense, see the rest of this article). This made me think deeply for some days now. In the end I found out that the ‘right reason’ changes in my understanding depending on what development phase I’m in. To make this clear (at least I hope it becomes clear…) I started to describe my way of working in some detail, and then something strange happened: The scope of the article slightly shifted from focusing ‘only’ on the ‘right reason’ issue to something more general, which you might describe as something like  'Doing real-world TDD in .NET , with massive use of third-party add-ins’. This is because I feel that there is a more general statement about Test-driven development to make:  It’s high time to speak about the ‘How’ of TDD, not always only the ‘Why’. Much has been said about this, and me myself also contributed to that (see here: TDD is not about testing, it's about how we develop software). But always justifying what you do is very unsatisfying in the long run, it is inherently defensive, and it costs time and effort that could be used for better and more important things. And frankly: I’m somewhat sick and tired of repeating time and again that the test-driven way of software development is highly preferable for many reasons - I don’t want to spent my time exclusively on stating the obvious… So, again, let’s say it clearly: TDD is programming, and programming is TDD. Other ways of programming (code-first, sometimes called cowboy-coding) are exceptional and need justification. – I know that there are many people out there who will disagree with this radical statement, and I also know that it’s not a description of the real world but more of a mission statement or something. But nevertheless I’m absolutely sure that in some years this statement will be nothing but a platitude. Side note: Some parts of this post read as if I were paid by Jetbrains (the manufacturer of the ReSharper add-in – R#), but I swear I’m not. Rather I think that Visual Studio is just not production-complete without it, and I wouldn’t even consider to do professional work without having this add-in installed... The three parts of a software component Before I go into some details, I first should describe my understanding of what belongs to a software component (assembly, type, or method) during the production process (i.e. the coding phase). Roughly, I come up with the three parts shown below:   First, we need to have some initial sort of requirement. This can be a multi-page formal document, a vague idea in some programmer’s brain of what might be needed, or anything in between. In either way, there has to be some sort of requirement, be it explicit or not. – At the C# micro-level, the best way that I found to formulate that is to define interfaces for just about everything, even for internal classes, and to provide them with exhaustive xml comments. The next step then is to re-formulate these requirements in an executable form. This is specific to the respective programming language. - For C#/.NET, the Gallio framework (which includes MbUnit) in conjunction with the ReSharper add-in for Visual Studio is my toolset of choice. The third part then finally is the production code itself. It’s development is entirely driven by the requirements and their executable formulation. This is the delivery, the two other parts are ‘only’ there to make its production possible, to give it a decent quality and reliability, and to significantly reduce related costs down the maintenance timeline. So while the first two parts are not really relevant for the customer, they are very important for the developer. The customer (or in Scrum terms: the Product Owner) is not interested at all in how  the product is developed, he is only interested in the fact that it is developed as cost-effective as possible, and that it meets his functional and non-functional requirements. The rest is solely a matter of the developer’s craftsmanship, and this is what I want to talk about during the remainder of this article… An example To demonstrate my way of doing real-world TDD, I decided to show the development of a (very) simple Calculator component. The example is deliberately trivial and silly, as examples always are. I am totally aware of the fact that real life is never that simple, but I only want to show some development principles here… The requirement As already said above, I start with writing down some words on the initial requirement, and I normally use interfaces for that, even for internal classes - the typical question “intf or not” doesn’t even come to mind. I need them for my usual workflow and using them automatically produces high componentized and testable code anyway. To think about their usage in every single situation would slow down the production process unnecessarily. So this is what I begin with: namespace Calculator {     /// <summary>     /// Defines a very simple calculator component for demo purposes.     /// </summary>     public interface ICalculator     {         /// <summary>         /// Gets the result of the last successful operation.         /// </summary>         /// <value>The last result.</value>         /// <remarks>         /// Will be <see langword="null" /> before the first successful operation.         /// </remarks>         double? LastResult { get; }       } // interface ICalculator   } // namespace Calculator So, I’m not beginning with a test, but with a sort of code declaration - and still I insist on being 100% test-driven. There are three important things here: Starting this way gives me a method signature, which allows to use IntelliSense and AutoCompletion and thus eliminates the danger of typos - one of the most regular, annoying, time-consuming, and therefore expensive sources of error in the development process. In my understanding, the interface definition as a whole is more of a readable requirement document and technical documentation than anything else. So this is at least as much about documentation than about coding. The documentation must completely describe the behavior of the documented element. I normally use an IoC container or some sort of self-written provider-like model in my architecture. In either case, I need my components defined via service interfaces anyway. - I will use the LinFu IoC framework here, for no other reason as that is is very simple to use. The ‘Red’ (pt. 1)   First I create a folder for the project’s third-party libraries and put the LinFu.Core dll there. Then I set up a test project (via a Gallio project template), and add references to the Calculator project and the LinFu dll. Finally I’m ready to write the first test, which will look like the following: namespace Calculator.Test {     [TestFixture]     public class CalculatorTest     {         private readonly ServiceContainer container = new ServiceContainer();           [Test]         public void CalculatorLastResultIsInitiallyNull()         {             ICalculator calculator = container.GetService<ICalculator>();               Assert.IsNull(calculator.LastResult);         }       } // class CalculatorTest   } // namespace Calculator.Test       This is basically the executable formulation of what the interface definition states (part of). Side note: There’s one principle of TDD that is just plain wrong in my eyes: I’m talking about the Red is 'does not compile' thing. How could a compiler error ever be interpreted as a valid test outcome? I never understood that, it just makes no sense to me. (Or, in Derick’s terms: this reason is as wrong as a reason ever could be…) A compiler error tells me: Your code is incorrect, but nothing more.  Instead, the ‘Red’ part of the red-green-refactor cycle has a clearly defined meaning to me: It means that the test works as intended and fails only if its assumptions are not met for some reason. Back to our Calculator. When I execute the above test with R#, the Gallio plugin will give me this output: So this tells me that the test is red for the wrong reason: There’s no implementation that the IoC-container could load, of course. So let’s fix that. With R#, this is very easy: First, create an ICalculator - derived type:        Next, implement the interface members: And finally, move the new class to its own file: So far my ‘work’ was six mouse clicks long, the only thing that’s left to do manually here, is to add the Ioc-specific wiring-declaration and also to make the respective class non-public, which I regularly do to force my components to communicate exclusively via interfaces: This is what my Calculator class looks like as of now: using System; using LinFu.IoC.Configuration;   namespace Calculator {     [Implements(typeof(ICalculator))]     internal class Calculator : ICalculator     {         public double? LastResult         {             get             {                 throw new NotImplementedException();             }         }     } } Back to the test fixture, we have to put our IoC container to work: [TestFixture] public class CalculatorTest {     #region Fields       private readonly ServiceContainer container = new ServiceContainer();       #endregion // Fields       #region Setup/TearDown       [FixtureSetUp]     public void FixtureSetUp()     {        container.LoadFrom(AppDomain.CurrentDomain.BaseDirectory, "Calculator.dll");     }       ... Because I have a R# live template defined for the setup/teardown method skeleton as well, the only manual coding here again is the IoC-specific stuff: two lines, not more… The ‘Red’ (pt. 2) Now, the execution of the above test gives the following result: This time, the test outcome tells me that the method under test is called. And this is the point, where Derick and I seem to have somewhat different views on the subject: Of course, the test still is worthless regarding the red/green outcome (or: it’s still red for the wrong reasons, in that it gives a false negative). But as far as I am concerned, I’m not really interested in the test outcome at this point of the red-green-refactor cycle. Rather, I only want to assert that my test actually calls the right method. If that’s the case, I will happily go on to the ‘Green’ part… The ‘Green’ Making the test green is quite trivial. Just make LastResult an automatic property:     [Implements(typeof(ICalculator))]     internal class Calculator : ICalculator     {         public double? LastResult { get; private set; }     }         One more round… Now on to something slightly more demanding (cough…). Let’s state that our Calculator exposes an Add() method:         ...   /// <summary>         /// Adds the specified operands.         /// </summary>         /// <param name="operand1">The operand1.</param>         /// <param name="operand2">The operand2.</param>         /// <returns>The result of the additon.</returns>         /// <exception cref="ArgumentException">         /// Argument <paramref name="operand1"/> is &lt; 0.<br/>         /// -- or --<br/>         /// Argument <paramref name="operand2"/> is &lt; 0.         /// </exception>         double Add(double operand1, double operand2);       } // interface ICalculator A remark: I sometimes hear the complaint that xml comment stuff like the above is hard to read. That’s certainly true, but irrelevant to me, because I read xml code comments with the CR_Documentor tool window. And using that, it looks like this:   Apart from that, I’m heavily using xml code comments (see e.g. here for a detailed guide) because there is the possibility of automating help generation with nightly CI builds (using MS Sandcastle and the Sandcastle Help File Builder), and then publishing the results to some intranet location.  This way, a team always has first class, up-to-date technical documentation at hand about the current codebase. (And, also very important for speeding up things and avoiding typos: You have IntelliSense/AutoCompletion and R# support, and the comments are subject to compiler checking…).     Back to our Calculator again: Two more R# – clicks implement the Add() skeleton:         ...           public double Add(double operand1, double operand2)         {             throw new NotImplementedException();         }       } // class Calculator As we have stated in the interface definition (which actually serves as our requirement document!), the operands are not allowed to be negative. So let’s start implementing that. Here’s the test: [Test] [Row(-0.5, 2)] public void AddThrowsOnNegativeOperands(double operand1, double operand2) {     ICalculator calculator = container.GetService<ICalculator>();       Assert.Throws<ArgumentException>(() => calculator.Add(operand1, operand2)); } As you can see, I’m using a data-driven unit test method here, mainly for these two reasons: Because I know that I will have to do the same test for the second operand in a few seconds, I save myself from implementing another test method for this purpose. Rather, I only will have to add another Row attribute to the existing one. From the test report below, you can see that the argument values are explicitly printed out. This can be a valuable documentation feature even when everything is green: One can quickly review what values were tested exactly - the complete Gallio HTML-report (as it will be produced by the Continuous Integration runs) shows these values in a quite clear format (see below for an example). Back to our Calculator development again, this is what the test result tells us at the moment: So we’re red again, because there is not yet an implementation… Next we go on and implement the necessary parameter verification to become green again, and then we do the same thing for the second operand. To make a long story short, here’s the test and the method implementation at the end of the second cycle: // in CalculatorTest:   [Test] [Row(-0.5, 2)] [Row(295, -123)] public void AddThrowsOnNegativeOperands(double operand1, double operand2) {     ICalculator calculator = container.GetService<ICalculator>();       Assert.Throws<ArgumentException>(() => calculator.Add(operand1, operand2)); }   // in Calculator: public double Add(double operand1, double operand2) {     if (operand1 < 0.0)     {         throw new ArgumentException("Value must not be negative.", "operand1");     }     if (operand2 < 0.0)     {         throw new ArgumentException("Value must not be negative.", "operand2");     }     throw new NotImplementedException(); } So far, we have sheltered our method from unwanted input, and now we can safely operate on the parameters without further caring about their validity (this is my interpretation of the Fail Fast principle, which is regarded here in more detail). Now we can think about the method’s successful outcomes. First let’s write another test for that: [Test] [Row(1, 1, 2)] public void TestAdd(double operand1, double operand2, double expectedResult) {     ICalculator calculator = container.GetService<ICalculator>();       double result = calculator.Add(operand1, operand2);       Assert.AreEqual(expectedResult, result); } Again, I’m regularly using row based test methods for these kinds of unit tests. The above shown pattern proved to be extremely helpful for my development work, I call it the Defined-Input/Expected-Output test idiom: You define your input arguments together with the expected method result. There are two major benefits from that way of testing: In the course of refining a method, it’s very likely to come up with additional test cases. In our case, we might add tests for some edge cases like ‘one of the operands is zero’ or ‘the sum of the two operands causes an overflow’, or maybe there’s an external test protocol that has to be fulfilled (e.g. an ISO norm for medical software), and this results in the need of testing against additional values. In all these scenarios we only have to add another Row attribute to the test. Remember that the argument values are written to the test report, so as a side-effect this produces valuable documentation. (This can become especially important if the fulfillment of some sort of external requirements has to be proven). So your test method might look something like that in the end: [Test, Description("Arguments: operand1, operand2, expectedResult")] [Row(1, 1, 2)] [Row(0, 999999999, 999999999)] [Row(0, 0, 0)] [Row(0, double.MaxValue, double.MaxValue)] [Row(4, double.MaxValue - 2.5, double.MaxValue)] public void TestAdd(double operand1, double operand2, double expectedResult) {     ICalculator calculator = container.GetService<ICalculator>();       double result = calculator.Add(operand1, operand2);       Assert.AreEqual(expectedResult, result); } And this will produce the following HTML report (with Gallio):   Not bad for the amount of work we invested in it, huh? - There might be scenarios where reports like that can be useful for demonstration purposes during a Scrum sprint review… The last requirement to fulfill is that the LastResult property is expected to store the result of the last operation. I don’t show this here, it’s trivial enough and brings nothing new… And finally: Refactor (for the right reasons) To demonstrate my way of going through the refactoring portion of the red-green-refactor cycle, I added another method to our Calculator component, namely Subtract(). Here’s the code (tests and production): // CalculatorTest.cs:   [Test, Description("Arguments: operand1, operand2, expectedResult")] [Row(1, 1, 0)] [Row(0, 999999999, -999999999)] [Row(0, 0, 0)] [Row(0, double.MaxValue, -double.MaxValue)] [Row(4, double.MaxValue - 2.5, -double.MaxValue)] public void TestSubtract(double operand1, double operand2, double expectedResult) {     ICalculator calculator = container.GetService<ICalculator>();       double result = calculator.Subtract(operand1, operand2);       Assert.AreEqual(expectedResult, result); }   [Test, Description("Arguments: operand1, operand2, expectedResult")] [Row(1, 1, 0)] [Row(0, 999999999, -999999999)] [Row(0, 0, 0)] [Row(0, double.MaxValue, -double.MaxValue)] [Row(4, double.MaxValue - 2.5, -double.MaxValue)] public void TestSubtractGivesExpectedLastResult(double operand1, double operand2, double expectedResult) {     ICalculator calculator = container.GetService<ICalculator>();       calculator.Subtract(operand1, operand2);       Assert.AreEqual(expectedResult, calculator.LastResult); }   ...   // ICalculator.cs: /// <summary> /// Subtracts the specified operands. /// </summary> /// <param name="operand1">The operand1.</param> /// <param name="operand2">The operand2.</param> /// <returns>The result of the subtraction.</returns> /// <exception cref="ArgumentException"> /// Argument <paramref name="operand1"/> is &lt; 0.<br/> /// -- or --<br/> /// Argument <paramref name="operand2"/> is &lt; 0. /// </exception> double Subtract(double operand1, double operand2);   ...   // Calculator.cs:   public double Subtract(double operand1, double operand2) {     if (operand1 < 0.0)     {         throw new ArgumentException("Value must not be negative.", "operand1");     }       if (operand2 < 0.0)     {         throw new ArgumentException("Value must not be negative.", "operand2");     }       return (this.LastResult = operand1 - operand2).Value; }   Obviously, the argument validation stuff that was produced during the red-green part of our cycle duplicates the code from the previous Add() method. So, to avoid code duplication and minimize the number of code lines of the production code, we do an Extract Method refactoring. One more time, this is only a matter of a few mouse clicks (and giving the new method a name) with R#: Having done that, our production code finally looks like that: using System; using LinFu.IoC.Configuration;   namespace Calculator {     [Implements(typeof(ICalculator))]     internal class Calculator : ICalculator     {         #region ICalculator           public double? LastResult { get; private set; }           public double Add(double operand1, double operand2)         {             ThrowIfOneOperandIsInvalid(operand1, operand2);               return (this.LastResult = operand1 + operand2).Value;         }           public double Subtract(double operand1, double operand2)         {             ThrowIfOneOperandIsInvalid(operand1, operand2);               return (this.LastResult = operand1 - operand2).Value;         }           #endregion // ICalculator           #region Implementation (Helper)           private static void ThrowIfOneOperandIsInvalid(double operand1, double operand2)         {             if (operand1 < 0.0)             {                 throw new ArgumentException("Value must not be negative.", "operand1");             }               if (operand2 < 0.0)             {                 throw new ArgumentException("Value must not be negative.", "operand2");             }         }           #endregion // Implementation (Helper)       } // class Calculator   } // namespace Calculator But is the above worth the effort at all? It’s obviously trivial and not very impressive. All our tests were green (for the right reasons), and refactoring the code did not change anything. It’s not immediately clear how this refactoring work adds value to the project. Derick puts it like this: STOP! Hold on a second… before you go any further and before you even think about refactoring what you just wrote to make your test pass, you need to understand something: if your done with your requirements after making the test green, you are not required to refactor the code. I know… I’m speaking heresy, here. Toss me to the wolves, I’ve gone over to the dark side! Seriously, though… if your test is passing for the right reasons, and you do not need to write any test or any more code for you class at this point, what value does refactoring add? Derick immediately answers his own question: So why should you follow the refactor portion of red/green/refactor? When you have added code that makes the system less readable, less understandable, less expressive of the domain or concern’s intentions, less architecturally sound, less DRY, etc, then you should refactor it. I couldn’t state it more precise. From my personal perspective, I’d add the following: You have to keep in mind that real-world software systems are usually quite large and there are dozens or even hundreds of occasions where micro-refactorings like the above can be applied. It’s the sum of them all that counts. And to have a good overall quality of the system (e.g. in terms of the Code Duplication Percentage metric) you have to be pedantic on the individual, seemingly trivial cases. My job regularly requires the reading and understanding of ‘foreign’ code. So code quality/readability really makes a HUGE difference for me – sometimes it can be even the difference between project success and failure… Conclusions The above described development process emerged over the years, and there were mainly two things that guided its evolution (you might call it eternal principles, personal beliefs, or anything in between): Test-driven development is the normal, natural way of writing software, code-first is exceptional. So ‘doing TDD or not’ is not a question. And good, stable code can only reliably be produced by doing TDD (yes, I know: many will strongly disagree here again, but I’ve never seen high-quality code – and high-quality code is code that stood the test of time and causes low maintenance costs – that was produced code-first…) It’s the production code that pays our bills in the end. (Though I have seen customers these days who demand an acceptance test battery as part of the final delivery. Things seem to go into the right direction…). The test code serves ‘only’ to make the production code work. But it’s the number of delivered features which solely counts at the end of the day - no matter how much test code you wrote or how good it is. With these two things in mind, I tried to optimize my coding process for coding speed – or, in business terms: productivity - without sacrificing the principles of TDD (more than I’d do either way…).  As a result, I consider a ratio of about 3-5/1 for test code vs. production code as normal and desirable. In other words: roughly 60-80% of my code is test code (This might sound heavy, but that is mainly due to the fact that software development standards only begin to evolve. The entire software development profession is very young, historically seen; only at the very beginning, and there are no viable standards yet. If you think about software development as a kind of casting process, where the test code is the mold and the resulting production code is the final product, then the above ratio sounds no longer extraordinary…) Although the above might look like very much unnecessary work at first sight, it’s not. With the aid of the mentioned add-ins, doing all the above is a matter of minutes, sometimes seconds (while writing this post took hours and days…). The most important thing is to have the right tools at hand. Slow developer machines or the lack of a tool or something like that - for ‘saving’ a few 100 bucks -  is just not acceptable and a very bad decision in business terms (though I quite some times have seen and heard that…). Production of high-quality products needs the usage of high-quality tools. This is a platitude that every craftsman knows… The here described round-trip will take me about five to ten minutes in my real-world development practice. I guess it’s about 30% more time compared to developing the ‘traditional’ (code-first) way. But the so manufactured ‘product’ is of much higher quality and massively reduces maintenance costs, which is by far the single biggest cost factor, as I showed in this previous post: It's the maintenance, stupid! (or: Something is rotten in developerland.). In the end, this is a highly cost-effective way of software development… But on the other hand, there clearly is a trade-off here: coding speed vs. code quality/later maintenance costs. The here described development method might be a perfect fit for the overwhelming majority of software projects, but there certainly are some scenarios where it’s not - e.g. if time-to-market is crucial for a software project. So this is a business decision in the end. It’s just that you have to know what you’re doing and what consequences this might have… Some last words First, I’d like to thank Derick Bailey again. His two aforementioned posts (which I strongly recommend for reading) inspired me to think deeply about my own personal way of doing TDD and to clarify my thoughts about it. I wouldn’t have done that without this inspiration. I really enjoy that kind of discussions… I agree with him in all respects. But I don’t know (yet?) how to bring his insights into the described production process without slowing things down. The above described method proved to be very “good enough” in my practical experience. But of course, I’m open to suggestions here… My rationale for now is: If the test is initially red during the red-green-refactor cycle, the ‘right reason’ is: it actually calls the right method, but this method is not yet operational. Later on, when the cycle is finished and the tests become part of the regular, automated Continuous Integration process, ‘red’ certainly must occur for the ‘right reason’: in this phase, ‘red’ MUST mean nothing but an unfulfilled assertion - Fail By Assertion, Not By Anything Else!

    Read the article

  • 256 Windows Azure Worker Roles, Windows Kinect and a 90's Text-Based Ray-Tracer

    - by Alan Smith
    For a couple of years I have been demoing a simple render farm hosted in Windows Azure using worker roles and the Azure Storage service. At the start of the presentation I deploy an Azure application that uses 16 worker roles to render a 1,500 frame 3D ray-traced animation. At the end of the presentation, when the animation was complete, I would play the animation delete the Azure deployment. The standing joke with the audience was that it was that it was a “$2 demo”, as the compute charges for running the 16 instances for an hour was $1.92, factor in the bandwidth charges and it’s a couple of dollars. The point of the demo is that it highlights one of the great benefits of cloud computing, you pay for what you use, and if you need massive compute power for a short period of time using Windows Azure can work out very cost effective. The “$2 demo” was great for presenting at user groups and conferences in that it could be deployed to Azure, used to render an animation, and then removed in a one hour session. I have always had the idea of doing something a bit more impressive with the demo, and scaling it from a “$2 demo” to a “$30 demo”. The challenge was to create a visually appealing animation in high definition format and keep the demo time down to one hour.  This article will take a run through how I achieved this. Ray Tracing Ray tracing, a technique for generating high quality photorealistic images, gained popularity in the 90’s with companies like Pixar creating feature length computer animations, and also the emergence of shareware text-based ray tracers that could run on a home PC. In order to render a ray traced image, the ray of light that would pass from the view point must be tracked until it intersects with an object. At the intersection, the color, reflectiveness, transparency, and refractive index of the object are used to calculate if the ray will be reflected or refracted. Each pixel may require thousands of calculations to determine what color it will be in the rendered image. Pin-Board Toys Having very little artistic talent and a basic understanding of maths I decided to focus on an animation that could be modeled fairly easily and would look visually impressive. I’ve always liked the pin-board desktop toys that become popular in the 80’s and when I was working as a 3D animator back in the 90’s I always had the idea of creating a 3D ray-traced animation of a pin-board, but never found the energy to do it. Even if I had a go at it, the render time to produce an animation that would look respectable on a 486 would have been measured in months. PolyRay Back in 1995 I landed my first real job, after spending three years being a beach-ski-climbing-paragliding-bum, and was employed to create 3D ray-traced animations for a CD-ROM that school kids would use to learn physics. I had got into the strange and wonderful world of text-based ray tracing, and was using a shareware ray-tracer called PolyRay. PolyRay takes a text file describing a scene as input and, after a few hours processing on a 486, produced a high quality ray-traced image. The following is an example of a basic PolyRay scene file. background Midnight_Blue   static define matte surface { ambient 0.1 diffuse 0.7 } define matte_white texture { matte { color white } } define matte_black texture { matte { color dark_slate_gray } } define position_cylindrical 3 define lookup_sawtooth 1 define light_wood <0.6, 0.24, 0.1> define median_wood <0.3, 0.12, 0.03> define dark_wood <0.05, 0.01, 0.005>     define wooden texture { noise surface { ambient 0.2  diffuse 0.7  specular white, 0.5 microfacet Reitz 10 position_fn position_cylindrical position_scale 1  lookup_fn lookup_sawtooth octaves 1 turbulence 1 color_map( [0.0, 0.2, light_wood, light_wood] [0.2, 0.3, light_wood, median_wood] [0.3, 0.4, median_wood, light_wood] [0.4, 0.7, light_wood, light_wood] [0.7, 0.8, light_wood, median_wood] [0.8, 0.9, median_wood, light_wood] [0.9, 1.0, light_wood, dark_wood]) } } define glass texture { surface { ambient 0 diffuse 0 specular 0.2 reflection white, 0.1 transmission white, 1, 1.5 }} define shiny surface { ambient 0.1 diffuse 0.6 specular white, 0.6 microfacet Phong 7  } define steely_blue texture { shiny { color black } } define chrome texture { surface { color white ambient 0.0 diffuse 0.2 specular 0.4 microfacet Phong 10 reflection 0.8 } }   viewpoint {     from <4.000, -1.000, 1.000> at <0.000, 0.000, 0.000> up <0, 1, 0> angle 60     resolution 640, 480 aspect 1.6 image_format 0 }       light <-10, 30, 20> light <-10, 30, -20>   object { disc <0, -2, 0>, <0, 1, 0>, 30 wooden }   object { sphere <0.000, 0.000, 0.000>, 1.00 chrome } object { cylinder <0.000, 0.000, 0.000>, <0.000, 0.000, -4.000>, 0.50 chrome }   After setting up the background and defining colors and textures, the viewpoint is specified. The “camera” is located at a point in 3D space, and it looks towards another point. The angle, image resolution, and aspect ratio are specified. Two lights are present in the image at defined coordinates. The three objects in the image are a wooden disc to represent a table top, and a sphere and cylinder that intersect to form a pin that will be used for the pin board toy in the final animation. When the image is rendered, the following image is produced. The pins are modeled with a chrome surface, so they reflect the environment around them. Note that the scale of the pin shaft is not correct, this will be fixed later. Modeling the Pin Board The frame of the pin-board is made up of three boxes, and six cylinders, the front box is modeled using a clear, slightly reflective solid, with the same refractive index of glass. The other shapes are modeled as metal. object { box <-5.5, -1.5, 1>, <5.5, 5.5, 1.2> glass } object { box <-5.5, -1.5, -0.04>, <5.5, 5.5, -0.09> steely_blue } object { box <-5.5, -1.5, -0.52>, <5.5, 5.5, -0.59> steely_blue } object { cylinder <-5.2, -1.2, 1.4>, <-5.2, -1.2, -0.74>, 0.2 steely_blue } object { cylinder <5.2, -1.2, 1.4>, <5.2, -1.2, -0.74>, 0.2 steely_blue } object { cylinder <-5.2, 5.2, 1.4>, <-5.2, 5.2, -0.74>, 0.2 steely_blue } object { cylinder <5.2, 5.2, 1.4>, <5.2, 5.2, -0.74>, 0.2 steely_blue } object { cylinder <0, -1.2, 1.4>, <0, -1.2, -0.74>, 0.2 steely_blue } object { cylinder <0, 5.2, 1.4>, <0, 5.2, -0.74>, 0.2 steely_blue }   In order to create the matrix of pins that make up the pin board I used a basic console application with a few nested loops to create two intersecting matrixes of pins, which models the layout used in the pin boards. The resulting image is shown below. The pin board contains 11,481 pins, with the scene file containing 23,709 lines of code. For the complete animation 2,000 scene files will be created, which is over 47 million lines of code. Each pin in the pin-board will slide out a specific distance when an object is pressed into the back of the board. This is easily modeled by setting the Z coordinate of the pin to a specific value. In order to set all of the pins in the pin-board to the correct position, a bitmap image can be used. The position of the pin can be set based on the color of the pixel at the appropriate position in the image. When the Windows Azure logo is used to set the Z coordinate of the pins, the following image is generated. The challenge now was to make a cool animation. The Azure Logo is fine, but it is static. Using a normal video to animate the pins would not work; the colors in the video would not be the same as the depth of the objects from the camera. In order to simulate the pin board accurately a series of frames from a depth camera could be used. Windows Kinect The Kenect controllers for the X-Box 360 and Windows feature a depth camera. The Kinect SDK for Windows provides a programming interface for Kenect, providing easy access for .NET developers to the Kinect sensors. The Kinect Explorer provided with the Kinect SDK is a great starting point for exploring Kinect from a developers perspective. Both the X-Box 360 Kinect and the Windows Kinect will work with the Kinect SDK, the Windows Kinect is required for commercial applications, but the X-Box Kinect can be used for hobby projects. The Windows Kinect has the advantage of providing a mode to allow depth capture with objects closer to the camera, which makes for a more accurate depth image for setting the pin positions. Creating a Depth Field Animation The depth field animation used to set the positions of the pin in the pin board was created using a modified version of the Kinect Explorer sample application. In order to simulate the pin board accurately, a small section of the depth range from the depth sensor will be used. Any part of the object in front of the depth range will result in a white pixel; anything behind the depth range will be black. Within the depth range the pixels in the image will be set to RGB values from 0,0,0 to 255,255,255. A screen shot of the modified Kinect Explorer application is shown below. The Kinect Explorer sample application was modified to include slider controls that are used to set the depth range that forms the image from the depth stream. This allows the fine tuning of the depth image that is required for simulating the position of the pins in the pin board. The Kinect Explorer was also modified to record a series of images from the depth camera and save them as a sequence JPEG files that will be used to animate the pins in the animation the Start and Stop buttons are used to start and stop the image recording. En example of one of the depth images is shown below. Once a series of 2,000 depth images has been captured, the task of creating the animation can begin. Rendering a Test Frame In order to test the creation of frames and get an approximation of the time required to render each frame a test frame was rendered on-premise using PolyRay. The output of the rendering process is shown below. The test frame contained 23,629 primitive shapes, most of which are the spheres and cylinders that are used for the 11,800 or so pins in the pin board. The 1280x720 image contains 921,600 pixels, but as anti-aliasing was used the number of rays that were calculated was 4,235,777, with 3,478,754,073 object boundaries checked. The test frame of the pin board with the depth field image applied is shown below. The tracing time for the test frame was 4 minutes 27 seconds, which means rendering the2,000 frames in the animation would take over 148 hours, or a little over 6 days. Although this is much faster that an old 486, waiting almost a week to see the results of an animation would make it challenging for animators to create, view, and refine their animations. It would be much better if the animation could be rendered in less than one hour. Windows Azure Worker Roles The cost of creating an on-premise render farm to render animations increases in proportion to the number of servers. The table below shows the cost of servers for creating a render farm, assuming a cost of $500 per server. Number of Servers Cost 1 $500 16 $8,000 256 $128,000   As well as the cost of the servers, there would be additional costs for networking, racks etc. Hosting an environment of 256 servers on-premise would require a server room with cooling, and some pretty hefty power cabling. The Windows Azure compute services provide worker roles, which are ideal for performing processor intensive compute tasks. With the scalability available in Windows Azure a job that takes 256 hours to complete could be perfumed using different numbers of worker roles. The time and cost of using 1, 16 or 256 worker roles is shown below. Number of Worker Roles Render Time Cost 1 256 hours $30.72 16 16 hours $30.72 256 1 hour $30.72   Using worker roles in Windows Azure provides the same cost for the 256 hour job, irrespective of the number of worker roles used. Provided the compute task can be broken down into many small units, and the worker role compute power can be used effectively, it makes sense to scale the application so that the task is completed quickly, making the results available in a timely fashion. The task of rendering 2,000 frames in an animation is one that can easily be broken down into 2,000 individual pieces, which can be performed by a number of worker roles. Creating a Render Farm in Windows Azure The architecture of the render farm is shown in the following diagram. The render farm is a hybrid application with the following components: ·         On-Premise o   Windows Kinect – Used combined with the Kinect Explorer to create a stream of depth images. o   Animation Creator – This application uses the depth images from the Kinect sensor to create scene description files for PolyRay. These files are then uploaded to the jobs blob container, and job messages added to the jobs queue. o   Process Monitor – This application queries the role instance lifecycle table and displays statistics about the render farm environment and render process. o   Image Downloader – This application polls the image queue and downloads the rendered animation files once they are complete. ·         Windows Azure o   Azure Storage – Queues and blobs are used for the scene description files and completed frames. A table is used to store the statistics about the rendering environment.   The architecture of each worker role is shown below.   The worker role is configured to use local storage, which provides file storage on the worker role instance that can be use by the applications to render the image and transform the format of the image. The service definition for the worker role with the local storage configuration highlighted is shown below. <?xml version="1.0" encoding="utf-8"?> <ServiceDefinition name="CloudRay" >   <WorkerRole name="CloudRayWorkerRole" vmsize="Small">     <Imports>     </Imports>     <ConfigurationSettings>       <Setting name="DataConnectionString" />     </ConfigurationSettings>     <LocalResources>       <LocalStorage name="RayFolder" cleanOnRoleRecycle="true" />     </LocalResources>   </WorkerRole> </ServiceDefinition>     The two executable programs, PolyRay.exe and DTA.exe are included in the Azure project, with Copy Always set as the property. PolyRay will take the scene description file and render it to a Truevision TGA file. As the TGA format has not seen much use since the mid 90’s it is converted to a JPG image using Dave's Targa Animator, another shareware application from the 90’s. Each worker roll will use the following process to render the animation frames. 1.       The worker process polls the job queue, if a job is available the scene description file is downloaded from blob storage to local storage. 2.       PolyRay.exe is started in a process with the appropriate command line arguments to render the image as a TGA file. 3.       DTA.exe is started in a process with the appropriate command line arguments convert the TGA file to a JPG file. 4.       The JPG file is uploaded from local storage to the images blob container. 5.       A message is placed on the images queue to indicate a new image is available for download. 6.       The job message is deleted from the job queue. 7.       The role instance lifecycle table is updated with statistics on the number of frames rendered by the worker role instance, and the CPU time used. The code for this is shown below. public override void Run() {     // Set environment variables     string polyRayPath = Path.Combine(Environment.GetEnvironmentVariable("RoleRoot"), PolyRayLocation);     string dtaPath = Path.Combine(Environment.GetEnvironmentVariable("RoleRoot"), DTALocation);       LocalResource rayStorage = RoleEnvironment.GetLocalResource("RayFolder");     string localStorageRootPath = rayStorage.RootPath;       JobQueue jobQueue = new JobQueue("renderjobs");     JobQueue downloadQueue = new JobQueue("renderimagedownloadjobs");     CloudRayBlob sceneBlob = new CloudRayBlob("scenes");     CloudRayBlob imageBlob = new CloudRayBlob("images");     RoleLifecycleDataSource roleLifecycleDataSource = new RoleLifecycleDataSource();       Frames = 0;       while (true)     {         // Get the render job from the queue         CloudQueueMessage jobMsg = jobQueue.Get();           if (jobMsg != null)         {             // Get the file details             string sceneFile = jobMsg.AsString;             string tgaFile = sceneFile.Replace(".pi", ".tga");             string jpgFile = sceneFile.Replace(".pi", ".jpg");               string sceneFilePath = Path.Combine(localStorageRootPath, sceneFile);             string tgaFilePath = Path.Combine(localStorageRootPath, tgaFile);             string jpgFilePath = Path.Combine(localStorageRootPath, jpgFile);               // Copy the scene file to local storage             sceneBlob.DownloadFile(sceneFilePath);               // Run the ray tracer.             string polyrayArguments =                 string.Format("\"{0}\" -o \"{1}\" -a 2", sceneFilePath, tgaFilePath);             Process polyRayProcess = new Process();             polyRayProcess.StartInfo.FileName =                 Path.Combine(Environment.GetEnvironmentVariable("RoleRoot"), polyRayPath);             polyRayProcess.StartInfo.Arguments = polyrayArguments;             polyRayProcess.Start();             polyRayProcess.WaitForExit();               // Convert the image             string dtaArguments =                 string.Format(" {0} /FJ /P{1}", tgaFilePath, Path.GetDirectoryName (jpgFilePath));             Process dtaProcess = new Process();             dtaProcess.StartInfo.FileName =                 Path.Combine(Environment.GetEnvironmentVariable("RoleRoot"), dtaPath);             dtaProcess.StartInfo.Arguments = dtaArguments;             dtaProcess.Start();             dtaProcess.WaitForExit();               // Upload the image to blob storage             imageBlob.UploadFile(jpgFilePath);               // Add a download job.             downloadQueue.Add(jpgFile);               // Delete the render job message             jobQueue.Delete(jobMsg);               Frames++;         }         else         {             Thread.Sleep(1000);         }           // Log the worker role activity.         roleLifecycleDataSource.Alive             ("CloudRayWorker", RoleLifecycleDataSource.RoleLifecycleId, Frames);     } }     Monitoring Worker Role Instance Lifecycle In order to get more accurate statistics about the lifecycle of the worker role instances used to render the animation data was tracked in an Azure storage table. The following class was used to track the worker role lifecycles in Azure storage.   public class RoleLifecycle : TableServiceEntity {     public string ServerName { get; set; }     public string Status { get; set; }     public DateTime StartTime { get; set; }     public DateTime EndTime { get; set; }     public long SecondsRunning { get; set; }     public DateTime LastActiveTime { get; set; }     public int Frames { get; set; }     public string Comment { get; set; }       public RoleLifecycle()     {     }       public RoleLifecycle(string roleName)     {         PartitionKey = roleName;         RowKey = Utils.GetAscendingRowKey();         Status = "Started";         StartTime = DateTime.UtcNow;         LastActiveTime = StartTime;         EndTime = StartTime;         SecondsRunning = 0;         Frames = 0;     } }     A new instance of this class is created and added to the storage table when the role starts. It is then updated each time the worker renders a frame to record the total number of frames rendered and the total processing time. These statistics are used be the monitoring application to determine the effectiveness of use of resources in the render farm. Rendering the Animation The Azure solution was deployed to Windows Azure with the service configuration set to 16 worker role instances. This allows for the application to be tested in the cloud environment, and the performance of the application determined. When I demo the application at conferences and user groups I often start with 16 instances, and then scale up the application to the full 256 instances. The configuration to run 16 instances is shown below. <?xml version="1.0" encoding="utf-8"?> <ServiceConfiguration serviceName="CloudRay" xmlns="http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceConfiguration" osFamily="1" osVersion="*">   <Role name="CloudRayWorkerRole">     <Instances count="16" />     <ConfigurationSettings>       <Setting name="DataConnectionString"         value="DefaultEndpointsProtocol=https;AccountName=cloudraydata;AccountKey=..." />     </ConfigurationSettings>   </Role> </ServiceConfiguration>     About six minutes after deploying the application the first worker roles become active and start to render the first frames of the animation. The CloudRay Monitor application displays an icon for each worker role instance, with a number indicating the number of frames that the worker role has rendered. The statistics on the left show the number of active worker roles and statistics about the render process. The render time is the time since the first worker role became active; the CPU time is the total amount of processing time used by all worker role instances to render the frames.   Five minutes after the first worker role became active the last of the 16 worker roles activated. By this time the first seven worker roles had each rendered one frame of the animation.   With 16 worker roles u and running it can be seen that one hour and 45 minutes CPU time has been used to render 32 frames with a render time of just under 10 minutes.     At this rate it would take over 10 hours to render the 2,000 frames of the full animation. In order to complete the animation in under an hour more processing power will be required. Scaling the render farm from 16 instances to 256 instances is easy using the new management portal. The slider is set to 256 instances, and the configuration saved. We do not need to re-deploy the application, and the 16 instances that are up and running will not be affected. Alternatively, the configuration file for the Azure service could be modified to specify 256 instances.   <?xml version="1.0" encoding="utf-8"?> <ServiceConfiguration serviceName="CloudRay" xmlns="http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceConfiguration" osFamily="1" osVersion="*">   <Role name="CloudRayWorkerRole">     <Instances count="256" />     <ConfigurationSettings>       <Setting name="DataConnectionString"         value="DefaultEndpointsProtocol=https;AccountName=cloudraydata;AccountKey=..." />     </ConfigurationSettings>   </Role> </ServiceConfiguration>     Six minutes after the new configuration has been applied 75 new worker roles have activated and are processing their first frames.   Five minutes later the full configuration of 256 worker roles is up and running. We can see that the average rate of frame rendering has increased from 3 to 12 frames per minute, and that over 17 hours of CPU time has been utilized in 23 minutes. In this test the time to provision 140 worker roles was about 11 minutes, which works out at about one every five seconds.   We are now half way through the rendering, with 1,000 frames complete. This has utilized just under three days of CPU time in a little over 35 minutes.   The animation is now complete, with 2,000 frames rendered in a little over 52 minutes. The CPU time used by the 256 worker roles is 6 days, 7 hours and 22 minutes with an average frame rate of 38 frames per minute. The rendering of the last 1,000 frames took 16 minutes 27 seconds, which works out at a rendering rate of 60 frames per minute. The frame counts in the server instances indicate that the use of a queue to distribute the workload has been very effective in distributing the load across the 256 worker role instances. The first 16 instances that were deployed first have rendered between 11 and 13 frames each, whilst the 240 instances that were added when the application was scaled have rendered between 6 and 9 frames each.   Completed Animation I’ve uploaded the completed animation to YouTube, a low resolution preview is shown below. Pin Board Animation Created using Windows Kinect and 256 Windows Azure Worker Roles   The animation can be viewed in 1280x720 resolution at the following link: http://www.youtube.com/watch?v=n5jy6bvSxWc Effective Use of Resources According to the CloudRay monitor statistics the animation took 6 days, 7 hours and 22 minutes CPU to render, this works out at 152 hours of compute time, rounded up to the nearest hour. As the usage for the worker role instances are billed for the full hour, it may have been possible to render the animation using fewer than 256 worker roles. When deciding the optimal usage of resources, the time required to provision and start the worker roles must also be considered. In the demo I started with 16 worker roles, and then scaled the application to 256 worker roles. It would have been more optimal to start the application with maybe 200 worker roles, and utilized the full hour that I was being billed for. This would, however, have prevented showing the ease of scalability of the application. The new management portal displays the CPU usage across the worker roles in the deployment. The average CPU usage across all instances is 93.27%, with over 99% used when all the instances are up and running. This shows that the worker role resources are being used very effectively. Grid Computing Scenarios Although I am using this scenario for a hobby project, there are many scenarios where a large amount of compute power is required for a short period of time. Windows Azure provides a great platform for developing these types of grid computing applications, and can work out very cost effective. ·         Windows Azure can provide massive compute power, on demand, in a matter of minutes. ·         The use of queues to manage the load balancing of jobs between role instances is a simple and effective solution. ·         Using a cloud-computing platform like Windows Azure allows proof-of-concept scenarios to be tested and evaluated on a very low budget. ·         No charges for inbound data transfer makes the uploading of large data sets to Windows Azure Storage services cost effective. (Transaction charges still apply.) Tips for using Windows Azure for Grid Computing Scenarios I found the implementation of a render farm using Windows Azure a fairly simple scenario to implement. I was impressed by ease of scalability that Azure provides, and by the short time that the application took to scale from 16 to 256 worker role instances. In this case it was around 13 minutes, in other tests it took between 10 and 20 minutes. The following tips may be useful when implementing a grid computing project in Windows Azure. ·         Using an Azure Storage queue to load-balance the units of work across multiple worker roles is simple and very effective. The design I have used in this scenario could easily scale to many thousands of worker role instances. ·         Windows Azure accounts are typically limited to 20 cores. If you need to use more than this, a call to support and a credit card check will be required. ·         Be aware of how the billing model works. You will be charged for worker role instances for the full clock our in which the instance is deployed. Schedule the workload to start just after the clock hour has started. ·         Monitor the utilization of the resources you are provisioning, ensure that you are not paying for worker roles that are idle. ·         If you are deploying third party applications to worker roles, you may well run into licensing issues. Purchasing software licenses on a per-processor basis when using hundreds of processors for a short time period would not be cost effective. ·         Third party software may also require installation onto the worker roles, which can be accomplished using start-up tasks. Bear in mind that adding a startup task and possible re-boot will add to the time required for the worker role instance to start and activate. An alternative may be to use a prepared VM and use VM roles. ·         Consider using the Windows Azure Autoscaling Application Block (WASABi) to autoscale the worker roles in your application. When using a large number of worker roles, the utilization must be carefully monitored, if the scaling algorithms are not optimal it could get very expensive!

    Read the article

  • Using R to Analyze G1GC Log Files

    - by user12620111
    Using R to Analyze G1GC Log Files body, td { font-family: sans-serif; background-color: white; font-size: 12px; margin: 8px; } tt, code, pre { font-family: 'DejaVu Sans Mono', 'Droid Sans Mono', 'Lucida Console', Consolas, Monaco, monospace; } h1 { font-size:2.2em; } h2 { font-size:1.8em; } h3 { font-size:1.4em; } h4 { font-size:1.0em; } h5 { font-size:0.9em; } h6 { font-size:0.8em; } a:visited { color: rgb(50%, 0%, 50%); } pre { margin-top: 0; max-width: 95%; border: 1px solid #ccc; white-space: pre-wrap; } pre code { display: block; padding: 0.5em; } code.r, code.cpp { background-color: #F8F8F8; } table, td, th { border: none; } blockquote { color:#666666; margin:0; padding-left: 1em; border-left: 0.5em #EEE solid; } hr { height: 0px; border-bottom: none; border-top-width: thin; border-top-style: dotted; border-top-color: #999999; } @media print { * { background: transparent !important; color: black !important; filter:none !important; -ms-filter: none !important; } body { font-size:12pt; max-width:100%; } a, a:visited { text-decoration: underline; } hr { visibility: hidden; page-break-before: always; } pre, blockquote { padding-right: 1em; page-break-inside: avoid; } tr, img { page-break-inside: avoid; } img { max-width: 100% !important; } @page :left { margin: 15mm 20mm 15mm 10mm; } @page :right { margin: 15mm 10mm 15mm 20mm; } p, h2, h3 { orphans: 3; widows: 3; } h2, h3 { page-break-after: avoid; } } pre .operator, pre .paren { color: rgb(104, 118, 135) } pre .literal { color: rgb(88, 72, 246) } pre .number { color: rgb(0, 0, 205); } pre .comment { color: rgb(76, 136, 107); } pre .keyword { color: rgb(0, 0, 255); } pre .identifier { color: rgb(0, 0, 0); } pre .string { color: rgb(3, 106, 7); } var hljs=new function(){function m(p){return p.replace(/&/gm,"&").replace(/"}while(y.length||w.length){var v=u().splice(0,1)[0];z+=m(x.substr(q,v.offset-q));q=v.offset;if(v.event=="start"){z+=t(v.node);s.push(v.node)}else{if(v.event=="stop"){var p,r=s.length;do{r--;p=s[r];z+=("")}while(p!=v.node);s.splice(r,1);while(r'+M[0]+""}else{r+=M[0]}O=P.lR.lastIndex;M=P.lR.exec(L)}return r+L.substr(O,L.length-O)}function J(L,M){if(M.sL&&e[M.sL]){var r=d(M.sL,L);x+=r.keyword_count;return r.value}else{return F(L,M)}}function I(M,r){var L=M.cN?'':"";if(M.rB){y+=L;M.buffer=""}else{if(M.eB){y+=m(r)+L;M.buffer=""}else{y+=L;M.buffer=r}}D.push(M);A+=M.r}function G(N,M,Q){var R=D[D.length-1];if(Q){y+=J(R.buffer+N,R);return false}var P=q(M,R);if(P){y+=J(R.buffer+N,R);I(P,M);return P.rB}var L=v(D.length-1,M);if(L){var O=R.cN?"":"";if(R.rE){y+=J(R.buffer+N,R)+O}else{if(R.eE){y+=J(R.buffer+N,R)+O+m(M)}else{y+=J(R.buffer+N+M,R)+O}}while(L1){O=D[D.length-2].cN?"":"";y+=O;L--;D.length--}var r=D[D.length-1];D.length--;D[D.length-1].buffer="";if(r.starts){I(r.starts,"")}return R.rE}if(w(M,R)){throw"Illegal"}}var E=e[B];var D=[E.dM];var A=0;var x=0;var y="";try{var s,u=0;E.dM.buffer="";do{s=p(C,u);var t=G(s[0],s[1],s[2]);u+=s[0].length;if(!t){u+=s[1].length}}while(!s[2]);if(D.length1){throw"Illegal"}return{r:A,keyword_count:x,value:y}}catch(H){if(H=="Illegal"){return{r:0,keyword_count:0,value:m(C)}}else{throw H}}}function g(t){var p={keyword_count:0,r:0,value:m(t)};var r=p;for(var q in e){if(!e.hasOwnProperty(q)){continue}var s=d(q,t);s.language=q;if(s.keyword_count+s.rr.keyword_count+r.r){r=s}if(s.keyword_count+s.rp.keyword_count+p.r){r=p;p=s}}if(r.language){p.second_best=r}return p}function i(r,q,p){if(q){r=r.replace(/^((]+|\t)+)/gm,function(t,w,v,u){return w.replace(/\t/g,q)})}if(p){r=r.replace(/\n/g,"")}return r}function n(t,w,r){var x=h(t,r);var v=a(t);var y,s;if(v){y=d(v,x)}else{return}var q=c(t);if(q.length){s=document.createElement("pre");s.innerHTML=y.value;y.value=k(q,c(s),x)}y.value=i(y.value,w,r);var u=t.className;if(!u.match("(\\s|^)(language-)?"+v+"(\\s|$)")){u=u?(u+" "+v):v}if(/MSIE [678]/.test(navigator.userAgent)&&t.tagName=="CODE"&&t.parentNode.tagName=="PRE"){s=t.parentNode;var p=document.createElement("div");p.innerHTML=""+y.value+"";t=p.firstChild.firstChild;p.firstChild.cN=s.cN;s.parentNode.replaceChild(p.firstChild,s)}else{t.innerHTML=y.value}t.className=u;t.result={language:v,kw:y.keyword_count,re:y.r};if(y.second_best){t.second_best={language:y.second_best.language,kw:y.second_best.keyword_count,re:y.second_best.r}}}function o(){if(o.called){return}o.called=true;var r=document.getElementsByTagName("pre");for(var p=0;p|=||=||=|\\?|\\[|\\{|\\(|\\^|\\^=|\\||\\|=|\\|\\||~";this.ER="(?![\\s\\S])";this.BE={b:"\\\\.",r:0};this.ASM={cN:"string",b:"'",e:"'",i:"\\n",c:[this.BE],r:0};this.QSM={cN:"string",b:'"',e:'"',i:"\\n",c:[this.BE],r:0};this.CLCM={cN:"comment",b:"//",e:"$"};this.CBLCLM={cN:"comment",b:"/\\*",e:"\\*/"};this.HCM={cN:"comment",b:"#",e:"$"};this.NM={cN:"number",b:this.NR,r:0};this.CNM={cN:"number",b:this.CNR,r:0};this.BNM={cN:"number",b:this.BNR,r:0};this.inherit=function(r,s){var p={};for(var q in r){p[q]=r[q]}if(s){for(var q in s){p[q]=s[q]}}return p}}();hljs.LANGUAGES.cpp=function(){var a={keyword:{"false":1,"int":1,"float":1,"while":1,"private":1,"char":1,"catch":1,"export":1,virtual:1,operator:2,sizeof:2,dynamic_cast:2,typedef:2,const_cast:2,"const":1,struct:1,"for":1,static_cast:2,union:1,namespace:1,unsigned:1,"long":1,"throw":1,"volatile":2,"static":1,"protected":1,bool:1,template:1,mutable:1,"if":1,"public":1,friend:2,"do":1,"return":1,"goto":1,auto:1,"void":2,"enum":1,"else":1,"break":1,"new":1,extern:1,using:1,"true":1,"class":1,asm:1,"case":1,typeid:1,"short":1,reinterpret_cast:2,"default":1,"double":1,register:1,explicit:1,signed:1,typename:1,"try":1,"this":1,"switch":1,"continue":1,wchar_t:1,inline:1,"delete":1,alignof:1,char16_t:1,char32_t:1,constexpr:1,decltype:1,noexcept:1,nullptr:1,static_assert:1,thread_local:1,restrict:1,_Bool:1,complex:1},built_in:{std:1,string:1,cin:1,cout:1,cerr:1,clog:1,stringstream:1,istringstream:1,ostringstream:1,auto_ptr:1,deque:1,list:1,queue:1,stack:1,vector:1,map:1,set:1,bitset:1,multiset:1,multimap:1,unordered_set:1,unordered_map:1,unordered_multiset:1,unordered_multimap:1,array:1,shared_ptr:1}};return{dM:{k:a,i:"",k:a,r:10,c:["self"]}]}}}();hljs.LANGUAGES.r={dM:{c:[hljs.HCM,{cN:"number",b:"\\b0[xX][0-9a-fA-F]+[Li]?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+(?:[eE][+\\-]?\\d*)?L\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+\\.(?!\\d)(?:i\\b)?",e:hljs.IMMEDIATE_RE,r:1},{cN:"number",b:"\\b\\d+(?:\\.\\d*)?(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\.\\d+(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"keyword",b:"(?:tryCatch|library|setGeneric|setGroupGeneric)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\.",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\d+(?![\\w.])",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\b(?:function)",e:hljs.IMMEDIATE_RE,r:2},{cN:"keyword",b:"(?:if|in|break|next|repeat|else|for|return|switch|while|try|stop|warning|require|attach|detach|source|setMethod|setClass)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"literal",b:"(?:NA|NA_integer_|NA_real_|NA_character_|NA_complex_)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"literal",b:"(?:NULL|TRUE|FALSE|T|F|Inf|NaN)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"identifier",b:"[a-zA-Z.][a-zA-Z0-9._]*\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"operator",b:"|=||   Using R to Analyze G1GC Log Files   Using R to Analyze G1GC Log Files Introduction Working in Oracle Platform Integration gives an engineer opportunities to work on a wide array of technologies. My team’s goal is to make Oracle applications run best on the Solaris/SPARC platform. When looking for bottlenecks in a modern applications, one needs to be aware of not only how the CPUs and operating system are executing, but also network, storage, and in some cases, the Java Virtual Machine. I was recently presented with about 1.5 GB of Java Garbage First Garbage Collector log file data. If you’re not familiar with the subject, you might want to review Garbage First Garbage Collector Tuning by Monica Beckwith. The customer had been running Java HotSpot 1.6.0_31 to host a web application server. I was told that the Solaris/SPARC server was running a Java process launched using a commmand line that included the following flags: -d64 -Xms9g -Xmx9g -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -XX:InitiatingHeapOccupancyPercent=80 -XX:PermSize=256m -XX:MaxPermSize=256m -XX:+PrintGC -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC -XX:+PrintGCDateStamps -XX:+PrintFlagsFinal -XX:+DisableExplicitGC -XX:+UnlockExperimentalVMOptions -XX:ParallelGCThreads=8 Several sources on the internet indicate that if I were to print out the 1.5 GB of log files, it would require enough paper to fill the bed of a pick up truck. Of course, it would be fruitless to try to scan the log files by hand. Tools will be required to summarize the contents of the log files. Others have encountered large Java garbage collection log files. There are existing tools to analyze the log files: IBM’s GC toolkit The chewiebug GCViewer gchisto HPjmeter Instead of using one of the other tools listed, I decide to parse the log files with standard Unix tools, and analyze the data with R. Data Cleansing The log files arrived in two different formats. I guess that the difference is that one set of log files was generated using a more verbose option, maybe -XX:+PrintHeapAtGC, and the other set of log files was generated without that option. Format 1 In some of the log files, the log files with the less verbose format, a single trace, i.e. the report of a singe garbage collection event, looks like this: {Heap before GC invocations=12280 (full 61): garbage-first heap total 9437184K, used 7499918K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) region size 4096K, 1 young (4096K), 0 survivors (0K) compacting perm gen total 262144K, used 144077K [0xffffffff40000000, 0xffffffff50000000, 0xffffffff50000000) the space 262144K, 54% used [0xffffffff40000000, 0xffffffff48cb3758, 0xffffffff48cb3800, 0xffffffff50000000) No shared spaces configured. 2014-05-14T07:24:00.988-0700: 60586.353: [GC pause (young) 7324M->7320M(9216M), 0.1567265 secs] Heap after GC invocations=12281 (full 61): garbage-first heap total 9437184K, used 7496533K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) region size 4096K, 0 young (0K), 0 survivors (0K) compacting perm gen total 262144K, used 144077K [0xffffffff40000000, 0xffffffff50000000, 0xffffffff50000000) the space 262144K, 54% used [0xffffffff40000000, 0xffffffff48cb3758, 0xffffffff48cb3800, 0xffffffff50000000) No shared spaces configured. } A simple grep can be used to extract a summary: $ grep "\[ GC pause (young" g1gc.log 2014-05-13T13:24:35.091-0700: 3.109: [GC pause (young) 20M->5029K(9216M), 0.0146328 secs] 2014-05-13T13:24:35.440-0700: 3.459: [GC pause (young) 9125K->6077K(9216M), 0.0086723 secs] 2014-05-13T13:24:37.581-0700: 5.599: [GC pause (young) 25M->8470K(9216M), 0.0203820 secs] 2014-05-13T13:24:42.686-0700: 10.704: [GC pause (young) 44M->15M(9216M), 0.0288848 secs] 2014-05-13T13:24:48.941-0700: 16.958: [GC pause (young) 51M->20M(9216M), 0.0491244 secs] 2014-05-13T13:24:56.049-0700: 24.066: [GC pause (young) 92M->26M(9216M), 0.0525368 secs] 2014-05-13T13:25:34.368-0700: 62.383: [GC pause (young) 602M->68M(9216M), 0.1721173 secs] But that format wasn't easily read into R, so I needed to be a bit more tricky. I used the following Unix command to create a summary file that was easy for R to read. $ echo "SecondsSinceLaunch BeforeSize AfterSize TotalSize RealTime" $ grep "\[GC pause (young" g1gc.log | grep -v mark | sed -e 's/[A-SU-z\(\),]/ /g' -e 's/->/ /' -e 's/: / /g' | more SecondsSinceLaunch BeforeSize AfterSize TotalSize RealTime 2014-05-13T13:24:35.091-0700 3.109 20 5029 9216 0.0146328 2014-05-13T13:24:35.440-0700 3.459 9125 6077 9216 0.0086723 2014-05-13T13:24:37.581-0700 5.599 25 8470 9216 0.0203820 2014-05-13T13:24:42.686-0700 10.704 44 15 9216 0.0288848 2014-05-13T13:24:48.941-0700 16.958 51 20 9216 0.0491244 2014-05-13T13:24:56.049-0700 24.066 92 26 9216 0.0525368 2014-05-13T13:25:34.368-0700 62.383 602 68 9216 0.1721173 Format 2 In some of the log files, the log files with the more verbose format, a single trace, i.e. the report of a singe garbage collection event, was more complicated than Format 1. Here is a text file with an example of a single G1GC trace in the second format. As you can see, it is quite complicated. It is nice that there is so much information available, but the level of detail can be overwhelming. I wrote this awk script (download) to summarize each trace on a single line. #!/usr/bin/env awk -f BEGIN { printf("SecondsSinceLaunch IncrementalCount FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize\n") } ###################### # Save count data from lines that are at the start of each G1GC trace. # Each trace starts out like this: # {Heap before GC invocations=14 (full 0): # garbage-first heap total 9437184K, used 325496K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) ###################### /{Heap.*full/{ gsub ( "\\)" , "" ); nf=split($0,a,"="); split(a[2],b," "); getline; if ( match($0, "first") ) { G1GC=1; IncrementalCount=b[1]; FullCount=substr( b[3], 1, length(b[3])-1 ); } else { G1GC=0; } } ###################### # Pull out time stamps that are in lines with this format: # 2014-05-12T14:02:06.025-0700: 94.312: [GC pause (young), 0.08870154 secs] ###################### /GC pause/ { DateTime=$1; SecondsSinceLaunch=substr($2, 1, length($2)-1); } ###################### # Heap sizes are in lines that look like this: # [ 4842M->4838M(9216M)] ###################### /\[ .*]$/ { gsub ( "\\[" , "" ); gsub ( "\ \]" , "" ); gsub ( "->" , " " ); gsub ( "\\( " , " " ); gsub ( "\ \)" , " " ); split($0,a," "); if ( split(a[1],b,"M") > 1 ) {BeforeSize=b[1]*1024;} if ( split(a[1],b,"K") > 1 ) {BeforeSize=b[1];} if ( split(a[2],b,"M") > 1 ) {AfterSize=b[1]*1024;} if ( split(a[2],b,"K") > 1 ) {AfterSize=b[1];} if ( split(a[3],b,"M") > 1 ) {TotalSize=b[1]*1024;} if ( split(a[3],b,"K") > 1 ) {TotalSize=b[1];} } ###################### # Emit an output line when you find input that looks like this: # [Times: user=1.41 sys=0.08, real=0.24 secs] ###################### /\[Times/ { if (G1GC==1) { gsub ( "," , "" ); split($2,a,"="); UserTime=a[2]; split($3,a,"="); SysTime=a[2]; split($4,a,"="); RealTime=a[2]; print DateTime,SecondsSinceLaunch,IncrementalCount,FullCount,UserTime,SysTime,RealTime,BeforeSize,AfterSize,TotalSize; G1GC=0; } } The resulting summary is about 25X smaller that the original file, but still difficult for a human to digest. SecondsSinceLaunch IncrementalCount FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ... 2014-05-12T18:36:34.669-0700: 3985.744 561 0 0.57 0.06 0.16 1724416 1720320 9437184 2014-05-12T18:36:34.839-0700: 3985.914 562 0 0.51 0.06 0.19 1724416 1720320 9437184 2014-05-12T18:36:35.069-0700: 3986.144 563 0 0.60 0.04 0.27 1724416 1721344 9437184 2014-05-12T18:36:35.354-0700: 3986.429 564 0 0.33 0.04 0.09 1725440 1722368 9437184 2014-05-12T18:36:35.545-0700: 3986.620 565 0 0.58 0.04 0.17 1726464 1722368 9437184 2014-05-12T18:36:35.726-0700: 3986.801 566 0 0.43 0.05 0.12 1726464 1722368 9437184 2014-05-12T18:36:35.856-0700: 3986.930 567 0 0.30 0.04 0.07 1726464 1723392 9437184 2014-05-12T18:36:35.947-0700: 3987.023 568 0 0.61 0.04 0.26 1727488 1723392 9437184 2014-05-12T18:36:36.228-0700: 3987.302 569 0 0.46 0.04 0.16 1731584 1724416 9437184 Reading the Data into R Once the GC log data had been cleansed, either by processing the first format with the shell script, or by processing the second format with the awk script, it was easy to read the data into R. g1gc.df = read.csv("summary.txt", row.names = NULL, stringsAsFactors=FALSE,sep="") str(g1gc.df) ## 'data.frame': 8307 obs. of 10 variables: ## $ row.names : chr "2014-05-12T14:00:32.868-0700:" "2014-05-12T14:00:33.179-0700:" "2014-05-12T14:00:33.677-0700:" "2014-05-12T14:00:35.538-0700:" ... ## $ SecondsSinceLaunch: num 1.16 1.47 1.97 3.83 6.1 ... ## $ IncrementalCount : int 0 1 2 3 4 5 6 7 8 9 ... ## $ FullCount : int 0 0 0 0 0 0 0 0 0 0 ... ## $ UserTime : num 0.11 0.05 0.04 0.21 0.08 0.26 0.31 0.33 0.34 0.56 ... ## $ SysTime : num 0.04 0.01 0.01 0.05 0.01 0.06 0.07 0.06 0.07 0.09 ... ## $ RealTime : num 0.02 0.02 0.01 0.04 0.02 0.04 0.05 0.04 0.04 0.06 ... ## $ BeforeSize : int 8192 5496 5768 22528 24576 43008 34816 53248 55296 93184 ... ## $ AfterSize : int 1400 1672 2557 4907 7072 14336 16384 18432 19456 21504 ... ## $ TotalSize : int 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 ... head(g1gc.df) ## row.names SecondsSinceLaunch IncrementalCount ## 1 2014-05-12T14:00:32.868-0700: 1.161 0 ## 2 2014-05-12T14:00:33.179-0700: 1.472 1 ## 3 2014-05-12T14:00:33.677-0700: 1.969 2 ## 4 2014-05-12T14:00:35.538-0700: 3.830 3 ## 5 2014-05-12T14:00:37.811-0700: 6.103 4 ## 6 2014-05-12T14:00:41.428-0700: 9.720 5 ## FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ## 1 0 0.11 0.04 0.02 8192 1400 9437184 ## 2 0 0.05 0.01 0.02 5496 1672 9437184 ## 3 0 0.04 0.01 0.01 5768 2557 9437184 ## 4 0 0.21 0.05 0.04 22528 4907 9437184 ## 5 0 0.08 0.01 0.02 24576 7072 9437184 ## 6 0 0.26 0.06 0.04 43008 14336 9437184 Basic Statistics Once the data has been read into R, simple statistics are very easy to generate. All of the numbers from high school statistics are available via simple commands. For example, generate a summary of every column: summary(g1gc.df) ## row.names SecondsSinceLaunch IncrementalCount FullCount ## Length:8307 Min. : 1 Min. : 0 Min. : 0.0 ## Class :character 1st Qu.: 9977 1st Qu.:2048 1st Qu.: 0.0 ## Mode :character Median :12855 Median :4136 Median : 12.0 ## Mean :12527 Mean :4156 Mean : 31.6 ## 3rd Qu.:15758 3rd Qu.:6262 3rd Qu.: 61.0 ## Max. :55484 Max. :8391 Max. :113.0 ## UserTime SysTime RealTime BeforeSize ## Min. :0.040 Min. :0.0000 Min. : 0.0 Min. : 5476 ## 1st Qu.:0.470 1st Qu.:0.0300 1st Qu.: 0.1 1st Qu.:5137920 ## Median :0.620 Median :0.0300 Median : 0.1 Median :6574080 ## Mean :0.751 Mean :0.0355 Mean : 0.3 Mean :5841855 ## 3rd Qu.:0.920 3rd Qu.:0.0400 3rd Qu.: 0.2 3rd Qu.:7084032 ## Max. :3.370 Max. :1.5600 Max. :488.1 Max. :8696832 ## AfterSize TotalSize ## Min. : 1380 Min. :9437184 ## 1st Qu.:5002752 1st Qu.:9437184 ## Median :6559744 Median :9437184 ## Mean :5785454 Mean :9437184 ## 3rd Qu.:7054336 3rd Qu.:9437184 ## Max. :8482816 Max. :9437184 Q: What is the total amount of User CPU time spent in garbage collection? sum(g1gc.df$UserTime) ## [1] 6236 As you can see, less than two hours of CPU time was spent in garbage collection. Is that too much? To find the percentage of time spent in garbage collection, divide the number above by total_elapsed_time*CPU_count. In this case, there are a lot of CPU’s and it turns out the the overall amount of CPU time spent in garbage collection isn’t a problem when viewed in isolation. When calculating rates, i.e. events per unit time, you need to ask yourself if the rate is homogenous across the time period in the log file. Does the log file include spikes of high activity that should be separately analyzed? Averaging in data from nights and weekends with data from business hours may alias problems. If you have a reason to suspect that the garbage collection rates include peaks and valleys that need independent analysis, see the “Time Series” section, below. Q: How much garbage is collected on each pass? The amount of heap space that is recovered per GC pass is surprisingly low: At least one collection didn’t recover any data. (“Min.=0”) 25% of the passes recovered 3MB or less. (“1st Qu.=3072”) Half of the GC passes recovered 4MB or less. (“Median=4096”) The average amount recovered was 56MB. (“Mean=56390”) 75% of the passes recovered 36MB or less. (“3rd Qu.=36860”) At least one pass recovered 2GB. (“Max.=2121000”) g1gc.df$Delta = g1gc.df$BeforeSize - g1gc.df$AfterSize summary(g1gc.df$Delta) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0 3070 4100 56400 36900 2120000 Q: What is the maximum User CPU time for a single collection? The worst garbage collection (“Max.”) is many standard deviations away from the mean. The data appears to be right skewed. summary(g1gc.df$UserTime) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0.040 0.470 0.620 0.751 0.920 3.370 sd(g1gc.df$UserTime) ## [1] 0.3966 Basic Graphics Once the data is in R, it is trivial to plot the data with formats including dot plots, line charts, bar charts (simple, stacked, grouped), pie charts, boxplots, scatter plots histograms, and kernel density plots. Histogram of User CPU Time per Collection I don't think that this graph requires any explanation. hist(g1gc.df$UserTime, main="User CPU Time per Collection", xlab="Seconds", ylab="Frequency") Box plot to identify outliers When the initial data is viewed with a box plot, you can see the one crazy outlier in the real time per GC. Save this data point for future analysis and drop the outlier so that it’s not throwing off our statistics. Now the box plot shows many outliers, which will be examined later, using times series analysis. Notice that the scale of the x-axis changes drastically once the crazy outlier is removed. par(mfrow=c(2,1)) boxplot(g1gc.df$UserTime,g1gc.df$SysTime,g1gc.df$RealTime, main="Box Plot of Time per GC\n(dominated by a crazy outlier)", names=c("usr","sys","elapsed"), xlab="Seconds per GC", ylab="Time (Seconds)", horizontal = TRUE, outcol="red") crazy.outlier.df=g1gc.df[g1gc.df$RealTime > 400,] g1gc.df=g1gc.df[g1gc.df$RealTime < 400,] boxplot(g1gc.df$UserTime,g1gc.df$SysTime,g1gc.df$RealTime, main="Box Plot of Time per GC\n(crazy outlier excluded)", names=c("usr","sys","elapsed"), xlab="Seconds per GC", ylab="Time (Seconds)", horizontal = TRUE, outcol="red") box(which = "outer", lty = "solid") Here is the crazy outlier for future analysis: crazy.outlier.df ## row.names SecondsSinceLaunch IncrementalCount ## 8233 2014-05-12T23:15:43.903-0700: 20741 8316 ## FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ## 8233 112 0.55 0.42 488.1 8381440 8235008 9437184 ## Delta ## 8233 146432 R Time Series Data To analyze the garbage collection as a time series, I’ll use Z’s Ordered Observations (zoo). “zoo is the creator for an S3 class of indexed totally ordered observations which includes irregular time series.” require(zoo) ## Loading required package: zoo ## ## Attaching package: 'zoo' ## ## The following objects are masked from 'package:base': ## ## as.Date, as.Date.numeric head(g1gc.df[,1]) ## [1] "2014-05-12T14:00:32.868-0700:" "2014-05-12T14:00:33.179-0700:" ## [3] "2014-05-12T14:00:33.677-0700:" "2014-05-12T14:00:35.538-0700:" ## [5] "2014-05-12T14:00:37.811-0700:" "2014-05-12T14:00:41.428-0700:" options("digits.secs"=3) times=as.POSIXct( g1gc.df[,1], format="%Y-%m-%dT%H:%M:%OS%z:") g1gc.z = zoo(g1gc.df[,-c(1)], order.by=times) head(g1gc.z) ## SecondsSinceLaunch IncrementalCount FullCount ## 2014-05-12 17:00:32.868 1.161 0 0 ## 2014-05-12 17:00:33.178 1.472 1 0 ## 2014-05-12 17:00:33.677 1.969 2 0 ## 2014-05-12 17:00:35.538 3.830 3 0 ## 2014-05-12 17:00:37.811 6.103 4 0 ## 2014-05-12 17:00:41.427 9.720 5 0 ## UserTime SysTime RealTime BeforeSize AfterSize ## 2014-05-12 17:00:32.868 0.11 0.04 0.02 8192 1400 ## 2014-05-12 17:00:33.178 0.05 0.01 0.02 5496 1672 ## 2014-05-12 17:00:33.677 0.04 0.01 0.01 5768 2557 ## 2014-05-12 17:00:35.538 0.21 0.05 0.04 22528 4907 ## 2014-05-12 17:00:37.811 0.08 0.01 0.02 24576 7072 ## 2014-05-12 17:00:41.427 0.26 0.06 0.04 43008 14336 ## TotalSize Delta ## 2014-05-12 17:00:32.868 9437184 6792 ## 2014-05-12 17:00:33.178 9437184 3824 ## 2014-05-12 17:00:33.677 9437184 3211 ## 2014-05-12 17:00:35.538 9437184 17621 ## 2014-05-12 17:00:37.811 9437184 17504 ## 2014-05-12 17:00:41.427 9437184 28672 Example of Two Benchmark Runs in One Log File The data in the following graph is from a different log file, not the one of primary interest to this article. I’m including this image because it is an example of idle periods followed by busy periods. It would be uninteresting to average the rate of garbage collection over the entire log file period. More interesting would be the rate of garbage collect in the two busy periods. Are they the same or different? Your production data may be similar, for example, bursts when employees return from lunch and idle times on weekend evenings, etc. Once the data is in an R Time Series, you can analyze isolated time windows. Clipping the Time Series data Flashing back to our test case… Viewing the data as a time series is interesting. You can see that the work intensive time period is between 9:00 PM and 3:00 AM. Lets clip the data to the interesting period:     par(mfrow=c(2,1)) plot(g1gc.z$UserTime, type="h", main="User Time per GC\nTime: Complete Log File", xlab="Time of Day", ylab="CPU Seconds per GC", col="#1b9e77") clipped.g1gc.z=window(g1gc.z, start=as.POSIXct("2014-05-12 21:00:00"), end=as.POSIXct("2014-05-13 03:00:00")) plot(clipped.g1gc.z$UserTime, type="h", main="User Time per GC\nTime: Limited to Benchmark Execution", xlab="Time of Day", ylab="CPU Seconds per GC", col="#1b9e77") box(which = "outer", lty = "solid") Cumulative Incremental and Full GC count Here is the cumulative incremental and full GC count. When the line is very steep, it indicates that the GCs are repeating very quickly. Notice that the scale on the Y axis is different for full vs. incremental. plot(clipped.g1gc.z[,c(2:3)], main="Cumulative Incremental and Full GC count", xlab="Time of Day", col="#1b9e77") GC Analysis of Benchmark Execution using Time Series data In the following series of 3 graphs: The “After Size” show the amount of heap space in use after each garbage collection. Many Java objects are still referenced, i.e. alive, during each garbage collection. This may indicate that the application has a memory leak, or may indicate that the application has a very large memory footprint. Typically, an application's memory footprint plateau's in the early stage of execution. One would expect this graph to have a flat top. The steep decline in the heap space may indicate that the application crashed after 2:00. The second graph shows that the outliers in real execution time, discussed above, occur near 2:00. when the Java heap seems to be quite full. The third graph shows that Full GCs are infrequent during the first few hours of execution. The rate of Full GC's, (the slope of the cummulative Full GC line), changes near midnight.   plot(clipped.g1gc.z[,c("AfterSize","RealTime","FullCount")], xlab="Time of Day", col=c("#1b9e77","red","#1b9e77")) GC Analysis of heap recovered Each GC trace includes the amount of heap space in use before and after the individual GC event. During garbage coolection, unreferenced objects are identified, the space holding the unreferenced objects is freed, and thus, the difference in before and after usage indicates how much space has been freed. The following box plot and bar chart both demonstrate the same point - the amount of heap space freed per garbage colloection is surprisingly low. par(mfrow=c(2,1)) boxplot(as.vector(clipped.g1gc.z$Delta), main="Amount of Heap Recovered per GC Pass", xlab="Size in KB", horizontal = TRUE, col="red") hist(as.vector(clipped.g1gc.z$Delta), main="Amount of Heap Recovered per GC Pass", xlab="Size in KB", breaks=100, col="red") box(which = "outer", lty = "solid") This graph is the most interesting. The dark blue area shows how much heap is occupied by referenced Java objects. This represents memory that holds live data. The red fringe at the top shows how much data was recovered after each garbage collection. barplot(clipped.g1gc.z[,c("AfterSize","Delta")], col=c("#7570b3","#e7298a"), xlab="Time of Day", border=NA) legend("topleft", c("Live Objects","Heap Recovered on GC"), fill=c("#7570b3","#e7298a")) box(which = "outer", lty = "solid") When I discuss the data in the log files with the customer, I will ask for an explaination for the large amount of referenced data resident in the Java heap. There are two are posibilities: There is a memory leak and the amount of space required to hold referenced objects will continue to grow, limited only by the maximum heap size. After the maximum heap size is reached, the JVM will throw an “Out of Memory” exception every time that the application tries to allocate a new object. If this is the case, the aplication needs to be debugged to identify why old objects are referenced when they are no longer needed. The application has a legitimate requirement to keep a large amount of data in memory. The customer may want to further increase the maximum heap size. Another possible solution would be to partition the application across multiple cluster nodes, where each node has responsibility for managing a unique subset of the data. Conclusion In conclusion, R is a very powerful tool for the analysis of Java garbage collection log files. The primary difficulty is data cleansing so that information can be read into an R data frame. Once the data has been read into R, a rich set of tools may be used for thorough evaluation.

    Read the article

  • Enable Php Fastcgi and Get 500 Internal Server Error (Lighttpd)

    - by skycrew
    anyone can help me? I just got this problem today. Before this my site running smooth with Fastcgi enable but now its show 500 internal server error with below logs. I need to disable php fastcgi in LxAdmin so that my visitor can access my site but when I disable php fastgi, my web performance is very slow with high load to server. I also include the performance screenshot. What should I do? This are the error log I got: 2010-06-16 21:59:52: (mod_cgi.c.584) cgi died, pid: 24055 2010-06-16 21:59:52: (mod_cgi.c.584) cgi died, pid: 21622 2010-06-16 21:59:52: (mod_fastcgi.c.2462) unexpected end-of-file (perhaps the fastcgi process died): pid: 3342 socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-1 2010-06-16 21:59:52: (mod_fastcgi.c.3207) child exited, pid: 3342 status: 0 2010-06-16 21:59:52: (mod_fastcgi.c.3254) response not received, request sent: 836 on socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-1 for /index.php , closing connection 2010-06-16 21:59:52: (mod_fastcgi.c.2462) unexpected end-of-file (perhaps the fastcgi process died): pid: 24447 socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-1 2010-06-16 21:59:52: (mod_fastcgi.c.3254) response not received, request sent: 860 on socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-1 for /index.php , closing connection 2010-06-16 21:59:52: (mod_fastcgi.c.2462) unexpected end-of-file (perhaps the fastcgi process died): pid: 24447 socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-1 2010-06-16 21:59:52: (mod_fastcgi.c.3254) response not received, request sent: 836 on socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-1 for /index.php , closing connection 2010-06-16 21:59:52: (mod_fastcgi.c.2462) unexpected end-of-file (perhaps the fastcgi process died): pid: 24447 socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-1 2010-06-16 21:59:52: (mod_fastcgi.c.3254) response not received, request sent: 878 on socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-1 for /index.php , closing connection 2010-06-16 21:59:52: (mod_fastcgi.c.2462) unexpected end-of-file (perhaps the fastcgi process died): pid: 24447 socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-1 2010-06-16 21:59:52: (mod_fastcgi.c.3254) response not received, request sent: 878 on socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-1 for /index.php , closing connection 2010-06-16 21:59:52: (mod_fastcgi.c.2462) unexpected end-of-file (perhaps the fastcgi process died): pid: 24447 socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-1 2010-06-16 21:59:52: (mod_fastcgi.c.3254) response not received, request sent: 878 on socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-1 for /index.php , closing connection 2010-06-16 21:59:52: (mod_cgi.c.584) cgi died, pid: 22325 2010-06-16 21:59:52: (mod_fastcgi.c.2462) unexpected end-of-file (perhaps the fastcgi process died): pid: 24447 socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-1 2010-06-16 21:59:52: (mod_fastcgi.c.3254) response not received, request sent: 852 on socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-1 for /index.php , closing connection 2010-06-16 21:59:52: (mod_cgi.c.584) cgi died, pid: 24032 2010-06-16 21:59:52: (mod_cgi.c.584) cgi died, pid: 20402 2010-06-16 21:59:52: (mod_fastcgi.c.2462) unexpected end-of-file (perhaps the fastcgi process died): pid: 3336 socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-0 2010-06-16 21:59:52: (mod_fastcgi.c.3207) child exited, pid: 3336 status: 0 2010-06-16 21:59:52: (mod_fastcgi.c.3254) response not received, request sent: 855 on socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-0 for /index.php , closing connection 2010-06-16 21:59:52: (mod_fastcgi.c.2462) unexpected end-of-file (perhaps the fastcgi process died): pid: 24448 socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-0 2010-06-16 21:59:52: (mod_fastcgi.c.3254) response not received, request sent: 860 on socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-0 for /index.php , closing connection 2010-06-16 21:59:52: (mod_cgi.c.1231) cgi died ? 2010-06-16 21:59:53: (mod_fastcgi.c.2462) unexpected end-of-file (perhaps the fastcgi process died): pid: 24448 socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-0 2010-06-16 21:59:53: (mod_fastcgi.c.3254) response not received, request sent: 860 on socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-0 for /index.php , closing connection 2010-06-16 21:59:53: (mod_fastcgi.c.2462) unexpected end-of-file (perhaps the fastcgi process died): pid: 24448 socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-0 2010-06-16 21:59:53: (mod_fastcgi.c.3254) response not received, request sent: 878 on socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-0 for /index.php , closing connection 2010-06-16 21:59:53: (mod_fastcgi.c.2462) unexpected end-of-file (perhaps the fastcgi process died): pid: 24448 socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-0 2010-06-16 21:59:53: (mod_fastcgi.c.3254) response not received, request sent: 860 on socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-0 for /index.php , closing connection 2010-06-16 21:59:53: (mod_fastcgi.c.1731) connect failed: Connection refused on unix:/var/tmp/lighttpd/php.socket.lyrics-hub.com.3333-1 2010-06-16 21:59:53: (mod_fastcgi.c.2885) backend died; we'll disable it for 5 seconds and send the request to another backend instead: reconnects: 0 load: 1 2010-06-16 21:59:56: (server.c.1470) server stopped by UID = 0 PID = 24439 2010-06-16 22:00:23: (log.c.75) server started Performance Graph as below:- http://img404.imageshack.us/img404/3498/memorylxadmin.jpg

    Read the article

  • Enable Php Fastcgi and Get 500 Internal Server Error (Lighttpd)

    - by skycrew
    Hello everyone, anyone can help me? I just got this problem today. Before this my site running smooth with Fastcgi enable but now its show 500 internal server error with below logs. I need to disable php fastcgi in LxAdmin so that my visitor can access my site but when I disable php fastgi, my web performance is very slow with high load to server. I also include the performance screenshot. What should I do? This are the error log I got: 2010-06-16 21:59:52: (mod_cgi.c.584) cgi died, pid: 24055 2010-06-16 21:59:52: (mod_cgi.c.584) cgi died, pid: 21622 2010-06-16 21:59:52: (mod_fastcgi.c.2462) unexpected end-of-file (perhaps the fastcgi process died): pid: 3342 socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-1 2010-06-16 21:59:52: (mod_fastcgi.c.3207) child exited, pid: 3342 status: 0 2010-06-16 21:59:52: (mod_fastcgi.c.3254) response not received, request sent: 836 on socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-1 for /index.php , closing connection 2010-06-16 21:59:52: (mod_fastcgi.c.2462) unexpected end-of-file (perhaps the fastcgi process died): pid: 24447 socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-1 2010-06-16 21:59:52: (mod_fastcgi.c.3254) response not received, request sent: 860 on socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-1 for /index.php , closing connection 2010-06-16 21:59:52: (mod_fastcgi.c.2462) unexpected end-of-file (perhaps the fastcgi process died): pid: 24447 socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-1 2010-06-16 21:59:52: (mod_fastcgi.c.3254) response not received, request sent: 836 on socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-1 for /index.php , closing connection 2010-06-16 21:59:52: (mod_fastcgi.c.2462) unexpected end-of-file (perhaps the fastcgi process died): pid: 24447 socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-1 2010-06-16 21:59:52: (mod_fastcgi.c.3254) response not received, request sent: 878 on socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-1 for /index.php , closing connection 2010-06-16 21:59:52: (mod_fastcgi.c.2462) unexpected end-of-file (perhaps the fastcgi process died): pid: 24447 socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-1 2010-06-16 21:59:52: (mod_fastcgi.c.3254) response not received, request sent: 878 on socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-1 for /index.php , closing connection 2010-06-16 21:59:52: (mod_fastcgi.c.2462) unexpected end-of-file (perhaps the fastcgi process died): pid: 24447 socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-1 2010-06-16 21:59:52: (mod_fastcgi.c.3254) response not received, request sent: 878 on socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-1 for /index.php , closing connection 2010-06-16 21:59:52: (mod_cgi.c.584) cgi died, pid: 22325 2010-06-16 21:59:52: (mod_fastcgi.c.2462) unexpected end-of-file (perhaps the fastcgi process died): pid: 24447 socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-1 2010-06-16 21:59:52: (mod_fastcgi.c.3254) response not received, request sent: 852 on socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-1 for /index.php , closing connection 2010-06-16 21:59:52: (mod_cgi.c.584) cgi died, pid: 24032 2010-06-16 21:59:52: (mod_cgi.c.584) cgi died, pid: 20402 2010-06-16 21:59:52: (mod_fastcgi.c.2462) unexpected end-of-file (perhaps the fastcgi process died): pid: 3336 socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-0 2010-06-16 21:59:52: (mod_fastcgi.c.3207) child exited, pid: 3336 status: 0 2010-06-16 21:59:52: (mod_fastcgi.c.3254) response not received, request sent: 855 on socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-0 for /index.php , closing connection 2010-06-16 21:59:52: (mod_fastcgi.c.2462) unexpected end-of-file (perhaps the fastcgi process died): pid: 24448 socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-0 2010-06-16 21:59:52: (mod_fastcgi.c.3254) response not received, request sent: 860 on socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-0 for /index.php , closing connection 2010-06-16 21:59:52: (mod_cgi.c.1231) cgi died ? 2010-06-16 21:59:53: (mod_fastcgi.c.2462) unexpected end-of-file (perhaps the fastcgi process died): pid: 24448 socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-0 2010-06-16 21:59:53: (mod_fastcgi.c.3254) response not received, request sent: 860 on socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-0 for /index.php , closing connection 2010-06-16 21:59:53: (mod_fastcgi.c.2462) unexpected end-of-file (perhaps the fastcgi process died): pid: 24448 socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-0 2010-06-16 21:59:53: (mod_fastcgi.c.3254) response not received, request sent: 878 on socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-0 for /index.php , closing connection 2010-06-16 21:59:53: (mod_fastcgi.c.2462) unexpected end-of-file (perhaps the fastcgi process died): pid: 24448 socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-0 2010-06-16 21:59:53: (mod_fastcgi.c.3254) response not received, request sent: 860 on socket: unix:/var/tmp/lighttpd/php.socket.lyrics.skycrewz.net.3333-0 for /index.php , closing connection 2010-06-16 21:59:53: (mod_fastcgi.c.1731) connect failed: Connection refused on unix:/var/tmp/lighttpd/php.socket.lyrics-hub.com.3333-1 2010-06-16 21:59:53: (mod_fastcgi.c.2885) backend died; we'll disable it for 5 seconds and send the request to another backend instead: reconnects: 0 load: 1 2010-06-16 21:59:56: (server.c.1470) server stopped by UID = 0 PID = 24439 2010-06-16 22:00:23: (log.c.75) server started Performance Graph as below:- http://img404.imageshack.us/img404/3498/memorylxadmin.jpg

    Read the article

  • Weird networking problem ( Linksys, Windows 7 )

    - by Rohit Nair
    Okay it's a bit tough to figure out where to start from, but here is the basic summary of the issue: During general internet usage, there are times when any attempt to visit a website stalls at "Waiting for somedomain.com". This problem occurs in Firefox, IE and Chrome. No website will load, INCLUDING the router configuration page at 192.168.1.1. Curiously, ping works fine, and other network apps such as MSN Messenger continue to work and I can send and receive messages. Disconnecting and reconnecting to the wireless network seems to fix the problem for a bit, but there are times when it relapses into not loading after every 2-3 http requests. Restarting the router seems to fix the issue, but it can crop up hours or days later. I have a CCNA cert and I know my way around the Windows family of operating systems, so I'm going to list all the things I've tried here. Other computers on the network seem to suffer the same problem, which makes me think it might be a specific problem with something in Win7. The random nature of this issue makes it a bit difficult to confirm, but I can definitely say that I have experienced this on the following systems: Windows 7 64-bit on my desktop Windows Vista 32-bit on my desktop ( the desktop has 2 wireless NICs and the problem existed on both ) Windows Vista 32-bit on my laptop ( both with wireless and wired ) Windows XP SP3 on another laptop ( both wireless and wired ) Using Wireshark to sniff packets seemed to indicate that although HTTP requests were being SENT out, no packets were coming in to respond to the HTTP request. However, other network apps continued to work i.e I would still receive IMs on Windows Live Messenger. Disabling IPV6 had no effect. Updating router firmware to the latest stock firmware by Linksys had no effect. Switching to dd-wrt firmware had no effect. By "no effect" I mean that although the restart required by firmware updates fixed the problem at the time, it still came back. A couple of weeks back, after a LOT of googling and flipping of various options, I figured it might be a case of router slowdown ( http://www.dd-wrt.com/wiki/index.php/Router%5FSlowdown ) caused by the fact that I occasionally run a torrent client. I tried changing the configuration as suggested in that router slowdown link, and restarted the router. However I have not run the torrent client for 12 days now, and yet I still randomly experience this problem. Currently the computer I am using is running Windows 7 64-bit. I would just like to reiterate some of the reasons that I was confused by the issue. Even the router config page at 192.168.1.1 would not load, indicating that it's not a problem with the WAN link, but probably a router issue or a local computer issue. For some reason, disconnecting and reconnecting to the wireless network immediately seems to fix the problem. Updating the router firmware, even switching to open source firmware did nothing. So it seemed to be a computer issue. On the other hand, I have not seen any mass outrage of people having networking problems with Windows 7 and Linksys routers, especially a problem of this sort, and I have tweaked every network setting I could think of. Although HTTP seems to have trouble, ping works fine, DNS lookups work fine, other networking apps work fine. However if I disconnect from Windows Live Messenger and try to reconnect, it fails to reconnect. So although it could receive data over the existing TCP/IP connection, trying to start a new one failed? Does anyone have any further ideas on debugging or fixing this issue? I am reasonably certain there are no viruses or other malicious apps on my network, and I am also reasonably certain that nobody is accessing my router without my consent. Router: Linksys WRT54G2 1.0 running dd-wrt firmware Wireless Card: Alfa AWUS036H OS: Windows 7 64-bit EDIT: I tried switching to a clean wireless channel free from interference, but the problem still persisted. I tried connecting directly with a cable, but the problem still persisted. Signed A very confused and bewildered geek whose knowledge seems to be useless in the face of this frustrating network issue.

    Read the article

  • Error installing pkgconfig via macports

    - by Greg K
    I installed Macports 1.8.2 from a DMG. That seemed to install fine. I ran sudo port selfupdate to make sure my ports tree was current. I then tried to install bindfs as I want to mount some directories in my OS X file system (like you can do with mount --bind in linux). pkgconfig and macfuse are two dependencies of bindfs. I had trouble installing bindfs due to errors installing pkgconfig, so I tried to just install pkgconfig, here's the debug output from sudo port install pkgconfig: $ sudo port -d install pkgconfig DEBUG: Found port in file:///opt/local/var/macports/sources/rsync.macports.org/release/ports/devel/pkgconfig DEBUG: Changing to port directory: /opt/local/var/macports/sources/rsync.macports.org/release/ports/devel/pkgconfig DEBUG: OS Platform: darwin DEBUG: OS Version: 10.3.0 DEBUG: Mac OS X Version: 10.6 DEBUG: System Arch: i386 DEBUG: setting option os.universal_supported to yes DEBUG: org.macports.load registered provides 'load', a pre-existing procedure. Target override will not be provided DEBUG: org.macports.unload registered provides 'unload', a pre-existing procedure. Target override will not be provided DEBUG: org.macports.distfiles registered provides 'distfiles', a pre-existing procedure. Target override will not be provided DEBUG: adding the default universal variant DEBUG: Reading variant descriptions from /opt/local/var/macports/sources/rsync.macports.org/release/ports/_resources/port1.0/variant_descriptions.conf DEBUG: Requested variant darwin is not provided by port pkgconfig. DEBUG: Requested variant i386 is not provided by port pkgconfig. DEBUG: Requested variant macosx is not provided by port pkgconfig. ---> Computing dependencies for pkgconfig DEBUG: Executing org.macports.main (pkgconfig) DEBUG: Skipping completed org.macports.fetch (pkgconfig) DEBUG: Skipping completed org.macports.checksum (pkgconfig) DEBUG: Skipping completed org.macports.extract (pkgconfig) DEBUG: Skipping completed org.macports.patch (pkgconfig) ---> Configuring pkgconfig DEBUG: Using compiler 'Mac OS X gcc 4.2' DEBUG: Executing org.macports.configure (pkgconfig) DEBUG: Environment: CFLAGS='-O2 -arch x86_64' CPPFLAGS='-I/opt/local/include' CXXFLAGS='-O2 -arch x86_64' MACOSX_DEPLOYMENT_TARGET='10.6' CXX='/usr/bin/g++-4.2' F90FLAGS='-O2 -m64' LDFLAGS='-L/opt/local/lib' OBJC='/usr/bin/gcc-4.2' FCFLAGS='-O2 -m64' INSTALL='/usr/bin/install -c' OBJCFLAGS='-O2 -arch x86_64' FFLAGS='-O2 -m64' CC='/usr/bin/gcc-4.2' DEBUG: Assembled command: 'cd "/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_ports_devel_pkgconfig/work/pkg-config-0.23" && ./configure --prefix=/opt/local --enable-indirect-deps --with-pc-path=/opt/local/lib/pkgconfig:/opt/local/share/pkgconfig' checking for a BSD-compatible install... /usr/bin/install -c checking whether build environment is sane... yes checking for gawk... no checking for mawk... no checking for nawk... no checking for awk... awk checking whether make sets $(MAKE)... no checking whether to enable maintainer-specific portions of Makefiles... no checking build system type... i386-apple-darwin10.3.0 checking host system type... i386-apple-darwin10.3.0 checking for style of include used by make... none checking for gcc... /usr/bin/gcc-4.2 checking for C compiler default output file name... configure: error: C compiler cannot create executables See `config.log' for more details. Error: Target org.macports.configure returned: configure failure: shell command " cd "/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_ports_devel_pkgconfig/work/pkg-config-0.23" && ./configure --prefix=/opt/local --enable-indirect-deps --with-pc-path=/opt/local/lib/pkgconfig:/opt/local/share/pkgconfig " returned error 77 DEBUG: Backtrace: configure failure: shell command " cd "/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_ports_devel_pkgconfig/work/pkg-config-0.23" && ./configure --prefix=/opt/local --enable-indirect-deps --with-pc-path=/opt/local/lib/pkgconfig:/opt/local/share/pkgconfig " returned error 77 while executing "$procedure $targetname" Warning: the following items did not execute (for pkgconfig): org.macports.activate org.macports.configure org.macports.build org.macports.destroot org.macports.install Error: Status 1 encountered during processing. I have only recently installed Xcode 3.2.2 (prior to installing macports). Am I right in thinking this the issue here: configure: error: C compiler cannot create executables

    Read the article

  • mono 3.0.2 + xsp + lighttpd delivers empty page

    - by Nefal Warnets
    I needed MVC 4 (and basic .NET 4.5) support so I downloaded mono 3.0.2 and deployed it on an lighttpd 1.4.28 installation, together with xsp-2.10.2 (was the latest I could find). After going through the config tutorials I managed to get the fastcgi server to spawn, but all pages are served empty. even if I go to nonexistant urls or direct .aspx files I get an empty HTTP 200 response. The log file on Debug shows nothing suspicious. Here is the log: [2012-12-12 15:15:38Z] Debug Accepting an incoming connection. [2012-12-12 15:15:38Z] Debug Record received. (Type: BeginRequest, ID: 1, Length: 8) [2012-12-12 15:15:38Z] Debug Record received. (Type: Params, ID: 1, Length: 801) [2012-12-12 15:15:38Z] Debug Record received. (Type: Params, ID: 1, Length: 0) [2012-12-12 15:15:38Z] Debug Read parameter. (SERVER_SOFTWARE = lighttpd/1.4.28) [2012-12-12 15:15:38Z] Debug Read parameter. (SERVER_NAME = xxxx) [2012-12-12 15:15:38Z] Debug Read parameter. (GATEWAY_INTERFACE = CGI/1.1) [2012-12-12 15:15:38Z] Debug Read parameter. (SERVER_PORT = 80) [2012-12-12 15:15:38Z] Debug Read parameter. (SERVER_ADDR = xxxx) [2012-12-12 15:15:38Z] Debug Read parameter. (REMOTE_PORT = xxx) [2012-12-12 15:15:38Z] Debug Read parameter. (REMOTE_ADDR = xxxx) [2012-12-12 15:15:38Z] Debug Read parameter. (SCRIPT_NAME = /ViewPage1.aspx) [2012-12-12 15:15:38Z] Debug Read parameter. (PATH_INFO = ) [2012-12-12 15:15:38Z] Debug Read parameter. (SCRIPT_FILENAME = /data/htdocs/ViewPage1.aspx) [2012-12-12 15:15:38Z] Debug Read parameter. (DOCUMENT_ROOT = /data/htdocs) [2012-12-12 15:15:38Z] Debug Read parameter. (REQUEST_URI = /ViewPage1.aspx) [2012-12-12 15:15:38Z] Debug Read parameter. (QUERY_STRING = ) [2012-12-12 15:15:38Z] Debug Read parameter. (REQUEST_METHOD = GET) [2012-12-12 15:15:38Z] Debug Read parameter. (REDIRECT_STATUS = 200) [2012-12-12 15:15:38Z] Debug Read parameter. (SERVER_PROTOCOL = HTTP/1.1) [2012-12-12 15:15:38Z] Debug Read parameter. (HTTP_HOST = xxxxx) [2012-12-12 15:15:38Z] Debug Read parameter. (HTTP_CONNECTION = keep-alive) [2012-12-12 15:15:38Z] Debug Read parameter. (HTTP_CACHE_CONTROL = max-age=0) [2012-12-12 15:15:38Z] Debug Read parameter. (HTTP_USER_AGENT = Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.95 Safari/537.11) [2012-12-12 15:15:38Z] Debug Read parameter. (HTTP_ACCEPT = text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8) [2012-12-12 15:15:38Z] Debug Read parameter. (HTTP_ACCEPT_ENCODING = gzip,deflate,sdch) [2012-12-12 15:15:38Z] Debug Read parameter. (HTTP_ACCEPT_LANGUAGE = en-US,en;q=0.8) [2012-12-12 15:15:38Z] Debug Read parameter. (HTTP_ACCEPT_CHARSET = ISO-8859-1,utf-8;q=0.7,*;q=0.3) [2012-12-12 15:15:38Z] Debug Record received. (Type: StandardInput, ID: 1, Length: 0) [2012-12-12 15:15:38Z] Debug Record sent. (Type: EndRequest, ID: 1, Length: 8) lighttpd config: server.modules += ( "mod_fastcgi" ) include "conf.d/mono.conf" $HTTP["host"] !~ "^vdn\." { $HTTP["url"] !~ "\.(jpg|gif|png|js|css|swf|ico|jpeg|mp4|flv|zip|7z|rar|psd|pdf|html|htm)$" { fastcgi.server += ( "" => (( "socket" => mono_shared_dir + "fastcgi-mono-server", "bin-path" => mono_fastcgi_server, "bin-environment" => ( "PATH" => mono_dir + "bin:/bin:/usr/bin:", "LD_LIBRARY_PATH" => mono_dir + "lib:", "MONO_SHARED_DIR" => mono_shared_dir, "MONO_FCGI_LOGLEVELS" => "Debug", "MONO_FCGI_LOGFILE" => mono_shared_dir + "fastcgi.log", "MONO_FCGI_ROOT" => mono_fcgi_root, "MONO_FCGI_APPLICATIONS" => mono_fcgi_applications ), "max-procs" => 1, "check-local" => "disable" )) ) } } the referenced mono.conf index-file.names += ( "index.aspx", "default.aspx" ) var.mono_dir = "/usr/" var.mono_shared_dir = "/tmp/" var.mono_fastcgi_server = mono_dir + "bin/" + "fastcgi-mono-server4" var.mono_fcgi_root = server.document-root var.mono_fcgi_applications = "/:." The document root for this server is /data/htdocs. The asp.net files reside there. lighttpd error logs show nothing. Every help is greatly appreciated!

    Read the article

  • Monit won't run

    - by Yaniro
    I have two identical EC2 instances (the second is a replica of the first), running Gentoo. The first instance has monit running which monitors a single process and some system resources and functions great. In the second instance, monit runs but quits right away. The configuration is similar on both instances so are the versions of monit. monit.log shows: [GMT Oct 3 08:36:41] info : monit daemon with PID 5 awakened Final lines on strace monit show: write(2, "monit daemon with PID 5 awakened"..., 33monit daemon with PID 5 awakened ) = 33 time(NULL) = 1349252827 open("/etc/localtime", O_RDONLY) = 4 fstat64(4, {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 fstat64(4, {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb773a000 read(4, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\1\0\0\0\1\0\0\0\0"..., 4096) = 118 _llseek(4, -6, [112], SEEK_CUR) = 0 read(4, "\nGMT0\n", 4096) = 6 close(4) = 0 munmap(0xb773a000, 4096) = 0 write(3, "[GMT Oct 3 08:27:07] info :"..., 33) = 33 write(3, "monit daemon with PID 5 awakened"..., 33) = 33 waitpid(-1, NULL, WNOHANG) = -1 ECHILD (No child processes) close(3) = 0 exit_group(0) = ? No core dumps (ulimit -c shows unlimited) monit -v shows: monit: Debug: Adding host allow 'localhost' monit: Debug: Skipping redundant host 'localhost' monit: Debug: Skipping redundant host 'localhost' monit: Debug: Adding credentials for user 'xxxx'. Runtime constants: Control file = /etc/monitrc Log file = /var/log/monit/monit.log Pid file = /var/run/monit.pid Id file = /var/run/monit.pid Debug = True Log = True Use syslog = False Is Daemon = True Use process engine = True Poll time = 30 seconds with start delay 0 seconds Expect buffer = 256 bytes Event queue = base directory /var/monit with 100 slots Mail server(s) = xx.xxx.xx.xxx with timeout 30 seconds Mail from = (not defined) Mail subject = (not defined) Mail message = (not defined) Start monit httpd = True httpd bind address = Any/All httpd portnumber = 2812 httpd signature = True Use ssl encryption = False httpd auth. style = Basic Authentication and Host/Net allow list Alert mail to = [email protected] Alert on = All events The service list contains the following entries: System Name = xxxx Monitoring mode = active CPU wait limit = if greater than 20.0% 1 times within 1 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert CPU system limit = if greater than 30.0% 1 times within 1 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert CPU user limit = if greater than 70.0% 1 times within 1 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert Swap usage limit = if greater than 25.0% 1 times within 1 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert Memory usage limit = if greater than 75.0% 1 times within 1 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert Load avg. (5min) = if greater than 2.0 1 times within 1 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert Load avg. (1min) = if greater than 4.0 1 times within 1 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert Process Name = xxxx Group = server Pid file = /var/run/xxxx.pid Monitoring mode = active Start program = '/etc/init.d/xxxx restart' timeout 20 second(s) Stop program = '/etc/init.d/xxxx stop' timeout 30 second(s) Existence = if does not exist 1 times within 1 cycle(s) then restart else if succeeded 1 times within 1 cycle(s) then alert Pid = if changed 1 times within 1 cycle(s) then alert Ppid = if changed 1 times within 1 cycle(s) then alert Timeout = If restarted 3 times within 5 cycle(s) then unmonitor Alert mail to = [email protected] Alert on = All events Alert mail to = [email protected] Alert on = All events ------------------------------------------------------------------------------- monit daemon with PID 5 awakened Ran emerge --sync before emerge -va monit which installed monit v5.3.2. When that didn't work i've downloaded v5.5 from their website and compiled from source which did not work either.

    Read the article

  • Diagnosing packet loss / high latency in Ubuntu

    - by Sam Gammon
    We have a Linux box (Ubuntu 12.04) running Nginx (1.5.2), which acts as a reverse proxy/load balancer to some Tornado and Apache hosts. The upstream servers are physically and logically close (same DC, sometimes same-rack) and show sub-millisecond latency between them: PING appserver (10.xx.xx.112) 56(84) bytes of data. 64 bytes from appserver (10.xx.xx.112): icmp_req=1 ttl=64 time=0.180 ms 64 bytes from appserver (10.xx.xx.112): icmp_req=2 ttl=64 time=0.165 ms 64 bytes from appserver (10.xx.xx.112): icmp_req=3 ttl=64 time=0.153 ms We receive a sustained load of about 500 requests per second, and are currently seeing regular packet loss / latency spikes from the Internet, even from basic pings: sam@AM-KEEN ~> ping -c 1000 loadbalancer PING 50.xx.xx.16 (50.xx.xx.16): 56 data bytes 64 bytes from loadbalancer: icmp_seq=0 ttl=56 time=11.624 ms 64 bytes from loadbalancer: icmp_seq=1 ttl=56 time=10.494 ms ... many packets later ... Request timeout for icmp_seq 2 64 bytes from loadbalancer: icmp_seq=2 ttl=56 time=1536.516 ms 64 bytes from loadbalancer: icmp_seq=3 ttl=56 time=536.907 ms 64 bytes from loadbalancer: icmp_seq=4 ttl=56 time=9.389 ms ... many packets later ... Request timeout for icmp_seq 919 64 bytes from loadbalancer: icmp_seq=918 ttl=56 time=2932.571 ms 64 bytes from loadbalancer: icmp_seq=919 ttl=56 time=1932.174 ms 64 bytes from loadbalancer: icmp_seq=920 ttl=56 time=932.018 ms 64 bytes from loadbalancer: icmp_seq=921 ttl=56 time=6.157 ms --- 50.xx.xx.16 ping statistics --- 1000 packets transmitted, 997 packets received, 0.3% packet loss round-trip min/avg/max/stddev = 5.119/52.712/2932.571/224.629 ms The pattern is always the same: things operate fine for a while (<20ms), then a ping drops completely, then three or four high-latency pings (1000ms), then it settles down again. Traffic comes in through a bonded public interface (we will call it bond0) configured as such: bond0 Link encap:Ethernet HWaddr 00:xx:xx:xx:xx:5d inet addr:50.xx.xx.16 Bcast:50.xx.xx.31 Mask:255.255.255.224 inet6 addr: <ipv6 address> Scope:Global inet6 addr: <ipv6 address> Scope:Link UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1 RX packets:527181270 errors:1 dropped:4 overruns:0 frame:1 TX packets:413335045 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:240016223540 (240.0 GB) TX bytes:104301759647 (104.3 GB) Requests are then submitted via HTTP to upstream servers on the private network (we can call it bond1), which is configured like so: bond1 Link encap:Ethernet HWaddr 00:xx:xx:xx:xx:5c inet addr:10.xx.xx.70 Bcast:10.xx.xx.127 Mask:255.255.255.192 inet6 addr: <ipv6 address> Scope:Link UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1 RX packets:430293342 errors:1 dropped:2 overruns:0 frame:1 TX packets:466983986 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:77714410892 (77.7 GB) TX bytes:227349392334 (227.3 GB) Output of uname -a: Linux <hostname> 3.5.0-42-generic #65~precise1-Ubuntu SMP Wed Oct 2 20:57:18 UTC 2013 x86_64 GNU/Linux We have customized sysctl.conf in an attempt to fix the problem, with no success. Output of /etc/sysctl.conf (with irrelevant configs omitted): # net: core net.core.netdev_max_backlog = 10000 # net: ipv4 stack net.ipv4.tcp_ecn = 2 net.ipv4.tcp_sack = 1 net.ipv4.tcp_fack = 1 net.ipv4.tcp_tw_reuse = 1 net.ipv4.tcp_tw_recycle = 0 net.ipv4.tcp_timestamps = 1 net.ipv4.tcp_window_scaling = 1 net.ipv4.tcp_no_metrics_save = 1 net.ipv4.tcp_max_syn_backlog = 10000 net.ipv4.tcp_congestion_control = cubic net.ipv4.ip_local_port_range = 8000 65535 net.ipv4.tcp_syncookies = 1 net.ipv4.tcp_synack_retries = 2 net.ipv4.tcp_thin_dupack = 1 net.ipv4.tcp_thin_linear_timeouts = 1 net.netfilter.nf_conntrack_max = 99999999 net.netfilter.nf_conntrack_tcp_timeout_established = 300 Output of dmesg -d, with non-ICMP UFW messages suppressed: [508315.349295 < 19.852453>] [UFW BLOCK] IN=bond1 OUT= MAC=<mac addresses> SRC=118.xx.xx.143 DST=50.xx.xx.16 LEN=68 TOS=0x00 PREC=0x00 TTL=51 ID=43221 PROTO=ICMP TYPE=3 CODE=1 [SRC=50.xx.xx.16 DST=118.xx.xx.143 LEN=40 TOS=0x00 PREC=0x00 TTL=249 ID=10220 DF PROTO=TCP SPT=80 DPT=53817 WINDOW=8190 RES=0x00 ACK FIN URGP=0 ] [517787.732242 < 0.443127>] Peer 190.xx.xx.131:59705/80 unexpectedly shrunk window 1155488866:1155489425 (repaired) How can I go about diagnosing the cause of this problem, on a Debian-family Linux box?

    Read the article

  • RHEL - NFS4: Mounted/Exported as rw, user write permission denied

    - by brendanmac
    Hello, I have nfs4 configured between a RHEL 5.3 server (charlie) and a RHEL 5.4 client (simcom1). The machines are configured to authenticate users via kerberos by a Windows Server 2008 active directory machine called "alpha." Alpha also serves as a dns and dhcp machine for the local network. I notice that when a user logs in to a RHEL machine for the first time they are issued a unique uid to that machine; The first user to log on gets 10001. So, what I see is that users between simcom1 and charlie have different UIDs. When a user does an 'ls -la' command from within an nfs4 mount I would have thought that the usernames in the owner column would indicate 'nobody' or at least the wrong user name - since UIDs are different between the machines for each user, and not all users have logged into each machine. However, the simcom1 is able to resolve usernames in an 'ls -la' executed on files residing on charlie via nfs4 correctly. Most troubling is that users are unable to write to files across the nfs mount. The server, charlie, has the root directory exported as rw. The client, simcom1, mounts the export as rw. My configurations are shown below. My question is, how do I configure the RHEL machines to allow users to write files across nfs4 that is already mounted as read/write? [root@charlie ~]# more /etc/exports / 10.100.0.0/16(rw,no_root_squash,fsid=0) [root@charlie ~]#cat /etc/sysconfig/nfs # # Define which protocol versions mountd # will advertise. The values are "no" or "yes" # with yes being the default #MOUNTD_NFS_V1="no" #MOUNTD_NFS_V2="no" #MOUNTD_NFS_V3="no" # # # Path to remote quota server. See rquotad(8) #RQUOTAD="/usr/sbin/rpc.rquotad" # Port rquotad should listen on. #RQUOTAD_PORT=875 # Optinal options passed to rquotad #RPCRQUOTADOPTS="" # # # TCP port rpc.lockd should listen on. #LOCKD_TCPPORT=32803 # UDP port rpc.lockd should listen on. #LOCKD_UDPPORT=32769 # # # Optional arguments passed to rpc.nfsd. See rpc.nfsd(8) # Turn off v2 and v3 protocol support #RPCNFSDARGS="-N 2 -N 3" # Turn off v4 protocol support #RPCNFSDARGS="-N 4" # Number of nfs server processes to be started. # The default is 8. RPCNFSDCOUNT=8 # Stop the nfsd module from being pre-loaded #NFSD_MODULE="noload" # # # Optional arguments passed to rpc.mountd. See rpc.mountd(8) #STATDARG="" #RPCMOUNTDOPTS="" # Port rpc.mountd should listen on. #MOUNTD_PORT=892 # # # Optional arguments passed to rpc.statd. See rpc.statd(8) #RPCIDMAPDARGS="" # # Set to turn on Secure NFS mounts. SECURE_NFS="no" # Optional arguments passed to rpc.gssd. See rpc.gssd(8) #RPCGSSDARGS="-vvv" # Optional arguments passed to rpc.svcgssd. See rpc.svcgssd(8) #RPCSVCGSSDARGS="-vvv" # Don't load security modules in to the kernel #SECURE_NFS_MODS="noload" # # Don't load sunrpc module. #RPCMTAB="noload" # [root@simcom1 ~]# cat /etc/fstab --start snip-- charlie:/home /usr/local/dev/charlie nfs4 rw,nosuid, 0 0 --end snip-- [brendanmac@simcom1 /usr/local/dev/charlie/brendanmac]# touch file touch: cannot touch 'file': Permission denied [brendanmac@simcom1 /usr/local/dev/charlie/brendanmac]# su Password: [root@simcom1 /usr/local/dev/charlie/brendanmac]# touch file [root@simcom1 /usr/local/dev/charlie/brendanmac]# ls -la file -rw------- 1 root root 0 May 26 10:43 file Thank you for your assistance, Brendan

    Read the article

  • Lag spikes at full CPU usage, lagy mouse, maybe video card

    - by Roberts
    My PC specs: Motherboard Name - Gigabyte GA-945PL-S3 CPU Type - DualCore Intel Core 2 Duo E4300, 1800 MHz (9 x 200) OS - Microsoft Windows 7 Ultimate OS Kernel Type - 32-bit OS Version - 6.1.7601 I bougth a new video card one month ago. GeForce 210. I didn't have any problems. I wanted to overclock it, in other words: "Play with it". So I installed Gigabyte EasyBoost from CD and overclocked the GPU 590 + 110 mhz, memory to max to 960mhz from 800mhz. Benchmarks showed a little bit bigger score. Then I overclocked shader clock from 1405 to [..] (don't remeber really). So I was playing Modern Warfare 2 when off sudden computer froze when I wanted to select team, I was afk before that. I had to reset CMOS. After that I had problems with Skype: unread messages and no sound. Then I figured it out that when ever I open EasyBoost - Skype starts to glitch again. Now I use EVGA Precission X. Now after a month, I cleaned computer and closed the case, it was open all the time. I started to overclock GPU clock only (just a bit) because there was no problems that would stop me. So sometimes on heavy CPU load graphics starts to lag. Dragging a window is painful to watch too. Sometimes the screen freezes for 5 to 10 seconds (I can see that hard disk activity is maximal). You may say that CPU fault it is, isn't it? But sometimes lag spikes starts randomly when CPU load is at maximum. All 3 benchmark softwares (PerformanceTest, NovaBench and MSI Kombustor) shows that performance of my video card has dropped about 25%. BUT! CPU score is lower too. I ignored these problems but when I refreshed Windows Experience Index I was shocked. Month before (in latvian language but not so hard to understand): Now 01.04.2012 (upgraded RAM): This happened when I tried to capture Minecraft with Fraps on underclocked GPU to 580mhz (def: 590mhz): All drivers are up to date. Average CPU temperature from 55°C to 75°C (at 70°C sometimes starts these lag spikes). Video card's tempratures are from 45°C to 60°C (very hard to reach 60°C). So my hope is that the video card is fine, cause this card is very new and I want to upgrade CPU anyways. Aplogies for my mistakes in vocabulary (I am trying to type this as fast I can). Update 02.04.2012 - 7:21 Forgot one thing, my hard disk is extrimly slow and I will upgrade it this week or next week so I will be installing same OS again. I am multi-tasker but I can't do much because of 1.8 GHz CPU and slow hard drive (Model ID - WDC WD800JD-60JRC0). The Windows Experience Index is back to normal. Actually "Spelu grafika" (Gaming graphics) are higher than month ago. During this test mouse was very lagy, but month ago there weren't any problems. WHY!?

    Read the article

  • How to deploy custom MBean to Tomcat?

    - by Christian
    Hi, I'm trying to deploy a custom mbean to a tomcat. This mbean is not part of a webapp. It should be instantiated when tomcat starts. My problem is, I can't find any complete documentation about how to deploy such a mbean. I'm getting different exceptions, depending on my configuration. Has anyone hints, a complete documentation or has implemented a mbean by himself and can post an example? I configured tomcat to read a configuration from his conf directory: <Engine name="Catalina" defaultHost="localhost" mbeansFile="${catalina.base}/conf/mbeans-descriptors.xml"> The content is as follows: <?xml version="1.0"?> <!-- <!DOCTYPE mbeans-descriptors PUBLIC "-//Apache Software Foundation//DTD Model MBeans Configuration File" "http://jakarta.apache.org/commons/dtds/mbeans-descriptors.dtd"> --> <!-- Descriptions of JMX MBeans --> <mbeans-descriptors> <mbean name="Performance" description="Caculate JVM throughput" type="Performance"> <attribute name="throughput" description="calculated throughput (ratio between gc times and uptime of JVM)" type="double" writeable="false"/> </mbean> </mbeans-descriptors> When name in the xml file and class name match, I get this excption: SEVERE: Error creating mbean Performance javax.management.MalformedObjectNameException: Key properties cannot be empty at javax.management.ObjectName.construct(ObjectName.java:467) at javax.management.ObjectName.<init>(ObjectName.java:1403) at org.apache.tomcat.util.modeler.modules.MbeansSource.execute(MbeansSource.java:202) at org.apache.tomcat.util.modeler.modules.MbeansSource.load(MbeansSource.java:137) at org.apache.catalina.core.StandardEngine.readEngineMbeans(StandardEngine.java:517) at org.apache.catalina.core.StandardEngine.init(StandardEngine.java:321) at org.apache.catalina.core.StandardEngine.start(StandardEngine.java:411) at org.apache.catalina.core.StandardService.start(StandardService.java:519) at org.apache.catalina.core.StandardServer.start(StandardServer.java:710) at org.apache.catalina.startup.Catalina.start(Catalina.java:581) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:289) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.commons.daemon.support.DaemonLoader.start(DaemonLoader.java:177) When changing the name attribute in the xml file to test.example:type=Performance, I get this exception: SEVERE: Error creating mbean test.example:type=Performance javax.management.NotCompliantMBeanException: MBean class must have public constructor at com.sun.jmx.mbeanserver.Introspector.testCreation(Introspector.java:127) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.createMBean(DefaultMBeanServerInterceptor.java:284) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.createMBean(DefaultMBeanServerInterceptor.java:199) at com.sun.jmx.mbeanserver.JmxMBeanServer.createMBean(JmxMBeanServer.java:393) at org.apache.tomcat.util.modeler.modules.MbeansSource.execute(MbeansSource.java:207) at org.apache.tomcat.util.modeler.modules.MbeansSource.load(MbeansSource.java:137) at org.apache.catalina.core.StandardEngine.readEngineMbeans(StandardEngine.java:517) at org.apache.catalina.core.StandardEngine.init(StandardEngine.java:321) at org.apache.catalina.core.StandardEngine.start(StandardEngine.java:411) at org.apache.catalina.core.StandardService.start(StandardService.java:519) at org.apache.catalina.core.StandardServer.start(StandardServer.java:710) at org.apache.catalina.startup.Catalina.start(Catalina.java:581) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:289) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.commons.daemon.support.DaemonLoader.start(DaemonLoader.java:177) The documentation from apache is not really helpful, as it just explains a small part. I'm aware of this question but it doesn't help me. The answer I gave worked just for a short time, after that I got some other exceptions. For additional info, the java interface public interface PerformanceMBean { public double getThroughput(); } and implementing class /* some import statements */ public class Performance implements PerformanceMBean { public double getThroughput() { ... } }

    Read the article

  • What is consuming so much memory?

    - by Christopher
    Hi, I am having a few problems with my server. It is throwing up intermittant errors and running quite slow. Here is the output from top: top - 07:33:33 up 18:57, 1 user, load average: 0.00, 0.00, 0.00 Tasks: 90 total, 1 running, 82 sleeping, 7 stopped, 0 zombie Cpu(s): 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 1048576k total, 1048576k used, 0k free, 0k buffers Swap: 0k total, 0k used, 0k free, 0k cached Ordered by %MEM: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 9597 root 16 0 276m 91m 15m S 0.0 8.9 0:29.38 java 9564 tomcat 15 0 249m 34m 11m S 0.0 3.4 0:11.79 java 9636 root 18 0 54804 24m 9784 S 0.0 2.4 0:02.58 httpd 26139 apache 15 0 57520 23m 5996 S 0.0 2.3 0:00.15 httpd 16264 apache 18 0 56984 23m 6104 S 0.0 2.2 0:00.21 httpd 24294 apache 15 0 57512 22m 5864 S 0.0 2.2 0:00.17 httpd 30231 apache 15 0 57272 22m 5748 S 0.0 2.2 0:00.97 httpd 32257 apache 15 0 57512 22m 5416 S 0.0 2.2 0:00.46 httpd 19947 apache 15 0 57512 22m 5320 S 0.0 2.2 0:00.19 httpd 26148 apache 15 0 56688 22m 5992 S 0.0 2.2 0:00.40 httpd 14039 apache 18 0 57000 22m 5492 S 0.0 2.2 0:00.33 httpd 6051 apache 15 0 57736 22m 5128 S 0.0 2.2 0:00.07 httpd 19937 apache 15 0 56992 22m 5400 S 0.0 2.2 0:00.14 httpd 5200 apache 15 0 56984 22m 5376 S 0.0 2.2 0:00.23 httpd 10001 apache 15 0 55636 21m 5636 S 0.0 2.1 0:01.05 httpd 11734 apache 15 0 56712 21m 4548 S 0.0 2.1 0:00.46 httpd 18193 apache 15 0 55100 20m 5508 S 0.0 2.0 0:00.24 httpd 14036 apache 15 0 55128 20m 5412 S 0.0 2.0 0:00.10 httpd 3981 apache 15 0 55128 19m 4860 S 0.0 1.9 0:00.16 httpd 7588 apache 18 0 55112 19m 4848 S 0.0 1.9 0:00.04 httpd 19768 apache 16 0 55112 19m 4844 S 0.0 1.9 0:00.02 httpd 5827 apache 15 0 55112 19m 4828 S 0.0 1.9 0:00.05 httpd 29774 apache 15 0 55112 19m 4544 S 0.0 1.9 0:00.11 httpd 6064 apache 15 0 55112 19m 4536 S 0.0 1.9 0:00.02 httpd 16253 apache 17 0 55116 19m 4532 S 0.0 1.9 0:00.01 httpd 19922 apache 15 0 55112 19m 4540 S 0.0 1.9 0:00.02 httpd 10010 apache 15 0 55100 19m 4524 S 0.0 1.9 0:00.01 httpd 18195 apache 18 0 55104 18m 3872 S 0.0 1.8 0:00.02 httpd 7361 mysql 15 0 134m 18m 6400 S 0.0 1.8 0:10.18 mysqld 19921 apache 15 0 55088 18m 3588 S 0.0 1.8 0:00.02 httpd 11967 apache 15 0 55080 18m 3584 S 0.0 1.8 0:00.00 httpd 13813 apache 15 0 55088 18m 3576 S 0.0 1.8 0:00.14 httpd 23898 apache 18 0 54968 17m 3212 S 0.0 1.7 0:00.00 httpd 13792 apache 15 0 54968 17m 3088 S 0.0 1.7 0:00.00 httpd 14083 apache 15 0 54968 17m 3088 S 0.0 1.7 0:00.00 httpd 32547 apache 15 0 54944 17m 2924 S 0.0 1.7 0:00.00 httpd 13787 apache 15 0 54944 17m 2908 S 0.0 1.7 0:00.00 httpd 3623 apache 17 0 54944 17m 2908 S 0.0 1.7 0:00.00 httpd 16024 apache 19 0 54944 17m 2860 S 0.0 1.7 0:00.00 httpd 13791 apache 15 0 54944 17m 2864 S 0.0 1.7 0:00.00 httpd 20090 named 19 0 110m 4244 2056 S 0.0 0.4 0:01.55 named 9369 cyrus 15 0 15904 3048 1720 S 0.0 0.3 0:00.24 cyrus-master 32735 root 15 0 8852 2888 2116 T 0.0 0.3 0:00.00 mysql The intermittant error I get using Firefox is: Server not found Firefox can't find the server at XXXXXXX.co. * Check the address for typing errors such as ww.example.com instead of www.example.com * If you are unable to load any pages, check your computer's network connection. * If your computer or network is protected by a firewall or proxy, make sure that Firefox is permitted to access the Web. And on other browsers, the page just loads for about 10 minutes but never appears. The only way to resolve it is to close the browser completely as the error appears to be saved in the cache. Has anyone got any ideas? Many Thanks.

    Read the article

  • Recommendations for distributed processing/distributed storage systems

    - by Eddie
    At my organization we have a processing and storage system spread across two dozen linux machines that handles over a petabyte of data. The system right now is very ad-hoc; processing automation and data management is handled by a collection of large perl programs on independent machines. I am looking at distributed processing and storage systems to make it easier to maintain, evenly distribute load and data with replication, and grow in disk space and compute power. The system needs to be able to handle millions of files, varying in size between 50 megabytes to 50 gigabytes. Once created, the files will not be appended to, only replaced completely if need be. The files need to be accessible via HTTP for customer download. Right now, processing is automated by perl scripts (that I have complete control over) which call a series of other programs (that I don't have control over because they are closed source) that essentially transforms one data set into another. No data mining happening here. Here is a quick list of things I am looking for: Reliability: These data must be accessible over HTTP about 99% of the time so I need something that does data replication across the cluster. Scalability: I want to be able to add more processing power and storage easily and rebalance the data on across the cluster. Distributed processing: Easy and automatic job scheduling and load balancing that fits with processing workflow I briefly described above. Data location awareness: Not strictly required but desirable. Since data and processing will be on the same set of nodes I would like the job scheduler to schedule jobs on or close to the node that the data is actually on to cut down on network traffic. Here is what I've looked at so far: Storage Management: GlusterFS: Looks really nice and easy to use but doesn't seem to have a way to figure out what node(s) a file actually resides on to supply as a hint to the job scheduler. GPFS: Seems like the gold standard of clustered filesystems. Meets most of my requirements except, like glusterfs, data location awareness. Ceph: Seems way to immature right now. Distributed processing: Sun Grid Engine: I have a lot of experience with this and it's relatively easy to use (once it is configured properly that is). But Oracle got its icy grip around it and it no longer seems very desirable. Both: Hadoop/HDFS: At first glance it looked like hadoop was perfect for my situation. Distributed storage and job scheduling and it was the only thing I found that would give me the data location awareness that I wanted. But I don't like the namename being a single point of failure. Also, I'm not really sure if the MapReduce paradigm fits the type of processing workflow that I have. It seems like you need to write all your software specifically for MapReduce instead of just using Hadoop as a generic job scheduler. OpenStack: I've done some reading on this but I'm having trouble deciding if it fits well with my problem or not. Does anyone have opinions or recommendations for technologies that would fit my problem well? Any suggestions or advise would be greatly appreciated. Thanks!

    Read the article

  • Have I pushed the limits of my current VPS or is there room for optimization?

    - by JRameau
    I am currently on a mediatemple DV server (basic) 512mb dedicated ram, this is a CentOS based VPS with Plesk and Virtuozzo. My experience with it from day 1 has been bad and I only could sooth my server issues with several caching "Band-aids," but my sites are not as small as they were a year ago either so the issues have worsen. I have 3 Drupal installs running on separate (plesk) domains, 1 of those drupal installs is a multisite, that consists of 5-6 sites 2 of those sites are bringing in actual traffic. Those caching "Band-aids" I mentioned are APC, which seemed to help alot initially, and Drupal's Boost, which is considered a poorman's Varnish, it makes all my pages static for anonymous users. Last 30day combined estimate on Google Ananlytics: 90k visitors 260k pageviews. Issue: alot of downtime, I am continually checking if my sites are up, and lately I have been finding it down more than 3 times daily. Restarting Apache will bring it back up, for some time. I have google search every error message and looked up ways to optimize my DV server, and I am beyond stump what is my next move. Is this server bad, have I hit a impossibly low restriction such as the 12mb kernel memory barrier (kmemsize), is it on my end, do I need to optimize some more? *I have provided as much information as I can below, any help or suggestions given will be appreciated Common Error messages I see in the log: [error] (12)Cannot allocate memory: fork: Unable to fork new process [error] make_obcallback: could not import mod_python.apache.\n Traceback (most recent call last): File "/usr/lib/python2.4/site-packages/mod_python/apache.py", line 21, in ? import traceback File "/usr/lib/python2.4/traceback.py", line 3, in ? import linecache ImportError: No module named linecache [error] python_handler: no interpreter callback found. [warn-phpd] mmap cache can't open /var/www/vhosts/***/httpdocs/*** - Too many open files in system (pid ***) [alert] Child 8125 returned a Fatal error... Apache is exiting! [emerg] (43)Identifier removed: couldn't grab the accept mutex [emerg] (22)Invalid argument: couldn't release the accept mutex cat /proc/user_beancounters: Version: 2.5 uid resource held maxheld barrier limit failcnt 41548: kmemsize 4582652 5306699 12288832 13517715 21105036 lockedpages 0 0 600 600 0 privvmpages 38151 42676 229036 249036 0 shmpages 16274 16274 17237 17237 2 dummy 0 0 0 0 0 numproc 43 46 300 300 0 physpages 27260 29528 0 2147483647 0 vmguarpages 0 0 131072 2147483647 0 oomguarpages 27270 29538 131072 2147483647 0 numtcpsock 21 29 300 300 0 numflock 8 8 480 528 0 numpty 1 1 30 30 0 numsiginfo 0 1 1024 1024 0 tcpsndbuf 648440 675272 2867477 4096277 1711499 tcprcvbuf 301620 359716 2867477 4096277 0 othersockbuf 4472 4472 1433738 2662538 0 dgramrcvbuf 0 0 1433738 1433738 0 numothersock 12 12 300 300 0 dcachesize 0 0 2684271 2764800 0 numfile 3447 3496 6300 6300 3872 dummy 0 0 0 0 0 dummy 0 0 0 0 0 dummy 0 0 0 0 0 numiptent 14 14 200 200 0 TOP: (In January the load avg was really high 3-10, I was able to bring it down where it is currently is by giving APC more memory play around with) top - 16:46:07 up 2:13, 1 user, load average: 0.34, 0.20, 0.20 Tasks: 40 total, 2 running, 37 sleeping, 0 stopped, 1 zombie Cpu(s): 0.3% us, 0.1% sy, 0.0% ni, 99.7% id, 0.0% wa, 0.0% hi, 0.0% si Mem: 916144k total, 156668k used, 759476k free, 0k buffers Swap: 0k total, 0k used, 0k free, 0k cached MySQLTuner: (after optimizing every table and repairing any table with overage I got the fragmented count down to 86) [--] Data in MyISAM tables: 285M (Tables: 1105) [!!] Total fragmented tables: 86 [--] Up for: 2h 44m 38s (409K q [41.421 qps], 6K conn, TX: 1B, RX: 174M) [--] Reads / Writes: 79% / 21% [--] Total buffers: 58.0M global + 2.7M per thread (100 max threads) [!!] Query cache prunes per day: 675307 [!!] Temporary tables created on disk: 35% (7K on disk / 20K total)

    Read the article

  • Tomcat dying silently on regular basis

    - by Hendrik
    My tomcat (6.0.32, Java Sun 1.6.0_22-b04 on Ubuntu 10.04) keeps crashing multiple times daily without any specific output in catalina.out. This usually happens on high load (see top output). Update: The pid-file is properly removed when this happens. Update 2: No CATALINA_OPTS set, _JAVA_OPTS are: export _JAVA_OPTIONS="-Xms128m -Xmx1024m -XX:MaxPermSize=512m \ -XX:MinHeapFreeRatio=20 \ -XX:MaxHeapFreeRatio=40 \ -XX:NewSize=10m \ -XX:MaxNewSize=10m \ -XX:SurvivorRatio=6 \ -XX:TargetSurvivorRatio=80 \ -XX:+CMSClassUnloadingEnabled \ -Djava.awt.headless=true \ -Dcom.sun.management.jmxremote \ -Dcom.sun.management.jmxremote.port=37331 \ -Dcom.sun.management.jmxremote.ssl=false \ -Dcom.sun.management.jmxremote.authenticate=true \ -Djava.rmi.server.hostname=(myhostname) \ -Dcom.sun.management.jmxremote.password.file=/etc/java-6-sun/management/jmxremote.password \ -Dcom.sun.management.jmxremote.access.file=/etc/java-6-sun/management/jmxremote.access" Top: top - 12:40:03 up 9 days, 12:15, 3 users, load average: 30.00, 22.39, 21.91 Tasks: 89 total, 4 running, 85 sleeping, 0 stopped, 0 zombie Cpu(s): 53.2%us, 9.7%sy, 0.0%ni, 34.7%id, 1.5%wa, 0.0%hi, 0.8%si, 0.0%st Mem: 4194304k total, 3311304k used, 883000k free, 0k buffers Swap: 4194304k total, 0k used, 4194304k free, 0k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 25850 tomcat6 20 0 1981m 1.2g 11m S 161 29.6 11:41.56 java 12632 mysql 20 0 393m 97m 4452 S 141 2.4 1690:05 mysqld 14932 nobody 20 0 253m 44m 9152 R 56 1.1 3:26.57 php-cgi 7011 nobody 20 0 241m 31m 9124 S 30 0.8 1:35.96 php-cgi 10093 nobody 20 0 228m 18m 8520 S 25 0.5 2:29.97 php-cgi 27071 nobody 20 0 237m 28m 8640 S 11 0.7 3:13.72 php-cgi 3306 nobody 20 0 227m 16m 6736 R 7 0.4 2:29.83 php-cgi 7756 nobody 20 0 261m 58m 15m R 5 1.4 2:22.33 php-cgi 7129 www-data 20 0 3646m 7228 1896 S 2 0.2 0:36.65 nginx 2657 nobody 20 0 228m 18m 8540 S 1 0.5 1:59.51 php-cgi 7131 www-data 20 0 3645m 6464 1960 S 1 0.2 0:34.13 nginx 7140 www-data 20 0 3652m 12m 1896 S 1 0.3 0:35.80 nginx 619 nobody 20 0 231m 29m 15m S 0 0.7 2:33.46 php-cgi 16552 nobody 20 0 250m 41m 8784 S 0 1.0 2:48.12 php-cgi 17134 nobody 20 0 239m 37m 16m S 0 0.9 2:32.86 php-cgi 21004 nobody 20 0 243m 34m 8700 S 0 0.8 1:19.85 php-cgi 26105 root 20 0 19220 1392 1060 R 0 0.0 0:00.82 top 32430 nobody 20 0 256m 47m 9196 S 0 1.2 2:19.01 php-cgi 314 nobody 20 0 256m 47m 8804 S 0 1.1 1:46.00 php-cgi 2111 nobody 20 0 253m 44m 9196 S 0 1.1 3:01.14 php-cgi 2142 root 20 0 26452 2564 868 S 0 0.1 0:00.56 screen 2144 root 20 0 19484 2012 1368 S 0 0.0 0:00.00 bash 2333 nobody 20 0 249m 41m 9160 S 0 1.0 1:10.33 php-cgi 2552 root 20 0 19484 2260 1620 S 0 0.1 0:00.01 bash 2587 nobody 20 0 258m 49m 9192 S 0 1.2 2:04.50 php-cgi 2684 root 20 0 4092 652 540 S 0 0.0 0:00.00 xvfb-run 2696 root 20 0 60720 13m 2352 S 0 0.3 0:09.12 Xvfb 2759 root 20 0 617m 12m 4676 S 0 0.3 0:00.66 node 3514 nobody 20 0 270m 61m 9216 S 0 1.5 3:13.69 php-cgi 5270 root 20 0 25164 1324 1036 S 0 0.0 0:00.01 screen 5402 nobody 20 0 227m 16m 8032 S 0 0.4 1:33.61 php-cgi 5765 root 20 0 81180 3820 3028 S 0 0.1 0:00.31 sshd 5798 nobody 20 0 242m 32m 9124 S 0 0.8 1:52.08 php-cgi 5856 root 20 0 19496 2292 1636 S 0 0.1 0:00.03 bash 6442 root 20 0 62332 20m 1960 S 0 0.5 0:30.58 mrtg 7082 root 20 0 88992 1916 1636 S 0 0.0 0:00.00 PassengerWatchd I can't find any concrete reason for it, no Exceptions or messages of a shutdown in catalina.out (and no other logs in tomcat's log dir). I can start up the service and it will run for a few days or just minutes before dying again. Is there somewhere else i could look for output? Could the kernel start killing threads due to a lack of ressources and by that bring the VM down?

    Read the article

  • Memcached Lagging

    - by Brad Dwyer
    Let me preface this by saying that this is a followup question to this topic. That was "solved" by switching from Solaris (SmartOS) to Ubuntu for the memcached server. Now we've multiplied load by about 5x and are running into problems again. We are running a site that is doing about 1000 requests/minute, each request hits Memcached with approximately 3 reads and 1 write. So load is approximately 65 requests per second. Total data in the cache is about 37M, and each key contains a very small amount of data (a JSON-encoded array of integers amounting to less than 1K). We have setup a benchmarking script on these pages and fed the data into StatsD for logging. The problem is that there are spikes where Memcached takes a very long time to respond. These do not appear to correlate with spikes in traffic. What could be causing these spikes? Why would memcached take over a second to reply? We just booted up a second server to put in the pool and it didn't make any noticeable difference in the frequency or severity of the spikes. This is the output of getStats() on the servers: Array ( [-----------] => Array ( [pid] => 1364 [uptime] => 3715684 [threads] => 4 [time] => 1336596719 [pointer_size] => 64 [rusage_user_seconds] => 7924 [rusage_user_microseconds] => 170000 [rusage_system_seconds] => 187214 [rusage_system_microseconds] => 190000 [curr_items] => 12578 [total_items] => 53516300 [limit_maxbytes] => 943718400 [curr_connections] => 14 [total_connections] => 72550117 [connection_structures] => 165 [bytes] => 2616068 [cmd_get] => 450388258 [cmd_set] => 53493365 [get_hits] => 450388258 [get_misses] => 2244297 [evictions] => 0 [bytes_read] => 2138744916 [bytes_written] => 745275216 [version] => 1.4.2 ) [-----------:11211] => Array ( [pid] => 8099 [uptime] => 4687 [threads] => 4 [time] => 1336596719 [pointer_size] => 64 [rusage_user_seconds] => 7 [rusage_user_microseconds] => 170000 [rusage_system_seconds] => 290 [rusage_system_microseconds] => 990000 [curr_items] => 2384 [total_items] => 225964 [limit_maxbytes] => 943718400 [curr_connections] => 7 [total_connections] => 588097 [connection_structures] => 91 [bytes] => 562641 [cmd_get] => 1012562 [cmd_set] => 225778 [get_hits] => 1012562 [get_misses] => 125161 [evictions] => 0 [bytes_read] => 91270698 [bytes_written] => 350071516 [version] => 1.4.2 ) ) Edit: Here is the result of a set and retrieve of 10,000 values. Normal: Stored 10000 values in 5.6118 seconds. Average: 0.0006 High: 0.1958 Low: 0.0003 Fetched 10000 values in 5.1215 seconds. Average: 0.0005 High: 0.0141 Low: 0.0003 When Spiking: Stored 10000 values in 16.5074 seconds. Average: 0.0017 High: 0.9288 Low: 0.0003 Fetched 10000 values in 19.8771 seconds. Average: 0.0020 High: 0.9478 Low: 0.0003

    Read the article

  • Intermittent 404 on select assets, LAMP stack

    - by Tom Lagier
    We have a LAMP stack WordPress server that is serving most assets correctly. However, one plugin's CSS file and several images are returning soft 404s roughly 20% of the time. I can't find any reference to the 404 in the access logs, but the browser is definitely receiving a 404 response from somewhere (WordPress, I would assume). When I use an alias URL that does not match the site URL but does resolve to the asset path, the resource loads correctly 100% of the time. However, using the site url only resolves for the select, problematic assets 20% of the time. You can test one of the problematic assets here: http://www.mreco.org/wp-content/uploads/2014/05/zero-cost.jpg However the alias link always resolves correctly: http://mr-eco.wordpress.promocampaigns.com/wp-content/uploads/2014/05/zero-cost.jpg Stranger, if I attempt to access outdated content that definitely does not exist on the server, at the live URL it returns the content roughly 50% of the time. Using the alias link, it 404s 100% of the time - the correct behavior. Error log and PHP error log are clean. A sample access log (pulled from grep 'zero-cost.jpg' /var/log/httpd/mr-eco-access_log) from several refreshes of the live direct link (where I am not seeing any 404's): 10.166.202.202 - - [28/May/2014:20:27:41 +0000] "GET /wp-content/uploads/2014/05/zero-cost.jpg HTTP/1.1" 304 - 10.166.202.202 - - [28/May/2014:20:27:42 +0000] "GET /wp-content/uploads/2014/05/zero-cost.jpg HTTP/1.1" 304 - 10.166.202.202 - - [28/May/2014:20:27:43 +0000] "GET /wp-content/uploads/2014/05/zero-cost.jpg HTTP/1.1" 304 - 10.166.202.202 - - [28/May/2014:20:27:43 +0000] "GET /wp-content/uploads/2014/05/zero-cost.jpg HTTP/1.1" 304 - 10.176.201.37 - - [28/May/2014:20:27:56 +0000] "GET /wp-content/uploads/2014/05/zero-cost.jpg HTTP/1.1" 200 57027 Chrome's dev tools list the following network activity before displaying 404 page content: zero-cost.jpg /wp-content/uploads/2014/05 GET 404 Not Found text/html Other 15.9?KB 73.2?KB 953?ms 947?ms My Apache configuration is standard, I've listed the virtual host entry and .htaccess file below. I can provide other parts of Apache config if necessary. Virtual host: <VirtualHost *:80> DocumentRoot /var/www/public_html/mr-eco.wordpress.promocampaigns.com ServerName www.mreco.org ServerAlias mreco.org mr-eco.wordpress.promocampaigns.com ErrorLog logs/mr-eco-error_log CustomLog logs/mr-eco-access_log common <Directory /var/www/public_html/mr-eco.wordpress.promocampaigns.com> AllowOverride All SetOutputFilter DEFLATE </Directory> </VirtualHost> .htaccess: # BEGIN WordPress <IfModule mod_rewrite.c> RewriteEngine On RewriteBase / RewriteRule ^index\.php$ - [L] RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_FILENAME} !-d RewriteRule . /index.php [L] </IfModule> # END WordPress I have checked for multiple A records and can confirm that there is a single A record pointing at the domain: ;; ANSWER SECTION: mreco.org. 60 IN A 50.18.58.174 I'm fairly new to systems administration, and at a complete loss as to what could cause this. In the past, inconsistently 404ing assets have been because of out-of-sync instances behind a load balancer. In this case, it is a single instance behind the load balancer. Because of the inconsistency, it feels like a caching issue. We don't make use of Apache caching, and as far as I know WordPress should not be caching either. What I've done so far: Reset WordPress permalinks Disabled WordPress plugins Re-generated WordPress .htaccess file Swapped ServerName and ServerAlias directives Cleared browser cache Confirmed disk location of resources Checked PHP, access, and error logs Confirmed correct DNS setup (can post if necessary) I'm at a total loss. Thanks for helping me out!

    Read the article

  • NIC Bonding/balance-rr with Dell PowerConnect 5324

    - by Branden Martin
    I'm trying to get NIC bonding to work with balance-rr so that three NIC ports are combined, so that instead of getting 1 Gbps we get 3 Gbps. We are doing this on two servers connected to the same switch. However, we're only getting the speed of one physical link. We are using 1 Dell PowerConnect 5324, SW version 2.0.1.3, Boot version 1.0.2.02, HW version 00.00.02. Both servers are CentOS 5.9 (Final) running OnApp Hypervisor (CloudBoot) Server 1 is using ports g5-g7 in port-channel 1. Server 2 is using ports g9-g11 in port-channel 2. Switch show interface status Port Type Duplex Speed Neg ctrl State Pressure Mode -------- ------------ ------ ----- -------- ---- ----------- -------- ------- g1 1G-Copper -- -- -- -- Down -- -- g2 1G-Copper Full 1000 Enabled Off Up Disabled Off g3 1G-Copper -- -- -- -- Down -- -- g4 1G-Copper -- -- -- -- Down -- -- g5 1G-Copper Full 1000 Enabled Off Up Disabled Off g6 1G-Copper Full 1000 Enabled Off Up Disabled Off g7 1G-Copper Full 1000 Enabled Off Up Disabled On g8 1G-Copper Full 1000 Enabled Off Up Disabled Off g9 1G-Copper Full 1000 Enabled Off Up Disabled On g10 1G-Copper Full 1000 Enabled Off Up Disabled On g11 1G-Copper Full 1000 Enabled Off Up Disabled Off g12 1G-Copper Full 1000 Enabled Off Up Disabled On g13 1G-Copper -- -- -- -- Down -- -- g14 1G-Copper -- -- -- -- Down -- -- g15 1G-Copper -- -- -- -- Down -- -- g16 1G-Copper -- -- -- -- Down -- -- g17 1G-Copper -- -- -- -- Down -- -- g18 1G-Copper -- -- -- -- Down -- -- g19 1G-Copper -- -- -- -- Down -- -- g20 1G-Copper -- -- -- -- Down -- -- g21 1G-Combo-C -- -- -- -- Down -- -- g22 1G-Combo-C -- -- -- -- Down -- -- g23 1G-Combo-C -- -- -- -- Down -- -- g24 1G-Combo-C Full 100 Enabled Off Up Disabled On Flow Link Ch Type Duplex Speed Neg control State -------- ------- ------ ----- -------- ------- ----------- ch1 1G Full 1000 Enabled Off Up ch2 1G Full 1000 Enabled Off Up ch3 -- -- -- -- -- Not Present ch4 -- -- -- -- -- Not Present ch5 -- -- -- -- -- Not Present ch6 -- -- -- -- -- Not Present ch7 -- -- -- -- -- Not Present ch8 -- -- -- -- -- Not Present Server 1: cat /etc/sysconfig/network-scripts/ifcfg-eth3 DEVICE=eth3 HWADDR=00:1b:21:ac:d5:55 USERCTL=no BOOTPROTO=none ONBOOT=yes MASTER=onappstorebond SLAVE=yes cat /etc/sysconfig/network-scripts/ifcfg-eth4 DEVICE=eth4 HWADDR=68:05:ca:18:28:ae USERCTL=no BOOTPROTO=none ONBOOT=yes MASTER=onappstorebond SLAVE=yes cat /etc/sysconfig/network-scripts/ifcfg-eth5 DEVICE=eth5 HWADDR=68:05:ca:18:28:af USERCTL=no BOOTPROTO=none ONBOOT=yes MASTER=onappstorebond SLAVE=yes cat /etc/sysconfig/network-scripts/ifcfg-onappstorebond DEVICE=onappstorebond IPADDR=10.200.52.1 NETMASK=255.255.0.0 GATEWAY=10.200.2.254 NETWORK=10.200.0.0 USERCTL=no BOOTPROTO=none ONBOOT=yes cat /proc/net/bonding/onappstorebond Ethernet Channel Bonding Driver: v3.4.0-1 (October 7, 2008) Bonding Mode: load balancing (round-robin) MII Status: up MII Polling Interval (ms): 100 Up Delay (ms): 0 Down Delay (ms): 0 Slave Interface: eth3 MII Status: up Speed: 1000 Mbps Duplex: full Link Failure Count: 0 Permanent HW addr: 00:1b:21:ac:d5:55 Slave Interface: eth4 MII Status: up Speed: 1000 Mbps Duplex: full Link Failure Count: 0 Permanent HW addr: 68:05:ca:18:28:ae Slave Interface: eth5 MII Status: up Speed: 1000 Mbps Duplex: full Link Failure Count: 0 Permanent HW addr: 68:05:ca:18:28:af Server 2: cat /etc/sysconfig/network-scripts/ifcfg-eth3 DEVICE=eth3 HWADDR=00:1b:21:ac:d5:a7 USERCTL=no BOOTPROTO=none ONBOOT=yes MASTER=onappstorebond SLAVE=yes cat /etc/sysconfig/network-scripts/ifcfg-eth4 DEVICE=eth4 HWADDR=68:05:ca:18:30:30 USERCTL=no BOOTPROTO=none ONBOOT=yes MASTER=onappstorebond SLAVE=yes cat /etc/sysconfig/network-scripts/ifcfg-eth5 DEVICE=eth5 HWADDR=68:05:ca:18:30:31 USERCTL=no BOOTPROTO=none ONBOOT=yes MASTER=onappstorebond SLAVE=yes cat /etc/sysconfig/network-scripts/ifcfg-onappstorebond DEVICE=onappstorebond IPADDR=10.200.53.1 NETMASK=255.255.0.0 GATEWAY=10.200.3.254 NETWORK=10.200.0.0 USERCTL=no BOOTPROTO=none ONBOOT=yes cat /proc/net/bonding/onappstorebond Ethernet Channel Bonding Driver: v3.4.0-1 (October 7, 2008) Bonding Mode: load balancing (round-robin) MII Status: up MII Polling Interval (ms): 100 Up Delay (ms): 0 Down Delay (ms): 0 Slave Interface: eth3 MII Status: up Speed: 1000 Mbps Duplex: full Link Failure Count: 0 Permanent HW addr: 00:1b:21:ac:d5:a7 Slave Interface: eth4 MII Status: up Speed: 1000 Mbps Duplex: full Link Failure Count: 0 Permanent HW addr: 68:05:ca:18:30:30 Slave Interface: eth5 MII Status: up Speed: 1000 Mbps Duplex: full Link Failure Count: 0 Permanent HW addr: 68:05:ca:18:30:31 Here are the results of iperf. ------------------------------------------------------------ Client connecting to 10.200.52.1, TCP port 5001 TCP window size: 27.7 KByte (default) ------------------------------------------------------------ [ 3] local 10.200.3.254 port 53766 connected with 10.200.52.1 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 950 MBytes 794 Mbits/sec

    Read the article

  • http request via iptables --to-destination ip redirect results in no response

    - by Wouter Vegter
    I have two Ubuntu servers with each having their own ip addresses. Let's call them server1 and server2, having respectively ip 1.1.1.1 and 2.2.2.2 I have a nginx running on server2. The sole purpose I want server1 to have is to redirect all incoming http (so port 80) requests to server2 without clients noticing that their request is being redirected. I tried the following command on server1: iptables -t nat -A PREROUTING -p tcp --dport 80 -j DNAT --to-destination 2.2.2.2 But when I enter 1.1.1.1 in my browser I get no respond: the page keeps trying to load without giving any message or error message (I get a time-out after 2-3 mins). But when I do remove the above iptables rule I immediately do get a "page not found error" when I enter 1.1.1.1 in my browser; so something is working but not as it should: when I enter 1.1.1.1 I want the html page to load that is hosted on 2.2.2.2 Because when i enter 2.2.2.2 in my browser I do see the webpage loaded. Could anyone please help me with this? I am searching quite some time (on severfault & Google) on this now so that's why I ask. Many thanks for reading my question! Update: Thank you all for you information. Unfortunately I still get no response I have the following iptables configuration: root@ip-10-48-238-216:/home/ubuntu# sudo iptables -L Chain INPUT (policy ACCEPT) target prot opt source destination Chain FORWARD (policy ACCEPT) target prot opt source destination Chain OUTPUT (policy ACCEPT) target prot opt source destination root@ip-10-48-238-216:/home/ubuntu# sudo iptables -t nat -L Chain PREROUTING (policy ACCEPT) target prot opt source destination DNAT tcp -- anywhere anywhere tcp dpt:www to:2.2.2.2 Chain OUTPUT (policy ACCEPT) target prot opt source destination Chain POSTROUTING (policy ACCEPT) target prot opt source destination When i run tcpdump and do request via chrome to 1.1.1.1 i get the following root@ip-10-48-238-216:/home/ubuntu# sudo tcpdump -i eth0 port 80 -vv tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes 13:56:18.346625 IP (tos 0x0, ttl 52, id 12055, offset 0, flags [DF], proto TCP (6), length 60) 212-123-161-112.ip.telfort.nl.16386 ip-10-48-238-216.eu-west-1.compute.internal.www: Flags [S], cksum 0xb398 (correct), seq 2639758575, win 5840, options [mss 1460,sackOK,TS val 1223672 ecr 0,nop,wscale 6], length 0 13:56:18.346662 IP (tos 0x0, ttl 51, id 12055, offset 0, flags [DF], proto TCP (6), length 60) 212-123-161-112.ip.telfort.nl.16386 ww1dc1.shopreme.com.www: Flags [S], cksum 0x9ee0 (correct), seq 2639758575, win 5840, options [mss 1460,sackOK,TS val 1223672 ecr 0,nop,wscale 6], length 0 13:56:18.598747 IP (tos 0x0, ttl 52, id 10138, offset 0, flags [DF], proto TCP (6), length 60) 212-123-161-112.ip.telfort.nl.16387 ip-10-48-238-216.eu-west-1.compute.internal.www: Flags [S], cksum 0xac40 (correct), seq 2645658541, win 5840, options [mss 1460,sackOK,TS val 1223735 ecr 0,nop,wscale 6], length 0 13:56:18.598777 IP (tos 0x0, ttl 51, id 10138, offset 0, flags [DF], proto TCP (6), length 60) 212-123-161-112.ip.telfort.nl.16387 ww1dc1.shopreme.com.www: Flags [S], cksum 0x9788 (correct), seq 2645658541, win 5840, options [mss 1460,sackOK,TS val 1223735 ecr 0,nop,wscale 6], length 0 ^C 4 packets captured 4 packets received by filter 0 packets dropped by kernel the mentioned address relate to the following 212-123-161-112.ip.telfort.nl.16386 : my personal computer ww1dc1.shopreme.com.www : dns of server2 (2.2.2.2) ip-10-48-238-216.eu-west-1.compute.internal.www : amazon web services ec2 internal address of server1 (1.1.1.1) However, the tcpdump log on server2 (2.2.2.2) stays empty and I get no response back in my browser. I am able to ping from server1 to server2. And net.ipv4.ip_forward is set to 1 and so is /proc/sys/net/ipv4/ip_forward Could there be anything else that is missing?

    Read the article

< Previous Page | 628 629 630 631 632 633 634 635 636 637 638 639  | Next Page >