Search Results

Search found 15266 results on 611 pages for 'young people'.

Page 12/611 | < Previous Page | 8 9 10 11 12 13 14 15 16 17 18 19  | Next Page >

  • Where can I find video resources of people programming?

    - by Corey
    This might be a strange question. I'm looking for videos of people actively coding something while explaining it. However, I don't want is a beginner video that delves into what variables and objects are. Nick Gravelyn's tile engine tutorial is a great example of what I'm looking for. (He actually used to host the full, unbroken video files in his site's archive, but I guess he took them down...) I tend to learn best by "action" examples; it's difficult for me to learn by reading through documentation and text tutorials, but if I see somebody actively doing a task, I can immediately register it and apply it myself. I'm hard-of-hearing, so I would really prefer that if the video has a lot of talking, it have captioning or subtitling of some sort, or at the very least, a transcript. The tile engine videos did not have captions, but the code he was writing was very self-documenting, so I understood it for the most part. I've gone through most of the relevant GoogleDevelopers and GoogleTechTalks videos on Youtube, so those need not apply. Are there any resources out there, or even websites dedicated to this kind of thing?

    Read the article

  • How do I (quickly) let people know that software I am providing for free is not abandon-ware?

    - by blueberryfields
    As an independent, individual programmer: How do I let people very quickly know that I have not abandoned the software I've written and given away for free? That I am putting in the effort required to maintain and support my software to a professional level? When software written by one or two developers is available for free, or marked as open-source, usually the default assumption is that it's abandon-ware. This is usually a safe assumption - check out the answers to this question if you doubt it: Why do programmers write applications and then make them free?. There are lots of programmers who provide free and/or open-source tools which are not abandon-ware, though. If we're talking about large companies, ie Google, there's no real problem telling the difference between supported, live tools and software, and those which are abandoned or discontinued. A lively git repository isn't quick - users will have to be savvy enough to understand the repository and know where to look for it. Consistent marketing and community management take more time and effort than I can put in on my own. Also, if my software becomes popular/successful, I assume those will grow on their own, and be supported by power users in the community.

    Read the article

  • Project freezed - what should I leave to the people after me?

    - by Maistora
    So the project I've been working on is now going to be freezed for unknown period of time. May be when the project unfreezes it won't be assigned to me or anybody of the current team. Actually we did also inherit the project after it had been freezed but there was nothing left by the team before us to help us understand even the basic needs of the project, so plenty of time passed by until we got to know the project well. My question is what do you think we should do to help people after us to best understand the needs of the project, what we have done, why we've done it, etc. I am open to other ideas of why should we leave some tracks to the others that will work on this project also. Some steps we already have taken: technical documentation (not full but at least there is some); source-control system history; estimations on which parts of the project need improvement and why we think so; bunch of unit tests. What do you think of what we've already prepared and what else could we do?

    Read the article

  • Project frozen - what should I leave to the people after me?

    - by Maistora
    So the project I've been working on is now going to be frozen indefinitely. It is possible that if and when the project unfreezes again, it won't be assigned to me or anybody from the current team. Actually, we inherited the project after it had been frozen before, but there was nothing left by the prior team to help us understand even the basic needs of the project, so we wasted a lot of time getting to know the project well. My question is what do you think we should do to help the people after us to best understand the needs of the project, what we have done, why we've done it, etc. I am open to other ideas of why should we leave some tracks to the others that will work on this project also. Some steps we already have taken: technical documentation (not full but at least there is some); source-control system history; estimations on which parts of the project need improvement and why we think so; bunch of unit tests. issue tracker with all the tickets we've done (EDIT) What do you think of what we've already prepared and what else can we do?

    Read the article

  • Why most people change from being a contractor to full time at my companies, but not the other way around?

    - by ????
    I have seen most people changed from being a contractor to being a full time employee, but not the other way around. And that happened in startups that had maybe 20% chance of IPO or being acquired, and another that had maybe a 50% chance. As far as I know, the rate (even for a 3 year experience graphics designer, or a programmer), can be $75 to $80 an hour. While a programmer with 15 years of experience may get $120,000 per year. So, the programmer with 15 years of programming experience earns $10,000 per month. At the same time, the programmer with 3 year of experience or the graphics designer will get $14,000 per month ($80 x 22 days x 8 hours). I know I have to buy my own insurance, but I can't imagine buying those for $4,000 each month... maybe $200, $300 at most. I probably need to pay Social Security (FICA) both way (myself and as self-employed = $210 x 2), but still, each month there will be extra $3,000 of income. So is the above calculation correct? But most often, I do see contractors wanting or becoming full time, but not full time employees becoming a contractor. Does somebody know what the reason is?

    Read the article

  • How can I distribute x cakes amongst y people?

    - by Rupert
    I have an associative array with the ids of x cakes and another associative array with the ids of y people. I want to ensure that each cake is enjoyed by exactly 2 people and that each person gets a fair share of cake overall. However, cakes must be kept whole (i.e. if the average cake per person is a fraction, this fraction will be rounded up for some, and down for others). No person can be assigned the same cake twice. For example: $cake = array('id'=>'1','id'=>'2') $people = array('id'=>'1','id'=>'2','id'=>'3') In order to do so, I wish to create a new array where each row represents a cake-person assignment. As each cake is being assigned to two people, the number of rows in this table should be exactly twice the number of cakes. There will not be exactly 1 solution to this problem but a solution to the above example would be: $cake_person = array( '1'=>array('cake_id'=>'1', 'person_id'=>'1'), '2'=>array('cake_id'=>'1', 'person_id'=>'2'), '3'=>array('cake_id'=>'2', 'person_id'=>'2'), '4'=>array('cake_id'=>'2', 'person_id'=>'3'), ) Notice that people 1 and 3 are losing out but that is because there is no more cake to go around! Each cake must be given exactly twice. How can I generate such a solution reliably for larger numbers of people and cakes?

    Read the article

  • How to explain bad software to non-technical people?

    - by mtutty
    In discussing software development with non-technical people (customers, business owners, project sponsors, etc.), I often resort to analogies and metaphors. It's relatively easy and effective to use a "house" or other metaphor for describing the size and complexity of new development. However, we often inherit someone else's code or data, and this approach doesn't seem to hold up as well when trying to explain why we're gutting something that already seems to work. Of course we can point to cycle time and cost to be saved in the future but this generally means nothing to business folks. I know doctors can say "just take this pill," but I'm not sure that software devs have the same authority. Ideas? EDIT: Let me add a bit to the discussion. The specific project I'm talking about has customers that don't realize (or care) about specific aspects of the system we're retiring (i.e., they think it was just fine): The system would save a NEW RECORD every time someone updated a field The system contained tables for reference data. These tables had new records added every day, even though they were duplicates of previous records. And there was no way to tie the reference data used for a particular case at the time it was closed. This is like 99% of the data in the old system. The field NAMES also have spaces, apostrophes and other inappropriate characters in them, making everything harder to work with. In addition to the incredible amount of duplicate data, they have around 1000 XLS files with data they want added to the system. Previously, they would do a spreadsheet for each case in the database, IN ADDITION TO what they typed into the database. Getting rid of this old, unneeded information and piping in the XLS data comprises about 80% of the total project effort, and was not something we could accurately predict. I'm trying to find a concrete way to describe how bad this thing was, mostly so that the customer will understand why the migration process has been so time-consuming. The actual coding was done pretty quickly and the new system works fine, but without the old data they won't be happy. Sorry to get into the weeds, but most of the answers I've seen so far are pretty basic scope/schedule/cost things. I've been doing this for 15 years, so this really is more of a reflective, philosophical question - but without some of the details it can be difficult to really appreciate the awful beauty of this problem.

    Read the article

  • Why is my SharePoint people search WorkEmail property blank?

    - by Nat
    I have an SSP setup for my site and I am trying to get the presence bubble working correctly. However, I cannot get the people search core results webpart to display the workemail. I have output the raw xml into my people search core results xslt and used the SharePoint Query Web Service Test Tool to try and find values for these properties, but they are appearing blank (including sipAddress and HighConfidenceDisplayProperty11). Note: The presence bubble does work when hard coded to users email address, so the problem is absolutely to do with the search results.

    Read the article

  • code duplication in sql case statements

    - by NS
    Hi I'm trying to output something like the following but am finding that there is a lot of code duplication going on. | australian_has_itch | kiwi_has_itch | | yes | no | | no | n/a | | n/a | no | ... My query looks like this with two case statements that do the same thing but flip the country (my real query has 5 of these case statements): SELECT CASE WHEN NOT EXISTS ( SELECT person_id FROM people_with_skin WHERE people_with_skin.person_id = people.person_id AND people.country = "Australia" ) THEN 'N/A' WHEN EXISTS ( SELECT person_id FROM itch_none_to_report WHERE people.country = "Australia" AND person_id = people.person_id ) THEN 'None to report' WHEN EXISTS ( SELECT person_id FROM itchy_people WHERE people.country = "Australia" AND person_id = people.person_id ) THEN 'Yes' ELSE 'No' END australian_has_itch, CASE WHEN NOT EXISTS ( SELECT person_id FROM people_with_skin WHERE people_with_skin.person_id = people.person_id AND people.country = "NZ" ) THEN 'N/A' WHEN EXISTS ( SELECT person_id FROM itch_none_to_report WHERE people.country = "NZ" AND person_id = people.person_id ) THEN 'None to report' WHEN EXISTS ( SELECT person_id FROM itchy_people WHERE people.country = "NZ" AND person_id = people.person_id ) THEN 'Yes' ELSE 'No' END kiwi_has_itch, FROM people Is there a way for me to condense this somehow and not have so much code duplication? Thanks!

    Read the article

  • How can I set up a 404 error page when people access http://ftp.mydomain.com?

    - by Tim B.
    I am a freelance videographer/developer, and part of my job involves transferring large files over FTP to production houses/television stations. While the majority of people in my industry understand the difference between FTP and HTTP, I've experienced several interactions in the past couple months of people who still open Internet Explorer and try to access http://ftp.mydomain.com, receive an error page served by HostGator, and tell me that they cannot access my FTP server. Instead of spending time delivering instructions via e-mail, I'd much prefer to serve up a custom error page in this instance that instructs them how to download and use an FTP client. I tried setting up a sub-domain in Cpanel hoping I could simply drop in an .htaccess file with the error page, but I got this error: ftp.mydomain.com domainadmin-domainexistsglobal I also tried creating a custom error page in PHP which reads the site URL and serves up the custom content only when http://ftp.mydomain.com is accessed. Unfortunately, the error page works for every subdomain except that one. I'm not entirely sure this is even technically possible, which is why I bring it to the good people of StackOverflow to help. Thanks!

    Read the article

  • What tools do people use to make programming tutorial videos?

    - by Pure.Krome
    Hi folks, I'm wanting to make some yee-run-o-the-mill tutorial video's about some programming concepts and stuff i've been doing. Nothing special ... lots of peeps been doing it. What tools are people using to record and edit these videos? What resolutions / fonts / sizes do people generally use/set? The only tool I've had experience with is Camtasia - and i didn't mind it. But i've seen vid's (or live demo's) where people zoom in to code sections.. how do they do that? For final editing, do most people just do some simple power point presentation with some video snippets mashed in. cheers!

    Read the article

  • How do people get around the Carmack's Reverse patent?

    - by Rei Miyasaka
    Apparently, Creative has a patent on Carmack's Reverse, and they successfully forced Id to modify their techniques for the source drop, as well as to include EAX in Doom 3. But Carmack's Reverse is discussed quite often and apparently it's a good choice for deferred shading, so it's presumably used in a lot of other high-budget productions too. Even though it's unlikely that Creative would go after smaller companies, I'm wondering how the bigger studios get around this problem. Do they just cross their fingers and hope Creative doesn't troll them, or do they just not use Carmack's Reverse at all?

    Read the article

  • Do people in non-English-speaking countries code in English?

    - by Damovisa
    With over 100 answers to this question it's highly likely that your answer has already been posted. Please don't post an answer unless you have something new to say I've heard it said (by coworkers) that everyone "codes in English" regardless of where they're from. I find that difficult to believe, however I wouldn't be surprised if, for most programming languages, the supported character set is relatively narrow. Have you ever worked in a country where English is not the primary language? If so, what did their code look like? Edit: Code samples would be great, by the way...

    Read the article

  • Security of logging people in automatically from another app?

    - by Simon
    I have 2 apps. They both have accounts, and each account has users. These apps are going to share the same users and accounts and they will always be in sync. I want to be able to login automatically from one app to the other. So my solution is to generate a login_key, for example: 2sa7439e-a570-ac21-a2ao-z1qia9ca6g25 once a day. And provide a automated login link to the other app... for example if the user clicks on: https://account_name.securityhole.io/login/2sa7439e-a570-ac21-a2ao-z1qia9ca6g25/user/123 They are logged in automatically, session created. So here we have 3 things that a intruder has to get right in order to gain access; account name, login key, and the user id. Bad idea? Or should I can down the path of making one app an oauth provider? Or is there a better way?

    Read the article

  • 2011 - ALMs for your development team and the people they work with.

    - by David V. Corbin
    Welcome to 2011, it is already shaping up to be a very exciting year. The title of the post is not about charitable giving, although that is also a great topic. Application Lifecycle Management and the Systems that support the environment is, and 2011 will be a year where I expect many teams to invest heavily in this area. For those not familiar with ALM, it can be simplified down to "A comprehensive view of all of the iteas, requirements, activities and artifacts that impact an application over the course of its lifecycle, from concept until decommissioning". Obviously, this encompases a large number of different areas even for relatively small and medium sized projects. In recent years, many teams have adapted methodoligies which address individual aspects of this; but the majority of this adoption has resulted in "islands of improvement" rather than the desired comprehensive outcome...Until now! Last year Microsoft released Team Foundation Server 2010 along with Visual Studio 2010 Ultimate Edition, and with these two in combination the situation has drastically changed. At last there is a single environment that is capable of handling all aspects of ALM, and is also capable of dealing with migration and integration with existing systems to make the transition to a single solution much easier. Thse possibilities (and practicalities) are nothing short of amazing, Architecture thru Testing integration? YES. Being able to correlate specific requirement items (and their history) to actual code (and code history)? YES. Identification of which tests will be potentially impacted by a given code change? YES. Resiliant Automated Testing of User Interfaces? YES. Automatic Deployment Management? YES. Integraton Level testing as part of (designated) Builds? YES. I could easily double or triple the above list, but these items should be enough to get you thinking about the "pain points" your team and organization currently face and the fact that there IS a way to relieve the pain. Over the course of the year, I am hoping to bring together some of the "best of breed" information, along with hosting (and participating in) discussions with various experts in the field. There are already a number of groups (including many on LinkedIn) that have an ALM focus, and I encourage everyone out to check them out. I will be posting a list of the ones I find most helpful in the not too distant future. As I said at the beginning, 2011 is shaping up to be a very interesting (and productive) year. Why wait to start investigating and adopting ALM? ps: For those interested in becoming an "Alms Giver" in the charitable sense, I highly recommend checking out GiveCamp. A group of developers, designers and others get together to create a solution for a charity in just under 48 hours. I will be attending the GiveCamp in New York City on Jan 14-16, more information is available at nycgivecamp.org/

    Read the article

  • What are some good Software Engineering books for people who didn't formally study Computer Science or Software Engineering?

    - by Kugathasan Abimaran
    I'm a graduate in the electronic & telecommunication field, but working in a software company. I want to continue in this field and going for Masters in it. Can you recommend me some of the best books on software engineering, which cover almost all the topics in software engineering. I am not looking for books about coding practices such as Code Complete, Pragmatic Programmer, but rather general software engineering references.

    Read the article

  • Would it be a good idea to work on letting people add arrays of numbers in javascript?

    - by OneThreeSeven
    I am a very mathematically oriented programmer, and I happen to be doing a lot of java script these days. I am really disappointed in the math aspects of javascript: the Math object is almost a joke because it has so few methods you can't use ^ for exponentiation the + operator is very limited, you cant add array's of numbers or do scalar multiplication on arrays Now I have written some pretty basic extensions to the Math object and have considered writing a library of advanced Math features, amazingly there doesn't seem to be any sort of standard library already out even for calculus, although there is one for vectors and matricies I was able find. The notation for working with vectors and matricies is really bad when you can't use the + operator on arrays, and you cant do scalar multiplication. For example, here is a hideous expression for subtracting two vectors, A - B: Math.vectorAddition(A,Math.scalarMultiplication(-1,B)); I have been looking for some kind of open-source project to contribute to for awhile, and even though my C++ is a bit rusty I would very much like to get into the code for V8 engine and extend the + operator to work on arrays, to get scalar multiplication to work, and possibly to get the ^ operator to work for exponentiation. These things would greatly enhance the utility of any mathematical javascript framework. I really don't know how to get involved in something like the V8 engine other than download the code and start working on it. Of course I'm afraid that since V8 is chrome specific, that without browser cross-compatibility a fundamental change of this type is likely to be rejected for V8. I was hoping someone could either tell me why this is a bad idea, or else give me some pointers about how to proceed at this point to get some kind of approval to add these features. Thanks!

    Read the article

  • Using R to Analyze G1GC Log Files

    - by user12620111
    Using R to Analyze G1GC Log Files body, td { font-family: sans-serif; background-color: white; font-size: 12px; margin: 8px; } tt, code, pre { font-family: 'DejaVu Sans Mono', 'Droid Sans Mono', 'Lucida Console', Consolas, Monaco, monospace; } h1 { font-size:2.2em; } h2 { font-size:1.8em; } h3 { font-size:1.4em; } h4 { font-size:1.0em; } h5 { font-size:0.9em; } h6 { font-size:0.8em; } a:visited { color: rgb(50%, 0%, 50%); } pre { margin-top: 0; max-width: 95%; border: 1px solid #ccc; white-space: pre-wrap; } pre code { display: block; padding: 0.5em; } code.r, code.cpp { background-color: #F8F8F8; } table, td, th { border: none; } blockquote { color:#666666; margin:0; padding-left: 1em; border-left: 0.5em #EEE solid; } hr { height: 0px; border-bottom: none; border-top-width: thin; border-top-style: dotted; border-top-color: #999999; } @media print { * { background: transparent !important; color: black !important; filter:none !important; -ms-filter: none !important; } body { font-size:12pt; max-width:100%; } a, a:visited { text-decoration: underline; } hr { visibility: hidden; page-break-before: always; } pre, blockquote { padding-right: 1em; page-break-inside: avoid; } tr, img { page-break-inside: avoid; } img { max-width: 100% !important; } @page :left { margin: 15mm 20mm 15mm 10mm; } @page :right { margin: 15mm 10mm 15mm 20mm; } p, h2, h3 { orphans: 3; widows: 3; } h2, h3 { page-break-after: avoid; } } pre .operator, pre .paren { color: rgb(104, 118, 135) } pre .literal { color: rgb(88, 72, 246) } pre .number { color: rgb(0, 0, 205); } pre .comment { color: rgb(76, 136, 107); } pre .keyword { color: rgb(0, 0, 255); } pre .identifier { color: rgb(0, 0, 0); } pre .string { color: rgb(3, 106, 7); } var hljs=new function(){function m(p){return p.replace(/&/gm,"&").replace(/"}while(y.length||w.length){var v=u().splice(0,1)[0];z+=m(x.substr(q,v.offset-q));q=v.offset;if(v.event=="start"){z+=t(v.node);s.push(v.node)}else{if(v.event=="stop"){var p,r=s.length;do{r--;p=s[r];z+=("")}while(p!=v.node);s.splice(r,1);while(r'+M[0]+""}else{r+=M[0]}O=P.lR.lastIndex;M=P.lR.exec(L)}return r+L.substr(O,L.length-O)}function J(L,M){if(M.sL&&e[M.sL]){var r=d(M.sL,L);x+=r.keyword_count;return r.value}else{return F(L,M)}}function I(M,r){var L=M.cN?'':"";if(M.rB){y+=L;M.buffer=""}else{if(M.eB){y+=m(r)+L;M.buffer=""}else{y+=L;M.buffer=r}}D.push(M);A+=M.r}function G(N,M,Q){var R=D[D.length-1];if(Q){y+=J(R.buffer+N,R);return false}var P=q(M,R);if(P){y+=J(R.buffer+N,R);I(P,M);return P.rB}var L=v(D.length-1,M);if(L){var O=R.cN?"":"";if(R.rE){y+=J(R.buffer+N,R)+O}else{if(R.eE){y+=J(R.buffer+N,R)+O+m(M)}else{y+=J(R.buffer+N+M,R)+O}}while(L1){O=D[D.length-2].cN?"":"";y+=O;L--;D.length--}var r=D[D.length-1];D.length--;D[D.length-1].buffer="";if(r.starts){I(r.starts,"")}return R.rE}if(w(M,R)){throw"Illegal"}}var E=e[B];var D=[E.dM];var A=0;var x=0;var y="";try{var s,u=0;E.dM.buffer="";do{s=p(C,u);var t=G(s[0],s[1],s[2]);u+=s[0].length;if(!t){u+=s[1].length}}while(!s[2]);if(D.length1){throw"Illegal"}return{r:A,keyword_count:x,value:y}}catch(H){if(H=="Illegal"){return{r:0,keyword_count:0,value:m(C)}}else{throw H}}}function g(t){var p={keyword_count:0,r:0,value:m(t)};var r=p;for(var q in e){if(!e.hasOwnProperty(q)){continue}var s=d(q,t);s.language=q;if(s.keyword_count+s.rr.keyword_count+r.r){r=s}if(s.keyword_count+s.rp.keyword_count+p.r){r=p;p=s}}if(r.language){p.second_best=r}return p}function i(r,q,p){if(q){r=r.replace(/^((]+|\t)+)/gm,function(t,w,v,u){return w.replace(/\t/g,q)})}if(p){r=r.replace(/\n/g,"")}return r}function n(t,w,r){var x=h(t,r);var v=a(t);var y,s;if(v){y=d(v,x)}else{return}var q=c(t);if(q.length){s=document.createElement("pre");s.innerHTML=y.value;y.value=k(q,c(s),x)}y.value=i(y.value,w,r);var u=t.className;if(!u.match("(\\s|^)(language-)?"+v+"(\\s|$)")){u=u?(u+" "+v):v}if(/MSIE [678]/.test(navigator.userAgent)&&t.tagName=="CODE"&&t.parentNode.tagName=="PRE"){s=t.parentNode;var p=document.createElement("div");p.innerHTML=""+y.value+"";t=p.firstChild.firstChild;p.firstChild.cN=s.cN;s.parentNode.replaceChild(p.firstChild,s)}else{t.innerHTML=y.value}t.className=u;t.result={language:v,kw:y.keyword_count,re:y.r};if(y.second_best){t.second_best={language:y.second_best.language,kw:y.second_best.keyword_count,re:y.second_best.r}}}function o(){if(o.called){return}o.called=true;var r=document.getElementsByTagName("pre");for(var p=0;p|=||=||=|\\?|\\[|\\{|\\(|\\^|\\^=|\\||\\|=|\\|\\||~";this.ER="(?![\\s\\S])";this.BE={b:"\\\\.",r:0};this.ASM={cN:"string",b:"'",e:"'",i:"\\n",c:[this.BE],r:0};this.QSM={cN:"string",b:'"',e:'"',i:"\\n",c:[this.BE],r:0};this.CLCM={cN:"comment",b:"//",e:"$"};this.CBLCLM={cN:"comment",b:"/\\*",e:"\\*/"};this.HCM={cN:"comment",b:"#",e:"$"};this.NM={cN:"number",b:this.NR,r:0};this.CNM={cN:"number",b:this.CNR,r:0};this.BNM={cN:"number",b:this.BNR,r:0};this.inherit=function(r,s){var p={};for(var q in r){p[q]=r[q]}if(s){for(var q in s){p[q]=s[q]}}return p}}();hljs.LANGUAGES.cpp=function(){var a={keyword:{"false":1,"int":1,"float":1,"while":1,"private":1,"char":1,"catch":1,"export":1,virtual:1,operator:2,sizeof:2,dynamic_cast:2,typedef:2,const_cast:2,"const":1,struct:1,"for":1,static_cast:2,union:1,namespace:1,unsigned:1,"long":1,"throw":1,"volatile":2,"static":1,"protected":1,bool:1,template:1,mutable:1,"if":1,"public":1,friend:2,"do":1,"return":1,"goto":1,auto:1,"void":2,"enum":1,"else":1,"break":1,"new":1,extern:1,using:1,"true":1,"class":1,asm:1,"case":1,typeid:1,"short":1,reinterpret_cast:2,"default":1,"double":1,register:1,explicit:1,signed:1,typename:1,"try":1,"this":1,"switch":1,"continue":1,wchar_t:1,inline:1,"delete":1,alignof:1,char16_t:1,char32_t:1,constexpr:1,decltype:1,noexcept:1,nullptr:1,static_assert:1,thread_local:1,restrict:1,_Bool:1,complex:1},built_in:{std:1,string:1,cin:1,cout:1,cerr:1,clog:1,stringstream:1,istringstream:1,ostringstream:1,auto_ptr:1,deque:1,list:1,queue:1,stack:1,vector:1,map:1,set:1,bitset:1,multiset:1,multimap:1,unordered_set:1,unordered_map:1,unordered_multiset:1,unordered_multimap:1,array:1,shared_ptr:1}};return{dM:{k:a,i:"",k:a,r:10,c:["self"]}]}}}();hljs.LANGUAGES.r={dM:{c:[hljs.HCM,{cN:"number",b:"\\b0[xX][0-9a-fA-F]+[Li]?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+(?:[eE][+\\-]?\\d*)?L\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+\\.(?!\\d)(?:i\\b)?",e:hljs.IMMEDIATE_RE,r:1},{cN:"number",b:"\\b\\d+(?:\\.\\d*)?(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\.\\d+(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"keyword",b:"(?:tryCatch|library|setGeneric|setGroupGeneric)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\.",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\d+(?![\\w.])",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\b(?:function)",e:hljs.IMMEDIATE_RE,r:2},{cN:"keyword",b:"(?:if|in|break|next|repeat|else|for|return|switch|while|try|stop|warning|require|attach|detach|source|setMethod|setClass)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"literal",b:"(?:NA|NA_integer_|NA_real_|NA_character_|NA_complex_)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"literal",b:"(?:NULL|TRUE|FALSE|T|F|Inf|NaN)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"identifier",b:"[a-zA-Z.][a-zA-Z0-9._]*\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"operator",b:"|=||   Using R to Analyze G1GC Log Files   Using R to Analyze G1GC Log Files Introduction Working in Oracle Platform Integration gives an engineer opportunities to work on a wide array of technologies. My team’s goal is to make Oracle applications run best on the Solaris/SPARC platform. When looking for bottlenecks in a modern applications, one needs to be aware of not only how the CPUs and operating system are executing, but also network, storage, and in some cases, the Java Virtual Machine. I was recently presented with about 1.5 GB of Java Garbage First Garbage Collector log file data. If you’re not familiar with the subject, you might want to review Garbage First Garbage Collector Tuning by Monica Beckwith. The customer had been running Java HotSpot 1.6.0_31 to host a web application server. I was told that the Solaris/SPARC server was running a Java process launched using a commmand line that included the following flags: -d64 -Xms9g -Xmx9g -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -XX:InitiatingHeapOccupancyPercent=80 -XX:PermSize=256m -XX:MaxPermSize=256m -XX:+PrintGC -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC -XX:+PrintGCDateStamps -XX:+PrintFlagsFinal -XX:+DisableExplicitGC -XX:+UnlockExperimentalVMOptions -XX:ParallelGCThreads=8 Several sources on the internet indicate that if I were to print out the 1.5 GB of log files, it would require enough paper to fill the bed of a pick up truck. Of course, it would be fruitless to try to scan the log files by hand. Tools will be required to summarize the contents of the log files. Others have encountered large Java garbage collection log files. There are existing tools to analyze the log files: IBM’s GC toolkit The chewiebug GCViewer gchisto HPjmeter Instead of using one of the other tools listed, I decide to parse the log files with standard Unix tools, and analyze the data with R. Data Cleansing The log files arrived in two different formats. I guess that the difference is that one set of log files was generated using a more verbose option, maybe -XX:+PrintHeapAtGC, and the other set of log files was generated without that option. Format 1 In some of the log files, the log files with the less verbose format, a single trace, i.e. the report of a singe garbage collection event, looks like this: {Heap before GC invocations=12280 (full 61): garbage-first heap total 9437184K, used 7499918K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) region size 4096K, 1 young (4096K), 0 survivors (0K) compacting perm gen total 262144K, used 144077K [0xffffffff40000000, 0xffffffff50000000, 0xffffffff50000000) the space 262144K, 54% used [0xffffffff40000000, 0xffffffff48cb3758, 0xffffffff48cb3800, 0xffffffff50000000) No shared spaces configured. 2014-05-14T07:24:00.988-0700: 60586.353: [GC pause (young) 7324M->7320M(9216M), 0.1567265 secs] Heap after GC invocations=12281 (full 61): garbage-first heap total 9437184K, used 7496533K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) region size 4096K, 0 young (0K), 0 survivors (0K) compacting perm gen total 262144K, used 144077K [0xffffffff40000000, 0xffffffff50000000, 0xffffffff50000000) the space 262144K, 54% used [0xffffffff40000000, 0xffffffff48cb3758, 0xffffffff48cb3800, 0xffffffff50000000) No shared spaces configured. } A simple grep can be used to extract a summary: $ grep "\[ GC pause (young" g1gc.log 2014-05-13T13:24:35.091-0700: 3.109: [GC pause (young) 20M->5029K(9216M), 0.0146328 secs] 2014-05-13T13:24:35.440-0700: 3.459: [GC pause (young) 9125K->6077K(9216M), 0.0086723 secs] 2014-05-13T13:24:37.581-0700: 5.599: [GC pause (young) 25M->8470K(9216M), 0.0203820 secs] 2014-05-13T13:24:42.686-0700: 10.704: [GC pause (young) 44M->15M(9216M), 0.0288848 secs] 2014-05-13T13:24:48.941-0700: 16.958: [GC pause (young) 51M->20M(9216M), 0.0491244 secs] 2014-05-13T13:24:56.049-0700: 24.066: [GC pause (young) 92M->26M(9216M), 0.0525368 secs] 2014-05-13T13:25:34.368-0700: 62.383: [GC pause (young) 602M->68M(9216M), 0.1721173 secs] But that format wasn't easily read into R, so I needed to be a bit more tricky. I used the following Unix command to create a summary file that was easy for R to read. $ echo "SecondsSinceLaunch BeforeSize AfterSize TotalSize RealTime" $ grep "\[GC pause (young" g1gc.log | grep -v mark | sed -e 's/[A-SU-z\(\),]/ /g' -e 's/->/ /' -e 's/: / /g' | more SecondsSinceLaunch BeforeSize AfterSize TotalSize RealTime 2014-05-13T13:24:35.091-0700 3.109 20 5029 9216 0.0146328 2014-05-13T13:24:35.440-0700 3.459 9125 6077 9216 0.0086723 2014-05-13T13:24:37.581-0700 5.599 25 8470 9216 0.0203820 2014-05-13T13:24:42.686-0700 10.704 44 15 9216 0.0288848 2014-05-13T13:24:48.941-0700 16.958 51 20 9216 0.0491244 2014-05-13T13:24:56.049-0700 24.066 92 26 9216 0.0525368 2014-05-13T13:25:34.368-0700 62.383 602 68 9216 0.1721173 Format 2 In some of the log files, the log files with the more verbose format, a single trace, i.e. the report of a singe garbage collection event, was more complicated than Format 1. Here is a text file with an example of a single G1GC trace in the second format. As you can see, it is quite complicated. It is nice that there is so much information available, but the level of detail can be overwhelming. I wrote this awk script (download) to summarize each trace on a single line. #!/usr/bin/env awk -f BEGIN { printf("SecondsSinceLaunch IncrementalCount FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize\n") } ###################### # Save count data from lines that are at the start of each G1GC trace. # Each trace starts out like this: # {Heap before GC invocations=14 (full 0): # garbage-first heap total 9437184K, used 325496K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) ###################### /{Heap.*full/{ gsub ( "\\)" , "" ); nf=split($0,a,"="); split(a[2],b," "); getline; if ( match($0, "first") ) { G1GC=1; IncrementalCount=b[1]; FullCount=substr( b[3], 1, length(b[3])-1 ); } else { G1GC=0; } } ###################### # Pull out time stamps that are in lines with this format: # 2014-05-12T14:02:06.025-0700: 94.312: [GC pause (young), 0.08870154 secs] ###################### /GC pause/ { DateTime=$1; SecondsSinceLaunch=substr($2, 1, length($2)-1); } ###################### # Heap sizes are in lines that look like this: # [ 4842M->4838M(9216M)] ###################### /\[ .*]$/ { gsub ( "\\[" , "" ); gsub ( "\ \]" , "" ); gsub ( "->" , " " ); gsub ( "\\( " , " " ); gsub ( "\ \)" , " " ); split($0,a," "); if ( split(a[1],b,"M") > 1 ) {BeforeSize=b[1]*1024;} if ( split(a[1],b,"K") > 1 ) {BeforeSize=b[1];} if ( split(a[2],b,"M") > 1 ) {AfterSize=b[1]*1024;} if ( split(a[2],b,"K") > 1 ) {AfterSize=b[1];} if ( split(a[3],b,"M") > 1 ) {TotalSize=b[1]*1024;} if ( split(a[3],b,"K") > 1 ) {TotalSize=b[1];} } ###################### # Emit an output line when you find input that looks like this: # [Times: user=1.41 sys=0.08, real=0.24 secs] ###################### /\[Times/ { if (G1GC==1) { gsub ( "," , "" ); split($2,a,"="); UserTime=a[2]; split($3,a,"="); SysTime=a[2]; split($4,a,"="); RealTime=a[2]; print DateTime,SecondsSinceLaunch,IncrementalCount,FullCount,UserTime,SysTime,RealTime,BeforeSize,AfterSize,TotalSize; G1GC=0; } } The resulting summary is about 25X smaller that the original file, but still difficult for a human to digest. SecondsSinceLaunch IncrementalCount FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ... 2014-05-12T18:36:34.669-0700: 3985.744 561 0 0.57 0.06 0.16 1724416 1720320 9437184 2014-05-12T18:36:34.839-0700: 3985.914 562 0 0.51 0.06 0.19 1724416 1720320 9437184 2014-05-12T18:36:35.069-0700: 3986.144 563 0 0.60 0.04 0.27 1724416 1721344 9437184 2014-05-12T18:36:35.354-0700: 3986.429 564 0 0.33 0.04 0.09 1725440 1722368 9437184 2014-05-12T18:36:35.545-0700: 3986.620 565 0 0.58 0.04 0.17 1726464 1722368 9437184 2014-05-12T18:36:35.726-0700: 3986.801 566 0 0.43 0.05 0.12 1726464 1722368 9437184 2014-05-12T18:36:35.856-0700: 3986.930 567 0 0.30 0.04 0.07 1726464 1723392 9437184 2014-05-12T18:36:35.947-0700: 3987.023 568 0 0.61 0.04 0.26 1727488 1723392 9437184 2014-05-12T18:36:36.228-0700: 3987.302 569 0 0.46 0.04 0.16 1731584 1724416 9437184 Reading the Data into R Once the GC log data had been cleansed, either by processing the first format with the shell script, or by processing the second format with the awk script, it was easy to read the data into R. g1gc.df = read.csv("summary.txt", row.names = NULL, stringsAsFactors=FALSE,sep="") str(g1gc.df) ## 'data.frame': 8307 obs. of 10 variables: ## $ row.names : chr "2014-05-12T14:00:32.868-0700:" "2014-05-12T14:00:33.179-0700:" "2014-05-12T14:00:33.677-0700:" "2014-05-12T14:00:35.538-0700:" ... ## $ SecondsSinceLaunch: num 1.16 1.47 1.97 3.83 6.1 ... ## $ IncrementalCount : int 0 1 2 3 4 5 6 7 8 9 ... ## $ FullCount : int 0 0 0 0 0 0 0 0 0 0 ... ## $ UserTime : num 0.11 0.05 0.04 0.21 0.08 0.26 0.31 0.33 0.34 0.56 ... ## $ SysTime : num 0.04 0.01 0.01 0.05 0.01 0.06 0.07 0.06 0.07 0.09 ... ## $ RealTime : num 0.02 0.02 0.01 0.04 0.02 0.04 0.05 0.04 0.04 0.06 ... ## $ BeforeSize : int 8192 5496 5768 22528 24576 43008 34816 53248 55296 93184 ... ## $ AfterSize : int 1400 1672 2557 4907 7072 14336 16384 18432 19456 21504 ... ## $ TotalSize : int 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 ... head(g1gc.df) ## row.names SecondsSinceLaunch IncrementalCount ## 1 2014-05-12T14:00:32.868-0700: 1.161 0 ## 2 2014-05-12T14:00:33.179-0700: 1.472 1 ## 3 2014-05-12T14:00:33.677-0700: 1.969 2 ## 4 2014-05-12T14:00:35.538-0700: 3.830 3 ## 5 2014-05-12T14:00:37.811-0700: 6.103 4 ## 6 2014-05-12T14:00:41.428-0700: 9.720 5 ## FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ## 1 0 0.11 0.04 0.02 8192 1400 9437184 ## 2 0 0.05 0.01 0.02 5496 1672 9437184 ## 3 0 0.04 0.01 0.01 5768 2557 9437184 ## 4 0 0.21 0.05 0.04 22528 4907 9437184 ## 5 0 0.08 0.01 0.02 24576 7072 9437184 ## 6 0 0.26 0.06 0.04 43008 14336 9437184 Basic Statistics Once the data has been read into R, simple statistics are very easy to generate. All of the numbers from high school statistics are available via simple commands. For example, generate a summary of every column: summary(g1gc.df) ## row.names SecondsSinceLaunch IncrementalCount FullCount ## Length:8307 Min. : 1 Min. : 0 Min. : 0.0 ## Class :character 1st Qu.: 9977 1st Qu.:2048 1st Qu.: 0.0 ## Mode :character Median :12855 Median :4136 Median : 12.0 ## Mean :12527 Mean :4156 Mean : 31.6 ## 3rd Qu.:15758 3rd Qu.:6262 3rd Qu.: 61.0 ## Max. :55484 Max. :8391 Max. :113.0 ## UserTime SysTime RealTime BeforeSize ## Min. :0.040 Min. :0.0000 Min. : 0.0 Min. : 5476 ## 1st Qu.:0.470 1st Qu.:0.0300 1st Qu.: 0.1 1st Qu.:5137920 ## Median :0.620 Median :0.0300 Median : 0.1 Median :6574080 ## Mean :0.751 Mean :0.0355 Mean : 0.3 Mean :5841855 ## 3rd Qu.:0.920 3rd Qu.:0.0400 3rd Qu.: 0.2 3rd Qu.:7084032 ## Max. :3.370 Max. :1.5600 Max. :488.1 Max. :8696832 ## AfterSize TotalSize ## Min. : 1380 Min. :9437184 ## 1st Qu.:5002752 1st Qu.:9437184 ## Median :6559744 Median :9437184 ## Mean :5785454 Mean :9437184 ## 3rd Qu.:7054336 3rd Qu.:9437184 ## Max. :8482816 Max. :9437184 Q: What is the total amount of User CPU time spent in garbage collection? sum(g1gc.df$UserTime) ## [1] 6236 As you can see, less than two hours of CPU time was spent in garbage collection. Is that too much? To find the percentage of time spent in garbage collection, divide the number above by total_elapsed_time*CPU_count. In this case, there are a lot of CPU’s and it turns out the the overall amount of CPU time spent in garbage collection isn’t a problem when viewed in isolation. When calculating rates, i.e. events per unit time, you need to ask yourself if the rate is homogenous across the time period in the log file. Does the log file include spikes of high activity that should be separately analyzed? Averaging in data from nights and weekends with data from business hours may alias problems. If you have a reason to suspect that the garbage collection rates include peaks and valleys that need independent analysis, see the “Time Series” section, below. Q: How much garbage is collected on each pass? The amount of heap space that is recovered per GC pass is surprisingly low: At least one collection didn’t recover any data. (“Min.=0”) 25% of the passes recovered 3MB or less. (“1st Qu.=3072”) Half of the GC passes recovered 4MB or less. (“Median=4096”) The average amount recovered was 56MB. (“Mean=56390”) 75% of the passes recovered 36MB or less. (“3rd Qu.=36860”) At least one pass recovered 2GB. (“Max.=2121000”) g1gc.df$Delta = g1gc.df$BeforeSize - g1gc.df$AfterSize summary(g1gc.df$Delta) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0 3070 4100 56400 36900 2120000 Q: What is the maximum User CPU time for a single collection? The worst garbage collection (“Max.”) is many standard deviations away from the mean. The data appears to be right skewed. summary(g1gc.df$UserTime) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0.040 0.470 0.620 0.751 0.920 3.370 sd(g1gc.df$UserTime) ## [1] 0.3966 Basic Graphics Once the data is in R, it is trivial to plot the data with formats including dot plots, line charts, bar charts (simple, stacked, grouped), pie charts, boxplots, scatter plots histograms, and kernel density plots. Histogram of User CPU Time per Collection I don't think that this graph requires any explanation. hist(g1gc.df$UserTime, main="User CPU Time per Collection", xlab="Seconds", ylab="Frequency") Box plot to identify outliers When the initial data is viewed with a box plot, you can see the one crazy outlier in the real time per GC. Save this data point for future analysis and drop the outlier so that it’s not throwing off our statistics. Now the box plot shows many outliers, which will be examined later, using times series analysis. Notice that the scale of the x-axis changes drastically once the crazy outlier is removed. par(mfrow=c(2,1)) boxplot(g1gc.df$UserTime,g1gc.df$SysTime,g1gc.df$RealTime, main="Box Plot of Time per GC\n(dominated by a crazy outlier)", names=c("usr","sys","elapsed"), xlab="Seconds per GC", ylab="Time (Seconds)", horizontal = TRUE, outcol="red") crazy.outlier.df=g1gc.df[g1gc.df$RealTime > 400,] g1gc.df=g1gc.df[g1gc.df$RealTime < 400,] boxplot(g1gc.df$UserTime,g1gc.df$SysTime,g1gc.df$RealTime, main="Box Plot of Time per GC\n(crazy outlier excluded)", names=c("usr","sys","elapsed"), xlab="Seconds per GC", ylab="Time (Seconds)", horizontal = TRUE, outcol="red") box(which = "outer", lty = "solid") Here is the crazy outlier for future analysis: crazy.outlier.df ## row.names SecondsSinceLaunch IncrementalCount ## 8233 2014-05-12T23:15:43.903-0700: 20741 8316 ## FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ## 8233 112 0.55 0.42 488.1 8381440 8235008 9437184 ## Delta ## 8233 146432 R Time Series Data To analyze the garbage collection as a time series, I’ll use Z’s Ordered Observations (zoo). “zoo is the creator for an S3 class of indexed totally ordered observations which includes irregular time series.” require(zoo) ## Loading required package: zoo ## ## Attaching package: 'zoo' ## ## The following objects are masked from 'package:base': ## ## as.Date, as.Date.numeric head(g1gc.df[,1]) ## [1] "2014-05-12T14:00:32.868-0700:" "2014-05-12T14:00:33.179-0700:" ## [3] "2014-05-12T14:00:33.677-0700:" "2014-05-12T14:00:35.538-0700:" ## [5] "2014-05-12T14:00:37.811-0700:" "2014-05-12T14:00:41.428-0700:" options("digits.secs"=3) times=as.POSIXct( g1gc.df[,1], format="%Y-%m-%dT%H:%M:%OS%z:") g1gc.z = zoo(g1gc.df[,-c(1)], order.by=times) head(g1gc.z) ## SecondsSinceLaunch IncrementalCount FullCount ## 2014-05-12 17:00:32.868 1.161 0 0 ## 2014-05-12 17:00:33.178 1.472 1 0 ## 2014-05-12 17:00:33.677 1.969 2 0 ## 2014-05-12 17:00:35.538 3.830 3 0 ## 2014-05-12 17:00:37.811 6.103 4 0 ## 2014-05-12 17:00:41.427 9.720 5 0 ## UserTime SysTime RealTime BeforeSize AfterSize ## 2014-05-12 17:00:32.868 0.11 0.04 0.02 8192 1400 ## 2014-05-12 17:00:33.178 0.05 0.01 0.02 5496 1672 ## 2014-05-12 17:00:33.677 0.04 0.01 0.01 5768 2557 ## 2014-05-12 17:00:35.538 0.21 0.05 0.04 22528 4907 ## 2014-05-12 17:00:37.811 0.08 0.01 0.02 24576 7072 ## 2014-05-12 17:00:41.427 0.26 0.06 0.04 43008 14336 ## TotalSize Delta ## 2014-05-12 17:00:32.868 9437184 6792 ## 2014-05-12 17:00:33.178 9437184 3824 ## 2014-05-12 17:00:33.677 9437184 3211 ## 2014-05-12 17:00:35.538 9437184 17621 ## 2014-05-12 17:00:37.811 9437184 17504 ## 2014-05-12 17:00:41.427 9437184 28672 Example of Two Benchmark Runs in One Log File The data in the following graph is from a different log file, not the one of primary interest to this article. I’m including this image because it is an example of idle periods followed by busy periods. It would be uninteresting to average the rate of garbage collection over the entire log file period. More interesting would be the rate of garbage collect in the two busy periods. Are they the same or different? Your production data may be similar, for example, bursts when employees return from lunch and idle times on weekend evenings, etc. Once the data is in an R Time Series, you can analyze isolated time windows. Clipping the Time Series data Flashing back to our test case… Viewing the data as a time series is interesting. You can see that the work intensive time period is between 9:00 PM and 3:00 AM. Lets clip the data to the interesting period:     par(mfrow=c(2,1)) plot(g1gc.z$UserTime, type="h", main="User Time per GC\nTime: Complete Log File", xlab="Time of Day", ylab="CPU Seconds per GC", col="#1b9e77") clipped.g1gc.z=window(g1gc.z, start=as.POSIXct("2014-05-12 21:00:00"), end=as.POSIXct("2014-05-13 03:00:00")) plot(clipped.g1gc.z$UserTime, type="h", main="User Time per GC\nTime: Limited to Benchmark Execution", xlab="Time of Day", ylab="CPU Seconds per GC", col="#1b9e77") box(which = "outer", lty = "solid") Cumulative Incremental and Full GC count Here is the cumulative incremental and full GC count. When the line is very steep, it indicates that the GCs are repeating very quickly. Notice that the scale on the Y axis is different for full vs. incremental. plot(clipped.g1gc.z[,c(2:3)], main="Cumulative Incremental and Full GC count", xlab="Time of Day", col="#1b9e77") GC Analysis of Benchmark Execution using Time Series data In the following series of 3 graphs: The “After Size” show the amount of heap space in use after each garbage collection. Many Java objects are still referenced, i.e. alive, during each garbage collection. This may indicate that the application has a memory leak, or may indicate that the application has a very large memory footprint. Typically, an application's memory footprint plateau's in the early stage of execution. One would expect this graph to have a flat top. The steep decline in the heap space may indicate that the application crashed after 2:00. The second graph shows that the outliers in real execution time, discussed above, occur near 2:00. when the Java heap seems to be quite full. The third graph shows that Full GCs are infrequent during the first few hours of execution. The rate of Full GC's, (the slope of the cummulative Full GC line), changes near midnight.   plot(clipped.g1gc.z[,c("AfterSize","RealTime","FullCount")], xlab="Time of Day", col=c("#1b9e77","red","#1b9e77")) GC Analysis of heap recovered Each GC trace includes the amount of heap space in use before and after the individual GC event. During garbage coolection, unreferenced objects are identified, the space holding the unreferenced objects is freed, and thus, the difference in before and after usage indicates how much space has been freed. The following box plot and bar chart both demonstrate the same point - the amount of heap space freed per garbage colloection is surprisingly low. par(mfrow=c(2,1)) boxplot(as.vector(clipped.g1gc.z$Delta), main="Amount of Heap Recovered per GC Pass", xlab="Size in KB", horizontal = TRUE, col="red") hist(as.vector(clipped.g1gc.z$Delta), main="Amount of Heap Recovered per GC Pass", xlab="Size in KB", breaks=100, col="red") box(which = "outer", lty = "solid") This graph is the most interesting. The dark blue area shows how much heap is occupied by referenced Java objects. This represents memory that holds live data. The red fringe at the top shows how much data was recovered after each garbage collection. barplot(clipped.g1gc.z[,c("AfterSize","Delta")], col=c("#7570b3","#e7298a"), xlab="Time of Day", border=NA) legend("topleft", c("Live Objects","Heap Recovered on GC"), fill=c("#7570b3","#e7298a")) box(which = "outer", lty = "solid") When I discuss the data in the log files with the customer, I will ask for an explaination for the large amount of referenced data resident in the Java heap. There are two are posibilities: There is a memory leak and the amount of space required to hold referenced objects will continue to grow, limited only by the maximum heap size. After the maximum heap size is reached, the JVM will throw an “Out of Memory” exception every time that the application tries to allocate a new object. If this is the case, the aplication needs to be debugged to identify why old objects are referenced when they are no longer needed. The application has a legitimate requirement to keep a large amount of data in memory. The customer may want to further increase the maximum heap size. Another possible solution would be to partition the application across multiple cluster nodes, where each node has responsibility for managing a unique subset of the data. Conclusion In conclusion, R is a very powerful tool for the analysis of Java garbage collection log files. The primary difficulty is data cleansing so that information can be read into an R data frame. Once the data has been read into R, a rich set of tools may be used for thorough evaluation.

    Read the article

  • Web development for people who mainly do client side..

    - by kamziro
    Okay, I'm sure there are a lot of us that has plenty of experience developing c++/opengl/objective C on the iPhone, java development on android, python games, etc (any client side stuff) while having little to no experience on web-based development. So what skillset should one learn in order to be able to work on web projects, say, to make a facebook clone (I kid), or maybe a startup that specializes on connecting random fashionistas with pics etc. I actualy do have some experience with C#/VB.net back-end development a while back, but as part of a team, I had a lot of support from the senior devs. Is C# considered a decent web development language?

    Read the article

  • What is the value of the Cloudera Hadoop Certification for people new to the IT industry?

    - by Saumitra
    I am a software developer with 8 months of experience in the IT industry, currently working on the development of tools for BIG DATA analytics. I have learned Hadoop basics on my own and I am pretty comfortable with writing MapReduce Jobs, PIG, HIVE, Flume and other related projects. I am thinking of taking the exam for the Cloudera Hadoop Certification. Will this certification add value, considering that I have less than 1 year of experience? Many of the jobs I've seen relating to Hadoop require at least 3 years of experience. Should I invest more time in learning Hadoop and improving my skills to take this certification?

    Read the article

  • How can I introduce QA and break it into parts for various people?

    - by Michael Durrant
    I was recently asked how to do this, specifically: How to introduce QA into an organization? How to break up QA into parts that others can do ? How to prioritize what needs QA? How to determine what to buy? code? etc. The organization uses Rails on Rails extensively as the development platform Note: I am posting a lengthy answer myself but I will also upvote additional good answers (and probably incorporate them into my answer).

    Read the article

  • What actions to take when people leave the team?

    - by finrod
    Recently one of our key engineers resigned. This engineer has co-authored a major component of our application. We are not hitting Truck number yet though, but we're getting close :) Before the guy waltzes off, we want to take actions necessary to recover from this loss as smoothly as possible and eventually 'grow' the rest of the team to competently cover the parts he authored. More about the context: the domain the component covers and the code are no rocket science but still a lot of non-trivial stuff. Some team members can already cover a lot of this but those have a lot on their plates and we want to make sure every. (as I see it): Improve tests and test coverage - especially for the non-trivial stuff, Update high level documents, Document any 'funny stuff' the code does (we had to do some heavy duct-taping), Add / update code documentation - have everything with 'public' visibility documented. Finally the questions: What do you think are the actions to take in this situation? What have you done in such situations? What did or did not work well for you?

    Read the article

  • Are small amounts of functional programming understandable by non-FP people?

    - by kd35a
    Case: I'm working at a company, writing an application in Python that is handling a lot of data in arrays. I'm the only developer of this program at the moment, but it will probably be used/modified/extended in the future (1-3 years) by some other programmer, at this moment unknown to me. I will probably not be there directly to help then, but maybe give some support via email if I have time for it. So, as a developer who has learned functional programming (Haskell), I tend to solve, for example, filtering like this: filtered = filter(lambda item: included(item.time, dur), measures) The rest of the code is OO, it's just some small cases where I want to solve it like this, because it is much simpler and more beautiful according to me. Question: Is it OK today to write code like this? How does a developer that hasn't written/learned FP react to code like this? Is it readable? Modifiable? Should I write documentation like explaining to a child what the line does? # Filter out the items from measures for which included(item.time, dur) != True I have asked my boss, and he just says "FP is black magic, but if it works and is the most efficient solution, then it's OK to use it." What is your opinion on this? As a non-FP programmer, how do you react to the code? Is the code "googable" so you can understand what it does? I would love feedback on this :) Edit: I marked phant0m's post as answer, because he gives good advice on how to write the code in a more readable way, and still keep the advantages. But I would also like to recommend superM's post because of his viewpoint as a non-FP programmer.

    Read the article

< Previous Page | 8 9 10 11 12 13 14 15 16 17 18 19  | Next Page >