Search Results

Search found 17517 results on 701 pages for 'keyword search'.

Page 212/701 | < Previous Page | 208 209 210 211 212 213 214 215 216 217 218 219 | Next Page >

MySQL cross table regular expression match

- by Josef Sábl

I have a web application and I am working on engine that analyzes referals. Now I have table with pageviews along with referes that looks something like this: pv_id referer ------------------------------------------------------------ 5531854534 http://www.google.com/search?ie=UTF-8... 8161876343 http://google.cn/search?search=human+rights 8468434831 http://search.yahoo.com/search;_... The second table contains sources definitions like: source regex ------------------------------------------------------------ Google ^https?:\/\/[^\/]*google\.([a-z]{2,4})(\/.*|)$ Yahoo ^https?:\/\/[^\/]*yahoo\.com(\/.*|)$ What I want is third table created by joinin these two: pv_id source ------------------------------------------------------------ 5531854534 Google 8161876343 Google 8468434831 Yahoo How to join these tables with regular expression?

Read the article
Avoid multiple autocomplete calls by wrapping it with SetTimeOut

- by pixelboy

Here's my issue : using an autocomplete jQuery plugin, I'd like to avoid multiple ajax requests when user strikes his keynoard by surrounding the $('#query1').autocomplete({ serviceUrl:'/actions/autocomplete?population=salon', minChars:3, maxHeight:300, width:200, clearCache:true, onSelect: function(suggestions,data){ $(".btn1").attr("href", "${pageContext.request.contextPath}/actions/espaceClients?participantId=" + data) } }); with something like var search = false; $('#query1, #query2, #query3').keyup(function(){ if (!search){ search = true; } if (search) { search = false; autocompleteThem(); } }); A you can see, above code is stupid, but it kinda shows what i'm trying to do. In simple words, if user dosen't type anything else in a certain period of time, then you can call autocomplete. I hope i'm being clear, as my brains are a mess...

Read the article
Codeigniter form action with slashes instead of normal GETs?

- by Ethan

Hey, so this is one of those questions that seems obvious, and I'm probably going to feel stupid, but here goes: I'm doing a CodeIgniter site with a search. Think of a Google type input, where you'd search for "white huskies." I have a search results page that takes a URI (MySite.com/dogs/white huskies), and takes the third part, and performs the search on that term. I'd like this to be done in the URI, and no by POST so my users can bookmark results. The problem I'm having is how to get that search button directed to Mysite.com/dogs/WHATEVER IS IN THE INPUT. How do I get the what is in the input part into the anchor href? I know I could do this with javascript, but I've heard it's bad practice to force people to have javascript for things this small. Thanks for the help!

Read the article
Django adminsite customize search_fields query

- by dArignac

Howdy! In the django admin you can set the search_fields for the ModelAdmin to be able to search over the properties given there. My model class has a property that is not a real model property, means it is not within the database table. The property relates to another database table that is not tied to the current model through relations. But I want to be able to search over it, so I have to somehow customize the query the admin site creates to do the filtering when the search field was filled - is this possible and if, how? I can query the database table of my custom property and it then returns the ids of the model classes fitting the search. This then, as I said, has to flow into the admin site search query. Thanks!

Read the article
How to use Zend_Cache Identifier ?

- by ArneRie

Hi Folks, i think iam getting crazy, iam trying to implement Zend_Cache to cache my database query. I know how it works and how to configure. But i cant find a good way to set the Identifier for the cache entrys. I have an method wich search for records in my database (based on an array with search values). /** * Find Record(s) * Returns one record, or array with objects * * @param array $search Search columns => value * @param integer $limit Limit results * @return array One record , or array with objects */ public function find(array $search, $limit = null) { $identifier = 'NoIdea'; if (!($data = $this->_cache->load($identifier))) { // fetch // save to cache with $identifier.. } But what kind of identifier can use in this situation?

Read the article
Does watir's browser.text.include? count text inside invisible divs? If so, how to search only for v

- by karlthorwald

Does watir's browser.text.include? count text inside invisible divs? If so, how to search only for visible text? I put all the instructions into the html from the beginning and use jQuery to hide and unhide the relevant parts. How can I use watir's waiter to wait for only text that is visible? My problem is, that the waiter always returns true, even before I have shown a certain text.

Read the article
Android on TabHost little more depth of the problem, there is the hard return key on the treatment.

- by user365723

Hello everybody, I have a question on the Android Activity, for example I have a TabHost, and there are included four Activities, the first tab is a search Activity, enter a keyword in the current result of this Activity to return, and in the current Activity display. Is called to display the search results themselves. And after searching several times, and then return to key mobile phone keypad, the display is the result of the last search keyword, I want the press back key to return to the last call of the Activity or TabHost. Should I do? By the way, in a tab in the use of Intent calls a Activity, eg: host.addTab (host.newTabSpec ("friend"). setIndicator ("search") . SetContent (new Intent (this, Search.class))); In this Activity in the need to call another Activity, e.g: startActivity (new Intent (this, Other.class)); Also called another Activity displayed on this tab, but not yet jump out of the show. I ask how you can achieve this?

Read the article
Zend router problem

- by Fluffy

Hi, folks. I have some problems with zend routes I have shops controller. It has 3 actions(for now): index - lists all of shops using paginator(so I have /shops/?page=2) show - shows concrete shop (show/Apple+store) search - shows search form So now I need to make routing for that. I have followin routes 'shop', new Zend_Controller_Router_Route ('/shops/:title',array('controller' = 'shops', 'action' = 'show'),array('title' = '/^(?!search$).+$/')) 'search_shops',new Zend_Controller_Router_Route_Static ('/shop/search',array('controller' = 'shops', 'action' = 'show')) but when i try to go /shops/Apple+store it says, that there is no Apple store action. If I ommit regexp part on shop route, I can't go to search. What am I doing wrong?

Read the article
JMS message received at only one server

- by BJH

I'm having a problem with a JEE6 application running in a clustered environment using WebSphere ApplicationServer 8. A search index is used for quick search in the UI (using Lucene), which must be re-indexed after new data arrived in the corresponding DB layer. To achieve this we're sending a JMS message to the application, then the search index will be refreshed. The problem is, that the messages only arrives at one of the cluster members. So only there the search index is up to date. At the other servers it remains outdated. How can I achieve that the search index gets updated at all cluster members? Can I receive the message somehow on all servers? Or is there a better way to do this?

Read the article
How do you customize a url for a form with Asp MVC?

- by Maudite

I am adding a search box to a Asp Mvc. This is html for the form: @using (Html.BeginForm("Query", "Search", FormMethod.Get)) { <input type="text" name="q" /> <input type="submit" value="Seach" /> } and I added this route routes.MapRoute("Search", "q={query}", new { controller = "Search", action = "Query" }); I would like the form to generate a url that looks like http://localhost:####/q=value in textbox. Is it possible to change the way MVC generates the url? This is currently what I get: http://localhost:50916/Search/Query?q=value in textbox

Read the article
RSS feed without items

- by Jan Hancic

I have webpage where I have a search page. I provide a "dynamic" RSS feed for search so that a user can subscribe to search results for any search term he likes. So I was wondering what is the standard (or best practise) way to do if that search term returns 0 results which means I have no "items" to put in the feed. Do I just return an empty feed (only including the meta data and no item elements). Or should I put some special item element in the feed with some "no results" text? edit: YouTube returns a feed without any item elements. If no-one can answer me I'll take it that this is the right way of doing it since I can't find any info elsewhere :)

Read the article
How to hide button value in url?

- by Levani

This is my search form: <form action="" method="get" name="search"> <input name="s" type="text" size="40" value="<?php echo $_GET["s"]; ?>" /> <input name="submit" type="submit" value="Search" /> </form> When someone clicks the search button the url in browser's address bar is something like this: http://example.com/?s=someting&submit=Search How can I change it so that it only displays: http://example.com/?s=someting Hope I'm clear...

Read the article
UIsearch rearrange the indexpath.row

- by abuyousif

i have a problem that i have spend about a week trying to solve it, but no luke up to now. i have a uisearchbar implmented into my table view. and i also have two nsarray, one for tilte and one for discription. when i search through the array of the titles it returns the rights search, but when i click on a row that the search came with, i get "row 0" if i click on the first row. my question is how to make a connection between the two arrays so when the search rearrange the titles based on the user search the discription array correspond to the same row the title is at. i appreciate any help.

Read the article
backgroundworkers or threadpools

- by vbNewbie

I am trying to create an app that allows multiple search requests to occur whilst maintaining use of the user interface to allow interaction. For the multiple search requests, I initially only had one search request running with user interaction still capable by using a backgroundworker to do this. Now I need to extend these functions by allowing more search functions and basically just queueing them up. I am not sure whether to use multiple backgroundworkers or use the threadpool since I want to be able to know the progress of each search job at any time.

Read the article
focus() jQuery function doesn't work in Safari, but works fine on all other browsers?

- by pMan

I have a search text field and search button, when button is clicked with default text in text field, or null value, an alert pops up and sets focus back on search text field. This works very well on all major browsers but not in safari. I tried it even with out jquery, but didn't work. When the focus falls on search text field, I have another jQuery function, is that the problem. The code that sets focus on search text is: if (defaults.keyword == SEARCH_TIP || defaults.keyword == '') { alert(SEARCH_NULL); $('#store_search_keyword').focus(); return false; } The code on focus is: var search_dom = $('#store_search_keyword'); var search_text = search_dom.val(); search_dom.focus(function(){ if ($(this).val() === SEARCH_TIP) { $(this).val(''); } }); any help is appreciated, thanks..

Read the article
ruby on rails named scopes (searching)

- by houlahan

I have a named scope (name) combination of first and last name and I'm wanting to use this in a search box. I have the code below: named_scope :full_name, lambda { |fn| {:joins => :actor, :conditions => ['first_name LIKE ? OR second_name LIKE ?', "%#{fn}%", "%#{fn}%"]} } def self.search(search) if search self.find(:all, :conditions => [ 'full_name LIKE ?', "%#{search}%"]) else find(:all) end end but this doesn't work as it gives the following error: SQLite3::SQLException: no such column: full_name: SELECT * FROM "actors" WHERE (full_name LIKE '%eli dooley%') Thanks in advance Houlahan

Read the article
My ASP.NET news sources

- by Jon Galloway

I just posted about the ASP.NET Daily Community Spotlight. I was going to list a bunch of my news sources at the end, but figured this deserves a separate post. I've been following a lot of development blogs for a long time - for a while I subscribed to over 1500 feeds and read them all. That doesn't scale very well, though, and it's really time consuming. Since the community spotlight requires an interesting ASP.NET post every day of the year, I've come up with a few sources of ASP.NET news. Top Link Blogs Chris Alcock's The Morning Brew is a must-read blog which highlights each day's best blog posts across the .NET community. He covers the entire Microsoft development, but generally any of the top ASP.NET posts I see either have already been listed on The Morning Brew or will be there soon. Elijah Manor posts a lot of great content, which is available in his Twitter feed at @elijahmanor, on his Delicious feed, and on a dedicated website - Web Dev Tweets. While not 100% ASP.NET focused, I've been appreciating Joe Stagner's Weekly Links series, partly since he includes a lot of links that don't show up on my other lists. Twitter Over the past few years, I've been getting more and more of my information from my Twitter network (as opposed to RSS or other means). Twitter is as good as your network, so if getting good information off Twitter sounds crazy, you're probably not following the right people. I already mentioned Elijah Manor (@elijahmanor). I follow over a thousand people on Twitter, so I'm not going to try to pick and choose a list, but one good way to get started building out a Twitter network is to follow active Twitter users on the ASP.NET team at Microsoft: @scottgu (well, not on the ASP.NET team, but their great grand boss, and always a great source of ASP.NET info) @shanselman @haacked @bradwilson @davidfowl @InfinitiesLoop @davidebbo @marcind @DamianEdwards @stevensanderson @bleroy @humancompiler @osbornm @anurse I'm sure I'm missing a few, and I'll update the list. Building a Twitter network that follows topics you're interested in allows you to use other tools like Cadmus to automatically summarize top content by leveraging the collective input of many users. Twitter Search with Topsy You can search Twitter for hashtags (like #aspnet, #aspnetmvc, and #webmatrix) to get a raw view of what people are talking about on Twitter. Twitter's search is pretty poor; I prefer Topsy. Here's an example search for the #aspnetmvc hashtag: http://topsy.com/s?q=%23aspnetmvc You can also do combined queries for several tags: http://topsy.com/s?q=%23aspnetmvc+OR+%23aspnet+OR+%23webmatrix Paper.li Paper.li is a handy service that builds a custom daily newspaper based on your social network. They've turned a lot of people off by automatically tweeting "The SuperDevFoo Daily is out!!!" messages (which can be turned off), but if you're ignoring them because of those message, you're missing out on a handy, free service. My paper.li page includes content across a lot of interests, including ASP.NET: http://paper.li/jongalloway When I want to drill into a specific tag, though, I'll just look at the Paper.li post for that hashtag. For example, here's the #aspnetmvc paper.li page: http://paper.li/tag/aspnetmvc Delicious I mentioned previously that I use Delicious for managing site links. I also use their network and search features. The tag based search is pretty good: Even better, though, is that I can see who's bookmarked these links, and add them to my Delicious network. After having built out a network, I can optimize by doing less searching and more leaching leveraging of collective intelligence. Community Sites I scan DotNetKicks, the weblogs.asp.net combined feed, and the ASP.NET Community page, CodeBetter, Los Techies, CodeProject, and DotNetSlackers from time to time. They're hit and miss, but they do offer more of an opportunity for finding original content which others may have missed. Terms of Enrampagement When someone's on a tear, I just manually check their sites more often. I could use RSS for that, but it changes pretty often. I just keep a mental note of people who are cranking out a lot of good content and check their sites more often. What works for you?

Read the article
Bash - how can i pass variable to command and read its output

- by Skuja

In my bash script i want to execute command that will find lines in a file that starts with a keyword which is stored in a variable like this: keyword="key" result=(`cat path/to/my/file | grep '^$keyword'`)

Read the article
How to move complete SharePoint Server 2007 from one box to another

- by DipeshBhanani

It was time of my first onsite client assignment on SharePoint. Client had one server production environment. They wanted to upgrade the topology with completely new SharePoint Farm of three servers. So, the task was to move whole MOSS 2007 stuff to the new server environment without impacting data. The last three scary words “… without impacting data…” were actually putting pressure on my head. Moreover SSP was required to move because additional information has been added for users apart from AD import. I thought I had to do only backup and restore. It appeared pretty easy at first thought. Just because of these damn scary words, I thought to check out on internet for guidance related to this scenario. I couldn’t get anything except general guidance of moving server on Microsoft TechNet site. I promised myself for starting blogs with this post if I would be successful in this task. Well, I took long time to write this but finally made it. I hope it will be useful to all guys looking for SharePoint server movement. Before beginning restoration, make sure that, there is no difference in versions of SharePoint at source and destination server. Also check whether the state of SharePoint Installation at the time of backup and restore is same or not. (E.g. SharePoint related service packs and patches if any) The main tasks of the server movement are as follow: Backup all the databases Install and configure SharePoint on new environment Deploy all solution (WSP Files) globally to destination server- for installing features attached to the solutions Install all the custom features Deploy/Copy custom pages/files which are added to the “12Hive” folder later Restore SSP Restore My Site Restore other web application Tasks 3 to 5 are for making sure that we have configured the environment well enough for the web application to be restored successfully. The main and complex task was restoring SSP. I have started restoring SSP through Central Admin. After a while, the restoration status was updated to “unsuccessful”. “Damn it, what went wrong?” I thought looking at the error detail down the page. I couldn’t remember the error message but I had corrected and restored it again. Actually once you fail restoring SSP, until and unless you don’t clean all related stuff well, your restoration will be failed again and again. I wanted to find the actual reason. So cleaned, restored, cleaned, restored… I had tried almost 5-6 times and finally, I succeeded. I had realized how pleasant it is, to see the word “Successful” on the screen. Without wasting your much time to read, let me write all the detailed steps of restoring SSP: Delete the SSP through following STSADM command. stsadm -o deletessp -title <SSP name> -deletedatabases -force e.g.: stsadm -o deletessp -title SharedServices1 -deletedatabases –force Check and delete the web application associated with SSP if it exists. Remove Link from Check and remove “Alternate Access Mapping” associated with SSP if it exists. Check and delete IIS site as well as application pool associated with SSP if it exists. Stop following services: · Office SharePoint Server Search · Windows SharePoint Services Search · Windows SharePoint Services Help Search Delete all the databases associated/related to SSP from SQL Server. Reset IIS. Start again following services: · Office SharePoint Server Search · Windows SharePoint Services Search · Windows SharePoint Services Help Search Restore the new SSP. After the SSP restoration, all other stuffs had completed very smoothly without any more issues. I did few modifications to sites for change of server name and finally, the new environment was ready.

Read the article
Dropped impression 25 days after restructure

- by Hamid

Our website is a non English property related website (moshaver.com) which is similar to rightmove.co.uk. On September 2012 our website was adversely affected by Panda causing our Google incoming clicks to drop from around 3000 clicks to less than a thousand. We were hoping that Google will eventually realize that we are not a spam website and things will get better. However, in August 2013 we were almost sure that we needed to do something, so we started to restructure our web content. We used the canonical tag to remove our search results and point to our listing pages, using the noindex tag to remove it from our listing pages which does not have any properties at the moment. We also changed title tags to more friendly ones, in addition to other changes. Our changes were effective on 10th August. As shown in the graph taken from Google Analytics Search Engine Optimization section, these changes has resulted in an increase in the number of times Google displayed our results in its search results. Our impressions almost doubled starting 15th August. However, as the graph shows, our CTR dropped from this date from around 15% to 8%. This might have been because of our changed title tags (so people were less likely to click on them), or it might be normal for increased impressions. This situation has continued up until 10th September, when our impressions decreased dramatically to less than a thousand. This is almost 30% of our original impressions (before website restructure) and 15% of the new impressions. At the same time our impressions has increased dramatically to around 50%. I have two theories for this increase. The first one is that these statistics are less accurate for lower impressions. The second one is that Google is now only displaying our results for queries directly related to our website (our name, our url), and not for general terms, such as "apartments in a specific city". The second theory also explains the dramatic decrease in impression as well. After digging the analytic data a little more, I constructed the following table. It displays the breakdown of our impressions, clicks and ctr in different Google products (web and image) and in total. What I understand from this table is that, most of our increased impressions after restructure were on the image search section. I don't think users of search would be looking for content in our website. Furthermore, it shows that the drop in our web search ctr, is as dramatic of the overall ctr (-30% in compare to -60%) . I thought posting it here might help you understand the situation better. Is it possible that Google has tested our new structure for 25 days, and then decided to decrease our impressions because of the the new low CTR? Or should we look for another factor? If this is the case, how long does it usually take for Google to give us another chance? It has been one month since our impressions has dropped.

Read the article
Get the Picture: Pinterest for Marketers

- by Mike Stiles

When trying to determine on which networks to conduct social marketing, the usual suspects immediately rise to the top; Facebook & Twitter, then LinkedIn (especially if you’re B2B), then maybe some Google Plus to hedge SEO bets. So at what juncture do brands get excited about Pinterest? Pinterest has been easy for marketers to de-prioritize thanks to the perception its usage is so dominated by women. Um, what’s wrong with that? Women make an estimated 85% of all consumer purchases. So if there are indeed over 30 million US women active on it monthly, and they do 92% of the pinning, and 84% are still active on it after 4 years, when did an audience of highly engaged, very likely sales conversions become low priority? Okay, if you’re a tech B2B SaaS product like the Oracle Social Cloud, Pinterest may not be where you focus. But if you operate in the top Pinterest categories, which are truly far-reaching, it’s time to take note of Pinterest’s performance to date: 40.1 million monthly users in the US (eMarketer). Over 30 billion pins, half of which were pinned in the last 6 months. (Big momentum) 75% of usage is on their mobile app. (In solid shape for the mobile migration) Pinterest sharing grew 58% in 2013, beating Facebook, Twitter, or LinkedIn. (ShareThis) Pinterest is the 3rd most popular sharing platform overall (over email), with 48% of all sharing on tablets. Users referred by Pinterest are 10% more likely to buy on e-commerce sites and tend to spend twice that of users coming from Facebook. (Shopify) To be fair, brands haven’t had any paid marketing opportunities on that platform…until recently. Users are seeing Promoted Pins in both category and search feeds from rollout brands like Gap, ABC Family, Ziploc, and Nestle. Are the paid pins annoying users? It seems more so than other social networks, they’re fitting right in to the intended user experience and being accepted, getting almost as many click-throughs as user pins. New York Magazine’s Kevin Roose laid it out succinctly; Pinterest offers a place that’s image-centric, search-friendly, makes things easy to purchase, makes things easy to share, and puts users in an aspirational mood to buy. Pinterest is very confident in the value of that combo and that audience, with CPM rates 5x that of the most expensive Facebook ad, plus (at least for now) required spending commitments and required pin review by Pinterest for quality. The latest developments; a continued move toward search and discovery with enhancements like Guided Search to help you hone in on what interests you, Custom Categories, and the rumored Visual Search that stands to be a liberation from text. And most recently, Pinterest has opened up its API so brands can get access to deeper insights into the best search terms and categories in which to play ball, as well as what kinds of pins stand to perform best in those areas. As we learned in our rundown this week of Social Media Examiner’s Social Media Marketing Industry Report, around 50% of marketers specifically intend on upping their use of Pinterest. If you’re a big believer in fishing where the fish are, that’s probably an efficient position to take. @mikestiles @oraclesocialPhoto: Adam Lambert_Gorwyn, freeimages.com

Read the article
How to add a variable into a grep command

- by twigg

I'm running the following grep command var=`grep -n "keyword" /var/www/test/testfile.txt` This work just as expected but I need to insert the file name dynamically from a loop like so: var=`grep -n "keyword" /var/www/test/`basename ${hd[$i]}`.txt` But obviously the use of ` brakes this with a unexpected EOF while looking for matching ``' and unexpected end of file Any ideas of away around this?

Read the article
Using R to Analyze G1GC Log Files

- by user12620111

Using R to Analyze G1GC Log Files body, td { font-family: sans-serif; background-color: white; font-size: 12px; margin: 8px; } tt, code, pre { font-family: 'DejaVu Sans Mono', 'Droid Sans Mono', 'Lucida Console', Consolas, Monaco, monospace; } h1 { font-size:2.2em; } h2 { font-size:1.8em; } h3 { font-size:1.4em; } h4 { font-size:1.0em; } h5 { font-size:0.9em; } h6 { font-size:0.8em; } a:visited { color: rgb(50%, 0%, 50%); } pre { margin-top: 0; max-width: 95%; border: 1px solid #ccc; white-space: pre-wrap; } pre code { display: block; padding: 0.5em; } code.r, code.cpp { background-color: #F8F8F8; } table, td, th { border: none; } blockquote { color:#666666; margin:0; padding-left: 1em; border-left: 0.5em #EEE solid; } hr { height: 0px; border-bottom: none; border-top-width: thin; border-top-style: dotted; border-top-color: #999999; } @media print { * { background: transparent !important; color: black !important; filter:none !important; -ms-filter: none !important; } body { font-size:12pt; max-width:100%; } a, a:visited { text-decoration: underline; } hr { visibility: hidden; page-break-before: always; } pre, blockquote { padding-right: 1em; page-break-inside: avoid; } tr, img { page-break-inside: avoid; } img { max-width: 100% !important; } @page :left { margin: 15mm 20mm 15mm 10mm; } @page :right { margin: 15mm 10mm 15mm 20mm; } p, h2, h3 { orphans: 3; widows: 3; } h2, h3 { page-break-after: avoid; } } pre .operator, pre .paren { color: rgb(104, 118, 135) } pre .literal { color: rgb(88, 72, 246) } pre .number { color: rgb(0, 0, 205); } pre .comment { color: rgb(76, 136, 107); } pre .keyword { color: rgb(0, 0, 255); } pre .identifier { color: rgb(0, 0, 0); } pre .string { color: rgb(3, 106, 7); } var hljs=new function(){function m(p){return p.replace(/&/gm,"&").replace(/"}while(y.length||w.length){var v=u().splice(0,1)[0];z+=m(x.substr(q,v.offset-q));q=v.offset;if(v.event=="start"){z+=t(v.node);s.push(v.node)}else{if(v.event=="stop"){var p,r=s.length;do{r--;p=s[r];z+=("")}while(p!=v.node);s.splice(r,1);while(r'+M[0]+""}else{r+=M[0]}O=P.lR.lastIndex;M=P.lR.exec(L)}return r+L.substr(O,L.length-O)}function J(L,M){if(M.sL&&e[M.sL]){var r=d(M.sL,L);x+=r.keyword_count;return r.value}else{return F(L,M)}}function I(M,r){var L=M.cN?'':"";if(M.rB){y+=L;M.buffer=""}else{if(M.eB){y+=m(r)+L;M.buffer=""}else{y+=L;M.buffer=r}}D.push(M);A+=M.r}function G(N,M,Q){var R=D[D.length-1];if(Q){y+=J(R.buffer+N,R);return false}var P=q(M,R);if(P){y+=J(R.buffer+N,R);I(P,M);return P.rB}var L=v(D.length-1,M);if(L){var O=R.cN?"":"";if(R.rE){y+=J(R.buffer+N,R)+O}else{if(R.eE){y+=J(R.buffer+N,R)+O+m(M)}else{y+=J(R.buffer+N+M,R)+O}}while(L1){O=D[D.length-2].cN?"":"";y+=O;L--;D.length--}var r=D[D.length-1];D.length--;D[D.length-1].buffer="";if(r.starts){I(r.starts,"")}return R.rE}if(w(M,R)){throw"Illegal"}}var E=e[B];var D=[E.dM];var A=0;var x=0;var y="";try{var s,u=0;E.dM.buffer="";do{s=p(C,u);var t=G(s[0],s[1],s[2]);u+=s[0].length;if(!t){u+=s[1].length}}while(!s[2]);if(D.length1){throw"Illegal"}return{r:A,keyword_count:x,value:y}}catch(H){if(H=="Illegal"){return{r:0,keyword_count:0,value:m(C)}}else{throw H}}}function g(t){var p={keyword_count:0,r:0,value:m(t)};var r=p;for(var q in e){if(!e.hasOwnProperty(q)){continue}var s=d(q,t);s.language=q;if(s.keyword_count+s.rr.keyword_count+r.r){r=s}if(s.keyword_count+s.rp.keyword_count+p.r){r=p;p=s}}if(r.language){p.second_best=r}return p}function i(r,q,p){if(q){r=r.replace(/^((]+|\t)+)/gm,function(t,w,v,u){return w.replace(/\t/g,q)})}if(p){r=r.replace(/\n/g,"")}return r}function n(t,w,r){var x=h(t,r);var v=a(t);var y,s;if(v){y=d(v,x)}else{return}var q=c(t);if(q.length){s=document.createElement("pre");s.innerHTML=y.value;y.value=k(q,c(s),x)}y.value=i(y.value,w,r);var u=t.className;if(!u.match("(\\s|^)(language-)?"+v+"(\\s|$)")){u=u?(u+" "+v):v}if(/MSIE [678]/.test(navigator.userAgent)&&t.tagName=="CODE"&&t.parentNode.tagName=="PRE"){s=t.parentNode;var p=document.createElement("div");p.innerHTML=""+y.value+"";t=p.firstChild.firstChild;p.firstChild.cN=s.cN;s.parentNode.replaceChild(p.firstChild,s)}else{t.innerHTML=y.value}t.className=u;t.result={language:v,kw:y.keyword_count,re:y.r};if(y.second_best){t.second_best={language:y.second_best.language,kw:y.second_best.keyword_count,re:y.second_best.r}}}function o(){if(o.called){return}o.called=true;var r=document.getElementsByTagName("pre");for(var p=0;p|=||=||=|\\?|\\[|\\{|\\(|\\^|\\^=|\\||\\|=|\\|\\||~";this.ER="(?![\\s\\S])";this.BE={b:"\\\\.",r:0};this.ASM={cN:"string",b:"'",e:"'",i:"\\n",c:[this.BE],r:0};this.QSM={cN:"string",b:'"',e:'"',i:"\\n",c:[this.BE],r:0};this.CLCM={cN:"comment",b:"//",e:"$"};this.CBLCLM={cN:"comment",b:"/\\*",e:"\\*/"};this.HCM={cN:"comment",b:"#",e:"$"};this.NM={cN:"number",b:this.NR,r:0};this.CNM={cN:"number",b:this.CNR,r:0};this.BNM={cN:"number",b:this.BNR,r:0};this.inherit=function(r,s){var p={};for(var q in r){p[q]=r[q]}if(s){for(var q in s){p[q]=s[q]}}return p}}();hljs.LANGUAGES.cpp=function(){var a={keyword:{"false":1,"int":1,"float":1,"while":1,"private":1,"char":1,"catch":1,"export":1,virtual:1,operator:2,sizeof:2,dynamic_cast:2,typedef:2,const_cast:2,"const":1,struct:1,"for":1,static_cast:2,union:1,namespace:1,unsigned:1,"long":1,"throw":1,"volatile":2,"static":1,"protected":1,bool:1,template:1,mutable:1,"if":1,"public":1,friend:2,"do":1,"return":1,"goto":1,auto:1,"void":2,"enum":1,"else":1,"break":1,"new":1,extern:1,using:1,"true":1,"class":1,asm:1,"case":1,typeid:1,"short":1,reinterpret_cast:2,"default":1,"double":1,register:1,explicit:1,signed:1,typename:1,"try":1,"this":1,"switch":1,"continue":1,wchar_t:1,inline:1,"delete":1,alignof:1,char16_t:1,char32_t:1,constexpr:1,decltype:1,noexcept:1,nullptr:1,static_assert:1,thread_local:1,restrict:1,_Bool:1,complex:1},built_in:{std:1,string:1,cin:1,cout:1,cerr:1,clog:1,stringstream:1,istringstream:1,ostringstream:1,auto_ptr:1,deque:1,list:1,queue:1,stack:1,vector:1,map:1,set:1,bitset:1,multiset:1,multimap:1,unordered_set:1,unordered_map:1,unordered_multiset:1,unordered_multimap:1,array:1,shared_ptr:1}};return{dM:{k:a,i:"",k:a,r:10,c:["self"]}]}}}();hljs.LANGUAGES.r={dM:{c:[hljs.HCM,{cN:"number",b:"\\b0[xX][0-9a-fA-F]+[Li]?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+(?:[eE][+\\-]?\\d*)?L\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+\\.(?!\\d)(?:i\\b)?",e:hljs.IMMEDIATE_RE,r:1},{cN:"number",b:"\\b\\d+(?:\\.\\d*)?(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\.\\d+(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"keyword",b:"(?:tryCatch|library|setGeneric|setGroupGeneric)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\.",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\d+(?![\\w.])",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\b(?:function)",e:hljs.IMMEDIATE_RE,r:2},{cN:"keyword",b:"(?:if|in|break|next|repeat|else|for|return|switch|while|try|stop|warning|require|attach|detach|source|setMethod|setClass)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"literal",b:"(?:NA|NA_integer_|NA_real_|NA_character_|NA_complex_)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"literal",b:"(?:NULL|TRUE|FALSE|T|F|Inf|NaN)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"identifier",b:"[a-zA-Z.][a-zA-Z0-9._]*\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"operator",b:"|=|| Using R to Analyze G1GC Log Files Using R to Analyze G1GC Log Files Introduction Working in Oracle Platform Integration gives an engineer opportunities to work on a wide array of technologies. My team’s goal is to make Oracle applications run best on the Solaris/SPARC platform. When looking for bottlenecks in a modern applications, one needs to be aware of not only how the CPUs and operating system are executing, but also network, storage, and in some cases, the Java Virtual Machine. I was recently presented with about 1.5 GB of Java Garbage First Garbage Collector log file data. If you’re not familiar with the subject, you might want to review Garbage First Garbage Collector Tuning by Monica Beckwith. The customer had been running Java HotSpot 1.6.0_31 to host a web application server. I was told that the Solaris/SPARC server was running a Java process launched using a commmand line that included the following flags: -d64 -Xms9g -Xmx9g -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -XX:InitiatingHeapOccupancyPercent=80 -XX:PermSize=256m -XX:MaxPermSize=256m -XX:+PrintGC -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC -XX:+PrintGCDateStamps -XX:+PrintFlagsFinal -XX:+DisableExplicitGC -XX:+UnlockExperimentalVMOptions -XX:ParallelGCThreads=8 Several sources on the internet indicate that if I were to print out the 1.5 GB of log files, it would require enough paper to fill the bed of a pick up truck. Of course, it would be fruitless to try to scan the log files by hand. Tools will be required to summarize the contents of the log files. Others have encountered large Java garbage collection log files. There are existing tools to analyze the log files: IBM’s GC toolkit The chewiebug GCViewer gchisto HPjmeter Instead of using one of the other tools listed, I decide to parse the log files with standard Unix tools, and analyze the data with R. Data Cleansing The log files arrived in two different formats. I guess that the difference is that one set of log files was generated using a more verbose option, maybe -XX:+PrintHeapAtGC, and the other set of log files was generated without that option. Format 1 In some of the log files, the log files with the less verbose format, a single trace, i.e. the report of a singe garbage collection event, looks like this: {Heap before GC invocations=12280 (full 61): garbage-first heap total 9437184K, used 7499918K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) region size 4096K, 1 young (4096K), 0 survivors (0K) compacting perm gen total 262144K, used 144077K [0xffffffff40000000, 0xffffffff50000000, 0xffffffff50000000) the space 262144K, 54% used [0xffffffff40000000, 0xffffffff48cb3758, 0xffffffff48cb3800, 0xffffffff50000000) No shared spaces configured. 2014-05-14T07:24:00.988-0700: 60586.353: [GC pause (young) 7324M->7320M(9216M), 0.1567265 secs] Heap after GC invocations=12281 (full 61): garbage-first heap total 9437184K, used 7496533K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) region size 4096K, 0 young (0K), 0 survivors (0K) compacting perm gen total 262144K, used 144077K [0xffffffff40000000, 0xffffffff50000000, 0xffffffff50000000) the space 262144K, 54% used [0xffffffff40000000, 0xffffffff48cb3758, 0xffffffff48cb3800, 0xffffffff50000000) No shared spaces configured. } A simple grep can be used to extract a summary: $ grep "\[ GC pause (young" g1gc.log 2014-05-13T13:24:35.091-0700: 3.109: [GC pause (young) 20M->5029K(9216M), 0.0146328 secs] 2014-05-13T13:24:35.440-0700: 3.459: [GC pause (young) 9125K->6077K(9216M), 0.0086723 secs] 2014-05-13T13:24:37.581-0700: 5.599: [GC pause (young) 25M->8470K(9216M), 0.0203820 secs] 2014-05-13T13:24:42.686-0700: 10.704: [GC pause (young) 44M->15M(9216M), 0.0288848 secs] 2014-05-13T13:24:48.941-0700: 16.958: [GC pause (young) 51M->20M(9216M), 0.0491244 secs] 2014-05-13T13:24:56.049-0700: 24.066: [GC pause (young) 92M->26M(9216M), 0.0525368 secs] 2014-05-13T13:25:34.368-0700: 62.383: [GC pause (young) 602M->68M(9216M), 0.1721173 secs] But that format wasn't easily read into R, so I needed to be a bit more tricky. I used the following Unix command to create a summary file that was easy for R to read. $ echo "SecondsSinceLaunch BeforeSize AfterSize TotalSize RealTime" $ grep "\[GC pause (young" g1gc.log | grep -v mark | sed -e 's/[A-SU-z\(\),]/ /g' -e 's/->/ /' -e 's/: / /g' | more SecondsSinceLaunch BeforeSize AfterSize TotalSize RealTime 2014-05-13T13:24:35.091-0700 3.109 20 5029 9216 0.0146328 2014-05-13T13:24:35.440-0700 3.459 9125 6077 9216 0.0086723 2014-05-13T13:24:37.581-0700 5.599 25 8470 9216 0.0203820 2014-05-13T13:24:42.686-0700 10.704 44 15 9216 0.0288848 2014-05-13T13:24:48.941-0700 16.958 51 20 9216 0.0491244 2014-05-13T13:24:56.049-0700 24.066 92 26 9216 0.0525368 2014-05-13T13:25:34.368-0700 62.383 602 68 9216 0.1721173 Format 2 In some of the log files, the log files with the more verbose format, a single trace, i.e. the report of a singe garbage collection event, was more complicated than Format 1. Here is a text file with an example of a single G1GC trace in the second format. As you can see, it is quite complicated. It is nice that there is so much information available, but the level of detail can be overwhelming. I wrote this awk script (download) to summarize each trace on a single line. #!/usr/bin/env awk -f BEGIN { printf("SecondsSinceLaunch IncrementalCount FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize\n") } ###################### # Save count data from lines that are at the start of each G1GC trace. # Each trace starts out like this: # {Heap before GC invocations=14 (full 0): # garbage-first heap total 9437184K, used 325496K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) ###################### /{Heap.*full/{ gsub ( "\\)" , "" ); nf=split($0,a,"="); split(a[2],b," "); getline; if ( match($0, "first") ) { G1GC=1; IncrementalCount=b[1]; FullCount=substr( b[3], 1, length(b[3])-1 ); } else { G1GC=0; } } ###################### # Pull out time stamps that are in lines with this format: # 2014-05-12T14:02:06.025-0700: 94.312: [GC pause (young), 0.08870154 secs] ###################### /GC pause/ { DateTime=$1; SecondsSinceLaunch=substr($2, 1, length($2)-1); } ###################### # Heap sizes are in lines that look like this: # [ 4842M->4838M(9216M)] ###################### /\[ .*]$/ { gsub ( "\\[" , "" ); gsub ( "\ \]" , "" ); gsub ( "->" , " " ); gsub ( "\\( " , " " ); gsub ( "\ \)" , " " ); split($0,a," "); if ( split(a[1],b,"M") > 1 ) {BeforeSize=b[1]*1024;} if ( split(a[1],b,"K") > 1 ) {BeforeSize=b[1];} if ( split(a[2],b,"M") > 1 ) {AfterSize=b[1]*1024;} if ( split(a[2],b,"K") > 1 ) {AfterSize=b[1];} if ( split(a[3],b,"M") > 1 ) {TotalSize=b[1]*1024;} if ( split(a[3],b,"K") > 1 ) {TotalSize=b[1];} } ###################### # Emit an output line when you find input that looks like this: # [Times: user=1.41 sys=0.08, real=0.24 secs] ###################### /\[Times/ { if (G1GC==1) { gsub ( "," , "" ); split($2,a,"="); UserTime=a[2]; split($3,a,"="); SysTime=a[2]; split($4,a,"="); RealTime=a[2]; print DateTime,SecondsSinceLaunch,IncrementalCount,FullCount,UserTime,SysTime,RealTime,BeforeSize,AfterSize,TotalSize; G1GC=0; } } The resulting summary is about 25X smaller that the original file, but still difficult for a human to digest. SecondsSinceLaunch IncrementalCount FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ... 2014-05-12T18:36:34.669-0700: 3985.744 561 0 0.57 0.06 0.16 1724416 1720320 9437184 2014-05-12T18:36:34.839-0700: 3985.914 562 0 0.51 0.06 0.19 1724416 1720320 9437184 2014-05-12T18:36:35.069-0700: 3986.144 563 0 0.60 0.04 0.27 1724416 1721344 9437184 2014-05-12T18:36:35.354-0700: 3986.429 564 0 0.33 0.04 0.09 1725440 1722368 9437184 2014-05-12T18:36:35.545-0700: 3986.620 565 0 0.58 0.04 0.17 1726464 1722368 9437184 2014-05-12T18:36:35.726-0700: 3986.801 566 0 0.43 0.05 0.12 1726464 1722368 9437184 2014-05-12T18:36:35.856-0700: 3986.930 567 0 0.30 0.04 0.07 1726464 1723392 9437184 2014-05-12T18:36:35.947-0700: 3987.023 568 0 0.61 0.04 0.26 1727488 1723392 9437184 2014-05-12T18:36:36.228-0700: 3987.302 569 0 0.46 0.04 0.16 1731584 1724416 9437184 Reading the Data into R Once the GC log data had been cleansed, either by processing the first format with the shell script, or by processing the second format with the awk script, it was easy to read the data into R. g1gc.df = read.csv("summary.txt", row.names = NULL, stringsAsFactors=FALSE,sep="") str(g1gc.df) ## 'data.frame': 8307 obs. of 10 variables: ## $ row.names : chr "2014-05-12T14:00:32.868-0700:" "2014-05-12T14:00:33.179-0700:" "2014-05-12T14:00:33.677-0700:" "2014-05-12T14:00:35.538-0700:" ... ## $ SecondsSinceLaunch: num 1.16 1.47 1.97 3.83 6.1 ... ## $ IncrementalCount : int 0 1 2 3 4 5 6 7 8 9 ... ## $ FullCount : int 0 0 0 0 0 0 0 0 0 0 ... ## $ UserTime : num 0.11 0.05 0.04 0.21 0.08 0.26 0.31 0.33 0.34 0.56 ... ## $ SysTime : num 0.04 0.01 0.01 0.05 0.01 0.06 0.07 0.06 0.07 0.09 ... ## $ RealTime : num 0.02 0.02 0.01 0.04 0.02 0.04 0.05 0.04 0.04 0.06 ... ## $ BeforeSize : int 8192 5496 5768 22528 24576 43008 34816 53248 55296 93184 ... ## $ AfterSize : int 1400 1672 2557 4907 7072 14336 16384 18432 19456 21504 ... ## $ TotalSize : int 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 ... head(g1gc.df) ## row.names SecondsSinceLaunch IncrementalCount ## 1 2014-05-12T14:00:32.868-0700: 1.161 0 ## 2 2014-05-12T14:00:33.179-0700: 1.472 1 ## 3 2014-05-12T14:00:33.677-0700: 1.969 2 ## 4 2014-05-12T14:00:35.538-0700: 3.830 3 ## 5 2014-05-12T14:00:37.811-0700: 6.103 4 ## 6 2014-05-12T14:00:41.428-0700: 9.720 5 ## FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ## 1 0 0.11 0.04 0.02 8192 1400 9437184 ## 2 0 0.05 0.01 0.02 5496 1672 9437184 ## 3 0 0.04 0.01 0.01 5768 2557 9437184 ## 4 0 0.21 0.05 0.04 22528 4907 9437184 ## 5 0 0.08 0.01 0.02 24576 7072 9437184 ## 6 0 0.26 0.06 0.04 43008 14336 9437184 Basic Statistics Once the data has been read into R, simple statistics are very easy to generate. All of the numbers from high school statistics are available via simple commands. For example, generate a summary of every column: summary(g1gc.df) ## row.names SecondsSinceLaunch IncrementalCount FullCount ## Length:8307 Min. : 1 Min. : 0 Min. : 0.0 ## Class :character 1st Qu.: 9977 1st Qu.:2048 1st Qu.: 0.0 ## Mode :character Median :12855 Median :4136 Median : 12.0 ## Mean :12527 Mean :4156 Mean : 31.6 ## 3rd Qu.:15758 3rd Qu.:6262 3rd Qu.: 61.0 ## Max. :55484 Max. :8391 Max. :113.0 ## UserTime SysTime RealTime BeforeSize ## Min. :0.040 Min. :0.0000 Min. : 0.0 Min. : 5476 ## 1st Qu.:0.470 1st Qu.:0.0300 1st Qu.: 0.1 1st Qu.:5137920 ## Median :0.620 Median :0.0300 Median : 0.1 Median :6574080 ## Mean :0.751 Mean :0.0355 Mean : 0.3 Mean :5841855 ## 3rd Qu.:0.920 3rd Qu.:0.0400 3rd Qu.: 0.2 3rd Qu.:7084032 ## Max. :3.370 Max. :1.5600 Max. :488.1 Max. :8696832 ## AfterSize TotalSize ## Min. : 1380 Min. :9437184 ## 1st Qu.:5002752 1st Qu.:9437184 ## Median :6559744 Median :9437184 ## Mean :5785454 Mean :9437184 ## 3rd Qu.:7054336 3rd Qu.:9437184 ## Max. :8482816 Max. :9437184 Q: What is the total amount of User CPU time spent in garbage collection? sum(g1gc.df$UserTime) ## [1] 6236 As you can see, less than two hours of CPU time was spent in garbage collection. Is that too much? To find the percentage of time spent in garbage collection, divide the number above by total_elapsed_time*CPU_count. In this case, there are a lot of CPU’s and it turns out the the overall amount of CPU time spent in garbage collection isn’t a problem when viewed in isolation. When calculating rates, i.e. events per unit time, you need to ask yourself if the rate is homogenous across the time period in the log file. Does the log file include spikes of high activity that should be separately analyzed? Averaging in data from nights and weekends with data from business hours may alias problems. If you have a reason to suspect that the garbage collection rates include peaks and valleys that need independent analysis, see the “Time Series” section, below. Q: How much garbage is collected on each pass? The amount of heap space that is recovered per GC pass is surprisingly low: At least one collection didn’t recover any data. (“Min.=0”) 25% of the passes recovered 3MB or less. (“1st Qu.=3072”) Half of the GC passes recovered 4MB or less. (“Median=4096”) The average amount recovered was 56MB. (“Mean=56390”) 75% of the passes recovered 36MB or less. (“3rd Qu.=36860”) At least one pass recovered 2GB. (“Max.=2121000”) g1gc.df$Delta = g1gc.df$BeforeSize - g1gc.df$AfterSize summary(g1gc.df$Delta) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0 3070 4100 56400 36900 2120000 Q: What is the maximum User CPU time for a single collection? The worst garbage collection (“Max.”) is many standard deviations away from the mean. The data appears to be right skewed. summary(g1gc.df$UserTime) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0.040 0.470 0.620 0.751 0.920 3.370 sd(g1gc.df$UserTime) ## [1] 0.3966 Basic Graphics Once the data is in R, it is trivial to plot the data with formats including dot plots, line charts, bar charts (simple, stacked, grouped), pie charts, boxplots, scatter plots histograms, and kernel density plots. Histogram of User CPU Time per Collection I don't think that this graph requires any explanation. hist(g1gc.df$UserTime, main="User CPU Time per Collection", xlab="Seconds", ylab="Frequency") Box plot to identify outliers When the initial data is viewed with a box plot, you can see the one crazy outlier in the real time per GC. Save this data point for future analysis and drop the outlier so that it’s not throwing off our statistics. Now the box plot shows many outliers, which will be examined later, using times series analysis. Notice that the scale of the x-axis changes drastically once the crazy outlier is removed. par(mfrow=c(2,1)) boxplot(g1gc.df$UserTime,g1gc.df$SysTime,g1gc.df$RealTime, main="Box Plot of Time per GC\n(dominated by a crazy outlier)", names=c("usr","sys","elapsed"), xlab="Seconds per GC", ylab="Time (Seconds)", horizontal = TRUE, outcol="red") crazy.outlier.df=g1gc.df[g1gc.df$RealTime > 400,] g1gc.df=g1gc.df[g1gc.df$RealTime < 400,] boxplot(g1gc.df$UserTime,g1gc.df$SysTime,g1gc.df$RealTime, main="Box Plot of Time per GC\n(crazy outlier excluded)", names=c("usr","sys","elapsed"), xlab="Seconds per GC", ylab="Time (Seconds)", horizontal = TRUE, outcol="red") box(which = "outer", lty = "solid") Here is the crazy outlier for future analysis: crazy.outlier.df ## row.names SecondsSinceLaunch IncrementalCount ## 8233 2014-05-12T23:15:43.903-0700: 20741 8316 ## FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ## 8233 112 0.55 0.42 488.1 8381440 8235008 9437184 ## Delta ## 8233 146432 R Time Series Data To analyze the garbage collection as a time series, I’ll use Z’s Ordered Observations (zoo). “zoo is the creator for an S3 class of indexed totally ordered observations which includes irregular time series.” require(zoo) ## Loading required package: zoo ## ## Attaching package: 'zoo' ## ## The following objects are masked from 'package:base': ## ## as.Date, as.Date.numeric head(g1gc.df[,1]) ## [1] "2014-05-12T14:00:32.868-0700:" "2014-05-12T14:00:33.179-0700:" ## [3] "2014-05-12T14:00:33.677-0700:" "2014-05-12T14:00:35.538-0700:" ## [5] "2014-05-12T14:00:37.811-0700:" "2014-05-12T14:00:41.428-0700:" options("digits.secs"=3) times=as.POSIXct( g1gc.df[,1], format="%Y-%m-%dT%H:%M:%OS%z:") g1gc.z = zoo(g1gc.df[,-c(1)], order.by=times) head(g1gc.z) ## SecondsSinceLaunch IncrementalCount FullCount ## 2014-05-12 17:00:32.868 1.161 0 0 ## 2014-05-12 17:00:33.178 1.472 1 0 ## 2014-05-12 17:00:33.677 1.969 2 0 ## 2014-05-12 17:00:35.538 3.830 3 0 ## 2014-05-12 17:00:37.811 6.103 4 0 ## 2014-05-12 17:00:41.427 9.720 5 0 ## UserTime SysTime RealTime BeforeSize AfterSize ## 2014-05-12 17:00:32.868 0.11 0.04 0.02 8192 1400 ## 2014-05-12 17:00:33.178 0.05 0.01 0.02 5496 1672 ## 2014-05-12 17:00:33.677 0.04 0.01 0.01 5768 2557 ## 2014-05-12 17:00:35.538 0.21 0.05 0.04 22528 4907 ## 2014-05-12 17:00:37.811 0.08 0.01 0.02 24576 7072 ## 2014-05-12 17:00:41.427 0.26 0.06 0.04 43008 14336 ## TotalSize Delta ## 2014-05-12 17:00:32.868 9437184 6792 ## 2014-05-12 17:00:33.178 9437184 3824 ## 2014-05-12 17:00:33.677 9437184 3211 ## 2014-05-12 17:00:35.538 9437184 17621 ## 2014-05-12 17:00:37.811 9437184 17504 ## 2014-05-12 17:00:41.427 9437184 28672 Example of Two Benchmark Runs in One Log File The data in the following graph is from a different log file, not the one of primary interest to this article. I’m including this image because it is an example of idle periods followed by busy periods. It would be uninteresting to average the rate of garbage collection over the entire log file period. More interesting would be the rate of garbage collect in the two busy periods. Are they the same or different? Your production data may be similar, for example, bursts when employees return from lunch and idle times on weekend evenings, etc. Once the data is in an R Time Series, you can analyze isolated time windows. Clipping the Time Series data Flashing back to our test case… Viewing the data as a time series is interesting. You can see that the work intensive time period is between 9:00 PM and 3:00 AM. Lets clip the data to the interesting period: par(mfrow=c(2,1)) plot(g1gc.z$UserTime, type="h", main="User Time per GC\nTime: Complete Log File", xlab="Time of Day", ylab="CPU Seconds per GC", col="#1b9e77") clipped.g1gc.z=window(g1gc.z, start=as.POSIXct("2014-05-12 21:00:00"), end=as.POSIXct("2014-05-13 03:00:00")) plot(clipped.g1gc.z$UserTime, type="h", main="User Time per GC\nTime: Limited to Benchmark Execution", xlab="Time of Day", ylab="CPU Seconds per GC", col="#1b9e77") box(which = "outer", lty = "solid") Cumulative Incremental and Full GC count Here is the cumulative incremental and full GC count. When the line is very steep, it indicates that the GCs are repeating very quickly. Notice that the scale on the Y axis is different for full vs. incremental. plot(clipped.g1gc.z[,c(2:3)], main="Cumulative Incremental and Full GC count", xlab="Time of Day", col="#1b9e77") GC Analysis of Benchmark Execution using Time Series data In the following series of 3 graphs: The “After Size” show the amount of heap space in use after each garbage collection. Many Java objects are still referenced, i.e. alive, during each garbage collection. This may indicate that the application has a memory leak, or may indicate that the application has a very large memory footprint. Typically, an application's memory footprint plateau's in the early stage of execution. One would expect this graph to have a flat top. The steep decline in the heap space may indicate that the application crashed after 2:00. The second graph shows that the outliers in real execution time, discussed above, occur near 2:00. when the Java heap seems to be quite full. The third graph shows that Full GCs are infrequent during the first few hours of execution. The rate of Full GC's, (the slope of the cummulative Full GC line), changes near midnight. plot(clipped.g1gc.z[,c("AfterSize","RealTime","FullCount")], xlab="Time of Day", col=c("#1b9e77","red","#1b9e77")) GC Analysis of heap recovered Each GC trace includes the amount of heap space in use before and after the individual GC event. During garbage coolection, unreferenced objects are identified, the space holding the unreferenced objects is freed, and thus, the difference in before and after usage indicates how much space has been freed. The following box plot and bar chart both demonstrate the same point - the amount of heap space freed per garbage colloection is surprisingly low. par(mfrow=c(2,1)) boxplot(as.vector(clipped.g1gc.z$Delta), main="Amount of Heap Recovered per GC Pass", xlab="Size in KB", horizontal = TRUE, col="red") hist(as.vector(clipped.g1gc.z$Delta), main="Amount of Heap Recovered per GC Pass", xlab="Size in KB", breaks=100, col="red") box(which = "outer", lty = "solid") This graph is the most interesting. The dark blue area shows how much heap is occupied by referenced Java objects. This represents memory that holds live data. The red fringe at the top shows how much data was recovered after each garbage collection. barplot(clipped.g1gc.z[,c("AfterSize","Delta")], col=c("#7570b3","#e7298a"), xlab="Time of Day", border=NA) legend("topleft", c("Live Objects","Heap Recovered on GC"), fill=c("#7570b3","#e7298a")) box(which = "outer", lty = "solid") When I discuss the data in the log files with the customer, I will ask for an explaination for the large amount of referenced data resident in the Java heap. There are two are posibilities: There is a memory leak and the amount of space required to hold referenced objects will continue to grow, limited only by the maximum heap size. After the maximum heap size is reached, the JVM will throw an “Out of Memory” exception every time that the application tries to allocate a new object. If this is the case, the aplication needs to be debugged to identify why old objects are referenced when they are no longer needed. The application has a legitimate requirement to keep a large amount of data in memory. The customer may want to further increase the maximum heap size. Another possible solution would be to partition the application across multiple cluster nodes, where each node has responsibility for managing a unique subset of the data. Conclusion In conclusion, R is a very powerful tool for the analysis of Java garbage collection log files. The primary difficulty is data cleansing so that information can be read into an R data frame. Once the data has been read into R, a rich set of tools may be used for thorough evaluation.

Read the article
Chrome Equivalent of %s address bar trick in Firefox

- by notbrain

I was curious if there was an equivalent technique in Chrome to do address bar param string replacement like you can do in Firefox. If you create a bookmark and put a %s in the bookmark URL/address part, and set a keyword for the bookmark, you can do things like URL: http://php.net/%s Keyword: php Type in browser: php fopen End up at: http://php.net/fopen Is this making its way into Chrome or is there a way to do it?

Read the article
How do I traverse the filesystem looking for a regex match?

- by editor

I know this is teeball for veteran sysadmins, but I'm looking to search a directory tree for file contents that match a regex (here, the word "Keyword"). I've gotten that far, but now I'm having trouble ignoring files in a hidden (.svn) file tree. Here's what I'm working with: find . -exec grep "Keyword" '{}' \; -print Reading sites via search I know that I need to negate the name flag, but I can't it working in the right order.

Read the article

< Previous Page | 208 209 210 211 212 213 214 215 216 217 218 219 | Next Page >