Search Results

Search found 832 results on 34 pages for 'plot'.

Page 11/34 | < Previous Page | 7 8 9 10 11 12 13 14 15 16 17 18 | Next Page >

I want to make a 2D color plot showing stress magnitude (S) at very loctaion (x, y) based on continuous color change using limited data sets

- by Alex Liu

friends, I have to trouble you as I couldn't find a solution after trying for a long time. I have 3 columns of data. x, y, and the stress value (S) at every point (x, y). I want to generate a 2D color plot displaying continuous color change with the magnitude of the stress (S). The stress values increase from -3*10^4 Pa to 4*10^4 Pa. I only have hundreds of data sets for an area, but I want to see the stress magnitude (read from the color) at every location (x, y). What Matlab command should I use? Thank you very much! I want to make a 2D color plot showing stress magnitude (S) at very loctaion (x, y) based on continuous color change using limited

Read the article
How can I plot an iteration coordinate(x,y) data with only label in GNUplot?

- by Mojiiz Alamode

I have the set of data like this 0 268 195 1 353 199 2 318 209 3 268 232 4 370 238 5 326 253 6 246 265 7 372 284 8 313 290 9 258 297 0 268 196 1 353 199 2 318 209 3 268 233 4 370 238 5 325 253 6 246 265 7 372 284 8 313 290 9 258 297 I would like to use first column for label and second and third for (x,y) plot, however, I would like to plot only one time label without iteration. How should I do? Thank you for help.

Read the article
Are there mapping utilities out there that will let me import geo position data (lat/long) and plot the points on a map?

- by GregH

I have a data file with a bunch of lat/long positions. Is there any mapping software out there (google maps, etc) that will allow me to import the positions from the file and plot them on a map? I would be this can be done through google maps but I'm not sure how to do it. I just want something that I can use quickly with a minimal amount of programming to do. I don't need to annotate anything. Just view where the points are on the map. I'm just wondering if there is something already available out there to import into google maps.

Read the article
In gnuplot, how to plot with lines but skip missing data points?

- by Anna

I've got a value associated to each day, as such: 120530 70.1 120531 69.0 120601 69.2 120602 69.5 # and so on for 200 lines When plotting this data in gnuplot with lines, the data points are nicely connected. Unfortunately, at places over a week of data points can be missing. Gnuplot draws long lines over these intervals. How can I make gnuplot only connect points on consecutive days? Solutions that require preprocessing of the data are fine, as I already smooth it with a script. Here is what I use: set xdata time set timefmt "%y%m%d" plot "vikt_ma.txt" using 1:2 with lines title "first line", \\ "" using 1:3 with lines title "second line" Example:

Read the article
Is it possible to plot implicit equations using Matplotlib?

- by Geddes

Hello all - many thanks again to people who have kindly offered help (especially to Mark for his outstanding response to my previous question). I would like to plot implicit equations (of the form f(x,y)=g(x,y) eg. x^y=y^x) in Matplotlib. Is this possible? All the best, Geddes

Read the article
How to plot multiple location in google maps using url address?

- by lee

As title, how to do that? to get the direction of two point we can do like http://maps.google.com/maps?saddr=40.1212,22.333&daddr=40.222,22.333&z=17" then how can i plot multiple location using this method by URL address?

Read the article
How do we plot a red dot (to represent a trade) on a financial candlestick chart in Matlab?

- by Gravitas

Hi, I'm plotting a financial candlestick chart using this Matlab function: http://www.mathworks.com/access/helpdesk/help/toolbox/finance/candlefts.html How do I plot a red dot on the chart, to represent a trade at that point?

Read the article
How can I plot graphs with MS Chart controls using data from the database?

- by Allan

I need a sample code on how I can plot a graph in a C# application using the MS Chart add on in Visual Studio 2008 using linq to sql to collect the data from the database?

Read the article
How can I add an en dash to a plot in R?

- by Banjer

I'm creating a plot in R, and need to add an en dash to some axis labels, as opposed to your everyday hyphen. axis(1, at=c(0:2), labels=c("0-10","11-30","31-70")) I'm running R version 2.8.1 on Linux.

Read the article
How to plot the graph(line) from a file in java?

- by kiran

I have a directory containing list of files. Those files have some list of values as x and y ordered as line by line. And my question is just I would like to read those files one by one and to plot line graphs based on those values. Could you please help me for that?

Read the article
Plotting Tweets from DB in Ruby, grouping by hour.

- by plotti

Hey guys I've got a couple of issues with my code. I was wondering that I am plotting the results very ineffectively, since the grouping by hour takes ages the DB is very simple it contains the tweets, created date and username. It is fed by the twitter gardenhose. Thanks for your help ! require 'rubygems' require 'sequel' require 'gnuplot' DB = Sequel.sqlite("volcano.sqlite") tweets = DB[:tweets] def get_values(keyword,tweets) my_tweets = tweets.filter(:text.like("%#{keyword}%")) r = Hash.new start = my_tweets.first[:created_at] my_tweets.each do |t| hour = ((t[:created_at]-start)/3600).round r[hour] == nil ? r[hour] = 1 : r[hour] += 1 end x = [] y = [] r.sort.each do |e| x << e[0] y << e[1] end [x,y] end keywords = ["iceland", "island", "vulkan", "volcano"] values = {} keywords.each do |k| values[k] = get_values(k,tweets) end Gnuplot.open do |gp| Gnuplot::Plot.new(gp) do |plot| plot.terminal "png" plot.output "volcano.png" plot.data = [] values.each do |k,v| plot.data << Gnuplot::DataSet.new([v[0],v[1]]){ |ds| ds.with = "linespoints" ds.title = k } end end end

Read the article
How can I draw a log-normalized imshow plot with a colorbar representing the raw data in matplotlib

- by Adam Fraser

I'm using matplotlib to plot log-normalized images but I would like the original raw image data to be represented in the colorbar rather than the [0-1] interval. I get the feeling there's a more matplotlib'y way of doing this by using some sort of normalization object and not transforming the data beforehand... in any case, there could be negative values in the raw image. import matplotlib.pyplot as plt import numpy as np def log_transform(im): '''returns log(image) scaled to the interval [0,1]''' try: (min, max) = (im[im > 0].min(), im.max()) if (max > min) and (max > 0): return (np.log(im.clip(min, max)) - np.log(min)) / (np.log(max) - np.log(min)) except: pass return im a = np.ones((100,100)) for i in range(100): a[i] = i f = plt.figure() ax = f.add_subplot(111) res = ax.imshow(log_transform(a)) # the colorbar drawn shows [0-1], but I want to see [0-99] cb = f.colorbar(res) I've tried using cb.set_array, but that didn't appear to do anything, and cb.set_clim, but that rescales the colors completely. Thanks in advance for any help :)

Read the article
Is there a way to show a 3-D surface plot in the browser?

- by Phage

I've got a bunch of data for 3-D surface plots. I want to build a quick web interface to let me browse through that data. Are there any (free) packages out there that can easily show surface plots? I found this question but the suggested libraries did not support surface plots. If it requires a plugin like flash / java that is fine. This is for prototyping so a quick 'n dirty solution is preferred. Right now, the only option I have come up with is to use gnuplot to serve up static images. It would be awesome if there was some way to provide an interactive 3-D surface plot in the browser.

Read the article
ORE graphics using Remote Desktop Protocol

- by Sherry LaMonica

Oracle R Enterprise graphics are returned as raster, or bitmap graphics. Raster images consist of tiny squares of color information referred to as pixels that form points of color to create a complete image. Plots that contain raster images render quickly in R and create small, high-quality exported image files in a wide variety of formats. However, it is a known issue that the rendering of raster images can be problematic when creating graphics using a Remote Desktop connection. Raster images do not display in the windows device using Remote Desktop under the default settings. This happens because Remote Desktop restricts the number of colors when connecting to a Windows machine to 16 bits per pixel, and interpolating raster graphics requires many colors, at least 32 bits per pixel.. For example, this simple embedded R image plot will be returned in a raster-based format using a standalone Windows machine: R> library(ORE) R> ore.connect(user="rquser", sid="orcl", host="localhost", password="rquser", all=TRUE) R> ore.doEval(function() image(volcano, col=terrain.colors(30))) Here, we first load the ORE packages and connect to the database instance using database login credentials. The ore.doEval function executes the R code within the database embedded R engine and returns the image back to the client R session. Over a Remote Desktop connection under the default settings, this graph will appear blank due to the restricted number of colors. Users who encounter this issue have two options to display ORE graphics over Remote Desktop: either raise Remote Desktop's Color Depth or direct the plot output to an alternate device. Option #1: Raise Remote Desktop Color Depth setting In a Remote Desktop session, all environment variables, including display variables determining Color Depth, are determined by the RCP-Tcp connection settings. For example, users can reduce the Color Depth when connecting over a slow connection. The different settings are 15 bits, 16 bits, 24 bits, or 32 bits per pixel. To raise the Remote Desktop color depth: On the Windows server, launch Remote Desktop Session Host Configuration from the Accessories menu.Under Connections, right click on RDP-Tcp and select Properties.On the Client Settings tab either uncheck LimitMaximum Color Depth or set it to 32 bits per pixel. Click Apply, then OK, log out of the remote session and reconnect.After reconnecting, the Color Depth on the Display tab will be set to 32 bits per pixel. Raster graphics will now display as expected. For ORE users, the increased color depth results in slightly reduced performance during plot creation, but the graph will be created instead of displaying an empty plot. Option #2: Direct plot output to alternate device Plotting to a non-windows device is a good option if it's not possible to increase Remote Desktop Color Depth, or if performance is degraded when creating the graph. Several device drivers are available for off-screen graphics in R, such as postscript, pdf, and png. On-screen devices include windows, X11 and Cairo. Here we output to the Cairo device to render an on-screen raster graphic. The grid.raster function in the grid package is analogous to other grid graphical primitives - it draws a raster image within the current plot's grid. R> options(device = "CairoWin") # use Cairo device for plotting during the session R> library(Cairo) # load Cairo, grid and png libraries R> library(grid) R> library(png) R> res <- ore.doEval(function()image(volcano,col=terrain.colors(30))) # create embedded R plot R> img <- ore.pull(res, graphics = TRUE)$img[[1]] # extract image R> grid.raster(as.raster(readPNG(img)), interpolate = FALSE) # generate raster graph R> dev.off() # turn off first device By default, the interpolate argument to grid.raster is TRUE, which means that what is actually drawn by R is a linear interpolation of the pixels in the original image. Setting interpolate to FALSE uses a sample from the pixels in the original image.A list of graphics devices available in R can be found in the Devices help file from the grDevices package: R> help(Devices)

Read the article
How can I get a windows server to collate and email 4 daily reports/graphs for server performance

- by Glyn Darkin

I run a windows 2008 webserver and would like to setup the most basic performance monitoring in the world. What i would like is: a plot of ASP.Net request time for each w3wp process for a 24hour period a plot of CPU% utilisation for each w3wp process for a 24hour period a plot of Memory utilisation for each w3wp process for a 24hour period a plot of network utilisation for each w3wp process for a 24hour period a plot of disk utilisation for each w3wp process for a 24hour period I would these plots to be emailed to me each morning. Anybody know what is the simplest way to set this up? Thanks for your help in advance. Glyn

Read the article
How to remove space between chart area and plot area?

- by Gopalakrishnan Subramani

I am using chartingToolKit:Chart control. I want to remove the white space appear in between the chart and plot area. Attached the WPF sample and image of area to be removed. <Window x:Class="WpfApplication2.MainWindow" xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation" xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml" Title="MainWindow" Height="350" Width="525" xmlns:chartingToolkit="clr-namespace:System.Windows.Controls.DataVisualization.Charting;assembly=System.Windows.Controls.DataVisualization.Toolkit"> <Grid> <chartingToolkit:Chart x:Name="chart" Width="500" Height="300" Margin="0, 0, 0, 0" LegendStyle="{StaticResource LegendStyle}" > <chartingToolkit:AreaSeries ItemsSource="{Binding}" DependentValuePath="Value" IndependentValuePath="Key" Background="Red" > </chartingToolkit:AreaSeries> <chartingToolkit:Chart.Axes> <chartingToolkit:LinearAxis Orientation="X" ShowGridLines="False" Visibility="Hidden"> </chartingToolkit:LinearAxis> <chartingToolkit:LinearAxis Orientation="Y" ShowGridLines="False" Visibility="Hidden"/> </chartingToolkit:Chart.Axes> </chartingToolkit:Chart> </Grid> The area marked in red arrow must be removed

Read the article
Python Turtle Graphics, how to plot functions over an interval?

- by TheDragonAce

I need to plot a function over a specified interval. The function is f1, which is shown below in the code, and the interval is [-7, -3]; [-1, 1]; [3, 7] with a step of .01. When I execute the program, nothing is drawn. Any ideas? import turtle from math import sqrt wn = turtle.Screen() wn.bgcolor("white") wn.title("Plotting") mypen = turtle.Turtle() mypen.shape("classic") mypen.color("black") mypen.speed(10) while True: try: def f1(x): return 2 * sqrt((-abs(abs(x)-1)) * abs(3 - abs(x))/((abs(x)-1)*(3-abs(x)))) * \ (1 + abs(abs(x)-3)/(abs(x)-3))*sqrt(1-(x/7)**2)+(5+0.97*(abs(x-0.5)+abs(x+0.5))-\ 3*(abs(x-0.75)+abs(x+0.75)))*(1+abs(1-abs(x))/(1-abs(x))) mypen.penup() step=.01 startf11=-7 stopf11=-3 startf12=-1 stopf12=1 startf13=3 stopf13=7 def f11 (startf11,stopf11,step): rc=[] y = f1(startf11) while y<=stopf11: rc.append(startf11) #y+=step mypen.setpos(f1(startf11)*25,y*25) mypen.dot() def f12 (startf12,stopf12,step): rc=[] y = f1(startf12) while y<=stopf12: rc.append(startf12) #y+=step mypen.setpos(f1(startf12)*25, y*25) mypen.dot() def f13 (startf13,stopf13,step): rc=[] y = f1(startf13) while y<=stopf13: rc.append(startf13) #y+=step mypen.setpos(f1(startf13)*25, y*25) mypen.dot() f11(startf11,stopf11,step) f12(startf12,stopf12,step) f13(startf13,stopf13,step) except ZeroDivisionError: continue

Read the article
How can I suppress the vertical gridlines in a ggplot2 plot while retaining the x-axis labels?

- by Tarek

This is a follow-on from this question, in which I was trying to suppress the vertical gridlines. The solution, as provided by learnr, was to add scale_x_continuous(breaks = NA), but this had the side effect of also suppressing the x-axis labels, as well. I am totally happy to write the labels back in by hand, but it's not clear to me how to figure out where the labels should go. The other option is to suppress all gridlines (using opts( panel.grid.major = theme_blank()) or some such) and then drawing back in just the major horizontal gridlines. Again, the problem here is how to figure out what the breaks are in the plot to supply to geom_hline(). So, essentially, my options are: Suppress vertical gridlines and x-axis labels (using scale_x_continuous(breaks = NA) ) and add the x-axis labels back in. Suppress all gridlines (using opts( panel.grid.major = theme_blank()) ) and add the major horizontal gridlines back in using geom_hline(). Here are the two options: library(ggplot2) data <- data.frame(x = 1:10, y = c(3,5,2,5,6,2,7,6,5,4)) # suppressing vertical gridlines and x-axis labels # need to re-draw x-axis labels ggplot(data, aes(x, y)) + geom_bar(stat = 'identity') + scale_x_continuous(breaks = NA) + opts( panel.grid.major = theme_line(size = 0.5, colour = '#1391FF'), panel.grid.minor = theme_blank(), panel.background = theme_blank(), axis.ticks = theme_blank() ) # suppressing all gridlines # need to re-draw horizontal gridlines, probably with geom_hbar() ggplot(data, aes(x, y)) + geom_bar(stat = 'identity') + scale_x_continuous(breaks = NA) + opts( panel.grid.major = theme_blank(), panel.grid.minor = theme_blank(), panel.background = theme_blank(), axis.ticks = theme_blank() )

Read the article
I want to plot ocean current with a GPS in a bottle.

- by Fantomas

Thinking of using a wine bottle with a cork that barely sticks out. Anyhow, I want to put in a GPS, a battery and a transmitter and to be able to collect position about every minute or so. Off-the-shelf components are preferred. What are my options as far as hardware and software choices? Thank you in advance!

Read the article
How can I show figures separately in matplotlib?

- by Federico Ramponi

Say that I have two figures in matplotlib, with one plot per figure: import matplotlib.pyplot as plt f1 = plt.figure() plt.plot(range(0,10)) f2 = plt.figure() plt.plot(range(10,20)) Then I show both in one shot plt.show() Is there a way to show them separately, i.e. to show just f1? Or better: how can I manage the figures separately like in the following 'wishful' code (that doesn't work): f1 = plt.figure() f1.plot(range(0,10)) f1.show()

Read the article
Flex: Create custom stroke on LineSeries?

- by John Isaacks

You can easily set a stroke on a line series like this: <mx:LineSeries yField="apple"> <mx:lineStroke> <mx:Stroke color="0x6699FF" weight="4" alpha=".8" /> </mx:lineStroke> </mx:LineSeries> This will set alpha for the entire stroke to .8 But I want to be able to set a different alpha on the stoke for each plot based on something in the dataProvider. For example the yField in the lineSeries is "Apple" which is how it knows where to plot for the lineSeries. I want to be able to add something like alphaField which tells it what to set the stroke alpha for each plot. so if my dataProvider was: <result month="Jan-04"> <apple>81768</apple> <alpha>1</alpha> </result> <result month="Feb-04"> <apple>51156</apple> <alpha>1</alpha> </result> <result month="Mar-04"> <apple>51156</apple> <alpha>.5</alpha> </result> And I set alphaField="alpha" then I would have a solid stroke from plot 0 to plot 1 and then a 50% alpha stroke from plot 1 to plot 2. How can I do this??? I am looking in the commitProperties() and updateDisplayList() methods of LineSeries and have no idea what would need to be added/changed to make this? I am pretty sure, this class has to use Graphics.lineTo() to draw each plot, so basically it would need to "get" the current alphaField value somehow, and apply a Graphics.lineStyle() with the correct alpha before drawing each line. Thanks!! UPDATE I have gotten much closer to my answer. When I extend LineRenderer I override updateDisplayList() which calls GraphicsUtilities.drawPolyLine() I extend GraphicsUtilities and override the method drawPolyLine() as this is where the line is actually drawn. I can call lineStyle() in here and change the alpha of the line... I still have 1 thing I cannot figure out, from within the drawPolyLine() method how can I access that data that dictates what the alpha should be? Thanks!!!!

Read the article
Using R to Analyze G1GC Log Files

- by user12620111

Using R to Analyze G1GC Log Files body, td { font-family: sans-serif; background-color: white; font-size: 12px; margin: 8px; } tt, code, pre { font-family: 'DejaVu Sans Mono', 'Droid Sans Mono', 'Lucida Console', Consolas, Monaco, monospace; } h1 { font-size:2.2em; } h2 { font-size:1.8em; } h3 { font-size:1.4em; } h4 { font-size:1.0em; } h5 { font-size:0.9em; } h6 { font-size:0.8em; } a:visited { color: rgb(50%, 0%, 50%); } pre { margin-top: 0; max-width: 95%; border: 1px solid #ccc; white-space: pre-wrap; } pre code { display: block; padding: 0.5em; } code.r, code.cpp { background-color: #F8F8F8; } table, td, th { border: none; } blockquote { color:#666666; margin:0; padding-left: 1em; border-left: 0.5em #EEE solid; } hr { height: 0px; border-bottom: none; border-top-width: thin; border-top-style: dotted; border-top-color: #999999; } @media print { * { background: transparent !important; color: black !important; filter:none !important; -ms-filter: none !important; } body { font-size:12pt; max-width:100%; } a, a:visited { text-decoration: underline; } hr { visibility: hidden; page-break-before: always; } pre, blockquote { padding-right: 1em; page-break-inside: avoid; } tr, img { page-break-inside: avoid; } img { max-width: 100% !important; } @page :left { margin: 15mm 20mm 15mm 10mm; } @page :right { margin: 15mm 10mm 15mm 20mm; } p, h2, h3 { orphans: 3; widows: 3; } h2, h3 { page-break-after: avoid; } } pre .operator, pre .paren { color: rgb(104, 118, 135) } pre .literal { color: rgb(88, 72, 246) } pre .number { color: rgb(0, 0, 205); } pre .comment { color: rgb(76, 136, 107); } pre .keyword { color: rgb(0, 0, 255); } pre .identifier { color: rgb(0, 0, 0); } pre .string { color: rgb(3, 106, 7); } var hljs=new function(){function m(p){return p.replace(/&/gm,"&").replace(/"}while(y.length||w.length){var v=u().splice(0,1)[0];z+=m(x.substr(q,v.offset-q));q=v.offset;if(v.event=="start"){z+=t(v.node);s.push(v.node)}else{if(v.event=="stop"){var p,r=s.length;do{r--;p=s[r];z+=("")}while(p!=v.node);s.splice(r,1);while(r'+M[0]+""}else{r+=M[0]}O=P.lR.lastIndex;M=P.lR.exec(L)}return r+L.substr(O,L.length-O)}function J(L,M){if(M.sL&&e[M.sL]){var r=d(M.sL,L);x+=r.keyword_count;return r.value}else{return F(L,M)}}function I(M,r){var L=M.cN?'':"";if(M.rB){y+=L;M.buffer=""}else{if(M.eB){y+=m(r)+L;M.buffer=""}else{y+=L;M.buffer=r}}D.push(M);A+=M.r}function G(N,M,Q){var R=D[D.length-1];if(Q){y+=J(R.buffer+N,R);return false}var P=q(M,R);if(P){y+=J(R.buffer+N,R);I(P,M);return P.rB}var L=v(D.length-1,M);if(L){var O=R.cN?"":"";if(R.rE){y+=J(R.buffer+N,R)+O}else{if(R.eE){y+=J(R.buffer+N,R)+O+m(M)}else{y+=J(R.buffer+N+M,R)+O}}while(L1){O=D[D.length-2].cN?"":"";y+=O;L--;D.length--}var r=D[D.length-1];D.length--;D[D.length-1].buffer="";if(r.starts){I(r.starts,"")}return R.rE}if(w(M,R)){throw"Illegal"}}var E=e[B];var D=[E.dM];var A=0;var x=0;var y="";try{var s,u=0;E.dM.buffer="";do{s=p(C,u);var t=G(s[0],s[1],s[2]);u+=s[0].length;if(!t){u+=s[1].length}}while(!s[2]);if(D.length1){throw"Illegal"}return{r:A,keyword_count:x,value:y}}catch(H){if(H=="Illegal"){return{r:0,keyword_count:0,value:m(C)}}else{throw H}}}function g(t){var p={keyword_count:0,r:0,value:m(t)};var r=p;for(var q in e){if(!e.hasOwnProperty(q)){continue}var s=d(q,t);s.language=q;if(s.keyword_count+s.rr.keyword_count+r.r){r=s}if(s.keyword_count+s.rp.keyword_count+p.r){r=p;p=s}}if(r.language){p.second_best=r}return p}function i(r,q,p){if(q){r=r.replace(/^((]+|\t)+)/gm,function(t,w,v,u){return w.replace(/\t/g,q)})}if(p){r=r.replace(/\n/g,"")}return r}function n(t,w,r){var x=h(t,r);var v=a(t);var y,s;if(v){y=d(v,x)}else{return}var q=c(t);if(q.length){s=document.createElement("pre");s.innerHTML=y.value;y.value=k(q,c(s),x)}y.value=i(y.value,w,r);var u=t.className;if(!u.match("(\\s|^)(language-)?"+v+"(\\s|$)")){u=u?(u+" "+v):v}if(/MSIE [678]/.test(navigator.userAgent)&&t.tagName=="CODE"&&t.parentNode.tagName=="PRE"){s=t.parentNode;var p=document.createElement("div");p.innerHTML=""+y.value+"";t=p.firstChild.firstChild;p.firstChild.cN=s.cN;s.parentNode.replaceChild(p.firstChild,s)}else{t.innerHTML=y.value}t.className=u;t.result={language:v,kw:y.keyword_count,re:y.r};if(y.second_best){t.second_best={language:y.second_best.language,kw:y.second_best.keyword_count,re:y.second_best.r}}}function o(){if(o.called){return}o.called=true;var r=document.getElementsByTagName("pre");for(var p=0;p|=||=||=|\\?|\\[|\\{|\\(|\\^|\\^=|\\||\\|=|\\|\\||~";this.ER="(?![\\s\\S])";this.BE={b:"\\\\.",r:0};this.ASM={cN:"string",b:"'",e:"'",i:"\\n",c:[this.BE],r:0};this.QSM={cN:"string",b:'"',e:'"',i:"\\n",c:[this.BE],r:0};this.CLCM={cN:"comment",b:"//",e:"$"};this.CBLCLM={cN:"comment",b:"/\\*",e:"\\*/"};this.HCM={cN:"comment",b:"#",e:"$"};this.NM={cN:"number",b:this.NR,r:0};this.CNM={cN:"number",b:this.CNR,r:0};this.BNM={cN:"number",b:this.BNR,r:0};this.inherit=function(r,s){var p={};for(var q in r){p[q]=r[q]}if(s){for(var q in s){p[q]=s[q]}}return p}}();hljs.LANGUAGES.cpp=function(){var a={keyword:{"false":1,"int":1,"float":1,"while":1,"private":1,"char":1,"catch":1,"export":1,virtual:1,operator:2,sizeof:2,dynamic_cast:2,typedef:2,const_cast:2,"const":1,struct:1,"for":1,static_cast:2,union:1,namespace:1,unsigned:1,"long":1,"throw":1,"volatile":2,"static":1,"protected":1,bool:1,template:1,mutable:1,"if":1,"public":1,friend:2,"do":1,"return":1,"goto":1,auto:1,"void":2,"enum":1,"else":1,"break":1,"new":1,extern:1,using:1,"true":1,"class":1,asm:1,"case":1,typeid:1,"short":1,reinterpret_cast:2,"default":1,"double":1,register:1,explicit:1,signed:1,typename:1,"try":1,"this":1,"switch":1,"continue":1,wchar_t:1,inline:1,"delete":1,alignof:1,char16_t:1,char32_t:1,constexpr:1,decltype:1,noexcept:1,nullptr:1,static_assert:1,thread_local:1,restrict:1,_Bool:1,complex:1},built_in:{std:1,string:1,cin:1,cout:1,cerr:1,clog:1,stringstream:1,istringstream:1,ostringstream:1,auto_ptr:1,deque:1,list:1,queue:1,stack:1,vector:1,map:1,set:1,bitset:1,multiset:1,multimap:1,unordered_set:1,unordered_map:1,unordered_multiset:1,unordered_multimap:1,array:1,shared_ptr:1}};return{dM:{k:a,i:"",k:a,r:10,c:["self"]}]}}}();hljs.LANGUAGES.r={dM:{c:[hljs.HCM,{cN:"number",b:"\\b0[xX][0-9a-fA-F]+[Li]?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+(?:[eE][+\\-]?\\d*)?L\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+\\.(?!\\d)(?:i\\b)?",e:hljs.IMMEDIATE_RE,r:1},{cN:"number",b:"\\b\\d+(?:\\.\\d*)?(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\.\\d+(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"keyword",b:"(?:tryCatch|library|setGeneric|setGroupGeneric)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\.",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\d+(?![\\w.])",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\b(?:function)",e:hljs.IMMEDIATE_RE,r:2},{cN:"keyword",b:"(?:if|in|break|next|repeat|else|for|return|switch|while|try|stop|warning|require|attach|detach|source|setMethod|setClass)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"literal",b:"(?:NA|NA_integer_|NA_real_|NA_character_|NA_complex_)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"literal",b:"(?:NULL|TRUE|FALSE|T|F|Inf|NaN)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"identifier",b:"[a-zA-Z.][a-zA-Z0-9._]*\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"operator",b:"|=|| Using R to Analyze G1GC Log Files Using R to Analyze G1GC Log Files Introduction Working in Oracle Platform Integration gives an engineer opportunities to work on a wide array of technologies. My team’s goal is to make Oracle applications run best on the Solaris/SPARC platform. When looking for bottlenecks in a modern applications, one needs to be aware of not only how the CPUs and operating system are executing, but also network, storage, and in some cases, the Java Virtual Machine. I was recently presented with about 1.5 GB of Java Garbage First Garbage Collector log file data. If you’re not familiar with the subject, you might want to review Garbage First Garbage Collector Tuning by Monica Beckwith. The customer had been running Java HotSpot 1.6.0_31 to host a web application server. I was told that the Solaris/SPARC server was running a Java process launched using a commmand line that included the following flags: -d64 -Xms9g -Xmx9g -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -XX:InitiatingHeapOccupancyPercent=80 -XX:PermSize=256m -XX:MaxPermSize=256m -XX:+PrintGC -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC -XX:+PrintGCDateStamps -XX:+PrintFlagsFinal -XX:+DisableExplicitGC -XX:+UnlockExperimentalVMOptions -XX:ParallelGCThreads=8 Several sources on the internet indicate that if I were to print out the 1.5 GB of log files, it would require enough paper to fill the bed of a pick up truck. Of course, it would be fruitless to try to scan the log files by hand. Tools will be required to summarize the contents of the log files. Others have encountered large Java garbage collection log files. There are existing tools to analyze the log files: IBM’s GC toolkit The chewiebug GCViewer gchisto HPjmeter Instead of using one of the other tools listed, I decide to parse the log files with standard Unix tools, and analyze the data with R. Data Cleansing The log files arrived in two different formats. I guess that the difference is that one set of log files was generated using a more verbose option, maybe -XX:+PrintHeapAtGC, and the other set of log files was generated without that option. Format 1 In some of the log files, the log files with the less verbose format, a single trace, i.e. the report of a singe garbage collection event, looks like this: {Heap before GC invocations=12280 (full 61): garbage-first heap total 9437184K, used 7499918K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) region size 4096K, 1 young (4096K), 0 survivors (0K) compacting perm gen total 262144K, used 144077K [0xffffffff40000000, 0xffffffff50000000, 0xffffffff50000000) the space 262144K, 54% used [0xffffffff40000000, 0xffffffff48cb3758, 0xffffffff48cb3800, 0xffffffff50000000) No shared spaces configured. 2014-05-14T07:24:00.988-0700: 60586.353: [GC pause (young) 7324M->7320M(9216M), 0.1567265 secs] Heap after GC invocations=12281 (full 61): garbage-first heap total 9437184K, used 7496533K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) region size 4096K, 0 young (0K), 0 survivors (0K) compacting perm gen total 262144K, used 144077K [0xffffffff40000000, 0xffffffff50000000, 0xffffffff50000000) the space 262144K, 54% used [0xffffffff40000000, 0xffffffff48cb3758, 0xffffffff48cb3800, 0xffffffff50000000) No shared spaces configured. } A simple grep can be used to extract a summary: $ grep "\[ GC pause (young" g1gc.log 2014-05-13T13:24:35.091-0700: 3.109: [GC pause (young) 20M->5029K(9216M), 0.0146328 secs] 2014-05-13T13:24:35.440-0700: 3.459: [GC pause (young) 9125K->6077K(9216M), 0.0086723 secs] 2014-05-13T13:24:37.581-0700: 5.599: [GC pause (young) 25M->8470K(9216M), 0.0203820 secs] 2014-05-13T13:24:42.686-0700: 10.704: [GC pause (young) 44M->15M(9216M), 0.0288848 secs] 2014-05-13T13:24:48.941-0700: 16.958: [GC pause (young) 51M->20M(9216M), 0.0491244 secs] 2014-05-13T13:24:56.049-0700: 24.066: [GC pause (young) 92M->26M(9216M), 0.0525368 secs] 2014-05-13T13:25:34.368-0700: 62.383: [GC pause (young) 602M->68M(9216M), 0.1721173 secs] But that format wasn't easily read into R, so I needed to be a bit more tricky. I used the following Unix command to create a summary file that was easy for R to read. $ echo "SecondsSinceLaunch BeforeSize AfterSize TotalSize RealTime" $ grep "\[GC pause (young" g1gc.log | grep -v mark | sed -e 's/[A-SU-z\(\),]/ /g' -e 's/->/ /' -e 's/: / /g' | more SecondsSinceLaunch BeforeSize AfterSize TotalSize RealTime 2014-05-13T13:24:35.091-0700 3.109 20 5029 9216 0.0146328 2014-05-13T13:24:35.440-0700 3.459 9125 6077 9216 0.0086723 2014-05-13T13:24:37.581-0700 5.599 25 8470 9216 0.0203820 2014-05-13T13:24:42.686-0700 10.704 44 15 9216 0.0288848 2014-05-13T13:24:48.941-0700 16.958 51 20 9216 0.0491244 2014-05-13T13:24:56.049-0700 24.066 92 26 9216 0.0525368 2014-05-13T13:25:34.368-0700 62.383 602 68 9216 0.1721173 Format 2 In some of the log files, the log files with the more verbose format, a single trace, i.e. the report of a singe garbage collection event, was more complicated than Format 1. Here is a text file with an example of a single G1GC trace in the second format. As you can see, it is quite complicated. It is nice that there is so much information available, but the level of detail can be overwhelming. I wrote this awk script (download) to summarize each trace on a single line. #!/usr/bin/env awk -f BEGIN { printf("SecondsSinceLaunch IncrementalCount FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize\n") } ###################### # Save count data from lines that are at the start of each G1GC trace. # Each trace starts out like this: # {Heap before GC invocations=14 (full 0): # garbage-first heap total 9437184K, used 325496K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) ###################### /{Heap.*full/{ gsub ( "\\)" , "" ); nf=split($0,a,"="); split(a[2],b," "); getline; if ( match($0, "first") ) { G1GC=1; IncrementalCount=b[1]; FullCount=substr( b[3], 1, length(b[3])-1 ); } else { G1GC=0; } } ###################### # Pull out time stamps that are in lines with this format: # 2014-05-12T14:02:06.025-0700: 94.312: [GC pause (young), 0.08870154 secs] ###################### /GC pause/ { DateTime=$1; SecondsSinceLaunch=substr($2, 1, length($2)-1); } ###################### # Heap sizes are in lines that look like this: # [ 4842M->4838M(9216M)] ###################### /\[ .*]$/ { gsub ( "\\[" , "" ); gsub ( "\ \]" , "" ); gsub ( "->" , " " ); gsub ( "\\( " , " " ); gsub ( "\ \)" , " " ); split($0,a," "); if ( split(a[1],b,"M") > 1 ) {BeforeSize=b[1]*1024;} if ( split(a[1],b,"K") > 1 ) {BeforeSize=b[1];} if ( split(a[2],b,"M") > 1 ) {AfterSize=b[1]*1024;} if ( split(a[2],b,"K") > 1 ) {AfterSize=b[1];} if ( split(a[3],b,"M") > 1 ) {TotalSize=b[1]*1024;} if ( split(a[3],b,"K") > 1 ) {TotalSize=b[1];} } ###################### # Emit an output line when you find input that looks like this: # [Times: user=1.41 sys=0.08, real=0.24 secs] ###################### /\[Times/ { if (G1GC==1) { gsub ( "," , "" ); split($2,a,"="); UserTime=a[2]; split($3,a,"="); SysTime=a[2]; split($4,a,"="); RealTime=a[2]; print DateTime,SecondsSinceLaunch,IncrementalCount,FullCount,UserTime,SysTime,RealTime,BeforeSize,AfterSize,TotalSize; G1GC=0; } } The resulting summary is about 25X smaller that the original file, but still difficult for a human to digest. SecondsSinceLaunch IncrementalCount FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ... 2014-05-12T18:36:34.669-0700: 3985.744 561 0 0.57 0.06 0.16 1724416 1720320 9437184 2014-05-12T18:36:34.839-0700: 3985.914 562 0 0.51 0.06 0.19 1724416 1720320 9437184 2014-05-12T18:36:35.069-0700: 3986.144 563 0 0.60 0.04 0.27 1724416 1721344 9437184 2014-05-12T18:36:35.354-0700: 3986.429 564 0 0.33 0.04 0.09 1725440 1722368 9437184 2014-05-12T18:36:35.545-0700: 3986.620 565 0 0.58 0.04 0.17 1726464 1722368 9437184 2014-05-12T18:36:35.726-0700: 3986.801 566 0 0.43 0.05 0.12 1726464 1722368 9437184 2014-05-12T18:36:35.856-0700: 3986.930 567 0 0.30 0.04 0.07 1726464 1723392 9437184 2014-05-12T18:36:35.947-0700: 3987.023 568 0 0.61 0.04 0.26 1727488 1723392 9437184 2014-05-12T18:36:36.228-0700: 3987.302 569 0 0.46 0.04 0.16 1731584 1724416 9437184 Reading the Data into R Once the GC log data had been cleansed, either by processing the first format with the shell script, or by processing the second format with the awk script, it was easy to read the data into R. g1gc.df = read.csv("summary.txt", row.names = NULL, stringsAsFactors=FALSE,sep="") str(g1gc.df) ## 'data.frame': 8307 obs. of 10 variables: ## $ row.names : chr "2014-05-12T14:00:32.868-0700:" "2014-05-12T14:00:33.179-0700:" "2014-05-12T14:00:33.677-0700:" "2014-05-12T14:00:35.538-0700:" ... ## $ SecondsSinceLaunch: num 1.16 1.47 1.97 3.83 6.1 ... ## $ IncrementalCount : int 0 1 2 3 4 5 6 7 8 9 ... ## $ FullCount : int 0 0 0 0 0 0 0 0 0 0 ... ## $ UserTime : num 0.11 0.05 0.04 0.21 0.08 0.26 0.31 0.33 0.34 0.56 ... ## $ SysTime : num 0.04 0.01 0.01 0.05 0.01 0.06 0.07 0.06 0.07 0.09 ... ## $ RealTime : num 0.02 0.02 0.01 0.04 0.02 0.04 0.05 0.04 0.04 0.06 ... ## $ BeforeSize : int 8192 5496 5768 22528 24576 43008 34816 53248 55296 93184 ... ## $ AfterSize : int 1400 1672 2557 4907 7072 14336 16384 18432 19456 21504 ... ## $ TotalSize : int 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 ... head(g1gc.df) ## row.names SecondsSinceLaunch IncrementalCount ## 1 2014-05-12T14:00:32.868-0700: 1.161 0 ## 2 2014-05-12T14:00:33.179-0700: 1.472 1 ## 3 2014-05-12T14:00:33.677-0700: 1.969 2 ## 4 2014-05-12T14:00:35.538-0700: 3.830 3 ## 5 2014-05-12T14:00:37.811-0700: 6.103 4 ## 6 2014-05-12T14:00:41.428-0700: 9.720 5 ## FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ## 1 0 0.11 0.04 0.02 8192 1400 9437184 ## 2 0 0.05 0.01 0.02 5496 1672 9437184 ## 3 0 0.04 0.01 0.01 5768 2557 9437184 ## 4 0 0.21 0.05 0.04 22528 4907 9437184 ## 5 0 0.08 0.01 0.02 24576 7072 9437184 ## 6 0 0.26 0.06 0.04 43008 14336 9437184 Basic Statistics Once the data has been read into R, simple statistics are very easy to generate. All of the numbers from high school statistics are available via simple commands. For example, generate a summary of every column: summary(g1gc.df) ## row.names SecondsSinceLaunch IncrementalCount FullCount ## Length:8307 Min. : 1 Min. : 0 Min. : 0.0 ## Class :character 1st Qu.: 9977 1st Qu.:2048 1st Qu.: 0.0 ## Mode :character Median :12855 Median :4136 Median : 12.0 ## Mean :12527 Mean :4156 Mean : 31.6 ## 3rd Qu.:15758 3rd Qu.:6262 3rd Qu.: 61.0 ## Max. :55484 Max. :8391 Max. :113.0 ## UserTime SysTime RealTime BeforeSize ## Min. :0.040 Min. :0.0000 Min. : 0.0 Min. : 5476 ## 1st Qu.:0.470 1st Qu.:0.0300 1st Qu.: 0.1 1st Qu.:5137920 ## Median :0.620 Median :0.0300 Median : 0.1 Median :6574080 ## Mean :0.751 Mean :0.0355 Mean : 0.3 Mean :5841855 ## 3rd Qu.:0.920 3rd Qu.:0.0400 3rd Qu.: 0.2 3rd Qu.:7084032 ## Max. :3.370 Max. :1.5600 Max. :488.1 Max. :8696832 ## AfterSize TotalSize ## Min. : 1380 Min. :9437184 ## 1st Qu.:5002752 1st Qu.:9437184 ## Median :6559744 Median :9437184 ## Mean :5785454 Mean :9437184 ## 3rd Qu.:7054336 3rd Qu.:9437184 ## Max. :8482816 Max. :9437184 Q: What is the total amount of User CPU time spent in garbage collection? sum(g1gc.df$UserTime) ## [1] 6236 As you can see, less than two hours of CPU time was spent in garbage collection. Is that too much? To find the percentage of time spent in garbage collection, divide the number above by total_elapsed_time*CPU_count. In this case, there are a lot of CPU’s and it turns out the the overall amount of CPU time spent in garbage collection isn’t a problem when viewed in isolation. When calculating rates, i.e. events per unit time, you need to ask yourself if the rate is homogenous across the time period in the log file. Does the log file include spikes of high activity that should be separately analyzed? Averaging in data from nights and weekends with data from business hours may alias problems. If you have a reason to suspect that the garbage collection rates include peaks and valleys that need independent analysis, see the “Time Series” section, below. Q: How much garbage is collected on each pass? The amount of heap space that is recovered per GC pass is surprisingly low: At least one collection didn’t recover any data. (“Min.=0”) 25% of the passes recovered 3MB or less. (“1st Qu.=3072”) Half of the GC passes recovered 4MB or less. (“Median=4096”) The average amount recovered was 56MB. (“Mean=56390”) 75% of the passes recovered 36MB or less. (“3rd Qu.=36860”) At least one pass recovered 2GB. (“Max.=2121000”) g1gc.df$Delta = g1gc.df$BeforeSize - g1gc.df$AfterSize summary(g1gc.df$Delta) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0 3070 4100 56400 36900 2120000 Q: What is the maximum User CPU time for a single collection? The worst garbage collection (“Max.”) is many standard deviations away from the mean. The data appears to be right skewed. summary(g1gc.df$UserTime) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0.040 0.470 0.620 0.751 0.920 3.370 sd(g1gc.df$UserTime) ## [1] 0.3966 Basic Graphics Once the data is in R, it is trivial to plot the data with formats including dot plots, line charts, bar charts (simple, stacked, grouped), pie charts, boxplots, scatter plots histograms, and kernel density plots. Histogram of User CPU Time per Collection I don't think that this graph requires any explanation. hist(g1gc.df$UserTime, main="User CPU Time per Collection", xlab="Seconds", ylab="Frequency") Box plot to identify outliers When the initial data is viewed with a box plot, you can see the one crazy outlier in the real time per GC. Save this data point for future analysis and drop the outlier so that it’s not throwing off our statistics. Now the box plot shows many outliers, which will be examined later, using times series analysis. Notice that the scale of the x-axis changes drastically once the crazy outlier is removed. par(mfrow=c(2,1)) boxplot(g1gc.df$UserTime,g1gc.df$SysTime,g1gc.df$RealTime, main="Box Plot of Time per GC\n(dominated by a crazy outlier)", names=c("usr","sys","elapsed"), xlab="Seconds per GC", ylab="Time (Seconds)", horizontal = TRUE, outcol="red") crazy.outlier.df=g1gc.df[g1gc.df$RealTime > 400,] g1gc.df=g1gc.df[g1gc.df$RealTime < 400,] boxplot(g1gc.df$UserTime,g1gc.df$SysTime,g1gc.df$RealTime, main="Box Plot of Time per GC\n(crazy outlier excluded)", names=c("usr","sys","elapsed"), xlab="Seconds per GC", ylab="Time (Seconds)", horizontal = TRUE, outcol="red") box(which = "outer", lty = "solid") Here is the crazy outlier for future analysis: crazy.outlier.df ## row.names SecondsSinceLaunch IncrementalCount ## 8233 2014-05-12T23:15:43.903-0700: 20741 8316 ## FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ## 8233 112 0.55 0.42 488.1 8381440 8235008 9437184 ## Delta ## 8233 146432 R Time Series Data To analyze the garbage collection as a time series, I’ll use Z’s Ordered Observations (zoo). “zoo is the creator for an S3 class of indexed totally ordered observations which includes irregular time series.” require(zoo) ## Loading required package: zoo ## ## Attaching package: 'zoo' ## ## The following objects are masked from 'package:base': ## ## as.Date, as.Date.numeric head(g1gc.df[,1]) ## [1] "2014-05-12T14:00:32.868-0700:" "2014-05-12T14:00:33.179-0700:" ## [3] "2014-05-12T14:00:33.677-0700:" "2014-05-12T14:00:35.538-0700:" ## [5] "2014-05-12T14:00:37.811-0700:" "2014-05-12T14:00:41.428-0700:" options("digits.secs"=3) times=as.POSIXct( g1gc.df[,1], format="%Y-%m-%dT%H:%M:%OS%z:") g1gc.z = zoo(g1gc.df[,-c(1)], order.by=times) head(g1gc.z) ## SecondsSinceLaunch IncrementalCount FullCount ## 2014-05-12 17:00:32.868 1.161 0 0 ## 2014-05-12 17:00:33.178 1.472 1 0 ## 2014-05-12 17:00:33.677 1.969 2 0 ## 2014-05-12 17:00:35.538 3.830 3 0 ## 2014-05-12 17:00:37.811 6.103 4 0 ## 2014-05-12 17:00:41.427 9.720 5 0 ## UserTime SysTime RealTime BeforeSize AfterSize ## 2014-05-12 17:00:32.868 0.11 0.04 0.02 8192 1400 ## 2014-05-12 17:00:33.178 0.05 0.01 0.02 5496 1672 ## 2014-05-12 17:00:33.677 0.04 0.01 0.01 5768 2557 ## 2014-05-12 17:00:35.538 0.21 0.05 0.04 22528 4907 ## 2014-05-12 17:00:37.811 0.08 0.01 0.02 24576 7072 ## 2014-05-12 17:00:41.427 0.26 0.06 0.04 43008 14336 ## TotalSize Delta ## 2014-05-12 17:00:32.868 9437184 6792 ## 2014-05-12 17:00:33.178 9437184 3824 ## 2014-05-12 17:00:33.677 9437184 3211 ## 2014-05-12 17:00:35.538 9437184 17621 ## 2014-05-12 17:00:37.811 9437184 17504 ## 2014-05-12 17:00:41.427 9437184 28672 Example of Two Benchmark Runs in One Log File The data in the following graph is from a different log file, not the one of primary interest to this article. I’m including this image because it is an example of idle periods followed by busy periods. It would be uninteresting to average the rate of garbage collection over the entire log file period. More interesting would be the rate of garbage collect in the two busy periods. Are they the same or different? Your production data may be similar, for example, bursts when employees return from lunch and idle times on weekend evenings, etc. Once the data is in an R Time Series, you can analyze isolated time windows. Clipping the Time Series data Flashing back to our test case… Viewing the data as a time series is interesting. You can see that the work intensive time period is between 9:00 PM and 3:00 AM. Lets clip the data to the interesting period: par(mfrow=c(2,1)) plot(g1gc.z$UserTime, type="h", main="User Time per GC\nTime: Complete Log File", xlab="Time of Day", ylab="CPU Seconds per GC", col="#1b9e77") clipped.g1gc.z=window(g1gc.z, start=as.POSIXct("2014-05-12 21:00:00"), end=as.POSIXct("2014-05-13 03:00:00")) plot(clipped.g1gc.z$UserTime, type="h", main="User Time per GC\nTime: Limited to Benchmark Execution", xlab="Time of Day", ylab="CPU Seconds per GC", col="#1b9e77") box(which = "outer", lty = "solid") Cumulative Incremental and Full GC count Here is the cumulative incremental and full GC count. When the line is very steep, it indicates that the GCs are repeating very quickly. Notice that the scale on the Y axis is different for full vs. incremental. plot(clipped.g1gc.z[,c(2:3)], main="Cumulative Incremental and Full GC count", xlab="Time of Day", col="#1b9e77") GC Analysis of Benchmark Execution using Time Series data In the following series of 3 graphs: The “After Size” show the amount of heap space in use after each garbage collection. Many Java objects are still referenced, i.e. alive, during each garbage collection. This may indicate that the application has a memory leak, or may indicate that the application has a very large memory footprint. Typically, an application's memory footprint plateau's in the early stage of execution. One would expect this graph to have a flat top. The steep decline in the heap space may indicate that the application crashed after 2:00. The second graph shows that the outliers in real execution time, discussed above, occur near 2:00. when the Java heap seems to be quite full. The third graph shows that Full GCs are infrequent during the first few hours of execution. The rate of Full GC's, (the slope of the cummulative Full GC line), changes near midnight. plot(clipped.g1gc.z[,c("AfterSize","RealTime","FullCount")], xlab="Time of Day", col=c("#1b9e77","red","#1b9e77")) GC Analysis of heap recovered Each GC trace includes the amount of heap space in use before and after the individual GC event. During garbage coolection, unreferenced objects are identified, the space holding the unreferenced objects is freed, and thus, the difference in before and after usage indicates how much space has been freed. The following box plot and bar chart both demonstrate the same point - the amount of heap space freed per garbage colloection is surprisingly low. par(mfrow=c(2,1)) boxplot(as.vector(clipped.g1gc.z$Delta), main="Amount of Heap Recovered per GC Pass", xlab="Size in KB", horizontal = TRUE, col="red") hist(as.vector(clipped.g1gc.z$Delta), main="Amount of Heap Recovered per GC Pass", xlab="Size in KB", breaks=100, col="red") box(which = "outer", lty = "solid") This graph is the most interesting. The dark blue area shows how much heap is occupied by referenced Java objects. This represents memory that holds live data. The red fringe at the top shows how much data was recovered after each garbage collection. barplot(clipped.g1gc.z[,c("AfterSize","Delta")], col=c("#7570b3","#e7298a"), xlab="Time of Day", border=NA) legend("topleft", c("Live Objects","Heap Recovered on GC"), fill=c("#7570b3","#e7298a")) box(which = "outer", lty = "solid") When I discuss the data in the log files with the customer, I will ask for an explaination for the large amount of referenced data resident in the Java heap. There are two are posibilities: There is a memory leak and the amount of space required to hold referenced objects will continue to grow, limited only by the maximum heap size. After the maximum heap size is reached, the JVM will throw an “Out of Memory” exception every time that the application tries to allocate a new object. If this is the case, the aplication needs to be debugged to identify why old objects are referenced when they are no longer needed. The application has a legitimate requirement to keep a large amount of data in memory. The customer may want to further increase the maximum heap size. Another possible solution would be to partition the application across multiple cluster nodes, where each node has responsibility for managing a unique subset of the data. Conclusion In conclusion, R is a very powerful tool for the analysis of Java garbage collection log files. The primary difficulty is data cleansing so that information can be read into an R data frame. Once the data has been read into R, a rich set of tools may be used for thorough evaluation.

Read the article
How can I use RRDTOOL to plot values from a CSV file?

- by Octopus

I have to create a graphical representation for staff salary. The staff is getting their salaries per day and I have there information in below format. This is one month data i.e. 1st March to 31st March <DATE>,<NAME1>,<NAME2>,<NAME3>......<NAME N> YYYY-MM-DD,name1,name2,Name3,.......name4 . . so on.. 1) Is rrdtool a better solution to create graphs and find AVERAGE, MAX, MIN. 2) If yes, How can I use above CSV file to create RRD. 3) If no, what else I can use this to automate the graphical information on my website. Any suggestion in Perl would be really appreciated.

Read the article
How can I use Perl and RRD to plot ping times?

- by ChrisMuench

I'm trying to do my first rrd graph through Perl. I have tried RRD::Simple and rrds and just can't get either one to work. Here's what I have so far: use strict; use RRD::Simple (); # Create an interface object my $rrd = RRD::Simple-new( file = "server.rrd" ); # Put some arbitary data values in the RRD file for the same # 3 data sources called bytesIn, bytesOut and faultsPerSec. $rrd-create( EqSearch = "DERIVE", MfSearch = "DERIVE", EQCostBasis = "DERIVE", MFCostBasis = "DERIVE" ); $rrd-update( EqSearch = 2, MfSearch = 3, EQCostBasis = 10, MFCostBasis = 15 ); # Generate graphs: # /var/tmp/myfile-daily.png, /var/tmp/myfile-weekly.png # /var/tmp/myfile-monthly.png, /var/tmp/myfile-annual.png my %rtn = $rrd-graph( destination = "/Users/cmuench/Documents/Code/perl", title = "Server A", vertical_label = "", interlaced = "", periods = [qw(hour)] );

Read the article
How can I plot NaN values as a special color with imshow in matplotlib?

- by Adam Fraser

example: import numpy as np import matplotlib.pyplot as plt f = plt.figure() ax = f.add_subplot(111) a = np.arange(25).reshape((5,5)).astype(float) a[3,:] = np.nan ax.imshow(a, interpolation='nearest') f.canvas.draw() The resultant image is unexpectedly all blue (the lowest color in the jet colormap). However, if I do the plotting like this: ax.imshow(a, interpolation='nearest', vmin=0, vmax=24) --then I get something better, but the NaN values are drawn the same color as vmin... Is there a graceful way that I can set NaNs to be drawn with a special color (eg: gray or transparent)?

Read the article

< Previous Page | 7 8 9 10 11 12 13 14 15 16 17 18 | Next Page >