Search Results

Search found 89926 results on 3598 pages for 'abstract data type'.

Page 535/3598 | < Previous Page | 531 532 533 534 535 536 537 538 539 540 541 542  | Next Page >

  • LINQ .Cast() extension method fails but (type)object works.

    - by Ben Robinson
    To convert between some LINQ to SQL objects and DTOs we have created explicit cast operators on the DTOs. That way we can do the following: DTOType MyDTO = (LinqToSQLType)MyLinq2SQLObj; This works well. However when you try to cast using the LINQ .Cast() extension method it trows an invalid cast exception saying cannot cast type Linq2SQLType to type DTOType. i.e. the below does not work List<DTO.Name> Names = dbContact.tNames.Cast<DTO.Name>() .ToList(); But the below works fine: DAL.tName MyDalName = new DAL.tName(); DTO.Name MyDTOName = (DTO.Name)MyDalName; and the below also works fine List<DTO.Name> Names = dbContact.tNames.Select(name => (DTO.Name)name) .ToList(); Why does the .Cast() extension method throw an invalid cast exception? I have used the .Cast() extension method in this way many times in the past and when you are casting something like a base type to a derived type it works fine, but falls over when the object has an explicit cast operator.

    Read the article

  • function not printing out anything

    - by Abdul Latif
    I have the following function below: public function setupHead($title){ $displayHead .='<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en"> <head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> <title>'.$title.'</title>'; $displayHead .='<script type="text/javascript" src="'.PATH.'js/jquery-1.3.2.min.js"></script> <script type="text/javascript" src="'.PATH.'js/thickbox.js"></script> <script type="text/javascript" src="'.PATH.'js/ui.core.js"></script> <!--<script type="text/javascript" src="'.PATH.'js/js.js"></script>--> <link rel="stylesheet" href="'.PATH.'css/thickbox.css" type="text/css" media="screen" /> <link rel="stylesheet" type="text/css" href="'.PATH.'css/styles.css"> <link rel="stylesheet" type="text/css" href="'.PATH.'css/menu_allbrowsers.css"> <link rel="stylesheet" href="'.PATH.'css/news.css" type="text/css" media="screen" /> <link rel="stylesheet" href="'.PATH.'css/text.css" type="text/css" media="screen" /> <script type="text/javascript" src="'.PATH.'js/swfobject.js"></script> <!--[if IE 7]><link rel="stylesheet" type="text/css" href="'.PATH.'css/IE7menu.css" /><![endif]--> <!--[if IE 6]><link rel="stylesheet" type="text/css" href="'.PATH.'css/ie6.css" /><![endif]--> <!--[if IE 7]><link rel="stylesheet" type="text/css" href="'.PATH.'css/ie7.css" /><![endif]--> </head>'; return $displayHead; } but when it call is using: echo classname->setupHead($title); nothing gets displayed. doesn't php allow html in strings? Thanks in advance

    Read the article

  • remove the blank rows after i removed xml elements?

    - by fayer
    i removed some elements from a xml file with simpledom. the code: $this->xmlDocument->removeNodes("//entity[name='mac']"); here is the initial file: <entity id="1000070"> <name>apple</name> <type>category</type> <entities> <entity id="7002870"> <name>mac</name> <type>category</type> </entity> <entity id="7024080"> <name>iphone</name> <type>category</type> </entity> <entity id="7024080"> <name>ipad</name> <type>category</type> </entity> </entities> </entity> the file afterwards: <entity id="1000070"> <name>apple</name> <type>category</type> <entities> <entity id="7024080"> <name>iphone</name> <type>category</type> </entity> <entity id="7024080"> <name>ipad</name> <type>category</type> </entity> </entities> </entity> i wonder how i could also remove the blank lines that are left after i ran the removal code? thanks!

    Read the article

  • C#: at design time, how can I reliably determine the type of a variable that is declared using var?

    - by Cheeso
    I'm working on a completion (intellisense) facility for C# in emacs. The idea is, if a user types a fragment, then asks for completion via a particular keystroke combination, the completion facility will use .NET reflection to determine the possible completions. Doing this requires that the type of the thing being completed, be known. If it's a string, it has a set of known methods; if it's an Int32, it has a separate set of methods, and so on. Using semantic, a code lexer/parser package available in emacs, I can locate the variable declarations, and their types. Given that, it's straightforward to use reflection to get the methods and properties on the type, and then present the list of options to the user. The problem arrives when the code uses var in the declaration. How can I reliably determine the actual type used, when the variable is declared with the var keyword? Just to be clear, I don't need to determine it at runtime. I want to determine it at "Design time". So far the best idea I have is: extract the declaration statement, eg var foo = "a string value"; concatenate a statement foo.GetType(); dynamically compile the resulting C# fragment it into a new assembly load the assembly into a new AppDomain, run the framgment and get the return type. unload and discard the assembly This sounds awfully heavyweight, for each completion request in the editor. Any better ideas out there?

    Read the article

  • WCF: get generic type object (e.g. MyObject<T>) from remote machine

    - by Aaron
    I have two applications that are communicating through WCF. On the server the following object exists: public class MyObject<T> { ... public Entry<T> GetValue() } Where Entry<T> is another object with T Data as a public property. T could be any number of types (string, double, etc) On the client I have ClientObject<T> that needs to get the value of Data from the server (same type). Since I'm using WCF, I have to define my ServiceContract as an interface, and I can't have ClientObject<T> call Entry<T> GetMyObjectValue (string Name) which calls GetValue on the correct MyObject<T> because my interface isn't aware of the type information. I've tried implementing separate GetValue functions (GetMyObjectValueDouble, GetMyObjectValueString) in the interface and then have ClientObject determine the correct one to call. However, Entry<T> val = (Entry<T>)GetMyObjectValueDouble(...); doesn't work because it's not sure about the type information. How can I go about getting a generic object over WCF with the correct type information? Let me know if there are other details I can provide. Thanks!

    Read the article

  • Fortran pointer as an argument to interface procedure

    - by icarusthecow
    Im trying to use interfaces to call different subroutines with different types, however, it doesnt seem to work when i use the pointer attribute. for example, take this sample code MODULE ptr_types TYPE, abstract :: parent INTEGER :: q END TYPE TYPE, extends(parent) :: child INTEGER :: m END TYPE INTERFACE ptr_interface MODULE PROCEDURE do_something END INTERFACE CONTAINS SUBROUTINE do_something(atype) CLASS(parent), POINTER :: atype ! code determines that this allocation is correct from input ALLOCATE(child::atype) WRITE (*,*) atype%q END SUBROUTINE END MODULE PROGRAM testpass USE ptr_types CLASS(child), POINTER :: ctype CALL ptr_interface(ctype) END PROGRAM This gives error Error: There is no specific subroutine for the generic 'ptr_interface' at (1) however if i remove the pointer attribute in the subroutine it compiles fine. Now, normally this wouldnt be a problem, but for my use case i need to be able to treat that argument as a pointer, mainly so i can allocate it if necessary. Any suggestions? Mind you I'm new to fortran so I may have missed something edit: forgot to put the allocation in the parents subroutine, the initial input is unallocated EDIT 2 this is my second attempt, with caller side casting MODULE ptr_types TYPE, abstract :: parent INTEGER :: q END TYPE TYPE, extends(parent) :: child INTEGER :: m END TYPE TYPE, extends(parent) :: second INTEGER :: meow END TYPE CONTAINS SUBROUTINE do_something(this, type_num) CLASS(parent), POINTER :: this INTEGER type_num IF (type_num == 0) THEN ALLOCATE (child::this) ELSE IF (type_num == 1) THEN ALLOCATE (second::this) ENDIF END SUBROUTINE END MODULE PROGRAM testpass USE ptr_types CLASS(child), POINTER :: ctype SELECT TYPE(ctype) CLASS is (parent) CALL do_something(ctype, 0) END SELECT WRITE (*,*) ctype%q END PROGRAM however this still fails. in the select statement it complains that parent must extend child. Im sure this is due to restrictions when dealing with the pointer attribute, for type safety, however, im looking for a way to convert a pointer into its parent type for generic allocation. Rather than have to write separate allocation functions for every type and hope they dont collide in an interface or something. hopefully this example will illustrate a little more clearly what im trying to achieve, if you know a better way let me know

    Read the article

  • How do I set an og:type for the first time without losing likes?

    - by Travis
    When I first created my site, I neglected to add the Open Graph tags that Facebook recommends (http://developers.facebook.com/docs/opengraph/), and the site now has about 1200 Facebook likes through a fb:comments widget. http://graph.facebook.com/http://feedtheanimalssamples.com/ shows this: { "id": "http://feedtheanimalssamples.com/", "shares": 1204 } Recently, I've added added the following OG tags: <meta property="fb:app_id" content="59193243341" /> <meta property="og:title" content="Girl Talk - Feed The Animals Samples (old)" /> <meta property="og:image" content="http://feedtheanimalssamples.com/fta_small.png" /> <meta property="og:url" content="http://feedtheanimalssamples.com" /> But when I add the og:type tag: <meta property="og:type" content="website" /> and Lint the site, I lose all my likes. http://graph.facebook.com/http://feedtheanimalssamples.com/ starts showing this: { "id": "170545342993850", "name": "Girl Talk - Feed The Animals Samples", "picture": "http://profile.ak.fbcdn.net/hprofile-ak-snc4/188039_170545342993850_3277642_s.jpg", "link": "http://www.facebook.com/pages/Girl-Talk-Feed-The-Animals-Samples/170545342993850", "category": "Website", "website": "http://feedtheanimalssamples.com/", "description": "Interactively identifies the samples in the 2008 album 'Feed The Animals' by mashup artist Girl Talk.", "likes": 1 } (Note the "likes": 1.) So: How do I set the og:type without losing my likes? I'm trying to let my likers know that I've created a new and improved site. I'm following the instuctions at http://developers.facebook.com/blog/post/397 under "Publishing to Connected Users via Graph API", but using that API apparently requires specifying an og:type. Thanks!

    Read the article

  • C++ and its type system: How to deal with data with multiple types?

    - by sub
    "Introduction" I'm relatively new to C++. I went through all the basic stuff and managed to build 2-3 simple interpreters for my programming languages. The first thing that gave and still gives me a headache: Implementing the type system of my language in C++ Think of that: Ruby, Python, PHP and Co. have a lot of built-in types which obviously are implemented in C. So what I first tried was to make it possible to give a value in my language three possible types: Int, String and Nil. I came up with this: enum ValueType { Int, String, Nil }; class Value { public: ValueType type; int intVal; string stringVal; }; Yeah, wow, I know. It was extremely slow to pass this class around as the string allocator had to be called all the time. Next time I've tried something similar to this: enum ValueType { Int, String, Nil }; extern string stringTable[255]; class Value { public: ValueType type; int index; }; I would store all strings in stringTable and write their position to index. If the type of Value was Int, I just stored the integer in index, it wouldn't make sense at all using an int index to access another int, or? Anyways, the above gave me a headache too. After some time, accessing the string from the table here, referencing it there and copying it over there grew over my head - I lost control. I had to put the interpreter draft down. Now: Okay, so C and C++ are statically typed. How do the main implementations of the languages mentioned above handle the different types in their programs (fixnums, bignums, nums, strings, arrays, resources,...)? What should I do to get maximum speed with many different available types? How do the solutions compare to my simplified versions above?

    Read the article

  • Pass the return type as a parameter in java?

    - by jonderry
    I have some files that contain logs of objects. Each file can store objects of a different type, but a single file is homogeneous -- it only stores objects of a single type. I would like to write a method that returns an array of these objects, and have the array be of a specified type (the type of objects in a file is known and can be passed as a parameter). Roughly, what I want is something like the following: public static <T> T[] parseLog(File log, Class<T> cls) throws Exception { ArrayList<T> objList = new ArrayList<T>(); FileInputStream fis = new FileInputStream(log); ObjectInputStream in = new ObjectInputStream(fis); try { Object obj; while (!((obj = in.readObject()) instanceof EOFObject)) { T tobj = (T) obj; objList.add(tobj); } } finally { in.close(); } return objList.toArray(new T[0]); } The above code doesn't compile (there's an error on the return statement, and a warning on the cast), but it should give you the idea of what I'm trying to do. Any suggestions for the best way to do this?

    Read the article

  • How to make restrictions on XML Schema Complex type?

    - by chobo2
    Hi I am reading the tutorials on w3cschools ( http://www.w3schools.com/schema/schema_complex.asp ) but they don't seem to mention how you could add restrictions on complex types. Like for instance I have this schema. <xs:element name="employee"> <xs:complexType> <xs:sequence> <xs:element name="firstname" type="xs:string"/> <xs:element name="lastname" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> now I want to make sure the firstname is no more then 10 characters long. How do I do this? I tried to put in the simple type for the firstname but it says I can't do that since I am using a complex type. So how do I put restrictions like that on the file so the people who I give the schema to don't try to make the firstname 100 characters.

    Read the article

  • When calling a static method on parent class, can the parent class deduce the type on the child (C#)

    - by Matt
    Suppose we have 2 classes, Child, and the class from which it inherits, Parent. class Parent { public static void MyFunction(){} } class Child : Parent { } Is it possible to determine in the parent class how the method was called? Because we can call it two ways: Parent.MyFunction(); Child.MyFunction(); My current approach was trying to use: MethodInfo.GetCurrentMethod().ReflectedType; // and MethodInfo.GetCurrentMethod().DeclaringType; But both appear to return the Parent type. If you are wondering what, exactly I am trying to accomplish (and why I am violating the basic OOP rule that the parent shouldn't have to know anything about the child), the short of it is this (let me know if you want the long version): I have a Model structure representing some of our data that persists to the database. All of these models inherit from an abstract Parent. This parent implements a couple of events, such as SaveEvent, DeleteEvent, etc. We want to be able to subscribe to events specific to the type. So, even though the event is in the parent, I want to be able to do: Child.SaveEvent += new EventHandler((sender, args) => {}); I have everything in place, where the event is actually backed by a dictionary of event handlers, hashed by type. The last thing I need to get working is correctly detecting the Child type, when doing Child.SaveEvent. I know I can implement the event in each child class (even forcing it through use of abstract), but it would be nice to keep it all in the parent, which is the class actually firing the events (since it implements the common save/delete/change functionality).

    Read the article

  • Why is Scala's type inferencer not able to resolve this?

    - by Levi Greenspan
    In the code snippet below - why do I have to give a type annotation for Nil? Welcome to Scala version 2.8.0.RC2 (OpenJDK Server VM, Java 1.6.0_18). Type in expressions to have them evaluated. Type :help for more information. scala> List(Some(1), Some(2), Some(3), None).foldLeft(Nil)((lst, o) => o match { case Some(i) => i::lst; case None => lst }) <console>:6: error: type mismatch; found : List[Int] required: object Nil List(Some(1), Some(2), Some(3), None).foldLeft(Nil)((lst, o) => o match { case Some(i) => i::lst; case None => lst }) ^ scala> List(Some(1), Some(2), Some(3), None).foldLeft(Nil:List[Int])((lst, o) => o match { case Some(i) => i::lst; case None => lst }) res1: List[Int] = List(3, 2, 1)

    Read the article

  • (Java) Is there a type of object that can handle anything from primitives to arrays?

    - by Michael
    I'm pretty new to Java, so I'm hoping one of you guys knows how to do this. I'm having the user specify both the type and value of arguments, in any XML-like way, to be passed to methods that are external to my application. Example: javac myAppsName externalJavaClass methodofExternalClass [parameters] Of course, to find the proper method, we have to have the proper parameter types as the method may be overloaded and that's the only way to tell the difference between the different versions. Parameters are currently formatted in this manner: (type)value(/type), e.g. (int)71(/int) (string)This is my string that I'm passing as a parameter!(/string) I parse them, getting the constructor for whatever type is indicated, then execute that constructor by running its method, newInstance(<String value>), loading the new instance into an Object. This works fine and dandy, but as we all know, some methods take arrays, or even multi-dimensional arrays. I could handle the argument formatting like so: (array)(array)(int)0(/int)(int)1(/int)(/array)(array)(int)2(/int)(int)3(/int)(/array)(/array)... or perhaps even better... {{(int)0(/int)(int)1(/int)}{(int)2(/int)(int)3(/int)}}. The question is, how can this be implemented? Do I have to start wrapping everything in an Object[] array so I can handle primitives, etc. as argObj[0], but load an array as I normally would? (Unfortunately, I would have to make it an Object[][] array if I wanted to support two-dimensional arrays. This implementation wouldn't be very pretty.)

    Read the article

  • Using R to Analyze G1GC Log Files

    - by user12620111
    Using R to Analyze G1GC Log Files body, td { font-family: sans-serif; background-color: white; font-size: 12px; margin: 8px; } tt, code, pre { font-family: 'DejaVu Sans Mono', 'Droid Sans Mono', 'Lucida Console', Consolas, Monaco, monospace; } h1 { font-size:2.2em; } h2 { font-size:1.8em; } h3 { font-size:1.4em; } h4 { font-size:1.0em; } h5 { font-size:0.9em; } h6 { font-size:0.8em; } a:visited { color: rgb(50%, 0%, 50%); } pre { margin-top: 0; max-width: 95%; border: 1px solid #ccc; white-space: pre-wrap; } pre code { display: block; padding: 0.5em; } code.r, code.cpp { background-color: #F8F8F8; } table, td, th { border: none; } blockquote { color:#666666; margin:0; padding-left: 1em; border-left: 0.5em #EEE solid; } hr { height: 0px; border-bottom: none; border-top-width: thin; border-top-style: dotted; border-top-color: #999999; } @media print { * { background: transparent !important; color: black !important; filter:none !important; -ms-filter: none !important; } body { font-size:12pt; max-width:100%; } a, a:visited { text-decoration: underline; } hr { visibility: hidden; page-break-before: always; } pre, blockquote { padding-right: 1em; page-break-inside: avoid; } tr, img { page-break-inside: avoid; } img { max-width: 100% !important; } @page :left { margin: 15mm 20mm 15mm 10mm; } @page :right { margin: 15mm 10mm 15mm 20mm; } p, h2, h3 { orphans: 3; widows: 3; } h2, h3 { page-break-after: avoid; } } pre .operator, pre .paren { color: rgb(104, 118, 135) } pre .literal { color: rgb(88, 72, 246) } pre .number { color: rgb(0, 0, 205); } pre .comment { color: rgb(76, 136, 107); } pre .keyword { color: rgb(0, 0, 255); } pre .identifier { color: rgb(0, 0, 0); } pre .string { color: rgb(3, 106, 7); } var hljs=new function(){function m(p){return p.replace(/&/gm,"&").replace(/"}while(y.length||w.length){var v=u().splice(0,1)[0];z+=m(x.substr(q,v.offset-q));q=v.offset;if(v.event=="start"){z+=t(v.node);s.push(v.node)}else{if(v.event=="stop"){var p,r=s.length;do{r--;p=s[r];z+=("")}while(p!=v.node);s.splice(r,1);while(r'+M[0]+""}else{r+=M[0]}O=P.lR.lastIndex;M=P.lR.exec(L)}return r+L.substr(O,L.length-O)}function J(L,M){if(M.sL&&e[M.sL]){var r=d(M.sL,L);x+=r.keyword_count;return r.value}else{return F(L,M)}}function I(M,r){var L=M.cN?'':"";if(M.rB){y+=L;M.buffer=""}else{if(M.eB){y+=m(r)+L;M.buffer=""}else{y+=L;M.buffer=r}}D.push(M);A+=M.r}function G(N,M,Q){var R=D[D.length-1];if(Q){y+=J(R.buffer+N,R);return false}var P=q(M,R);if(P){y+=J(R.buffer+N,R);I(P,M);return P.rB}var L=v(D.length-1,M);if(L){var O=R.cN?"":"";if(R.rE){y+=J(R.buffer+N,R)+O}else{if(R.eE){y+=J(R.buffer+N,R)+O+m(M)}else{y+=J(R.buffer+N+M,R)+O}}while(L1){O=D[D.length-2].cN?"":"";y+=O;L--;D.length--}var r=D[D.length-1];D.length--;D[D.length-1].buffer="";if(r.starts){I(r.starts,"")}return R.rE}if(w(M,R)){throw"Illegal"}}var E=e[B];var D=[E.dM];var A=0;var x=0;var y="";try{var s,u=0;E.dM.buffer="";do{s=p(C,u);var t=G(s[0],s[1],s[2]);u+=s[0].length;if(!t){u+=s[1].length}}while(!s[2]);if(D.length1){throw"Illegal"}return{r:A,keyword_count:x,value:y}}catch(H){if(H=="Illegal"){return{r:0,keyword_count:0,value:m(C)}}else{throw H}}}function g(t){var p={keyword_count:0,r:0,value:m(t)};var r=p;for(var q in e){if(!e.hasOwnProperty(q)){continue}var s=d(q,t);s.language=q;if(s.keyword_count+s.rr.keyword_count+r.r){r=s}if(s.keyword_count+s.rp.keyword_count+p.r){r=p;p=s}}if(r.language){p.second_best=r}return p}function i(r,q,p){if(q){r=r.replace(/^((]+|\t)+)/gm,function(t,w,v,u){return w.replace(/\t/g,q)})}if(p){r=r.replace(/\n/g,"")}return r}function n(t,w,r){var x=h(t,r);var v=a(t);var y,s;if(v){y=d(v,x)}else{return}var q=c(t);if(q.length){s=document.createElement("pre");s.innerHTML=y.value;y.value=k(q,c(s),x)}y.value=i(y.value,w,r);var u=t.className;if(!u.match("(\\s|^)(language-)?"+v+"(\\s|$)")){u=u?(u+" "+v):v}if(/MSIE [678]/.test(navigator.userAgent)&&t.tagName=="CODE"&&t.parentNode.tagName=="PRE"){s=t.parentNode;var p=document.createElement("div");p.innerHTML=""+y.value+"";t=p.firstChild.firstChild;p.firstChild.cN=s.cN;s.parentNode.replaceChild(p.firstChild,s)}else{t.innerHTML=y.value}t.className=u;t.result={language:v,kw:y.keyword_count,re:y.r};if(y.second_best){t.second_best={language:y.second_best.language,kw:y.second_best.keyword_count,re:y.second_best.r}}}function o(){if(o.called){return}o.called=true;var r=document.getElementsByTagName("pre");for(var p=0;p|=||=||=|\\?|\\[|\\{|\\(|\\^|\\^=|\\||\\|=|\\|\\||~";this.ER="(?![\\s\\S])";this.BE={b:"\\\\.",r:0};this.ASM={cN:"string",b:"'",e:"'",i:"\\n",c:[this.BE],r:0};this.QSM={cN:"string",b:'"',e:'"',i:"\\n",c:[this.BE],r:0};this.CLCM={cN:"comment",b:"//",e:"$"};this.CBLCLM={cN:"comment",b:"/\\*",e:"\\*/"};this.HCM={cN:"comment",b:"#",e:"$"};this.NM={cN:"number",b:this.NR,r:0};this.CNM={cN:"number",b:this.CNR,r:0};this.BNM={cN:"number",b:this.BNR,r:0};this.inherit=function(r,s){var p={};for(var q in r){p[q]=r[q]}if(s){for(var q in s){p[q]=s[q]}}return p}}();hljs.LANGUAGES.cpp=function(){var a={keyword:{"false":1,"int":1,"float":1,"while":1,"private":1,"char":1,"catch":1,"export":1,virtual:1,operator:2,sizeof:2,dynamic_cast:2,typedef:2,const_cast:2,"const":1,struct:1,"for":1,static_cast:2,union:1,namespace:1,unsigned:1,"long":1,"throw":1,"volatile":2,"static":1,"protected":1,bool:1,template:1,mutable:1,"if":1,"public":1,friend:2,"do":1,"return":1,"goto":1,auto:1,"void":2,"enum":1,"else":1,"break":1,"new":1,extern:1,using:1,"true":1,"class":1,asm:1,"case":1,typeid:1,"short":1,reinterpret_cast:2,"default":1,"double":1,register:1,explicit:1,signed:1,typename:1,"try":1,"this":1,"switch":1,"continue":1,wchar_t:1,inline:1,"delete":1,alignof:1,char16_t:1,char32_t:1,constexpr:1,decltype:1,noexcept:1,nullptr:1,static_assert:1,thread_local:1,restrict:1,_Bool:1,complex:1},built_in:{std:1,string:1,cin:1,cout:1,cerr:1,clog:1,stringstream:1,istringstream:1,ostringstream:1,auto_ptr:1,deque:1,list:1,queue:1,stack:1,vector:1,map:1,set:1,bitset:1,multiset:1,multimap:1,unordered_set:1,unordered_map:1,unordered_multiset:1,unordered_multimap:1,array:1,shared_ptr:1}};return{dM:{k:a,i:"",k:a,r:10,c:["self"]}]}}}();hljs.LANGUAGES.r={dM:{c:[hljs.HCM,{cN:"number",b:"\\b0[xX][0-9a-fA-F]+[Li]?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+(?:[eE][+\\-]?\\d*)?L\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+\\.(?!\\d)(?:i\\b)?",e:hljs.IMMEDIATE_RE,r:1},{cN:"number",b:"\\b\\d+(?:\\.\\d*)?(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\.\\d+(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"keyword",b:"(?:tryCatch|library|setGeneric|setGroupGeneric)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\.",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\d+(?![\\w.])",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\b(?:function)",e:hljs.IMMEDIATE_RE,r:2},{cN:"keyword",b:"(?:if|in|break|next|repeat|else|for|return|switch|while|try|stop|warning|require|attach|detach|source|setMethod|setClass)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"literal",b:"(?:NA|NA_integer_|NA_real_|NA_character_|NA_complex_)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"literal",b:"(?:NULL|TRUE|FALSE|T|F|Inf|NaN)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"identifier",b:"[a-zA-Z.][a-zA-Z0-9._]*\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"operator",b:"|=||   Using R to Analyze G1GC Log Files   Using R to Analyze G1GC Log Files Introduction Working in Oracle Platform Integration gives an engineer opportunities to work on a wide array of technologies. My team’s goal is to make Oracle applications run best on the Solaris/SPARC platform. When looking for bottlenecks in a modern applications, one needs to be aware of not only how the CPUs and operating system are executing, but also network, storage, and in some cases, the Java Virtual Machine. I was recently presented with about 1.5 GB of Java Garbage First Garbage Collector log file data. If you’re not familiar with the subject, you might want to review Garbage First Garbage Collector Tuning by Monica Beckwith. The customer had been running Java HotSpot 1.6.0_31 to host a web application server. I was told that the Solaris/SPARC server was running a Java process launched using a commmand line that included the following flags: -d64 -Xms9g -Xmx9g -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -XX:InitiatingHeapOccupancyPercent=80 -XX:PermSize=256m -XX:MaxPermSize=256m -XX:+PrintGC -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC -XX:+PrintGCDateStamps -XX:+PrintFlagsFinal -XX:+DisableExplicitGC -XX:+UnlockExperimentalVMOptions -XX:ParallelGCThreads=8 Several sources on the internet indicate that if I were to print out the 1.5 GB of log files, it would require enough paper to fill the bed of a pick up truck. Of course, it would be fruitless to try to scan the log files by hand. Tools will be required to summarize the contents of the log files. Others have encountered large Java garbage collection log files. There are existing tools to analyze the log files: IBM’s GC toolkit The chewiebug GCViewer gchisto HPjmeter Instead of using one of the other tools listed, I decide to parse the log files with standard Unix tools, and analyze the data with R. Data Cleansing The log files arrived in two different formats. I guess that the difference is that one set of log files was generated using a more verbose option, maybe -XX:+PrintHeapAtGC, and the other set of log files was generated without that option. Format 1 In some of the log files, the log files with the less verbose format, a single trace, i.e. the report of a singe garbage collection event, looks like this: {Heap before GC invocations=12280 (full 61): garbage-first heap total 9437184K, used 7499918K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) region size 4096K, 1 young (4096K), 0 survivors (0K) compacting perm gen total 262144K, used 144077K [0xffffffff40000000, 0xffffffff50000000, 0xffffffff50000000) the space 262144K, 54% used [0xffffffff40000000, 0xffffffff48cb3758, 0xffffffff48cb3800, 0xffffffff50000000) No shared spaces configured. 2014-05-14T07:24:00.988-0700: 60586.353: [GC pause (young) 7324M->7320M(9216M), 0.1567265 secs] Heap after GC invocations=12281 (full 61): garbage-first heap total 9437184K, used 7496533K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) region size 4096K, 0 young (0K), 0 survivors (0K) compacting perm gen total 262144K, used 144077K [0xffffffff40000000, 0xffffffff50000000, 0xffffffff50000000) the space 262144K, 54% used [0xffffffff40000000, 0xffffffff48cb3758, 0xffffffff48cb3800, 0xffffffff50000000) No shared spaces configured. } A simple grep can be used to extract a summary: $ grep "\[ GC pause (young" g1gc.log 2014-05-13T13:24:35.091-0700: 3.109: [GC pause (young) 20M->5029K(9216M), 0.0146328 secs] 2014-05-13T13:24:35.440-0700: 3.459: [GC pause (young) 9125K->6077K(9216M), 0.0086723 secs] 2014-05-13T13:24:37.581-0700: 5.599: [GC pause (young) 25M->8470K(9216M), 0.0203820 secs] 2014-05-13T13:24:42.686-0700: 10.704: [GC pause (young) 44M->15M(9216M), 0.0288848 secs] 2014-05-13T13:24:48.941-0700: 16.958: [GC pause (young) 51M->20M(9216M), 0.0491244 secs] 2014-05-13T13:24:56.049-0700: 24.066: [GC pause (young) 92M->26M(9216M), 0.0525368 secs] 2014-05-13T13:25:34.368-0700: 62.383: [GC pause (young) 602M->68M(9216M), 0.1721173 secs] But that format wasn't easily read into R, so I needed to be a bit more tricky. I used the following Unix command to create a summary file that was easy for R to read. $ echo "SecondsSinceLaunch BeforeSize AfterSize TotalSize RealTime" $ grep "\[GC pause (young" g1gc.log | grep -v mark | sed -e 's/[A-SU-z\(\),]/ /g' -e 's/->/ /' -e 's/: / /g' | more SecondsSinceLaunch BeforeSize AfterSize TotalSize RealTime 2014-05-13T13:24:35.091-0700 3.109 20 5029 9216 0.0146328 2014-05-13T13:24:35.440-0700 3.459 9125 6077 9216 0.0086723 2014-05-13T13:24:37.581-0700 5.599 25 8470 9216 0.0203820 2014-05-13T13:24:42.686-0700 10.704 44 15 9216 0.0288848 2014-05-13T13:24:48.941-0700 16.958 51 20 9216 0.0491244 2014-05-13T13:24:56.049-0700 24.066 92 26 9216 0.0525368 2014-05-13T13:25:34.368-0700 62.383 602 68 9216 0.1721173 Format 2 In some of the log files, the log files with the more verbose format, a single trace, i.e. the report of a singe garbage collection event, was more complicated than Format 1. Here is a text file with an example of a single G1GC trace in the second format. As you can see, it is quite complicated. It is nice that there is so much information available, but the level of detail can be overwhelming. I wrote this awk script (download) to summarize each trace on a single line. #!/usr/bin/env awk -f BEGIN { printf("SecondsSinceLaunch IncrementalCount FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize\n") } ###################### # Save count data from lines that are at the start of each G1GC trace. # Each trace starts out like this: # {Heap before GC invocations=14 (full 0): # garbage-first heap total 9437184K, used 325496K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) ###################### /{Heap.*full/{ gsub ( "\\)" , "" ); nf=split($0,a,"="); split(a[2],b," "); getline; if ( match($0, "first") ) { G1GC=1; IncrementalCount=b[1]; FullCount=substr( b[3], 1, length(b[3])-1 ); } else { G1GC=0; } } ###################### # Pull out time stamps that are in lines with this format: # 2014-05-12T14:02:06.025-0700: 94.312: [GC pause (young), 0.08870154 secs] ###################### /GC pause/ { DateTime=$1; SecondsSinceLaunch=substr($2, 1, length($2)-1); } ###################### # Heap sizes are in lines that look like this: # [ 4842M->4838M(9216M)] ###################### /\[ .*]$/ { gsub ( "\\[" , "" ); gsub ( "\ \]" , "" ); gsub ( "->" , " " ); gsub ( "\\( " , " " ); gsub ( "\ \)" , " " ); split($0,a," "); if ( split(a[1],b,"M") > 1 ) {BeforeSize=b[1]*1024;} if ( split(a[1],b,"K") > 1 ) {BeforeSize=b[1];} if ( split(a[2],b,"M") > 1 ) {AfterSize=b[1]*1024;} if ( split(a[2],b,"K") > 1 ) {AfterSize=b[1];} if ( split(a[3],b,"M") > 1 ) {TotalSize=b[1]*1024;} if ( split(a[3],b,"K") > 1 ) {TotalSize=b[1];} } ###################### # Emit an output line when you find input that looks like this: # [Times: user=1.41 sys=0.08, real=0.24 secs] ###################### /\[Times/ { if (G1GC==1) { gsub ( "," , "" ); split($2,a,"="); UserTime=a[2]; split($3,a,"="); SysTime=a[2]; split($4,a,"="); RealTime=a[2]; print DateTime,SecondsSinceLaunch,IncrementalCount,FullCount,UserTime,SysTime,RealTime,BeforeSize,AfterSize,TotalSize; G1GC=0; } } The resulting summary is about 25X smaller that the original file, but still difficult for a human to digest. SecondsSinceLaunch IncrementalCount FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ... 2014-05-12T18:36:34.669-0700: 3985.744 561 0 0.57 0.06 0.16 1724416 1720320 9437184 2014-05-12T18:36:34.839-0700: 3985.914 562 0 0.51 0.06 0.19 1724416 1720320 9437184 2014-05-12T18:36:35.069-0700: 3986.144 563 0 0.60 0.04 0.27 1724416 1721344 9437184 2014-05-12T18:36:35.354-0700: 3986.429 564 0 0.33 0.04 0.09 1725440 1722368 9437184 2014-05-12T18:36:35.545-0700: 3986.620 565 0 0.58 0.04 0.17 1726464 1722368 9437184 2014-05-12T18:36:35.726-0700: 3986.801 566 0 0.43 0.05 0.12 1726464 1722368 9437184 2014-05-12T18:36:35.856-0700: 3986.930 567 0 0.30 0.04 0.07 1726464 1723392 9437184 2014-05-12T18:36:35.947-0700: 3987.023 568 0 0.61 0.04 0.26 1727488 1723392 9437184 2014-05-12T18:36:36.228-0700: 3987.302 569 0 0.46 0.04 0.16 1731584 1724416 9437184 Reading the Data into R Once the GC log data had been cleansed, either by processing the first format with the shell script, or by processing the second format with the awk script, it was easy to read the data into R. g1gc.df = read.csv("summary.txt", row.names = NULL, stringsAsFactors=FALSE,sep="") str(g1gc.df) ## 'data.frame': 8307 obs. of 10 variables: ## $ row.names : chr "2014-05-12T14:00:32.868-0700:" "2014-05-12T14:00:33.179-0700:" "2014-05-12T14:00:33.677-0700:" "2014-05-12T14:00:35.538-0700:" ... ## $ SecondsSinceLaunch: num 1.16 1.47 1.97 3.83 6.1 ... ## $ IncrementalCount : int 0 1 2 3 4 5 6 7 8 9 ... ## $ FullCount : int 0 0 0 0 0 0 0 0 0 0 ... ## $ UserTime : num 0.11 0.05 0.04 0.21 0.08 0.26 0.31 0.33 0.34 0.56 ... ## $ SysTime : num 0.04 0.01 0.01 0.05 0.01 0.06 0.07 0.06 0.07 0.09 ... ## $ RealTime : num 0.02 0.02 0.01 0.04 0.02 0.04 0.05 0.04 0.04 0.06 ... ## $ BeforeSize : int 8192 5496 5768 22528 24576 43008 34816 53248 55296 93184 ... ## $ AfterSize : int 1400 1672 2557 4907 7072 14336 16384 18432 19456 21504 ... ## $ TotalSize : int 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 ... head(g1gc.df) ## row.names SecondsSinceLaunch IncrementalCount ## 1 2014-05-12T14:00:32.868-0700: 1.161 0 ## 2 2014-05-12T14:00:33.179-0700: 1.472 1 ## 3 2014-05-12T14:00:33.677-0700: 1.969 2 ## 4 2014-05-12T14:00:35.538-0700: 3.830 3 ## 5 2014-05-12T14:00:37.811-0700: 6.103 4 ## 6 2014-05-12T14:00:41.428-0700: 9.720 5 ## FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ## 1 0 0.11 0.04 0.02 8192 1400 9437184 ## 2 0 0.05 0.01 0.02 5496 1672 9437184 ## 3 0 0.04 0.01 0.01 5768 2557 9437184 ## 4 0 0.21 0.05 0.04 22528 4907 9437184 ## 5 0 0.08 0.01 0.02 24576 7072 9437184 ## 6 0 0.26 0.06 0.04 43008 14336 9437184 Basic Statistics Once the data has been read into R, simple statistics are very easy to generate. All of the numbers from high school statistics are available via simple commands. For example, generate a summary of every column: summary(g1gc.df) ## row.names SecondsSinceLaunch IncrementalCount FullCount ## Length:8307 Min. : 1 Min. : 0 Min. : 0.0 ## Class :character 1st Qu.: 9977 1st Qu.:2048 1st Qu.: 0.0 ## Mode :character Median :12855 Median :4136 Median : 12.0 ## Mean :12527 Mean :4156 Mean : 31.6 ## 3rd Qu.:15758 3rd Qu.:6262 3rd Qu.: 61.0 ## Max. :55484 Max. :8391 Max. :113.0 ## UserTime SysTime RealTime BeforeSize ## Min. :0.040 Min. :0.0000 Min. : 0.0 Min. : 5476 ## 1st Qu.:0.470 1st Qu.:0.0300 1st Qu.: 0.1 1st Qu.:5137920 ## Median :0.620 Median :0.0300 Median : 0.1 Median :6574080 ## Mean :0.751 Mean :0.0355 Mean : 0.3 Mean :5841855 ## 3rd Qu.:0.920 3rd Qu.:0.0400 3rd Qu.: 0.2 3rd Qu.:7084032 ## Max. :3.370 Max. :1.5600 Max. :488.1 Max. :8696832 ## AfterSize TotalSize ## Min. : 1380 Min. :9437184 ## 1st Qu.:5002752 1st Qu.:9437184 ## Median :6559744 Median :9437184 ## Mean :5785454 Mean :9437184 ## 3rd Qu.:7054336 3rd Qu.:9437184 ## Max. :8482816 Max. :9437184 Q: What is the total amount of User CPU time spent in garbage collection? sum(g1gc.df$UserTime) ## [1] 6236 As you can see, less than two hours of CPU time was spent in garbage collection. Is that too much? To find the percentage of time spent in garbage collection, divide the number above by total_elapsed_time*CPU_count. In this case, there are a lot of CPU’s and it turns out the the overall amount of CPU time spent in garbage collection isn’t a problem when viewed in isolation. When calculating rates, i.e. events per unit time, you need to ask yourself if the rate is homogenous across the time period in the log file. Does the log file include spikes of high activity that should be separately analyzed? Averaging in data from nights and weekends with data from business hours may alias problems. If you have a reason to suspect that the garbage collection rates include peaks and valleys that need independent analysis, see the “Time Series” section, below. Q: How much garbage is collected on each pass? The amount of heap space that is recovered per GC pass is surprisingly low: At least one collection didn’t recover any data. (“Min.=0”) 25% of the passes recovered 3MB or less. (“1st Qu.=3072”) Half of the GC passes recovered 4MB or less. (“Median=4096”) The average amount recovered was 56MB. (“Mean=56390”) 75% of the passes recovered 36MB or less. (“3rd Qu.=36860”) At least one pass recovered 2GB. (“Max.=2121000”) g1gc.df$Delta = g1gc.df$BeforeSize - g1gc.df$AfterSize summary(g1gc.df$Delta) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0 3070 4100 56400 36900 2120000 Q: What is the maximum User CPU time for a single collection? The worst garbage collection (“Max.”) is many standard deviations away from the mean. The data appears to be right skewed. summary(g1gc.df$UserTime) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0.040 0.470 0.620 0.751 0.920 3.370 sd(g1gc.df$UserTime) ## [1] 0.3966 Basic Graphics Once the data is in R, it is trivial to plot the data with formats including dot plots, line charts, bar charts (simple, stacked, grouped), pie charts, boxplots, scatter plots histograms, and kernel density plots. Histogram of User CPU Time per Collection I don't think that this graph requires any explanation. hist(g1gc.df$UserTime, main="User CPU Time per Collection", xlab="Seconds", ylab="Frequency") Box plot to identify outliers When the initial data is viewed with a box plot, you can see the one crazy outlier in the real time per GC. Save this data point for future analysis and drop the outlier so that it’s not throwing off our statistics. Now the box plot shows many outliers, which will be examined later, using times series analysis. Notice that the scale of the x-axis changes drastically once the crazy outlier is removed. par(mfrow=c(2,1)) boxplot(g1gc.df$UserTime,g1gc.df$SysTime,g1gc.df$RealTime, main="Box Plot of Time per GC\n(dominated by a crazy outlier)", names=c("usr","sys","elapsed"), xlab="Seconds per GC", ylab="Time (Seconds)", horizontal = TRUE, outcol="red") crazy.outlier.df=g1gc.df[g1gc.df$RealTime > 400,] g1gc.df=g1gc.df[g1gc.df$RealTime < 400,] boxplot(g1gc.df$UserTime,g1gc.df$SysTime,g1gc.df$RealTime, main="Box Plot of Time per GC\n(crazy outlier excluded)", names=c("usr","sys","elapsed"), xlab="Seconds per GC", ylab="Time (Seconds)", horizontal = TRUE, outcol="red") box(which = "outer", lty = "solid") Here is the crazy outlier for future analysis: crazy.outlier.df ## row.names SecondsSinceLaunch IncrementalCount ## 8233 2014-05-12T23:15:43.903-0700: 20741 8316 ## FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ## 8233 112 0.55 0.42 488.1 8381440 8235008 9437184 ## Delta ## 8233 146432 R Time Series Data To analyze the garbage collection as a time series, I’ll use Z’s Ordered Observations (zoo). “zoo is the creator for an S3 class of indexed totally ordered observations which includes irregular time series.” require(zoo) ## Loading required package: zoo ## ## Attaching package: 'zoo' ## ## The following objects are masked from 'package:base': ## ## as.Date, as.Date.numeric head(g1gc.df[,1]) ## [1] "2014-05-12T14:00:32.868-0700:" "2014-05-12T14:00:33.179-0700:" ## [3] "2014-05-12T14:00:33.677-0700:" "2014-05-12T14:00:35.538-0700:" ## [5] "2014-05-12T14:00:37.811-0700:" "2014-05-12T14:00:41.428-0700:" options("digits.secs"=3) times=as.POSIXct( g1gc.df[,1], format="%Y-%m-%dT%H:%M:%OS%z:") g1gc.z = zoo(g1gc.df[,-c(1)], order.by=times) head(g1gc.z) ## SecondsSinceLaunch IncrementalCount FullCount ## 2014-05-12 17:00:32.868 1.161 0 0 ## 2014-05-12 17:00:33.178 1.472 1 0 ## 2014-05-12 17:00:33.677 1.969 2 0 ## 2014-05-12 17:00:35.538 3.830 3 0 ## 2014-05-12 17:00:37.811 6.103 4 0 ## 2014-05-12 17:00:41.427 9.720 5 0 ## UserTime SysTime RealTime BeforeSize AfterSize ## 2014-05-12 17:00:32.868 0.11 0.04 0.02 8192 1400 ## 2014-05-12 17:00:33.178 0.05 0.01 0.02 5496 1672 ## 2014-05-12 17:00:33.677 0.04 0.01 0.01 5768 2557 ## 2014-05-12 17:00:35.538 0.21 0.05 0.04 22528 4907 ## 2014-05-12 17:00:37.811 0.08 0.01 0.02 24576 7072 ## 2014-05-12 17:00:41.427 0.26 0.06 0.04 43008 14336 ## TotalSize Delta ## 2014-05-12 17:00:32.868 9437184 6792 ## 2014-05-12 17:00:33.178 9437184 3824 ## 2014-05-12 17:00:33.677 9437184 3211 ## 2014-05-12 17:00:35.538 9437184 17621 ## 2014-05-12 17:00:37.811 9437184 17504 ## 2014-05-12 17:00:41.427 9437184 28672 Example of Two Benchmark Runs in One Log File The data in the following graph is from a different log file, not the one of primary interest to this article. I’m including this image because it is an example of idle periods followed by busy periods. It would be uninteresting to average the rate of garbage collection over the entire log file period. More interesting would be the rate of garbage collect in the two busy periods. Are they the same or different? Your production data may be similar, for example, bursts when employees return from lunch and idle times on weekend evenings, etc. Once the data is in an R Time Series, you can analyze isolated time windows. Clipping the Time Series data Flashing back to our test case… Viewing the data as a time series is interesting. You can see that the work intensive time period is between 9:00 PM and 3:00 AM. Lets clip the data to the interesting period:     par(mfrow=c(2,1)) plot(g1gc.z$UserTime, type="h", main="User Time per GC\nTime: Complete Log File", xlab="Time of Day", ylab="CPU Seconds per GC", col="#1b9e77") clipped.g1gc.z=window(g1gc.z, start=as.POSIXct("2014-05-12 21:00:00"), end=as.POSIXct("2014-05-13 03:00:00")) plot(clipped.g1gc.z$UserTime, type="h", main="User Time per GC\nTime: Limited to Benchmark Execution", xlab="Time of Day", ylab="CPU Seconds per GC", col="#1b9e77") box(which = "outer", lty = "solid") Cumulative Incremental and Full GC count Here is the cumulative incremental and full GC count. When the line is very steep, it indicates that the GCs are repeating very quickly. Notice that the scale on the Y axis is different for full vs. incremental. plot(clipped.g1gc.z[,c(2:3)], main="Cumulative Incremental and Full GC count", xlab="Time of Day", col="#1b9e77") GC Analysis of Benchmark Execution using Time Series data In the following series of 3 graphs: The “After Size” show the amount of heap space in use after each garbage collection. Many Java objects are still referenced, i.e. alive, during each garbage collection. This may indicate that the application has a memory leak, or may indicate that the application has a very large memory footprint. Typically, an application's memory footprint plateau's in the early stage of execution. One would expect this graph to have a flat top. The steep decline in the heap space may indicate that the application crashed after 2:00. The second graph shows that the outliers in real execution time, discussed above, occur near 2:00. when the Java heap seems to be quite full. The third graph shows that Full GCs are infrequent during the first few hours of execution. The rate of Full GC's, (the slope of the cummulative Full GC line), changes near midnight.   plot(clipped.g1gc.z[,c("AfterSize","RealTime","FullCount")], xlab="Time of Day", col=c("#1b9e77","red","#1b9e77")) GC Analysis of heap recovered Each GC trace includes the amount of heap space in use before and after the individual GC event. During garbage coolection, unreferenced objects are identified, the space holding the unreferenced objects is freed, and thus, the difference in before and after usage indicates how much space has been freed. The following box plot and bar chart both demonstrate the same point - the amount of heap space freed per garbage colloection is surprisingly low. par(mfrow=c(2,1)) boxplot(as.vector(clipped.g1gc.z$Delta), main="Amount of Heap Recovered per GC Pass", xlab="Size in KB", horizontal = TRUE, col="red") hist(as.vector(clipped.g1gc.z$Delta), main="Amount of Heap Recovered per GC Pass", xlab="Size in KB", breaks=100, col="red") box(which = "outer", lty = "solid") This graph is the most interesting. The dark blue area shows how much heap is occupied by referenced Java objects. This represents memory that holds live data. The red fringe at the top shows how much data was recovered after each garbage collection. barplot(clipped.g1gc.z[,c("AfterSize","Delta")], col=c("#7570b3","#e7298a"), xlab="Time of Day", border=NA) legend("topleft", c("Live Objects","Heap Recovered on GC"), fill=c("#7570b3","#e7298a")) box(which = "outer", lty = "solid") When I discuss the data in the log files with the customer, I will ask for an explaination for the large amount of referenced data resident in the Java heap. There are two are posibilities: There is a memory leak and the amount of space required to hold referenced objects will continue to grow, limited only by the maximum heap size. After the maximum heap size is reached, the JVM will throw an “Out of Memory” exception every time that the application tries to allocate a new object. If this is the case, the aplication needs to be debugged to identify why old objects are referenced when they are no longer needed. The application has a legitimate requirement to keep a large amount of data in memory. The customer may want to further increase the maximum heap size. Another possible solution would be to partition the application across multiple cluster nodes, where each node has responsibility for managing a unique subset of the data. Conclusion In conclusion, R is a very powerful tool for the analysis of Java garbage collection log files. The primary difficulty is data cleansing so that information can be read into an R data frame. Once the data has been read into R, a rich set of tools may be used for thorough evaluation.

    Read the article

  • Puppet: how to use data from a MySQL table in Puppet 3.0 templates?

    - by Luke404
    I have some data whose source-of-truth is in a MySQL database, size is expected to max out at the some-thousands-rows range (in a worst-case scenario) and I'd like to use puppet to configure files on some servers with that data (mostly iterating through those rows in a template). I'm currently using Puppet 3.0.x, and I cannot change the fact that MySQL will be the authoritative source for that data. Please note, data comes from external sources and not from puppet or from the managed nodes. What possible approaches are there? Which one would you recommend? Would External Node Classifiers be useful here? My "last resort" would be regularly dumping the table to a YAML file and reading that through Hiera to a Puppet template, or to directly dump the table in one or more pre-formatted text file(s) ready to be copied to the nodes. There is an unanswered question on SF about system users but the fundamental issue is probably similar to mine - he's trying to get data out of MySQL.

    Read the article

  • Is it faster to create indexes before or after data loading in MySQL?

    - by Josh Glover
    I have a data replication process that drops and recreates a few tables in a target database, then loads them up with data from a source database (running on another host, but that is immaterial to the question at hand). The target database does need primary keys and a few other indexes on its tables, but not during the data loading. I'm currently loading all of the data, then creating the indexes. However, index creation takes a pretty long time--30 minutes of my data loader's 5 and a half hour running time. My intuition tells me that creating the indexes at the end should be faster than creating them first, since the index would need to be rewritten with each insert. Can anyone tell me for sure which way is faster? FWIW, I'm running MySQL 5.1 with InnoDB tables.

    Read the article

  • Is it useful check data integrity in one DAT tape?

    - by maxim
    I backup my data every day on tape using one drive DAT HP Storageworks DAT 160. I use one tape for every day and I turn them weekly. Every monday I check one tape randomly recover some files saved on it. I know that when data is saved on tape, the driver and backup software check data integrity, but I wonder if a manual check of some data saved has a sense or not. I re-use these tapes many times and I would be sure data are safe.

    Read the article

  • SQL SERVER – Weekly Series – Memory Lane – #039

    - by Pinal Dave
    Here is the list of selected articles of SQLAuthority.com across all these years. Instead of just listing all the articles I have selected a few of my most favorite articles and have listed them here with additional notes below it. Let me know which one of the following is your favorite article from memory lane. 2007 FQL – Facebook Query Language Facebook list following advantages of FQL: Condensed XML reduces bandwidth and parsing costs. More complex requests can reduce the number of requests necessary. Provides a single consistent, unified interface for all of your data. It’s fun! UDF – Get the Day of the Week Function The day of the week can be retrieved in SQL Server by using the DatePart function. The value returned by the function is between 1 (Sunday) and 7 (Saturday). To convert this to a string representing the day of the week, use a CASE statement. UDF – Function to Get Previous And Next Work Day – Exclude Saturday and Sunday While reading ColdFusion blog of Ben Nadel Getting the Previous Day In ColdFusion, Excluding Saturday And Sunday, I realize that I use similar function on my SQL Server Database. This function excludes the Weekends (Saturday and Sunday), and it gets previous as well as next work day. Complete Series of SQL Server Interview Questions and Answers Data Warehousing Interview Questions and Answers – Introduction Data Warehousing Interview Questions and Answers – Part 1 Data Warehousing Interview Questions and Answers – Part 2 Data Warehousing Interview Questions and Answers – Part 3 Data Warehousing Interview Questions and Answers Complete List Download 2008 Introduction to Log Viewer In SQL Server all the windows event logs can be seen along with SQL Server logs. Interface for all the logs is same and can be launched from the same place. This log can be exported and filtered as well. DBCC SHRINKFILE Takes Long Time to Run If you are DBA who are involved with Database Maintenance and file group maintenance, you must have experience that many times DBCC SHRINKFILE operations takes a long time but any other operations with Database are relatively quicker. mssqlsystemresource – Resource Database The purpose of resource database is to facilitates upgrading to the new version of SQL Server without any hassle. In previous versions whenever version of SQL Server was upgraded all the previous version system objects needs to be dropped and new version system objects to be created. 2009 Puzzle – Write Script to Generate Primary Key and Foreign Key In SQL Server Management Studio (SSMS), there is no option to script all the keys. If one is required to script keys they will have to manually script each key one at a time. If database has many tables, generating one key at a time can be a very intricate task. I want to throw a question to all of you if any of you have scripts for the same purpose. Maximizing View of SQL Server Management Studio – Full Screen – New Screen I had explained the following two different methods: 1) Open Results in Separate Tab - This is a very interesting method as result pan shows up in a different tab instead of the splitting screen horizontally. 2) Open SSMS in Full Screen - This works always and to its best. Not many people are aware of this method; hence, very few people use it to enhance performance. 2010 Find Queries using Parallelism from Cached Plan T-SQL script gets all the queries and their execution plan where parallelism operations are kicked up. Pay attention there is TOP 10 is used, if you have lots of transactional operations, I suggest that you change TOP 10 to TOP 50 This is the list of the all the articles in the series of computed columns. SQL SERVER – Computed Column – PERSISTED and Storage This article talks about how computed columns are created and why they take more storage space than before. SQL SERVER – Computed Column – PERSISTED and Performance This article talks about how PERSISTED columns give better performance than non-persisted columns. SQL SERVER – Computed Column – PERSISTED and Performance – Part 2 This article talks about how non-persisted columns give better performance than PERSISTED columns. SQL SERVER – Computed Column and Performance – Part 3 This article talks about how Index improves the performance of Computed Columns. SQL SERVER – Computed Column – PERSISTED and Storage – Part 2 This article talks about how creating index on computed column does not grow the row length of table. SQL SERVER – Computed Columns – Index and Performance This article summarized all the articles related to computed columns. 2011 SQL SERVER – Interview Questions and Answers – Frequently Asked Questions – Data Warehousing Concepts – Day 21 of 31 What is Data Warehousing? What is Business Intelligence (BI)? What is a Dimension Table? What is Dimensional Modeling? What is a Fact Table? What are the Fundamental Stages of Data Warehousing? What are the Different Methods of Loading Dimension tables? Describes the Foreign Key Columns in Fact Table and Dimension Table? What is Data Mining? What is the Difference between a View and a Materialized View? SQL SERVER – Interview Questions and Answers – Frequently Asked Questions – Data Warehousing Concepts – Day 22 of 31 What is OLTP? What is OLAP? What is the Difference between OLTP and OLAP? What is ODS? What is ER Diagram? SQL SERVER – Interview Questions and Answers – Frequently Asked Questions – Data Warehousing Concepts – Day 23 of 31 What is ETL? What is VLDB? Is OLTP Database is Design Optimal for Data Warehouse? If denormalizing improves Data Warehouse Processes, then why is the Fact Table is in the Normal Form? What are Lookup Tables? What are Aggregate Tables? What is Real-Time Data-Warehousing? What are Conformed Dimensions? What is a Conformed Fact? How do you Load the Time Dimension? What is a Level of Granularity of a Fact Table? What are Non-Additive Facts? What is a Factless Facts Table? What are Slowly Changing Dimensions (SCD)? SQL SERVER – Interview Questions and Answers – Frequently Asked Questions – Data Warehousing Concepts – Day 24 of 31 What is Hybrid Slowly Changing Dimension? What is BUS Schema? What is a Star Schema? What Snow Flake Schema? Differences between the Star and Snowflake Schema? What is Difference between ER Modeling and Dimensional Modeling? What is Degenerate Dimension Table? Why is Data Modeling Important? What is a Surrogate Key? What is Junk Dimension? What is a Data Mart? What is the Difference between OLAP and Data Warehouse? What is a Cube and Linked Cube with Reference to Data Warehouse? What is Snapshot with Reference to Data Warehouse? What is Active Data Warehousing? What is the Difference between Data Warehousing and Business Intelligence? What is MDS? Explain the Paradigm of Bill Inmon and Ralph Kimball. SQL SERVER – Azure Interview Questions and Answers – Guest Post by Paras Doshi – Day 25 of 31 Paras Doshi has submitted 21 interesting question and answers for SQL Azure. 1.What is SQL Azure? 2.What is cloud computing? 3.How is SQL Azure different than SQL server? 4.How many replicas are maintained for each SQL Azure database? 5.How can we migrate from SQL server to SQL Azure? 6.Which tools are available to manage SQL Azure databases and servers? 7.Tell me something about security and SQL Azure. 8.What is SQL Azure Firewall? 9.What is the difference between web edition and business edition? 10.How do we synchronize On Premise SQL server with SQL Azure? 11.How do we Backup SQL Azure Data? 12.What is the current pricing model of SQL Azure? 13.What is the current limitation of the size of SQL Azure DB? 14.How do you handle datasets larger than 50 GB? 15.What happens when the SQL Azure database reaches Max Size? 16.How many databases can we create in a single server? 17.How many servers can we create in a single subscription? 18.How do you improve the performance of a SQL Azure Database? 19.What is code near application topology? 20.What were the latest updates to SQL Azure service? 21.When does a workload on SQL Azure get throttled? SQL SERVER – Interview Questions and Answers – Guest Post by Malathi Mahadevan – Day 26 of 31 Malachi had asked a simple question which has several answers. Each answer makes you think and ponder about the reality of the IT world. Look at the simple question – ‘What is the toughest challenge you have faced in your present job and how did you handle it’? and its various answers. Each answer has its own story. SQL SERVER – Interview Questions and Answers – Guest Post by Rick Morelan – Day 27 of 31 Rick Morelan of Joes2Pros has written an excellent blog post on the subject how to find top N values. Most people are fully aware of how the TOP keyword works with a SELECT statement. After years preparing so many students to pass the SQL Certification I noticed they were pretty well prepared for job interviews too. Yes, they would do well in the interview but not great. There seemed to be a few questions that would come up repeatedly for almost everyone. Rick addresses similar questions in his lucid writing skills. 2012 Observation of Top with Index and Order of Resultset SQL Server has lots of things to learn and share. It is amazing to see how people evaluate and understand different techniques and styles differently when implementing. The real reason may be absolutely different but we may blame something totally different for the incorrect results. Read the blog post to learn more. How do I Record Video and Webcast How to Convert Hex to Decimal or INT Earlier I asked regarding a question about how to convert Hex to Decimal. I promised that I will post an answer with Due Credit to the author but never got around to post a blog post around it. Read the original post over here SQL SERVER – Question – How to Convert Hex to Decimal. Query to Get Unique Distinct Data Based on Condition – Eliminate Duplicate Data from Resultset The natural reaction will be to suggest DISTINCT or GROUP BY. However, not all the questions can be solved by DISTINCT or GROUP BY. Let us see the following example, where a user wanted only latest records to be displayed. Let us see the example to understand further. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Memory Lane, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • Organizations &amp; Architecture UNISA Studies &ndash; Chap 7

    - by MarkPearl
    Learning Outcomes Name different device categories Discuss the functions and structure of I/.O modules Describe the principles of Programmed I/O Describe the principles of Interrupt-driven I/O Describe the principles of DMA Discuss the evolution characteristic of I/O channels Describe different types of I/O interface Explain the principles of point-to-point and multipoint configurations Discuss the way in which a FireWire serial bus functions Discuss the principles of InfiniBand architecture External Devices An external device attaches to the computer by a link to an I/O module. The link is used to exchange control, status, and data between the I/O module and the external device. External devices can be classified into 3 categories… Human readable – e.g. video display Machine readable – e.g. magnetic disk Communications – e.g. wifi card I/O Modules An I/O module has two major functions… Interface to the processor and memory via the system bus or central switch Interface to one or more peripheral devices by tailored data links Module Functions The major functions or requirements for an I/O module fall into the following categories… Control and timing Processor communication Device communication Data buffering Error detection I/O function includes a control and timing requirement, to coordinate the flow of traffic between internal resources and external devices. Processor communication involves the following… Command decoding Data Status reporting Address recognition The I/O device must be able to perform device communication. This communication involves commands, status information, and data. An essential task of an I/O module is data buffering due to the relative slow speeds of most external devices. An I/O module is often responsible for error detection and for subsequently reporting errors to the processor. I/O Module Structure An I/O module functions to allow the processor to view a wide range of devices in a simple minded way. The I/O module may hide the details of timing, formats, and the electro mechanics of an external device so that the processor can function in terms of simple reads and write commands. An I/O channel/processor is an I/O module that takes on most of the detailed processing burden, presenting a high-level interface to the processor. There are 3 techniques are possible for I/O operations Programmed I/O Interrupt[t I/O DMA Access Programmed I/O When a processor is executing a program and encounters an instruction relating to I/O it executes that instruction by issuing a command to the appropriate I/O module. With programmed I/O, the I/O module will perform the requested action and then set the appropriate bits in the I/O status register. The I/O module takes no further actions to alert the processor. I/O Commands To execute an I/O related instruction, the processor issues an address, specifying the particular I/O module and external device, and an I/O command. There are four types of I/O commands that an I/O module may receive when it is addressed by a processor… Control – used to activate a peripheral and tell it what to do Test – Used to test various status conditions associated with an I/O module and its peripherals Read – Causes the I/O module to obtain an item of data from the peripheral and place it in an internal buffer Write – Causes the I/O module to take an item of data form the data bus and subsequently transmit that data item to the peripheral The main disadvantage of this technique is it is a time consuming process that keeps the processor busy needlessly I/O Instructions With programmed I/O there is a close correspondence between the I/O related instructions that the processor fetches from memory and the I/O commands that the processor issues to an I/O module to execute the instructions. Typically there will be many I/O devices connected through I/O modules to the system – each device is given a unique identifier or address – when the processor issues an I/O command, the command contains the address of the address of the desired device, thus each I/O module must interpret the address lines to determine if the command is for itself. When the processor, main memory and I/O share a common bus, two modes of addressing are possible… Memory mapped I/O Isolated I/O (for a detailed explanation read page 245 of book) The advantage of memory mapped I/O over isolated I/O is that it has a large repertoire of instructions that can be used, allowing more efficient programming. The disadvantage of memory mapped I/O over isolated I/O is that valuable memory address space is sued up. Interrupts driven I/O Interrupt driven I/O works as follows… The processor issues an I/O command to a module and then goes on to do some other useful work The I/O module will then interrupts the processor to request service when is is ready to exchange data with the processor The processor then executes the data transfer and then resumes its former processing Interrupt Processing The occurrence of an interrupt triggers a number of events, both in the processor hardware and in software. When an I/O device completes an I/O operations the following sequence of hardware events occurs… The device issues an interrupt signal to the processor The processor finishes execution of the current instruction before responding to the interrupt The processor tests for an interrupt – determines that there is one – and sends an acknowledgement signal to the device that issues the interrupt. The acknowledgement allows the device to remove its interrupt signal The processor now needs to prepare to transfer control to the interrupt routine. To begin, it needs to save information needed to resume the current program at the point of interrupt. The minimum information required is the status of the processor and the location of the next instruction to be executed. The processor now loads the program counter with the entry location of the interrupt-handling program that will respond to this interrupt. It also saves the values of the process registers because the Interrupt operation may modify these The interrupt handler processes the interrupt – this includes examination of status information relating to the I/O operation or other event that caused an interrupt When interrupt processing is complete, the saved register values are retrieved from the stack and restored to the registers Finally, the PSW and program counter values from the stack are restored. Design Issues Two design issues arise in implementing interrupt I/O Because there will be multiple I/O modules, how does the processor determine which device issued the interrupt? If multiple interrupts have occurred, how does the processor decide which one to process? Addressing device recognition, 4 general categories of techniques are in common use… Multiple interrupt lines Software poll Daisy chain Bus arbitration For a detailed explanation of these approaches read page 250 of the textbook. Interrupt driven I/O while more efficient than simple programmed I/O still requires the active intervention of the processor to transfer data between memory and an I/O module, and any data transfer must traverse a path through the processor. Thus is suffers from two inherent drawbacks… The I/O transfer rate is limited by the speed with which the processor can test and service a device The processor is tied up in managing an I/O transfer; a number of instructions must be executed for each I/O transfer Direct Memory Access When large volumes of data are to be moved, an efficient technique is direct memory access (DMA) DMA Function DMA involves an additional module on the system bus. The DMA module is capable of mimicking the processor and taking over control of the system from the processor. It needs to do this to transfer data to and from memory over the system bus. DMA must the bus only when the processor does not need it, or it must force the processor to suspend operation temporarily (most common – referred to as cycle stealing). When the processor wishes to read or write a block of data, it issues a command to the DMA module by sending to the DMA module the following information… Whether a read or write is requested using the read or write control line between the processor and the DMA module The address of the I/O device involved, communicated on the data lines The starting location in memory to read from or write to, communicated on the data lines and stored by the DMA module in its address register The number of words to be read or written, communicated via the data lines and stored in the data count register The processor then continues with other work, it delegates the I/O operation to the DMA module which transfers the entire block of data, one word at a time, directly to or from memory without going through the processor. When the transfer is complete, the DMA module sends an interrupt signal to the processor, this the processor is involved only at the beginning and end of the transfer. I/O Channels and Processors Characteristics of I/O Channels As one proceeds along the evolutionary path, more and more of the I/O function is performed without CPU involvement. The I/O channel represents an extension of the DMA concept. An I/O channel ahs the ability to execute I/O instructions, which gives it complete control over I/O operations. In a computer system with such devices, the CPU does not execute I/O instructions – such instructions are stored in main memory to be executed by a special purpose processor in the I/O channel itself. Two types of I/O channels are common A selector channel controls multiple high-speed devices. A multiplexor channel can handle I/O with multiple characters as fast as possible to multiple devices. The external interface: FireWire and InfiniBand Types of Interfaces One major characteristic of the interface is whether it is serial or parallel parallel interface – there are multiple lines connecting the I/O module and the peripheral, and multiple bits are transferred simultaneously serial interface – there is only one line used to transmit data, and bits must be transmitted one at a time With new generation serial interfaces, parallel interfaces are becoming less common. In either case, the I/O module must engage in a dialogue with the peripheral. In general terms the dialog may look as follows… The I/O module sends a control signal requesting permission to send data The peripheral acknowledges the request The I/O module transfers data The peripheral acknowledges receipt of data For a detailed explanation of FireWire and InfiniBand technology read page 264 – 270 of the textbook

    Read the article

  • Spooling in SQL execution plans

    - by Rob Farley
    Sewing has never been my thing. I barely even know the terminology, and when discussing this with American friends, I even found out that half the words that Americans use are different to the words that English and Australian people use. That said – let’s talk about spools! In particular, the Spool operators that you find in some SQL execution plans. This post is for T-SQL Tuesday, hosted this month by me! I’ve chosen to write about spools because they seem to get a bad rap (even in my song I used the line “There’s spooling from a CTE, they’ve got recursion needlessly”). I figured it was worth covering some of what spools are about, and hopefully explain why they are remarkably necessary, and generally very useful. If you have a look at the Books Online page about Plan Operators, at http://msdn.microsoft.com/en-us/library/ms191158.aspx, and do a search for the word ‘spool’, you’ll notice it says there are 46 matches. 46! Yeah, that’s what I thought too... Spooling is mentioned in several operators: Eager Spool, Lazy Spool, Index Spool (sometimes called a Nonclustered Index Spool), Row Count Spool, Spool, Table Spool, and Window Spool (oh, and Cache, which is a special kind of spool for a single row, but as it isn’t used in SQL 2012, I won’t describe it any further here). Spool, Table Spool, Index Spool, Window Spool and Row Count Spool are all physical operators, whereas Eager Spool and Lazy Spool are logical operators, describing the way that the other spools work. For example, you might see a Table Spool which is either Eager or Lazy. A Window Spool can actually act as both, as I’ll mention in a moment. In sewing, cotton is put onto a spool to make it more useful. You might buy it in bulk on a cone, but if you’re going to be using a sewing machine, then you quite probably want to have it on a spool or bobbin, which allows it to be used in a more effective way. This is the picture that I want you to think about in relation to your data. I’m sure you use spools every time you use your sewing machine. I know I do. I can’t think of a time when I’ve got out my sewing machine to do some sewing and haven’t used a spool. However, I often run SQL queries that don’t use spools. You see, the data that is consumed by my query is typically in a useful state without a spool. It’s like I can just sew with my cotton despite it not being on a spool! Many of my favourite features in T-SQL do like to use spools though. This looks like a very similar query to before, but includes an OVER clause to return a column telling me the number of rows in my data set. I’ll describe what’s going on in a few paragraphs’ time. So what does a Spool operator actually do? The spool operator consumes a set of data, and stores it in a temporary structure, in the tempdb database. This structure is typically either a Table (ie, a heap), or an Index (ie, a b-tree). If no data is actually needed from it, then it could also be a Row Count spool, which only stores the number of rows that the spool operator consumes. A Window Spool is another option if the data being consumed is tightly linked to windows of data, such as when the ROWS/RANGE clause of the OVER clause is being used. You could maybe think about the type of spool being like whether the cotton is going onto a small bobbin to fit in the base of the sewing machine, or whether it’s a larger spool for the top. A Table or Index Spool is either Eager or Lazy in nature. Eager and Lazy are Logical operators, which talk more about the behaviour, rather than the physical operation. If I’m sewing, I can either be all enthusiastic and get all my cotton onto the spool before I start, or I can do it as I need it. “Lazy” might not the be the best word to describe a person – in the SQL world it describes the idea of either fetching all the rows to build up the whole spool when the operator is called (Eager), or populating the spool only as it’s needed (Lazy). Window Spools are both physical and logical. They’re eager on a per-window basis, but lazy between windows. And when is it needed? The way I see it, spools are needed for two reasons. 1 – When data is going to be needed AGAIN. 2 – When data needs to be kept away from the original source. If you’re someone that writes long stored procedures, you are probably quite aware of the second scenario. I see plenty of stored procedures being written this way – where the query writer populates a temporary table, so that they can make updates to it without risking the original table. SQL does this too. Imagine I’m updating my contact list, and some of my changes move data to later in the book. If I’m not careful, I might update the same row a second time (or even enter an infinite loop, updating it over and over). A spool can make sure that I don’t, by using a copy of the data. This problem is known as the Halloween Effect (not because it’s spooky, but because it was discovered in late October one year). As I’m sure you can imagine, the kind of spool you’d need to protect against the Halloween Effect would be eager, because if you’re only handling one row at a time, then you’re not providing the protection... An eager spool will block the flow of data, waiting until it has fetched all the data before serving it up to the operator that called it. In the query below I’m forcing the Query Optimizer to use an index which would be upset if the Name column values got changed, and we see that before any data is fetched, a spool is created to load the data into. This doesn’t stop the index being maintained, but it does mean that the index is protected from the changes that are being done. There are plenty of times, though, when you need data repeatedly. Consider the query I put above. A simple join, but then counting the number of rows that came through. The way that this has executed (be it ideal or not), is to ask that a Table Spool be populated. That’s the Table Spool operator on the top row. That spool can produce the same set of rows repeatedly. This is the behaviour that we see in the bottom half of the plan. In the bottom half of the plan, we see that the a join is being done between the rows that are being sourced from the spool – one being aggregated and one not – producing the columns that we need for the query. Table v Index When considering whether to use a Table Spool or an Index Spool, the question that the Query Optimizer needs to answer is whether there is sufficient benefit to storing the data in a b-tree. The idea of having data in indexes is great, but of course there is a cost to maintaining them. Here we’re creating a temporary structure for data, and there is a cost associated with populating each row into its correct position according to a b-tree, as opposed to simply adding it to the end of the list of rows in a heap. Using a b-tree could even result in page-splits as the b-tree is populated, so there had better be a reason to use that kind of structure. That all depends on how the data is going to be used in other parts of the plan. If you’ve ever thought that you could use a temporary index for a particular query, well this is it – and the Query Optimizer can do that if it thinks it’s worthwhile. It’s worth noting that just because a Spool is populated using an Index Spool, it can still be fetched using a Table Spool. The details about whether or not a Spool used as a source shows as a Table Spool or an Index Spool is more about whether a Seek predicate is used, rather than on the underlying structure. Recursive CTE I’ve already shown you an example of spooling when the OVER clause is used. You might see them being used whenever you have data that is needed multiple times, and CTEs are quite common here. With the definition of a set of data described in a CTE, if the query writer is leveraging this by referring to the CTE multiple times, and there’s no simplification to be leveraged, a spool could theoretically be used to avoid reapplying the CTE’s logic. Annoyingly, this doesn’t happen. Consider this query, which really looks like it’s using the same data twice. I’m creating a set of data (which is completely deterministic, by the way), and then joining it back to itself. There seems to be no reason why it shouldn’t use a spool for the set described by the CTE, but it doesn’t. On the other hand, if we don’t pull as many columns back, we might see a very different plan. You see, CTEs, like all sub-queries, are simplified out to figure out the best way of executing the whole query. My example is somewhat contrived, and although there are plenty of cases when it’s nice to give the Query Optimizer hints about how to execute queries, it usually doesn’t do a bad job, even without spooling (and you can always use a temporary table). When recursion is used, though, spooling should be expected. Consider what we’re asking for in a recursive CTE. We’re telling the system to construct a set of data using an initial query, and then use set as a source for another query, piping this back into the same set and back around. It’s very much a spool. The analogy of cotton is long gone here, as the idea of having a continual loop of cotton feeding onto a spool and off again doesn’t quite fit, but that’s what we have here. Data is being fed onto the spool, and getting pulled out a second time when the spool is used as a source. (This query is running on AdventureWorks, which has a ManagerID column in HumanResources.Employee, not AdventureWorks2012) The Index Spool operator is sucking rows into it – lazily. It has to be lazy, because at the start, there’s only one row to be had. However, as rows get populated onto the spool, the Table Spool operator on the right can return rows when asked, ending up with more rows (potentially) getting back onto the spool, ready for the next round. (The Assert operator is merely checking to see if we’ve reached the MAXRECURSION point – it vanishes if you use OPTION (MAXRECURSION 0), which you can try yourself if you like). Spools are useful. Don’t lose sight of that. Every time you use temporary tables or table variables in a stored procedure, you’re essentially doing the same – don’t get upset at the Query Optimizer for doing so, even if you think the spool looks like an expensive part of the query. I hope you’re enjoying this T-SQL Tuesday. Why not head over to my post that is hosting it this month to read about some other plan operators? At some point I’ll write a summary post – once I have you should find a comment below pointing at it. @rob_farley

    Read the article

  • How can I read data from a generated pointcloud, but from an other perspective of the camera? [migrated]

    - by Vlad Lata
    Basically what I'm trying to do is as follows: I have a software that generates and shows a pointcloud by analyzing Time-of-Flight data (Z-Data). This software has a GUI that delivers this pointcloud on a grid, and you can watch it and adjust the camera to change the perspective, or apply filtering to it and so on. Since the Z-data was recorder through a stereoscopic system, I want to obtain a perspective transformation. My idea was to simply change the position of the camera in the GUI and than add a button that sais (ex. New Perspective) that calls a function that would measure the distances from the existing pointcloud to the camera I'm viewing it from. Of course this would generate some occluded areas, but I want this to happen. And now the main question is: How can I do that? Are there any functions in OpenGL that measure the distance from an object to a camera, or is it even possible to do something like his? Or has someone some other idea? P.S. The software uses the qt sdk and opengl

    Read the article

  • How to use the unit of work and repository patterns in a service oriented enviroment

    - by A. Karimi
    I've created an application framework using the unit of work and repository patterns for it's data layer. Data consumer layers such as presentation depend on the data layer design. For example a CRUD abstract form has a dependency to a repository (IRepository). This architecture works like a charm in client/server environments (Ex. a WPF application and a SQL Server). But I'm looking for a good pattern to change or reuse this architecture for a service oriented environment. Of course I have some ideas: Idea 1: The "Adapter" design pattern Keep the current architecture and create a new unit of work and repository implementation which can work with a service instead of the ORM. Data layer consumers are loosely coupled to the data layer so it's possible but the problem is about the unit of work; I have to create a context which tracks the objects state at the client side and sends the changes to the server side on calling the "Commit" (Something that I think the RIA has done for Silverlight). Here the diagram: ----------- CLIENT----------- | ------------------ SERVER ---------------------- [ UI ] -> [ UoW/Repository ] ---> [ Web Services ] -> [ UoW/Repository ] -> [DB] Idea 2: Add another layer Add another layer (let say "local services" or "data provider"), then put it between the data layer (unit of work and repository) and the data consumer layers (like UI). Then I have to rewrite the consumer classes (CRUD and other classes which are dependent to IRepository) to depend on another interface. And the diagram: ----------------- CLIENT ------------------ | ------------------- SERVER --------------------- [ UI ] -> [ Local Services/Data Provider ] ---> [ Web Services ] -> [ UoW/Repository ] -> [DB] Please note that I have the local services layer on the current architecture but it doesn't expose the data layer functionality. In another word the UI layer can communicate with both of the data and local services layers whereas the local services layer also uses the data layer. | | | | | | | | ---> | Local Services | ---> | | | UI | | | | Data | | | | | | | ----------------------------> | |

    Read the article

  • How to create per-vertex normals when reusing vertex data?

    - by Chris Smith
    I am displaying a cube using a vertex buffer object (gl.ELEMENT_ARRAY_BUFFER). This allows me to specify vertex indicies, rather than having duplicate vertexes. In the case of displaying a simple cube, this means I only need to have eight vertices total. Opposed to needing three vertices per triangle, times two triangles per face, times six faces. Sound correct so far? My question is, how do I now deal with vertex attribute data such as color, texture coordinates, and normals when reusing vertices using the vertex buffer object? If I am reusing the same vertex data in my indexed vertex buffer, how can I differentiate when vertex X is used as part of the cube's front face versus the cube's left face? In both cases I would like the surface normal and texture coordinates to be different. I understand I could average the surface normal, however I would like to render a cube. Also, this still doesn't work for texture coordinates. Is there a way to save memory using a vertex buffer object while being able to provide different vertex attribute data based on context? (Per-triangle would be idea.) Or should I just duplicate each vertex for each context in which it gets rendered. (So there is a one-to-one mapping between vertex, normal, color, etc.) Note: I'm using OpenGL ES.

    Read the article

  • ASP.NET MVC 2 from Scratch &ndash; Part 1 Listing Data from Database

    - by Max
    Part 1 - Listing Data from Database: Let us now learn ASP.NET MVC 2 from Scratch by actually developing a front end website for the Chinook database, which is an alternative to the traditional Northwind database. You can get the Chinook database from here. As always the best way to learn something is by working on it and doing something. The Chinook database has the following schema, a quick look will help us implementing the application in a efficient way. Let us first implement a grid view table with the list of Employees with some details, this table also has the Details, Edit and Delete buttons on it to do some operations. This is series of post will concentrate on creating a simple CRUD front end for Chinook DB using ASP.NET MVC 2. In this post, we will look at listing all the possible Employees in the database in a tabular format, from which, we can then edit and delete them as required. In this post, we will concentrate on setting up our environment and then just designing a page to show a tabular information from the database. We need to first setup the SQL Server database, you can download the required version and then set it up in your localhost. Then we need to add the LINQ to SQL Classes required for us to enable interaction with our database. Now after you do the above step, just use your Server Explorer in VS 2010 to actually navigate to the database, expand the tables node and then drag drop all the tables onto the Object Relational Designer space and you go you will have the tables visualized as classes. As simple as that. Now for the purpose of displaying the data from Employee in a table, we will show only the EmployeeID, Firstname and lastname. So let us create a class to hold this information. So let us add a new class called EmployeeList to the ViewModels. We will send this data model to the View and this can be displayed in the page. public class EmployeeList { public int EmployeeID { get; set; } public string Firstname { get; set; } public string Lastname { get; set; } public EmployeeList(int empID, string fname, string lname) { this.EmployeeID = empID; this.Firstname = fname; this.Lastname = lname; } } Ok now we have got the backend ready. Let us now look at the front end view now. We will first create a master called Site.Master and reuse it across the site. The Site.Master content will be <%@ Master Language="C#" AutoEventWireup="true" CodeBehind="Site.Master.cs" Inherits="ChinookMvcSample.Views.Shared.Site" %>   <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head id="Head1" runat="server"> <title></title> <style type="text/css"> html { background-color: gray; } .content { width: 880px; position: relative; background-color: #ffffff; min-width: 880px; min-height: 800px; float: inherit; text-align: justify; } </style> <script src="../../Scripts/jquery-1.4.1.min.js" type="text/javascript"></script> <asp:ContentPlaceHolder ID="head" runat="server"> </asp:ContentPlaceHolder> </head> <body> <center> <h1> My Website</h1> <div class="content"> <asp:ContentPlaceHolder ID="body" runat="server"> </asp:ContentPlaceHolder> </div> </center> </body> </html> The backend Site.Master.cs does not contain anything. In the actual Index.aspx view, we add the code to simply iterate through the collection of EmployeeList that was sent to the View via the Controller. So in the top of the Index.aspx view, we have this inherits which says Inherits="System.Web.Mvc.ViewPage<IEnumerable<ChinookMvcSample.ViewModels.EmployeeList>>" In this above line, we dictate that the page is consuming a IEnumerable collection of EmployeeList. So once we specify this and compile the project. Then in our Index.aspx page, we can consume the EmployeeList object and access all its methods and properties. <table class="styled" cellpadding="3" border="0" cellspacing="0"> <tr> <th colspan="3"> </th> <th> First Name </th> <th> Last Name </th> </tr> <% foreach (var item in Model) { %> <tr> <td align="center"> <%: Html.ActionLink("Edit", "Edit", new { id = item.EmployeeID }, new { id = "links" })%> </td> <td align="center"> <%: Html.ActionLink("Details", "Details", new { id = item.EmployeeID }, new { id = "links" })%> </td> <td align="center"> <%: Html.ActionLink("Delete", "Delete", new { id = item.EmployeeID }, new { id = "links" })%> </td> <td> <%: item.Firstname %> </td> <td> <%: item.Lastname %> </td> </tr> <% } %> <tr> <td colspan="5"> <%: Html.ActionLink("Create New", "Create") %> </td> </tr> </table> The Html.ActionLink is a Html Helper to a create a hyperlink in the page, in the one we have used, the first parameter is the text that is to be used for the hyperlink, second one is the action name, third one is the parameter to be passed, last one is the attributes to be added while the hyperlink is rendered in the page. Here we are adding the id=”links” to the hyperlinks that is created in the page. In the index.aspx page, we add some jQuery stuff add alternate row colours and highlight colours for rows on mouse over. Now the Controller that handles the requests and directs the request to the right view. For the index view, the controller would be public ActionResult Index() { //var Employees = from e in data.Employees select new EmployeeList(e.EmployeeId,e.FirstName,e.LastName); //return View(Employees.ToList()); return View(_data.Employees.Select(p => new EmployeeList(p.EmployeeId, p.FirstName, p.LastName))); } Let us also write a unit test using NUnit for the above, just testing EmployeeController’s Index. DataClasses1DataContext _data; public EmployeeControllerTest() { _data = new DataClasses1DataContext("Data Source=(local);Initial Catalog=Chinook;Integrated Security=True"); }   [Test] public void TestEmployeeIndex() { var e = new EmployeeController(_data); var result = e.Index() as ViewResult; var employeeList = result.ViewData.Model; Assert.IsNotNull(employeeList, "Result is null."); } In the first EmployeeControllerTest constructor, we set the data context to be used while running the tests. And then in the actual test, We just ensure that the View results returned by Index is not null. Here is the zip of the entire solution files until this point. Let me know if you have any doubts or clarifications. Cheers! Have a nice day.

    Read the article

  • A Big Data korszakban, túl az 1000. eladott Oracle Exadata Database Machine adatbázisgépen

    - by user645740
    Mint azt már egy ideje a szél is fújja, beköszöntött a BIG DATA korszak, azaz egyre több adat gyulik, egyre több adattal gazdálkodunk. A hatalmas mennyiségu adat jó részét Oracle adatbázisokban tárolják. Mi is futtathatná jobban, gyorsabban és hetékonyabban ezeket az Oracle adatbázisokat, mint az Oracle stratégiai high-end megoldása az Oracle Exadata Database Machine? Rengeteg forrása van a sok adatnak, néhány példa, ahol a növekedés óriási: kommunikációs adatok, CDR-ek banki és kormányzati tranzakciók hely információk spatial, location, GPS,..., mint ahogyan a közelmúltban az egyes telefonokkal ésoperációs rendszerekkel kapcsolatos "ügyekben" is olvashattuk, e-mail-ek, közösségi site-ok, intelligens méromuszerek, háztartási berendezések, .... Milyen ütemben no az Exadata értékesítés? Nos az Exadata 2008 oszén lett bejelentve. Az Oracle pénzügyi év végén a jelentésben azt olvashatjuk, hogy az Exadata páratlanul sikeres megoldás, már több mint 1000 Exadatát vásároltak meg az Oracle ügyfelek, mondta Mark Hurd, az Oracle alelnöke:   “In addition to record setting software sales, our Exadata and Exalogic systems also made a strong contribution to our growth in Q4,” said Oracle President, Mark Hurd. “Today there are more than 1,000 Exadata machines installed worldwide. Our goal is to triple that number in FY12.” Larry Ellison, az Oracle elso embere, azt nyilatkozta, hogy mind a felho - cloud computing, mind a memória-adatbázisok területén egyre gyorsabban növekszik az Oracle:   “In FY11 Oracle’s database business experienced its fastest growth in a decade,” said Oracle CEO, Larry Ellison. “Over the past few years we added features to the Oracle database for both cloud computing and in-memory databases that led to increased database sales this past year. Lately we’ve been focused on the big business opportunity presented by Big Data.” A Big Data korszakban  megtakarításokat érhetünk el az Exadatával, tekintse meg a következo videót, de óvatosan, mert gondolkodásra késztet:    -   Oracle Exadata: Are You Ready?.

    Read the article

< Previous Page | 531 532 533 534 535 536 537 538 539 540 541 542  | Next Page >