Search Results

Search found 21142 results on 846 pages for 'bit manipulation'.

Page 372/846 | < Previous Page | 368 369 370 371 372 373 374 375 376 377 378 379 | Next Page >

ASP.NET MVC JavaScript Routing

- by zowens

Have you ever done this sort of thing in your ASP.NET MVC view? The weird thing about this isn’t the alert function, it’s the code block containing the Url formation using the ASP.NET MVC UrlHelper. The terrible thing about this experience is the obvious lack of IntelliSense and this ugly inline JavaScript code. Inline JavaScript isn’t portable to other pages beyond the current page of execution. It is generally considered bad practice to use inline JavaScript in your public-facing pages. How ludicrous would it be to copy and paste the entire jQuery code base into your pages…? Not something you’d ever consider doing. The problem is that your URLs have to be generated by ASP.NET at runtime and really can’t be copied to your JavaScript code without some trickery. How about this? Does the hard-coded URL bother you? It really bothers me. The typical solution to this whole routing in JavaScript issue is to just hard-code your URLs into your JavaScript files and call it done. But what if your URLs change? You have to now go an track down the places in JavaScript and manually replace them. What if you get the pattern wrong? Do you have tests around it? This isn’t something you should have to worry about. The Solution To Our Problems The solution is to port routing over to JavaScript. Does that sound daunting to you? It’s actually not very hard, but I decided to create my own generator that will do all the work for you. What I have created is a very basic port of the route formation feature of ASP.NET routing. It will generate the formatted URLs based on your routing patterns. Here’s how you’d do this: Does that feel familiar? It looks a lot like something you’d do inside of your ASP.NET MVC views… but this is inside of a JavaScript file… just a plain ol’ .js file. Your first question might be why do you have to have that “.toUrl()” thing. The reason is that I wanted to make POST and GET requests dead simple. Here’s how you’d do a POST request (and the same would work with a GET request): The first parameter is extra data passed to the post request and the second parameter is a function that handles the success of the POST request. If you’re familiar with jQuery’s Ajax goodness, you’ll know how to use it. (if not, check out http://api.jquery.com/jQuery.Post/ and the parameters are essentially the same). But we still haven’t gotten rid of the magic strings. We still have controller names and action names represented as strings. This is going to blow your mind… If you’ve seen T4MVC, this will look familiar. We’re essentially doing the same sort of thing with my JavaScript router, but we’re porting the concept to JavaScript. The good news is that parameters to the controllers are directly reflected in the action function, just like T4MVC. And the even better news… IntlliSense is easily transferred to the JavaScript version if you’re using Visual Studio as your JavaScript editor. The additional data parameter gives you the ability to pass extra routing data to the URL formatter. About the Magic You may be wondering how this all work. It’s actually quite simple. I’ve built a simple jQuery pluggin (called routeManager) that hangs off the main jQuery namespace and routes all the URLs. Every time your solution builds, a routing file will be generated with this pluggin, all your route and controller definitions along with your documentation. Then by the power of Visual Studio, you get some really slick IntelliSense that is hard to live without. But there are a few steps you have to take before this whole thing is going to work. First and foremost, you need a reference to the JsRouting.Core.dll to your projects containing controllers or routes. Second, you have to specify your routes in a bit of a non-standard way. See, we can’t just pull routes out of your App_Start in your Global.asax. We force you to build a route source like this: The way we determine the routes is by pulling in all RouteSources and generating routes based upon the mapped routes. There are various reasons why we can’t use RouteCollection (different post for another day)… but in this case, you get the same route mapping experience. Converting the RouteSource to a RouteCollection is trivial (there’s an extension method for that). Next thing you have to do is generate a documentation XML file. This is done by going to the project settings, going to the build tab and clicking the checkbox. (this isn’t required, but nice to have). The final thing you need to do is hook up the generation mechanism. Pop open your project file and look for the AfterBuild step. Now change the build step task to look like this: The “PathToOutputExe” is the path to the JsRouting.Output.exe file. This will change based on where you put the EXE. The “PathToOutputJs” is a path to the output JavaScript file. The “DicrectoryOfAssemblies” is a path to the directory containing controller and routing DLLs. The JsRouting.Output.exe executable pulls in all these assemblies and scans them for controllers and route sources. Now that wasn’t too bad, was it :) The State of the Project This is definitely not complete… I have a lot of plans for this little project of mine. For starters, I need to look at the generation mechanism. Either I will be creating a utility that will do the project file manipulation or I will go a different direction. I’d like some feedback on this if you feel partial either way. Another thing I don’t support currently is areas. While this wouldn’t be too hard to support, I just don’t use areas and I wanted something up quickly (this is, after all, for a current project of mine). I’ll be adding support shortly. There are a few things that I haven’t covered in this post that I will most certainly be covering in another post, such as routing constraints and how these will be translated to JavaScript. I decided to open source this whole thing, since it’s a nice little utility I think others should really be using. Currently we’re using ASP.NET MVC 2, but it should work with MVC 3 as well. I’ll upgrade it as soon as MVC 3 is released. Along those same lines, I’m investigating how this could be put on the NuGet feed. Show me the Bits! OK, OK! The code is posted on my GitHub account. Go nuts. Tell me what you think. Tell me what you want. Tell me that you hate it. All feedback is welcome! https://github.com/zowens/ASP.NET-MVC-JavaScript-Routing

Read the article
Creating an SMF service for mercurial web server

- by Chris W Beal

I'm working on a project at the moment, which has a number of contributers. We're managing the project gate (which is stand alone) with mercurial. We want to have an easy way of seeing the changelog, so we can show management what is going on. Luckily mercurial provides a basic web server which allows you to see the changes, and drill in to change sets. This can be run as a daemon, but as it was running on our build server, every time it was rebooted, someone needed to remember to start the process again. This is of course a classic usage of SMF. Now I'm not an experienced person at writing SMF services, so it took me 1/2 an hour or so to figure it out the first time. But going forward I should know what I'm doing a bit better. I did reference this doc extensively. Taking a step back, the command to start the mercurial web server is $ hg serve -p <port number> -d So we somehow need to get SMF to run that command for us. In the simplest form, SMF services are really made up of two components. The manifest Usually lives in /var/svc/manifest somewhere Can be imported from any location The method Usually live in /lib/svc/method I simply put the script straight in that directory. Not very repeatable, but it worked Can take an argument of start, stop, or refresh Lets start with the manifest. This looks pretty complex, but all it's doing is describing the service name, the dependencies, the start and stop methods, and some properties. The properties can be by instance, that is to say I could have multiple hg serve processes handling different mercurial projects, on different ports simultaneously Here is the manifest I wrote. I stole extensively from the examples in the Documentation. So my manifest looks like this $ cat hg-serve.xml <?xml version="1.0"?> <!DOCTYPE service_bundle SYSTEM "/usr/share/lib/xml/dtd/service_bundle.dtd.1"> <service_bundle type='manifest' name='hg-serve'> <service name='application/network/hg-serve' type='service' version='1'> <dependency name='network' grouping='require_all' restart_on='none' type='service'> <service_fmri value='svc:/milestone/network:default' /> </dependency> <exec_method type='method' name='start' exec='/lib/svc/method/hg-serve %m' timeout_seconds='2' /> <exec_method type='method' name='stop' exec=':kill' timeout_seconds='2'> </exec_method> <instance name='project-gate' enabled='true'> <method_context> <method_credential user='root' group='root' /> </method_context> <property_group name='hg-serve' type='application'> <propval name='path' type='astring' value='/src/project-gate'/> <propval name='port' type='astring' value='9998' /> </property_group> </instance> <stability value='Evolving' /> <template> <common_name> <loctext xml:lang='C'>hg-serve</loctext> </common_name> <documentation> <manpage title='hg' section='1' /> </documentation> </template> </service> </service_bundle> So the only things I had to decide on in this are the service name "application/network/hg-serve" the start and stop methods (more of which later) and the properties. This is the information I need to pass to the start method script. In my case the port I want to start the web server on "9998", and the path to the source gate "/src/project-gate". These can be read in to the start method. So now lets look at the method scripts $ cat /lib/svc/method/hg-serve #!/sbin/sh # # # Copyright (c) 2012, Oracle and/or its affiliates. All rights reserved. # # Standard prolog # . /lib/svc/share/smf_include.sh if [ -z $SMF_FMRI ]; then echo "SMF framework variables are not initialized." exit $SMF_EXIT_ERR fi # # Build the command line flags # # Get the port and directory from the SMF properties port=`svcprop -c -p hg-serve/port $SMF_FMRI` dir=`svcprop -c -p hg-serve/path $SMF_FMRI` echo "$1" case "$1" in 'start') cd $dir /usr/bin/hg serve -d -p $port ;; *) echo "Usage: $0 {start|refresh|stop}" exit 1 ;; esac exit $SMF_EXIT_OK This is all pretty self explanatory, we read the port and directory using svcprop, and use those simply to run a command in the start case. We don't need to implement a stop case, as the manifest says to use "exec=':kill'for the stop method. Now all we need to do is import the manifest and start the service, but first verify the manifest # svccfg verify /path/to/hg-serve.xml If that doesn't give an error try importing it # svccfg import /path/to/hg-serve.xml If like me you originally put the hg-serve.xml file in /var/svc/manifest somewhere you'll get an error and told to restart the import service svccfg: Restarting svc:/system/manifest-import The manifest being imported is from a standard location and should be imported with the command : svcadm restart svc:/system/manifest-import # svcadm restart svc:/system/manifest-import and you're nearly done. You can look at the service using svcs -l # svcs -l hg-serve fmri svc:/application/network/hg-serve:project-gate name hg-serve enabled false state disabled next_state none state_time Thu May 31 16:11:47 2012 logfile /var/svc/log/application-network-hg-serve:project-gate.log restarter svc:/system/svc/restarter:default contract_id 15749 manifest /var/svc/manifest/network/hg/hg-serve.xml dependency require_all/none svc:/milestone/network:default (online) And look at the interesting properties # svcprop hg-serve hg-serve/path astring /src/project-gate hg-serve/port astring 9998 ...stuff deleted.... Then simply enable the service and if every things gone right, you can point your browser at http://server:9998 and get a nice graphical log of project activity. # svcadm enable hg-serve # svcs -l hg-serve fmri svc:/application/network/hg-serve:project-gate name hg-serve enabled true state online next_state none state_time Thu May 31 16:18:11 2012 logfile /var/svc/log/application-network-hg-serve:project-gate.log restarter svc:/system/svc/restarter:default contract_id 15858 manifest /var/svc/manifest/network/hg/hg-serve.xml dependency require_all/none svc:/milestone/network:default (online) None of this is rocket science, but a bit fiddly. Hence I thought I'd blog it. It might just be you see this in google and it clicks with you more than one of the many other blogs or how tos about it. Plus I can always refer back to it myself in 3 weeks, when I want to add another project to the server, and I've forgotten how to do it.

Read the article
How to archive data from a table to a local or remote database in SQL 2005 and SQL 2008

- by simonsabin

Often you have the need to archive data from a table. This leads to a number of challenges 1. How can you do it without impacting users 2. How can I make it transactionally consistent, i.e. the data I put in the archive is the data I remove from the main table 3. How can I get it to perform well Points 1 is very much tied to point 3. If it doesn't perform well then the delete of data is going to cause lots of locks and thus potentially blocking. For points 1 and 3 refer to my previous posts DELETE-TOP-x-rows-avoiding-a-table-scan and UPDATE-and-DELETE-TOP-and-ORDER-BY---Part2. In essence you need to be removing small chunks of data from your table and you want to do that avoiding a table scan. So that deals with the delete approach but archiving is about inserting that data somewhere else. Well in SQL 2008 they introduced a new feature INSERT over DML (Data Manipulation Language, i.e. SQL statements that change data), or composable DML. The ability to nest DML statements within themselves, so you can past the results of an insert to an update to a merge. I've mentioned this before here SQL-Server-2008---MERGE-and-optimistic-concurrency. This feature is currently limited to being able to consume the results of a DML statement in an INSERT statement. There are many restrictions which you can find here http://msdn.microsoft.com/en-us/library/ms177564.aspx look for the section "Inserting Data Returned From an OUTPUT Clause Into a Table" Even with the restrictions what we can do is consume the OUTPUT from a DELETE and INSERT the results into a table in another database. Note that in BOL it refers to not being able to use a remote table, remote means a table on another SQL instance. To show this working use this SQL to setup two databases foo and fooArchive create database foo go --create the source table fred in database foo select * into foo..fred from sys.objects go create database fooArchive go if object_id('fredarchive',DB_ID('fooArchive')) is null begin select getdate() ArchiveDate,* into fooArchive..FredArchive from sys.objects where 1=2 end go And then we can use this simple statement to archive the data insert into fooArchive..FredArchive select getdate(),d.* from (delete top (1) from foo..Fred output deleted.*) d go In this statement the delete can be any delete statement you wish so if you are deleting by ids or a range of values then you can do that. Refer to the DELETE-TOP-x-rows-avoiding-a-table-scan post to ensure that your delete is going to perform. The last thing you want to do is to perform 100 deletes each with 5000 records for each of those deletes to do a table scan. For a solution that works for SQL2005 or if you want to archive to a different server then you can use linked servers or SSIS. This example shows how to do it with linked servers. [ONARC-LAP03] is the source server. begin transaction insert into fooArchive..FredArchive select getdate(),d.* from openquery ([ONARC-LAP03],'delete top (1) from foo..Fred output deleted.*') d commit transaction and to prove the transactions work try, you should get the same number of records before and after. select (select count(1) from foo..Fred) fred ,(select COUNT(1) from fooArchive..FredArchive ) fredarchive begin transaction insert into fooArchive..FredArchive select getdate(),d.* from openquery ([ONARC-LAP03],'delete top (1) from foo..Fred output deleted.*') d rollback transaction select (select count(1) from foo..Fred) fred ,(select COUNT(1) from fooArchive..FredArchive ) fredarchive The transactions are very important with this solution. Look what happens when you don't have transactions and an error occurs select (select count(1) from foo..Fred) fred ,(select COUNT(1) from fooArchive..FredArchive ) fredarchive insert into fooArchive..FredArchive select getdate(),d.* from openquery ([ONARC-LAP03],'delete top (1) from foo..Fred output deleted.* raiserror (''Oh doo doo'',15,15)') d select (select count(1) from foo..Fred) fred ,(select COUNT(1) from fooArchive..FredArchive ) fredarchive Before running this think what the result would be. I got it wrong. What seems to happen is that the remote query is executed as a transaction, the error causes that to rollback. However the results have already been sent to the client and so get inserted into the

Read the article
The blocking nature of aggregates

- by Rob Farley

I wrote a post recently about how query tuning isn’t just about how quickly the query runs – that if you have something (such as SSIS) that is consuming your data (and probably introducing a bottleneck), then it might be more important to have a query which focuses on getting the first bit of data out. You can read that post here. In particular, we looked at two operators that could be used to ensure that a query returns only Distinct rows. and The Sort operator pulls in all the data, sorts it (discarding duplicates), and then pushes out the remaining rows. The Hash Match operator performs a Hashing function on each row as it comes in, and then looks to see if it’s created a Hash it’s seen before. If not, it pushes the row out. The Sort method is quicker, but has to wait until it’s gathered all the data before it can do the sort, and therefore blocks the data flow. But that was my last post. This one’s a bit different. This post is going to look at how Aggregate functions work, which ties nicely into this month’s T-SQL Tuesday. I’ve frequently explained about the fact that DISTINCT and GROUP BY are essentially the same function, although DISTINCT is the poorer cousin because you have less control over it, and you can’t apply aggregate functions. Just like the operators used for Distinct, there are different flavours of Aggregate operators – coming in blocking and non-blocking varieties. The example I like to use to explain this is a pile of playing cards. If I’m handed a pile of cards and asked to count how many cards there are in each suit, it’s going to help if the cards are already ordered. Suppose I’m playing a game of Bridge, I can easily glance at my hand and count how many there are in each suit, because I keep the pile of cards in order. Moving from left to right, I could tell you I have four Hearts in my hand, even before I’ve got to the end. By telling you that I have four Hearts as soon as I know, I demonstrate the principle of a non-blocking operation. This is known as a Stream Aggregate operation. It requires input which is sorted by whichever columns the grouping is on, and it will release a row as soon as the group changes – when I encounter a Spade, I know I don’t have any more Hearts in my hand. Alternatively, if the pile of cards are not sorted, I won’t know how many Hearts I have until I’ve looked through all the cards. In fact, to count them, I basically need to put them into little piles, and when I’ve finished making all those piles, I can count how many there are in each. Because I don’t know any of the final numbers until I’ve seen all the cards, this is blocking. This performs the aggregate function using a Hash Match. Observant readers will remember this from my Distinct example. You might remember that my earlier Hash Match operation – used for Distinct Flow – wasn’t blocking. But this one is. They’re essentially doing a similar operation, applying a Hash function to some data and seeing if the set of values have been seen before, but before, it needs more information than the mere existence of a new set of values, it needs to consider how many of them there are. A lot is dependent here on whether the data coming out of the source is sorted or not, and this is largely determined by the indexes that are being used. If you look in the Properties of an Index Scan, you’ll be able to see whether the order of the data is required by the plan. A property called Ordered will demonstrate this. In this particular example, the second plan is significantly faster, but is dependent on having ordered data. In fact, if I force a Stream Aggregate on unordered data (which I’m doing by telling it to use a different index), a Sort operation is needed, which makes my plan a lot slower. This is all very straight-forward stuff, and information that most people are fully aware of. I’m sure you’ve all read my good friend Paul White (@sql_kiwi)’s post on how the Query Optimizer chooses which type of aggregate function to apply. But let’s take a look at SQL Server Integration Services. SSIS gives us a Aggregate transformation for use in Data Flow Tasks, but it’s described as Blocking. The definitive article on Performance Tuning SSIS uses Sort and Aggregate as examples of Blocking Transformations. I’ve just shown you that Aggregate operations used by the Query Optimizer are not always blocking, but that the SSIS Aggregate component is an example of a blocking transformation. But is it always the case? After all, there are plenty of SSIS Performance Tuning talks out there that describe the value of sorted data in Data Flow Tasks, describing the IsSorted property that can be set through the Advanced Editor of your Source component. And so I set about testing the Aggregate transformation in SSIS, to prove for sure whether providing Sorted data would let the Aggregate transform behave like a Stream Aggregate. (Of course, I knew the answer already, but it helps to be able to demonstrate these things). A query that will produce a million rows in order was in order. Let me rephrase. I used a query which produced the numbers from 1 to 1000000, in a single field, ordered. The IsSorted flag was set on the source output, with the only column as SortKey 1. Performing an Aggregate function over this (counting the number of rows per distinct number) should produce an additional column with 1 in it. If this were being done in T-SQL, the ordered data would allow a Stream Aggregate to be used. In fact, if the Query Optimizer saw that the field had a Unique Index on it, it would be able to skip the Aggregate function completely, and just insert the value 1. This is a shortcut I wouldn’t be expecting from SSIS, but certainly the Stream behaviour would be nice. Unfortunately, it’s not the case. As you can see from the screenshots above, the data is pouring into the Aggregate function, and not being released until all million rows have been seen. It’s not doing a Stream Aggregate at all. This is expected behaviour. (I put that in bold, because I want you to realise this.) An SSIS transformation is a piece of code that runs. It’s a physical operation. When you write T-SQL and ask for an aggregation to be done, it’s a logical operation. The physical operation is either a Stream Aggregate or a Hash Match. In SSIS, you’re telling the system that you want a generic Aggregation, that will have to work with whatever data is passed in. I’m not saying that it wouldn’t be possible to make a sometimes-blocking aggregation component in SSIS. A Custom Component could be created which could detect whether the SortKeys columns of the input matched the Grouping columns of the Aggregation, and either call the blocking code or the non-blocking code as appropriate. One day I’ll make one of those, and publish it on my blog. I’ve done it before with a Script Component, but as Script components are single-use, I was able to handle the data knowing everything about my data flow already. As per my previous post – there are a lot of aspects in which tuning SSIS and tuning execution plans use similar concepts. In both situations, it really helps to have a feel for what’s going on behind the scenes. Considering whether an operation is blocking or not is extremely relevant to performance, and that it’s not always obvious from the surface. In a future post, I’ll show the impact of blocking v non-blocking and synchronous v asynchronous components in SSIS, using some of LobsterPot’s Script Components and Custom Components as examples. When I get that sorted, I’ll make a Stream Aggregate component available for download.

Read the article
The blocking nature of aggregates

- by Rob Farley

I wrote a post recently about how query tuning isn’t just about how quickly the query runs – that if you have something (such as SSIS) that is consuming your data (and probably introducing a bottleneck), then it might be more important to have a query which focuses on getting the first bit of data out. You can read that post here. In particular, we looked at two operators that could be used to ensure that a query returns only Distinct rows. and The Sort operator pulls in all the data, sorts it (discarding duplicates), and then pushes out the remaining rows. The Hash Match operator performs a Hashing function on each row as it comes in, and then looks to see if it’s created a Hash it’s seen before. If not, it pushes the row out. The Sort method is quicker, but has to wait until it’s gathered all the data before it can do the sort, and therefore blocks the data flow. But that was my last post. This one’s a bit different. This post is going to look at how Aggregate functions work, which ties nicely into this month’s T-SQL Tuesday. I’ve frequently explained about the fact that DISTINCT and GROUP BY are essentially the same function, although DISTINCT is the poorer cousin because you have less control over it, and you can’t apply aggregate functions. Just like the operators used for Distinct, there are different flavours of Aggregate operators – coming in blocking and non-blocking varieties. The example I like to use to explain this is a pile of playing cards. If I’m handed a pile of cards and asked to count how many cards there are in each suit, it’s going to help if the cards are already ordered. Suppose I’m playing a game of Bridge, I can easily glance at my hand and count how many there are in each suit, because I keep the pile of cards in order. Moving from left to right, I could tell you I have four Hearts in my hand, even before I’ve got to the end. By telling you that I have four Hearts as soon as I know, I demonstrate the principle of a non-blocking operation. This is known as a Stream Aggregate operation. It requires input which is sorted by whichever columns the grouping is on, and it will release a row as soon as the group changes – when I encounter a Spade, I know I don’t have any more Hearts in my hand. Alternatively, if the pile of cards are not sorted, I won’t know how many Hearts I have until I’ve looked through all the cards. In fact, to count them, I basically need to put them into little piles, and when I’ve finished making all those piles, I can count how many there are in each. Because I don’t know any of the final numbers until I’ve seen all the cards, this is blocking. This performs the aggregate function using a Hash Match. Observant readers will remember this from my Distinct example. You might remember that my earlier Hash Match operation – used for Distinct Flow – wasn’t blocking. But this one is. They’re essentially doing a similar operation, applying a Hash function to some data and seeing if the set of values have been seen before, but before, it needs more information than the mere existence of a new set of values, it needs to consider how many of them there are. A lot is dependent here on whether the data coming out of the source is sorted or not, and this is largely determined by the indexes that are being used. If you look in the Properties of an Index Scan, you’ll be able to see whether the order of the data is required by the plan. A property called Ordered will demonstrate this. In this particular example, the second plan is significantly faster, but is dependent on having ordered data. In fact, if I force a Stream Aggregate on unordered data (which I’m doing by telling it to use a different index), a Sort operation is needed, which makes my plan a lot slower. This is all very straight-forward stuff, and information that most people are fully aware of. I’m sure you’ve all read my good friend Paul White (@sql_kiwi)’s post on how the Query Optimizer chooses which type of aggregate function to apply. But let’s take a look at SQL Server Integration Services. SSIS gives us a Aggregate transformation for use in Data Flow Tasks, but it’s described as Blocking. The definitive article on Performance Tuning SSIS uses Sort and Aggregate as examples of Blocking Transformations. I’ve just shown you that Aggregate operations used by the Query Optimizer are not always blocking, but that the SSIS Aggregate component is an example of a blocking transformation. But is it always the case? After all, there are plenty of SSIS Performance Tuning talks out there that describe the value of sorted data in Data Flow Tasks, describing the IsSorted property that can be set through the Advanced Editor of your Source component. And so I set about testing the Aggregate transformation in SSIS, to prove for sure whether providing Sorted data would let the Aggregate transform behave like a Stream Aggregate. (Of course, I knew the answer already, but it helps to be able to demonstrate these things). A query that will produce a million rows in order was in order. Let me rephrase. I used a query which produced the numbers from 1 to 1000000, in a single field, ordered. The IsSorted flag was set on the source output, with the only column as SortKey 1. Performing an Aggregate function over this (counting the number of rows per distinct number) should produce an additional column with 1 in it. If this were being done in T-SQL, the ordered data would allow a Stream Aggregate to be used. In fact, if the Query Optimizer saw that the field had a Unique Index on it, it would be able to skip the Aggregate function completely, and just insert the value 1. This is a shortcut I wouldn’t be expecting from SSIS, but certainly the Stream behaviour would be nice. Unfortunately, it’s not the case. As you can see from the screenshots above, the data is pouring into the Aggregate function, and not being released until all million rows have been seen. It’s not doing a Stream Aggregate at all. This is expected behaviour. (I put that in bold, because I want you to realise this.) An SSIS transformation is a piece of code that runs. It’s a physical operation. When you write T-SQL and ask for an aggregation to be done, it’s a logical operation. The physical operation is either a Stream Aggregate or a Hash Match. In SSIS, you’re telling the system that you want a generic Aggregation, that will have to work with whatever data is passed in. I’m not saying that it wouldn’t be possible to make a sometimes-blocking aggregation component in SSIS. A Custom Component could be created which could detect whether the SortKeys columns of the input matched the Grouping columns of the Aggregation, and either call the blocking code or the non-blocking code as appropriate. One day I’ll make one of those, and publish it on my blog. I’ve done it before with a Script Component, but as Script components are single-use, I was able to handle the data knowing everything about my data flow already. As per my previous post – there are a lot of aspects in which tuning SSIS and tuning execution plans use similar concepts. In both situations, it really helps to have a feel for what’s going on behind the scenes. Considering whether an operation is blocking or not is extremely relevant to performance, and that it’s not always obvious from the surface. In a future post, I’ll show the impact of blocking v non-blocking and synchronous v asynchronous components in SSIS, using some of LobsterPot’s Script Components and Custom Components as examples. When I get that sorted, I’ll make a Stream Aggregate component available for download.

Read the article
Improved Performance on PeopleSoft Combined Benchmark using SPARC T4-4

- by Brian

Oracle's SPARC T4-4 server running Oracle's PeopleSoft HCM 9.1 combined online and batch benchmark achieved a world record 18,000 concurrent users experiencing subsecond response time while executing a PeopleSoft Payroll batch job of 500,000 employees in 32.4 minutes. This result was obtained with a SPARC T4-4 server running Oracle Database 11g Release 2, a SPARC T4-4 server running PeopleSoft HCM 9.1 application server and a SPARC T4-2 server running Oracle WebLogic Server in the web tier. The SPARC T4-4 server running the application tier used Oracle Solaris Zones which provide a flexible, scalable and manageable virtualization environment. The average CPU utilization on the SPARC T4-2 server in the web tier was 17%, on the SPARC T4-4 server in the application tier it was 59%, and on the SPARC T4-4 server in the database tier was 47% (online and batch) leaving significant headroom for additional processing across the three tiers. The SPARC T4-4 server used for the database tier hosted Oracle Database 11g Release 2 using Oracle Automatic Storage Management (ASM) for database files management with I/O performance equivalent to raw devices. Performance Landscape Results are presented for the PeopleSoft HRMS Self-Service and Payroll combined benchmark. The new result with 128 streams shows significant improvement in the payroll batch processing time with little impact on the self-service component response time. PeopleSoft HRMS Self-Service and Payroll Benchmark Systems Users Ave Response Search (sec) Ave Response Save (sec) Batch Time (min) Streams SPARC T4-2 (web) SPARC T4-4 (app) SPARC T4-4 (db) 18,000 0.988 0.539 32.4 128 SPARC T4-2 (web) SPARC T4-4 (app) SPARC T4-4 (db) 18,000 0.944 0.503 43.3 64 The following results are for the PeopleSoft HRMS Self-Service benchmark that was previous run. The results are not directly comparable with the combined results because they do not include the payroll component. PeopleSoft HRMS Self-Service 9.1 Benchmark Systems Users Ave Response Search (sec) Ave Response Save (sec) Batch Time (min) Streams SPARC T4-2 (web) SPARC T4-4 (app) 2x SPARC T4-2 (db) 18,000 1.048 0.742 N/A N/A The following results are for the PeopleSoft Payroll benchmark that was previous run. The results are not directly comparable with the combined results because they do not include the self-service component. PeopleSoft Payroll (N.A.) 9.1 - 500K Employees (7 Million SQL PayCalc, Unicode) Systems Users Ave Response Search (sec) Ave Response Save (sec) Batch Time (min) Streams SPARC T4-4 (db) N/A N/A N/A 30.84 96 Configuration Summary Application Configuration: 1 x SPARC T4-4 server with 4 x SPARC T4 processors, 3.0 GHz 512 GB memory Oracle Solaris 11 11/11 PeopleTools 8.52 PeopleSoft HCM 9.1 Oracle Tuxedo, Version 10.3.0.0, 64-bit, Patch Level 031 Java Platform, Standard Edition Development Kit 6 Update 32 Database Configuration: 1 x SPARC T4-4 server with 4 x SPARC T4 processors, 3.0 GHz 256 GB memory Oracle Solaris 11 11/11 Oracle Database 11g Release 2 PeopleTools 8.52 Oracle Tuxedo, Version 10.3.0.0, 64-bit, Patch Level 031 Micro Focus Server Express (COBOL v 5.1.00) Web Tier Configuration: 1 x SPARC T4-2 server with 2 x SPARC T4 processors, 2.85 GHz 256 GB memory Oracle Solaris 11 11/11 PeopleTools 8.52 Oracle WebLogic Server 10.3.4 Java Platform, Standard Edition Development Kit 6 Update 32 Storage Configuration: 1 x Sun Server X2-4 as a COMSTAR head for data 4 x Intel Xeon X7550, 2.0 GHz 128 GB memory 1 x Sun Storage F5100 Flash Array (80 flash modules) 1 x Sun Storage F5100 Flash Array (40 flash modules) 1 x Sun Fire X4275 as a COMSTAR head for redo logs 12 x 2 TB SAS disks with Niwot Raid controller Benchmark Description This benchmark combines PeopleSoft HCM 9.1 HR Self Service online and PeopleSoft Payroll batch workloads to run on a unified database deployed on Oracle Database 11g Release 2. The PeopleSoft HRSS benchmark kit is a Oracle standard benchmark kit run by all platform vendors to measure the performance. It's an OLTP benchmark where DB SQLs are moderately complex. The results are certified by Oracle and a white paper is published. PeopleSoft HR SS defines a business transaction as a series of HTML pages that guide a user through a particular scenario. Users are defined as corporate Employees, Managers and HR administrators. The benchmark consist of 14 scenarios which emulate users performing typical HCM transactions such as viewing paycheck, promoting and hiring employees, updating employee profile and other typical HCM application transactions. All these transactions are well-defined in the PeopleSoft HR Self-Service 9.1 benchmark kit. This benchmark metric is the weighted average response search/save time for all the transactions. The PeopleSoft 9.1 Payroll (North America) benchmark demonstrates system performance for a range of processing volumes in a specific configuration. This workload represents large batch runs typical of a ERP environment during a mass update. The benchmark measures five application business process run times for a database representing large organization. They are Paysheet Creation, Payroll Calculation, Payroll Confirmation, Print Advice forms, and Create Direct Deposit File. The benchmark metric is the cumulative elapsed time taken to complete the Paysheet Creation, Payroll Calculation and Payroll Confirmation business application processes. The benchmark metrics are taken for each respective benchmark while running simultaneously on the same database back-end. Specifically, the payroll batch processes are started when the online workload reaches steady state (the maximum number of online users) and overlap with online transactions for the duration of the steady state. Key Points and Best Practices Two PeopleSoft Domain sets with 200 application servers each on a SPARC T4-4 server were hosted in 2 separate Oracle Solaris Zones to demonstrate consolidation of multiple application servers, ease of administration and performance tuning. Each Oracle Solaris Zone was bound to a separate processor set, each containing 15 cores (total 120 threads). The default set (1 core from first and third processor socket, total 16 threads) was used for network and disk interrupt handling. This was done to improve performance by reducing memory access latency by using the physical memory closest to the processors and offload I/O interrupt handling to default set threads, freeing up cpu resources for Application Servers threads and balancing application workload across 240 threads. A total of 128 PeopleSoft streams server processes where used on the database node to complete payroll batch job of 500,000 employees in 32.4 minutes. See Also Oracle PeopleSoft Benchmark White Papers oracle.com SPARC T4-2 Server oracle.com OTN SPARC T4-4 Server oracle.com OTN PeopleSoft Enterprise Human Capital Managementoracle.com OTN PeopleSoft Enterprise Human Capital Management (Payroll) oracle.com OTN Oracle Solaris oracle.com OTN Oracle Database 11g Release 2 oracle.com OTN Disclosure Statement Copyright 2012, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 8 November 2012.

Read the article
Working With Extended Events

- by Fatherjack

SQL Server 2012 has made working with Extended Events (XE) pretty simple when it comes to what sessions you have on your servers and what options you have selected and so forth but if you are like me then you still have some SQL Server instances that are 2008 or 2008 R2. For those servers there is no built-in way to view the Extended Event sessions in SSMS. I keep coming up against the same situations – Where are the xel log files? What events, actions or predicates are set for the events on the server? What sessions are there on the server already? I got tired of this being a perpetual question and wrote some TSQL to save as a snippet in SQL Prompt so that these details are permanently only a couple of clicks away. First, some history. If you just came here for the code skip down a few paragraphs and it’s all there. If you want a little time to reminisce about SQL Server then stick with me through the next paragraph or two. We are in a bit of a cross-over period currently, there are many versions of SQL Server but I would guess that SQL Server 2008, 2008 R2 and 2012 comprise the majority of installations. With each of these comes a set of management tools, of which SQL Server Management Studio (SSMS) is one. In 2008 and 2008 R2 Extended Events made their first appearance and there was no way to work with them in the SSMS interface. At some point the Extended Events guru Jonathan Kehayias (http://www.sqlskills.com/blogs/jonathan/) created the SQL Server 2008 Extended Events SSMS Addin which is really an excellent tool to ease XE session administration. This addin will install in SSMS 2008 or 2008R2 but not SSMS 2012. If you use a compatible version of SSMS then I wholly recommend downloading and using it to make your work with XE much easier. If you have SSMS 2012 installed, and there is no reason not to as it will let you work with all versions of SQL Server, then you cannot install this addin. If you are working with SQL Server 2012 then SSMS 2012 has built in functionality to manage XE sessions – this functionality does not apply for 2008 or 2008 R2 instances though. This means you are somewhat restricted and have to use TSQL to manage XE sessions on older versions of SQL Server. OK, those of you that skipped ahead for the code, you need to start from here: So, you are working with SSMS 2012 but have a SQL Server that is an earlier version that needs an XE session created or you think there is a session created but you aren’t sure, or you know it’s there but can’t remember if it is running and where the output is going. How do you find out? Well, none of the information is hidden as such but it is a bit of a wrangle to locate it and it isn’t a lot of code that is unlikely to remain in your memory. I have created two pieces of code. The first examines the SYS.Server_Event_… management views in combination with the SYS.DM_XE_… management views to give the name of all sessions that exist on the server, regardless of whether they are running or not and two pieces of TSQL code. One piece will alter the state of the session: if the session is running then the code will stop the session if executed and vice versa. The other piece of code will drop the selected session. If the session is running then the code will stop it first. Do not execute the DROP code unless you are sure you have the Create code to hand. It will be dropped from the server without a second chance to change your mind. /**************************************************************/ /*** To locate and describe event sessions on a server ***/ /*** ***/ /*** Generates TSQL to start/stop/drop sessions ***/ /*** ***/ /*** Jonathan Allen - @fatherjack ***/ /*** June 2013 ***/ /*** ***/ /**************************************************************/ SELECT [EES].[name] AS [Session Name - all sessions] , CASE WHEN [MXS].[name] IS NULL THEN ISNULL([MXS].[name], 'Stopped') ELSE 'Running' END AS SessionState , CASE WHEN [MXS].[name] IS NULL THEN ISNULL([MXS].[name], 'ALTER EVENT SESSION [' + [EES].[name] + '] ON SERVER STATE = START;') ELSE 'ALTER EVENT SESSION [' + [EES].[name] + '] ON SERVER STATE = STOP;' END AS ALTER_SessionState , CASE WHEN [MXS].[name] IS NULL THEN ISNULL([MXS].[name], 'DROP EVENT SESSION [' + [EES].[name] + '] ON SERVER; -- This WILL drop the session. It will no longer exist. Don't do it unless you are certain you can recreate it if you need it.') ELSE 'ALTER EVENT SESSION [' + [EES].[name] + '] ON SERVER STATE = STOP; ' + CHAR(10) + '-- DROP EVENT SESSION [' + [EES].[name] + '] ON SERVER; -- This WILL stop and drop the session. It will no longer exist. Don't do it unless you are certain you can recreate it if you need it.' END AS DROP_Session FROM [sys].[server_event_sessions] AS EES LEFT JOIN [sys].[dm_xe_sessions] AS MXS ON [EES].[name] = [MXS].[name] WHERE [EES].[name] NOT IN ( 'system_health', 'AlwaysOn_health' ) ORDER BY SessionState GO I have excluded the system_health and AlwaysOn sessions as I don’t want to accidentally execute the drop script for these sessions that are created as part of the SQL Server installation. It is possible to recreate the sessions but that is a whole lot of aggravation I’d rather avoid. The second piece of code gathers details of running XE sessions only and provides information on the Events being collected, any predicates that are set on those events, the actions that are set to be collected, where the collected information is being logged and if that logging is to a file target, where that file is located. /**********************************************/ /*** Running Session summary ***/ /*** ***/ /*** Details key values of XE sessions ***/ /*** that are in a running state ***/ /*** ***/ /*** Jonathan Allen - @fatherjack ***/ /*** June 2013 ***/ /*** ***/ /**********************************************/ SELECT [EES].[name] AS [Session Name - running sessions] , [EESE].[name] AS [Event Name] , COALESCE([EESE].[predicate], 'unfiltered') AS [Event Predicate Filter(s)] , [EESA].[Action] AS [Event Action(s)] , [EEST].[Target] AS [Session Target(s)] , ISNULL([EESF].[value], 'No file target in use') AS [File_Target_UNC] -- select * FROM [sys].[server_event_sessions] AS EES INNER JOIN [sys].[dm_xe_sessions] AS MXS ON [EES].[name] = [MXS].[name] INNER JOIN [sys].[server_event_session_events] AS [EESE] ON [EES].[event_session_id] = [EESE].[event_session_id] LEFT JOIN [sys].[server_event_session_fields] AS EESF ON ( [EES].[event_session_id] = [EESF].[event_session_id] AND [EESF].[name] = 'filename' ) CROSS APPLY ( SELECT STUFF(( SELECT ', ' + sest.name FROM [sys].[server_event_session_targets] AS SEST WHERE [EES].[event_session_id] = [SEST].[event_session_id] FOR XML PATH('') ), 1, 2, '') AS [Target] ) AS EEST CROSS APPLY ( SELECT STUFF(( SELECT ', ' + [sesa].NAME FROM [sys].[server_event_session_actions] AS sesa WHERE [sesa].[event_session_id] = [EES].[event_session_id] FOR XML PATH('') ), 1, 2, '') AS [Action] ) AS EESA WHERE [EES].[name] NOT IN ( 'system_health', 'AlwaysOn_health' ) /*Optional to exclude 'out-of-the-box' traces*/ I hope that these scripts are useful to you and I would be obliged if you would keep my name in the script comments. I have no problem with you using it in production or personal circumstances, however it has no warranty or guarantee. Don’t use it unless you understand it and are happy with what it is going to do. I am not ever responsible for the consequences of executing this script on your servers.

Read the article
Anatomy of a .NET Assembly - Custom attribute encoding

- by Simon Cooper

In my previous post, I covered how field, method, and other types of signatures are encoded in a .NET assembly. Custom attribute signatures differ quite a bit from these, which consequently affects attribute specifications in C#. Custom attribute specifications In C#, you can apply a custom attribute to a type or type member, specifying a constructor as well as the values of fields or properties on the attribute type: public class ExampleAttribute : Attribute { public ExampleAttribute(int ctorArg1, string ctorArg2) { ... } public Type ExampleType { get; set; } } [Example(5, "6", ExampleType = typeof(string))] public class C { ... } How does this specification actually get encoded and stored in an assembly? Specification blob values Custom attribute specification signatures use the same building blocks as other types of signatures; the ELEMENT_TYPE structure. However, they significantly differ from other types of signatures, in that the actual parameter values need to be stored along with type information. There are two types of specification arguments in a signature blob; fixed args and named args. Fixed args are the arguments to the attribute type constructor, named arguments are specified after the constructor arguments to provide a value to a field or property on the constructed attribute type (PropertyName = propValue) Values in an attribute blob are limited to one of the basic types (one of the number types, character, or boolean), a reference to a type, an enum (which, in .NET, has to use one of the integer types as a base representation), or arrays of any of those. Enums and the basic types are easy to store in a blob - you simply store the binary representation. Strings are stored starting with a compressed integer indicating the length of the string, followed by the UTF8 characters. Array values start with an integer indicating the number of elements in the array, then the item values concatentated together. Rather than using a coded token, Type values are stored using a string representing the type name and fully qualified assembly name (for example, MyNs.MyType, MyAssembly, Version=1.0.0.0, Culture=neutral, PublicKeyToken=0123456789abcdef). If the type is in the current assembly or mscorlib then just the type name can be used. This is probably done to prevent direct references between assemblies solely because of attribute specification arguments; assemblies can be loaded in the reflection-only context and attribute arguments still processed, without loading the entire assembly. Fixed and named arguments Each entry in the CustomAttribute metadata table contains a reference to the object the attribute is applied to, the attribute constructor, and the specification blob. The number and type of arguments to the constructor (the fixed args) can be worked out by the method signature referenced by the attribute constructor, and so the fixed args can simply be concatenated together in the blob without any extra type information. Named args are different. These specify the value to assign to a field or property once the attribute type has been constructed. In the CLR, fields and properties can be overloaded just on their type; different fields and properties can have the same name. Therefore, to uniquely identify a field or property you need: Whether it's a field or property (indicated using byte values 0x53 and 0x54, respectively) The field or property type The field or property name After the fixed arg values is a 2-byte number specifying the number of named args in the blob. Each named argument has the above information concatenated together, mostly using the basic ELEMENT_TYPE values, in the same way as a method or field signature. A Type argument is represented using the byte 0x50, and an enum argument is represented using the byte 0x55 followed by a string specifying the name and assembly of the enum type. The named argument property information is followed by the argument value, using the same encoding as fixed args. Boxed objects This would be all very well, were it not for object and object[]. Arguments and properties of type object allow a value of any allowed argument type to be specified. As a result, more information needs to be specified in the blob to interpret the argument bytes as the correct type. So, the argument value is simple prepended with the type of the value by specifying the ELEMENT_TYPE or name of the enum the value represents. For named arguments, a field or property of type object is represented using the byte 0x51, with the actual type specified in the argument value. Some examples... All property signatures start with the 2-byte value 0x0001. Similar to my previous post in the series, names in capitals correspond to a particular byte value in the ELEMENT_TYPE structure. For strings, I'll simply give the string value, rather than the length and UTF8 encoding in the actual blob. I'll be using the following enum and attribute types to demonstrate specification encodings: class AttrAttribute : Attribute { public AttrAttribute() {} public AttrAttribute(Type[] tArray) {} public AttrAttribute(object o) {} public AttrAttribute(MyEnum e) {} public AttrAttribute(ushort x, int y) {} public AttrAttribute(string str, Type type1, Type type2) {} public int Prop1 { get; set; } public object Prop2 { get; set; } public object[] ObjectArray; } enum MyEnum : int { Val1 = 1, Val2 = 2 } Now, some examples: Here, the the specification binds to the (ushort, int) attribute constructor, with fixed args only. The specification blob starts off with a prolog, followed by the two constructor arguments, then the number of named arguments (zero): [Attr(42, 84)] 0x0001 0x002a 0x00000054 0x0000 An example of string and type encoding: [Attr("MyString", typeof(Array), typeof(System.Windows.Forms.Form))] 0x0001 "MyString" "System.Array" "System.Windows.Forms.Form, System.Windows.Forms, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089" 0x0000 As you can see, the full assembly specification of a type is only needed if the type isn't in the current assembly or mscorlib. Note, however, that the C# compiler currently chooses to fully-qualify mscorlib types anyway. An object argument (this binds to the object attribute constructor), and two named arguments (a null string is represented by 0xff and the empty string by 0x00) [Attr((ushort)40, Prop1 = 12, Prop2 = "")] 0x0001 U2 0x0028 0x0002 0x54 I4 "Prop1" 0x0000000c 0x54 0x51 "Prop2" STRING 0x00 Right, more complicated now. A type array as a fixed argument: [Attr(new[] { typeof(string), typeof(object) })] 0x0001 0x00000002 // the number of elements "System.String" "System.Object" 0x0000 An enum value, which is simply represented using the underlying value. The CLR works out that it's an enum using information in the attribute constructor signature: [Attr(MyEnum.Val1)] 0x0001 0x00000001 0x0000 And finally, a null array, and an object array as a named argument: [Attr((Type[])null, ObjectArray = new object[] { (byte)2, typeof(decimal), null, MyEnum.Val2 })] 0x0001 0xffffffff 0x0001 0x53 SZARRAY 0x51 "ObjectArray" 0x00000004 U1 0x02 0x50 "System.Decimal" STRING 0xff 0x55 "MyEnum" 0x00000002 As you'll notice, a null object is encoded as a null string value, and a null array is represented using a length of -1 (0xffffffff). How does this affect C#? So, we can now explain why the limits on attribute arguments are so strict in C#. Attribute specification blobs are limited to basic numbers, enums, types, and arrays. As you can see, this is because the raw CLR encoding can only accommodate those types. Special byte patterns have to be used to indicate object, string, Type, or enum values in named arguments; you can't specify an arbitary object type, as there isn't a generalised way of encoding the resulting value in the specification blob. In particular, decimal values can't be encoded, as it isn't a 'built-in' CLR type that has a native representation (you'll notice that decimal constants in C# programs are compiled as several integer arguments to DecimalConstantAttribute). Jagged arrays also aren't natively supported, although you can get around it by using an array as a value to an object argument: [Attr(new object[] { new object[] { new Type[] { typeof(string) } }, 42 })] Finally... Phew! That was a bit longer than I thought it would be. Custom attribute encodings are complicated! Hopefully this series has been an informative look at what exactly goes on inside a .NET assembly. In the next blog posts, I'll be carrying on with the 'Inside Red Gate' series.

Read the article
Oracle's Global Single Schema

- by david.butler(at)oracle.com

Maximizing business process efficiencies in a heterogeneous environment is very difficult. The difficulty stems from the fact that the various applications across the Information Technology (IT) landscape employ different integration standards, different message passing strategies, and different workflow engines. Vendors such as Oracle and others are delivering tools to help IT organizations manage the complexities introduced by these differences. But the one remaining intractable problem impacting efficient operations is the fact that these applications have different definitions for the same business data. Business data is your business information codified for computer programs to use. A good data model will represent the way your organization does business. The computer applications your organization deploys to improve operational efficiency are built to operate on the business data organized into this schema. If the schema does not represent how you do business, the applications on that schema cannot provide the features you need to achieve the desired efficiencies. Business processes span these applications. Data problems break these processes rendering them far less efficient than they need to be to achieve organization goals. Thus, the expected return on the investment in these applications is never realized. The success of all business processes depends on the availability of accurate master data. Clearly, the solution to this problem is to consolidate all the master data an organization uses to run its business. Then clean it up, augment it, govern it, and connect it back to the applications that need it. Until now, this obvious solution has been difficult to achieve because no one had defined a data model sufficiently broad, deep and flexible enough to support transaction processing on all key business entities and serve as a master superset to all other operational data models deployed in heterogeneous IT environments. Today, the situation has changed. Oracle has created an operational data model (aka schema) that can support accurate and consistent master data across heterogeneous IT systems. This is foundational for providing a way to consolidate and integrate master data without having to replace investments in existing applications. This Global Single Schema (GSS) represents a revolutionary breakthrough that allows for true master data consolidation. Oracle has deep knowledge of applications dating back to the early 1990s. It developed applications in the areas of Supply Chain Management (SCM), Product Lifecycle Management (PLM), Enterprise Resource Planning (ERP), Customer Relationship Management (CRM), Human Capital Management (HCM), Financials and Manufacturing. In addition, Oracle applications were delivered for key industries such as Communications, Financial Services, Retail, Public Sector, High Tech Manufacturing (HTM) and more. Expertise in all these areas drove requirements for GSS. The following figure illustrates Oracle's unique position that enabled the creation of the Global Single Schema. GSS Requirements Gathering GSS defines all the key business entities and attributes including Customers, Contacts, Suppliers, Accounts, Products, Services, Materials, Employees, Installed Base, Sites, Assets, and Inventory to name just a few. In addition, Oracle delivers GSS pre-integrated with a wide variety of operational applications. Business Process Automation EBusiness is about maximizing operational efficiency. At the highest level, these 'operations' span all that you do as an organization. The following figure illustrates some of these high-level business processes. Enterprise Business Processes Supplies are procured. Assets are maintained. Materials are stored. Inventory is accumulated. Products and Services are engineered, produced and sold. Customers are serviced. And across this entire spectrum, Employees do the procuring, supporting, engineering, producing, selling and servicing. Not shown, but not to be overlooked, are the accounting and the financial processes associated with all this procuring, manufacturing, and selling activity. Supporting all these applications is the master data. When this data is fragmented and inconsistent, the business processes fail and inefficiencies multiply. But imagine having all the data under these operational business processes in one place. · The same accurate and timely customer data will be provided to all your operational applications from the call center to the point of sale. · The same accurate and timely supplier data will be provided to all your operational applications from supply chain planning to procurement. · The same accurate and timely product information will be available to all your operational applications from demand chain planning to marketing. You would have a single version of the truth about your assets, financial information, customers, suppliers, employees, products and services to support your business automation processes as they flow across your business applications. All company and partner personnel will access the same exact data entity across all your channels and across all your lines of business. Oracle's Global Single Schema enables this vision of a single version of the truth across the heterogeneous operational applications supporting the entire enterprise. Global Single Schema Oracle's Global Single Schema organizes hundreds of thousands of attributes into 165 major schema objects supporting over 180 business application modules. It is designed for international operations, and extensibility. The schema is delivered with a full set of public Application Programming Interfaces (APIs) and an Integration Repository with modern Service Oriented Architecture interfaces to make data available as a services (DaaS) to business processes and enable operations in heterogeneous IT environments. · Key tables can be extended with unlimited numbers of additional attributes and attribute groups for maximum flexibility. o This enables model extensions that reflect business entities unique to your organization's operations. · The schema is multi-organization enabled so data manipulation can be controlled along organizational boundaries. · It uses variable byte Unicode to support over 31 languages. · The schema encodes flexible date and flexible address formats for easy localizations. No matter how complex your business is, Oracle's Global Single Schema can hold your business objects and support your global operations. Oracle's Global Single Schema identifies and defines the business objects an enterprise needs within the context of its business operations. The interrelationships between the business objects are also contained within the GSS data model. Their presence expresses fundamental business rules for the interaction between business entities. The following figure illustrates some of these connections. Interconnected Business Entities Interconnecte business processes require interconnected business data. No other MDM vendor has this capability. Everyone else has either one entity they can master or separate disconnected models for various business entities. Higher level integrations are made available, but that is a weak architectural alternative to data level integration in this critically important aspect of Master Data Management.

Read the article
C# Performance Pitfall – Interop Scenarios Change the Rules

- by Reed

C# and .NET, overall, really do have fantastic performance in my opinion. That being said, the performance characteristics dramatically differ from native programming, and take some relearning if you’re used to doing performance optimization in most other languages, especially C, C++, and similar. However, there are times when revisiting tricks learned in native code play a critical role in performance optimization in C#. I recently ran across a nasty scenario that illustrated to me how dangerous following any fixed rules for optimization can be… The rules in C# when optimizing code are very different than C or C++. Often, they’re exactly backwards. For example, in C and C++, lifting a variable out of loops in order to avoid memory allocations often can have huge advantages. If some function within a call graph is allocating memory dynamically, and that gets called in a loop, it can dramatically slow down a routine. This can be a tricky bottleneck to track down, even with a profiler. Looking at the memory allocation graph is usually the key for spotting this routine, as it’s often “hidden” deep in call graph. For example, while optimizing some of my scientific routines, I ran into a situation where I had a loop similar to: for (i=0; i<numberToProcess; ++i) { // Do some work ProcessElement(element[i]); } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } This loop was at a fairly high level in the call graph, and often could take many hours to complete, depending on the input data. As such, any performance optimization we could achieve would be greatly appreciated by our users. After a fair bit of profiling, I noticed that a couple of function calls down the call graph (inside of ProcessElement), there was some code that effectively was doing: // Allocate some data required DataStructure* data = new DataStructure(num); // Call into a subroutine that passed around and manipulated this data highly CallSubroutine(data); // Read and use some values from here double values = data->Foo; // Cleanup delete data; // ... return bar; Normally, if “DataStructure” was a simple data type, I could just allocate it on the stack. However, it’s constructor, internally, allocated it’s own memory using new, so this wouldn’t eliminate the problem. In this case, however, I could change the call signatures to allow the pointer to the data structure to be passed into ProcessElement and through the call graph, allowing the inner routine to reuse the same “data” memory instead of allocating. At the highest level, my code effectively changed to something like: DataStructure* data = new DataStructure(numberToProcess); for (i=0; i<numberToProcess; ++i) { // Do some work ProcessElement(element[i], data); } delete data; Granted, this dramatically reduced the maintainability of the code, so it wasn’t something I wanted to do unless there was a significant benefit. In this case, after profiling the new version, I found that it increased the overall performance dramatically – my main test case went from 35 minutes runtime down to 21 minutes. This was such a significant improvement, I felt it was worth the reduction in maintainability. In C and C++, it’s generally a good idea (for performance) to: Reduce the number of memory allocations as much as possible, Use fewer, larger memory allocations instead of many smaller ones, and Allocate as high up the call stack as possible, and reuse memory I’ve seen many people try to make similar optimizations in C# code. For good or bad, this is typically not a good idea. The garbage collector in .NET completely changes the rules here. In C#, reallocating memory in a loop is not always a bad idea. In this scenario, for example, I may have been much better off leaving the original code alone. The reason for this is the garbage collector. The GC in .NET is incredibly effective, and leaving the allocation deep inside the call stack has some huge advantages. First and foremost, it tends to make the code more maintainable – passing around object references tends to couple the methods together more than necessary, and overall increase the complexity of the code. This is something that should be avoided unless there is a significant reason. Second, (unlike C and C++) memory allocation of a single object in C# is normally cheap and fast. Finally, and most critically, there is a large advantage to having short lived objects. If you lift a variable out of the loop and reuse the memory, its much more likely that object will get promoted to Gen1 (or worse, Gen2). This can cause expensive compaction operations to be required, and also lead to (at least temporary) memory fragmentation as well as more costly collections later. As such, I’ve found that it’s often (though not always) faster to leave memory allocations where you’d naturally place them – deep inside of the call graph, inside of the loops. This causes the objects to stay very short lived, which in turn increases the efficiency of the garbage collector, and can dramatically improve the overall performance of the routine as a whole. In C#, I tend to: Keep variable declarations in the tightest scope possible Declare and allocate objects at usage While this tends to cause some of the same goals (reducing unnecessary allocations, etc), the goal here is a bit different – it’s about keeping the objects rooted for as little time as possible in order to (attempt) to keep them completely in Gen0, or worst case, Gen1. It also has the huge advantage of keeping the code very maintainable – objects are used and “released” as soon as possible, which keeps the code very clean. It does, however, often have the side effect of causing more allocations to occur, but keeping the objects rooted for a much shorter time. Now – nowhere here am I suggesting that these rules are hard, fast rules that are always true. That being said, my time spent optimizing over the years encourages me to naturally write code that follows the above guidelines, then profile and adjust as necessary. In my current project, however, I ran across one of those nasty little pitfalls that’s something to keep in mind – interop changes the rules. In this case, I was dealing with an API that, internally, used some COM objects. In this case, these COM objects were leading to native allocations (most likely C++) occurring in a loop deep in my call graph. Even though I was writing nice, clean managed code, the normal managed code rules for performance no longer apply. After profiling to find the bottleneck in my code, I realized that my inner loop, a innocuous looking block of C# code, was effectively causing a set of native memory allocations in every iteration. This required going back to a “native programming” mindset for optimization. Lifting these variables and reusing them took a 1:10 routine down to 0:20 – again, a very worthwhile improvement. Overall, the lessons here are: Always profile if you suspect a performance problem – don’t assume any rule is correct, or any code is efficient just because it looks like it should be Remember to check memory allocations when profiling, not just CPU cycles Interop scenarios often cause managed code to act very differently than “normal” managed code. Native code can be hidden very cleverly inside of managed wrappers

Read the article
How accurate is "Business logic should be in a service, not in a model"?

- by Jeroen Vannevel

Situation Earlier this evening I gave an answer to a question on StackOverflow. The question: Editing of an existing object should be done in repository layer or in service? For example if I have a User that has debt. I want to change his debt. Should I do it in UserRepository or in service for example BuyingService by getting an object, editing it and saving it ? My answer: You should leave the responsibility of mutating an object to that same object and use the repository to retrieve this object. Example situation: class User { private int debt; // debt in cents private string name; // getters public void makePayment(int cents){ debt -= cents; } } class UserRepository { public User GetUserByName(string name){ // Get appropriate user from database } } A comment I received: Business logic should really be in a service. Not in a model. What does the internet say? So, this got me searching since I've never really (consciously) used a service layer. I started reading up on the Service Layer pattern and the Unit Of Work pattern but so far I can't say I'm convinced a service layer has to be used. Take for example this article by Martin Fowler on the anti-pattern of an Anemic Domain Model: There are objects, many named after the nouns in the domain space, and these objects are connected with the rich relationships and structure that true domain models have. The catch comes when you look at the behavior, and you realize that there is hardly any behavior on these objects, making them little more than bags of getters and setters. Indeed often these models come with design rules that say that you are not to put any domain logic in the the domain objects. Instead there are a set of service objects which capture all the domain logic. These services live on top of the domain model and use the domain model for data. (...) The logic that should be in a domain object is domain logic - validations, calculations, business rules - whatever you like to call it. To me, this seemed exactly what the situation was about: I advocated the manipulation of an object's data by introducing methods inside that class that do just that. However I realize that this should be a given either way, and it probably has more to do with how these methods are invoked (using a repository). I also had the feeling that in that article (see below), a Service Layer is more considered as a façade that delegates work to the underlying model, than an actual work-intensive layer. Application Layer [his name for Service Layer]: Defines the jobs the software is supposed to do and directs the expressive domain objects to work out problems. The tasks this layer is responsible for are meaningful to the business or necessary for interaction with the application layers of other systems. This layer is kept thin. It does not contain business rules or knowledge, but only coordinates tasks and delegates work to collaborations of domain objects in the next layer down. It does not have state reflecting the business situation, but it can have state that reflects the progress of a task for the user or the program. Which is reinforced here: Service interfaces. Services expose a service interface to which all inbound messages are sent. You can think of a service interface as a façade that exposes the business logic implemented in the application (typically, logic in the business layer) to potential consumers. And here: The service layer should be devoid of any application or business logic and should focus primarily on a few concerns. It should wrap Business Layer calls, translate your Domain in a common language that your clients can understand, and handle the communication medium between server and requesting client. This is a serious contrast to other resources that talk about the Service Layer: The service layer should consist of classes with methods that are units of work with actions that belong in the same transaction. Or the second answer to a question I've already linked: At some point, your application will want some business logic. Also, you might want to validate the input to make sure that there isn't something evil or nonperforming being requested. This logic belongs in your service layer. "Solution"? Following the guidelines in this answer, I came up with the following approach that uses a Service Layer: class UserController : Controller { private UserService _userService; public UserController(UserService userService){ _userService = userService; } public ActionResult MakeHimPay(string username, int amount) { _userService.MakeHimPay(username, amount); return RedirectToAction("ShowUserOverview"); } public ActionResult ShowUserOverview() { return View(); } } class UserService { private IUserRepository _userRepository; public UserService(IUserRepository userRepository) { _userRepository = userRepository; } public void MakeHimPay(username, amount) { _userRepository.GetUserByName(username).makePayment(amount); } } class UserRepository { public User GetUserByName(string name){ // Get appropriate user from database } } class User { private int debt; // debt in cents private string name; // getters public void makePayment(int cents){ debt -= cents; } } Conclusion All together not much has changed here: code from the controller has moved to the service layer (which is a good thing, so there is an upside to this approach). However this doesn't look like it had anything to do with my original answer. I realize design patterns are guidelines, not rules set in stone to be implemented whenever possible. Yet I have not found a definitive explanation of the service layer and how it should be regarded. Is it a means to simply extract logic from the controller and put it inside a service instead? Is it supposed to form a contract between the controller and the domain? Should there be a layer between the domain and the service layer? And, last but not least: following the original comment Business logic should really be in a service. Not in a model. Is this correct? How would I introduce my business logic in a service instead of the model?

Read the article
Modern/Metro Internet Explorer: What were they thinking???

- by Rick Strahl

As I installed Windows 8.1 last week I decided that I really should take a closer look at Internet Explorer in the Modern/Metro environment again. Right away I ran into two issues that are real head scratchers to me.Modern Split Windows don't resize Viewport but Zoom OutThis one falls in the "WTF, really?" department: It looks like Modern Internet Explorer's Modern doesn't resize the browser window as every other browser (including IE 11 on the desktop) does, but rather tries to adjust the zoom to the width of the browser. This means that if you use the Modern IE browser and you split the display between IE and another application, IE will be zoomed out, with text becoming much, much smaller, rather than resizing the browser viewport and adjusting the pixel width as you would when a browser window is typically resized.Here's what I'm talking about in a couple of pictures. First here's the full screen Internet Explorer version (this shot is resized down since it's full screen at 1080p, click to see the full image):This brings up the first issue which is: On the desktop who wants to browse a site full screen? Most sites aren't fully optimized for 1080p widescreen experience and frankly most content that wide just looks weird. Even in typical 10" resolutions of 1280 width it's weird to look at things this way. At least this issue can be worked around with @media queries and either constraining the view, or adding additional content to make use of the extra space. Still running a desktop browser full screen is not optimal on a desktop machine - ever.Regardless, this view, while oversized, is what I expect: Everything is rendered in the right ratios, with font-size and the responsive design styling properly respected.But now look what happens when you split the desktop windows and show half desktop and have modern IE (this screen shot is not resized but cropped - this is actual size content as you can see in the cropped Twitter window on the right half of the screen):What's happening here is that IE is zooming out of the content to make it fit into the smaller width, shrinking the content rather than resizing the viewport's pixel width. In effect it looks like the pixel width stays at 1080px and the viewport expands out height-wise in response resulting in some crazy long portrait view.There goes responsive design - out the window literally. If you've built your site using @media queries and fixed viewport sizes, Internet Explorer completely screws you in this split view. On my 1080p monitor, the site shown at a little under half width becomes completely unreadable as the fonts are too small and break up. As you go into split view and you resize the window handle the content of the browser gets smaller and smaller (and effectively longer and longer on the bottom) effectively throwing off any responsive layout to the point of un-readability even on a big display, let alone a small tablet screen.What could POSSIBLY be the benefit of this screwed up behavior? I checked around a bit trying different pages in this shrunk down view. Other than the Microsoft home page, every page I went to was nearly unreadable at a quarter width. The only page I found that worked 'normally' was the Microsoft home page which undoubtedly is optimized just for Internet Explorer specifically.Bottom Address Bar opaquely overlays ContentAnother problematic feature for me is the browser address bar on the bottom. Modern IE shows the status bar opaquely on the bottom, overlaying the content area of the Web Page - until you click on the page. Until you do though, the address bar overlays the bottom content solidly. And not just a little bit but by good sizable chunk.In the application from the screen shot above I have an application toolbar on the bottom and the IE Address bar completely hides that bottom toolbar when the page is first loaded, until the user clicks into the content at which point the address bar shrinks down to a fat border style bar with a … on it. Toolbars on the bottom are pretty common these days, especially for mobile optimized applications, so I'd say this is a common use case. But even if you don't have toolbars on the bottom maybe there's other fixed content on the bottom of the page that is vital to display. While other browsers often also show address bars and then later hide them, these other browsers tend to resize the viewport when the address bar status changes, so the content can respond to the size change. Not so with Modern IE. The address bar overlays content and stays visible until content is clicked. No resize notification or viewport height change is sent to the browser.So basically Internet Explorer is telling me: "Our toolbar is more important than your content!" - AND gives me no chance to re-act to that behavior. The result on this page/application is that the user sees no actionable operations until he or she clicks into the content area, which is terrible from a UI perspective as the user has no idea what options are available on initial load.It's doubly confounding in that IE is running in full screen mode and has an the entire height of the screen at its disposal - there's plenty of real estate available to not require this sort of hiding of content in the first place. Heck, even Windows Phone with its more constrained size doesn't hide content - in fact the address bar on Windows Phone 8 is always visible.What were they thinking?Every time I use anything in the Modern Metro interface in Windows 8/8.1 I get angry. I can pretty much ignore Metro/Modern for my everyday usage, but unfortunately with Internet Explorer in the modern shell I have to live with, because there will be users using it to access my sites. I think it's inexcusable by Microsoft to build such a crappy shell around the browser that impacts the actual usability of Web content. In both of the cases above I can only scratch my head at what could have possibly motivated anybody designing the UI for the browser to make these screwed up choices, that manipulate the content in a totally unmaintainable way.© Rick Strahl, West Wind Technologies, 2005-2013Posted in Windows HTML5 Tweet !function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0];if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src="//platform.twitter.com/widgets.js";fjs.parentNode.insertBefore(js,fjs);}}(document,"script","twitter-wjs"); (function() { var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true; po.src = 'https://apis.google.com/js/plusone.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s); })();

Read the article
J2EE Applications, SPARC T4, Solaris Containers, and Resource Pools

- by user12620111

I've obtained a substantial performance improvement on a SPARC T4-2 Server running a J2EE Application Server Cluster by deploying the cluster members into Oracle Solaris Containers and binding those containers to cores of the SPARC T4 Processor. This is not a surprising result, in fact, it is consistent with other results that are available on the Internet. See the "references", below, for some examples. Nonetheless, here is a summary of my configuration and results. (1.0) Before deploying a J2EE Application Server Cluster into a virtualized environment, many decisions need to be made. I'm not claiming that all of the decisions that I have a made will work well for every environment. In fact, I'm not even claiming that all of the decisions are the best possible for my environment. I'm only claiming that of the small sample of configurations that I've tested, this is the one that is working best for me. Here are some of the decisions that needed to be made: (1.1) Which virtualization option? There are several virtualization options and isolation levels that are available. Options include: Hard partitions: Dynamic Domains on Sun SPARC Enterprise M-Series Servers Hypervisor based virtualization such as Oracle VM Server for SPARC (LDOMs) on SPARC T-Series Servers OS Virtualization using Oracle Solaris Containers Resource management tools in the Oracle Solaris OS to control the amount of resources an application receives, such as CPU cycles, physical memory, and network bandwidth. Oracle Solaris Containers provide the right level of isolation and flexibility for my environment. To borrow some words from my friends in marketing, "The SPARC T4 processor leverages the unique, no-cost virtualization capabilities of Oracle Solaris Zones" (1.2) How to associate Oracle Solaris Containers with resources? There are several options available to associate containers with resources, including (a) resource pool association (b) dedicated-cpu resources and (c) capped-cpu resources. I chose to create resource pools and associate them with the containers because I wanted explicit control over the cores and virtual processors. (1.3) Cluster Topology? Is it best to deploy (a) multiple application servers on one node, (b) one application server on multiple nodes, or (c) multiple application servers on multiple nodes? After a few quick tests, it appears that one application server per Oracle Solaris Container is a good solution. (1.4) Number of cluster members to deploy? I chose to deploy four big 64-bit application servers. I would like go back a test many 32-bit application servers, but that is left for another day. (2.0) Configuration tested. (2.1) I was using a SPARC T4-2 Server which has 2 CPU and 128 virtual processors. To understand the physical layout of the hardware on Solaris 10, I used the OpenSolaris psrinfo perl script available at http://hub.opensolaris.org/bin/download/Community+Group+performance/files/psrinfo.pl: test# ./psrinfo.pl -pv The physical processor has 8 cores and 64 virtual processors (0-63) The core has 8 virtual processors (0-7) The core has 8 virtual processors (8-15) The core has 8 virtual processors (16-23) The core has 8 virtual processors (24-31) The core has 8 virtual processors (32-39) The core has 8 virtual processors (40-47) The core has 8 virtual processors (48-55) The core has 8 virtual processors (56-63) SPARC-T4 (chipid 0, clock 2848 MHz) The physical processor has 8 cores and 64 virtual processors (64-127) The core has 8 virtual processors (64-71) The core has 8 virtual processors (72-79) The core has 8 virtual processors (80-87) The core has 8 virtual processors (88-95) The core has 8 virtual processors (96-103) The core has 8 virtual processors (104-111) The core has 8 virtual processors (112-119) The core has 8 virtual processors (120-127) SPARC-T4 (chipid 1, clock 2848 MHz) (2.2) The "before" test: without processor binding. I started with a 4-member cluster deployed into 4 Oracle Solaris Containers. Each container used a unique gigabit Ethernet port for HTTP traffic. The containers shared a 10 gigabit Ethernet port for JDBC traffic. (2.3) The "after" test: with processor binding. I ran one application server in the Global Zone and another application server in each of the three non-global zones (NGZ): (3.0) Configuration steps. The following steps need to be repeated for all three Oracle Solaris Containers. (3.1) Stop AppServers from the BUI. (3.2) Stop the NGZ. test# ssh test-z2 init 5 (3.3) Enable resource pools: test# svcadm enable pools (3.4) Create the resource pool: test# poolcfg -dc 'create pool pool-test-z2' (3.5) Create the processor set: test# poolcfg -dc 'create pset pset-test-z2' (3.6) Specify the maximum number of CPU's that may be addd to the processor set: test# poolcfg -dc 'modify pset pset-test-z2 (uint pset.max=32)' (3.7) bash syntax to add Virtual CPUs to the processor set: test# (( i = 64 )); while (( i < 96 )); do poolcfg -dc "transfer to pset pset-test-z2 (cpu $i)"; (( i = i + 1 )) ; done (3.8) Associate the resource pool with the processor set: test# poolcfg -dc 'associate pool pool-test-z2 (pset pset-test-z2)' (3.9) Tell the zone to use the resource pool that has been created: test# zonecfg -z test-z1 set pool=pool-test-z2 (3.10) Boot the Oracle Solaris Container test# zoneadm -z test-z2 boot (3.11) Save the configuration to /etc/pooladm.conf test# pooladm -s (4.0) Results. Using the resource pools improves both throughput and response time: (5.0) References: System Administration Guide: Oracle Solaris Containers-Resource Management and Oracle Solaris Zones Capitalizing on large numbers of processors with WebSphere Portal on Solaris WebSphere Application Server and T5440 (Dileep Kumar's Weblog) http://www.brendangregg.com/zones.html Reuters Market Data System, RMDS 6 Multiple Instances (Consolidated), Performance Test Results in Solaris, Containers/Zones Environment on Sun Blade X6270 by Amjad Khan, 2009.

Read the article
Anatomy of a .NET Assembly - CLR metadata 1

- by Simon Cooper

Before we look at the bytes comprising the CLR-specific data inside an assembly, we first need to understand the logical format of the metadata (For this post I only be looking at simple pure-IL assemblies; mixed-mode assemblies & other things complicates things quite a bit). Metadata streams Most of the CLR-specific data inside an assembly is inside one of 5 streams, which are analogous to the sections in a PE file. The name of each section in a PE file starts with a ., and the name of each stream in the CLR metadata starts with a #. All but one of the streams are heaps, which store unstructured binary data. The predefined streams are: #~ Also called the metadata stream, this stream stores all the information on the types, methods, fields, properties and events in the assembly. Unlike the other streams, the metadata stream has predefined contents & structure. #Strings This heap is where all the namespace, type & member names are stored. It is referenced extensively from the #~ stream, as we'll be looking at later. #US Also known as the user string heap, this stream stores all the strings used in code directly. All the strings you embed in your source code end up in here. This stream is only referenced from method bodies. #GUID This heap exclusively stores GUIDs used throughout the assembly. #Blob This heap is for storing pure binary data - method signatures, generic instantiations, that sort of thing. Items inside the heaps (#Strings, #US, #GUID and #Blob) are indexed using a simple binary offset from the start of the heap. At that offset is a coded integer giving the length of that item, then the item's bytes immediately follow. The #GUID stream is slightly different, in that GUIDs are all 16 bytes long, so a length isn't required. Metadata tables The #~ stream contains all the assembly metadata. The metadata is organised into 45 tables, which are binary arrays of predefined structures containing information on various aspects of the metadata. Each entry in a table is called a row, and the rows are simply concatentated together in the file on disk. For example, each row in the TypeRef table contains: A reference to where the type is defined (most of the time, a row in the AssemblyRef table). An offset into the #Strings heap with the name of the type An offset into the #Strings heap with the namespace of the type. in that order. The important tables are (with their table number in hex): 0x2: TypeDef 0x4: FieldDef 0x6: MethodDef 0x14: EventDef 0x17: PropertyDef Contains basic information on all the types, fields, methods, events and properties defined in the assembly. 0x1: TypeRef The details of all the referenced types defined in other assemblies. 0xa: MemberRef The details of all the referenced members of types defined in other assemblies. 0x9: InterfaceImpl Links the types defined in the assembly with the interfaces that type implements. 0xc: CustomAttribute Contains information on all the attributes applied to elements in this assembly, from method parameters to the assembly itself. 0x18: MethodSemantics Links properties and events with the methods that comprise the get/set or add/remove methods of the property or method. 0x1b: TypeSpec 0x2b: MethodSpec These tables provide instantiations of generic types and methods for each usage within the assembly. There are several ways to reference a single row within a table. The simplest is to simply specify the 1-based row index (RID). The indexes are 1-based so a value of 0 can represent 'null'. In this case, which table the row index refers to is inferred from the context. If the table can't be determined from the context, then a particular row is specified using a token. This is a 4-byte value with the most significant byte specifying the table, and the other 3 specifying the 1-based RID within that table. This is generally how a metadata table row is referenced from the instruction stream in method bodies. The third way is to use a coded token, which we will look at in the next post. So, back to the bytes Now we've got a rough idea of how the metadata is logically arranged, we can now look at the bytes comprising the start of the CLR data within an assembly: The first 8 bytes of the .text section are used by the CLR loader stub. After that, the CLR-specific data starts with the CLI header. I've highlighted the important bytes in the diagram. In order, they are: The size of the header. As the header is a fixed size, this is always 0x48. The CLR major version. This is always 2, even for .NET 4 assemblies. The CLR minor version. This is always 5, even for .NET 4 assemblies, and seems to be ignored by the runtime. The RVA and size of the metadata header. In the diagram, the RVA 0x20e4 corresponds to the file offset 0x2e4 Various flags specifying if this assembly is pure-IL, whether it is strong name signed, and whether it should be run as 32-bit (this is how the CLR differentiates between x86 and AnyCPU assemblies). A token pointing to the entrypoint of the assembly. In this case, 06 (the last byte) refers to the MethodDef table, and 01 00 00 refers to to the first row in that table. (after a gap) RVA of the strong name signature hash, which comes straight after the CLI header. The RVA 0x2050 corresponds to file offset 0x250. The rest of the CLI header is mainly used in mixed-mode assemblies, and so is zeroed in this pure-IL assembly. After the CLI header comes the strong name hash, which is a SHA-1 hash of the assembly using the strong name key. After that comes the bodies of all the methods in the assembly concatentated together. Each method body starts off with a header, which I'll be looking at later. As you can see, this is a very small assembly with only 2 methods (an instance constructor and a Main method). After that, near the end of the .text section, comes the metadata, containing a metadata header and the 5 streams discussed above. We'll be looking at this in the next post. Conclusion The CLI header data doesn't have much to it, but we've covered some concepts that will be important in later posts - the logical structure of the CLR metadata and the overall layout of CLR data within the .text section. Next, I'll have a look at the contents of the #~ stream, and how the table data is arranged on disk.

Read the article
Get Application Title from Windows Phone

- by psheriff

In a Windows Phone application that I am currently developing I needed to be able to retrieve the Application Title of the phone application. You can set the Deployment Title in the Properties of your Windows Phone Application, however getting to this value programmatically can be a little tricky. This article assumes that you have Visual Studio 2010 and the Windows Phone tools installed along with it. The Windows Phone tools must be downloaded separately and installed with Visual Studio2010. You may also download the free Visual Studio2010 Express for Windows Phone developer environment. The WMAppManifest.xml File First off you need to understand that when you set the Deployment Title in the Properties windows of your Windows Phone application, this title actually gets stored into an XML file located under the \Properties folder of your application. This XML file is named WMAppManifest.xml. A portion of this file is shown in the following listing. <?xml version="1.0" encoding="utf-8"?><Deployment http://schemas.microsoft.com/windowsphone/2009/deployment"http://schemas.microsoft.com/windowsphone/2009/deployment" AppPlatformVersion="7.0"> <App xmlns="" ProductID="{71d20842-9acc-4f2f-b0e0-8ef79842ea53}" Title="Mobile Time Track" RuntimeType="Silverlight" Version="1.0.0.0" Genre="apps.normal" Author="PDSA, Inc." Description="Mobile Time Track" Publisher="PDSA, Inc."> ... ... </App></Deployment> Notice the “Title” attribute in the <App> element in the above XML document. This is the value that gets set when you modify the Deployment Title in your Properties Window of your Phone project. The only value you can set from the Properties Window is the Title. All of the other attributes you see here must be set by going into the XML file and modifying them directly. Note that this information duplicates some of the information that you can also set from the Assembly Information… button in the Properties Window. Why Microsoft did not just use that information, I don’t know. Reading Attributes from WMAppManifest I searched all over the namespaces and classes within the Windows Phone DLLs and could not find a way to read the attributes within the <App> element. Thus, I had to resort to good old fashioned XML processing. First off I created a WinPhoneCommon class and added two static methods as shown in the snippet below: public class WinPhoneCommon{ /// <summary> /// Returns the Application Title /// from the WMAppManifest.xml file /// </summary> /// <returns>The application title</returns> public static string GetApplicationTitle() { return GetWinPhoneAttribute("Title"); } /// <summary> /// Returns the Application Description /// from the WMAppManifest.xml file /// </summary> /// <returns>The application description</returns> public static string GetApplicationDescription() { return GetWinPhoneAttribute("Description"); } ... GetWinPhoneAttribute method here ...} In your Windows Phone application you can now simply call WinPhoneCommon.GetApplicationTitle() or WinPhone.GetApplicationDescription() to retrieve the Title or Description properties from the WMAppManifest.xml file respectively. You notice that each of these methods makes a call to the GetWinPhoneAttribute method. This method is shown in the following code snippet: /// <summary>/// Gets an attribute from the Windows Phone WMAppManifest.xml file/// To use this method, add a reference to the System.Xml.Linq DLL/// </summary>/// <param name="attributeName">The attribute to read</param>/// <returns>The Attribute's Value</returns>private static string GetWinPhoneAttribute(string attributeName){ string ret = string.Empty; try { XElement xe = XElement.Load("WMAppManifest.xml"); var attr = (from manifest in xe.Descendants("App") select manifest).SingleOrDefault(); if (attr != null) ret = attr.Attribute(attributeName).Value; } catch { // Ignore errors in case this method is called // from design time in VS.NET } return ret;} I love using the new LINQ to XML classes contained in the System.Xml.Linq.dll. When I did a Bing search the only samples I found for reading attribute information from WMAppManifest.xml used either an XmlReader or XmlReaderSettings objects. These are fine and work, but involve a little extra code. Instead of using these, I added a reference to the System.Xml.Linq.dll, then added two using statements to the top of the WinPhoneCommon class: using System.Linq;using System.Xml.Linq; Now, with just a few lines of LINQ to XML code you can read to the App element and extract the appropriate attribute that you pass into the GetWinPhoneAttribute method. Notice that I added a little bit of exception handling code in this method. I ignore the exception in case you call this method in the Loaded event of a user control. In design-time you cannot access the WMAppManifest file and thus an exception would be thrown. Summary In this article you learned how to retrieve the attributes from the WMAppManifest.xml file. I use this technique to grab information that I would otherwise have to hard-code in my application. Getting the Title or Description for your Windows Phone application is easy with just a little bit of LINQ to XML code. NOTE: You can download the complete sample code at my website. http://www.pdsa.com/downloads. Choose Tips & Tricks, then "Get Application Title from Windows Phone" from the drop-down. Good Luck with your Coding,Paul Sheriff ** SPECIAL OFFER FOR MY BLOG READERS **Visit http://www.pdsa.com/Event/Blog for a free video on Silverlight entitled Silverlight XAML for the Complete Novice - Part 1.

Read the article
Given an XML which contains a representation of a graph, how to apply it DFS algorithm? [on hold]

- by winston smith

Given the followin XML which is a directed graph: <?xml version="1.0" encoding="iso-8859-1" ?> <!DOCTYPE graph PUBLIC "-//FC//DTD red//EN" "../dtd/graph.dtd"> <graph direct="1"> <vertex label="V0"/> <vertex label="V1"/> <vertex label="V2"/> <vertex label="V3"/> <vertex label="V4"/> <vertex label="V5"/> <edge source="V0" target="V1" weight="1"/> <edge source="V0" target="V4" weight="1"/> <edge source="V5" target="V2" weight="1"/> <edge source="V5" target="V4" weight="1"/> <edge source="V1" target="V2" weight="1"/> <edge source="V1" target="V3" weight="1"/> <edge source="V1" target="V4" weight="1"/> <edge source="V2" target="V3" weight="1"/> </graph> With this classes i parsed the graph and give it an adjacency list representation: import java.io.IOException; import java.util.HashSet; import java.util.LinkedList; import java.util.Collection; import java.util.Iterator; import java.util.logging.Level; import java.util.logging.Logger; import practica3.util.Disc; public class ParsingXML { public static void main(String[] args) { try { // TODO code application logic here Collection<Vertex> sources = new HashSet<Vertex>(); LinkedList<String> lines = Disc.readFile("xml/directed.xml"); for (String lin : lines) { int i = Disc.find(lin, "source=\""); String data = ""; if (i > 0 && i < lin.length()) { while (lin.charAt(i + 1) != '"') { data += lin.charAt(i + 1); i++; } Vertex v = new Vertex(); v.setName(data); v.setAdy(new HashSet<Vertex>()); sources.add(v); } } Iterator it = sources.iterator(); while (it.hasNext()) { Vertex ver = (Vertex) it.next(); Collection<Vertex> adyacencias = ver.getAdy(); LinkedList<String> ls = Disc.readFile("xml/graphs.xml"); for (String lin : ls) { int i = Disc.find(lin, "target=\""); String data = ""; if (lin.contains("source=\""+ver.getName())) { Vertex v = new Vertex(); if (i > 0 && i < lin.length()) { while (lin.charAt(i + 1) != '"') { data += lin.charAt(i + 1); i++; } v.setName(data); } i = Disc.find(lin, "weight=\""); data = ""; if (i > 0 && i < lin.length()) { while (lin.charAt(i + 1) != '"') { data += lin.charAt(i + 1); i++; } v.setWeight(Integer.parseInt(data)); } if (v.getName() != null) { adyacencias.add(v); } } } } for (Vertex vert : sources) { System.out.println(vert); System.out.println("adyacencias: " + vert.getAdy()); } } catch (IOException ex) { Logger.getLogger(ParsingXML.class.getName()).log(Level.SEVERE, null, ex); } } } This is another class: import java.util.Collection; import java.util.Objects; public class Vertex { private String name; private int weight; private Collection ady; public Collection getAdy() { return ady; } public void setAdy(Collection adyacencias) { this.ady = adyacencias; } public String getName() { return name; } public void setName(String nombre) { this.name = nombre; } public int getWeight() { return weight; } public void setWeight(int weight) { this.weight = weight; } @Override public int hashCode() { int hash = 7; hash = 43 * hash + Objects.hashCode(this.name); hash = 43 * hash + this.weight; return hash; } @Override public boolean equals(Object obj) { if (obj == null) { return false; } if (getClass() != obj.getClass()) { return false; } final Vertex other = (Vertex) obj; if (!Objects.equals(this.name, other.name)) { return false; } if (this.weight != other.weight) { return false; } return true; } @Override public String toString() { return "Vertice{" + "name=" + name + ", weight=" + weight + '}'; } } And finally: /** * * @author user */ /* -*-jde-*- */ /* <Disc.java> Contains the main argument*/ import java.io.*; import java.util.LinkedList; /** * Lectura y escritura de archivos en listas de cadenas * Ideal para el uso de las clases para gráficas. * * @author Peralta Santa Anna Victor Miguel * @since Julio 2011 */ public class Disc { /** * Metodo para lectura de un archivo * * @param fileName archivo que se va a leer * @return El archivo en representacion de lista de cadenas */ public static LinkedList<String> readFile(String fileName) throws IOException { BufferedReader file = new BufferedReader(new FileReader(fileName)); LinkedList<String> textlist = new LinkedList<String>(); while (file.ready()) { textlist.add(file.readLine().trim()); } file.close(); /* for(String linea:textlist){ if(linea.contains("source")){ //String generado = linea.replaceAll("<\\w+\\s+\"", ""); //System.out.println(generado); } }*/ return textlist; }//readFile public static int find(String linea,String palabra){ int i,j; boolean found = false; for(i=0,j=0;i<linea.length();i++){ if(linea.charAt(i)==palabra.charAt(j)){ j++; if(j==palabra.length()){ found = true; return i; } }else{ continue; } } if(!found){ i= -1; } return i; } /** * Metodo para la escritura de un archivo * * @param fileName archivo que se va a escribir * @param tofile la lista de cadenas que quedaran en el archivo * @param append el bit que dira si se anexa el contenido o se empieza de cero */ public static void writeFile(String fileName, LinkedList<String> tofile, boolean append) throws IOException { FileWriter file = new FileWriter(fileName, append); for (int i = 0; i < tofile.size(); i++) { file.write(tofile.get(i) + "\n"); } file.close(); }//writeFile /** * Metodo para escritura de un archivo * @param msg archivo que se va a escribir * @param tofile la cadena que quedaran en el archivo * @param append el bit que dira si se anexa el contenido o se empieza de cero */ public static void writeFile(String msg, String tofile, boolean append) throws IOException { FileWriter file = new FileWriter(msg, append); file.write(tofile); file.close(); }//writeFile }// I'm stuck on what can be the best way to given an adjacency list representation of the graph how to apply it Depth-first search algorithm. Any idea of how to aproach to complete the task?

Read the article
How to store generated eigen faces for future face recognition?

- by user3237134

My code works in the following manner: 1.First, it obtains several images from the training set 2.After loading these images, we find the normalized faces,mean face and perform several calculation. 3.Next, we ask for the name of an image we want to recognize 4.We then project the input image into the eigenspace, and based on the difference from the eigenfaces we make a decision. 5.Depending on eigen weight vector for each input image we make clusters using kmeans command. Source code i tried: clear all close all clc % number of images on your training set. M=1200; %Chosen std and mean. %It can be any number that it is close to the std and mean of most of the images. um=60; ustd=32; %read and show images(bmp); S=[]; %img matrix for i=1:M str=strcat(int2str(i),'.jpg'); %concatenates two strings that form the name of the image eval('img=imread(str);'); [irow icol d]=size(img); % get the number of rows (N1) and columns (N2) temp=reshape(permute(img,[2,1,3]),[irow*icol,d]); %creates a (N1*N2)x1 matrix S=[S temp]; %X is a N1*N2xM matrix after finishing the sequence %this is our S end %Here we change the mean and std of all images. We normalize all images. %This is done to reduce the error due to lighting conditions. for i=1:size(S,2) temp=double(S(:,i)); m=mean(temp); st=std(temp); S(:,i)=(temp-m)*ustd/st+um; end %show normalized images for i=1:M str=strcat(int2str(i),'.jpg'); img=reshape(S(:,i),icol,irow); img=img'; end %mean image; m=mean(S,2); %obtains the mean of each row instead of each column tmimg=uint8(m); %converts to unsigned 8-bit integer. Values range from 0 to 255 img=reshape(tmimg,icol,irow); %takes the N1*N2x1 vector and creates a N2xN1 matrix img=img'; %creates a N1xN2 matrix by transposing the image. % Change image for manipulation dbx=[]; % A matrix for i=1:M temp=double(S(:,i)); dbx=[dbx temp]; end %Covariance matrix C=A'A, L=AA' A=dbx'; L=A*A'; % vv are the eigenvector for L % dd are the eigenvalue for both L=dbx'*dbx and C=dbx*dbx'; [vv dd]=eig(L); % Sort and eliminate those whose eigenvalue is zero v=[]; d=[]; for i=1:size(vv,2) if(dd(i,i)>1e-4) v=[v vv(:,i)]; d=[d dd(i,i)]; end end %sort, will return an ascending sequence [B index]=sort(d); ind=zeros(size(index)); dtemp=zeros(size(index)); vtemp=zeros(size(v)); len=length(index); for i=1:len dtemp(i)=B(len+1-i); ind(i)=len+1-index(i); vtemp(:,ind(i))=v(:,i); end d=dtemp; v=vtemp; %Normalization of eigenvectors for i=1:size(v,2) %access each column kk=v(:,i); temp=sqrt(sum(kk.^2)); v(:,i)=v(:,i)./temp; end %Eigenvectors of C matrix u=[]; for i=1:size(v,2) temp=sqrt(d(i)); u=[u (dbx*v(:,i))./temp]; end %Normalization of eigenvectors for i=1:size(u,2) kk=u(:,i); temp=sqrt(sum(kk.^2)); u(:,i)=u(:,i)./temp; end % show eigenfaces; for i=1:size(u,2) img=reshape(u(:,i),icol,irow); img=img'; img=histeq(img,255); end % Find the weight of each face in the training set. omega = []; for h=1:size(dbx,2) WW=[]; for i=1:size(u,2) t = u(:,i)'; WeightOfImage = dot(t,dbx(:,h)'); WW = [WW; WeightOfImage]; end omega = [omega WW]; end % Acquire new image % Note: the input image must have a bmp or jpg extension. % It should have the same size as the ones in your training set. % It should be placed on your desktop ed_min=[]; srcFiles = dir('G:\newdatabase\*.jpg'); % the folder in which ur images exists for b = 1 : length(srcFiles) filename = strcat('G:\newdatabase\',srcFiles(b).name); Imgdata = imread(filename); InputImage=Imgdata; InImage=reshape(permute((double(InputImage)),[2,1,3]),[irow*icol,1]); temp=InImage; me=mean(temp); st=std(temp); temp=(temp-me)*ustd/st+um; NormImage = temp; Difference = temp-m; p = []; aa=size(u,2); for i = 1:aa pare = dot(NormImage,u(:,i)); p = [p; pare]; end InImWeight = []; for i=1:size(u,2) t = u(:,i)'; WeightOfInputImage = dot(t,Difference'); InImWeight = [InImWeight; WeightOfInputImage]; end noe=numel(InImWeight); % Find Euclidean distance e=[]; for i=1:size(omega,2) q = omega(:,i); DiffWeight = InImWeight-q; mag = norm(DiffWeight); e = [e mag]; end ed_min=[ed_min MinimumValue]; theta=6.0e+03; %disp(e) z(b,:)=InImWeight; end IDX = kmeans(z,5); clustercount=accumarray(IDX, ones(size(IDX))); disp(clustercount); QUESTIONS: 1.It is working fine for M=50(i.e Training set contains 50 images) but not for M=1200(i.e Training set contains 1200 images).It is not showing any error.There is no output.I waited for 10 min still there is no output. I think it is going infinite loop.What is the problem?Where i was wrong? 2.Instead of running the training set everytime how eigen faces generated are stored so that stored eigen faces are used for future face recoginition for a new input image.So it reduces wastage of time.

Read the article
Vectorization of matlab code for faster execution

- by user3237134

My code works in the following manner: 1.First, it obtains several images from the training set 2.After loading these images, we find the normalized faces,mean face and perform several calculation. 3.Next, we ask for the name of an image we want to recognize 4.We then project the input image into the eigenspace, and based on the difference from the eigenfaces we make a decision. 5.Depending on eigen weight vector for each input image we make clusters using kmeans command. Source code i tried: clear all close all clc % number of images on your training set. M=1200; %Chosen std and mean. %It can be any number that it is close to the std and mean of most of the images. um=60; ustd=32; %read and show images(bmp); S=[]; %img matrix for i=1:M str=strcat(int2str(i),'.jpg'); %concatenates two strings that form the name of the image eval('img=imread(str);'); [irow icol d]=size(img); % get the number of rows (N1) and columns (N2) temp=reshape(permute(img,[2,1,3]),[irow*icol,d]); %creates a (N1*N2)x1 matrix S=[S temp]; %X is a N1*N2xM matrix after finishing the sequence %this is our S end %Here we change the mean and std of all images. We normalize all images. %This is done to reduce the error due to lighting conditions. for i=1:size(S,2) temp=double(S(:,i)); m=mean(temp); st=std(temp); S(:,i)=(temp-m)*ustd/st+um; end %show normalized images for i=1:M str=strcat(int2str(i),'.jpg'); img=reshape(S(:,i),icol,irow); img=img'; end %mean image; m=mean(S,2); %obtains the mean of each row instead of each column tmimg=uint8(m); %converts to unsigned 8-bit integer. Values range from 0 to 255 img=reshape(tmimg,icol,irow); %takes the N1*N2x1 vector and creates a N2xN1 matrix img=img'; %creates a N1xN2 matrix by transposing the image. % Change image for manipulation dbx=[]; % A matrix for i=1:M temp=double(S(:,i)); dbx=[dbx temp]; end %Covariance matrix C=A'A, L=AA' A=dbx'; L=A*A'; % vv are the eigenvector for L % dd are the eigenvalue for both L=dbx'*dbx and C=dbx*dbx'; [vv dd]=eig(L); % Sort and eliminate those whose eigenvalue is zero v=[]; d=[]; for i=1:size(vv,2) if(dd(i,i)>1e-4) v=[v vv(:,i)]; d=[d dd(i,i)]; end end %sort, will return an ascending sequence [B index]=sort(d); ind=zeros(size(index)); dtemp=zeros(size(index)); vtemp=zeros(size(v)); len=length(index); for i=1:len dtemp(i)=B(len+1-i); ind(i)=len+1-index(i); vtemp(:,ind(i))=v(:,i); end d=dtemp; v=vtemp; %Normalization of eigenvectors for i=1:size(v,2) %access each column kk=v(:,i); temp=sqrt(sum(kk.^2)); v(:,i)=v(:,i)./temp; end %Eigenvectors of C matrix u=[]; for i=1:size(v,2) temp=sqrt(d(i)); u=[u (dbx*v(:,i))./temp]; end %Normalization of eigenvectors for i=1:size(u,2) kk=u(:,i); temp=sqrt(sum(kk.^2)); u(:,i)=u(:,i)./temp; end % show eigenfaces; for i=1:size(u,2) img=reshape(u(:,i),icol,irow); img=img'; img=histeq(img,255); end % Find the weight of each face in the training set. omega = []; for h=1:size(dbx,2) WW=[]; for i=1:size(u,2) t = u(:,i)'; WeightOfImage = dot(t,dbx(:,h)'); WW = [WW; WeightOfImage]; end omega = [omega WW]; end % Acquire new image % Note: the input image must have a bmp or jpg extension. % It should have the same size as the ones in your training set. % It should be placed on your desktop ed_min=[]; srcFiles = dir('G:\newdatabase\*.jpg'); % the folder in which ur images exists for b = 1 : length(srcFiles) filename = strcat('G:\newdatabase\',srcFiles(b).name); Imgdata = imread(filename); InputImage=Imgdata; InImage=reshape(permute((double(InputImage)),[2,1,3]),[irow*icol,1]); temp=InImage; me=mean(temp); st=std(temp); temp=(temp-me)*ustd/st+um; NormImage = temp; Difference = temp-m; p = []; aa=size(u,2); for i = 1:aa pare = dot(NormImage,u(:,i)); p = [p; pare]; end InImWeight = []; for i=1:size(u,2) t = u(:,i)'; WeightOfInputImage = dot(t,Difference'); InImWeight = [InImWeight; WeightOfInputImage]; end noe=numel(InImWeight); % Find Euclidean distance e=[]; for i=1:size(omega,2) q = omega(:,i); DiffWeight = InImWeight-q; mag = norm(DiffWeight); e = [e mag]; end ed_min=[ed_min MinimumValue]; theta=6.0e+03; %disp(e) z(b,:)=InImWeight; end IDX = kmeans(z,5); clustercount=accumarray(IDX, ones(size(IDX))); disp(clustercount); Running time for 50 images:Elapsed time is 103.947573 seconds. QUESTIONS: 1.It is working fine for M=50(i.e Training set contains 50 images) but not for M=1200(i.e Training set contains 1200 images).It is not showing any error.There is no output.I waited for 10 min still there is no output. I think it is going infinite loop.What is the problem?Where i was wrong?

Read the article
Big Visible Charts

- by Robert May

An important part of Agile is the concept of transparency and visibility. In proper functioning teams, stakeholders can look at any team at any time in the iteration or release and see how that team is doing by simply looking at what we call Big Visible Charts. If you’ve done Scrum, you’ve seen these charts. However, interpreting these charts can often be an art form. There are several different charts that can be useful. In this newsletter, I’ll focus on the Iteration Burndown and Cumulative Flow charts. I’ve included a copy of the spreadsheet that I used to create the charts, and if you don’t have a tool that creates them for you, you can use this spreadsheet to do so. Our preferred tool for managing Scrum projects is Rally. Rally creates all of these charts for you, saving you quite a bit of time. The Iteration Burndown and Cumulative Flow Charts This is the main chart that teams use. Although less useful to stakeholders, this chart is critical to the team and provides quite a bit of information to the team about how their iteration is going. Most charts are a combination of the charts below, so you may need to combine aspects of each section to understand what is happening in your iterations. Ideal Ah, isn’t that a pretty picture? Unfortunately, it’s also very unrealistic. I’ve seen iterations that come close to ideal, but never that match perfectly. If your iteration matches perfectly, chances are, someone is playing with the numbers. Reality is just too difficult to have a burndown chart that matches this exactly. Late Planning Iteration started, but the team didn’t. You can tell this by the fact that the real number of estimated hours didn’t appear until day two. In the cumulative flow, you can also see that nothing was defined in Day one and two. You want to avoid situations like this. You’ll note that the team had to burn faster than is ideal to meet the iteration because of the late planning. This often results in long weeks and days. Testing Starved Determining whether or not testing is starved is difficult without the cumulative flow. The pattern in the burndown could be nothing more that developers not completing stories early enough or could be caused by stories being too big. With the cumulative flow, however, you see that only small bites are in progress and stories were completed early, but testing didn’t start testing until the end of the iteration, and didn’t complete testing all stories in the iteration. When this happens, question whether or not your testing resources are sufficient for your team and whether or not acceptance is adequately defined. No Testing With this one, both graphs show the same thing; the team needs testers and testing! Without testing, what was completed cannot be verified to make sure that it is acceptable to the business. If you find yourself in this situation, review your testing practices and acceptance testing process and make changes today. Late Development With this situation, both graphs tell a story. In the top graph, you can see that the hours failed to burn down as quickly as the team expected. This could be caused by the team not correctly estimating their hours or the team could have had illness or some other issue that affected them. Often, when teams are tackling something that is more unknown, they’ll run into technical barriers that cause the burn down to happen slower than expected. In the cumulative flow graph, you can see that not much was completed in the first few days. This could be because of illness or technical barriers or simply poor estimation. Testing was able to keep up with everything that was completed, however. No Tool Updating When you see graphs that look like this, you can be assured that it’s because the team is not updating the tool that generates the graphs. Review your policy for when they are to update. On the teams that I run, I require that each team member updates the tool at least once daily. You should also check to see how well the team is breaking down stories into tasks. If they’re creating few large tasks, graphs can look similar to this. As a general rule, I never allow tasks, other than Unit Testing and Uncertainty, to be greater than eight hours in duration. Scope Increase I always encourage team members to enter in however much time they think they have left on a task, even if that means increasing the total amount of time left to do. You get a much better and more realistic picture this way. Increasing time remaining could explain the burndown graph, but by looking at the cumulative flow graph, we can see that stories were added to the iteration and scope was increased. Since planning should consume all of the hours in the iteration, this is almost always a bad thing. If the scope change happened late in the iteration and the hours remaining were well below the ideal burn, then increasing scope is probably o.k., but estimation needs to get better. However, with the charts above, that’s clearly not what happened and the team was required to do extra work to make the iteration. If you find this happening, your product owner and ScrumMasters need training. The team also needs to learn to say no. Scope Decrease Scope decreases are just as bad as scope increases. Usually, graphs above show that the team did a poor job of estimating their stories and part way through had to reduce scope to change the iteration. This will happen once in a while, but if you find it’s a pattern on your team, you need to re-evaluate planning. Some teams are hopelessly optimistic. In those cases, I’ll introduce a task I call “Uncertainty.” With Uncertainty, the team estimates how many hours they might need if things don’t go well with the tasks they’ve defined. They try to estimate things that could go poorly and increase the time appropriately. Having an Uncertainty task allows them to have a low and high estimate. Uncertainty should not just be an arbitrary buffer. It must correlate to real uncertainty in the tasks that have been defined. Stories are too Big Often, we see graphs like the ones above. Note that the burndown looks fairly good, other than the chunky acceptance of stories. However, when you look at cumulative flow, you can see that at one point, everything is in progress. This is a bad thing. When you see graphs like this, you’re in one of two states. You may just have a very small team and can only handle one or two stories in your iteration. If you have more than one or two people, then the most likely problem is that your stories are far too big. To combat this, break large high hour stories into smaller pieces that can be completed independently and accepted independently. If you don’t, you’ll likely be requiring your testers to do heroic things to complete testing on the last day of the iteration and you’re much more likely to have the entire iteration fail, because of the limited amount of things that can be completed. Summary There are other charts that can be useful when doing scrum. If you don’t have any big visible charts, you really need to evaluate your process and change. These charts can provide the team a wealth of information and help you write better software. If you have any questions about charts that you’re seeing on your team, contact me with a screen capture of the charts and I’ll tell you what I’m seeing in those charts. I always want this information to be useful, so please let me know if you have other questions. Technorati Tags: Agile

Read the article
Can you help me fix my broken packages?

- by Andreas Hartmann

I would like to upgrade from 13.04 to 13.10, but some broken packages are preventing upgrade success: grep Broken /var/log/dist-upgrade/apt.log output: Broken libwayland-client0:amd64 Conflicts on libwayland0 [ amd64 ] < 1.0.5-0ubuntu1 > ( libs ) (< 1.1.0) Broken libunity9:amd64 Breaks on unity-common [ amd64 ] < 7.0.0daily13.06.19~13.04-0ubuntu1 > ( gnome ) (< 7.1.2) Broken cups-filters:amd64 Conflicts on ghostscript-cups [ amd64 ] < 9.07~dfsg2-0ubuntu3.1 > ( text ) Broken libpam-systemd:amd64 Conflicts on libpam-xdg-support [ amd64 ] < 0.2-0ubuntu2 > ( admin ) Broken libharfbuzz0a:amd64 Breaks on libharfbuzz0 [ amd64 ] < 0.9.13-1 > ( libs ) Broken libharfbuzz0a:amd64 Breaks on libharfbuzz0 [ i386 ] < 0.9.13-1 > ( libs ) Broken libunity-scopes-json-def-desktop:amd64 Conflicts on libunity-common [ amd64 ] < 6.90.2daily13.04.05-0ubuntu1 > ( gnome ) (< 7.0.7) Broken libunity-scopes-json-def-desktop:amd64 Conflicts on libunity-common [ i386 ] < none > ( none ) (< 7.0.7) Broken libaccount-plugin-generic-oauth:amd64 Conflicts on account-plugin-generic-oauth [ amd64 ] < 0.10bzr13.03.26-0ubuntu1.1 > ( gnome ) (< 0.10bzr13.04.30) Broken libaccount-plugin-generic-oauth:amd64 Breaks on account-plugin-generic-oauth [ amd64 ] < 0.10bzr13.03.26-0ubuntu1.1 > ( gnome ) (< 0.10bzr13.04.30) Broken libmutter0b:amd64 Breaks on libmutter0a [ amd64 ] < 3.6.3-0ubuntu2 > ( libs ) Broken python3-aptdaemon.pkcompat:amd64 Breaks on libpackagekit-glib2-14 [ amd64 ] < 0.7.6-3ubuntu1 > ( libs ) (<= 0.7.6-4) Broken apache2:amd64 Conflicts on apache2.2-common [ amd64 ] < 2.2.22-6ubuntu5.1 > ( httpd ) Broken chromium-codecs-ffmpeg-extra:amd64 Conflicts on chromium-codecs-ffmpeg [ amd64 ] < 28.0.1500.71-0ubuntu1.13.04.1 -> 29.0.1547.65-0ubuntu2 > ( universe/web ) Broken unity-scope-home:amd64 Conflicts on unity-lens-shopping [ amd64 ] < 6.8.0daily13.03.04-0ubuntu1 > ( gnome ) Broken libsnmp30:amd64 Breaks on libsnmp15 [ amd64 ] < 5.4.3~dfsg-2.7ubuntu1 > ( libs ) Broken apache2.2-bin:amd64 Breaks on gnome-user-share [ amd64 ] < 3.0.4-0ubuntu1 > ( gnome ) (< 3.8.0-2~) Broken libgjs0d:amd64 Conflicts on libgjs0c [ amd64 ] < 1.34.0-0ubuntu1 > ( libs ) Broken unity-gtk2-module:amd64 Conflicts on appmenu-gtk [ amd64 ] < 12.10.3daily13.04.03-0ubuntu1 > ( libs ) Broken lib32asound2:amd64 Depends on libasound2 [ amd64 ] < 1.0.25-4ubuntu3.1 -> 1.0.27.2-1ubuntu6 > ( libs ) (= 1.0.25-4ubuntu3.1) Broken unity-gtk3-module:amd64 Conflicts on appmenu-gtk3 [ amd64 ] < 12.10.3daily13.04.03-0ubuntu1 > ( libs ) Broken activity-log-manager:amd64 Conflicts on activity-log-manager-common [ amd64 ] < 0.9.4-0ubuntu6.2 > ( utils ) Broken libgtksourceview-3.0-0:amd64 Depends on libgtksourceview-3.0-common [ amd64 ] < 3.6.3-0ubuntu1 -> 3.8.2-0ubuntu1 > ( libs ) (< 3.7) Broken icaclient:amd64 Depends on lib32asound2 [ amd64 ] < 1.0.25-4ubuntu3.1 > ( libs ) Broken libunity-core-6.0-5:amd64 Depends on unity-services [ amd64 ] < 7.0.0daily13.06.19~13.04-0ubuntu1 -> 7.1.2+13.10.20131014.1-0ubuntu1 > ( gnome ) (= 7.0.0daily13.06.19~13.04-0ubuntu1) Broken libbamf3-1:amd64 Depends on bamfdaemon [ amd64 ] < 0.4.0daily13.06.19~13.04-0ubuntu1 -> 0.5.1+13.10.20131011-0ubuntu1 > ( libs ) (= 0.4.0daily13.06.19~13.04-0ubuntu1) Broken apache2-bin:amd64 Conflicts on apache2.2-bin [ amd64 ] < 2.2.22-6ubuntu5.1 -> 2.4.6-2ubuntu2 > ( httpd ) (< 2.3~) Output for cat /etc/apt/sources.list /etc/apt/sources.list.d/*.list # deb cdrom:[Ubuntu 13.04 _Raring Ringtail_ - Release amd64 (20130424)]/ raring main restricted # See http://help.ubuntu.com/community/UpgradeNotes for how to upgrade to # newer versions of the distribution. deb http://de.archive.ubuntu.com/ubuntu/ raring main restricted ## Major bug fix updates produced after the final release of the ## distribution. deb http://de.archive.ubuntu.com/ubuntu/ raring-updates main restricted ## N.B. software from this repository is ENTIRELY UNSUPPORTED by the Ubuntu ## team. Also, please note that software in universe WILL NOT receive any ## review or updates from the Ubuntu security team. deb http://de.archive.ubuntu.com/ubuntu/ raring universe deb http://de.archive.ubuntu.com/ubuntu/ raring-updates universe ## N.B. software from this repository is ENTIRELY UNSUPPORTED by the Ubuntu ## team, and may not be under a free licence. Please satisfy yourself as to ## your rights to use the software. Also, please note that software in ## multiverse WILL NOT receive any review or updates from the Ubuntu ## security team. deb http://de.archive.ubuntu.com/ubuntu/ raring multiverse deb http://de.archive.ubuntu.com/ubuntu/ raring-updates multiverse ## N.B. software from this repository may not have been tested as ## extensively as that contained in the main release, although it includes ## newer versions of some applications which may provide useful features. ## Also, please note that software in backports WILL NOT receive any review ## or updates from the Ubuntu security team. deb http://security.ubuntu.com/ubuntu raring-security main restricted deb http://security.ubuntu.com/ubuntu raring-security universe deb http://security.ubuntu.com/ubuntu raring-security multiverse ## Uncomment the following two lines to add software from Canonical's ## 'partner' repository. ## This software is not part of Ubuntu, but is offered by Canonical and the ## respective vendors as a service to Ubuntu users. deb http://archive.canonical.com/ubuntu raring partner # deb-src http://archive.canonical.com/ubuntu raring partner ## This software is not part of Ubuntu, but is offered by third-party ## developers who want to ship their latest software. deb http://extras.ubuntu.com/ubuntu raring main # deb-src http://extras.ubuntu.com/ubuntu raring main # deb http://linux.dropbox.com/ubuntu precise main output for sudo dpkg -l | grep -e "^iU" -e "^rc": rc ibm-lotus-cae 8.5.2-20100805.0821 i386 IBM Lotus Composite Application Editor rc ibm-lotus-cae-nl1 8.5.2-20100805.0821 i386 IBM Lotus CAE NL1 rc ibm-lotus-feedreader 8.5.2-20100805.0821 i386 Feeds for IBM Lotus Notes 8.5.2 rc ibm-lotus-feedreader-nl1 8.5.2-20100805.0821 i386 IBM Lotus Feed Reader NL1 rc ibm-lotus-notes 8.5.2-20100805.0821 i386 IBM Lotus Notes rc ibm-lotus-notes-core-de 8.5.2-20100805.0821 i386 IBM Lotus Notes Native German (de) rc ibm-lotus-notes-nl1 8.5.2-20100805.0821 i386 IBM Lotus Notes Java NL1 rc ibm-lotus-sametime 8.5.2-20100805.0821 i386 IBM Lotus Sametime rc ibm-lotus-symphony 8.5.2-20100805.0821 i386 IBM Lotus Symphony rc ibm-lotus-symphony-nl1 8.5.2-20100805.0821 i386 IBM Lotus Symphony NL1 rc libapache2-mod-php5filter 5.4.9-4ubuntu2.2 amd64 server-side, HTML-embedded scripting language (apache 2 filter module) rc libavcodec53:amd64 6:0.8.6-1ubuntu2 amd64 Libav codec library rc libavutil51:amd64 6:0.8.6-1ubuntu2 amd64 Libav utility library rc libmotif4:amd64 2.3.3-7ubuntu1 amd64 Open Motif - shared libraries rc linux-image-3.8.0-25-generic 3.8.0-25.37 amd64 Linux kernel image for version 3.8.0 on 64 bit x86 SMP rc linux-image-extra-3.8.0-25-generic 3.8.0-25.37 amd64 Linux kernel image for version 3.8.0 on 64 bit x86 SMP

Read the article
Deterministic/Consistent Unique Masking

- by Dinesh Rajasekharan-Oracle

One of the key requirements while masking data in large databases or multi database environment is to consistently mask some columns, i.e. for a given input the output should always be the same. At the same time the masked output should not be predictable. Deterministic masking also eliminates the need to spend enormous amount of time spent in identifying data relationships, i.e. parent and child relationships among columns defined in the application tables. In this blog post I will explain different ways of consistently masking the data across databases using Oracle Data Masking and Subsetting The readers of post should have minimal knowledge on Oracle Enterprise Manager 12c, Application Data Modeling, Data Masking concepts. For more information on these concepts, please refer to Oracle Data Masking and Subsetting document Oracle Data Masking and Subsetting 12c provides four methods using which users can consistently yet irreversibly mask their inputs. 1. Substitute 2. SQL Expression 3. Encrypt 4. User Defined Function SUBSTITUTE The substitute masking format replaces the original value with a value from a pre-created database table. As the method uses a hash based algorithm in the back end the mappings are consistent. For example consider DEPARTMENT_ID in EMPLOYEES table is replaced with FAKE_DEPARTMENT_ID from FAKE_TABLE. The substitute masking transformation that all occurrences of DEPARTMENT_ID say ‘101’ will be replaced with ‘502’ provided same substitution table and column is used , i.e. FAKE_TABLE.FAKE_DEPARTMENT_ID. The following screen shot shows the usage of the Substitute masking format with in a masking definition: Note that the uniqueness of the masked value depends on the number of columns being used in the substitution table i.e. if the original table contains 50000 unique values, then for the masked output to be unique and deterministic the substitution column should also contain 50000 unique values without which only consistency is maintained but not uniqueness. SQL EXPRESSION SQL Expression replaces an existing value with the output of a specified SQL Expression. For example while masking an EMPLOYEES table the EMAIL_ID of an employee has to be in the format EMPLOYEE’s [email protected] while FIRST_NAME and LAST_NAME are the actual column names of the EMPLOYEES table then the corresponding SQL Expression will look like %FIRST_NAME%||’.’||%LAST_NAME%||’@COMPANY.COM’. The advantage of this technique is that if you are masking FIRST_NAME and LAST_NAME of the EMPLOYEES table than the corresponding EMAIL ID will be replaced accordingly by the masking scripts. One of the interesting aspect’s of a SQL Expressions is that you can use sub SQL expressions, which means that you can write a nested SQL and use it as SQL Expression to address a complex masking business use cases. SQL Expression can also be used to consistently replace value with hashed value using Oracle’s PL/SQL function ORA_HASH. The following SQL Expression will help in the previous example for replacing the DEPARTMENT_IDs with a hashed number ORA_HASH (%DEPARTMENT_ID%, 1000) The following screen shot shows the usage of encrypt masking format with in the masking definition: ORA_HASH takes three arguments: 1. Expression which can be of any data type except LONG, LOB, User Defined Type [nested table type is allowed]. In the above example I used the Original value as expression. 2. Number of hash buckets which can be number between 0 and 4294967295. The default value is 4294967295. You can also co-relate the number of hash buckets to a range of numbers. In the above example above the bucket value is specified as 1000, so the end result will be a hashed number in between 0 and 1000. 3. Seed, can be any number which decides the consistency, i.e. for a given seed value the output will always be same. The default seed is 0. In the above SQL Expression a seed in not specified, so it to 0. If you have to use a non default seed then the function will look like. ORA_HASH (%DEPARTMENT_ID%, 1000, 1234 The uniqueness depends on the input and the number of hash buckets used. However as ORA_HASH uses a 32 bit algorithm, considering birthday paradox or pigeonhole principle there is a 0.5 probability of collision after 232-1 unique values. ENCRYPT Encrypt masking format uses a blend of 3DES encryption algorithm, hashing, and regular expression to produce a deterministic and unique masked output. The format of the masked output corresponds to the specified regular expression. As this technique uses a key [string] to encrypt the data, the same string can be used to decrypt the data. The key also acts as seed to maintain consistent outputs for a given input. The following screen shot shows the usage of encrypt masking format with in the masking definition: Regular Expressions may look complex for the first time users but you will soon realize that it’s a simple language. There are many resources in internet, oracle documentation, oracle learning library, my oracle support on writing a Regular Expressions, out of all the following My Oracle Support document helped me to get started with Regular Expressions: Oracle SQL Support for Regular Expressions[Video](Doc ID 1369668.1) USER DEFINED FUNCTION [UDF] User Defined Function or UDF provides flexibility for the users to code their own masking logic in PL/SQL, which can be called from masking Defintion. The standard format of an UDF in Oracle Data Masking and Subsetting is: Function udf_func (rowid varchar2, column_name varchar2, original_value varchar2) returns varchar2; Where • rowid is the row identifier of the column that needs to be masked • column_name is the name of the column that needs to be masked • original_value is the column value that needs to be masked You can achieve deterministic masking by using Oracle’s built in hash functions like, ORA_HASH, DBMS_CRYPTO.MD4, DBMS_CRYPTO.MD5, DBMS_UTILITY. GET_HASH_VALUE.Please refers to the Oracle Database Documentation for more information on the Oracle Hash functions. For example the following masking UDF generate deterministic unique hexadecimal values for a given string input: CREATE OR REPLACE FUNCTION RD_DUX (rid varchar2, column_name varchar2, orig_val VARCHAR2) RETURN VARCHAR2 DETERMINISTIC PARALLEL_ENABLE IS stext varchar2 (26); no_of_characters number(2); BEGIN no_of_characters:=6; stext:=substr(RAWTOHEX(DBMS_CRYPTO.HASH(UTL_RAW.CAST_TO_RAW(text),1)),0,no_of_characters); RETURN stext; END; The uniqueness depends on the input and length of the string and number of bits used by hash algorithm. In the above function MD4 hash is used [denoted by argument 1 in the DBMS_CRYPTO.HASH function which is a 128 bit algorithm which produces 2^128-1 unique hashed values , however this is limited by the length of the input string which is 6, so only 6^6 unique values will be generated. Also do not forget about the birthday paradox/pigeonhole principle mentioned earlier in this post. An another example is to consistently replace characters or numbers preserving the length and special characters as shown below: CREATE OR REPLACE FUNCTION RD_DUS(rid varchar2,column_name varchar2,orig_val VARCHAR2) RETURN VARCHAR2 DETERMINISTIC PARALLEL_ENABLE IS stext varchar2(26); BEGIN DBMS_RANDOM.SEED(orig_val); stext:=TRANSLATE(orig_val,'ABCDEFGHILKLMNOPQRSTUVWXYZ',DBMS_RANDOM.STRING('U',26)); stext:=TRANSLATE(stext,'abcdefghijklmnopqrstuvwxyz',DBMS_RANDOM.STRING('L',26)); stext:=TRANSLATE(stext,'0123456789',to_char(DBMS_RANDOM.VALUE(1,9))); stext:=REPLACE(stext,'.','0'); RETURN stext; END; The following screen shot shows the usage of an UDF with in a masking definition: To summarize, Oracle Data Masking and Subsetting helps you to consistently mask data across databases using one or all of the methods described in this post. It saves the hassle of identifying the parent-child relationships defined in the application table. Happy Masking

Read the article
SQL University: What and why of database refactoring

- by Mladen Prajdic

This is a post for a great idea called SQL University started by Jorge Segarra also famously known as SqlChicken on Twitter. It’s a collection of blog posts on different database related topics contributed by several smart people all over the world. So this week is mine and we’ll be talking about database testing and refactoring. In 3 posts we’ll cover: SQLU part 1 - What and why of database testing SQLU part 2 - What and why of database refactoring SQLU part 3 - Tools of the trade This is a second part of the series and in it we’ll take a look at what database refactoring is and why do it. Why refactor a database To know why refactor we first have to know what refactoring actually is. Code refactoring is a process where we change module internals in a way that does not change that module’s input/output behavior. For successful refactoring there is one crucial thing we absolutely must have: Tests. Automated unit tests are the only guarantee we have that we haven’t broken the input/output behavior before refactoring. If you haven’t go back ad read my post on the matter. Then start writing them. Next thing you need is a code module. Those are views, UDFs and stored procedures. By having direct table access we can kiss fast and sweet refactoring good bye. One more point to have a database abstraction layer. And no, ORM’s don’t fall into that category. But also know that refactoring is NOT adding new functionality to your code. Many have fallen into this trap. Don’t be one of them and resist the lure of the dark side. And it’s a strong lure. We developers in general love to add new stuff to our code, but hate fixing our own mistakes or changing existing code for no apparent reason. To be a good refactorer one needs discipline and focus. Now we know that refactoring is all about changing inner workings of existing code. This can be due to performance optimizations, changing internal code workflows or some other reason. This is a typical black box scenario to the outside world. If we upgrade the car engine it still has to drive on the road (preferably faster) and not fly (no matter how cool that would be). Also be aware that white box tests will break when we refactor. What to refactor in a database Refactoring databases doesn’t happen that often but when it does it can include a lot of stuff. Let us look at a few common cases. Adding or removing database schema objects Adding, removing or changing table columns in any way, adding constraints, keys, etc… All of these can be counted as internal changes not visible to the data consumer. But each of these carries a potential input/output behavior change. Dropping a column can result in views not working anymore or stored procedure logic crashing. Adding a unique constraint shows duplicated data that shouldn’t exist. Foreign keys break a truncate table command executed from an application that runs once a month. All these scenarios are very real and can happen. With the proper database abstraction layer fully covered with black box tests we can make sure something like that does not happen (hopefully at all). Changing physical structures Physical structures include heaps, indexes and partitions. We can pretty much add or remove those without changing the data returned by the database. But the performance can be affected. So here we use our performance tests. We do have them, right? Just by adding a single index we can achieve orders of magnitude performance improvement. Won’t that make users happy? But what if that index causes our write operations to crawl to a stop. again we have to test this. There are a lot of things to think about and have tests for. Without tests we can’t do successful refactoring! Fixing bad code We all have some bad code in our systems. We usually refer to that code as code smell as they violate good coding practices. Examples of such code smells are SQL injection, use of SELECT *, scalar UDFs or cursors, etc… Each of those is huge code smell and can result in major code changes. Take SELECT * from example. If we remove a column from a table the client using that SELECT * statement won’t have a clue about that until it runs. Then it will gracefully crash and burn. Not to mention the widely unknown SELECT * view refresh problem that Tomas LaRock (@SQLRockstar on Twitter) and Colin Stasiuk (@BenchmarkIT on Twitter) talk about in detail. Go read about it, it’s informative. Refactoring this includes replacing the * with column names and most likely change to application using the database. Breaking apart huge stored procedures Have you ever seen seen a stored procedure that was 2000 lines long? I have. It’s not pretty. It hurts the eyes and sucks the will to live the next 10 minutes. They are a maintenance nightmare and turn into things no one dares to touch. I’m willing to bet that 100% of time they don’t have a single test on them. Large stored procedures (and functions) are a clear sign that they contain business logic. General opinion on good database coding practices says that business logic has no business in the database. That’s the applications part. Refactoring such behemoths requires writing lots of edge case tests for the stored procedure input/output behavior and then start to refactor it. First we split the logic inside into smaller parts like new stored procedures and UDFs. Those then get called from the master stored procedure. Once we’ve successfully modularized the database code it’s best to transfer that logic into the applications consuming it. This only leaves the stored procedure with common data manipulation logic. Of course this isn’t always possible so having a plethora of performance and behavior unit tests is absolutely necessary to confirm we’ve actually improved the codebase in some way. Refactoring is not a popular chore amongst developers or managers. The former don’t like fixing old code, the latter can’t see the financial benefit. Remember how we talked about being lousy at estimating future costs in the previous post? But there comes a time when it must be done. Hopefully I’ve given you some ideas how to get started. In the last post of the series we’ll take a look at the tools to use and an example of testing and refactoring.

Read the article
Free Document/Content Management System Using SharePoint 2010

- by KunaalKapoor

That’s right, it’s true. You can use the free version of SharePoint 2010 to meet your document and content management needs and even run your public facing website or an internal knowledge bank. SharePoint Foundation 2010 is free. It may not have all the features that you get in the enterprise license but it still has enough to cater to your needs to build a document management system and replace age old file shares or folders. I’ve built a dozen content management sites for internal and public use exploiting SharePoint. There are hundreds of web content management systems out there (see CMS Matrix). On one hand we have commercial platforms like SharePoint, SiteCore, and Ektron etc. which are the most frequently used and on the other hand there are free options like WordPress, Drupal, Joomla, and Plone etc. which are pretty common popular as well. But I would be very surprised if anyone was able to find a single CMS platform that is all things to all people. Infact not a lot of people consider SharePoint’s free version under the free CMS side but its high time organizations benefit from this. Through this blog post I wanted to present SharePoint Foundation as an option for running a FREE CMS platform. Even if you knew that there is a free version of SharePoint, what most people don’t realize is that SharePoint Foundation is a great option for running web sites of all kinds – not just team sites. It is a great option for many reasons, but in reality it is supported by Microsoft, and above all it is FREE (yay!), and it is extremely easy to get started. From a functionality perspective – it’s hard to beat SharePoint. Even the free version, SharePoint Foundation, offers simple data connectivity (through BCS), cross browser support, accessibility, support for Office Web Apps, blogs, wikis, templates, document support, health analyzer, support for presence, and MUCH more.I often get asked: “Can I use SharePoint 2010 as a document management system?” The answer really depends on · What are your specific requirements? · What systems you currently have in place for managing documents. · And of course how much money you have J Benefits? Not many large organizations have benefited from SharePoint yet. For some it has been an IT project to see what they can achieve with it, for others it has been used as a collaborative platform or in many cases an extended intranet. SharePoint 2010 has changed the game slightly as the improvements that Microsoft have made have been noted by organizations, and we are seeing a lot of companies starting to build specific business applications using SharePoint as the basis, and nearly every business process will require documents at some stage. If you require a document management system and have SharePoint in place then it can be a relatively straight forward decision to use SharePoint, as long as you have reviewed the considerations just discussed. The collaborative nature of SharePoint 2010 is also a massive advantage, as specific departmental or project sites can be created quickly and easily that allow workers to interact in a variety of different ways using one source of information. This also benefits an organization with regards to how they manage the knowledge that they have, as if all of their information is in one source then it is naturally easier to search and manage. Is SharePoint right for your organization? As just discussed, this can only be determined after defining your requirements and also planning a longer term strategy for how you will manage your documents and information. A key factor to look at is how the users would interact with the system and how much value would it get for your organization. The amount of data and documents that organizations are creating is increasing rapidly each year. Therefore the ability to archive this information, whilst keeping the ability to know what you have and where it is, is vital to any organizations management of their information life cycle. SharePoint is best used for the initial life of business documents where they need to be referenced and accessed after time. It is often beneficial to archive these to overcome for storage and performance issues. FREE CMS – SharePoint, Really? In order to show some of the completely of what comes with this free version of SharePoint 2010, I thought it would make sense to use Wikipedia (since every one trusts it as a credible source). Wikipedia shows that a web content management system typically has the following components: Document Management: - CMS software may provide a means of managing the life cycle of a document from initial creation time, through revisions, publication, archive, and document destruction. SharePoint is king when it comes to document management. Version history, exclusive check-out, security, publication, workflow, and so much more. Content Virtualization: - CMS software may provide a means of allowing each user to work within a virtual copy of the entire Web site, document set, and/or code base. This enables changes to multiple interdependent resources to be viewed and/or executed in-context prior to submission. Through the use of versioning, each content manager can preview, publish, and roll-back content of pages, wiki entries, blog posts, documents, or any other type of content stored in SharePoint. The idea of each user having an entire copy of the website virtualized is a bit odd to me – not sure why anyone would need that for anything but the simplest of websites. Automated Templates: - Create standard output templates that can be automatically applied to new and existing content, allowing the appearance of all content to be changed from one central place. Through the use of Master Pages and Themes, SharePoint provides the ability to change the entire look and feel of site. Of course, the older brother version of SharePoint – SharePoint Server 2010 – also introduces the concept of Page Layouts which allows page template level customization and even switching the layout of an individual page using different page templates. I think many organizations really think they want this but rarely end up using this bit of functionality. Easy Edits: - Once content is separated from the visual presentation of a site, it usually becomes much easier and quicker to edit and manipulate. Most WCMS software includes WYSIWYG editing tools allowing non-technical individuals to create and edit content. This is probably easier described with a screen cap of a vanilla SharePoint Foundation page in edit mode. Notice the page editing toolbar, the multiple layout options… It’s actually easier to use than Microsoft Word. Workflow management: - Workflow is the process of creating cycles of sequential and parallel tasks that must be accomplished in the CMS. For example, a content creator can submit a story, but it is not published until the copy editor cleans it up and the editor-in-chief approves it. Workflow, it’s in there. In fact, the same workflow engine is running under SharePoint Foundation that is running under the other versions of SharePoint. The primary difference is that with SharePoint Foundation – you need to configure the workflows yourself. Web Standards: - Active WCMS software usually receives regular updates that include new feature sets and keep the system up to current web standards. SharePoint is in the fourth major iteration under Microsoft with the 2010 release. In addition to the innovation that Microsoft continuously adds, you have the entire global ecosystem available. Scalable Expansion: - Available in most modern WCMSs is the ability to expand a single implementation (one installation on one server) across multiple domains. SharePoint Foundation can run multiple sites using multiple URLs on a single server install. Even more powerful, SharePoint Foundation is scalable and can be part of a multi-server farm to ensure that it will handle any amount of traffic that can be thrown at it. Delegation & Security: - Some CMS software allows for various user groups to have limited privileges over specific content on the website, spreading out the responsibility of content management. SharePoint Foundation provides very granular security capabilities. Read @ http://msdn.microsoft.com/en-us/library/ee537811.aspx Content Syndication: - CMS software often assists in content distribution by generating RSS and Atom data feeds to other systems. They may also e-mail users when updates are available as part of the workflow process. SharePoint Foundation nails it. With RSS syndication and email alerts available out of the box, content syndication is already in the platform. Multilingual Support: - Ability to display content in multiple languages. SharePoint Foundation 2010 supports more than 40 languages. Read More Read more @ http://msdn.microsoft.com/en-us/library/dd776256(v=office.12).aspxYou can download the free version from http://www.microsoft.com/en-us/download/details.aspx?id=5970

Read the article
Integration Patterns with Azure Service Bus Relay, Part 3: Anonymous partial-trust consumer

- by Elton Stoneman

This is the third in the IPASBR series, see also: Integration Patterns with Azure Service Bus Relay, Part 1: Exposing the on-premise service Integration Patterns with Azure Service Bus Relay, Part 2: Anonymous full-trust .NET consumer As the patterns get further from the simple .NET full-trust consumer, all that changes is the communication protocol and the authentication mechanism. In Part 3 the scenario is that we still have a secure .NET environment consuming our service, so we can store shared keys securely, but the runtime environment is locked down so we can't use Microsoft.ServiceBus to get the nice WCF relay bindings. To support this we will expose a RESTful endpoint through the Azure Service Bus, and require the consumer to send a security token with each HTTP service request. Pattern applicability This is a good fit for scenarios where: the runtime environment is secure enough to keep shared secrets the consumer can execute custom code, including building HTTP requests with custom headers the consumer cannot use the Azure SDK assemblies the service may need to know who is consuming it the service does not need to know who the end-user is Note there isn't actually a .NET requirement here. By exposing the service in a REST endpoint, anything that can talk HTTP can be a consumer. We'll authenticate through ACS which also gives us REST endpoints, so the service is still accessed securely. Our real-world example would be a hosted cloud app, where we we have enough room in the app's customisation to keep the shared secret somewhere safe and to hook in some HTTP calls. We will be flowing an identity through to the on-premise service now, but it will be the service identity given to the consuming app - the end user's identity isn't flown through yet. In this post, we’ll consume the service from Part 1 in ASP.NET using the WebHttpRelayBinding. The code for Part 3 (+ Part 1) is on GitHub here: IPASBR Part 3. Authenticating and authorizing with ACS We'll follow the previous examples and add a new service identity for the namespace in ACS, so we can separate permissions for different consumers (see walkthrough in Part 1). I've named the identity partialTrustConsumer. We’ll be authenticating against ACS with an explicit HTTP call, so we need a password credential rather than a symmetric key – for a nice secure option, generate a symmetric key, copy to the clipboard, then change type to password and paste in the key: We then need to do the same as in Part 2 , add a rule to map the incoming identity claim to an outgoing authorization claim that allows the identity to send messages to Service Bus: Issuer: Access Control Service Input claim type: http://schemas.xmlsoap.org/ws/2005/05/identity/claims/nameidentifier Input claim value: partialTrustConsumer Output claim type: net.windows.servicebus.action Output claim value: Send As with Part 2, this sets up a service identity which can send messages into Service Bus, but cannot register itself as a listener, or manage the namespace. RESTfully exposing the on-premise service through Azure Service Bus Relay The part 3 sample code is ready to go, just put your Azure details into Solution Items\AzureConnectionDetails.xml and “Run Custom Tool” on the .tt files. But to do it yourself is very simple. We already have a WebGet attribute in the service for locally making REST calls, so we are just going to add a new endpoint which uses the WebHttpRelayBinding to relay that service through Azure. It's as easy as adding this endpoint to Web.config for the service: <endpoint address="https://sixeyed-ipasbr.servicebus.windows.net/rest" binding="webHttpRelayBinding" contract="Sixeyed.Ipasbr.Services.IFormatService" behaviorConfiguration="SharedSecret"> </endpoint> - and adding the webHttp attribute in your endpoint behavior: <behavior name="SharedSecret"> <webHttp/> <transportClientEndpointBehavior credentialType="SharedSecret"> <clientCredentials> <sharedSecret issuerName="serviceProvider" issuerSecret="gl0xaVmlebKKJUAnpripKhr8YnLf9Neaf6LR53N8uGs="/> </clientCredentials> </transportClientEndpointBehavior> </behavior> Where's my WSDL? The metadata story for REST is a bit less automated. In our local webHttp endpoint we've enabled WCF's built-in help, so if you navigate to: http://localhost/Sixeyed.Ipasbr.Services/FormatService.svc/rest/help - you'll see the uri format for making a GET request to the service. The format is the same over Azure, so this is where you'll be connecting: https://[your-namespace].servicebus.windows.net/rest/reverse?string=abc123 Build the service with the new endpoint, open that in a browser and you'll get an XML version of an HTTP status code - a 401 with an error message stating that you haven’t provided an authorization header: <?xml version="1.0"?><Error><Code>401</Code><Detail>MissingToken: The request contains no authorization header..TrackingId:4cb53408-646b-4163-87b9-bc2b20cdfb75_5,TimeStamp:10/3/2012 8:34:07 PM</Detail></Error> By default, the setup of your Service Bus endpoint as a relying party in ACS expects a Simple Web Token to be presented with each service request, and in the browser we're not passing one, so we can't access the service. Note that this request doesn't get anywhere near your on-premise service, Service Bus only relays requests once they've got the necessary approval from ACS. Why didn't the consumer need to get ACS authorization in Part 2? It did, but it was all done behind the scenes in the NetTcpRelayBinding. By specifying our Shared Secret credentials in the consumer, the service call is preceded by a check on ACS to see that the identity provided is a) valid, and b) allowed access to our Service Bus endpoint. By making manual HTTP requests, we need to take care of that ACS check ourselves now. We do that with a simple WebClient call to the ACS endpoint of our service; passing the shared secret credentials, we will get back an SWT: var values = new System.Collections.Specialized.NameValueCollection(); values.Add("wrap_name", "partialTrustConsumer"); //service identity name values.Add("wrap_password", "suCei7AzdXY9toVH+S47C4TVyXO/UUFzu0zZiSCp64Y="); //service identity password values.Add("wrap_scope", "http://sixeyed-ipasbr.servicebus.windows.net/"); //this is the realm of the RP in ACS var acsClient = new WebClient(); var responseBytes = acsClient.UploadValues("https://sixeyed-ipasbr-sb.accesscontrol.windows.net/WRAPv0.9/", "POST", values); rawToken = System.Text.Encoding.UTF8.GetString(responseBytes); With a little manipulation, we then attach the SWT to subsequent REST calls in the authorization header; the token contains the Send claim returned from ACS, so we will be authorized to send messages into Service Bus. Running the sample Navigate to http://localhost:2028/Sixeyed.Ipasbr.WebHttpClient/Default.cshtml, enter a string and hit Go! - your string will be reversed by your on-premise service, routed through Azure: Using shared secret client credentials in this way means ACS is the identity provider for your service, and the claim which allows Send access to Service Bus is consumed by Service Bus. None of the authentication details make it through to your service, so your service is not aware who the consumer is (MSDN calls this "anonymous authentication").

Read the article
Agile Testing Days 2012 – Day 3 – Agile or agile?

- by Chris George

Another early start for my last Lean Coffee of the conference, and again it was not wasted. We had some really interesting discussions around how to determine what test automation is useful, if agile is not faster, why do it? and a rather existential discussion on whether unicorns exist! First keynote of the day was entitled “Fast Feedback Teams” by Ola Ellnestam. Again this relates nicely to the releasing faster talk on day 2, and something that we are looking at and some teams are actively trying. Introducing the notion of feedback, Ola describes a game he wrote for his eldest child. It was a simple game where every time he clicked a button, it displayed “You’ve Won!”. He then changed it to be a Win-Lose-Win-Lose pattern and watched the feedback from his son who then twigged the pattern and got his younger brother to play, alternating turns… genius! (must do that with my children). The idea behind this was that you need that feedback loop to learn and progress. If you are not getting the feedback you need to close that loop. An interesting point Ola made was to solve problems BEFORE writing software. It may be that you don’t have to write anything at all, perhaps it’s a communication/training issue? Perhaps the problem can be solved another way. Writing software, although it’s the business we are in, is expensive, and this should be taken into account. He again mentions frequent releases, and how they should be made as soon as stuff is ready to be released, don’t leave stuff on the shelf cause it’s not earning you anything, money or data. I totally agree with this and it’s something that we will be aiming for moving forwards. “Exceptions, Assumptions and Ambiguity: Finding the truth behind the story” by David Evans started off very promising by making references to ‘Grim up North’ referring to the north of England. Not sure it was appreciated by most of the audience, but it made me laugh! David explained how there are always risks associated with exceptions, giving the example of a one-way road near where he lives, with an exception sign giving rights to coaches to go the wrong way. Therefore you could merrily swing around the corner of the one way road straight into a coach! David showed the danger in making assumptions with lyrical quotes from Lola by The Kinks “I’m glad I’m a man, and so is Lola” and with a picture of a toilet flush that needed instructions to operate the full and half flush. With this particular flush, you pulled the handle all the way down to half flush, and half way down to full flush! hmmm, a bit of a crappy user experience methinks! Then through a clever use of a passage from the Jabberwocky, David then went onto show how mis-translation/ambiguity is the can completely distort the original meaning of something, and this is a real enemy of software development. This was all helping to demonstrate that the term Story is often heavily overloaded in the Agile world, and should really be stripped back to what it is really for, stating a business problem, and offering a technical solution. Therefore a story could be worded as “In order to {make some improvement}, we will { do something}”. The first ‘in order to’ statement is stakeholder neutral, and states the problem through requesting an improvement to the software/process etc. The second part of the story is the verb, the doing bit. So to achieve the ‘improvement’ which is not currently true, we will do something to make this true in the future. My PM is very interested in this, and he’s observed some of the problems of overloading stories so I’m hoping between us we can use some of David’s suggestions to help clarify our stories better. The second keynote of the day (and our last) proved to be the most entertaining and exhausting of the conference for me. “The ongoing evolution of testing in agile development” by Scott Barber. I’ve never had the pleasure of seeing Scott before… OMG I would love to have even half of the energy he has! What struck me during this presentation was Scott’s explanation of how testing has become the role/job that it is (largely) today, and how this has led to the need for ‘methodologies’ to make dev and test work! The argument that we should be trying to converge the roles again is a very valid one, and one that a couple of the teams at work are actively doing with great results. Making developers as responsible for quality as testers is something that has been lost over the years, but something that we are now striving to achieve. The idea that we (testers) should be testing experts/specialists, not testing ‘union members’, supports this idea so the entire team works on all aspects of a feature/product, with the ‘specialists’ taking the lead and advising/coaching the others. This leads to better propagation of information around the team, a greater holistic understanding of the project and it allows the team to continue functioning if some of it’s members are off sick, for example. Feeling somewhat drained from Scott’s keynote (but at the same time excited that alot of the points he raised supported actions we are taking at work), I headed into my last presentation for Agile Testing Days 2012 before having to make my way to Tegel to catch the flight home. “Thinking and working agile in an unbending world” with Pete Walen was a talk I was not going to miss! Having spoken to Pete several times during the past few days, I was looking forward to hearing what he was going to say, and I was not disappointed. Pete started off by trying to separate the definitions of ‘Agile’ as in the methodology, and ‘agile’ as in the adjective by pronouncing them the ‘english’ and ‘american’ ways. So Agile pronounced (Ajyle) and agile pronounced (ajul). There was much confusion around what the hell he was talking about, although I thought it was quite clear. Agile – Software development methodology agile – Marked by ready ability to move with quick easy grace; Having a quick resourceful and adaptable character. Anyway, that aside (although it provided a few laughs during the presentation), the point was that many teams that claim to be ‘Agile’ but are not, in fact, ‘agile’ by nature. Implementing ‘Agile’ methodologies that are so prescriptive actually goes against the very nature of Agile development where a team should anticipate, adapt and explore. Pete made a valid point that very few companies intentionally put up roadblocks to impede work, so if work is being blocked/delayed, why? This is where being agile as a team pays off because the team can inspect what’s going on, explore options and adapt their processes. It is through experimentation (and that means trying and failing as well as trying and succeeding) that a team will improve and grow leading to focussing on what really needs to be done to achieve X. So, that was it, the last talk of our conference. I was gutted that we had to miss the closing keynote from Matt Heusser, as Matt was another person I had spoken too a few times during the conference, but the flight would not wait, and just as well we left when we did because the traffic was a nightmare! My Takeaway Triple from Day 3: Release often and release small – don’t leave stuff on the shelf Keep the meaning of the word ‘agile’ in mind when working in ‘Agile Look at testing as more of a skill than a role

Read the article

Search Results

Search found 21142 results on 846 pages for 'bit manipulation'.

Page 372/846 | < Previous Page | 368 369 370 371 372 373 374 375 376 377 378 379 | Next Page >

- by zowens

- by Chris W Beal

- by simonsabin

- by Rob Farley

- by Rob Farley

- by Brian

- by Fatherjack

- by Simon Cooper

- by david.butler(at)oracle.com

- by Reed

- by Jeroen Vannevel

- by Rick Strahl

- by user12620111

- by Simon Cooper

- by psheriff

- by winston smith

- by user3237134

- by user3237134

- by Robert May

- by Andreas Hartmann

- by Dinesh Rajasekharan-Oracle

- by Mladen Prajdic

- by KunaalKapoor

- by Elton Stoneman

- by Chris George

< Previous Page | 368 369 370 371 372 373 374 375 376 377 378 379 | Next Page >