Search Results

Search found 100 results on 4 pages for 'costing'.

Page 4/4 | < Previous Page | 1 2 3 4

which is best smart automatic file replication solution for cloud storage based systems.

- by TORr0t

I am looking for a solution for a project i am working on. We are developing a websystem where people can upload their files and other people can download it. (similar to rapidshare.com model) Problem is, some files can be demanded much more than other files. The scenerio is like: I have uploaded my birthday video and shared it with all of my friend, I have uploaded it to myproject.com and it was stored in one of the cluster which has 100mbit connection. Problem is, once all of my friends want to download the file, they cant download it since the bottleneck here is 100mbit which is 15MB per second, but i got 1000 friends and they can only download 15KB per second. I am not taking into account that the hdd is serving same files. My network infrastrucre is as follows: 1 gbit server(client) and connected to 4 Nodes of storage servers that have 100mbit connection. 1gbit server can handle the 1000 users traffic if one of storage node can stream more than 15MB per second to my 1gbit (client) server and visitor will stream directly from client server instead of storage nodes. I can do it by replicating the file into 2 nodes. But i dont want to replicate all files uploadded to my network since it is costing much more. So i need a cloud based system, which will push the files into replicated nodes automatically when demanded to those files are high, and when the demand is low, they will delete from other nodes and it will stay in only 1 node. I have looked to gluster and asked in their irc channel that, gluster cant do such a thing. It is only able to replicate all the files or none of the files. But i need it the cluster software to do it automatically. Any solutions ? (instead of recommending me amazon s3) S

Read the article
Buying an old laser printer -- what will need to be replaced?

- by marienbad

Hi all -- as you can see I'm new. I do IT and wiring for a small local shop but I never deal with printers. I do a LOT of printing, and I'd like to stop spending as much money on it. On my local CL, there is an HP 8100DN (duplex network) printer for a very good price (and the toner is a quarter-cent per page). It has printed 200,000 pages and I don't yet know anything else about it. The model was released in 1999. So my questions: What are the parts that tend to need service on laser printers? On ebay, I see fusers, rollers, DC power boards, and motors. What would you expect to replace soon at 200,000 pages? Are there any good "tests" to find out if certain parts are near failure? Do you have anything to say about the HP 8100 specifically? The bottom line for me is that if there's any chance of repairs costing more than $100, it's not worth it for me.

Read the article
The enterprise vendor con - connecting SSD's using SATA 2 (3Gbits) thus limiting there performance

- by tonyrogerson

When comparing SSD against Hard drive performance it really makes me cross when folk think comparing an array of SSD running on 3GBits/sec to hard drives running on 6GBits/second is somehow valid. In a paper from DELL (http://www.dell.com/downloads/global/products/pvaul/en/PowerEdge-PowerVaultH800-CacheCade-final.pdf) on increasing database performance using the DELL PERC H800 with Solid State Drives they compare four SSD drives connected at 3Gbits/sec against ten 10Krpm drives connected at 6Gbits [Tony slaps forehead while shouting DOH!]. It is true in the case of hard drives it probably doesn’t make much difference 3Gbit or 6Gbit because SAS and SATA are both end to end protocols rather than shared bus architecture like SCSI, so the hard drive doesn’t share bandwidth and probably can’t get near the 600MiBytes/second throughput that 6Gbit gives unless you are doing contiguous reads, in my own tests on a single 15Krpm SAS disk using IOMeter (8 worker threads, queue depth of 16 with a stripe size of 64KiB, an 8KiB transfer size on a drive formatted with an allocation size of 8KiB for a 100% sequential read test) I only get 347MiBytes per second sustained throughput at an average latency of 2.87ms per IO equating to 44.5K IOps, ok, if that was 3GBits it would be less – around 280MiBytes per second, oh, but wait a minute [...fingers tap desk] You’ll struggle to find in the commodity space an SSD that doesn’t have the SATA 3 (6GBits) interface, SSD’s are fast not only low latency and high IOps but they also offer a very large sustained transfer rate, consider the OCZ Agility 3 it so happens that in my masters dissertation I did the same test but on a difference box, I got 374MiBytes per second at an average latency of 2.67ms per IO equating to 47.9K IOps – cost of an 240GB Agility 3 is £174.24 (http://www.scan.co.uk/products/240gb-ocz-agility-3-ssd-25-sata-6gb-s-sandforce-2281-read-525mb-s-write-500mb-s-85k-iops), but that same drive set in a box connected with SATA 2 (3Gbits) would only yield around 280MiBytes per second thus losing almost 100MiBytes per second throughput and a ton of IOps too. So why the hell are “enterprise” vendors still only connecting SSD’s at 3GBits? Well, my conspiracy states that they have no interest in you moving to SSD because they’ll lose so much money, the argument that they use SATA 2 doesn’t wash, SATA 3 has been out for some time now and all the commodity stuff you buy uses it now. Consider the cost, not in terms of price per GB but price per IOps, SSD absolutely thrash Hard Drives on that, it was true that the opposite was also true that Hard Drives thrashed SSD’s on price per GB, but is that true now, I’m not so sure – a 300GByte 2.5” 15Krpm SAS drive costs £329.76 ex VAT (http://www.scan.co.uk/products/300gb-seagate-st9300653ss-savvio-15k3-25-hdd-sas-6gb-s-15000rpm-64mb-cache-27ms) which equates to £1.09 per GB compared to a 480GB OCZ Agility 3 costing £422.10 ex VAT (http://www.scan.co.uk/products/480gb-ocz-agility-3-ssd-25-sata-6gb-s-sandforce-2281-read-525mb-s-write-410mb-s-30k-iops) which equates to £0.88 per GB. Ok, I compared an “enterprise” hard drive with a “commodity” SSD, ok, so things get a little more complicated here, most “enterprise” SSD’s are SLC and most commodity are MLC, SLC gives more performance and wear, I’ll talk about that another day. For now though, don’t get sucked in by vendor marketing, SATA 2 (3Gbit) just doesn’t cut it, SSD need 6Gbit to breath and even that SSD’s are pushing. Alas, SSD’s are connected using SATA so all the controllers I’ve seen thus far from HP and DELL only do SATA 2 – deliberate? Well, I’ll let you decide on that one.

Read the article
Social Targeting: This One's Just for You

- by Mike Stiles

Think of social targeting in terms of the archery competition we just saw in the Olympics. If someone loaded up 5 arrows and shot them straight up into the air all at once, hoping some would land near the target, the world would have united in laughter. But sadly for hysterical YouTube video viewing, that’s not what happened. The archers sought to maximize every arrow by zeroing in on the spot that would bring them the most points. Marketers have always sought to do the same. But they can only work with the tools that are available. A firm grasp of the desired target does little good if the ad products aren’t there to deliver that target. On the social side, both Facebook and Twitter have taken steps to enhance targeting for marketers. And why not? As the demand to monetize only goes up, they’re quite motivated to leverage and deliver their incredible user bases in ways that make economic sense for advertisers. You could target keywords on Twitter with promoted accounts, and get promoted tweets into search. They would surface for your followers and some users that Twitter thought were like them. Now you can go beyond keywords and target Twitter users based on 350 interests in 25 categories. How does a user wind up in one of these categories? Twitter looks at that user’s tweets, they look at whom they follow, and they run data through some sort of Twitter secret sauce. The result is, you have a much clearer shot at Twitter users who are most likely to welcome and be responsive to your tweets. And beyond the 350 interests, you can also create custom segments that find users who resemble followers of whatever Twitter handle you give it. That means you can now use boring tweets to sell like a madman, right? Not quite. This ad product is still quality-based, meaning if you’re not putting out tweets that lead to interest and thus, engagement, that tweet will earn a low quality score and wind up costing you more under Twitter’s auction system to maintain. That means, as the old knight in “Indiana Jones and the Last Crusade” cautions, “choose wisely” when targeting based on these interests and categories to make sure your interests truly do line up with theirs. On the Facebook side, they’re rolling out ad targeting that uses email addresses, phone numbers, game and app developers’ user ID’s, and eventually addresses for you bigger brands. Why? Because you marketers asked for it. Here you were with this amazing customer list but no way to reach those same customers should they be on Facebook. Now you can find and communicate with customers you gathered outside of social, and use Facebook to do it. Fair to say such users are a sensible target and will be responsive to your message since they’ve already bought something from you. And no you’re not giving your customer info to Facebook. They’ll use something called “hashing” to make sure you don’t see Facebook user data (beyond email, phone number, address, or user ID), and Facebook can’t see your customer data. The end result, social becomes far more workable and more valuable to marketers when it delivers on the promise that made it so exciting in the first place. That promise is the ability to move past casting wide nets to the masses and toward concentrating marketing dollars efficiently on the targets most likely to yield results.

Read the article
SQL Server Optimizer Malfunction?

- by Tony Davis

There was a sharp intake of breath from the audience when Adam Machanic declared the SQL Server optimizer to be essentially "stuck in 1997". It was during his fascinating "Query Tuning Mastery: Manhandling Parallelism" session at the recent PASS SQL Summit. Paraphrasing somewhat, Adam (blog | @AdamMachanic) offered a convincing argument that the optimizer often delivers flawed plans based on assumptions that are no longer valid with today’s hardware. In 1997, when Microsoft engineers re-designed the database engine for SQL Server 7.0, SQL Server got its initial implementation of a cost-based optimizer. Up to SQL Server 2000, the developer often had to deploy a steady stream of hints in SQL statements to combat the occasionally wilful plan choices made by the optimizer. However, with each successive release, the optimizer has evolved and improved in its decision-making. It is still prone to the occasional stumble when we tackle difficult problems, join large numbers of tables, perform complex aggregations, and so on, but for most of us, most of the time, the optimizer purrs along efficiently in the background. Adam, however, challenged further any assumption that the current optimizer is competent at providing the most efficient plans for our more complex analytical queries, and in particular of offering up correctly parallelized plans. He painted a picture of a present where complex analytical queries have become ever more prevalent; where disk IO is ever faster so that reads from disk come into buffer cache faster than ever; where the improving RAM-to-data ratio means that we have a better chance of finding our data in cache. Most importantly, we have more CPUs at our disposal than ever before. To get these queries to perform, we not only need to have the right indexes, but also to be able to split the data up into subsets and spread its processing evenly across all these available CPUs. Improvements such as support for ColumnStore indexes are taking things in the right direction, but, unfortunately, deficiencies in the current Optimizer mean that SQL Server is yet to be able to exploit properly all those extra CPUs. Adam’s contention was that the current optimizer uses essentially the same costing model for many of its core operations as it did back in the days of SQL Server 7, based on assumptions that are no longer valid. One example he gave was a "slow disk" bias that may have been valid back in 1997 but certainly is not on modern disk systems. Essentially, the optimizer assesses the relative cost of serial versus parallel plans based on the assumption that there is no IO cost benefit from parallelization, only CPU. It assumes that a single request will saturate the IO channel, and so a query would not run any faster if we parallelized IO because the disk system simply wouldn’t be able to handle the extra pressure. As such, the optimizer often decides that a serial plan is lower cost, often in cases where a parallel plan would improve performance dramatically. It was challenging and thought provoking stuff, as were his techniques for driving parallelism through query logic based on subsets of rows that define the "grain" of the query. I highly recommend you catch the session if you missed it. I’m interested to hear though, when and how often people feel the force of the optimizer’s shortcomings. Barring mistakes, such as stale statistics, how often do you feel the Optimizer fails to find the plan you think it should, and what are the most common causes? Is it fighting to induce it toward parallelism? Combating unexpected plans, arising from table partitioning? Something altogether more prosaic? Cheers, Tony.

Read the article
Design: How to declare a specialized memory handler class

- by Michael Dorgan

On an embedded type system, I have created a Small Object Allocator that piggy backs on top of a standard memory allocation system. This allocator is a Boost::simple_segregated_storage< class and it does exactly what I need - O(1) alloc/dealloc time on small objects at the cost of a touch of internal fragmentation. My question is how best to declare it. Right now, it's scope static declared in our mem code module, which is probably fine, but it feels a bit exposed there and is also now linked to that module forever. Normally, I declare it as a monostate or a singleton, but this uses the dynamic memory allocator (where this is located.) Furthermore, our dynamic memory allocator is being initialized and used before static object initialization occurs on our system (as again, the memory manager is pretty much the most fundamental component of an engine.) To get around this catch 22, I added an extra 'if the small memory allocator exists' to see if the small object allocator exists yet. That if that now must be run on every small object allocation. In the scheme of things, this is nearly negligable, but it still bothers me. So the question is, is there a better way to declare this portion of the memory manager that helps decouple it from the memory module and perhaps not costing that extra isinitialized() if statement? If this method uses dynamic memory, please explain how to get around lack of initialization of the small object portion of the manager.

Read the article
Where to get laptop replacement parts?

- by wai

I am sure some of us have older laptops that still works but just one or two parts started not working very well. However, to send it back to the manufacturer for repair might not be worth the money (as some of them even charge consultation fees just to have it checked) as these old laptops are most probably well beyond warranty. Replacing these one or two parts could give your laptop a new lease of life. For example, I have a Dell Inspiron 640m which the fan recently failed. Everything else still works. I know enough to open up the laptop (after reading the service manual) and put it back together again, hence a replacement fan could save it. However, finding these parts turned out to be difficult. I guess I wouldn't be able to get it from Dell since I have been searching around the Dell forums and concluded if I were going to get Dell to at least talk to me, I would have to pay some money (since my warranty expired). I found this online:- http://www.parts-people.com/index.php?action=item&id=3725 However, as I don't live in the United States, shipping alone would cost me 3 times more than the part in question. In addition, if there's a problem, it would be difficult because I probably would have to mail the parts back (costing more money). One of my friend spilled water on her Macbook, she sent it back to Apple and they told her the cost of repairing it exceeds that of buying a new Macbook. They probably have to change everything except the LCD screen. I opened it up and found that most of the parts are still working, only the RAM had fried. Replaced it and it's working again. For some time before discovering only the RAM fried, I was toying with the idea of replacing the logic board myself but looks like I didn't have to. =) Now I know there's a site for Macbooks used replacement parts:- http://www.ifixit.com/ The question is, does anyone know where to get replacement parts at their own locality? It might benefit the community as a whole. Someone might know a local computer store that sells precisely just the things we needed (for now, I hope to find a shop that sells just the fan). If you own a laptop which you had an easy experience finding the replacement parts, do post that as well. PS: I wish laptop manufacturers have outlets where they sell these replacement parts (or at least let you place an order).

Read the article
Where to get laptop replacement parts?

- by wai

I am sure some of us have older laptops that still works but just one or two parts started not working very well. However, to send it back to the manufacturer for repair might not be worth the money (as some of them even charge consultation fees just to have it checked) as these old laptops are most probably well beyond warranty. Replacing these one or two parts could give your laptop a new lease of life. For example, I have a Dell Inspiron 640m which the fan recently failed. Everything else still works. I know enough to open up the laptop (after reading the service manual) and put it back together again, hence a replacement fan could save it. However, finding these parts turned out to be difficult. I guess I wouldn't be able to get it from Dell since I have been searching around the Dell forums and concluded if I were going to get Dell to at least talk to me, I would have to pay some money (since my warranty expired). I found this online:- http://www.parts-people.com/index.php?action=item&id=3725 However, as I don't live in the United States, shipping alone would cost me 3 times more than the part in question. In addition, if there's a problem, it would be difficult because I probably would have to mail the parts back (costing more money). One of my friend spilled water on her Macbook, she sent it back to Apple and they told her the cost of repairing it exceeds that of buying a new Macbook. They probably have to change everything except the LCD screen. I opened it up and found that most of the parts are still working, only the RAM had fried. Replaced it and it's working again. For some time before discovering only the RAM fried, I was toying with the idea of replacing the logic board myself but looks like I didn't have to. =) Now I know there's a site for Macbooks used replacement parts:- http://www.ifixit.com/ The question is, does anyone know where to get replacement parts at their own locality? It might benefit the community as a whole. Someone might know a local computer store that sells precisely just the things we needed (for now, I hope to find a shop that sells just the fan). If you own a laptop which you had an easy experience finding the replacement parts, do post that as well. PS: I wish laptop manufacturers have outlets where they sell these replacement parts (or at least let you place an order).

Read the article
Choosing parts for a high-spec custom PC - feedback required [closed]

- by James

I'm looking to build a high-spec PC costing under ~£800 (bearing in mind I can get the CPU half price). This is my first time doing this so I have plenty of questions! I have been doing lots of research and this is what I have come up with: http://pcpartpicker.com/uk/p/j4lE Usage: I will be using it for Adobe CS6, rendering in 3DS Max, particle simulations in Realflow and for playing games like GTA IV (and V when it comes out), Crysis 1/2, Saints Row The Third, Deus Ex HR, etc. Questions: Can you see any obvious problem areas with the current setup? Will it be sufficient for the above usage? I won't be doing any overclocking initially. Is it worth buying the H60 liquid cooler, or will the fan that comes with the CPU be sufficient? Is water cooling generally quieter? Is the chosen motherboard good for the current components? And is it future-proof? I read that the HDD is often the bottleneck when it comes to gaming. I presume this is true to other high-end applications? If so, is my selection good? I keep changing my mind about the GPU; first the 560, now the 660. Can anyone shed some light on how to choose? I read mixed opinions about matching the GPU to the CPU. Will the 560 or the 660 be sufficient for my required usage? Atm I'm basing my choice on the PassMark benchmarks and how much they cost. The specs on the GeForce website state that the 560 and the 660 both require 450W. Is this a good figure to base the wattage of my PSU on? If so, how do you decide? Do I really need 750W? The latest GTX 690 requires 650W. Is it a good idea to buy a 750W PSU now to future-proof myself?

Read the article
Management and Monitoring Tools for Windows Azure

- by BuckWoody

With such a large platform, Windows Azure has a lot of moving parts. We’ve done our best to keep the interface as simple as possible, while giving you the most control and visibility we can. However, as with most Microsoft products, there are multiple ways to do something – and I’ve always found that to be a good strength. Depending on the situation, I might want a graphical interface, a command-line interface, or just an API so I can incorporate the management into my own tools, or have third-party companies write other tools. While by no means exhaustive, I thought I might put together a quick list of a few tools you can use to manage and monitor Windows Azure components, from our IaaS, SaaS and PaaS offerings. Some of the products focus on one area more than another, but all are available today. I’ll try and maintain this list to keep it current, but make sure you check the date of this post’s update – if it’s more than six months old, it’s most likely out of date. Things move fast in the cloud. The Windows Azure Management Portal The primary tool for managing Windows Azure is our portal – most everything you need is there, from creating new services to querying a database. There are two versions as of this writing – a Silverlight client version, and a newer HTML5 version. The latter is being updated constantly to be in parity with the Silverlight client. There’s a balance in this portal between simplicity and power – we’re following the “less is more” approach, with increasing levels of detail as you work through the portal rather than overwhelming you with a single, long “more is more” page. You can find the Portal here: http://windowsazure.com (then click “Log In” and then “Portal”) Windows Azure Management API You can also use programming tools to either write your own interface, or simply provide management functions directly within your solution. You have two options – you can use the more universal REST API’s, which area bit more complex but work with any system that can write to them, or the more approachable .NET API calls in code. You can find the reference for the API’s here: http://msdn.microsoft.com/en-us/library/windowsazure/ee460799.aspx All Class Libraries, for each part of Windows Azure: http://msdn.microsoft.com/en-us/library/ee393295.aspx PowerShell Command-lets PowerShell is one of the most powerful scripting languages I’ve used with Windows – and it’s baked into all of our products. When you need to work with multiple servers, scripting is really the only way to go, and the Windows Azure PowerShell Command-Lets allow you to work across most any part of the platform – and can even be used within the services themselves. You can do everything with them from creating a new IaaS, PaaS or SaaS service, to controlling them and even working with security and more. You can find more about the Command-Lets here: http://wappowershell.codeplex.com/documentation (older link, still works, will point you to the new ones as well) We have command-line utilities for other operating systems as well: https://www.windowsazure.com/en-us/manage/downloads/ Video walkthrough of using the Command-Lets: http://channel9.msdn.com/Events/BUILD/BUILD2011/SAC-859T System Center System Center is actually a suite of graphical tools you can use to manage, deploy, control, monitor and tune software from Microsoft and even other platforms. This will be the primary tool we’ll recommend for managing a hybrid or contiguous management process – and as time goes on you’ll see more and more features put into System Center for the entire Windows Azure suite of products. You can find the Management Pack and README for it here: http://www.microsoft.com/en-us/download/details.aspx?id=11324 SQL Server Management Studio / Data Tools / Visual Studio SQL Server has two built-in management and development, and since Version 2008 R2, you can use them to manage Windows Azure Databases. Visual Studio also lets you connect to and manage portions of Windows Azure as well as Windows Azure Databases. You can read more about Visual Studio here: http://msdn.microsoft.com/en-us/library/windowsazure/ee405484 You can read more about the SQL tools here: http://msdn.microsoft.com/en-us/library/windowsazure/ee621784.aspx Vendor-Provided Tools Microsoft does not suggest or endorse a specific third-party product. We do, however, use them, and see lots of other customers use them. You can browse to these sites to learn more, and chat with their folks directly on how they support Windows Azure. Cerebrata: Tools for managing from the command-line, graphical diagnostics, graphical storage management - http://www.cerebrata.com/ Quest Cloud Tools: Monitoring, Storage Management, and costing tools - http://communities.quest.com/community/cloud-tools Paraleap: Monitoring tool - http://www.paraleap.com/AzureWatch Cloudgraphs: Monitoring too - http://www.cloudgraphs.com/ Opstera: Monitoring for Windows Azure and a Scale-out pattern manager - http://www.opstera.com/products/Azureops/ Compuware: SaaS performance monitoring, load testing - http://www.compuware.com/application-performance-management/gomez-apm-products.html SOASTA: Penetration and Security Testing - http://www.soasta.com/cloudtest/enterprise/ LoadStorm: Load-testing tool - http://loadstorm.com/windows-azure Open-Source Tools This is probably the most specific set of tools, and the list I’ll have to maintain most often. Smaller projects have a way of coming and going, so I’ll try and make sure this list is current. Windows Azure MMC: (I actually use this one a lot) http://wapmmc.codeplex.com/ Windows Azure Diagnostics Monitor: http://archive.msdn.microsoft.com/wazdmon Azure Application Monitor: http://azuremonitor.codeplex.com/ Azure Web Log: http://www.xentrik.net/software/azure_web_log.html Cloud Ninja:Multi-Tennant billing and performance monitor - http://cnmb.codeplex.com/ Cloud Samurai: Multi-Tennant Management- http://cloudsamurai.codeplex.com/ If you have additions to this list, please post them as a comment and I’ll research and then add them. Thanks!

Read the article
Management and Monitoring Tools for Windows Azure

- by BuckWoody

With such a large platform, Windows Azure has a lot of moving parts. We’ve done our best to keep the interface as simple as possible, while giving you the most control and visibility we can. However, as with most Microsoft products, there are multiple ways to do something – and I’ve always found that to be a good strength. Depending on the situation, I might want a graphical interface, a command-line interface, or just an API so I can incorporate the management into my own tools, or have third-party companies write other tools. While by no means exhaustive, I thought I might put together a quick list of a few tools you can use to manage and monitor Windows Azure components, from our IaaS, SaaS and PaaS offerings. Some of the products focus on one area more than another, but all are available today. I’ll try and maintain this list to keep it current, but make sure you check the date of this post’s update – if it’s more than six months old, it’s most likely out of date. Things move fast in the cloud. The Windows Azure Management Portal The primary tool for managing Windows Azure is our portal – most everything you need is there, from creating new services to querying a database. There are two versions as of this writing – a Silverlight client version, and a newer HTML5 version. The latter is being updated constantly to be in parity with the Silverlight client. There’s a balance in this portal between simplicity and power – we’re following the “less is more” approach, with increasing levels of detail as you work through the portal rather than overwhelming you with a single, long “more is more” page. You can find the Portal here: http://windowsazure.com (then click “Log In” and then “Portal”) Windows Azure Management API You can also use programming tools to either write your own interface, or simply provide management functions directly within your solution. You have two options – you can use the more universal REST API’s, which area bit more complex but work with any system that can write to them, or the more approachable .NET API calls in code. You can find the reference for the API’s here: http://msdn.microsoft.com/en-us/library/windowsazure/ee460799.aspx All Class Libraries, for each part of Windows Azure: http://msdn.microsoft.com/en-us/library/ee393295.aspx PowerShell Command-lets PowerShell is one of the most powerful scripting languages I’ve used with Windows – and it’s baked into all of our products. When you need to work with multiple servers, scripting is really the only way to go, and the Windows Azure PowerShell Command-Lets allow you to work across most any part of the platform – and can even be used within the services themselves. You can do everything with them from creating a new IaaS, PaaS or SaaS service, to controlling them and even working with security and more. You can find more about the Command-Lets here: http://wappowershell.codeplex.com/documentation (older link, still works, will point you to the new ones as well) We have command-line utilities for other operating systems as well: https://www.windowsazure.com/en-us/manage/downloads/ Video walkthrough of using the Command-Lets: http://channel9.msdn.com/Events/BUILD/BUILD2011/SAC-859T System Center System Center is actually a suite of graphical tools you can use to manage, deploy, control, monitor and tune software from Microsoft and even other platforms. This will be the primary tool we’ll recommend for managing a hybrid or contiguous management process – and as time goes on you’ll see more and more features put into System Center for the entire Windows Azure suite of products. You can find the Management Pack and README for it here: http://www.microsoft.com/en-us/download/details.aspx?id=11324 SQL Server Management Studio / Data Tools / Visual Studio SQL Server has two built-in management and development, and since Version 2008 R2, you can use them to manage Windows Azure Databases. Visual Studio also lets you connect to and manage portions of Windows Azure as well as Windows Azure Databases. You can read more about Visual Studio here: http://msdn.microsoft.com/en-us/library/windowsazure/ee405484 You can read more about the SQL tools here: http://msdn.microsoft.com/en-us/library/windowsazure/ee621784.aspx Vendor-Provided Tools Microsoft does not suggest or endorse a specific third-party product. We do, however, use them, and see lots of other customers use them. You can browse to these sites to learn more, and chat with their folks directly on how they support Windows Azure. Cerebrata: Tools for managing from the command-line, graphical diagnostics, graphical storage management - http://www.cerebrata.com/ Quest Cloud Tools: Monitoring, Storage Management, and costing tools - http://communities.quest.com/community/cloud-tools Paraleap: Monitoring tool - http://www.paraleap.com/AzureWatch Cloudgraphs: Monitoring too - http://www.cloudgraphs.com/ Opstera: Monitoring for Windows Azure and a Scale-out pattern manager - http://www.opstera.com/products/Azureops/ Compuware: SaaS performance monitoring, load testing - http://www.compuware.com/application-performance-management/gomez-apm-products.html SOASTA: Penetration and Security Testing - http://www.soasta.com/cloudtest/enterprise/ LoadStorm: Load-testing tool - http://loadstorm.com/windows-azure Open-Source Tools This is probably the most specific set of tools, and the list I’ll have to maintain most often. Smaller projects have a way of coming and going, so I’ll try and make sure this list is current. Windows Azure MMC: (I actually use this one a lot) http://wapmmc.codeplex.com/ Windows Azure Diagnostics Monitor: http://archive.msdn.microsoft.com/wazdmon Azure Application Monitor: http://azuremonitor.codeplex.com/ Azure Web Log: http://www.xentrik.net/software/azure_web_log.html Cloud Ninja:Multi-Tennant billing and performance monitor - http://cnmb.codeplex.com/ Cloud Samurai: Multi-Tennant Management- http://cloudsamurai.codeplex.com/ If you have additions to this list, please post them as a comment and I’ll research and then add them. Thanks!

Read the article
EPM and Business Analytics Talking-head Videos from Oracle OpenWorld 2013

- by Mike.Hallett(at)Oracle-BI&EPM

Normal 0 false false false EN-GB X-NONE X-NONE Here is a selection of 2 to 3 minute video interviews at this year’s Oracle OpenWorld: 1. George Somogyi, Solutions Architect, New Edge Group, talks about the importance of having their integrated Oracle Hyperion Platform consisting of Oracle Hyperion Financial Management, Oracle Hyperion Financial Data Quality Management, Oracle E-Business Suite R12 and Oracle Business Intelligence Extended Edition plus their use of Oracle Managed Cloud Services. Speaker: George Somogyi @ http://youtu.be/kWn0dQxCUy8 2. Gregg Thompson, Director of Financial Systems for ADT, talks about using Oracle Data Relationship Management prior to implementing an Enterprise Performance Management solution. Gregg confirmed that there are big benefits to bringing the full Oracle Hyperion Financial Close suite online with Oracle DRM as the metadata source. Reduced maintenance time and use of external consultants translates into significant time and cost savings and faster implementation times. Speaker: Gregg Thompson @ http://youtu.be/XnFrR9Uk4xk 3. Jeff Spangler, Director Financial Planning and Analysis for Speedy Cash Holdings Corp, talked to us about the benefits achieved through implementing Oracle Hyperion Planning and financial reporting solutions. He also describes how the use of Data Relationship Management will keep the process running smoothly now and in the future. Speaker: Jeff Spangler @ http://youtu.be/kkkuMkgJ22U 4. Marc Seewald, Senior Director of Product Management for Oracle Hyperion Tax Provision at Oracle, talks about Oracle Hyperion Tax Provision, how it is an integral part of the financial close process and that it provides better internal controls and automation of this task. Marc talks about Oracle Partners and customers alike who are seeing great value. Speaker: Marc Seewald @ http://youtu.be/lM_nfvACGuA 5. Matt Bradley, SVP of Product Development for Enterprise Performance Management (EPM) Applications at Oracle, talked to us about different deployment options for Oracle EPM. Cloud services (SaaS), managed services, on-premise, off-premise all have their merits, and organizations need flexibility to easily move between them as their companies evolve. Speaker: Matt Bradley @ http://youtu.be/ATO7Z9dbE-o 6. Neil Sellers, Partner, Qubix International talks about their experience with previewing Oracle’s new Planning and Budgeting Cloud Service. He describes the benefits of the step-by-step task lists, the speed of getting the application up and running, and the huge benefits of not having to manage the software and hardware side of the planning process. Speaker: Neil Sellers @ http://youtu.be/xmosO28e4_I 7. Praveen Pasupuleti, Senior Business Intelligence Development Manager of Citrix Systems Inc., talks about their Oracle Hyperion Planning upgrade and the huge performance improvement now experienced in forecasting. He also talked about the benefits of Oracle Hyperion Workforce Planning achieved by Citrix. Speaker: Praveen Pasupuleti @ http://youtu.be/d1e_4hLqw8c 8. CheckPoint Consulting, talked to us about how Enterprise Performance Management should be viewed as an entire solution, rather than as a bunch of applications in silos, to provide significant benefits; and how Data Relationship Management can tie it all together effectively. Speaker: Ron Dimon @ http://youtu.be/sRwbdbbXvUE 9. Sonal Kulkarni, Enterprise Performance Management Leader, Cummins Inc., talks about their use of Oracle Hyperion Financial Close Management (Account Reconciliation Manager), Oracle Hyperion Financial Management and Oracle Hyperion Financial Data Quality Management and how this is providing efficiency, visibility and compliance benefits. Speaker: Sonal Kulkarni @ http://youtu.be/OEgup5dKyVc 10. Todd Renard, Manager Financial Planning and Business Analytics for B/E Aerospace Inc., talks about the huge benefits that B/E Aerospace is experiencing from Oracle Financial Close Suite. He was extremely excited about Oracle Hyperion Financial Data Quality Management and how this helps them integrate a new business in as little as three weeks. Speaker: Todd Renard @ http://youtu.be/nIfqK46uVI8 11. Peter Smolianski, Chief Technology Officer for the District of Columbia Courts, talked to us about how D.C. Courts is using Oracle Scorecard and Strategy Management to push their 5 year plan forward, to report results to their constituents, and take accountability for process changes to become more efficient. Speaker: Peter Smolianski @ http://www.youtube.com/watch?v=T-DtB5pl-uk 12. Rich Wilkie, Senior Director of Product Management for Financial Close Suite at Oracle, talked to us about Oracle Financial Management Analytics. He told us how the prebuilt dashboards on top of Oracle Hyperion Financial Close Suite make it easy for everyone to see the numbers and understand where they are in the close process, and if there is an issue, they can see where it is. Executives are excited to get this information on mobile devices too. Speaker: Rich Wilkie @ http://www.youtube.com/watch?v=4UHuHgx74Yg 13. Dinesh Balebail, Senior Director of Software Development for Oracle Hyperion Profitability and Cost Management, talked to us about the power and speed of Oracle Hyperion Profitability and Cost Management and how it is being used to do deep costing for Telecoms, Hospitals, Banks and other high transaction volume organizations effectively. Speaker: Dinesh Balebail @ http://youtu.be/ivx5AZCXAfs /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin:0cm; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Calibri","sans-serif"; mso-bidi-font-family:"Times New Roman"; mso-ansi-language:EN-US; mso-fareast-language:EN-US;}

Read the article
Dealing with "I-am-cool-and-you-are-dumb" manager [closed]

- by Software Guy

I have been working with a software company for about 6 months now. I like the projects I work on there and I really like all the people there except for 1 guy. That guy is technically smart, and he is a co-founder of the company. He is an okay guy in person (the kind you wouldn't want to care about much) but things get tricky when he is your manager. In general I am all okay but there are times when I feel I am not being treated fairly: He doesn't give much thought to when he makes mistakes and when I do something similar, he is super critical. Recently he went as far as to say "I am not sure if I can trust you with this feature". The detais of this specific case are this: I was working on this feature, and I was already a couple of hours over my normal working hours, and then I decided to stop and continue tomorrow. We use git, and I like to commit changes locally and only push when I feel they are ready. This manager insists that I push all the changes to the central repo (in case my hard drive crashes). So I push the change, and the ticket is marked as "to be tested". Next day I come in, he sits next to me and starts complaining and says that I posted above. I really didn't know what to say, I tried to explain to him that the ticket is still being worked upon but he didn't seem to listen. He interrupts me in-between when I am coding, which I do not mind, but when I do that same, his face turns like this :| and reacts as if his work was super important and I am just wasting his time. He asks me to accumulate all questions, and then ask him altogether which is not always possible, as you need a clarification before you can continue on a feature implementation. And when I am coding, he talks on the phone with his customers next to me (when he can go to the meeting room with his laptop) and doesn't care. He made me switch to a whole new IDE (from Netbeans to a commercial IDE costing a lot of money) for a really tiny feature (which I later found out was in Netbeans as well!). I didn't make a big deal out of it as I am equally comfortable working with this new IDE, but I couldn't get the science behind his obsession. He said this feature makes sure that if any method is updated by a programmer, the IDE will turn the method name to red in places where it is used. I told him that I do not have a problem since I always search for method usage in the project and make sure its updated. IDEs even have refactoring features for exactly that, but... I recently implemented a feature for a project, and I was happy about it and considering him a senior, I asked him his comments about the implementation quality.. he thought long and hard, made a few funny faces, and when he couldn't find anything, he said "ummm, your program will crash if JS is disabled" - he was wrong, since I had made sure it would work fine with default values even if JS was disabled. I told him that and then he said "oh okay". BUT, the funny thing is, a few days back, he implemented something and I objected with "But that would not run if JS is disabled" and his response was "We don't have to care about people who disable JS" :-/ Once he asked me to investigate if there was a way to modify a CMS generated menu programmatically by extending the CMS, I did my research and told him that the only was is to inject a menu item using JavaScript / jQuery and his reaction was "ah that's ugly, and hacky, not acceptable" and two days later, I see that feature implemented in the same way as I had suggested. The point is, his reaction was not respectful at all, even if what I proposed was hacky, he should be respectful, that I know what's hacky and if I am suggesting something hacky, there must be a reason for it. There are plenty of other reasons / examples where I feel I am not being treated fairly. I want your advice as to what is it that I am doing wrong and how to deal with such a situation. The other guys in the team are actually very good people, and I do not want to leave the job either (although I could, if I want to). All I want is respect and equal treatment. I have thought about talking to this guy in a face to face meeting, but that worries me that his attitude might get worse and make things more difficult for me (since he doesn't seem to be the guy who thinks he can be wrong too). I am also considering talking to the other co-founder but I am not sure how he will take it (as both founders have been friends forever). Thanks for reading the long message, I really appreciate your help.

Read the article
Internal drives vs USB-3 with external SSD or eSata with External SSD

- by normstorm

I have a need to carry VMWare Virtual Machines with me for work. These are very large files (each VM is 20GB or more) and I carry around about 40 to 50 VM's to simulate different software configurations for different client needs. Key: they won't fit on the internal hard drive of my current laptop. I currently execute the VM's from an external 7200RPM 2.5" USB-2 drive. I keep copies of the VM's on other 5400 external USB-2 drives. The VM's work from this drive, but they are slow, costing me much time and frustration. It can take upwards of 30 minutes just to make a copy of one of the VM's. They can take upwards of 10-15 minutes to fully launch and then they operate sluggishly. I am buying a new laptop (Core I7, 8GB RAM and other high-end specs). I intend to buy an SSD for the O/S volume (C:). This SSD will not be large enough to hold the VM's. I have always wanted a second internal hard drive to operate the VM's. To have two hard drives, though, I am finding that I will have to go to a 17" laptop which would be bulky/heavy. I am instead considering purchasing a 15" laptop with either an eSATA port or USB-3 ports and then purchasing two external drives. One of the drives might be an external SSD (maybe OCX brand) for operating the VM's and the other a 7400RPM 1TB hard drive for carrying around the VM's not currently in use. The question is which options would give me the biggest bang for the buck and the weight: 1) 2nd Internal SSD hard drive. This would mean buying a 17" laptop with two drive "bays". The first bay would hold an SSD drive for the C: drive. I would leave the first bay empty from the manufacture and then purchase/install an aftermarket SSD drive. This second SSD drive would have to be very large (256 GB), which would be expensive. I would still also need another external hard drive for carrying around the VM's not in use. 2) 2nd internal hard drive - 7400 RPM. Again, a 17" laptop would be required, but there are models available with on SSD drive for the C: drive and a second 7200 RPM hard drives. The second drive could probably be large enough to hold the VM's in use as well as those not in use. But would it be fast enough to drive the VM's? 3) USB-3 with External SSD. I could buy a 15" laptop with an SSD drive for the C: drive and a second hard drive for general files. I would operate the VM's from an external USB-3 SSD drive and have a third USB-3 external 7200 RPM drive for holding the VM's not in use. 4) eSATA with External SSD. Ditto, just eSATA instead of USB-3 5) USB-3 with External 7400 RPM drive. Ditto, but the drive running the VM's would be USB-3 attached 7400 RPM drives rather than SSD. 6) eSATA with External 7400 RPM drive. Dittor, but the drive running the VM's would be eSATA attached 7400 RPM drives rather than SSD. Any thoughts on this and any creative solutions?

Read the article
Using different SSDs types (not only SATA based) as system drive

- by Hubert Kario

Currently I have a Thinkpad X61s and want to make it both a bit faster and a bit more power efficient. For that reason I thought that adding SSD drive would make most sense. Unfortunately, because of financial reasons, buying SSD of over 200GB capacity is out of reach for me (not only it would be worth more than the rest of the laptop, but also I currently have a 500GB drive in it, so even such a drive would be kind of a downgrade for me). During preliminary testing with a cheap Transcend 4GB Class 6 (14MiB/s streaming, 9MiB/s random read) card I experienced boot times to be reduced by half so putting the OS only on it would already would be an improvement. Unfortunately, my system now is about 11GiB in size so anything less than 16GB would be constraining. In this laptop I can connect additional drives on at least 5 different ways: using SATA-ATA converter caddy in the X6 Ultrabase using internal mini PCIe slot using integrated SDHC slot using CardBus (a.k.a PCMCIA or PC Card) slot using USB Thankfully, because I use only Linux on this PC the bootability of them is irrelevant as I can put the /boot partition on internal HDD and / on any of the above mentioned Flash memories (as I already did for the SDHC test). From what I was able to research and from my own experience those options come with rather big downsides or other problems: SATA-ATA caddy It has three downsides: I have to carry the Ultrabse with me at all times (it's not really inconvenient, but those grams do add) and couldn't disconnect it when I want to disconnect the battery It makes the bay unusable for the optical drive and occasional quick access to other hard drives the only caddies I could buy have rather flaky controllers in them so putting my OS on it would hamper its stability Internal mini PCIe slot This would be an ideal solution, if only I could find real PCIe SSDs, not only devices that could talk only SATA or ATA over PCIe mechanical connection (the ones used in Dell Mini or Asus EEE). Theoretically Samsung did release such devices but I couldn't find them in retail anywhere. Integrated SDHC slot It's a nice solution with a single drawback: the fastest 16GB SDHC card on the market can only do around 35MiB/s read and 15MiB/s write while still costing like a normal 40GB SATA SSD that's 10 times faster. Not really cost-effective. CardBus (a.k.a PCMCIA or PC Card) slot Those cards are much faster than the SDHC option (there are ones that can do well over 50MiB/s read in benchmarks) and from what I could find the PCMCIA controller in my laptop does support UDMA so it should be able to deliver comparable speeds. They still cost similarly to SD cards but at least they provide streaming performance comparable to my current HDD. USB That's the worst option. Not only is it limited to 20-30MiB/s by the interface itself the drive would stick out of the laptop so it's a big no no. The question As such I think that going the "CF in a CardBus adapter" route will be the best option. My question is: did anyone try using CF cards in CardBus adapters as system drives with Linux on Thinkpad laptops? Laptops in general? What was the real-world performance? I don't have any CF cards so I can't check how well does it work with suspend/resume, or whatever it's easy to make it work in initramfs (I'm using ArchLinux and SD card was trivial — add 3 modules in single config line and rebuilding initramfs) so any tips/gotchas on this are welcome as well.

Read the article
Product Development Investment: A Measure of Vendor Performance

- by Jim Mcglothlin

The relationship between a large, complex organization and its key suppliers of information technology is normally more than just "strategic". Expectations about the duration of the relationship typically exceed 20 years. Enterprise applications and technology infrastructure are not expected to be changed out like petunias. So how would you rate the due diligence processes as performed in Higher Education when selecting critical, transformational information technology? My observation: I see a lot of effort put into elaborate demonstration of basic software functionality. I see a lot of attention paid to the cost element of technology acquisition, including the contracted cost of implementation consulting services. But the factor that receives only cursory analysis and due diligence is long-term performance--the ability of a vendor to grow, expand, and develop, and bring its customers along with it. So what should you look for in a long-term IT supplier? Oracle has a public track record for product development. The annual investment has been on a run rate of almost $3 Billion organic product development. Oracle's well-publicized acquisitions and mergers have been supplemental to its R&D. This is important for Higher Education. Another meaningful way to evaluate a company is to look at the tangible track record of enhancement. Consider the Oracle-PeopleSoft enterprise business platform since acquired by Oracle 6 years ago: Product or Technology Enhancement Customer or User Impact Service Oriented Architecture (SOA) 300+ new web services delivered in versions 9.0 & 9.1 provide flexibility, so that customers can integrate PeopleSoft with other applications. Campus Solutions has added Admissions and Constituent Web Services. Constituent Relationship Management PeopleSoft CRM 9.1 for Higher Education introduced new process flows for student recruiting and retention to support "Student Success" initiatives. A 360 view of the constituent is now delivered, and the concept of a single-stop Student Services Center is now in CRM 9.1 with tight integration to PeopleSoft Campus Solutions. Human Capital Management Contract Pay for Education, with flexibility for configuration and calculation, has been extended in HCM 9.1. New chartfield integration among Project Costing - Time & Labor - Payroll to serve the labor distribution requirements for Grants / Sponsored Research. Talent Management PeopleSoft 9.0 and 9.1 feature an integrated talent management approach centered on definitions in "Profile Manager", with all new usability improvements. Internal and external candidate pools, and the entire recruitment process, are driven by delivered configurable selection and on-boarding processes. Interview scheduling, and online job offers are newly delivered processes. Performance Management PeopleSoft HCM ePerformance 9.1 will include significant new functionality designed to help organizations more effectively align business objectives with employee goals. Using an Organization Chart view, your business goals can flow down to become tangible objectives per employee. Succession Planning / Workforce Development New in HCM 9.0, enhanced in 9.1, is a planning capability for regular or unusual (major organizational change) succession of internal or external candidates. PeopleSoft supports employee-based career planning, which ultimately increases the integrity of the succession planning process (identify their career needs, plans, preferences, and interests). Dashboards / Oracle Business Intelligence Application Suite Oracle Human Resources Analytics provides the workforce information foundation that integrates data from HR functional areas and Finance. Oracle Human Resources Analytics delivers 9 dashboards and over 200 reports. Provide your HR professionals and front-line managers the tools to analyze workforce staffing, retention, productivity, to better source high-quality applicants, and to reduce absence costs. Multi-year Planning and Commitment Control External funding sources, especially Grants, require a multi-year encumbrance business process. PeopleSoft HCM 9.1 adds multi-year funding and commitment control, including budget checking. The newly designed Real Time Budget Checking will provide the customer with an updated snapshot of their budget and encumbrances at any given time. Position Budgeting with Hyperion Hyperion Planning world-class products now include delivered integration to PeopleSoft HCM. Position Budgeting is available in the new Public Sector Planning module of Hyperion. Web 2.0 features for the latest in usability PeopleSoft 9.1 features a contemporary internet user experience: Partial-page refreshing Drag and drop pagelets New menu structure Navigation pagelets Modal popup message windows Favorites & recently used links Type-ahead Drag and drop grid columns, pop-out grids Portal Workspaces Enterprise 2.0 for your collaborative web communities, using new content management, along with Wikis, blogs, and discussion forums in PeopleSoft Portal 9.1. PeopleTools enhanced by Oracle Fusion Middleware Standards-based tools have been added to the PeopleTools application infrastructure: BI (XML) Publisher, Java tools. Certified for use with PeopleSoft: Oracle Business Intelligence (OBIEE), Oracle Enterprise Manager, Oracle Weblogic Server, Oracle SOA Suite. Hosting for PeopleSoft applications A solid new deployment option: Oracle On Demand remote hosting center for high scalability, security, and continuity of operations. Business Process Outsourcing (BPO) for HCM / Payroll functions Partnership with AT&T provides hosting of HR/Payroll application along with payroll business process operations, and subscription-based service fees (SaaS). AT&T BPO full service includes pay sheet processing, bank and 3rd party file transfer, payroll tax handling, etc. Continuous Delivery Model Feature Packs provide faster time-to-benefit; new features become available in PeopleSoft 9.1 (or Campus Solutions 9.0) without need to perform upgrade. Golden person data model across all campus applications Oracle Higher Education Constituent Hub provides synchronization and data governance of person data across any application, e.g. HR/ Payroll, Student Information System, Housing, Emergency Contact, LMS, CRM. Oracle's aggressive enhancement plans within the "Applications Unlimited" program continue, as new functionality is under development for a new version of a PeopleSoft release planned for 2012. Meanwhile, new capabilities are planned on an annual basis in Feature Packs. PeopleSoft just delivered the HCM 2010 Feature Pack and another is planned for 2011. In February we plan to have over 100 customers from our Customer Advisory Boards at our PeopleSoft Development Center in California to review designs for all of these releases. For those of you near New York City The investment and progressive development story described above is the subject of an Oracle road show event on February 9, 2011. Charting Your Course with Oracle Applications is a global event series designed to help business and IT executives assess the impact of new inflection points on their business and applications roadmap: changing workforces, shifting customer and constituent bases, and increased volatility. Learn how innovations ranging from new deployment models like cloud computing to the introduction of social applications and smart devices are delivering results across all areas of business and industry. THIS DOCUMENT IS FOR INFORMATIONAL PURPOSES ONLY AND MAY NOT BE INCORPORATED INTO A CONTRACT OR AGREEMENT.

Read the article
Cost Comparison Hard Disk Drive to Solid State Drive on Price per Gigabyte - dispelling a myth!

- by tonyrogerson

It is often said that Hard Disk Drive storage is significantly cheaper per GiByte than Solid State Devices – this is wholly inaccurate within the database space. People need to look at the cost of the complete solution and not just a single component part in isolation to what is really required to meet the business requirement. Buying a single Hitachi Ultrastar 600GB 3.5” SAS 15Krpm hard disk drive will cost approximately £239.60 (http://scan.co.uk, 22nd March 2012) compared to an OCZ 600GB Z-Drive R4 CM84 PCIe costing £2,316.54 (http://scan.co.uk, 22nd March 2012); I’ve not included FusionIO ioDrive because there is no public pricing available for it – something I never understand and personally when companies do this I immediately think what are they hiding, luckily in FusionIO’s case the product is proven though is expensive compared to OCZ enterprise offerings. On the face of it the single 15Krpm hard disk has a price per GB of £0.39, the SSD £3.86; this is what you will see in the press and this is what sales people will use in comparing the two technologies – do not be fooled by this bullshit people! What is the requirement? The requirement is the database will have a static size of 400GB kept static through archiving so growth and trim will balance the database size, the client requires resilience, there will be several hundred call centre staff querying the database where queries will read a small amount of data but there will be no hot spot in the data so the randomness will come across the entire 400GB of the database, estimates predict that the IOps required will be approximately 4,000IOps at peak times, because it’s a call centre system the IO latency is important and must remain below 5ms per IO. The balance between read and write is 70% read, 30% write. The requirement is now defined and we have three of the most important pieces of the puzzle – space required, estimated IOps and maximum latency per IO. Something to consider with regard SQL Server; write activity requires synchronous IO to the storage media specifically the transaction log; that means the write thread will wait until the IO is completed and hardened off until the thread can continue execution, the requirement has stated that 30% of the system activity will be write so we can expect a high amount of synchronous activity. The hardware solution needs to be defined; two possible solutions: hard disk or solid state based; the real question now is how many hard disks are required to achieve the IO throughput, the latency and resilience, ditto for the solid state. Hard Drive solution On a test on an HP DL380, P410i controller using IOMeter against a single 15Krpm 146GB SAS drive, the throughput given on a transfer size of 8KiB against a 40GiB file on a freshly formatted disk where the partition is the only partition on the disk thus the 40GiB file is on the outer edge of the drive so more sectors can be read before head movement is required: For 100% sequential IO at a queue depth of 16 with 8 worker threads 43,537 IOps at an average latency of 2.93ms (340 MiB/s), for 100% random IO at the same queue depth and worker threads 3,733 IOps at an average latency of 34.06ms (34 MiB/s). The same test was done on the same disk but the test file was 130GiB: For 100% sequential IO at a queue depth of 16 with 8 worker threads 43,537 IOps at an average latency of 2.93ms (340 MiB/s), for 100% random IO at the same queue depth and worker threads 528 IOps at an average latency of 217.49ms (4 MiB/s). From the result it is clear random performance gets worse as the disk fills up – I’m currently writing an article on short stroking which will cover this in detail. Given the work load is random in nature looking at the random performance of the single drive when only 40 GiB of the 146 GB is used gives near the IOps required but the latency is way out. Luckily I have tested 6 x 15Krpm 146GB SAS 15Krpm drives in a RAID 0 using the same test methodology, for the same test above on a 130 GiB for each drive added the performance boost is near linear, for each drive added throughput goes up by 5 MiB/sec, IOps by 700 IOps and latency reducing nearly 50% per drive added (172 ms, 94 ms, 65 ms, 47 ms, 37 ms, 30 ms). This is because the same 130GiB is spread out more as you add drives 130 / 1, 130 / 2, 130 / 3 etc. so implicit short stroking is occurring because there is less file on each drive so less head movement required. The best latency is still 30 ms but we have the IOps required now, but that’s on a 130GiB file and not the 400GiB we need. Some reality check here: a) the drive randomness is more likely to be 50/50 and not a full 100% but the above has highlighted the effect randomness has on the drive and the more a drive fills with data the worse the effect. For argument sake let us assume that for the given workload we need 8 disks to do the job, for resilience reasons we will need 16 because we need to RAID 1+0 them in order to get the throughput and the resilience, RAID 5 would degrade performance. Cost for hard drives: 16 x £239.60 = £3,833.60 For the hard drives we will need disk controllers and a separate external disk array because the likelihood is that the server itself won’t take the drives, a quick spec off DELL for a PowerVault MD1220 which gives the dual pathing with 16 disks 146GB 15Krpm 2.5” disks is priced at £7,438.00, note its probably more once we had two controller cards to sit in the server in, racking etc. Minimum cost taking the DELL quote as an example is therefore: {Cost of Hardware} / {Storage Required} £7,438.60 / 400 = £18.595 per GB £18.59 per GiB is a far cry from the £0.39 we had been told by the salesman and the myth. Yes, the storage array is composed of 16 x 146 disks in RAID 10 (therefore 8 usable) giving an effective usable storage availability of 1168GB but the actual storage requirement is only 400 and the extra disks have had to be purchased to get the IOps up. Solid State Drive solution A single card significantly exceeds the IOps and latency required, for resilience two will be required. ( £2,316.54 * 2 ) / 400 = £11.58 per GB With the SSD solution only two PCIe sockets are required, no external disk units, no additional controllers, no redundant controllers etc. Conclusion I hope by showing you an example that the myth that hard disk drives are cheaper per GiB than Solid State has now been dispelled - £11.58 per GB for SSD compared to £18.59 for Hard Disk. I’ve not even touched on the running costs, compare the costs of running 18 hard disks, that’s a lot of heat and power compared to two PCIe cards!Just a quick note: I've left a fair amount of information out due to this being a blog! If in doubt, email me :)I'll also deal with the myth that SSD's wear out at a later date as well - that's just way over done still, yes, 5 years ago, but now - no.

Read the article
Creating a Training Lab on Windows Azure

- by Michael Stephenson

Originally posted on: http://geekswithblogs.net/michaelstephenson/archive/2013/06/17/153149.aspxThis week we are preparing for a training course that Alan Smith will be running for the support teams at one of my customers around Windows Azure. In order to facilitate the training lab we have a few prerequisites we need to handle. One of the biggest ones is that although the support team all have MSDN accounts the local desktops they work on are not ideal for running most of the labs as we want to give them some additional developer background training around Azure. Some recent Azure announcements really help us in this area: MSDN software can now be used on Azure VM You don't pay for Azure VM's when they are no longer used Since the support team only have limited experience of Windows Azure and the organisation also have an Enterprise Agreement we decided it would be best value for money to spin up a training lab in a subscription on the EA and then we can turn the machines off when we are done. At the same time we would be able to spin them back up when the users need to do some additional lab work once the training course is completed. In order to achieve this I wanted to create a powershell script which would setup my training lab. The aim was to create 18 VM's which would be based on a prebuilt template with Visual Studio and the Azure development tools. The script I used is described below The Start & Variables The below text will setup the powershell environment and some variables which I will use elsewhere in the script. It will also import the Azure Powershell cmdlets. You can see below that I will need to download my publisher settings file and know some details from my Azure account. At this point I will assume you have a basic understanding of Azure & Powershell so already know how to do this. Set-ExecutionPolicy Unrestrictedcls $startTime = get-dateImport-Module "C:\Program Files (x86)\Microsoft SDKs\Windows Azure\PowerShell\Azure\Azure.psd1"# Azure Publisher Settings $azurePublisherSettings = '<Your settings file>.publishsettings' # Subscription Details $subscriptionName = "<Your subscription name>" $defaultStorageAccount = "<Your default storage account>" # Affinity Group Details $affinityGroup = '<Your affinity group>' $dataCenter = 'West Europe' # From Get-AzureLocation # VM Details $baseVMName = 'TRN' $adminUserName = '<Your admin username>' $password = '<Your admin password>' $size = 'Medium' $vmTemplate = '<The name of your VM template image>' $rdpFilePath = '<File path to save RDP files to>' $machineSettingsPath = '<File path to save machine info to>' Functions In the next section of the script I have some functions which are used to perform certain actions. The first is called CreateVM. This will do the following actions: If the VM already exists it will be deleted Create the cloud service Create the VM from the template I have created Add an endpoint so we can RDP to them all over the same port Download the RDP file so there is a short cut the trainees can easily access the machine via Write settings for the machine to a log file function CreateVM($machineNo) { # Specify a name for the new VM $machineName = "$baseVMName-$machineNo" Write-Host "Creating VM: $machineName" # Get the Azure VM Image $myImage = Get-AzureVMImage $vmTemplate #If the VM already exists delete and re-create it $existingVm = Get-AzureVM -Name $machineName -ServiceName $serviceName if($existingVm -ne $null) { Write-Host "VM already exists so deleting it" Remove-AzureVM -Name $machineName -ServiceName $serviceName } "Creating Service" $serviceName = "bupa-azure-train-$machineName" Remove-AzureService -Force -ServiceName $serviceName New-AzureService -Location $dataCenter -ServiceName $serviceName Write-Host "Creating VM: $machineName" New-AzureQuickVM -Windows -name $machineName -ServiceName $serviceName -ImageName $myImage.ImageName -InstanceSize $size -AdminUsername $adminUserName -Password $password Write-Host "Updating the RDP endpoint for $machineName" Get-AzureVM -name $machineName -ServiceName $serviceName ` | Add-AzureEndpoint -Name RDP -Protocol TCP -LocalPort 3389 -PublicPort 550 ` | Update-AzureVM Write-Host "Get the RDP File for machine $machineName" $machineRDPFilePath = "$rdpFilePath\$machineName.rdp" Get-AzureRemoteDesktopFile -name $machineName -ServiceName $serviceName -LocalPath "$machineRDPFilePath" WriteMachineSettings "$machineName" "$serviceName" } The delete machine settings function is used to delete the log file before we start re-running the process. function DeleteMachineSettings() { Write-Host "Deleting the machine settings output file" [System.IO.File]::Delete("$machineSettingsPath"); } The write machine settings function will get the VM and then record its details to the log file. The importance of the log file is that I can easily provide the information for all of the VM's to our infrastructure team to be able to configure access to all of the VM's function WriteMachineSettings([string]$vmName, [string]$vmServiceName) { Write-Host "Writing to the machine settings output file" $vm = Get-AzureVM -name $vmName -ServiceName $vmServiceName $vmEndpoint = Get-AzureEndpoint -VM $vm -Name RDP $sb = new-object System.Text.StringBuilder $sb.Append("Service Name: "); $sb.Append($vm.ServiceName); $sb.Append(", "); $sb.Append("VM: "); $sb.Append($vm.Name); $sb.Append(", "); $sb.Append("RDP Public Port: "); $sb.Append($vmEndpoint.Port); $sb.Append(", "); $sb.Append("Public DNS: "); $sb.Append($vmEndpoint.Vip); $sb.AppendLine(""); [System.IO.File]::AppendAllText($machineSettingsPath, $sb.ToString()); } # end functions Rest of Script In the rest of the script it is really just the bit that orchestrates the actions we want to happen. It will load the publisher settings, select the Azure subscription and then loop around the CreateVM function and create 16 VM's Import-AzurePublishSettingsFile $azurePublisherSettings Set-AzureSubscription -SubscriptionName $subscriptionName -CurrentStorageAccount $defaultStorageAccount Select-AzureSubscription -SubscriptionName $subscriptionName DeleteMachineSettings "Starting creating Bupa International Azure Training Lab" $numberOfVMs = 16 for ($index=1; $index -le $numberOfVMs; $index++) { $vmNo = "$index" CreateVM($vmNo); } "Finished creating Bupa International Azure Training Lab" # Give it a Minute Start-Sleep -s 60 $endTime = get-date "Script run time " + ($endTime - $startTime) Conclusion As you can see there is nothing too fancy about this script but in our case of creating a small isolated training lab which is not connected to our corporate network then we can easily use this to provision the lab. Im sure if this is of use to anyone you can easily modify it to do other things with the lab environment too. A couple of points to note are that there are some soft limits in Azure about the number of cores and services your subscription can use. You may need to contact the Azure support team to be able to increase this limit. In terms of the real business value of this approach, it was not possible to use the existing desktops to do the training on, and getting some internal virtual machines would have been relatively expensive and time consuming for our ops team to do. With the Azure option we are able to spin these machines up for a temporary period during the training course and then throw them away when we are done. We expect the costing of this test lab to be very small, especially considering we have EA pricing. As a ball park I think my 18 lab VM training environment will cost in the region of $80 per day on our EA. This is a fraction of the cost of the creation of a single VM on premise.

Read the article
Conducting Effective Web Meetings

- by BuckWoody

There are several forms of corporate communication. From immediate, rich communications like phones and IM messaging to historical transactions like e-mail, there are a lot of ways to get information to one or more people. From time to time, it's even useful to have a meeting. (This is where a witty picture of a guy sleeping in a meeting goes. I won't bother actually putting one here; you're already envisioning it in your mind) Most meetings are pointless, and a complete waste of time. This is the fault, completely and solely, of the organizer. It's because he or she hasn't thought things through enough to think about alternate forms of information passing. Here's the criteria for a good meeting - whether in-person or over the web: 100% of the content of a meeting should require the participation of 100% of the attendees for 100% of the time It doesn't get any simpler than that. If it doesn't meet that criteria, then don't invite that person to that meeting. If you're just conveying information and no one has the need for immediate interaction with that information (like telling you something that modifies the message), then send an e-mail. If you're a manager, and you need to get status from lots of people, pick up the phone.If you need a quick answer, use IM. I once had a high-level manager that called frequent meetings. His real need was status updates on various processes, so 50 of us would sit in a room while he asked each one of us questions. He believed this larger meeting helped us "cross pollinate ideas". In fact, it was a complete waste of time for most everyone, except in the one or two moments that they interacted with him. So I wrote some code for a Palm Pilot (which was a kind of SmartPhone but with no phone and no real graphics, but this was in the days when we had just discovered fire and the wheel, although the order of those things is still in debate) that took an average of the salaries of the people in the room (I guessed at it) and ran a timer which multiplied the number of people against the salaries. I left that running in plain sight for him, and when he asked about it, I explained how much the meetings were really costing the company. We had far fewer meetings after. Meetings are now web-enabled. I believe that's largely a good thing, since it saves on travel time and allows more people to participate, but I think the rule above still holds. And in fact, there are some other rules that you should follow to have a great meeting - and fewer of them. Be Clear About the Goal This is important in any meeting, but all of us have probably gotten an invite with a web link and an ambiguous title. Then you get to the meeting, and it's a 500-level deep-dive on something everyone expects you to know. This is unfair to the "expert" and to the participants. I always tell people that invite me to a meeting that I will be as detailed as I can - but the more detail they can tell me about the questions, the more detailed I can be in my responses. Granted, there are times when you don't know what you don't know, but the more you can say about the topic the better. There's another point here - and it's that you should have a clearly defined "win" for the meeting. When the meeting is over, and everyone goes back to work, what were you expecting them to do with the information? Have that clearly defined in your head, and in the meeting invite. Understand the Technology There are several web-meeting clients out there. I use them all, since I meet with clients all over the world. They all work differently - so I take a few moments and read up on the different clients and find out how I can use the tools properly. I do this with the technology I use for everything else, and it's important to understand it if the meeting is to be a success. If you're running the meeting, know the tools. I don't care if you like the tools or not, learn them anyway. Don't waste everyone else's time just because you're too bitter/snarky/lazy to spend a few minutes reading. Check your phone or mic. Check your video size. Install (and learn to use) ZoomIT (http://technet.microsoft.com/en-us/sysinternals/bb897434.aspx). Format your slides or screen or output correctly. Learn to use the voting features of the meeting software, and especially it's whiteboard features. Figure out how multiple monitors work. Try a quick meeting with someone to test all this. Do this *before* you invite lots of other people to your meeting. Use a WebCam I'm not a pretty man. I have a face fit for radio. But after attending a meeting with clients where one Microsoft person used a webcam and another did not, I'm convinced that people pay more attention when a face is involved. There are tons of studies around this, or you can take my word for it, but toss a shirt on over those pajamas and turn the webcam on. Set Up Early Whether you're attending or leading the meeting, don't wait to sign on to the meeting at the time when it starts. I can almost plan that a 10:00 meeting will actually start at 10:10 because the participants/leader is just now installing the web client for the meeting at 10:00. Sign on early, go on mute, and then wait for everyone to arrive. Mute When Not Talking No one wants to hear your screaming offspring / yappy dog / other cubicle conversations / car wind noise (are you driving in a desert storm or something?) while the person leading the meeting is trying to talk. I use the Lync software from Microsoft for my meetings, and I mute everyone by default, and then tell them to un-mute to talk to the group. Share Collateral If you have a PowerPoint deck, mail it out in case you have a tech failure. If you have a document, share it as an attachment to the meeting. Don't make people ask you for the information - that's why you're there to begin with. Even better, send it out early. "But", you say, "then no one will come to the meeting if they have the deck first!" Uhm, then don't have a meeting. Send out the deck and a quick e-mail and let everyone get on with their productive day. Set Actions At the Meeting A meeting should have some sort of outcome (see point one). That means there are actions to take, a follow up, or some deliverable. Otherwise, it's an e-mail. At the meeting, decide who will do what, when things are needed, and so on. And avoid, if at all possible, setting up another meeting, unless absolutely necessary. So there you have it. Whether it's on-premises or on the web, meetings are a necessary evil, and should be treated that way. Like politicians, you should have as few of them as are necessary to keep the roads paved and public libraries open.

Read the article
Different Flavors of Leases Back On

- by Theresa Hickman

Given the continued interest regarding the proposed changes to Lease Accounting, I decided to write another entry on this controversial topic with colorful commentary from our resident accounting expert, Seamus Moran. Background (A History Lesson) Back in 1976, the FASB issued FAS 13, “Accounting for Leases” that permitted leases to be either an operating lease or capital (finance) lease. In substance, operating leases are a form of off-balance sheet financing. According to Seamus, operating leases date back to the launch of the Boeing 707 in the 1950s. Because the aircraft was so much more expensive than previous aircrafts, the industry came up with the operating lease concept to accommodate these jet liners that dominated air transport. How it worked was the bank would buy the plane and lease it to the airline. Because the bank never controlled or flew the plane, they never placed the asset on their balance sheet, and because the airline never owned the plane, they didn’t place it on their balance sheet either. They simply treated the monthly lease payments as rental expenses on the P&L. August 2010 Original Lease Accounting Changes In August 2010, FASB and IASB decided to overhaul lease accounting as part of their joint commitment “to insure that investors and other users of financial statements are provided useful, transparent, and complete information about leasing transactions in the financial statements.” Some say that the current lease accounting standards are broken because it keeps assets off the balance sheet, hidden from investors’ view. The original proposal abolished operating leases and only permitted capital leases where all leases would be recorded on the balance sheet as assets and liabilities. The asset side would reflect the right to use the asset for the leased term, and the liability side would reflect the obligation to make lease payments. Why Companies Were Freaking Out According to the SEC, the financial impact of the aforementioned lease changes was estimated to add more than $1.3 trillion of operating lease obligations to corporate balance sheets. Many companies in various industries, especially retail, are concerned because the changes are significant and will impact existing leases with no grandfather clause for existing operating leases. Of course, the banks and airlines I mentioned earlier really hate this because neither wants to report the airplane (now costing around $60 M) as an asset. Regular companies were concerned that they would have to report routine short term leases of real estate or equipment as fixed assets, even though they were really just longer term rentals. One company we spoke to leased roadside billboards, and really did not consider them to be fixed assets in any way. Obviously, these changes would have had a profound and lasting effect on a company’s financial and real estate strategies and significantly impact its financial statements. Financial statements would show higher depreciation and interest expense with significantly higher total assets and debt. In terms of financial metrics, they’re negatively impacted. It would raise a company’s debt-to-capital ratio to reflect the higher debt compared to equity, it would negatively impact their return-on-assets because now companies will appear more asset intensive, and it will decrease EPS, lowering shareholder ROI. Feb. 2011 Recent Update The comment period on leases closed in December 2010. The FASB and the IASB have met several times since then and published their initial responses to the input they received from the various interested parties. They are “redeliberating” the principles involved in Lease Accounting. Some of the issues they are looking at include: The core definition of a lease. This will articulate principles on what is a lease and what is “not-a-lease.” One theory or supposition is that they might define a lease as the transfer of certain but not all major ownership attributes for a certain period of time. So a year’s lease of an aircraft might be a “lease,” but a year’s lease of half a floor in an office building would be “not-a-lease.” The ownership attributes transferred from the core owner to the user are different; the airline must maintain, paint, and do whatever it needs to do on the aircraft. However, the office renter will have strictly limited rights in respect to the rented space. The differences between a lease contract and service contract. Even if they call them “leases” for the purpose of commercial law, a service contract might not be accounted for as a lease. The accounting to be done by the lessee. They would define when the bank or landlord would retain the asset on their balance sheet, and perhaps by implication, when the lessor would not need to include the asset on theirs. So if the finance house keeps the airplane or office on their balance sheet, the tenant doesn’t need to. I’m not sure that I can draw the opposite conclusion where the finance house doesn’t report but the tenant must. The difference, if any, between a financing lease and other leases, and the implications to the accounting. The present value calculation when renewable terms exist. They have reduced the circumstances in which one must look at the renewable terms of a lease in calculating the present value. In most circumstances, you will use the lease term rather than the potential renewable term. Their latest discussion this past week with the contents of the discussion was not available at the time of me writing this entry. For more details, the results of the discussions are posted on both the FASB and the IASB websites. Implied Software Changes Whatever the final rules turn out to be, all ERP systems, such as Oracle E-Business Suite, PeopleSoft Enterprise, JD Edwards, and Oracle Hyperion will need to change their software to accommodate the new rules. The following lists some changes that might have to be made to accounting software depending on what the final standards will be in June 2011: Lease tracking may require modifications with tracking of additional lease details that might require a centralized repository to maintain Accounting may need to be modified as there are many changes to how capital leases and the new “other than finance” leases are accounted for both on the lessee and lessor side. For example, valuation, amortization, and disclosure will be considerably different requiring different types of data to be captured. Companies may need to modify their chart of accounts depending on how they want to track leases, which could then impact financial reporting and consolidation Business processes may require changes which could then impact internal controls Software applications may need to perform more advanced computations on leases Reports and KPIs may need to reflect new operating metrics Hold Onto Your Seats Before you redo all your lease agreements and call your software vendors asking when the changes to the software will be made, remember that the rules are not finalized yet, and from appearances, will not reflect the proposals in the exposure draft. Not only are there objections to putting the operating lease assets on anyone’s balance sheet, there are lots of objections to subjectivity and the data required for the valuation. According to Seamus, there is huge opposition from New York bankers, the airlines, the EU, the Communist Party of China (since it impacts their exporting business), and Republicans (hearing complaints from small and large businesses). Even if everyone can agree on the proposed changes, 2013 might be the earliest that companies would need to change how they report leases. The Boards will finish their deliberations in April, May or June 2011. As we’ve seen with other Exposure Drafts, if the changes are minor and the principles met the General Acceptance consensus criteria, the Standard could be finalized at that time. However, if substantial changes are made, a fresh exposure draft, comment period, and review period might be involved, too. Seamus added an interesting perspective. Even if the proposed changes do pass, don’t you think our customers, such as Boeing, GE Capital, United Airlines, etc. will be clever enough to come up with a new kind of financing arrangement that complies with the new accounting? How about the large retail customers, such as Best Buy and Macerich? Don’t you think they might simply cut deals around retail locations with new contracts that prevent their leases from being capital leases? Instead of blindly adapting the software to meet the principles outlined in the final standard, our software needs to accommodate how businesses will respond to the new rules. We cannot know our customers’ responses until the rules are finalized. Oracle is aware of the potential changes and is staying abreast of the developments through our domain expertise staff, our relationship with customers, our market awareness, and, of course, our relationships with the Big 4. This is part of our normal process with respect to worldwide regulatory compliance. Oracle products have been IFRS and GAAP compliant for years and we will continue to maintain those standards going forward.

Read the article
When is a SQL function not a function?

- by Rob Farley

Should SQL Server even have functions? (Oh yeah – this is a T-SQL Tuesday post, hosted this month by Brad Schulz) Functions serve an important part of programming, in almost any language. A function is a piece of code that is designed to return something, as opposed to a piece of code which isn’t designed to return anything (which is known as a procedure). SQL Server is no different. You can call stored procedures, even from within other stored procedures, and you can call functions and use these in other queries. Stored procedures might query something, and therefore ‘return data’, but a function in SQL is considered to have the type of the thing returned, and can be used accordingly in queries. Consider the internal GETDATE() function. SELECT GETDATE(), SomeDatetimeColumn FROM dbo.SomeTable; There’s no logical difference between the field that is being returned by the function and the field that’s being returned by the table column. Both are the datetime field – if you didn’t have inside knowledge, you wouldn’t necessarily be able to tell which was which. And so as developers, we find ourselves wanting to create functions that return all kinds of things – functions which look up values based on codes, functions which do string manipulation, and so on. But it’s rubbish. Ok, it’s not all rubbish, but it mostly is. And this isn’t even considering the SARGability impact. It’s far more significant than that. (When I say the SARGability aspect, I mean “because you’re unlikely to have an index on the result of some function that’s applied to a column, so try to invert the function and query the column in an unchanged manner”) I’m going to consider the three main types of user-defined functions in SQL Server: Scalar Inline Table-Valued Multi-statement Table-Valued I could also look at user-defined CLR functions, including aggregate functions, but not today. I figure that most people don’t tend to get around to doing CLR functions, and I’m going to focus on the T-SQL-based user-defined functions. Most people split these types of function up into two types. So do I. Except that most people pick them based on ‘scalar or table-valued’. I’d rather go with ‘inline or not’. If it’s not inline, it’s rubbish. It really is. Let’s start by considering the two kinds of table-valued function, and compare them. These functions are going to return the sales for a particular salesperson in a particular year, from the AdventureWorks database. CREATE FUNCTION dbo.FetchSales_inline(@salespersonid int, @orderyear int) RETURNS TABLE AS RETURN ( SELECT e.LoginID as EmployeeLogin, o.OrderDate, o.SalesOrderID FROM Sales.SalesOrderHeader AS o LEFT JOIN HumanResources.Employee AS e ON e.EmployeeID = o.SalesPersonID WHERE o.SalesPersonID = @salespersonid AND o.OrderDate >= DATEADD(year,@orderyear-2000,'20000101') AND o.OrderDate < DATEADD(year,@orderyear-2000+1,'20000101') ) ; GO CREATE FUNCTION dbo.FetchSales_multi(@salespersonid int, @orderyear int) RETURNS @results TABLE ( EmployeeLogin nvarchar(512), OrderDate datetime, SalesOrderID int ) AS BEGIN INSERT @results (EmployeeLogin, OrderDate, SalesOrderID) SELECT e.LoginID, o.OrderDate, o.SalesOrderID FROM Sales.SalesOrderHeader AS o LEFT JOIN HumanResources.Employee AS e ON e.EmployeeID = o.SalesPersonID WHERE o.SalesPersonID = @salespersonid AND o.OrderDate >= DATEADD(year,@orderyear-2000,'20000101') AND o.OrderDate < DATEADD(year,@orderyear-2000+1,'20000101') ; RETURN END ; GO You’ll notice that I’m being nice and responsible with the use of the DATEADD function, so that I have SARGability on the OrderDate filter. Regular readers will be hoping I’ll show what’s going on in the execution plans here. Here I’ve run two SELECT * queries with the “Show Actual Execution Plan” option turned on. Notice that the ‘Query cost’ of the multi-statement version is just 2% of the ‘Batch cost’. But also notice there’s trickery going on. And it’s nothing to do with that extra index that I have on the OrderDate column. Trickery. Look at it – clearly, the first plan is showing us what’s going on inside the function, but the second one isn’t. The second one is blindly running the function, and then scanning the results. There’s a Sequence operator which is calling the TVF operator, and then calling a Table Scan to get the results of that function for the SELECT operator. But surely it still has to do all the work that the first one is doing... To see what’s actually going on, let’s look at the Estimated plan. Now, we see the same plans (almost) that we saw in the Actuals, but we have an extra one – the one that was used for the TVF. Here’s where we see the inner workings of it. You’ll probably recognise the right-hand side of the TVF’s plan as looking very similar to the first plan – but it’s now being called by a stack of other operators, including an INSERT statement to be able to populate the table variable that the multi-statement TVF requires. And the cost of the TVF is 57% of the batch! But it gets worse. Let’s consider what happens if we don’t need all the columns. We’ll leave out the EmployeeLogin column. Here, we see that the inline function call has been simplified down. It doesn’t need the Employee table. The join is redundant and has been eliminated from the plan, making it even cheaper. But the multi-statement plan runs the whole thing as before, only removing the extra column when the Table Scan is performed. A multi-statement function is a lot more powerful than an inline one. An inline function can only be the result of a single sub-query. It’s essentially the same as a parameterised view, because views demonstrate this same behaviour of extracting the definition of the view and using it in the outer query. A multi-statement function is clearly more powerful because it can contain far more complex logic. But a multi-statement function isn’t really a function at all. It’s a stored procedure. It’s wrapped up like a function, but behaves like a stored procedure. It would be completely unreasonable to expect that a stored procedure could be simplified down to recognise that not all the columns might be needed, but yet this is part of the pain associated with this procedural function situation. The biggest clue that a multi-statement function is more like a stored procedure than a function is the “BEGIN” and “END” statements that surround the code. If you try to create a multi-statement function without these statements, you’ll get an error – they are very much required. When I used to present on this kind of thing, I even used to call it “The Dangers of BEGIN and END”, and yes, I’ve written about this type of thing before in a similarly-named post over at my old blog. Now how about scalar functions... Suppose we wanted a scalar function to return the count of these. CREATE FUNCTION dbo.FetchSales_scalar(@salespersonid int, @orderyear int) RETURNS int AS BEGIN RETURN ( SELECT COUNT(*) FROM Sales.SalesOrderHeader AS o LEFT JOIN HumanResources.Employee AS e ON e.EmployeeID = o.SalesPersonID WHERE o.SalesPersonID = @salespersonid AND o.OrderDate >= DATEADD(year,@orderyear-2000,'20000101') AND o.OrderDate < DATEADD(year,@orderyear-2000+1,'20000101') ); END ; GO Notice the evil words? They’re required. Try to remove them, you just get an error. That’s right – any scalar function is procedural, despite the fact that you wrap up a sub-query inside that RETURN statement. It’s as ugly as anything. Hopefully this will change in future versions. Let’s have a look at how this is reflected in an execution plan. Here’s a query, its Actual plan, and its Estimated plan: SELECT e.LoginID, y.year, dbo.FetchSales_scalar(p.SalesPersonID, y.year) AS NumSales FROM (VALUES (2001),(2002),(2003),(2004)) AS y (year) CROSS JOIN Sales.SalesPerson AS p LEFT JOIN HumanResources.Employee AS e ON e.EmployeeID = p.SalesPersonID; We see here that the cost of the scalar function is about twice that of the outer query. Nicely, the query optimizer has worked out that it doesn’t need the Employee table, but that’s a bit of a red herring here. There’s actually something way more significant going on. If I look at the properties of that UDF operator, it tells me that the Estimated Subtree Cost is 0.337999. If I just run the query SELECT dbo.FetchSales_scalar(281,2003); we see that the UDF cost is still unchanged. You see, this 0.0337999 is the cost of running the scalar function ONCE. But when we ran that query with the CROSS JOIN in it, we returned quite a few rows. 68 in fact. Could’ve been a lot more, if we’d had more salespeople or more years. And so we come to the biggest problem. This procedure (I don’t want to call it a function) is getting called 68 times – each one between twice as expensive as the outer query. And because it’s calling it in a separate context, there is even more overhead that I haven’t considered here. The cheek of it, to say that the Compute Scalar operator here costs 0%! I know a number of IT projects that could’ve used that kind of costing method, but that’s another story that I’m not going to go into here. Let’s look at a better way. Suppose our scalar function had been implemented as an inline one. Then it could have been expanded out like a sub-query. It could’ve run something like this: SELECT e.LoginID, y.year, (SELECT COUNT(*) FROM Sales.SalesOrderHeader AS o LEFT JOIN HumanResources.Employee AS e ON e.EmployeeID = o.SalesPersonID WHERE o.SalesPersonID = p.SalesPersonID AND o.OrderDate >= DATEADD(year,y.year-2000,'20000101') AND o.OrderDate < DATEADD(year,y.year-2000+1,'20000101') ) AS NumSales FROM (VALUES (2001),(2002),(2003),(2004)) AS y (year) CROSS JOIN Sales.SalesPerson AS p LEFT JOIN HumanResources.Employee AS e ON e.EmployeeID = p.SalesPersonID; Don’t worry too much about the Scan of the SalesOrderHeader underneath a Nested Loop. If you remember from plenty of other posts on the matter, execution plans don’t push the data through. That Scan only runs once. The Index Spool sucks the data out of it and populates a structure that is used to feed the Stream Aggregate. The Index Spool operator gets called 68 times, but the Scan only once (the Number of Executions property demonstrates this). Here, the Query Optimizer has a full picture of what’s being asked, and can make the appropriate decision about how it accesses the data. It can simplify it down properly. To get this kind of behaviour from a function, we need it to be inline. But without inline scalar functions, we need to make our function be table-valued. Luckily, that’s ok. CREATE FUNCTION dbo.FetchSales_inline2(@salespersonid int, @orderyear int) RETURNS table AS RETURN (SELECT COUNT(*) as NumSales FROM Sales.SalesOrderHeader AS o LEFT JOIN HumanResources.Employee AS e ON e.EmployeeID = o.SalesPersonID WHERE o.SalesPersonID = @salespersonid AND o.OrderDate >= DATEADD(year,@orderyear-2000,'20000101') AND o.OrderDate < DATEADD(year,@orderyear-2000+1,'20000101') ); GO But we can’t use this as a scalar. Instead, we need to use it with the APPLY operator. SELECT e.LoginID, y.year, n.NumSales FROM (VALUES (2001),(2002),(2003),(2004)) AS y (year) CROSS JOIN Sales.SalesPerson AS p LEFT JOIN HumanResources.Employee AS e ON e.EmployeeID = p.SalesPersonID OUTER APPLY dbo.FetchSales_inline2(p.SalesPersonID, y.year) AS n; And now, we get the plan that we want for this query. All we’ve done is tell the function that it’s returning a table instead of a single value, and removed the BEGIN and END statements. We’ve had to name the column being returned, but what we’ve gained is an actual inline simplifiable function. And if we wanted it to return multiple columns, it could do that too. I really consider this function to be superior to the scalar function in every way. It does need to be handled differently in the outer query, but in many ways it’s a more elegant method there too. The function calls can be put amongst the FROM clause, where they can then be used in the WHERE or GROUP BY clauses without fear of calling the function multiple times (another horrible side effect of functions). So please. If you see BEGIN and END in a function, remember it’s not really a function, it’s a procedure. And then fix it. @rob_farley

Read the article
West Wind WebSurge - an easy way to Load Test Web Applications

- by Rick Strahl

A few months ago on a project the subject of load testing came up. We were having some serious issues with a Web application that would start spewing SQL lock errors under somewhat heavy load. These sort of errors can be tough to catch, precisely because they only occur under load and not during typical development testing. To replicate this error more reliably we needed to put a load on the application and run it for a while before these SQL errors would flare up. It’s been a while since I’d looked at load testing tools, so I spent a bit of time looking at different tools and frankly didn’t really find anything that was a good fit. A lot of tools were either a pain to use, didn’t have the basic features I needed, or are extravagantly expensive. In the end I got frustrated enough to build an initially small custom load test solution that then morphed into a more generic library, then gained a console front end and eventually turned into a full blown Web load testing tool that is now called West Wind WebSurge. I got seriously frustrated looking for tools every time I needed some quick and dirty load testing for an application. If my aim is to just put an application under heavy enough load to find a scalability problem in code, or to simply try and push an application to its limits on the hardware it’s running I shouldn’t have to have to struggle to set up tests. It should be easy enough to get going in a few minutes, so that the testing can be set up quickly so that it can be done on a regular basis without a lot of hassle. And that was the goal when I started to build out my initial custom load tester into a more widely usable tool. If you’re in a hurry and you want to check it out, you can find more information and download links here: West Wind WebSurge Product Page Walk through Video Download link (zip) Install from Chocolatey Source on GitHub For a more detailed discussion of the why’s and how’s and some background continue reading. How did I get here? When I started out on this path, I wasn’t planning on building a tool like this myself – but I got frustrated enough looking at what’s out there to think that I can do better than what’s available for the most common simple load testing scenarios. When we ran into the SQL lock problems I mentioned, I started looking around what’s available for Web load testing solutions that would work for our whole team which consisted of a few developers and a couple of IT guys both of which needed to be able to run the tests. It had been a while since I looked at tools and I figured that by now there should be some good solutions out there, but as it turns out I didn’t really find anything that fit our relatively simple needs without costing an arm and a leg… I spent the better part of a day installing and trying various load testing tools and to be frank most of them were either terrible at what they do, incredibly unfriendly to use, used some terminology I couldn’t even parse, or were extremely expensive (and I mean in the ‘sell your liver’ range of expensive). Pick your poison. There are also a number of online solutions for load testing and they actually looked more promising, but those wouldn’t work well for our scenario as the application is running inside of a private VPN with no outside access into the VPN. Most of those online solutions also ended up being very pricey as well – presumably because of the bandwidth required to test over the open Web can be enormous. When I asked around on Twitter what people were using– I got mostly… crickets. Several people mentioned Visual Studio Load Test, and most other suggestions pointed to online solutions. I did get a bunch of responses though with people asking to let them know what I found – apparently I’m not alone when it comes to finding load testing tools that are effective and easy to use. As to Visual Studio, the higher end skus of Visual Studio and the test edition include a Web load testing tool, which is quite powerful, but there are a number of issues with that: First it’s tied to Visual Studio so it’s not very portable – you need a VS install. I also find the test setup and terminology used by the VS test runner extremely confusing. Heck, it’s complicated enough that there’s even a Pluralsight course on using the Visual Studio Web test from Steve Smith. And of course you need to have one of the high end Visual Studio Skus, and those are mucho Dinero ($$$) – just for the load testing that’s rarely an option. Some of the tools are ultra extensive and let you run analysis tools on the target serves which is useful, but in most cases – just plain overkill and only distracts from what I tend to be ultimately interested in: Reproducing problems that occur at high load, and finding the upper limits and ‘what if’ scenarios as load is ramped up increasingly against a site. Yes it’s useful to have Web app instrumentation, but often that’s not what you’re interested in. I still fondly remember early days of Web testing when Microsoft had the WAST (Web Application Stress Tool) tool, which was rather simple – and also somewhat limited – but easily allowed you to create stress tests very quickly. It had some serious limitations (mainly that it didn’t work with SSL), but the idea behind it was excellent: Create tests quickly and easily and provide a decent engine to run it locally with minimal setup. You could get set up and run tests within a few minutes. Unfortunately, that tool died a quiet death as so many of Microsoft’s tools that probably were built by an intern and then abandoned, even though there was a lot of potential and it was actually fairly widely used. Eventually the tools was no longer downloadable and now it simply doesn’t work anymore on higher end hardware. West Wind Web Surge – Making Load Testing Quick and Easy So I ended up creating West Wind WebSurge out of rebellious frustration… The goal of WebSurge is to make it drop dead simple to create load tests. It’s super easy to capture sessions either using the built in capture tool (big props to Eric Lawrence, Telerik and FiddlerCore which made that piece a snap), using the full version of Fiddler and exporting sessions, or by manually or programmatically creating text files based on plain HTTP headers to create requests. I’ve been using this tool for 4 months now on a regular basis on various projects as a reality check for performance and scalability and it’s worked extremely well for finding small performance issues. I also use it regularly as a simple URL tester, as it allows me to quickly enter a URL plus headers and content and test that URL and its results along with the ability to easily save one or more of those URLs. A few weeks back I made a walk through video that goes over most of the features of WebSurge in some detail: Note that the UI has slightly changed since then, so there are some UI improvements. Most notably the test results screen has been updated recently to a different layout and to provide more information about each URL in a session at a glance. The video and the main WebSurge site has a lot of info of basic operations. For the rest of this post I’ll talk about a few deeper aspects that may be of interest while also giving a glance at how WebSurge works. Session Capturing As you would expect, WebSurge works with Sessions of Urls that are played back under load. Here’s what the main Session View looks like: You can create session entries manually by individually adding URLs to test (on the Request tab on the right) and saving them, or you can capture output from Web Browsers, Windows Desktop applications that call services, your own applications using the built in Capture tool. With this tool you can capture anything HTTP -SSL requests and content from Web pages, AJAX calls, SOAP or REST services – again anything that uses Windows or .NET HTTP APIs. Behind the scenes the capture tool uses FiddlerCore so basically anything you can capture with Fiddler you can also capture with Web Surge Session capture tool. Alternately you can actually use Fiddler as well, and then export the captured Fiddler trace to a file, which can then be imported into WebSurge. This is a nice way to let somebody capture session without having to actually install WebSurge or for your customers to provide an exact playback scenario for a given set of URLs that cause a problem perhaps. Note that not all applications work with Fiddler’s proxy unless you configure a proxy. For example, .NET Web applications that make HTTP calls usually don’t show up in Fiddler by default. For those .NET applications you can explicitly override proxy settings to capture those requests to service calls. The capture tool also has handy optional filters that allow you to filter by domain, to help block out noise that you typically don’t want to include in your requests. For example, if your pages include links to CDNs, or Google Analytics or social links you typically don’t want to include those in your load test, so by capturing just from a specific domain you are guaranteed content from only that one domain. Additionally you can provide url filters in the configuration file – filters allow to provide filter strings that if contained in a url will cause requests to be ignored. Again this is useful if you don’t filter by domain but you want to filter out things like static image, css and script files etc. Often you’re not interested in the load characteristics of these static and usually cached resources as they just add noise to tests and often skew the overall url performance results. In my testing I tend to care only about my dynamic requests. SSL Captures require Fiddler Note, that in order to capture SSL requests you’ll have to install the Fiddler’s SSL certificate. The easiest way to do this is to install Fiddler and use its SSL configuration options to get the certificate into the local certificate store. There’s a document on the Telerik site that provides the exact steps to get SSL captures to work with Fiddler and therefore with WebSurge. Session Storage A group of URLs entered or captured make up a Session. Sessions can be saved and restored easily as they use a very simple text format that simply stored on disk. The format is slightly customized HTTP header traces separated by a separator line. The headers are standard HTTP headers except that the full URL instead of just the domain relative path is stored as part of the 1st HTTP header line for easier parsing. Because it’s just text and uses the same format that Fiddler uses for exports, it’s super easy to create Sessions by hand manually or under program control writing out to a simple text file. You can see what this format looks like in the Capture window figure above – the raw captured format is also what’s stored to disk and what WebSurge parses from. The only ‘custom’ part of these headers is that 1st line contains the full URL instead of the domain relative path and Host: header. The rest of each header are just plain standard HTTP headers with each individual URL isolated by a separator line. The format used here also uses what Fiddler produces for exports, so it’s easy to exchange or view data either in Fiddler or WebSurge. Urls can also be edited interactively so you can modify the headers easily as well: Again – it’s just plain HTTP headers so anything you can do with HTTP can be added here. Use it for single URL Testing Incidentally I’ve also found this form as an excellent way to test and replay individual URLs for simple non-load testing purposes. Because you can capture a single or many URLs and store them on disk, this also provides a nice HTTP playground where you can record URLs with their headers, and fire them one at a time or as a session and see results immediately. It’s actually an easy way for REST presentations and I find the simple UI flow actually easier than using Fiddler natively. Finally you can save one or more URLs as a session for later retrieval. I’m using this more and more for simple URL checks. Overriding Cookies and Domains Speaking of HTTP headers – you can also overwrite cookies used as part of the options. One thing that happens with modern Web applications is that you have session cookies in use for authorization. These cookies tend to expire at some point which would invalidate a test. Using the Options dialog you can actually override the cookie: which replaces the cookie for all requests with the cookie value specified here. You can capture a valid cookie from a manual HTTP request in your browser and then paste into the cookie field, to replace the existing Cookie with the new one that is now valid. Likewise you can easily replace the domain so if you captured urls on west-wind.com and now you want to test on localhost you can do that easily easily as well. You could even do something like capture on store.west-wind.com and then test on localhost/store which would also work. Running Load Tests Once you’ve created a Session you can specify the length of the test in seconds, and specify the number of simultaneous threads to run each session on. Sessions run through each of the URLs in the session sequentially by default. One option in the options list above is that you can also randomize the URLs so each thread runs requests in a different order. This avoids bunching up URLs initially when tests start as all threads run the same requests simultaneously which can sometimes skew the results of the first few minutes of a test. While sessions run some progress information is displayed: By default there’s a live view of requests displayed in a Console-like window. On the bottom of the window there’s a running total summary that displays where you’re at in the test, how many requests have been processed and what the requests per second count is currently for all requests. Note that for tests that run over a thousand requests a second it’s a good idea to turn off the console display. While the console display is nice to see that something is happening and also gives you slight idea what’s happening with actual requests, once a lot of requests are processed, this UI updating actually adds a lot of CPU overhead to the application which may cause the actual load generated to be reduced. If you are running a 1000 requests a second there’s not much to see anyway as requests roll by way too fast to see individual lines anyway. If you look on the options panel, there is a NoProgressEvents option that disables the console display. Note that the summary display is still updated approximately once a second so you can always tell that the test is still running. Test Results When the test is done you get a simple Results display: On the right you get an overall summary as well as breakdown by each URL in the session. Both success and failures are highlighted so it’s easy to see what’s breaking in your load test. The report can be printed or you can also open the HTML document in your default Web Browser for printing to PDF or saving the HTML document to disk. The list on the right shows you a partial list of the URLs that were fired so you can look in detail at the request and response data. The list can be filtered by success and failure requests. Each list is partial only (at the moment) and limited to a max of 1000 items in order to render reasonably quickly. Each item in the list can be clicked to see the full request and response data: This particularly useful for errors so you can quickly see and copy what request data was used and in the case of a GET request you can also just click the link to quickly jump to the page. For non-GET requests you can find the URL in the Session list, and use the context menu to Test the URL as configured including any HTTP content data to send. You get to see the full HTTP request and response as well as a link in the Request header to go visit the actual page. Not so useful for a POST as above, but definitely useful for GET requests. Finally you can also get a few charts. The most useful one is probably the Request per Second chart which can be accessed from the Charts menu or shortcut. Here’s what it looks like: Results can also be exported to JSON, XML and HTML. Keep in mind that these files can get very large rather quickly though, so exports can end up taking a while to complete. Command Line Interface WebSurge runs with a small core load engine and this engine is plugged into the front end application I’ve shown so far. There’s also a command line interface available to run WebSurge from the Windows command prompt. Using the command line you can run tests for either an individual URL (similar to AB.exe for example) or a full Session file. By default when it runs WebSurgeCli shows progress every second showing total request count, failures and the requests per second for the entire test. A silent option can turn off this progress display and display only the results. The command line interface can be useful for build integration which allows checking for failures perhaps or hitting a specific requests per second count etc. It’s also nice to use this as quick and dirty URL test facility similar to the way you’d use Apache Bench (ab.exe). Unlike ab.exe though, WebSurgeCli supports SSL and makes it much easier to create multi-URL tests using either manual editing or the WebSurge UI. Current Status Currently West Wind WebSurge is still in Beta status. I’m still adding small new features and tweaking the UI in an attempt to make it as easy and self-explanatory as possible to run. Documentation for the UI and specialty features is also still a work in progress. I plan on open-sourcing this product, but it won’t be free. There’s a free version available that provides a limited number of threads and request URLs to run. A relatively low cost license removes the thread and request limitations. Pricing info can be found on the Web site – there’s an introductory price which is $99 at the moment which I think is reasonable compared to most other for pay solutions out there that are exorbitant by comparison… The reason code is not available yet is – well, the UI portion of the app is a bit embarrassing in its current monolithic state. The UI started as a very simple interface originally that later got a lot more complex – yeah, that never happens, right? Unless there’s a lot of interest I don’t foresee re-writing the UI entirely (which would be ideal), but in the meantime at least some cleanup is required before I dare to publish it :-). The code will likely be released with version 1.0. I’m very interested in feedback. Do you think this could be useful to you and provide value over other tools you may or may not have used before? I hope so – it already has provided a ton of value for me and the work I do that made the development worthwhile at this point. You can leave a comment below, or for more extensive discussions you can post a message on the West Wind Message Board in the WebSurge section Microsoft MVPs and Insiders get a free License If you’re a Microsoft MVP or a Microsoft Insider you can get a full license for free. Send me a link to your current, official Microsoft profile and I’ll send you a not-for resale license. Send any messages to [email protected]. Resources For more info on WebSurge and to download it to try it out, use the following links. West Wind WebSurge Home Download West Wind WebSurge Getting Started with West Wind WebSurge Video© Rick Strahl, West Wind Technologies, 2005-2014Posted in ASP.NET Tweet !function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0];if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src="//platform.twitter.com/widgets.js";fjs.parentNode.insertBefore(js,fjs);}}(document,"script","twitter-wjs"); (function() { var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true; po.src = 'https://apis.google.com/js/plusone.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s); })();

Read the article
Will these optimizations to my Ruby implementation of diff improve performance in a Rails app?

- by grg-n-sox

<tl;dr> In source version control diff patch generation, would it be worth it to use the optimizations listed at the very bottom of this writing (see <optimizations>) in my Ruby implementation of diff for making diff patches? </tl;dr> <introduction> I am programming something I have never done before and there might already be tools out there to do the exact thing I am programming but at this point I am having too much fun to care so I am still going to do it from scratch, even if there is a tool for this. So anyways, I am working on a Ruby on Rails app and need a certain feature. Basically I want each entry in a table of mine, let's say for example a table of video games, to have a stored chunk of text that represents a review or something of the sort for that table entry. However, I want this text to be both editable by any registered user and also keep track of different submissions in a version control system. The simplest solution I could think of is just implement a solution that keeps track of the text body and the diff patch history of different versions of the text body as objects in Ruby and then serialize it, preferably in human readable form (so I'll most likely use YAML for this) for editing if needed due to corruption by a software bug or a mistake is made by an admin doing some version editing. So at first I just tried to dive in head first into this feature to find that the problem of generating a diff patch is more difficult that I thought to do efficiently. So I did some research and came across some ideas. Some I have implemented already and some I have not. However, it all pretty much revolves around the longest common subsequence problem, as you would already know if you have already done anything with diff or diff-like features, and optimization the function that solves it. Currently I have it so it truncates the compared versions of the text body from the beginning and end until non-matching lines are found. Then it solves the problem using a comparison matrix, but instead of incrementing the value stored in a cell when it finds a matching line like in most longest common subsequence algorithms I have seen examples of, I increment when I have a non-matching line so as to calculate edit distance instead of longest common subsequence. Although as far as I can tell between the two approaches, they are essentially two sides of the same coin so either could be used to derive an answer. It then back-traces through the comparison matrix and notes when there was an incrementation and in which adjacent cell (West, Northwest, or North) to determine that line's diff entry and assumes all other lines to be unchanged. Normally I would leave it at that, but since this is going into a Rails environment and not just some stand-alone Ruby script, I started getting worried about needing to optimize at least enough so if a spammer that somehow knew how I implemented the version control system and knew my worst case scenario entry still wouldn't be able to hit the server that bad. After some searching and reading of research papers and articles through the internet, I've come across several that seem decent but all seem to have pros and cons and I am having a hard time deciding how well in this situation that the pros and cons balance out. So are the ones listed here worth it? I have listed them with known pros and cons. </introduction> <optimizations> Chop the compared sequences into multiple chucks of subsequences by splitting where lines are unchanged, and then truncating each section of unchanged lines at the beginning and end of each section. Then solve the edit distance of each subsequence. Pro: Changes the time increase as the changed area gets bigger from a quadratic increase to something more similar to a linear increase. Con: Figuring out where to split already seems like you have to solve edit distance except now you don't care how it is changed. Would be fine if this was solvable by a process closer to solving hamming distance but a single insertion would throw this off. Use a cryptographic hash function to both convert all sequence elements into integers and ensure uniqueness. Then solve the edit distance comparing the hash integers instead of the sequence elements themselves. Pro: The operation of comparing two integers is faster than the operation of comparing two strings, so a slight performance gain is received after every comparison, which can be a lot overall. Con: Using a cryptographic hash function takes time to convert all the sequence elements and may end up costing more time to do the conversion that you gain back from the integer comparisons. You could use the built in hash function for a string but that will not guarantee uniqueness. Use lazy evaluation to only calculate the three center-most diagonals of the comparison matrix and then only calculate additional diagonals as needed. And then also use this approach to possibly remove the need on some comparisons to compare all three adjacent cells as desribed here. Pro: Can turn an algorithm that always takes O(n * m) time and make it so only worst case scenario is that time, best case becomes practically linear, and average case is somewhere between the two. Con: It is an algorithm I've only seen implemented in functional programming languages and I am having a difficult time comprehending how to convert this into Ruby based on how it is described at the site linked to above. Make a C module and do the hard work at the native level in C and just make a Ruby wrapper for it so Ruby can make all the calls to it that it needs. Pro: I have to imagine that evaluating something like this in could be a LOT faster. Con: I have no idea how Rails handles apps with ruby code that has C extensions and it hurts the portability of the app. This is an optimization for after the solving of edit distance, but idea is to store additional combined diffs with the ones produced by each version to make a delta-tree data structure with the most recently made diff as the root node of the tree so getting to any version takes worst case time of O(log n) instead of O(n). Pro: Would make going back to an old version a lot faster. Con: It would mean every new commit, the delta-tree would get a new root node that will cost time to reorganize the delta-tree for an operation that will be carried out a lot more often than going back a version, not to mention the unlikelihood it will be an old version. </optimizations> So are these things worth the effort?

Read the article
How John Got 15x Improvement Without Really Trying

- by rchrd

The following article was published on a Sun Microsystems website a number of years ago by John Feo. It is still useful and worth preserving. So I'm republishing it here. How I Got 15x Improvement Without Really Trying John Feo, Sun Microsystems Taking ten "personal" program codes used in scientific and engineering research, the author was able to get from 2 to 15 times performance improvement easily by applying some simple general optimization techniques. Introduction Scientific research based on computer simulation depends on the simulation for advancement. The research can advance only as fast as the computational codes can execute. The codes' efficiency determines both the rate and quality of results. In the same amount of time, a faster program can generate more results and can carry out a more detailed simulation of physical phenomena than a slower program. Highly optimized programs help science advance quickly and insure that monies supporting scientific research are used as effectively as possible. Scientific computer codes divide into three broad categories: ISV, community, and personal. ISV codes are large, mature production codes developed and sold commercially. The codes improve slowly over time both in methods and capabilities, and they are well tuned for most vendor platforms. Since the codes are mature and complex, there are few opportunities to improve their performance solely through code optimization. Improvements of 10% to 15% are typical. Examples of ISV codes are DYNA3D, Gaussian, and Nastran. Community codes are non-commercial production codes used by a particular research field. Generally, they are developed and distributed by a single academic or research institution with assistance from the community. Most users just run the codes, but some develop new methods and extensions that feed back into the general release. The codes are available on most vendor platforms. Since these codes are younger than ISV codes, there are more opportunities to optimize the source code. Improvements of 50% are not unusual. Examples of community codes are AMBER, CHARM, BLAST, and FASTA. Personal codes are those written by single users or small research groups for their own use. These codes are not distributed, but may be passed from professor-to-student or student-to-student over several years. They form the primordial ocean of applications from which community and ISV codes emerge. Government research grants pay for the development of most personal codes. This paper reports on the nature and performance of this class of codes. Over the last year, I have looked at over two dozen personal codes from more than a dozen research institutions. The codes cover a variety of scientific fields, including astronomy, atmospheric sciences, bioinformatics, biology, chemistry, geology, and physics. The sources range from a few hundred lines to more than ten thousand lines, and are written in Fortran, Fortran 90, C, and C++. For the most part, the codes are modular, documented, and written in a clear, straightforward manner. They do not use complex language features, advanced data structures, programming tricks, or libraries. I had little trouble understanding what the codes did or how data structures were used. Most came with a makefile. Surprisingly, only one of the applications is parallel. All developers have access to parallel machines, so availability is not an issue. Several tried to parallelize their applications, but stopped after encountering difficulties. Lack of education and a perception that parallelism is difficult prevented most from trying. I parallelized several of the codes using OpenMP, and did not judge any of the codes as difficult to parallelize. Even more surprising than the lack of parallelism is the inefficiency of the codes. I was able to get large improvements in performance in a matter of a few days applying simple optimization techniques. Table 1 lists ten representative codes [names and affiliation are omitted to preserve anonymity]. Improvements on one processor range from 2x to 15.5x with a simple average of 4.75x. I did not use sophisticated performance tools or drill deep into the program's execution character as one would do when tuning ISV or community codes. Using only a profiler and source line timers, I identified inefficient sections of code and improved their performance by inspection. The changes were at a high level. I am sure there is another factor of 2 or 3 in each code, and more if the codes are parallelized. The study’s results show that personal scientific codes are running many times slower than they should and that the problem is pervasive. Computational scientists are not sloppy programmers; however, few are trained in the art of computer programming or code optimization. I found that most have a working knowledge of some programming language and standard software engineering practices; but they do not know, or think about, how to make their programs run faster. They simply do not know the standard techniques used to make codes run faster. In fact, they do not even perceive that such techniques exist. The case studies described in this paper show that applying simple, well known techniques can significantly increase the performance of personal codes. It is important that the scientific community and the Government agencies that support scientific research find ways to better educate academic scientific programmers. The inefficiency of their codes is so bad that it is retarding both the quality and progress of scientific research. # cacheperformance redundantoperations loopstructures performanceimprovement 1 x x 15.5 2 x 2.8 3 x x 2.5 4 x 2.1 5 x x 2.0 6 x 5.0 7 x 5.8 8 x 6.3 9 2.2 10 x x 3.3 Table 1 — Area of improvement and performance gains of 10 codes The remainder of the paper is organized as follows: sections 2, 3, and 4 discuss the three most common sources of inefficiencies in the codes studied. These are cache performance, redundant operations, and loop structures. Each section includes several examples. The last section summaries the work and suggests a possible solution to the issues raised. Optimizing cache performance Commodity microprocessor systems use caches to increase memory bandwidth and reduce memory latencies. Typical latencies from processor to L1, L2, local, and remote memory are 3, 10, 50, and 200 cycles, respectively. Moreover, bandwidth falls off dramatically as memory distances increase. Programs that do not use cache effectively run many times slower than programs that do. When optimizing for cache, the biggest performance gains are achieved by accessing data in cache order and reusing data to amortize the overhead of cache misses. Secondary considerations are prefetching, associativity, and replacement; however, the understanding and analysis required to optimize for the latter are probably beyond the capabilities of the non-expert. Much can be gained simply by accessing data in the correct order and maximizing data reuse. 6 out of the 10 codes studied here benefited from such high level optimizations. Array Accesses The most important cache optimization is the most basic: accessing Fortran array elements in column order and C array elements in row order. Four of the ten codes—1, 2, 4, and 10—got it wrong. Compilers will restructure nested loops to optimize cache performance, but may not do so if the loop structure is too complex, or the loop body includes conditionals, complex addressing, or function calls. In code 1, the compiler failed to invert a key loop because of complex addressing do I = 0, 1010, delta_x IM = I - delta_x IP = I + delta_x do J = 5, 995, delta_x JM = J - delta_x JP = J + delta_x T1 = CA1(IP, J) + CA1(I, JP) T2 = CA1(IM, J) + CA1(I, JM) S1 = T1 + T2 - 4 * CA1(I, J) CA(I, J) = CA1(I, J) + D * S1 end do end do In code 2, the culprit is conditionals do I = 1, N do J = 1, N If (IFLAG(I,J) .EQ. 0) then T1 = Value(I, J-1) T2 = Value(I-1, J) T3 = Value(I, J) T4 = Value(I+1, J) T5 = Value(I, J+1) Value(I,J) = 0.25 * (T1 + T2 + T5 + T4) Delta = ABS(T3 - Value(I,J)) If (Delta .GT. MaxDelta) MaxDelta = Delta endif enddo enddo I fixed both programs by inverting the loops by hand. Code 10 has three-dimensional arrays and triply nested loops. The structure of the most computationally intensive loops is too complex to invert automatically or by hand. The only practical solution is to transpose the arrays so that the dimension accessed by the innermost loop is in cache order. The arrays can be transposed at construction or prior to entering a computationally intensive section of code. The former requires all array references to be modified, while the latter is cost effective only if the cost of the transpose is amortized over many accesses. I used the second approach to optimize code 10. Code 5 has four-dimensional arrays and loops are nested four deep. For all of the reasons cited above the compiler is not able to restructure three key loops. Assume C arrays and let the four dimensions of the arrays be i, j, k, and l. In the original code, the index structure of the three loops is L1: for i L2: for i L3: for i for l for l for j for k for j for k for j for k for l So only L3 accesses array elements in cache order. L1 is a very complex loop—much too complex to invert. I brought the loop into cache alignment by transposing the second and fourth dimensions of the arrays. Since the code uses a macro to compute all array indexes, I effected the transpose at construction and changed the macro appropriately. The dimensions of the new arrays are now: i, l, k, and j. L3 is a simple loop and easily inverted. L2 has a loop-carried scalar dependence in k. By promoting the scalar name that carries the dependence to an array, I was able to invert the third and fourth subloops aligning the loop with cache. Code 5 is by far the most difficult of the four codes to optimize for array accesses; but the knowledge required to fix the problems is no more than that required for the other codes. I would judge this code at the limits of, but not beyond, the capabilities of appropriately trained computational scientists. Array Strides When a cache miss occurs, a line (64 bytes) rather than just one word is loaded into the cache. If data is accessed stride 1, than the cost of the miss is amortized over 8 words. Any stride other than one reduces the cost savings. Two of the ten codes studied suffered from non-unit strides. The codes represent two important classes of "strided" codes. Code 1 employs a multi-grid algorithm to reduce time to convergence. The grids are every tenth, fifth, second, and unit element. Since time to convergence is inversely proportional to the distance between elements, coarse grids converge quickly providing good starting values for finer grids. The better starting values further reduce the time to convergence. The downside is that grids of every nth element, n > 1, introduce non-unit strides into the computation. In the original code, much of the savings of the multi-grid algorithm were lost due to this problem. I eliminated the problem by compressing (copying) coarse grids into continuous memory, and rewriting the computation as a function of the compressed grid. On convergence, I copied the final values of the compressed grid back to the original grid. The savings gained from unit stride access of the compressed grid more than paid for the cost of copying. Using compressed grids, the loop from code 1 included in the previous section becomes do j = 1, GZ do i = 1, GZ T1 = CA(i+0, j-1) + CA(i-1, j+0) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) S1 = T1 + T4 - 4 * CA1(i+0, j+0) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 enddo enddo where CA and CA1 are compressed arrays of size GZ. Code 7 traverses a list of objects selecting objects for later processing. The labels of the selected objects are stored in an array. The selection step has unit stride, but the processing steps have irregular stride. A fix is to save the parameters of the selected objects in temporary arrays as they are selected, and pass the temporary arrays to the processing functions. The fix is practical if the same parameters are used in selection as in processing, or if processing comprises a series of distinct steps which use overlapping subsets of the parameters. Both conditions are true for code 7, so I achieved significant improvement by copying parameters to temporary arrays during selection. Data reuse In the previous sections, we optimized for spatial locality. It is also important to optimize for temporal locality. Once read, a datum should be used as much as possible before it is forced from cache. Loop fusion and loop unrolling are two techniques that increase temporal locality. Unfortunately, both techniques increase register pressure—as loop bodies become larger, the number of registers required to hold temporary values grows. Once register spilling occurs, any gains evaporate quickly. For multiprocessors with small register sets or small caches, the sweet spot can be very small. In the ten codes presented here, I found no opportunities for loop fusion and only two opportunities for loop unrolling (codes 1 and 3). In code 1, unrolling the outer and inner loop one iteration increases the number of result values computed by the loop body from 1 to 4, do J = 1, GZ-2, 2 do I = 1, GZ-2, 2 T1 = CA1(i+0, j-1) + CA1(i-1, j+0) T2 = CA1(i+1, j-1) + CA1(i+0, j+0) T3 = CA1(i+0, j+0) + CA1(i-1, j+1) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) T5 = CA1(i+2, j+0) + CA1(i+1, j+1) T6 = CA1(i+1, j+1) + CA1(i+0, j+2) T7 = CA1(i+2, j+1) + CA1(i+1, j+2) S1 = T1 + T4 - 4 * CA1(i+0, j+0) S2 = T2 + T5 - 4 * CA1(i+1, j+0) S3 = T3 + T6 - 4 * CA1(i+0, j+1) S4 = T4 + T7 - 4 * CA1(i+1, j+1) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 CA(i+1, j+0) = CA1(i+1, j+0) + DD * S2 CA(i+0, j+1) = CA1(i+0, j+1) + DD * S3 CA(i+1, j+1) = CA1(i+1, j+1) + DD * S4 enddo enddo The loop body executes 12 reads, whereas as the rolled loop shown in the previous section executes 20 reads to compute the same four values. In code 3, two loops are unrolled 8 times and one loop is unrolled 4 times. Here is the before for (k = 0; k < NK[u]; k++) { sum = 0.0; for (y = 0; y < NY; y++) { sum += W[y][u][k] * delta[y]; } backprop[i++]=sum; } and after code for (k = 0; k < KK - 8; k+=8) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (y = 0; y < NY; y++) { sum0 += W[y][0][k+0] * delta[y]; sum1 += W[y][0][k+1] * delta[y]; sum2 += W[y][0][k+2] * delta[y]; sum3 += W[y][0][k+3] * delta[y]; sum4 += W[y][0][k+4] * delta[y]; sum5 += W[y][0][k+5] * delta[y]; sum6 += W[y][0][k+6] * delta[y]; sum7 += W[y][0][k+7] * delta[y]; } backprop[k+0] = sum0; backprop[k+1] = sum1; backprop[k+2] = sum2; backprop[k+3] = sum3; backprop[k+4] = sum4; backprop[k+5] = sum5; backprop[k+6] = sum6; backprop[k+7] = sum7; } for one of the loops unrolled 8 times. Optimizing for temporal locality is the most difficult optimization considered in this paper. The concepts are not difficult, but the sweet spot is small. Identifying where the program can benefit from loop unrolling or loop fusion is not trivial. Moreover, it takes some effort to get it right. Still, educating scientific programmers about temporal locality and teaching them how to optimize for it will pay dividends. Reducing instruction count Execution time is a function of instruction count. Reduce the count and you usually reduce the time. The best solution is to use a more efficient algorithm; that is, an algorithm whose order of complexity is smaller, that converges quicker, or is more accurate. Optimizing source code without changing the algorithm yields smaller, but still significant, gains. This paper considers only the latter because the intent is to study how much better codes can run if written by programmers schooled in basic code optimization techniques. The ten codes studied benefited from three types of "instruction reducing" optimizations. The two most prevalent were hoisting invariant memory and data operations out of inner loops. The third was eliminating unnecessary data copying. The nature of these inefficiencies is language dependent. Memory operations The semantics of C make it difficult for the compiler to determine all the invariant memory operations in a loop. The problem is particularly acute for loops in functions since the compiler may not know the values of the function's parameters at every call site when compiling the function. Most compilers support pragmas to help resolve ambiguities; however, these pragmas are not comprehensive and there is no standard syntax. To guarantee that invariant memory operations are not executed repetitively, the user has little choice but to hoist the operations by hand. The problem is not as severe in Fortran programs because in the absence of equivalence statements, it is a violation of the language's semantics for two names to share memory. Codes 3 and 5 are C programs. In both cases, the compiler did not hoist all invariant memory operations from inner loops. Consider the following loop from code 3 for (y = 0; y < NY; y++) { i = 0; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += delta[y] * I1[i++]; } } } Since dW[y][u] can point to the same memory space as delta for one or more values of y and u, assignment to dW[y][u][k] may change the value of delta[y]. In reality, dW and delta do not overlap in memory, so I rewrote the loop as for (y = 0; y < NY; y++) { i = 0; Dy = delta[y]; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += Dy * I1[i++]; } } } Failure to hoist invariant memory operations may be due to complex address calculations. If the compiler can not determine that the address calculation is invariant, then it can hoist neither the calculation nor the associated memory operations. As noted above, code 5 uses a macro to address four-dimensional arrays #define MAT4D(a,q,i,j,k) (double *)((a)->data + (q)*(a)->strides[0] + (i)*(a)->strides[3] + (j)*(a)->strides[2] + (k)*(a)->strides[1]) The macro is too complex for the compiler to understand and so, it does not identify any subexpressions as loop invariant. The simplest way to eliminate the address calculation from the innermost loop (over i) is to define a0 = MAT4D(a,q,0,j,k) before the loop and then replace all instances of *MAT4D(a,q,i,j,k) in the loop with a0[i] A similar problem appears in code 6, a Fortran program. The key loop in this program is do n1 = 1, nh nx1 = (n1 - 1) / nz + 1 nz1 = n1 - nz * (nx1 - 1) do n2 = 1, nh nx2 = (n2 - 1) / nz + 1 nz2 = n2 - nz * (nx2 - 1) ndx = nx2 - nx1 ndy = nz2 - nz1 gxx = grn(1,ndx,ndy) gyy = grn(2,ndx,ndy) gxy = grn(3,ndx,ndy) balance(n1,1) = balance(n1,1) + (force(n2,1) * gxx + force(n2,2) * gxy) * h1 balance(n1,2) = balance(n1,2) + (force(n2,1) * gxy + force(n2,2) * gyy)*h1 end do end do The programmer has written this loop well—there are no loop invariant operations with respect to n1 and n2. However, the loop resides within an iterative loop over time and the index calculations are independent with respect to time. Trading space for time, I precomputed the index values prior to the entering the time loop and stored the values in two arrays. I then replaced the index calculations with reads of the arrays. Data operations Ways to reduce data operations can appear in many forms. Implementing a more efficient algorithm produces the biggest gains. The closest I came to an algorithm change was in code 4. This code computes the inner product of K-vectors A(i) and B(j), 0 = i < N, 0 = j < M, for most values of i and j. Since the program computes most of the NM possible inner products, it is more efficient to compute all the inner products in one triply-nested loop rather than one at a time when needed. The savings accrue from reading A(i) once for all B(j) vectors and from loop unrolling. for (i = 0; i < N; i+=8) { for (j = 0; j < M; j++) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (k = 0; k < K; k++) { sum0 += A[i+0][k] * B[j][k]; sum1 += A[i+1][k] * B[j][k]; sum2 += A[i+2][k] * B[j][k]; sum3 += A[i+3][k] * B[j][k]; sum4 += A[i+4][k] * B[j][k]; sum5 += A[i+5][k] * B[j][k]; sum6 += A[i+6][k] * B[j][k]; sum7 += A[i+7][k] * B[j][k]; } C[i+0][j] = sum0; C[i+1][j] = sum1; C[i+2][j] = sum2; C[i+3][j] = sum3; C[i+4][j] = sum4; C[i+5][j] = sum5; C[i+6][j] = sum6; C[i+7][j] = sum7; }} This change requires knowledge of a typical run; i.e., that most inner products are computed. The reasons for the change, however, derive from basic optimization concepts. It is the type of change easily made at development time by a knowledgeable programmer. In code 5, we have the data version of the index optimization in code 6. Here a very expensive computation is a function of the loop indices and so cannot be hoisted out of the loop; however, the computation is invariant with respect to an outer iterative loop over time. We can compute its value for each iteration of the computation loop prior to entering the time loop and save the values in an array. The increase in memory required to store the values is small in comparison to the large savings in time. The main loop in Code 8 is doubly nested. The inner loop includes a series of guarded computations; some are a function of the inner loop index but not the outer loop index while others are a function of the outer loop index but not the inner loop index for (j = 0; j < N; j++) { for (i = 0; i < M; i++) { r = i * hrmax; R = A[j]; temp = (PRM[3] == 0.0) ? 1.0 : pow(r, PRM[3]); high = temp * kcoeff * B[j] * PRM[2] * PRM[4]; low = high * PRM[6] * PRM[6] / (1.0 + pow(PRM[4] * PRM[6], 2.0)); kap = (R > PRM[6]) ? high * R * R / (1.0 + pow(PRM[4]*r, 2.0) : low * pow(R/PRM[6], PRM[5]); < rest of loop omitted > }} Note that the value of temp is invariant to j. Thus, we can hoist the computation for temp out of the loop and save its values in an array. for (i = 0; i < M; i++) { r = i * hrmax; TEMP[i] = pow(r, PRM[3]); } [N.B. – the case for PRM[3] = 0 is omitted and will be reintroduced later.] We now hoist out of the inner loop the computations invariant to i. Since the conditional guarding the value of kap is invariant to i, it behooves us to hoist the computation out of the inner loop, thereby executing the guard once rather than M times. The final version of the code is for (j = 0; j < N; j++) { R = rig[j] / 1000.; tmp1 = kcoeff * par[2] * beta[j] * par[4]; tmp2 = 1.0 + (par[4] * par[4] * par[6] * par[6]); tmp3 = 1.0 + (par[4] * par[4] * R * R); tmp4 = par[6] * par[6] / tmp2; tmp5 = R * R / tmp3; tmp6 = pow(R / par[6], par[5]); if ((par[3] == 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp5; } else if ((par[3] == 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp4 * tmp6; } else if ((par[3] != 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp5; } else if ((par[3] != 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp4 * tmp6; } for (i = 0; i < M; i++) { kap = KAP[i]; r = i * hrmax; < rest of loop omitted > } } Maybe not the prettiest piece of code, but certainly much more efficient than the original loop, Copy operations Several programs unnecessarily copy data from one data structure to another. This problem occurs in both Fortran and C programs, although it manifests itself differently in the two languages. Code 1 declares two arrays—one for old values and one for new values. At the end of each iteration, the array of new values is copied to the array of old values to reset the data structures for the next iteration. This problem occurs in Fortran programs not included in this study and in both Fortran 77 and Fortran 90 code. Introducing pointers to the arrays and swapping pointer values is an obvious way to eliminate the copying; but pointers is not a feature that many Fortran programmers know well or are comfortable using. An easy solution not involving pointers is to extend the dimension of the value array by 1 and use the last dimension to differentiate between arrays at different times. For example, if the data space is N x N, declare the array (N, N, 2). Then store the problem’s initial values in (_, _, 2) and define the scalar names new = 2 and old = 1. At the start of each iteration, swap old and new to reset the arrays. The old–new copy problem did not appear in any C program. In programs that had new and old values, the code swapped pointers to reset data structures. Where unnecessary coping did occur is in structure assignment and parameter passing. Structures in C are handled much like scalars. Assignment causes the data space of the right-hand name to be copied to the data space of the left-hand name. Similarly, when a structure is passed to a function, the data space of the actual parameter is copied to the data space of the formal parameter. If the structure is large and the assignment or function call is in an inner loop, then copying costs can grow quite large. While none of the ten programs considered here manifested this problem, it did occur in programs not included in the study. A simple fix is always to refer to structures via pointers. Optimizing loop structures Since scientific programs spend almost all their time in loops, efficient loops are the key to good performance. Conditionals, function calls, little instruction level parallelism, and large numbers of temporary values make it difficult for the compiler to generate tightly packed, highly efficient code. Conditionals and function calls introduce jumps that disrupt code flow. Users should eliminate or isolate conditionls to their own loops as much as possible. Often logical expressions can be substituted for if-then-else statements. For example, code 2 includes the following snippet MaxDelta = 0.0 do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) if (Delta > MaxDelta) MaxDelta = Delta enddo enddo if (MaxDelta .gt. 0.001) goto 200 Since the only use of MaxDelta is to control the jump to 200 and all that matters is whether or not it is greater than 0.001, I made MaxDelta a boolean and rewrote the snippet as MaxDelta = .false. do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) MaxDelta = MaxDelta .or. (Delta .gt. 0.001) enddo enddo if (MaxDelta) goto 200 thereby, eliminating the conditional expression from the inner loop. A microprocessor can execute many instructions per instruction cycle. Typically, it can execute one or more memory, floating point, integer, and jump operations. To be executed simultaneously, the operations must be independent. Thick loops tend to have more instruction level parallelism than thin loops. Moreover, they reduce memory traffice by maximizing data reuse. Loop unrolling and loop fusion are two techniques to increase the size of loop bodies. Several of the codes studied benefitted from loop unrolling, but none benefitted from loop fusion. This observation is not too surpising since it is the general tendency of programmers to write thick loops. As loops become thicker, the number of temporary values grows, increasing register pressure. If registers spill, then memory traffic increases and code flow is disrupted. A thick loop with many temporary values may execute slower than an equivalent series of thin loops. The biggest gain will be achieved if the thick loop can be split into a series of independent loops eliminating the need to write and read temporary arrays. I found such an occasion in code 10 where I split the loop do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do into two disjoint loops do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) end do end do do i = 1, n do j = 1, m C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do Conclusions Over the course of the last year, I have had the opportunity to work with over two dozen academic scientific programmers at leading research universities. Their research interests span a broad range of scientific fields. Except for two programs that relied almost exclusively on library routines (matrix multiply and fast Fourier transform), I was able to improve significantly the single processor performance of all codes. Improvements range from 2x to 15.5x with a simple average of 4.75x. Changes to the source code were at a very high level. I did not use sophisticated techniques or programming tools to discover inefficiencies or effect the changes. Only one code was parallel despite the availability of parallel systems to all developers. Clearly, we have a problem—personal scientific research codes are highly inefficient and not running parallel. The developers are unaware of simple optimization techniques to make programs run faster. They lack education in the art of code optimization and parallel programming. I do not believe we can fix the problem by publishing additional books or training manuals. To date, the developers in questions have not studied the books or manual available, and are unlikely to do so in the future. Short courses are a possible solution, but I believe they are too concentrated to be much use. The general concepts can be taught in a three or four day course, but that is not enough time for students to practice what they learn and acquire the experience to apply and extend the concepts to their codes. Practice is the key to becoming proficient at optimization. I recommend that graduate students be required to take a semester length course in optimization and parallel programming. We would never give someone access to state-of-the-art scientific equipment costing hundreds of thousands of dollars without first requiring them to demonstrate that they know how to use the equipment. Yet the criterion for time on state-of-the-art supercomputers is at most an interesting project. Requestors are never asked to demonstrate that they know how to use the system, or can use the system effectively. A semester course would teach them the required skills. Government agencies that fund academic scientific research pay for most of the computer systems supporting scientific research as well as the development of most personal scientific codes. These agencies should require graduate schools to offer a course in optimization and parallel programming as a requirement for funding. About the Author John Feo received his Ph.D. in Computer Science from The University of Texas at Austin in 1986. After graduate school, Dr. Feo worked at Lawrence Livermore National Laboratory where he was the Group Leader of the Computer Research Group and principal investigator of the Sisal Language Project. In 1997, Dr. Feo joined Tera Computer Company where he was project manager for the MTA, and oversaw the programming and evaluation of the MTA at the San Diego Supercomputer Center. In 2000, Dr. Feo joined Sun Microsystems as an HPC application specialist. He works with university research groups to optimize and parallelize scientific codes. Dr. Feo has published over two dozen research articles in the areas of parallel parallel programming, parallel programming languages, and application performance.

Read the article
client problems - misaligned expectations & not following SDLC protocols

- by louism

hi guys, im having some serious problems with a client on a project - i could use some advice please the short version i have been working with this client now for almost 6 months without any problems (a classified website project in the range of 500 hours) over the last few days things have drastically deteriorated to the point where ive had to place the project on-hold whilst i work-out what to do (this has pissed the client off even more) to be simplistic, the root cause of the issue is this: the client doesnt read the specs i make for him, i code the feature, he than wants to change things, i tell him its not to the agreed spec and that that change will have to be postponed and possibly charged for, he gets upset and rants saying 'hes paid for the feature' and im not keeping to the agreement (<- misalignment of expectations) i think the root cause of the root cause is my clients failure to take my SDLC protocols seriously. i have a bug tracking system in place which he practically refuses to use (he still emails me bugs), he doesnt seem to care to much for the protocols i use for dealing with scope creep and change control the whole situation came to a head recently where he 'cracked it' (an aussie term for being fed-up). the more terms like 'postponed for post-launch implementation', 'costed feature addition', and 'not to agreed spec' i kept using, the worse it got finally, he began to bully me - basically insisting i shut-up and do the work im being paid for. i wrote a long-winded email explaining how wrong he was on all these different points, and explaining what all the SDLC protocols do to protect the success of the project. than i deleted that email and wrote a new one in the new email, i suggested as a solution i write up a list of grievances we both had. we than review the list and compromise on different points: he gets some things he wants, i get some things i want. sometimes youve got to give ground to get ground his response to this suggestion was flat-out refusal, and a restatement that i should just get on with the work ive been paid to do so there you have the very subjective short version. if you have the time and inclination, the long version may be a little less bias as it has the email communiques between me and my client the long version (with background) the long version works by me showing you the email communiques which lead to the situation coming to a head. so here it is, judge for yourself where the trouble started... 1. client asked me why something was missing from a feature i just uploaded, my response was to show him what was in the spec: it basically said the item he was looking for was never going to be included 2. [clients response...] Memo Louis, We are following your own title fields and keeping a consistent layout. Why the big fuss about not adding "Part". It simply replaces "model" and is consistent with your current title fields. 3. [my response...] hi [client], the 'part' field appeared to me as a redundancy / mistake. i requested clarification but never received any in a timely manner (about 2 weeks ago) the specification for this feature also indicated it wasnt going to be included: RE: "Why the big fuss about not adding "Part" " it may not appear so, but it would actually be a lot of work for me to now add a 'Part' field it could take me up to 15-20 minutes to properly explain why its such a big undertaking to do this, but i would prefer to use that time instead to work on completing your v1.1 features as a simplistic explanation - it connects to the change in paradigm from a 'generic classified ad' model to a 'specific attributes for specific categories' model basically, i am saying it is a big fuss, but i understand that it doesnt look that way - after all, it is just one ity-bitty field :) if you require a fuller explanation, please let me know and i will commit the time needed to write that out also, if you recall when we first started on the project, i said that with the effort/time required for features, you would likely not know off the top of your head. you may think something is really complex, but in reality its quite simple, you might think something is easy - but it could actually be a massive trauma to code (which is the case here with the 'Part' field). if you also recalled, i said the best course of action is to just ask, and i would let you know on a case-by-case basis 4. [email from me to client...] hi [client], the online catalogue page is now up live (see my email from a few days ago for information on how it works) note: the window of opportunity for input/revisions on what data the catalogue stores has now closed (as i have put the code up live now) RE: the UI/layout of the online catalogue page you may still do visual/ui tweaks to the page at the moment (this window for input/revisions will close in a couple of days time) 5. [email from client to me...] *(note: i had put up the feature & asked the client to review it, never heard back from them for a few days)* Memo Louis, Here you go again. CLOSED without a word of input from the customer. I don't think so. I will reply tomorrow regarding the content and functionality we require from this feature. 5. [from me to client...] hi [client]: RE: from my understanding, you are saying that the mini-sale yard control would change itself based on the fact someone was viewing for parts & accessories <- is that correct? this change is outside the scope of the v1.1 mini-spec and therefore will need to wait 'til post launch for costing/implementation 6. [email from client to me...] Memo Louis, Following your v1.1 mini-spec and all your time paid in full for the work selected. We need to make the situation clear. There will be no further items held for post-launch. Do not expect us to pay for any further items other than those we have agreed upon. You have undertaken to complete the Parts and accessories feature as follows. Obviously, as part of this process the "mini search" will be effected, and will require "adaption to make sense". 7. [email from me to client...] hi [client], RE: "There will be no further items held for post-launch. Do not expect us to pay for any further items other than those we have agreed upon." a few points to consider: 1) the specification for the 'parts & accessories' feature was as follows: (i.e. [what] "...we have agreed upon.") 2) you have received the 'parts & accessories' feature free of charge (you have paid $0 for it). ive spent two days coding that feature as a gesture of good will i would request that you please consider these two facts carefully and sincerely 8. [email from client to me...] Memo Louis, I don't see how you are giving us anything for free. From your original fee proposal you have deleted more than 30 hours of included features. Your title "shelved features". Further you have charged us twice by adding back into the site, at an addition cost, some of those "shelved features" features. See v1.1 mini-spec. Did include in your original fee proposal a change request budget but then charge without discussion items included in v1.1 mini-spec. Included a further Features test plan for a regression test, a fee of 10 hours that would not have been required if the "shelved features" were not left out of the agreed fee proposal. I have made every attempt to satisfy your your uneven business sense by offering you everything your heart desired, in the v1.1 mini-spec, to be left once again with your attitude of "its too hard, lets leave it for post launch". I am no longer accepting anything less than what we have contracted you to do. That is clearly defined in v1.1 mini-spec, and you are paid in advance for delivering those items as an acceptable function. a few notes about the above email... i had to cull features from the original spec because it didnt fit into the budget. i explained this to the client at the start of the project (he wanted more features than he had budget hours to do them all) nothing has been charged for twice, i didnt charge the client for culled features. im charging him to now do those culled features the draft version of the project schedule included a change request budget of 10 hours, but i had to remove that to meet the budget (the client may not have been aware of this to be fair to them) what the client refers to as my attitude of 'too hard/leave it for post-launch', i called a change request protocol and a method for keeping scope creep under control 9. [email from me to client...] hi [client], RE: "...all your grievances..." i had originally written out a long email response; it was fantastic, it had all these great points of how 'you were wrong' and 'i was right', you would of loved it (and by 'loved it', i mean it would of just infuriated you more) so, i decided to deleted it start over, for two reasons: 1) a long email is being disrespectful of your time (youre a busy businessman with things to do) 2) whos wrong or right gets us no closer to fixing the problems we are experiencing what i propose is this... i prepare a bullet point list of your grievances and my grievances (yes, im unhappy too about how things are going - and it has little to do with money) i submit this list to you for you to add to as necessary we then both take a good hard look at this list, and we decide which areas we are willing to give ground on as an example, the list may look something like this: "louis, you keep taking away features you said you would do" [your grievance 2] [your grievance 3] [your grievance ...] "[client], i feel you dont properly read the specs i prepare for you..." [my grievance 2] [my grievance 3] [my grievance ...] if you are willing to give this a try, let me know will it work? who knows. but if it doesnt, we can always go back to arguing some more :) obviously, this will only work if you are willing to give it a genuine try, and you can accept that you may have to 'give some ground to get some ground' what do you think? 10. [email from client to me ...] Memo Louis, Instead of wasting your time listing grievances, I would prefer you complete the items in v1.1 mini-spec, to a satisfactory conclusion. We almost had the website ready for launch until you brought the v1.1 mini-spec into the frame. Obviously I expected you could complete the v1.1 mini-spec in a two-week time frame as you indicated and give the site a more profession presentation. Most of the problems have been caused by you not following our instructions, but deciding to do what you feel like at the time. And then arguing with us how the missing information is not necessary. For instance "Parts and Accessories". Why on earth would you leave out the parts heading, when it ties-in with the fields you have already developed. It replaces "model" and is just as important in the context of information that appears in the "Details" panel. We are at a stage where the the v1.1 mini-spec needs to be completed without further time wasting and the site is complete (subject to all features working). We are on standby at this end to do just that. Let me know when you are back, working on the site and we will process and complete each v1.1 mini-spec, item by item, until the job is complete. 11. [last email from me to client...] hi [client], based on this reply, and your demonstrated unwillingness to compromise/give any ground on issues at hand, i have decided to place your project on-hold for the moment i will be considering further options on how to over-come our challenges over the next few days i will contact you by monday 17/may to discuss any new options i have come up with, and if i believe it is appropriate to restart work on your project at that point or not told you it was long... what do you think?

Read the article

Search Results

Search found 100 results on 4 pages for 'costing'.

Page 4/4 | < Previous Page | 1 2 3 4

- by TORr0t

- by marienbad

- by tonyrogerson

- by Mike Stiles

- by Tony Davis

- by Michael Dorgan

- by wai

- by wai

- by James

- by BuckWoody

- by BuckWoody

- by Mike.Hallett(at)Oracle-BI&EPM

- by Software Guy

- by normstorm

- by Hubert Kario

- by Jim Mcglothlin

- by tonyrogerson

- by Michael Stephenson

- by BuckWoody

- by Theresa Hickman

- by Rob Farley

- by Rick Strahl

- by grg-n-sox

- by rchrd

- by louism

< Previous Page | 1 2 3 4