Search Results

Search found 37647 results on 1506 pages for 'sql performance'.

Page 292/1506 | < Previous Page | 288 289 290 291 292 293 294 295 296 297 298 299 | Next Page >

sql query selecting one name no matter how many rows it was mentioned in

- by Baruch

Basically what I'm trying to do is get the information from column x no matter how many times it was mentioned. means that if I have this kind of table: x | y | z ------+-------+-------- hello | one | bye hello | two | goodbye hi | three | see you so what I'm trying to do is create a query that would get all of the names that are mentions in the x column without duplicates and put it into a select list. my goal is that I would have a select list with TWO not THREE options, hello and hi this is what I have so far which isn't working. hope you guys know the answer to that: function getList(){ $options="<select id='names' style='margin-right:40px;'>"; $c_id = $_SESSION['id']; $sql="SELECT * FROM names"; $result=mysql_query($sql); $options.="<option value='blank'>-- Select something --</option>" ; while ($row=mysql_fetch_array($result)) { $name=$row["x"]; $options.="<option value='$name'>$name</option>"; } $options.= "</SELECT>"; return "$options"; } Sorry for confusing... i edited my source

Read the article
Java - Can i have a faster performance for this loop ?

- by Brad

I am reading a book and deleting a number of words from it. My problem is that the process takes long time, and i want to make its performance better(Less time), example : Vector<String> pages = new Vector<String>(); // Contains about 1500 page, each page has about 1000 words. Vector<String> wordsToDelete = new Vector<String>(); // Contains about 50000 words. for( String page: pages ) { String pageInLowCase = page.toLowerCase(); for( String wordToDelete: wordsToDelete ) { if( pageInLowCase.contains( wordToDelete ) ) page = page.replaceAll( "(?i)\\b" + wordToDelete + "\\b" , "" ); } // Do some staff with the final page that does not take much time. } This code takes around 3 minutes to execute. If i skipped the loop of replaceAll(...) i can save more than 2 minutes. So is there a way to do the same loop with a faster performance ?

Read the article
Manipulating data in sql / asp.net / c# - how?

- by SLC

Not sure how to word the question... Basically, so far all my SQL stuff has been stored procedures and dumped into a gridview. The odd case where I had to perform an action based on a value (such as highlighting a row green if a certain value was true) were done as the gridview was rendering in one of the overrides. Now however I have to do something far more complicated - pull three sets of data down, run a series of checks on all three and some date related checks and stuff, then populate a gridview with some of the items. In logic terms, I want to run three queries, and store the lists of results (presumably in Lists?) then run some logic, then populate the gridview. Specifically what I don't know how to do is: Best way of pulling the data, and putting it into a List or other datastructure that lets me easily run through it, and retrieve data based on column (myList.age, or more likely, myList["Age"]). One I have compared the data, I assume I create a new list that will be put into the gridview... how do I put the contents of a list INTO a gridview? How would I add other stuff such as buttons or checkboxes at the same time? Any nudge in the right direction would be appreciated! Particularly doing cool stuff with lists and sql (if there is anything cool you can do with them)

Read the article
INSERT 2000 records into SQL Database all at a time from C# .NET code

- by padmavathi

We need to INSERT 2000 records into SQL DB from C# .Net code. For this is there any way to INSERT all 2000 records at a time instead of executing the INSERT query for each record. Also how would be the performance impact of doing this? Thanks & Regards Padma

Read the article
Why could "insert (...) values (...)" not insert a new row?

- by nang

Hi, I have a simple SQL insert statement of the form: insert into MyTable (...) values (...) It is used repeatedly to insert rows and usually works as expected. It inserts exactly 1 row to MyTable, which is also the value returned by the Delphi statement AffectedRows:= myInsertADOQuery.ExecSQL. After some time there was a temporary network connectivity problem. As a result, other threads of the same application perceived EOleExceptions (Connection failure, -2147467259 = unspecified error). Later, the network connection was reestablished, these threads reconnected and were fine. The thread responsible for executing the insert statement described above, however, did not perceive the connectivity problems (No exceptions) - probably it was simply not executed while the network was down. But after the network connectivity problems myInsertADOQuery.ExecSQL always returned 0 and no rows were inserted to MyTable anymore. After a restart of the application the insert statement worked again as expected. For SQL Server, is there any defined case where an insert statment like the one above would not insert a row and return 0 as the number of affected rows? Primary key is an autogenerated GUID. There are no unique or check constraints (which should result in an exception anyway rather than not inserting a row). Are there any known ADO bugs (Provider=SQLOLEDB.1)? Any other explanations for this behaviour? Thanks, Nang.

Read the article
My first blog post…

- by steveh99999

I’ve been meaning to start a blog for a while now, (OK, for several years…..) - finally now, here it begins First post, something really simple but, a wise-man once told me about the best way to improve SQL server performance. Store Less Data. That's it.. that's all there is to it... Over the years, I've seen the following :- - a 200Gb database which held 3 days data. Once business requirements changed, we were able to hold only 1 days data in this database. - a table developed by DBAs to hold application table cardinality information - that information was collected at 2 hour intervals every day for 7 years ! After 7 years the DBA space-info table had become the largest table in the database - 60 million rows ! It was a simple change to remove alot of the historical intra-day data and change the schedule to run only once per evening. Suddenly that table held 6 million rows instead of 60 million.... - lots of backup and restore history held in msdb. See this post by Brent Ozar for more details on this issue. Imagine how much faster the backups, DBCC Checks and reindexes ran when the above 3 changes were implemented ? How often do you review your big databases \ tables to see if you’re actually holding only data that is really required by the business ?

Read the article
Fixing Robocopy for SQL Server Jobs

- by Most Valuable Yak (Rob Volk)

Robocopy is one of, if not the, best life-saving/greatest-thing-since-sliced-bread command line utilities ever to come from Microsoft. If you're not using it already, what are you waiting for? Of course, being a Microsoft product, it's not exactly perfect. ;) Specifically, it sets the ERRORLEVEL to a non-zero value even if the copy is successful. This causes a problem in SQL Server job steps, since non-zero ERRORLEVELs report as failed. You can work around this by having your SQL job go to the next step on failure, but then you can't determine if there was a genuine error. Plus you still see annoying red X's in your job history. One way I've found to avoid this is to use a batch file that runs Robocopy, and I add some commands after it (in red): robocopy d:\backups \\BackupServer\BackupFolder *.bak rem suppress successful robocopy exit statuses, only report genuine errors (bitmask 16 and 8 settings)set/A errlev="%ERRORLEVEL% & 24" rem exit batch file with errorlevel so SQL job can succeed or fail appropriatelyexit/B %errlev% (The REM statements are simply comments and don't need to be included in the batch file) The SET command lets you use expressions when you use the /A switch. So I set an environment variable "errlev" to a bitwise AND with the ERRORLEVEL value. Robocopy's exit codes use a bitmap/bitmask to specify its exit status. The bits for 1, 2, and 4 do not indicate any kind of failure, but 8 and 16 do. So by adding 16 + 8 to get 24, and doing a bitwise AND, I suppress any of the other bits that might be set, and allow either or both of the error bits to pass. The next step is to use the EXIT command with the /B switch to set a new ERRORLEVEL value, using the "errlev" variable. This will now return zero (unless Robocopy had real errors) and allow your SQL job step to report success. This technique should also work for other command-line utilities. The only issues I've found is that it requires the commands to be part of a batch file, so if you use Robocopy directly in your SQL job step you'd need to place it in a batch. If you also have multiple Robocopy calls, you'll need to place the SET/A command ONLY after the last one. You'd therefore lose any errors from previous calls, unless you use multiple "errlev" variables and AND them together. (I'll leave this as an exercise for the reader) The SET/A syntax also permits other kinds of expressions to be calculated. You can get a full list by running "SET /?" on a command prompt.

Read the article
Using linked servers, OPENROWSET and OPENQUERY

- by BuckWoody

SQL Server has a few mechanisms to reach out to another server (even another server type) and query data from within a Transact-SQL statement. Among them are a set of stored credentials and information (called a Linked Server), a statement that uses a linked server called called OPENQUERY, another called OPENROWSET, and one called OPENDATASOURCE. This post isn’t about those particular functions or statements – hit the links for more if you’re new to those topics. I’m actually more concerned about where I see these used than the particular method. In many cases, a Linked server isn’t another Relational Database Management System (RDMBS) like Oracle or DB2 (which is possible with a linked server), but another SQL Server. My concern is that linked servers are the new Data Transformation Services (DTS) from SQL Server 2000 – something that was designed for one purpose but which is being morphed into something much more. In the case of DTS, most of us turned that feature into a full-fledged job system. What was designed as a simple data import and export system has been pressed into service doing logic, routing and timing. And of course we all know how painful it was to move off of a complex DTS system onto SQL Server Integration Services. In the case of linked servers, what should be used as a method of running a simple query or two on another server where you have occasional connection or need a quick import of a small data set is morphing into a full federation strategy. In some cases I’ve seen a complex web of linked servers, and when credentials, names or anything else changes there are huge problems. Now don’t get me wrong – linked servers and other forms of distributing queries is a fantastic set of tools that we have to move data around. I’m just saying that when you start having lots of workarounds and when things get really complicated, you might want to step back a little and ask if there’s a better way. Are you able to tolerate some latency? Perhaps you’re able to use Service Broker. Would you like to be platform-independent on the data source? Perhaps a middle-tier might make more sense, abstracting the queries there and sending them to the proper server. Designed properly, I’ve seen these systems scale further and be more resilient than loading up on linked servers. Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Read the article
SQL Saturday #162 Cambridge

- by Most Valuable Yak (Rob Volk)

Despite the efforts of American Airlines, this past weekend I attended the first SQL Saturday in the UK! Hosted by the SQLCambs Chapter of PASS and organized by Mark (b|t) & Lorraine Broadbent, ably assisted by John Martin (b|t), Mark Pryce-Maher (b|t) and other folks whose names I've unfortunately forgotten, it was held at the Crowne Plaza Hotel, which is completely surrounded by Cambridge University. On Friday, they presented 3 pre-conference sessions given by the brilliant American Cloud & DBA Guru, Buck Woody (b|t), the brilliant Danish SQL Server Internals Guru, Mark Rasmussen (b|t), and the brilliant Scottish Business Intelligence Guru and recent Outstanding Pass Volunteer, Jen Stirrup (b|t). While I would have loved to attend any of their pre-cons (having seen them present several times already), finances and American Airlines ultimately made that impossible. But not to worry, I caught up with them during the regular sessions and at the speaker dinner. And I got back the money they all owed me. (Actually I owed Mark some money) The schedule was jam-packed even with only 4 tracks, there were 8 regular slots, a lunch session for sponsor presentations, and a 15 minute keynote given by Buck Woody, who besides giving an excellent history of SQL Server at Microsoft (and before), also explained the source of the "unknown contact" image that appears in Outlook. Hint: it's not Buck himself. Amazingly, and against all better judgment, I even got to present at SQL Saturday 162! I did a 5 minute Lightning Talk on Regular Expressions in SSMS. I then did a regular 50 minute session on Constraints. You can download the content for the regular session at that link, and for the regular expression presentation here. I had a great time and had a great audience for both of my sessions. You would never have guessed this was the first event for the organizers, everything went very smoothly, especially for the number of attendees and the relative smallness of the space. The event sponsors also deserve a lot of credit for making themselves fit in a small area and for staying through the entire event until the giveaways at the very end. Overall this was one of the best SQL Saturdays I've ever attended and I have to congratulate Mark B, Lorraine, John, Mark P-M, and all the volunteers and speakers for making this an astoundingly hard act to follow! Well done!

Read the article
Create Outlook Appointments from PowerShell

- by BuckWoody

I've been toying around with a script to create a special set of calendar objects in Outlook that show when my SQL Server Agent Jobs are scheduled to run. I haven't finished yet, but I thought I would share the part that creates the Outlook Appointments.I have yet to fill a variable with the start and end times, and then loop through that to create the appointments. I'm thinking I'll make the script below into a function, and feed it those variables in a loop. The script below creates a whole new Calendar Folder in Outlook called "SQL Server Agent Jobs". I also use categories quite a bit, so you'll see that too. Caution: If you plan to play with this script, do it on an isolated workstation, not on your "regular" Outlook calendar. Otherwise, you'll have lots of appointments in there that you don't care about! # Add a new calendar item to a new Outlook folder called "SQL Server Agent Jobs" $outlook = new-object -com Outlook.Application $calendar = $outlook.Session.folders.Item(1).Folders.Item("SQL Server Agent Jobs") $appt = $calendar.Items.Add(1) # == olAppointmentItem $appt.Start = [datetime]"03/11/2010 11:00" $appt.End = [datetime]"03/11/2009 12:00" $appt.Subject = "JobName" $appt.Location = "ServerName" $appt.Body = "Job Details" $appt.Categories = "SQL server Agent Job" $appt.Save() Script Disclaimer, for people who need to be told this sort of thing: Never trust any script, including those that you find here, until you understand exactly what it does and how it will act on your systems. Always check the script on a test system or Virtual Machine, not a production system. All scripts on this site are performed by a professional stunt driver on a closed course. Your mileage may vary. Void where prohibited. Offer good for a limited time only. Keep out of reach of small children. Do not operate heavy machinery while using this script. If you experience blurry vision, indigestion or diarrhea during the operation of this script, see a physician immediately. Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Read the article
SQL Server Memory Manager Changes in Denali

- by SQLOS Team

The next version of SQL Server will contain significant changes to the memory manager component. The memory manager component has been rewritten for Denali. In the previous versions of SQL Server there were two distinct memory managers. There was one memory manager which handled allocation sizes of 8k or less and another for greater than 8k. For Denali there will be one memory manager for all allocation sizes. The majority of the changes will be transparent to the end user. However, some changes will be visible to the user. These are listed below: · The ‘max server memory’ configuration option has new lower limits. Specifically, 32-bit versions of SQL Server will have a lower limit of 64 MB. The 64-bit versions will have a lower limit of 128 MB. · All memory allocations by SQL Server components will observe the ‘max server memory’ configuration option. In previous SQL versions only the 8k allocations were limited the ‘max server memory’ configuration option. Allocations larger than 8k weren’t constrained. · DMVs which refer to memory manager internals have been modified. This includes adding or removing columns and changing column names. · The memory manager configuration messages in the error log have minor changes. · DBCC memorystatus output has been changed. · Address Windowing Extensions (AWE) has been deprecated. In the next blog post I will discuss the changes to the memory manager DMVs in greater detail. In future blog posts I will discuss the other changes in greater detail.

Read the article
Sensible Way to Pass Web Data in XML to a SQL Server Database

- by Emtucifor

After exploring several different ways to pass web data to a database for update purposes, I'm wondering if XML might be a good strategy. The database is currently SQL 2000. In a few months it will move to SQL 2005 and I will be able to change things if needed, but I need a SQL 2000 solution now. First of all, the database in question uses the EAV model. I know that this kind of database is generally highly frowned on, so for the purposes of this question, please just accept that this is not going to change. The current update method has the web server inserting values (that have all been converted first to their correct underlying types, then to sql_variant) to a temp table. A stored procedure is then run which expects the temp table to exist and it takes care of updating, inserting, or deleting things as needed. So far, only a single element has needed to be updated at a time. But now, there is a requirement to be able to edit multiple elements at once, and also to support hierarchical elements, each of which can have its own list of attributes. Here's some example XML I hand-typed to demonstrate what I'm thinking of. Note that in this database the Entity is Element and an ID of 0 signifies "create" aka an insert of a new item. <Elements> <Element ID="1234"> <Attr ID="221">Value</Attr> <Attr ID="225">287</Attr> <Attr ID="234"> <Element ID="99825"> <Attr ID="7">Value1</Attr> <Attr ID="8">Value2</Attr> <Attr ID="9" Action="delete" /> </Element> <Element ID="99826" Action="delete" /> <Element ID="0" Type="24"> <Attr ID="7">Value4</Attr> <Attr ID="8">Value5</Attr> <Attr ID="9">Value6</Attr> </Element> <Element ID="0" Type="24"> <Attr ID="7">Value7</Attr> <Attr ID="8">Value8</Attr> <Attr ID="9">Value9</Attr> </Element> </Attr> <Rel ID="3827" Action="delete" /> <Rel ID="2284" Role="parent"> <Element ID="3827" /> <Element ID="3829" /> <Attr ID="665">1</Attr> </Rel> <Rel ID="0" Type="23" Role="child"> <Element ID="3830" /> <Attr ID="67" </Rel> </Element> <Element ID="0" Type="87"> <Attr ID="221">Value</Attr> <Attr ID="225">569</Attr> <Attr ID="234"> <Element ID="0" Type="24"> <Attr ID="7">Value10</Attr> <Attr ID="8">Value11</Attr> <Attr ID="9">Value12</Attr> </Element> </Attr> </Element> <Element ID="1235" Action="delete" /> </Elements> Some Attributes are straight value types, such as AttrID 221. But AttrID 234 is a special "multi-value" type that can have a list of elements underneath it, and each one can have one or more values. Types only need to be presented when a new item is created, since the ElementID fully implies the type if it already exists. I'll probably support only passing in changed items (as detected by javascript). And there may be an Action="Delete" on Attr elements as well, since NULLs are treated as "unselected"--sometimes it's very important to know if a Yes/No question has intentionally been answered No or if no one's bothered to say Yes yet. There is also a different kind of data, a Relationship. At this time, those are updated through individual AJAX calls as things are edited in the UI, but I'd like to include those so that changes to relationships can be canceled (right now, once you change it, it's done). So those are really elements, too, but they are called Rel instead of Element. Relationships are implemented as ElementID1 and ElementID2, so the RelID 2284 in the XML above is in the database as: ElementID 2284 ElementID1 1234 ElementID2 3827 Having multiple children in one relationship isn't currently supported, but it would be nice later. Does this strategy and the example XML make sense? Is there a more sensible way? I'm just looking for some broad critique to help save me from going down a bad path. Any aspect that you'd like to comment on would be helpful. The web language happens to be Classic ASP, but that could change to ASP.Net at some point. A persistence engine like Linq or nHibernate is probably not acceptable right now--I just want to get this already working application enhanced without a huge amount of development time. I'll choose the answer that shows experience and has a balance of good warnings about what not to do, confirmations of what I'm planning to do, and recommendations about something else to do. I'll make it as objective as possible. P.S. I'd like to handle unicode characters as well as very long strings (10k +). UPDATE I have had this working for some time and I used the ADO Recordset Save-To-Stream trick to make creating the XML really easy. The result seems to be fairly fast, though if speed ever becomes a problem I may revisit this. In the meantime, my code works to handle any number of elements and attributes on the page at once, including updating, deleting, and creating new items all in one go. I settled on a scheme like so for all my elements: Existing data elements Example: input name e12345_a678 (element 12345, attribute 678), the input value is the value of the attribute. New elements Javascript copies a hidden template of the set of HTML elements needed for the type into the correct location on the page, increments a counter to get a new ID for this item, and prepends the number to the names of the form items. var newid = 0; function metadataAdd(reference, nameid, value) { var t = document.createElement('input'); t.setAttribute('name', nameid); t.setAttribute('id', nameid); t.setAttribute('type', 'hidden'); t.setAttribute('value', value); reference.appendChild(t); } function multiAdd(target, parentelementid, attrid, elementtypeid) { var proto = document.getElementById('a' + attrid + '_proto'); var instance = document.createElement('p'); target.parentNode.parentNode.insertBefore(instance, target.parentNode); var thisid = ++newid; instance.innerHTML = proto.innerHTML.replace(/{prefix}/g, 'n' + thisid + '_'); instance.id = 'n' + thisid; instance.className += ' new'; metadataAdd(instance, 'n' + thisid + '_p', parentelementid); metadataAdd(instance, 'n' + thisid + '_c', attrid); metadataAdd(instance, 'n' + thisid + '_t', elementtypeid); return false; } Example: Template input name _a678 becomes n1_a678 (a new element, the first one on the page, attribute 678). all attributes of this new element are tagged with the same prefix of n1. The next new item will be n2, and so on. Some hidden form inputs are created: n1_t, value is the elementtype of the element to be created n1_p, value is the parent id of the element (if it is a relationship) n1_c, value is the child id of the element (if it is a relationship) Deleting elements A hidden input is created in the form e12345_t with value set to 0. The existing controls displaying that attribute's values are disabled so they are not included in the form post. So "set type to 0" is treated as delete. With this scheme, every item on the page has a unique name and can be distinguished properly, and every action can be represented properly. When the form is posted, here's a sample of building one of the two recordsets used (classic ASP code): Set Data = Server.CreateObject("ADODB.Recordset") Data.Fields.Append "ElementID", adInteger, 4, adFldKeyColumn Data.Fields.Append "AttrID", adInteger, 4, adFldKeyColumn Data.Fields.Append "Value", adLongVarWChar, 2147483647, adFldIsNullable Or adFldMayBeNull Data.CursorLocation = adUseClient Data.CursorType = adOpenDynamic Data.Open This is the recordset for values, the other is for the elements themselves. I step through the posted form and for the element recordset use a Scripting.Dictionary populated with instances of a custom Class that has the properties I need, so that I can add the values piecemeal, since they don't always come in order. New elements are added as negative to distinguish them from regular elements (rather than requiring a separate column to indicate if it is new or addresses an existing element). I use regular expression to tear apart the form keys: "^(e|n)([0-9]{1,10})_(a|p|t|c)([0-9]{0,10})$" Then, adding an attribute looks like this. Data.AddNew ElementID.Value = DataID AttrID.Value = Integerize(Matches(0).SubMatches(3)) AttrValue.Value = Request.Form(Key) Data.Update ElementID, AttrID, and AttrValue are references to the fields of the recordset. This method is hugely faster than using Data.Fields("ElementID").Value each time. I loop through the Dictionary of element updates and ignore any that don't have all the proper information, adding the good ones to the recordset. Then I call my data-updating stored procedure like so: Set Cmd = Server.CreateObject("ADODB.Command") With Cmd Set .ActiveConnection = MyDBConn .CommandType = adCmdStoredProc .CommandText = "DataPost" .Prepared = False .Parameters.Append .CreateParameter("@ElementMetadata", adLongVarWChar, adParamInput, 2147483647, XMLFromRecordset(Element)) .Parameters.Append .CreateParameter("@ElementData", adLongVarWChar, adParamInput, 2147483647, XMLFromRecordset(Data)) End With Result.Open Cmd ' previously created recordset object with options set Here's the function that does the xml conversion: Private Function XMLFromRecordset(Recordset) Dim Stream Set Stream = Server.CreateObject("ADODB.Stream") Stream.Open Recordset.Save Stream, adPersistXML Stream.Position = 0 XMLFromRecordset = Stream.ReadText End Function Just in case the web page needs to know, the SP returns a recordset of any new elements, showing their page value and their created value (so I can see that n1 is now e12346 for example). Here are some key snippets from the stored procedure. Note this is SQL 2000 for now, though I'll be able to switch to 2005 soon: CREATE PROCEDURE [dbo].[DataPost] @ElementMetaData ntext, @ElementData ntext AS DECLARE @hdoc int --- snip --- EXEC sp_xml_preparedocument @hdoc OUTPUT, @ElementMetaData, '<xml xmlns:s="uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882" xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" xmlns:rs="urn:schemas-microsoft-com:rowset" xmlns:z="#RowsetSchema" />' INSERT #ElementMetadata (ElementID, ElementTypeID, ElementID1, ElementID2) SELECT * FROM OPENXML(@hdoc, '/xml/rs:data/rs:insert/z:row', 0) WITH ( ElementID int, ElementTypeID int, ElementID1 int, ElementID2 int ) ORDER BY ElementID -- orders negative items (new elements) first so they begin counting at 1 for later ID calculation EXEC sp_xml_removedocument @hdoc --- snip --- UPDATE E SET E.ElementTypeID = M.ElementTypeID FROM Element E INNER JOIN #ElementMetadata M ON E.ElementID = M.ElementID WHERE E.ElementID >= 1 AND M.ElementTypeID >= 1 The following query does the correlation of the negative new element ids to the newly inserted ones: UPDATE #ElementMetadata -- Correlate the new ElementIDs with the input rows SET NewElementID = Scope_Identity() - @@RowCount + DataID WHERE ElementID < 0 Other set-based queries do all the other work of validating that the attributes are allowed, are the correct data type, and inserting, updating, and deleting elements and attributes. I hope this brief run-down is useful to others some day! Converting ADO Recordsets to an XML stream was a huge winner for me as it saved all sorts of time and had a namespace and schema already defined that made the results come out correctly. Using a flatter XML format with 2 inputs was also much easier than sticking to some ideal about having everything in a single XML stream.

Read the article
Sensible Way to Pass Web Data to Sql Server Database

- by Emtucifor

After exploring several different ways to pass web data to a database for update purposes, I'm wondering if XML might be a good strategy. The database is currently SQL 2000. In a few months it will move to SQL 2005 and I will be able to change things if needed, but I need a SQL 2000 solution now. First of all, the database in question uses the EAV model. I know that this kind of database is generally highly frowned on, so for the purposes of this question, please just accept that this is not going to change. The current update method has the web server inserting values (that have all been converted first to their correct underlying types, then to sql_variant) to a temp table. A stored procedure is then run which expects the temp table to exist and it takes care of updating, inserting, or deleting things as needed. So far, only a single element has needed to be updated at a time. But now, there is a requirement to be able to edit multiple elements at once, and also to support hierarchical elements, each of which can have its own list of attributes. Here's some example XML I hand-typed to demonstrate what I'm thinking of. Note that in this database the Entity is Element and an ID of 0 signifies "create" aka an insert of a new item. <Elements> <Element ID="1234"> <Attr ID="221">Value</Attr> <Attr ID="225">287</Attr> <Attr ID="234"> <Element ID="99825"> <Attr ID="7">Value1</Attr> <Attr ID="8">Value2</Attr> <Attr ID="9" Action="delete" /> </Element> <Element ID="99826" Action="delete" /> <Element ID="0" Type="24"> <Attr ID="7">Value4</Attr> <Attr ID="8">Value5</Attr> <Attr ID="9">Value6</Attr> </Element> <Element ID="0" Type="24"> <Attr ID="7">Value7</Attr> <Attr ID="8">Value8</Attr> <Attr ID="9">Value9</Attr> </Element> </Attr> <Rel ID="3827" Action="delete" /> <Rel ID="2284" Role="parent"> <Element ID="3827" /> <Element ID="3829" /> <Attr ID="665">1</Attr> </Rel> <Rel ID="0" Type="23" Role="child"> <Element ID="3830" /> <Attr ID="67" </Rel> </Element> <Element ID="0" Type="87"> <Attr ID="221">Value</Attr> <Attr ID="225">569</Attr> <Attr ID="234"> <Element ID="0" Type="24"> <Attr ID="7">Value10</Attr> <Attr ID="8">Value11</Attr> <Attr ID="9">Value12</Attr> </Element> </Attr> </Element> <Element ID="1235" Action="delete" /> </Elements> Some Attributes are straight value types, such as AttrID 221. But AttrID 234 is a special "multi-value" type that can have a list of elements underneath it, and each one can have one or more values. Types only need to be presented when a new item is created, since the ElementID fully implies the type if it already exists. I'll probably support only passing in changed items (as detected by javascript). And there may be an Action="Delete" on Attr elements as well, since NULLs are treated as "unselected"--sometimes it's very important to know if a Yes/No question has intentionally been answered No or if no one's bothered to say Yes yet. There is also a different kind of data, a Relationship. At this time, those are updated through individual AJAX calls as things are edited in the UI, but I'd like to include those so that changes to relationships can be canceled (right now, once you change it, it's done). So those are really elements, too, but they are called Rel instead of Element. Relationships are implemented as ElementID1 and ElementID2, so the RelID 2284 in the XML above is in the database as: ElementID 2284 ElementID1 1234 ElementID2 3827 Having multiple children in one relationship isn't currently supported, but it would be nice later. Does this strategy and the example XML make sense? Is there a more sensible way? I'm just looking for some broad critique to help save me from going down a bad path. Any aspect that you'd like to comment on would be helpful. The web language happens to be Classic ASP, but that could change to ASP.Net at some point. A persistence engine like Linq or nHibernate is probably not acceptable right now--I just want to get this already working application enhanced without a huge amount of development time. I'll choose the answer that shows experience and has a balance of good warnings about what not to do, confirmations of what I'm planning to do, and recommendations about something else to do. I'll make it as objective as possible. P.S. I'd like to handle unicode characters as well as very long strings (10k +). UPDATE I have had this working for some time and I used the ADO Recordset Save-To-Stream trick to make creating the XML really easy. The result seems to be fairly fast, though if speed ever becomes a problem I may revisit this. In the meantime, my code works to handle any number of elements and attributes on the page at once, including updating, deleting, and creating new items all in one go. I settled on a scheme like so for all my elements: Existing data elements Example: input name e12345_a678 (element 12345, attribute 678), the input value is the value of the attribute. New elements Javascript copies a hidden template of the set of HTML elements needed for the type into the correct location on the page, increments a counter to get a new ID for this item, and prepends the number to the names of the form items. var newid = 0; function metadataAdd(reference, nameid, value) { var t = document.createElement('input'); t.setAttribute('name', nameid); t.setAttribute('id', nameid); t.setAttribute('type', 'hidden'); t.setAttribute('value', value); reference.appendChild(t); } function multiAdd(target, parentelementid, attrid, elementtypeid) { var proto = document.getElementById('a' + attrid + '_proto'); var instance = document.createElement('p'); target.parentNode.parentNode.insertBefore(instance, target.parentNode); var thisid = ++newid; instance.innerHTML = proto.innerHTML.replace(/{prefix}/g, 'n' + thisid + '_'); instance.id = 'n' + thisid; instance.className += ' new'; metadataAdd(instance, 'n' + thisid + '_p', parentelementid); metadataAdd(instance, 'n' + thisid + '_c', attrid); metadataAdd(instance, 'n' + thisid + '_t', elementtypeid); return false; } Example: Template input name _a678 becomes n1_a678 (a new element, the first one on the page, attribute 678). all attributes of this new element are tagged with the same prefix of n1. The next new item will be n2, and so on. Some hidden form inputs are created: n1_t, value is the elementtype of the element to be created n1_p, value is the parent id of the element (if it is a relationship) n1_c, value is the child id of the element (if it is a relationship) Deleting elements A hidden input is created in the form e12345_t with value set to 0. The existing controls displaying that attribute's values are disabled so they are not included in the form post. So "set type to 0" is treated as delete. With this scheme, every item on the page has a unique name and can be distinguished properly, and every action can be represented properly. When the form is posted, here's a sample of building one of the two recordsets used (classic ASP code): Set Data = Server.CreateObject("ADODB.Recordset") Data.Fields.Append "ElementID", adInteger, 4, adFldKeyColumn Data.Fields.Append "AttrID", adInteger, 4, adFldKeyColumn Data.Fields.Append "Value", adLongVarWChar, 2147483647, adFldIsNullable Or adFldMayBeNull Data.CursorLocation = adUseClient Data.CursorType = adOpenDynamic Data.Open This is the recordset for values, the other is for the elements themselves. I step through the posted form and for the element recordset use a Scripting.Dictionary populated with instances of a custom Class that has the properties I need, so that I can add the values piecemeal, since they don't always come in order. New elements are added as negative to distinguish them from regular elements (rather than requiring a separate column to indicate if it is new or addresses an existing element). I use regular expression to tear apart the form keys: "^(e|n)([0-9]{1,10})_(a|p|t|c)([0-9]{0,10})$" Then, adding an attribute looks like this. Data.AddNew ElementID.Value = DataID AttrID.Value = Integerize(Matches(0).SubMatches(3)) AttrValue.Value = Request.Form(Key) Data.Update ElementID, AttrID, and AttrValue are references to the fields of the recordset. This method is hugely faster than using Data.Fields("ElementID").Value each time. I loop through the Dictionary of element updates and ignore any that don't have all the proper information, adding the good ones to the recordset. Then I call my data-updating stored procedure like so: Set Cmd = Server.CreateObject("ADODB.Command") With Cmd Set .ActiveConnection = MyDBConn .CommandType = adCmdStoredProc .CommandText = "DataPost" .Prepared = False .Parameters.Append .CreateParameter("@ElementMetadata", adLongVarWChar, adParamInput, 2147483647, XMLFromRecordset(Element)) .Parameters.Append .CreateParameter("@ElementData", adLongVarWChar, adParamInput, 2147483647, XMLFromRecordset(Data)) End With Result.Open Cmd ' previously created recordset object with options set Here's the function that does the xml conversion: Private Function XMLFromRecordset(Recordset) Dim Stream Set Stream = Server.CreateObject("ADODB.Stream") Stream.Open Recordset.Save Stream, adPersistXML Stream.Position = 0 XMLFromRecordset = Stream.ReadText End Function Just in case the web page needs to know, the SP returns a recordset of any new elements, showing their page value and their created value (so I can see that n1 is now e12346 for example). Here are some key snippets from the stored procedure. Note this is SQL 2000 for now, though I'll be able to switch to 2005 soon: CREATE PROCEDURE [dbo].[DataPost] @ElementMetaData ntext, @ElementData ntext AS DECLARE @hdoc int --- snip --- EXEC sp_xml_preparedocument @hdoc OUTPUT, @ElementMetaData, '<xml xmlns:s="uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882" xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" xmlns:rs="urn:schemas-microsoft-com:rowset" xmlns:z="#RowsetSchema" />' INSERT #ElementMetadata (ElementID, ElementTypeID, ElementID1, ElementID2) SELECT * FROM OPENXML(@hdoc, '/xml/rs:data/rs:insert/z:row', 0) WITH ( ElementID int, ElementTypeID int, ElementID1 int, ElementID2 int ) ORDER BY ElementID -- orders negative items (new elements) first so they begin counting at 1 for later ID calculation EXEC sp_xml_removedocument @hdoc --- snip --- UPDATE E SET E.ElementTypeID = M.ElementTypeID FROM Element E INNER JOIN #ElementMetadata M ON E.ElementID = M.ElementID WHERE E.ElementID >= 1 AND M.ElementTypeID >= 1 The following query does the correlation of the negative new element ids to the newly inserted ones: UPDATE #ElementMetadata -- Correlate the new ElementIDs with the input rows SET NewElementID = Scope_Identity() - @@RowCount + DataID WHERE ElementID < 0 Other set-based queries do all the other work of validating that the attributes are allowed, are the correct data type, and inserting, updating, and deleting elements and attributes. I hope this brief run-down is useful to others some day! Converting ADO Recordsets to an XML stream was a huge winner for me as it saved all sorts of time and had a namespace and schema already defined that made the results come out correctly. Using a flatter XML format with 2 inputs was also much easier than sticking to some ideal about having everything in a single XML stream.

Read the article
ADNOC talks about 50x increase in performance

- by KLaker

If you are still wondering about how Exadata can revolutionise your business then I would recommend watching this great video which was recorded at this year's OpenWorld. First a little background...The Abu Dhabi National Oil Company for Distribution (ADNOC) is an integrated energy company that was founded in 1973. ADNOC Distribution markets and distributes petroleum products and services within the United Arab Emirates and internationally. As one of the largest and most innovative government-owned petroleum companies in the Arab Gulf, ADNOC Distribution is renowned and respected for the exceptional quality and reliability of its products and services. Its five corporate divisions include more than 200 filling stations (a number that is growing at 8% annually), more than 150 convenience stores, 10 vehicle inspection stations, as well as wholesale and retail sales of bulk fuel, gas, oil, diesel, and lubricants. ADNOC selected Oracle Exadata Database Machine after extensive research because it provided them with a single platform that can run mixed workloads in a single unified machine: "We chose Oracle Exadata Database Machine because it.offered a fully integrated and highly engineered system that was ready to deploy. With our infrastructure running all the same technology, we can operate any type of Oracle Database without restrictions and be prepared for business growth," said Ali Abdul Aziz Al-Ali, IT division manager, ADNOC Distribution. ".....we could consolidate our transaction processing and business intelligence onto one platform. Competing solutions are just not capable of doing that." - Awad Ahmed Ali El-Sidiq, Senior Database Administrator, ADNOC Distribution In this new video Awad Ahmen Ali El Sidddig, Senior DBA at ADNOC, talks about the impact that Exadata has had on his team and the whole business. ADNOC is using our engineered systems to drive and manage all their workloads: from transaction systems to payments system to data warehouse to BI environment. A true Disk-to-Dashboard revolution using Engineered Systems. This engineered approach is delivering 50x improvement in performance with one queries running 100x faster! The IT has even revolutionised some of their data warehouse related processes with the help of Exadata and now jobs that were taking over 4 hours now run in a few minutes. To watch the video click on the image below which will take you to our Oracle YouTube page: (if the above link does not work, click here: http://www.youtube.com/watch?v=zcRpxc6u5Ic) Now that queries are running 100x faster and jobs are completing in minutes not hours, what is next for the IT team at ADNOC? Like many of our customers ADNOC is now looking to take advantage of big data to help them better align their business operations with customer behaviour and customer insights. To help deliver this next level of insight the IT team is looking at the new features in Oracle Database 12c such as the new in-memory feature to deliver even more performance gains. The great news is that Awad Ahmen Ali El Sidddig was awarded DBA of the Year - EMEA within our Data Warehouse Global Leaders programme and you can see the badge for this award pop-up at the start of video. Well done to everyone at ADNOC and thanks for spending the time with us at OOW to create this great video.

Read the article
JIT compiler for C, C++, and the likes

- by Ebrahim

Is there any just-in-time compiler out there for compiled languages, such as C and C++? (The first names that come to mind are Clang and LLVM! But I don't think they currently support it.) Explanation: I think the software could benefit from runtime profiling feedback and aggressively optimized recompilation of hotspots at runtime, even for compiled-to-machine languages like C and C++. Profile-guided optimization does a similar job, but with the difference a JIT would be more flexible in different environments. In PGO you run your binary prior to releasing it. After you released it, it would use no environment/input feedbacks collected at runtime. So if the input pattern is changed, it is probe to performance penalty. But JIT works well even in that conditions. However I think it is controversial wether the JIT compiling performance benefit outweights its own overhead. Edit: Grammar

Read the article
Enterprise Performance Management: Driving Management Excellence

Extending operational excellence to management excellence is the new strategic imperative for organizations large and small, all around the world. Management Excellence is a strategy for organizations to differentiate from their competition, by being smarter, more agile and more aligned. Tune into this conversation with John Kopcke, Senior Vice President of Oracle’s Enterprise Performance Management Global Business Unit to learn how leading companies are integrating their management processes and using Oracle’s EPM System to achieve management excellence.

Read the article
Performance Impact of Using Spring.NET Dependency Injection

Looking to start using Spring.NET to provide Dependency Injection in your next project? In this article I will show the performance impact of Spring.NET Dependency Injection and compare it to performing the same functions natively.

Read the article
Advanced MySQL Replication - Improving Performance

MySQL Replication can be made quite reliable and robust if the right tools are used to keep it running smoothly--but what if enormous loads on the primary server are overloading the slave server. Are there ways to speed up performance, so the slave can keep up?

Read the article
Performance Gains using Indexed Views and Computed Columns

- by NeilHambly

Hello This is a quick follow-up blog to the Presention I gave last night @ the London UG Meeting ( 17th March 2010 ) It was a great evening and we had a big full house (over 120 Registered for this event), due to time constraints we had I was unable to spend enough time on this topic to really give it justice or any the myriad of questions that arose form the session, I will be gathering all my material and putting a comprehensive BLOG entry on this topic in the next couple of days.. In the meantime here is the slides from last night if you wanted to again review it or if you where not @ the meeting If you wish to contact me then please feel free to send me emails @ [email protected] Finally - a quick thanks to Tony Rogerson for allowing me to be a Presenter last night (so we know who we can blame !) and all the other presenters for thier support Watch this space Folks more to follow soon..

Read the article
Performance Tuning Tips for Apache

Apache is one of the most successful open source projects of our times. A big advantage of this popularity is that over the years people have spent a great deal of time fine tuning the software for better performance. Read on to learn more.

Read the article
How to factor out data layer in nopCommerce and replace MS SQL with RavenDB?

- by Kaveh Shahbazian

I am new to nopCommerce and ecommerce in general but I am involved in an ecommerce project. Now from my past experiences with RavenDB (which mostly were absolutely pleasant) and based on the needs of the business (fast changes with awkward business workflows) It seemed to be an appealing option to have RavenDB handling all sort of things related to the database. I do not understand design and architecture of nopCommerce fully so I did not reach to a conclusion on how to factor data parts, since it seems the services layer actually does not abstract data-layer concepts away; like bringing in EF working model to other layers. I have found another project which used NuDB as it's database as a nopCommerce fork. But it did not help because NuDB still has the feeling of a RDBMS and is not as different as RavenDB. Now first how can I learn about the internals of nopCommerce (other than investigating the code)? It's workflows? It's conventions? Second has anyone tried something similar before with a NoSQL database (say like MongoDB or RavenDB)? Is it possible to achieve this in a 1 (~2) month time frame? Thanks in advance;

Read the article
World Record Performance on PeopleSoft Enterprise Financials Benchmark on SPARC T4-2

- by Brian

Oracle's SPARC T4-2 server achieved World Record performance on Oracle's PeopleSoft Enterprise Financials 9.1 executing 20 Million Journals lines in 8.92 minutes on Oracle Database 11g Release 2 running on Oracle Solaris 11. This is the first result published on this version of the benchmark. The SPARC T4-2 server was able to process 20 million general ledger journal edit and post batch jobs in 8.92 minutes on this benchmark that reflects a large customer environment that utilizes a back-end database of nearly 500 GB. This benchmark demonstrates that the SPARC T4-2 server with PeopleSoft Financials 9.1 can easily process 100 million journal lines in less than 1 hour. The SPARC T4-2 server delivered more than 146 MB/sec of IO throughput with Oracle Database 11g running on Oracle Solaris 11. Performance Landscape Results are presented for PeopleSoft Financials Benchmark 9.1. Results obtained with PeopleSoft Financials Benchmark 9.1 are not comparable to the the previous version of the benchmark, PeopleSoft Financials Benchmark 9.0, due to significant change in data model and supports only batch. PeopleSoft Financials Benchmark, Version 9.1 Solution Under Test Batch (min) SPARC T4-2 (2 x SPARC T4, 2.85 GHz) 8.92 Results from PeopleSoft Financials Benchmark 9.0. PeopleSoft Financials Benchmark, Version 9.0 Solution Under Test Batch (min) Batch with Online (min) SPARC Enterprise M4000 (Web/App) SPARC Enterprise M5000 (DB) 33.09 34.72 SPARC T3-1 (Web/App) SPARC Enterprise M5000 (DB) 35.82 37.01 Configuration Summary Hardware Configuration: 1 x SPARC T4-2 server 2 x SPARC T4 processors, 2.85 GHz 128 GB memory Storage Configuration: 1 x Sun Storage F5100 Flash Array (for database and redo logs) 2 x Sun Storage 2540-M2 arrays and 2 x Sun Storage 2501-M2 arrays (for backup) Software Configuration: Oracle Solaris 11 11/11 SRU 7.5 Oracle Database 11g Release 2 (11.2.0.3) PeopleSoft Financials 9.1 Feature Pack 2 PeopleSoft Supply Chain Management 9.1 Feature Pack 2 PeopleSoft PeopleTools 8.52 latest patch - 8.52.03 Oracle WebLogic Server 10.3.5 Java Platform, Standard Edition Development Kit 6 Update 32 Benchmark Description The PeopleSoft Enterprise Financials 9.1 benchmark emulates a large enterprise that processes and validates a large number of financial journal transactions before posting the journal entry to the ledger. The validation process certifies that the journal entries are accurate, ensuring that ChartFields values are valid, debits and credits equal out, and inter/intra-units are balanced. Once validated, the entries are processed, ensuring that each journal line posts to the correct target ledger, and then changes the journal status to posted. In this benchmark, the Journal Edit & Post is set up to edit and post both Inter-Unit and Regular multi-currency journals. The benchmark processes 20 million journal lines using AppEngine for edits and Cobol for post processes. See Also Oracle PeopleSoft Benchmark White Papers oracle.com SPARC T4-2 Server oracle.com OTN PeopleSoft Financial Management oracle.com OTN Oracle Solaris oracle.com OTN Oracle Database 11g Release 2 Enterprise Edition oracle.com OTN Disclosure Statement Copyright 2012, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 1 October 2012.

Read the article
SQL Server devs–what source control system do you use, if any? (answer and maybe win free stuff)

- by jamiet

Recently I noticed a tweet from notable SQL Server author and community dude-at-large Steve Jones in which he asked how many SQL Server developers were putting their SQL Server source code (i.e. DDL) under source control (I’m paraphrasing because I can’t remember the exact tweet and Twitter’s search functionality is useless). The question surprised me slightly as I thought a more pertinent question would be “how many SQL Server developers are not using source control?” because I have been doing just that for many years now and I simply assumed that use of source control is a given in this day and age. Then I started thinking about it. “Perhaps I’m wrong” I pondered, “perhaps the SQL Server folks that do use source control in their day-to-day jobs are in the minority”. So, dear reader, I’m interested to know a little bit more about your use of source control. Are you putting your SQL Server code into a source control system? If so, what source control server software (e.g. TFS, Git, SVN, Mercurial, SourceSafe, Perforce) are you using? What source control client software are you using (e.g. TFS Team Explorer, Tortoise, Red Gate SQL Source Control, Red Gate SQL Connect, Git Bash, etc…)? Why did you make those particular software choices? Any interesting anecdotes to share in regard to your use of source control and SQL Server? To encourage you to contribute I have five pairs of licenses for Red Gate SQL Source Control and Red Gate SQL Connect to give away to what I consider to be the five best replies (“best” is totally subjective of course but this is my blog so my decision is final ), if you want to be considered don’t forget to leave contact details; email address, Twitter handle or similar will do. To start you off and to perhaps get the brain cells whirring, here are my answers to the questions above: Are you putting your SQL Server code into a source control system? As I think I’ve already said…yes. Always. If so, what source control server software (e.g. TFS, Git, SVN, Mercurial, SourceSafe, Perforce) are you using? I move around a lot between many clients so it changes on a fairly regular basis; my current client uses Team Foundation Server (aka TFS) and as part of a separate project is trialing the use of Team Foundation Service. I have used SVN extensively in the past which I am a fan of (I generally prefer it to TFS) and am trying to get my head around Git by using it for ObjectStorageHelper. What source control client software are you using (e.g. TFS Team Explorer, Tortoise, Red Gate SQL Source Control, Red Gate SQL Connect, Git Bash, etc…)? On my current project, Team Explorer. In the past I have used Tortoise to connect to SVN. Why did you make those particular software choices? I generally use whatever the client uses and given that I work with SQL Server I find that the majority of my clients use TFS, I guess simply because they are Microsoft development shops. Any interesting anecdotes to share in regard to your use of source control and SQL Server? Not an anecdote as such but I am going to share some frustrations about TFS. In many ways TFS is a great product because it integrates many separate functions (source control, work item tracking, build agents) into one whole and I’m firmly of the opinion that that is a good thing if for no reason other than being able to associate your check-ins with a work-item. However, like many people there are aspects to TFS source control that annoy me day-in, day-out. Chief among them has to be the fact that it uses a file’s read-only property to determine if a file should be checked-out or not and, if it determines that it should, it will happily do that check-out on your behalf without you even asking it to. I didn’t realise how ridiculous this was until I first used SVN about three years ago – with SVN you make any changes you wish and then use your source control client to determine which files have changed and thus be checked-in; the notion of “check-out” doesn’t even exist. That sounds like a small thing but you don’t realise how liberating it is until you actually start working that way. Hoping to hear some more anecdotes and opinions in the comments. Remember….free software is up for grabs! @jamiet

Read the article
Of C# Iterators and Performance

- by James Michael Hare

Some of you reading this will be wondering, "what is an iterator" and think I'm locked in the world of C++. Nope, I'm talking C# iterators. No, not enumerators, iterators. So, for those of you who do not know what iterators are in C#, I will explain it in summary, and for those of you who know what iterators are but are curious of the performance impacts, I will explore that as well. Iterators have been around for a bit now, and there are still a bunch of people who don't know what they are or what they do. I don't know how many times at work I've had a code review on my code and have someone ask me, "what's that yield word do?" Basically, this post came to me as I was writing some extension methods to extend IEnumerable<T> -- I'll post some of the fun ones in a later post. Since I was filtering the resulting list down, I was using the standard C# iterator concept; but that got me wondering: what are the performance implications of using an iterator versus returning a new enumeration? So, to begin, let's look at a couple of methods. This is a new (albeit contrived) method called Every(...). The goal of this method is to access and enumeration and return every nth item in the enumeration (including the first). So Every(2) would return items 0, 2, 4, 6, etc. Now, if you wanted to write this in the traditional way, you may come up with something like this: public static IEnumerable<T> Every<T>(this IEnumerable<T> list, int interval) { List<T> newList = new List<T>(); int count = 0; foreach (var i in list) { if ((count++ % interval) == 0) { newList.Add(i); } } return newList; } So basically this method takes any IEnumerable<T> and returns a new IEnumerable<T> that contains every nth item. Pretty straight forward. The problem? Well, Every<T>(...) will construct a list containing every nth item whether or not you care. What happens if you were searching this result for a certain item and find that item after five tries? You would have generated the rest of the list for nothing. Enter iterators. This C# construct uses the yield keyword to effectively defer evaluation of the next item until it is asked for. This can be very handy if the evaluation itself is expensive or if there's a fair chance you'll never want to fully evaluate a list. We see this all the time in Linq, where many expressions are chained together to do complex processing on a list. This would be very expensive if each of these expressions evaluated their entire possible result set on call. Let's look at the same example function, this time using an iterator: public static IEnumerable<T> Every<T>(this IEnumerable<T> list, int interval) { int count = 0; foreach (var i in list) { if ((count++ % interval) == 0) { yield return i; } } } Notice it does not create a new return value explicitly, the only evidence of a return is the "yield return" statement. What this means is that when an item is requested from the enumeration, it will enter this method and evaluate until it either hits a yield return (in which case that item is returned) or until it exits the method or hits a yield break (in which case the iteration ends. Behind the scenes, this is all done with a class that the CLR creates behind the scenes that keeps track of the state of the iteration, so that every time the next item is asked for, it finds that item and then updates the current position so it knows where to start at next time. It doesn't seem like a big deal, does it? But keep in mind the key point here: it only returns items as they are requested. Thus if there's a good chance you will only process a portion of the return list and/or if the evaluation of each item is expensive, an iterator may be of benefit. This is especially true if you intend your methods to be chainable similar to the way Linq methods can be chained. For example, perhaps you have a List<int> and you want to take every tenth one until you find one greater than 10. We could write that as: List<int> someList = new List<int>(); // fill list here someList.Every(10).TakeWhile(i => i <= 10); Now is the difference more apparent? If we use the first form of Every that makes a copy of the list. It's going to copy the entire list whether we will need those items or not, that can be costly! With the iterator version, however, it will only take items from the list until it finds one that is > 10, at which point no further items in the list are evaluated. So, sounds neat eh? But what's the cost is what you're probably wondering. So I ran some tests using the two forms of Every above on lists varying from 5 to 500,000 integers and tried various things. Now, iteration isn't free. If you are more likely than not to iterate the entire collection every time, iterator has some very slight overhead: Copy vs Iterator on 100% of Collection (10,000 iterations) Collection Size Num Iterated Type Total ms 5 5 Copy 5 5 5 Iterator 5 50 50 Copy 28 50 50 Iterator 27 500 500 Copy 227 500 500 Iterator 247 5000 5000 Copy 2266 5000 5000 Iterator 2444 50,000 50,000 Copy 24,443 50,000 50,000 Iterator 24,719 500,000 500,000 Copy 250,024 500,000 500,000 Iterator 251,521 Notice that when iterating over the entire produced list, the times for the iterator are a little better for smaller lists, then getting just a slight bit worse for larger lists. In reality, given the number of items and iterations, the result is near negligible, but just to show that iterators come at a price. However, it should also be noted that the form of Every that returns a copy will have a left-over collection to garbage collect. However, if we only partially evaluate less and less through the list, the savings start to show and make it well worth the overhead. Let's look at what happens if you stop looking after 80% of the list: Copy vs Iterator on 80% of Collection (10,000 iterations) Collection Size Num Iterated Type Total ms 5 4 Copy 5 5 4 Iterator 5 50 40 Copy 27 50 40 Iterator 23 500 400 Copy 215 500 400 Iterator 200 5000 4000 Copy 2099 5000 4000 Iterator 1962 50,000 40,000 Copy 22,385 50,000 40,000 Iterator 19,599 500,000 400,000 Copy 236,427 500,000 400,000 Iterator 196,010 Notice that the iterator form is now operating quite a bit faster. But the savings really add up if you stop on average at 50% (which most searches would typically do): Copy vs Iterator on 50% of Collection (10,000 iterations) Collection Size Num Iterated Type Total ms 5 2 Copy 5 5 2 Iterator 4 50 25 Copy 25 50 25 Iterator 16 500 250 Copy 188 500 250 Iterator 126 5000 2500 Copy 1854 5000 2500 Iterator 1226 50,000 25,000 Copy 19,839 50,000 25,000 Iterator 12,233 500,000 250,000 Copy 208,667 500,000 250,000 Iterator 122,336 Now we see that if we only expect to go on average 50% into the results, we tend to shave off around 40% of the time. And this is only for one level deep. If we are using this in a chain of query expressions it only adds to the savings. So my recommendation? If you have a resonable expectation that someone may only want to partially consume your enumerable result, I would always tend to favor an iterator. The cost if they iterate the whole thing does not add much at all -- and if they consume only partially, you reap some really good performance gains. Next time I'll discuss some of my favorite extensions I've created to make development life a little easier and maintainability a little better.

Read the article
Google I/O 2012 - Building High Performance Mobile Web Applications

Google I/O 2012 - Building High Performance Mobile Web Applications Ryan Fioravanti Learn what it takes to build an HTML5 mobile app that will wow your users. This session will focus on speed, offline support, UI layouts, and the tools necessary to set up a productive development environment. Come to this session if you're looking to make a killer mobile web app that stands out amongst the competition. For all I/O 2012 sessions, go to developers.google.com From: GoogleDevelopers Views: 33 0 ratings Time: 49:43 More in Science & Technology

Read the article

< Previous Page | 288 289 290 291 292 293 294 295 296 297 298 299 | Next Page >