Search Results

Search found 34476 results on 1380 pages for 'sql blog'.

Page 77/1380 | < Previous Page | 73 74 75 76 77 78 79 80 81 82 83 84  | Next Page >

  • SQL Server: What locale should be used to format numeric values into SQL Server format?

    - by Ian Boyd
    It seems that SQL Server does not accept numbers formatted using any particular locale. It also doesn't support locales that have digits other than 0-9. For example, if the current locale is bengali, then the number 123456789 would come out as "?????????". And that's just the digits, nevermind what the digit grouping would be. But the same problem happens for numbers in the Invariant locale, which formats numbers as "123,456,789", which SQL Server won't accept. Is there a culture that matches what SQL Server accepts for numeric values? Or will i have to create some custom "sql server" culture, generating rules for that culture myself from lower level formatting routines? If i was in .NET (which i'm not), i could peruse the Standard Numeric Format strings. Of the format codes available in .NET: c (Currency): $123.46 d (Decimal): 1234 e (Exponentional): 1.052033E+003 f (Fixed Point): 1234.57 g (General): 123.456 n (Number): 1,234.57 p (Percent): 100.00 % r (Round Trip): 123456789.12345678 x (Hexadecimal): FF Only 6 accept all numeric types: c (Currency): $123.46 d (Decimal): 1234 e (Exponentional): 1.052033E+003 f (Fixed Point): 1234.57 g (General): 123.456 n (Number): 1,234.57 p (Percent): 100.00 % r (Round Trip): 123456789.12345678 x (Hexadecimal): FF And of those only 2 generate string representations, in the en-US locale anyway, that would be accepted by SQL Server: c (Currency): $123.46 d (Decimal): 1234 e (Exponentional): 1.052033E+003 f (Fixed Point): 1234.57 g (General): 123.456 n (Number): 1,234.57 p (Percent): 100.00 % r (Round Trip): 123456789.12345678 x (Hexadecimal): FF Of the remaining two, fixed is dependant on the locale's digits, rather than the number being used, leaving General g format: c (Currency): $123.46 d (Decimal): 1234 e (Exponentional): 1.052033E+003 f (Fixed Point): 1234.57 g (General): 123.456 n (Number): 1,234.57 p (Percent): 100.00 % r (Round Trip): 123456789.12345678 x (Hexadecimal): FF And i can't even say for certain that the g format won't add digit groupings (e.g. 1,234). Is there a locale that formats numbers in the way SQL Server expects? Is there a .NET format code? A java format code? A Delphi format code? A VB format code? A stdio format code? latin-numeral-digits

    Read the article

  • SQL Server 2000: Why is this query w/ variables so slow vs w/o variables?

    - by William DiStefano
    I can't figure out why this query would be so slow with variables versus without them. I read some where that I need to enable "Dynamic Parameters" but I cannot find where to do this. DECLARE @BeginDate AS DATETIME ,@EndDate AS DATETIME SELECT @BeginDate = '2010-05-20' ,@EndDate = '2010-05-25' -- Fix date range to include time values SET @BeginDate = CONVERT(VARCHAR(10), ISNULL(@BeginDate, '01/01/1990'), 101) + ' 00:00' SET @EndDate = CONVERT(VARCHAR(10), ISNULL(@EndDate, '12/31/2099'), 101) + ' 23:59' SELECT * FROM claim c WHERE (c.Received_Date BETWEEN @BeginDate AND @EndDate) --this is much slower --(c.Received_Date BETWEEN '2010-05-20' AND '2010-05-25') --this is much faster

    Read the article

  • How many records can i store in a Sql server table before it's getting ugly?

    - by Michel
    Hi, i've been asked to do some performance tests for a new system. It is only just running with a few client, but as they expect to grow, these are the numbers i work with for my test: 200 clients, 4 years of data, and the data changes per.... 5 minutes. So for every 5 minutes for every client there is 1 record. That means 365*24*12 = 105.000 records per client per year, that means 80 milion records for my test. It has one FK to another table, one PK (uniqueidentifier) and one index on the clientID. Is this something SqlServer laughs about because it isn't scaring him, is this getting too much for one quad core 8 GB machine, is this on the edge, or..... Has anybody had any experience with these kind of numbers?

    Read the article

  • Sql Server Compact Edition version error.

    - by Tim
    I am working on .NET ClickOnce project that uses Sql Server 2005 Compact Edition to synchronize remote data through the use of a Merge replication. This application has been live for nearly a year now, and while we encounter occasional synchronization errors, things run quite smoothly for the most part. Yesterday a user reported an error that I have never seen before and have yet to find any information for online. Many users synchronize every night, and I haven't received error reports from anyone else, so this issue must be isolated to this particular user / client machine. Here are the full details of the error: -Error Code : 80004005 -Message : The message contains an unexpected replication operation code. The version of SQL Server Compact Edition Client Agent and SQL Server Compact Edition Server Agent should match. [ replication operation code = 31 ] -Minor Error : 28526 -Source : Microsoft SQL Server Compact Edition -Numeric Parameters : 31 One interesting thing that I've found is that his data does get synchronized to the server, so this error must occur after the upload completes. I have yet to determine whether or not changes at the server are still being downloaded to his subscription. Thinking that maybe there was some kind of version conflict going on, I had a remote desktop session with this user last night and uninstalled both the application and the SQL Server Compact Edition prerequisite, then reinstalled both from our ClickOnce publication site. I also removed his existing local database file so that upon synchronization, an entirely new subscription would be issued to him. Still his errors continue. I suppose the error may be somewhat general, and the text in the error message stating that the versions should match may not necessarily reflect the problem at hand. This site contains the only official reference to this error that I've been able to find, and it offers no more detail than the error message itself. Has anyone else encountered this error? Or at least know more about SQL Compact to have a better guess as to what is going on here? Any help / suggestions will be greatly appreciated!

    Read the article

  • Authentication Problem in Silverlight - Cannot connect to SQL Server

    - by Johann
    Hi All, At the moment, when I try to register a user using the default Business Template in Silverlight, I am getting an error. Basically I am following a tutorial found on Channel 9 on how to build a business app with Silverlight. "An error has occurred while establishing a connection to the server. (provider: Named Pipes Provider, error: 40 – Could not open a connection to SQL Server) (Microsoft SQL Server, Error: 5) An error has occurred while establishing a connection to the server. When connecting to SQL Server 2005, this failure may be caused by the fact that under the default settings SQL Server does not allow remote connections. (provider: Named Pipes Provider, error: 40 – Could not open a connection to SQL Server) (Microsoft SQL Server, Error: 1326)". I found a blog where it identifies the problem, and followed every step, however I am still getting the same problems. Here is my connection string:- <add name="SIEventManagerEntities" connectionString="metadata=res://*/EventManagerDBModel.csdl|res://*/EventManagerDBModel.ssdl|res://*/EventManagerDBModel.msl;provider=System.Data.SqlClient;provider connection string=&quot;Data Source=MONFU-PC;Initial Catalog=SIEventManager;Integrated Security=SSPI;MultipleActiveResultSets=True&quot;" providerName="System.Data.EntityClient" /> Any help would really be very much appreciated! Thanks

    Read the article

  • Why date comparison in sql is not working. Please help

    - by Shantanu Gupta
    I am trying to fetch some records from table but when i use OR instead of AND it returns me few records but not in other case. dates given exactly are present in table. What mistake i am doing ? select newsid,title,detail,hotnews from view_newsmaster where datefrom>=CONVERT(datetime, '4-22-2010',111) AND dateto<=CONVERT(datetime, '4-22-2010',111)

    Read the article

  • Have SSIS' differing type systems ever caused you problems?

    - by jamiet
    One thing that has always infuriated me about SSIS is the fact that every package has three different type systems; to give you an idea of what I am talking about consider the following: The SSIS dataflow's type system is made up of types called DT_*  (e.g. DT_STR, DT_I4) The SSIS variable type system is based on .Net datatypes (e.g. String, Int32) The types available for Execute SQL Task's parameters are based on something else - I don't exactly know what (e.g. VARCHAR, LONG) Speaking euphemistically ... this is not an optimum situation (were I not speaking euphemistically I would be a lot ruder) and hence I have submitted a suggestion to Connect at [SSIS] Consolidate three type systems into one requesting that it be remedied. This accompanying blog post is not however a request for votes (though that would be nice); the reason is actually subtler than that. Let me explain. I have been submitting bugs and suggestions pertaining to SSIS for years and have, so far, submitted over 200 Connect items. If that experience has taught me anything it is this - Connect items are not generally actioned because they are considered "nice to have". No, SSIS Connect items get actioned because they cause customers grief and if I am perfectly honest I must admit that, other than being a bit gnarly, SSIS' three type system architecture has never knowingly caused me any significant problems. The reason for this blog post is to ask if any reader out there has ever encountered any problems on account of SSIS' three type systems or have you, like me, never found them to be a problem? Errors or performance degredation caused by implicit type conversions would, I believe, present a strong case for getting this situation remedied in a future version of SSIS so if you HAVE encountered such problems I would encourage you to leave a comment on the Connect submission accordingly. Let me know in the comments too - I would be interested to hear others' opinions on this. @Jamiet

    Read the article

  • SQL Server Management Data Warehouse - quick tour on setting health monitoring policies

    - by ssqa.net
    Profiler, Perfmon, DMVs & scripts are legendary tools for a DBA to monitor the SQL arena. In line with these tools SQL Server 2008 throws a powerful stream with policy based management (PBM) framework & management data warehouse (MDW) methods, which is a relational database that contains the data that is collected from a server that is a data collection target. This data is used to generate the reports for the System Data collection sets, and can also be used to create custom reports. .....(read more)

    Read the article

  • Don’t forget the London SQL Social tomorrow evening - all things SQL Server and beyond

    - by simonsabin
    Its not too late to register for the SQLSocial event in London on Tuesday (7th June, tomorrow). This is a must attend event for anyone that wants to know what’s coming with SQL Server in the next release or are considering SQL Azure. You can register here http://sqlsocial20110607.eventbrite.com/ For full details of the event go to http://www.sqlsocial.com/Events/11-05-09/An_evening_with_the_SQL_Server_Leadership_Team.aspx...(read more)

    Read the article

  • SQL Windowing screencast session for Cuppa Corner - rolling totals, data cleansing

    - by tonyrogerson
    In this 10 minute screencast I go through the basics of what I term windowing, which is basically the technique of filtering to a set of rows given a specific value, for instance a Sub-Query that aggregates or a join that returns more than just one row (for instance on a one to one relationship). http://sqlserverfaq.com/content/SQL-Basic-Windowing-using-Joins.aspx SQL below... USE tempdb go CREATE TABLE RollingTotals_Nesting ( client_id int not null, transaction_date date not null, transaction_amount...(read more)

    Read the article

  • SQL Server SELECT INTO

    - by Derek Dieter
    The most efficient method of copying a result set into a new table is to use the SELECT INTO method. This method also follows a very simple syntax. [/sql] SELECT * INTO dbo.NewTableName FROM dbo.ExistingTable [sql] Once the query above is executed, all the columns and data in the table ExistingTable (along with their datatypes) will be copied into a [...]

    Read the article

  • SQL Server Select

    - by Derek D.
    The SQL Server Select statement is the first statement used when returning data. It is the most used and most important statement in the T-SQL language. The Select statement has many different clauses. We will step through each clause further in the tutorial, however now, we will look at Select itself. The following [...]

    Read the article

  • Looking for SQL 2008 R2 Training Resources

    - by NeilHambly
    Are you looking for some R2 Training Resources - then this would most likely keep you busy for a while digesting all the content http://www.microsoft.com/downloads/details.aspx?displaylang=en&FamilyID=fffaad6a-0153-4d41-b289-a3ed1d637c0d SQL Server 2008 R2 Update for Developers Training Kit (April 2010 Update) it Contains the following Presentations (22) Demos (29) Hands-on Labs (18) Videos (35) SQL Server 2008 R2 offers an impressive array of capabilities for developers that build upon key innovations...(read more)

    Read the article

  • T-SQL bits - ROW_NUMBER

    - by MartinIsti
    About a month ago I found the SQLShare site which provides useful, clear tutorial videos of how to use some SQL functions, or how to fine tune a query. Their videos are roughly 3-5 minutes long and have proved to be very good for me with a strong BI background with less first-hand T-SQL experience. I decided to make notes of the ones I watched and found useful and instead of putting them into a word document somewhere locally I'll publish them on this blog so. These would be very simple and short...(read more)

    Read the article

  • Stairway to T-SQL DML Level 5: The Mathematics of SQL: Part 2

    Joining tables is a crucial concept to understanding data relationships in a relational database. When you are working with your SQL Server data, you will often need to join tables to produce the results your application requires. Having a good understanding of set theory, and the mathematical operators available and how they are used to join tables will make it easier for you to retrieve the data you need from SQL Server.

    Read the article

  • We’re looking got SQL People

    - by simonsabin
    We are growing our data team at Wonga. If you are working in the SQL Server space and would like to join the one the fastest growing tech companies in Europe then please get in touch ( http://sqlblogcasts.com/blogs/simons/contact.aspx ) We have positions for production DBAs, Data QA analysts and SQL generalists (with a BI tendency). We also have generalist production support roles   Wonga is currently 3rd in the Times Tech Track 100 having been 1st last year. Being in the top 3 for two years...(read more)

    Read the article

  • SQL Server 2008 - Error starting service - model.mdf not found?!

    - by alex
    my SQL server 2008 was running fine. About an hour ago, it suddenly stopped - the MSSQLSERVER service had stopped I right clicked, clicked start, and it said the service had started, and stopped I looked in the event log and saw these two errors: 17207 : udopen: Operating system error 3(error not found) during the creation/opening of physical device C:\Program Files\Microsoft SQL Server\MSSQL\data\model.mdf. 17204 : FCB::Open failed: Could not open device C:\Program Files\Microsoft SQL Server\MSSQL\data\model.mdf for virtual device number (VDN) 1. The model.mdf db has NEVER been in that location - i specified drive F: to use for data / log during install. I checked the SQL Configuration Manager, to try and set startup params, but SQL Server is not listed as one of the services..... EDIT: I've now moved the db to where it was looking for: C:\Program Files\Microsoft SQL Server\MSSQL\data\ directory. Now if I start the service, it still does not work - i get this error message in the log: Could not find row in sysindexes for database ID 3, object ID 1, index ID 1. Run DBCC CHECKTABLE on sysindexes. Interestingly, i checked the error log - around the time users reported problems, there is this: 2010-01-08 17:11:26.44 spid51 Configuration option 'show advanced options' changed from 0 to 1. Run the RECONFIGURE statement to install. 2010-01-08 17:11:26.44 spid51 FILESTREAM: effective level = 0, configured level = 0, file system access share name = 'MSSQLSERVER'. 2010-01-08 17:11:26.44 spid51 Configuration option 'Agent XPs' changed from 1 to 0. Run the RECONFIGURE statement to install. 2010-01-08 17:11:26.44 spid51 FILESTREAM: effective level = 0, configured level = 0, file system access share name = 'MSSQLSERVER'. 2010-01-08 17:11:26.44 spid51 Configuration option 'show advanced options' changed from 1 to 0. Run the RECONFIGURE statement to install. 2010-01-08 17:11:26.44 spid51 FILESTREAM: effective level = 0, configured level = 0, file system access share name = 'MSSQLSERVER'. 2010-01-08 17:11:44.89 spid10s Service Broker manager has shut down. 2010-01-08 17:11:47.83 spid7s SQL Server is terminating in response to a 'stop' request from Service Control Manager. This is an informational message only. No user action is required. 2010-01-08 17:11:47.83 spid7s SQL Trace was stopped due to server shutdown. Trace ID = '1'. This is an informational message only; no user action is required.

    Read the article

  • Cannot connect to a 2008 sql server named instance hosted in a azure virtual machine

    - by emardini
    When I try to connect to a named instance in a SQL SERVER hosted in a azure VM I get this message: A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: SQL Network Interfaces, error: 26 - Error Locating Server/Instance Specified) (Microsoft SQL Server, Error: -1) The problem is the sql browser is not working properly, when I start the sql browser service it closes after a few seconds and the event log says "There are no instances of SQL Server or SQL Server Analysis Services." But I do have a named instance, I can connect locally to this instance. I've re-installed sql browser and the instance but ii does not work. The host is a azure virtual machine windows server 2008 datacenter. Please help. Thank you

    Read the article

  • The blocking nature of aggregates

    - by Rob Farley
    I wrote a post recently about how query tuning isn’t just about how quickly the query runs – that if you have something (such as SSIS) that is consuming your data (and probably introducing a bottleneck), then it might be more important to have a query which focuses on getting the first bit of data out. You can read that post here.  In particular, we looked at two operators that could be used to ensure that a query returns only Distinct rows. and The Sort operator pulls in all the data, sorts it (discarding duplicates), and then pushes out the remaining rows. The Hash Match operator performs a Hashing function on each row as it comes in, and then looks to see if it’s created a Hash it’s seen before. If not, it pushes the row out. The Sort method is quicker, but has to wait until it’s gathered all the data before it can do the sort, and therefore blocks the data flow. But that was my last post. This one’s a bit different. This post is going to look at how Aggregate functions work, which ties nicely into this month’s T-SQL Tuesday. I’ve frequently explained about the fact that DISTINCT and GROUP BY are essentially the same function, although DISTINCT is the poorer cousin because you have less control over it, and you can’t apply aggregate functions. Just like the operators used for Distinct, there are different flavours of Aggregate operators – coming in blocking and non-blocking varieties. The example I like to use to explain this is a pile of playing cards. If I’m handed a pile of cards and asked to count how many cards there are in each suit, it’s going to help if the cards are already ordered. Suppose I’m playing a game of Bridge, I can easily glance at my hand and count how many there are in each suit, because I keep the pile of cards in order. Moving from left to right, I could tell you I have four Hearts in my hand, even before I’ve got to the end. By telling you that I have four Hearts as soon as I know, I demonstrate the principle of a non-blocking operation. This is known as a Stream Aggregate operation. It requires input which is sorted by whichever columns the grouping is on, and it will release a row as soon as the group changes – when I encounter a Spade, I know I don’t have any more Hearts in my hand. Alternatively, if the pile of cards are not sorted, I won’t know how many Hearts I have until I’ve looked through all the cards. In fact, to count them, I basically need to put them into little piles, and when I’ve finished making all those piles, I can count how many there are in each. Because I don’t know any of the final numbers until I’ve seen all the cards, this is blocking. This performs the aggregate function using a Hash Match. Observant readers will remember this from my Distinct example. You might remember that my earlier Hash Match operation – used for Distinct Flow – wasn’t blocking. But this one is. They’re essentially doing a similar operation, applying a Hash function to some data and seeing if the set of values have been seen before, but before, it needs more information than the mere existence of a new set of values, it needs to consider how many of them there are. A lot is dependent here on whether the data coming out of the source is sorted or not, and this is largely determined by the indexes that are being used. If you look in the Properties of an Index Scan, you’ll be able to see whether the order of the data is required by the plan. A property called Ordered will demonstrate this. In this particular example, the second plan is significantly faster, but is dependent on having ordered data. In fact, if I force a Stream Aggregate on unordered data (which I’m doing by telling it to use a different index), a Sort operation is needed, which makes my plan a lot slower. This is all very straight-forward stuff, and information that most people are fully aware of. I’m sure you’ve all read my good friend Paul White (@sql_kiwi)’s post on how the Query Optimizer chooses which type of aggregate function to apply. But let’s take a look at SQL Server Integration Services. SSIS gives us a Aggregate transformation for use in Data Flow Tasks, but it’s described as Blocking. The definitive article on Performance Tuning SSIS uses Sort and Aggregate as examples of Blocking Transformations. I’ve just shown you that Aggregate operations used by the Query Optimizer are not always blocking, but that the SSIS Aggregate component is an example of a blocking transformation. But is it always the case? After all, there are plenty of SSIS Performance Tuning talks out there that describe the value of sorted data in Data Flow Tasks, describing the IsSorted property that can be set through the Advanced Editor of your Source component. And so I set about testing the Aggregate transformation in SSIS, to prove for sure whether providing Sorted data would let the Aggregate transform behave like a Stream Aggregate. (Of course, I knew the answer already, but it helps to be able to demonstrate these things). A query that will produce a million rows in order was in order. Let me rephrase. I used a query which produced the numbers from 1 to 1000000, in a single field, ordered. The IsSorted flag was set on the source output, with the only column as SortKey 1. Performing an Aggregate function over this (counting the number of rows per distinct number) should produce an additional column with 1 in it. If this were being done in T-SQL, the ordered data would allow a Stream Aggregate to be used. In fact, if the Query Optimizer saw that the field had a Unique Index on it, it would be able to skip the Aggregate function completely, and just insert the value 1. This is a shortcut I wouldn’t be expecting from SSIS, but certainly the Stream behaviour would be nice. Unfortunately, it’s not the case. As you can see from the screenshots above, the data is pouring into the Aggregate function, and not being released until all million rows have been seen. It’s not doing a Stream Aggregate at all. This is expected behaviour. (I put that in bold, because I want you to realise this.) An SSIS transformation is a piece of code that runs. It’s a physical operation. When you write T-SQL and ask for an aggregation to be done, it’s a logical operation. The physical operation is either a Stream Aggregate or a Hash Match. In SSIS, you’re telling the system that you want a generic Aggregation, that will have to work with whatever data is passed in. I’m not saying that it wouldn’t be possible to make a sometimes-blocking aggregation component in SSIS. A Custom Component could be created which could detect whether the SortKeys columns of the input matched the Grouping columns of the Aggregation, and either call the blocking code or the non-blocking code as appropriate. One day I’ll make one of those, and publish it on my blog. I’ve done it before with a Script Component, but as Script components are single-use, I was able to handle the data knowing everything about my data flow already. As per my previous post – there are a lot of aspects in which tuning SSIS and tuning execution plans use similar concepts. In both situations, it really helps to have a feel for what’s going on behind the scenes. Considering whether an operation is blocking or not is extremely relevant to performance, and that it’s not always obvious from the surface. In a future post, I’ll show the impact of blocking v non-blocking and synchronous v asynchronous components in SSIS, using some of LobsterPot’s Script Components and Custom Components as examples. When I get that sorted, I’ll make a Stream Aggregate component available for download.

    Read the article

  • The blocking nature of aggregates

    - by Rob Farley
    I wrote a post recently about how query tuning isn’t just about how quickly the query runs – that if you have something (such as SSIS) that is consuming your data (and probably introducing a bottleneck), then it might be more important to have a query which focuses on getting the first bit of data out. You can read that post here.  In particular, we looked at two operators that could be used to ensure that a query returns only Distinct rows. and The Sort operator pulls in all the data, sorts it (discarding duplicates), and then pushes out the remaining rows. The Hash Match operator performs a Hashing function on each row as it comes in, and then looks to see if it’s created a Hash it’s seen before. If not, it pushes the row out. The Sort method is quicker, but has to wait until it’s gathered all the data before it can do the sort, and therefore blocks the data flow. But that was my last post. This one’s a bit different. This post is going to look at how Aggregate functions work, which ties nicely into this month’s T-SQL Tuesday. I’ve frequently explained about the fact that DISTINCT and GROUP BY are essentially the same function, although DISTINCT is the poorer cousin because you have less control over it, and you can’t apply aggregate functions. Just like the operators used for Distinct, there are different flavours of Aggregate operators – coming in blocking and non-blocking varieties. The example I like to use to explain this is a pile of playing cards. If I’m handed a pile of cards and asked to count how many cards there are in each suit, it’s going to help if the cards are already ordered. Suppose I’m playing a game of Bridge, I can easily glance at my hand and count how many there are in each suit, because I keep the pile of cards in order. Moving from left to right, I could tell you I have four Hearts in my hand, even before I’ve got to the end. By telling you that I have four Hearts as soon as I know, I demonstrate the principle of a non-blocking operation. This is known as a Stream Aggregate operation. It requires input which is sorted by whichever columns the grouping is on, and it will release a row as soon as the group changes – when I encounter a Spade, I know I don’t have any more Hearts in my hand. Alternatively, if the pile of cards are not sorted, I won’t know how many Hearts I have until I’ve looked through all the cards. In fact, to count them, I basically need to put them into little piles, and when I’ve finished making all those piles, I can count how many there are in each. Because I don’t know any of the final numbers until I’ve seen all the cards, this is blocking. This performs the aggregate function using a Hash Match. Observant readers will remember this from my Distinct example. You might remember that my earlier Hash Match operation – used for Distinct Flow – wasn’t blocking. But this one is. They’re essentially doing a similar operation, applying a Hash function to some data and seeing if the set of values have been seen before, but before, it needs more information than the mere existence of a new set of values, it needs to consider how many of them there are. A lot is dependent here on whether the data coming out of the source is sorted or not, and this is largely determined by the indexes that are being used. If you look in the Properties of an Index Scan, you’ll be able to see whether the order of the data is required by the plan. A property called Ordered will demonstrate this. In this particular example, the second plan is significantly faster, but is dependent on having ordered data. In fact, if I force a Stream Aggregate on unordered data (which I’m doing by telling it to use a different index), a Sort operation is needed, which makes my plan a lot slower. This is all very straight-forward stuff, and information that most people are fully aware of. I’m sure you’ve all read my good friend Paul White (@sql_kiwi)’s post on how the Query Optimizer chooses which type of aggregate function to apply. But let’s take a look at SQL Server Integration Services. SSIS gives us a Aggregate transformation for use in Data Flow Tasks, but it’s described as Blocking. The definitive article on Performance Tuning SSIS uses Sort and Aggregate as examples of Blocking Transformations. I’ve just shown you that Aggregate operations used by the Query Optimizer are not always blocking, but that the SSIS Aggregate component is an example of a blocking transformation. But is it always the case? After all, there are plenty of SSIS Performance Tuning talks out there that describe the value of sorted data in Data Flow Tasks, describing the IsSorted property that can be set through the Advanced Editor of your Source component. And so I set about testing the Aggregate transformation in SSIS, to prove for sure whether providing Sorted data would let the Aggregate transform behave like a Stream Aggregate. (Of course, I knew the answer already, but it helps to be able to demonstrate these things). A query that will produce a million rows in order was in order. Let me rephrase. I used a query which produced the numbers from 1 to 1000000, in a single field, ordered. The IsSorted flag was set on the source output, with the only column as SortKey 1. Performing an Aggregate function over this (counting the number of rows per distinct number) should produce an additional column with 1 in it. If this were being done in T-SQL, the ordered data would allow a Stream Aggregate to be used. In fact, if the Query Optimizer saw that the field had a Unique Index on it, it would be able to skip the Aggregate function completely, and just insert the value 1. This is a shortcut I wouldn’t be expecting from SSIS, but certainly the Stream behaviour would be nice. Unfortunately, it’s not the case. As you can see from the screenshots above, the data is pouring into the Aggregate function, and not being released until all million rows have been seen. It’s not doing a Stream Aggregate at all. This is expected behaviour. (I put that in bold, because I want you to realise this.) An SSIS transformation is a piece of code that runs. It’s a physical operation. When you write T-SQL and ask for an aggregation to be done, it’s a logical operation. The physical operation is either a Stream Aggregate or a Hash Match. In SSIS, you’re telling the system that you want a generic Aggregation, that will have to work with whatever data is passed in. I’m not saying that it wouldn’t be possible to make a sometimes-blocking aggregation component in SSIS. A Custom Component could be created which could detect whether the SortKeys columns of the input matched the Grouping columns of the Aggregation, and either call the blocking code or the non-blocking code as appropriate. One day I’ll make one of those, and publish it on my blog. I’ve done it before with a Script Component, but as Script components are single-use, I was able to handle the data knowing everything about my data flow already. As per my previous post – there are a lot of aspects in which tuning SSIS and tuning execution plans use similar concepts. In both situations, it really helps to have a feel for what’s going on behind the scenes. Considering whether an operation is blocking or not is extremely relevant to performance, and that it’s not always obvious from the surface. In a future post, I’ll show the impact of blocking v non-blocking and synchronous v asynchronous components in SSIS, using some of LobsterPot’s Script Components and Custom Components as examples. When I get that sorted, I’ll make a Stream Aggregate component available for download.

    Read the article

< Previous Page | 73 74 75 76 77 78 79 80 81 82 83 84  | Next Page >