Search Results

Search found 1595 results on 64 pages for 'tim farley'.

Page 4/64 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >

  • Stored Procedures with SSRS? Hmm… not so much

    - by Rob Farley
    Little Bobby Tables’ mother says you should always sanitise your data input. Except that I think she’s wrong. The SQL Injection aspect is for another post, where I’ll show you why I think SQL Injection is the same kind of attack as many other attacks, such as the old buffer overflow, but here I want to have a bit of a whinge about the way that some people sanitise data input, and even have a whinge about people who insist on using stored procedures for SSRS reports. Let me say that again, in case you missed it the first time: I want to have a whinge about people who insist on using stored procedures for SSRS reports. Let’s look at the data input sanitisation aspect – except that I’m going to call it ‘parameter validation’. I’m talking about code that looks like this: create procedure dbo.GetMonthSummaryPerSalesPerson(@eomdate datetime) as begin     /* First check that @eomdate is a valid date */     if isdate(@eomdate) != 1     begin         select 'Please enter a valid date' as ErrorMessage;         return;     end     /* Then check that time has passed since @eomdate */     if datediff(day,@eomdate,sysdatetime()) < 5     begin         select 'Sorry - EOM is not complete yet' as ErrorMessage;         return;     end         /* If those checks have succeeded, return the data */     select SalesPersonID, count(*) as NumSales, sum(TotalDue) as TotalSales     from Sales.SalesOrderHeader     where OrderDate >= dateadd(month,-1,@eomdate)         and OrderDate < @eomdate     group by SalesPersonID     order by SalesPersonID; end Notice that the code checks that a date has been entered. Seriously??!! This must only be to check for NULL values being passed in, because anything else would have to be a valid datetime to avoid an error. The other check is maybe fair enough, but I still don’t like it. The two problems I have with this stored procedure are the result sets and the small fact that the stored procedure even exists in the first place. But let’s consider the first one of these problems for starters. I’ll get to the second one in a moment. If you read Jes Borland (@grrl_geek)’s recent post about returning multiple result sets in Reporting Services, you’ll be aware that Reporting Services doesn’t support multiple results sets from a single query. And when it says ‘single query’, it includes ‘stored procedure call’. It’ll only handle the first result set that comes back. But that’s okay – we have RETURN statements, so our stored procedure will only ever return a single result set.  Sometimes that result set might contain a single field called ErrorMessage, but it’s still only one result set. Except that it’s not okay, because Reporting Services needs to know what fields to expect. Your report needs to hook into your fields, so SSRS needs to have a way to get that information. For stored procs, it uses an option called FMTONLY. When Reporting Services tries to figure out what fields are going to be returned by a query (or stored procedure call), it doesn’t want to have to run the whole thing. That could take ages. (Maybe it’s seen some of the stored procedures I’ve had to deal with over the years!) So it turns on FMTONLY before it makes the call (and turns it off again afterwards). FMTONLY is designed to be able to figure out the shape of the output, without actually running the contents. It’s very useful, you might think. set fmtonly on exec dbo.GetMonthSummaryPerSalesPerson '20030401'; set fmtonly off Without the FMTONLY lines, this stored procedure returns a result set that has three columns and fourteen rows. But with FMTONLY turned on, those rows don’t come back. But what I do get back hurts Reporting Services. It doesn’t run the stored procedure at all. It just looks for anything that could be returned and pushes out a result set in that shape. Despite the fact that I’ve made sure that the logic will only ever return a single result set, the FMTONLY option kills me by returning three of them. It would have been much better to push these checks down into the query itself. alter procedure dbo.GetMonthSummaryPerSalesPerson(@eomdate datetime) as begin     select SalesPersonID, count(*) as NumSales, sum(TotalDue) as TotalSales     from Sales.SalesOrderHeader     where     /* Make sure that @eomdate is valid */         isdate(@eomdate) = 1     /* And that it's sufficiently past */     and datediff(day,@eomdate,sysdatetime()) >= 5     /* And now use it in the filter as appropriate */     and OrderDate >= dateadd(month,-1,@eomdate)     and OrderDate < @eomdate     group by SalesPersonID     order by SalesPersonID; end Now if we run it with FMTONLY turned on, we get the single result set back. But let’s consider the execution plan when we pass in an invalid date. First let’s look at one that returns data. I’ve got a semi-useful index in place on OrderDate, which includes the SalesPersonID and TotalDue fields. It does the job, despite a hefty Sort operation. …compared to one that uses a future date: You might notice that the estimated costs are similar – the Index Seek is still 28%, the Sort is still 71%. But the size of that arrow coming out of the Index Seek is a whole bunch smaller. The coolest thing here is what’s going on with that Index Seek. Let’s look at some of the properties of it. Glance down it with me… Estimated CPU cost of 0.0005728, 387 estimated rows, estimated subtree cost of 0.0044385, ForceSeek false, Number of Executions 0. That’s right – it doesn’t run. So much for reading plans right-to-left... The key is the Filter on the left of it. It has a Startup Expression Predicate in it, which means that it doesn’t call anything further down the plan (to the right) if the predicate evaluates to false. Using this method, we can make sure that our stored procedure contains a single query, and therefore avoid any problems with multiple result sets. If we wanted, we could always use UNION ALL to make sure that we can return an appropriate error message. alter procedure dbo.GetMonthSummaryPerSalesPerson(@eomdate datetime) as begin     select SalesPersonID, count(*) as NumSales, sum(TotalDue) as TotalSales, /*Placeholder: */ '' as ErrorMessage     from Sales.SalesOrderHeader     where     /* Make sure that @eomdate is valid */         isdate(@eomdate) = 1     /* And that it's sufficiently past */     and datediff(day,@eomdate,sysdatetime()) >= 5     /* And now use it in the filter as appropriate */     and OrderDate >= dateadd(month,-1,@eomdate)     and OrderDate < @eomdate     group by SalesPersonID     /* Now include the error messages */     union all     select 0, 0, 0, 'Please enter a valid date' as ErrorMessage     where isdate(@eomdate) != 1     union all     select 0, 0, 0, 'Sorry - EOM is not complete yet' as ErrorMessage     where datediff(day,@eomdate,sysdatetime()) < 5     order by SalesPersonID; end But still I don’t like it, because it’s now a stored procedure with a single query. And I don’t like stored procedures that should be functions. That’s right – I think this should be a function, and SSRS should call the function. And I apologise to those of you who are now planning a bonfire for me. Guy Fawkes’ night has already passed this year, so I think you miss out. (And I’m not going to remind you about when the PASS Summit is in 2012.) create function dbo.GetMonthSummaryPerSalesPerson(@eomdate datetime) returns table as return (     select SalesPersonID, count(*) as NumSales, sum(TotalDue) as TotalSales, '' as ErrorMessage     from Sales.SalesOrderHeader     where     /* Make sure that @eomdate is valid */         isdate(@eomdate) = 1     /* And that it's sufficiently past */     and datediff(day,@eomdate,sysdatetime()) >= 5     /* And now use it in the filter as appropriate */     and OrderDate >= dateadd(month,-1,@eomdate)     and OrderDate < @eomdate     group by SalesPersonID     union all     select 0, 0, 0, 'Please enter a valid date' as ErrorMessage     where isdate(@eomdate) != 1     union all     select 0, 0, 0, 'Sorry - EOM is not complete yet' as ErrorMessage     where datediff(day,@eomdate,sysdatetime()) < 5 ); We’ve had to lose the ORDER BY – but that’s fine, as that’s a client thing anyway. We can have our reports leverage this stored query still, but we’re recognising that it’s a query, not a procedure. A procedure is designed to DO stuff, not just return data. We even get entries in sys.columns that confirm what the shape of the result set actually is, which makes sense, because a table-valued function is the right mechanism to return data. And we get so much more flexibility with this. If you haven’t seen the simplification stuff that I’ve preached on before, jump over to http://bit.ly/SimpleRob and watch the video of when I broke a microphone and nearly fell off the stage in Wales. You’ll see the impact of being able to have a simplifiable query. You can also read the procedural functions post I wrote recently, if you didn’t follow the link from a few paragraphs ago. So if we want the list of SalesPeople that made any kind of sales in a given month, we can do something like: select SalesPersonID from dbo.GetMonthSummaryPerSalesPerson(@eomonth) order by SalesPersonID; This doesn’t need to look up the TotalDue field, which makes a simpler plan. select * from dbo.GetMonthSummaryPerSalesPerson(@eomonth) where SalesPersonID is not null order by SalesPersonID; This one can avoid having to do the work on the rows that don’t have a SalesPersonID value, pushing the predicate into the Index Seek rather than filtering the results that come back to the report. If we had joins involved, we might see some of those being simplified out. We also get the ability to include query hints in individual reports. We shift from having a single-use stored procedure to having a reusable stored query – and isn’t that one of the main points of modularisation? Stored procedures in Reporting Services are just a bit limited for my liking. They’re useful in plenty of ways, but if you insist on using stored procedures all the time rather that queries that use functions – that’s rubbish. @rob_farley

    Read the article

  • SQL Spatial: Getting “nearest” calculations working properly

    - by Rob Farley
    If you’ve ever done spatial work with SQL Server, I hope you’ve come across the ‘nearest’ problem. You have five thousand stores around the world, and you want to identify the one that’s closest to a particular place. Maybe you want the store closest to the LobsterPot office in Adelaide, at -34.925806, 138.605073. Or our new US office, at 42.524929, -87.858244. Or maybe both! You know how to do this. You don’t want to use an aggregate MIN or MAX, because you want the whole row, telling you which store it is. You want to use TOP, and if you want to find the closest store for multiple locations, you use APPLY. Let’s do this (but I’m going to use addresses in AdventureWorks2012, as I don’t have a list of stores). Oh, and before I do, let’s make sure we have a spatial index in place. I’m going to use the default options. CREATE SPATIAL INDEX spin_Address ON Person.Address(SpatialLocation); And my actual query: WITH MyLocations AS (SELECT * FROM (VALUES ('LobsterPot Adelaide', geography::Point(-34.925806, 138.605073, 4326)),                        ('LobsterPot USA', geography::Point(42.524929, -87.858244, 4326))                ) t (Name, Geo)) SELECT l.Name, a.AddressLine1, a.City, s.Name AS [State], c.Name AS Country FROM MyLocations AS l CROSS APPLY (     SELECT TOP (1) *     FROM Person.Address AS ad     ORDER BY l.Geo.STDistance(ad.SpatialLocation)     ) AS a JOIN Person.StateProvince AS s     ON s.StateProvinceID = a.StateProvinceID JOIN Person.CountryRegion AS c     ON c.CountryRegionCode = s.CountryRegionCode ; Great! This is definitely working. I know both those City locations, even if the AddressLine1s don’t quite ring a bell. I’m sure I’ll be able to find them next time I’m in the area. But of course what I’m concerned about from a querying perspective is what’s happened behind the scenes – the execution plan. This isn’t pretty. It’s not using my index. It’s sucking every row out of the Address table TWICE (which sucks), and then it’s sorting them by the distance to find the smallest one. It’s not pretty, and it takes a while. Mind you, I do like the fact that it saw an indexed view it could use for the State and Country details – that’s pretty neat. But yeah – users of my nifty website aren’t going to like how long that query takes. The frustrating thing is that I know that I can use the index to find locations that are within a particular distance of my locations quite easily, and Microsoft recommends this for solving the ‘nearest’ problem, as described at http://msdn.microsoft.com/en-au/library/ff929109.aspx. Now, in the first example on this page, it says that the query there will use the spatial index. But when I run it on my machine, it does nothing of the sort. I’m not particularly impressed. But what we see here is that parallelism has kicked in. In my scenario, it’s split the data up into 4 threads, but it’s still slow, and not using my index. It’s disappointing. But I can persuade it with hints! If I tell it to FORCESEEK, or use my index, or even turn off the parallelism with MAXDOP 1, then I get the index being used, and it’s a thing of beauty! Part of the plan is here: It’s massive, and it’s ugly, and it uses a TVF… but it’s quick. The way it works is to hook into the GeodeticTessellation function, which is essentially finds where the point is, and works out through the spatial index cells that surround it. This then provides a framework to be able to see into the spatial index for the items we want. You can read more about it at http://msdn.microsoft.com/en-us/library/bb895265.aspx#tessellation – including a bunch of pretty diagrams. One of those times when we have a much more complex-looking plan, but just because of the good that’s going on. This tessellation stuff was introduced in SQL Server 2012. But my query isn’t using it. When I try to use the FORCESEEK hint on the Person.Address table, I get the friendly error: Msg 8622, Level 16, State 1, Line 1 Query processor could not produce a query plan because of the hints defined in this query. Resubmit the query without specifying any hints and without using SET FORCEPLAN. And I’m almost tempted to just give up and move back to the old method of checking increasingly large circles around my location. After all, I can even leverage multiple OUTER APPLY clauses just like I did in my recent Lookup post. WITH MyLocations AS (SELECT * FROM (VALUES ('LobsterPot Adelaide', geography::Point(-34.925806, 138.605073, 4326)),                        ('LobsterPot USA', geography::Point(42.524929, -87.858244, 4326))                ) t (Name, Geo)) SELECT     l.Name,     COALESCE(a1.AddressLine1,a2.AddressLine1,a3.AddressLine1),     COALESCE(a1.City,a2.City,a3.City),     s.Name AS [State],     c.Name AS Country FROM MyLocations AS l OUTER APPLY (     SELECT TOP (1) *     FROM Person.Address AS ad     WHERE l.Geo.STDistance(ad.SpatialLocation) < 1000     ORDER BY l.Geo.STDistance(ad.SpatialLocation)     ) AS a1 OUTER APPLY (     SELECT TOP (1) *     FROM Person.Address AS ad     WHERE l.Geo.STDistance(ad.SpatialLocation) < 5000     AND a1.AddressID IS NULL     ORDER BY l.Geo.STDistance(ad.SpatialLocation)     ) AS a2 OUTER APPLY (     SELECT TOP (1) *     FROM Person.Address AS ad     WHERE l.Geo.STDistance(ad.SpatialLocation) < 20000     AND a2.AddressID IS NULL     ORDER BY l.Geo.STDistance(ad.SpatialLocation)     ) AS a3 JOIN Person.StateProvince AS s     ON s.StateProvinceID = COALESCE(a1.StateProvinceID,a2.StateProvinceID,a3.StateProvinceID) JOIN Person.CountryRegion AS c     ON c.CountryRegionCode = s.CountryRegionCode ; But this isn’t friendly-looking at all, and I’d use the method recommended by Isaac Kunen, who uses a table of numbers for the expanding circles. It feels old-school though, when I’m dealing with SQL 2012 (and later) versions. So why isn’t my query doing what it’s supposed to? Remember the query... WITH MyLocations AS (SELECT * FROM (VALUES ('LobsterPot Adelaide', geography::Point(-34.925806, 138.605073, 4326)),                        ('LobsterPot USA', geography::Point(42.524929, -87.858244, 4326))                ) t (Name, Geo)) SELECT l.Name, a.AddressLine1, a.City, s.Name AS [State], c.Name AS Country FROM MyLocations AS l CROSS APPLY (     SELECT TOP (1) *     FROM Person.Address AS ad     ORDER BY l.Geo.STDistance(ad.SpatialLocation)     ) AS a JOIN Person.StateProvince AS s     ON s.StateProvinceID = a.StateProvinceID JOIN Person.CountryRegion AS c     ON c.CountryRegionCode = s.CountryRegionCode ; Well, I just wasn’t reading http://msdn.microsoft.com/en-us/library/ff929109.aspx properly. The following requirements must be met for a Nearest Neighbor query to use a spatial index: A spatial index must be present on one of the spatial columns and the STDistance() method must use that column in the WHERE and ORDER BY clauses. The TOP clause cannot contain a PERCENT statement. The WHERE clause must contain a STDistance() method. If there are multiple predicates in the WHERE clause then the predicate containing STDistance() method must be connected by an AND conjunction to the other predicates. The STDistance() method cannot be in an optional part of the WHERE clause. The first expression in the ORDER BY clause must use the STDistance() method. Sort order for the first STDistance() expression in the ORDER BY clause must be ASC. All the rows for which STDistance returns NULL must be filtered out. Let’s start from the top. 1. Needs a spatial index on one of the columns that’s in the STDistance call. Yup, got the index. 2. No ‘PERCENT’. Yeah, I don’t have that. 3. The WHERE clause needs to use STDistance(). Ok, but I’m not filtering, so that should be fine. 4. Yeah, I don’t have multiple predicates. 5. The first expression in the ORDER BY is my distance, that’s fine. 6. Sort order is ASC, because otherwise we’d be starting with the ones that are furthest away, and that’s tricky. 7. All the rows for which STDistance returns NULL must be filtered out. But I don’t have any NULL values, so that shouldn’t affect me either. ...but something’s wrong. I do actually need to satisfy #3. And I do need to make sure #7 is being handled properly, because there are some situations (eg, differing SRIDs) where STDistance can return NULL. It says so at http://msdn.microsoft.com/en-us/library/bb933808.aspx – “STDistance() always returns null if the spatial reference IDs (SRIDs) of the geography instances do not match.” So if I simply make sure that I’m filtering out the rows that return NULL… …then it’s blindingly fast, I get the right results, and I’ve got the complex-but-brilliant plan that I wanted. It just wasn’t overly intuitive, despite being documented. @rob_farley

    Read the article

  • Visualising data a different way with Pivot collections

    - by Rob Farley
    Roger’s been doing a great job extending PivotViewer recently, and you can find the list of LobsterPot pivots at http://pivot.lobsterpot.com.au Many months back, the TED Talk that Gary Flake did about Pivot caught my imagination, and I did some research into it. At the time, most of what we did with Pivot was geared towards what we could do for clients, including making Pivot collections based on students at a school, and using it to browse PDF invoices by their various properties. We had actual commercial work based on Pivot collections back then, and it was all kinds of fun. Later, we made some collections for events that were happening, and even got featured in the TechEd Australia keynote. But I’m getting ahead of myself... let me explain the concept. A Pivot collection is an XML file (with .cxml extension) which lists Items, each linking to an image that’s stored in a Deep Zoom format (this means that it contains tiles like Bing Maps, so that the browser can request only the ones of interest according to the zoom level). This collection can be shown in a Silverlight application that uses the PivotViewer control, or in the Pivot Browser that’s available from getpivot.com. Filtering and sorting the items according to their facets (attributes, such as size, age, category, etc), the PivotViewer rearranges the way that these are shown in a very dynamic way. To quote Gary Flake, this lets us “see patterns which are otherwise hidden”. This browsing mechanism is very suited to a number of different methods, because it’s just that – browsing. It’s not searching, it’s more akin to window-shopping than doing an internet search. When we decided to put something together for the conferences such as TechEd Australia 2010 and the PASS Summit 2010, we did some screen-scraping to provide a different view of data that was already available online. Nick Hodge and Michael Kordahi from Microsoft liked the idea a lot, and after a bit of tweaking, we produced one that Michael used in the TechEd Australia keynote to show the variety of talks on offer. It’s interesting to see a pattern in this data: The Office track has the most sessions, but if the Interactive Sessions and Instructor-Led Labs are removed, it drops down to only the sixth most popular track, with Cloud Computing taking over. This is something which just isn’t obvious when you look an ordinary search tool. You get a much better feel for the data when moving around it like this. The more observant amongst you will have noticed some difference in the collection that Michael is demonstrating in the picture above with the screenshots I’ve shown. That’s because it’s been extended some more. At the SQLBits conference in the UK this year, I had some interesting discussions with the guys from Xpert360, particularly Phil Carter, who I’d met in 2009 at an earlier SQLBits conference. They had got around to producing a Pivot collection based on the SQLBits data, which we had been planning to do but ran out of time. We discussed some of ways that Pivot could be used, including the ways that my old friend Howard Dierking had extended it for the MSDN Magazine. I’m not suggesting I influenced Xpert360 at all, but they certainly inspired us with some of their posts on the matter So with LobsterPot guys David Gardiner and Roger Noble both having dabbled in Pivot collections (and Dave doing some for clients), I set Roger to work on extending it some more. He’s used various events and so on to be able to make an environment that allows us to do quick deployment of new collections, as well as showing the data in a grid view which behaves as if it were simply a third view of the data (the other two being the array of images and the ‘histogram’ view). I see PivotViewer as being a significant step in data visualisation – so much so that I feature it when I deliver talks on Spatial Data Visualisation methods. Any time when there is information that can be conveyed through an image, you have to ask yourself how best to show that image, and whether that image is the focal point. For Spatial data, the image is most often a map, and the map becomes the central mode for navigation. I show Pivot with postcode areas, since I can browse the postcodes based on their data, and many of the images are recognisable (to locals of South Australia). Naturally, the images could link through to the map itself, and so on, but generally people think of Spatial data in terms of navigating a map, which doesn’t always gel with the information you’re trying to extract. Roger’s even looking into ways to hook PivotViewer into the Bing Maps API, in a similar way to the Deep Earth project, displaying different levels of map detail according to how ‘zoomed in’ the images are. Some of the work that Dave did with one of the schools was generating the Deep Zoom tiles “on the fly”, based on images stored in a database, and Roger has produced a collection which uses images from flickr, that lets you move from one search term to another. Pulling the images down from flickr.com isn’t particularly ideal from a performance aspect, and flickr doesn’t store images in a small-enough format to really lend itself to this use, but you might agree that it’s an interesting concept which compares nicely to using Maps. I’m looking forward to future versions of the PivotViewer control, and hope they provide many more events that can be used, and even more hooks into it. Naturally, LobsterPot could help provide your business with a PivotViewer experience, but you can probably do a lot of it yourself too. There’s a thorough guide at getpivot.com, which is how we got into it. For some examples of what we’ve done, have a look at http://pivot.lobsterpot.com.au. I’d like to see PivotViewer really catch on a data visualisation tool.

    Read the article

  • Slide-decks from recent Adelaide SQL Server UG meetings

    - by Rob Farley
    The UK has been well represented this summer at the Adelaide SQL Server User Group, with presentations from Chris Testa-O’Neill (isn’t that the right link? Maybe try this one) and Martin Cairney. The slides are available here and here. I thought I’d particularly mention Martin’s, and how it’s relevant to this month’s T-SQL Tuesday. Martin spoke about Policy-Based Management and the Enterprise Policy Management Framework – something which is remarkably under-used, and yet which can really impact your ability to look after environments. If you have policies set up, then you can easily test each of your SQL instances to see if they are still satisfying a set of policies as defined. Automation (the topic of this month’s T-SQL Tuesday) should mean that your life is made easier, thereby enabling to you to do more. It shouldn’t remove the human element, but should remove (most of) the human errors. People still need to manage the situation, and work out what needs to be done, etc. We haven’t reached a point where computers can replace people, but they are very good at replace the mundaneness and monotony of our jobs. They’ve made our lives more interesting (although many would rightly argue that they have also made our lives more complex) by letting us focus on the stuff that changes. Martin named his talk Put Your Feet Up, which nicely expresses the fact that managing systems shouldn’t be about running around checking things all the time. It must be about having systems in place which tell you when things aren’t going well. It’s never quite as simple as being able to actually put your feet up, but certainly no system should require constant attention. It’s definitely a policy we at LobsterPot adhere to, whether it’s an alert to let us know that an ETL package has run successfully, or a script that generates some code for a report. If things can be automated, it reduces the chance of error, reduces the repetitive nature of work, and in general, keeps both consultants and clients much happier.

    Read the article

  • Slide-decks from recent Adelaide SQL Server UG meetings

    - by Rob Farley
    The UK has been well represented this summer at the Adelaide SQL Server User Group, with presentations from Chris Testa-O’Neill (isn’t that the right link? Maybe try this one) and Martin Cairney. The slides are available here and here. I thought I’d particularly mention Martin’s, and how it’s relevant to this month’s T-SQL Tuesday. Martin spoke about Policy-Based Management and the Enterprise Policy Management Framework – something which is remarkably under-used, and yet which can really impact your ability to look after environments. If you have policies set up, then you can easily test each of your SQL instances to see if they are still satisfying a set of policies as defined. Automation (the topic of this month’s T-SQL Tuesday) should mean that your life is made easier, thereby enabling to you to do more. It shouldn’t remove the human element, but should remove (most of) the human errors. People still need to manage the situation, and work out what needs to be done, etc. We haven’t reached a point where computers can replace people, but they are very good at replace the mundaneness and monotony of our jobs. They’ve made our lives more interesting (although many would rightly argue that they have also made our lives more complex) by letting us focus on the stuff that changes. Martin named his talk Put Your Feet Up, which nicely expresses the fact that managing systems shouldn’t be about running around checking things all the time. It must be about having systems in place which tell you when things aren’t going well. It’s never quite as simple as being able to actually put your feet up, but certainly no system should require constant attention. It’s definitely a policy we at LobsterPot adhere to, whether it’s an alert to let us know that an ETL package has run successfully, or a script that generates some code for a report. If things can be automated, it reduces the chance of error, reduces the repetitive nature of work, and in general, keeps both consultants and clients much happier.

    Read the article

  • Xorg does not see my monitor EDID

    - by sean farley
    Below is the output from my Xorg.0. X.Org X Server 1.11.3 Release Date: 2011-12-16 [ 22.311] X Protocol Version 11, Revision 0 [ 22.311] Build Operating System: Linux 2.6.42-23-generic x86_64 Ubuntu [ 22.311] Current Operating System: Linux sean-P55-USB3 3.2.0-34-generic #53-Ubuntu SMP Thu Nov 15 10:48:16 UTC 2012 x86_64 [ 22.311] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-3.2.0-34-generic root=UUID=0a34603e-aee9-44d1-8982-a5a5a38c3e4d ro quiet splash [ 22.311] Build Date: 29 August 2012 12:12:33AM [ 22.311] xorg-server 2:1.11.4-0ubuntu10.8 (For technical support please see http://www.ubuntu.com/support) [ 22.311] Current version of pixman: 0.24.4 [ 22.311] Before reporting problems, check http://wiki.x.org to make sure that you have the latest version. [ 22.311] Markers: (--) probed, (**) from config file, (==) default setting, (++) from command line, (!!) notice, (II) informational, (WW) warning, (EE) error, (NI) not implemented, (??) unknown. [ 22.311] (==) Log file: "/var/log/Xorg.0.log", Time: Sat Nov 17 13:20:45 2012 [ 22.311] (==) Using config file: "/etc/X11/xorg.conf" [ 22.311] (==) Using system config directory "/usr/share/X11/xorg.conf.d" [ 22.311] (==) No Layout section. Using the first Screen section. [ 22.311] (==) No screen section available. Using defaults. [ 22.311] (**) |-->Screen "Default Screen Section" (0) [ 22.311] (**) | |-->Monitor "<default monitor>" [ 22.311] (==) No device specified for screen "Default Screen Section". Using the first device section listed. [ 22.311] (**) | |-->Device "Default Device" [ 22.311] (==) No monitor specified for screen "Default Screen Section". Using a default monitor configuration. [ 22.311] (==) Automatically adding devices I have searched all over, and followed lots of dead ends and unanswered questions on this issue. I need to get this monitor recognised so I can use the native resolution of 1600x1200. The Nvidia driver in Windows has no problem with this. The monitor is an old Iiyama HM204DT A. Is there a way of configuring Xorg manually to get these working? I have tried xrandr but this will not work. Output:- sean@sean-P55-USB3:~$ xrandr Screen 0: minimum 8 x 8, current 1152 x 864, maximum 16384 x 16384 DVI-I-0 connected 1152x864+0+0 (normal left inverted right x axis y axis) 0mm x 0mm 1024x768 60.0 + 1360x768 60.0 59.8 1152x864 60.0* 800x600 72.2 60.3 56.2 680x384 119.9 119.6 640x480 59.9 512x384 120.0 400x300 144.4 320x240 120.1 DVI-I-1 disconnected (normal left inverted right x axis y axis) DVI-I-2 disconnected (normal left inverted right x axis y axis) HDMI-0 disconnected (normal left inverted right x axis y axis) DVI-I-3 disconnected (normal left inverted right x axis y axis) Tried Nvidia Xorg.config: sean@sean-P55-USB3:~$ sudo nvidia-xconfig [sudo] password for sean: Using X configuration file: "/etc/X11/xorg.conf". VALIDATION ERROR: Data incomplete in file /etc/X11/xorg.conf. Device section "Default Device" must have a Driver line. Backed up file '/etc/X11/xorg.conf' as '/etc/X11/xorg.conf.nvidia-xconfig-original' Backed up file '/etc/X11/xorg.conf' as '/etc/X11/xorg.conf.backup' New X configuration file written to '/etc/X11/xorg.conf' How do I insert a driver line? This is a bit of a pain as I want to use my Vectorworks cad program in a WinXP Vbox at 1600x1200 but all virtual drives are restricted to the host screen resolution. Do i need to manually create EIDI info in Xorg? I am slightly confused about how Xorg and Nvidia relate Please help

    Read the article

  • An MCM exam, Rob? Really?

    - by Rob Farley
    I took the SQL 2008 MCM Knowledge exam while in Seattle for the PASS Summit ten days ago. I wasn’t planning to do it, but I got persuaded to try. I was meaning to write this post to explain myself before the result came out, but it seems I didn’t get typing quickly enough. Those of you who know me will know I’m a big fan of certification, to a point. I’ve been involved with Microsoft Learning to help create exams. I’ve kept my certifications current since I first took an exam back in 1998, sitting many in beta, across quite a variety of topics. I’ve probably become quite good at them – I know I’ve definitely passed some that I really should’ve failed. I’ve also written that I don’t think exams are worth studying for. (That’s probably not entirely true, but it depends on your motivation. If you’re doing learning, I would encourage you to focus on what you need to know to do your job better. That will help you pass an exam – but the two skills are very different. I can coach someone on how to pass an exam, but that’s a different kind of teaching when compared to coaching someone about how to do a job. For example, the real world includes a lot of “it depends”, where you develop a feel for what the influencing factors might be. In an exam, its better to be able to know some of the “Don’t use this technology if XYZ is true” concepts better.) As for the Microsoft Certified Master certification… I’m not opposed to the idea of having the MCM (or in the future, MCSM) cert. But the barrier to entry feels quite high for me. When it was first introduced, the nearest testing centres to me were in Kuala Lumpur and Manila. Now there’s one in Perth, but that’s still a big effort. I know there are options in the US – such as one about an hour’s drive away from downtown Seattle, but it all just seems too hard. Plus, these exams are more expensive, and all up – I wasn’t sure I wanted to try them, particularly with the fact that I don’t like to study. I used to study for exams. It would drive my wife crazy. I’d have some exam scheduled for some time in the future (like the time I had two booked for two consecutive days at TechEd Australia 2005), and I’d make sure I was ready. Every waking moment would be spent pouring over exam material, and it wasn’t healthy. I got shaken out of that, though, when I ended up taking four exams in those two days in 2005 and passed them all. I also worked out that if I had a Second Shot available, then failing wasn’t a bad thing at all. Even without Second Shot, I’m much more okay about failing. But even just trying an MCM exam is a big effort. I wouldn’t want to fail one of them. Plus there’s the illusion to maintain. People have told me for a long time that I should just take the MCM exams – that I’d pass no problem. I’ve never been so sure. It was almost becoming a pride-point. Perhaps I should fail just to demonstrate that I can fail these things. Anyway – boB Taylor (@sqlboBT) persuaded me to try the SQL 2008 MCM Knowledge exam at the PASS Summit. They set up a testing centre in one of the room there, so it wasn’t out of my way at all. I had to squeeze it in between other commitments, and I certainly didn’t have time to even see what was on the syllabus, let alone study. In fact, I was so exhausted from the week that I fell asleep at least once (just for a moment though) during the actual exam. Perhaps the questions need more jokes, I’m not sure. I knew if I failed, then I might disappoint some people, but that I wouldn’t’ve spent a great deal of effort in trying to pass. On the other hand, if I did pass I’d then be under pressure to investigate the MCM Lab exam, which can be taken remotely (therefore, a much smaller amount of effort to make happen). In some ways, passing could end up just putting a bunch more pressure on me. Oh, and I did.

    Read the article

  • Plan Operator Tuesday round-up

    - by Rob Farley
    Eighteen posts for T-SQL Tuesday #43 this month, discussing Plan Operators. I put them together and made the following clickable plan. It’s 1000px wide, so I hope you have a monitor wide enough. Let me explain this plan for you (people’s names are the links to the articles on their blogs – the same links as in the plan above). It was clearly a SELECT statement. Wayne Sheffield (@dbawayne) wrote about that, so we start with a SELECT physical operator, leveraging the logical operator Wayne Sheffield. The SELECT operator calls the Paul White operator, discussed by Jason Brimhall (@sqlrnnr) in his post. The Paul White operator is quite remarkable, and can consume three streams of data. Let’s look at those streams. The first pulls data from a Table Scan – Boris Hristov (@borishristov)’s post – using parallel threads (Bradley Ball – @sqlballs) that pull the data eagerly through a Table Spool (Oliver Asmus – @oliverasmus). A scalar operation is also performed on it, thanks to Jeffrey Verheul (@devjef)’s Compute Scalar operator. The second stream of data applies Evil (I figured that must mean a procedural TVF, but could’ve been anything), courtesy of Jason Strate (@stratesql). It performs this Evil on the merging of parallel streams (Steve Jones – @way0utwest), which suck data out of a Switch (Paul White – @sql_kiwi). This Switch operator is consuming data from up to four lookups, thanks to Kalen Delaney (@sqlqueen), Rick Krueger (@dataogre), Mickey Stuewe (@sqlmickey) and Kathi Kellenberger (@auntkathi). Unfortunately Kathi’s name is a bit long and has been truncated, just like in real plans. The last stream performs a join of two others via a Nested Loop (Matan Yungman – @matanyungman). One pulls data from a Spool (my post – @rob_farley) populated from a Table Scan (Jon Morisi). The other applies a catchall operator (the catchall is because Tamera Clark (@tameraclark) didn’t specify any particular operator, and a catchall is what gets shown when SSMS doesn’t know what to show. Surprisingly, it’s showing the yellow one, which is about cursors. Hopefully that’s not what Tamera planned, but anyway...) to the output from an Index Seek operator (Sebastian Meine – @sqlity). Lastly, I think everyone put in 110% effort, so that’s what all the operators cost. That didn’t leave anything for me, unfortunately, but that’s okay. Also, because he decided to use the Paul White operator, Jason Brimhall gets 0%, and his 110% was given to Paul’s Switch operator post. I hope you’ve enjoyed this T-SQL Tuesday, and have learned something extra about Plan Operators. Keep your eye out for next month’s one by watching the Twitter Hashtag #tsql2sday, and why not contribute a post to the party? Big thanks to Adam Machanic as usual for starting all this. @rob_farley

    Read the article

  • Exploding maps in Reporting Services 2008 R2

    - by Rob Farley
    Kaboom! Well, that was the imagery that secretly appeared in my mind when I saw “USA By State Exploded” in the list of installed maps in Report Builder 3.0 – part of the spatial offering of SQL Server Reporting Server 2008 R2. Alas, it just means that the borders are bigger. Clicking on it showed me. Unfortunately, I’m not interested in maps of the US. None of my clients are there (at least, not yet – feel free to get in touch if you want to change this ‘feature’ of my company). So instead, I’ve recently been getting hold of some data for Australian areas. I’ve just bought some PostCode shapes for South Australia, and will use this in demos for conferences and for showing clients how this kind of report can really impact their reporting. One of the companies I was talking about getting shape files sent me a sample. So I chose the “ESRI shapefile” option you see above, and browsed to my file. It appeared in the window like this: Australians will immediately recognise this as the area around Wollongong, just south of Sydney. Well, apart from me. I didn’t. I had to put a Bing Maps layer behind it to work that out, but that’s not for this post. The thing that I discovered was that if I selected the Exploded USA option (but without clicking Next), and then chose my shape file, then my area around Wollongong would be exploded too! Huh! I think this is actually a bug, but a potentially useful one! Some further investigation (involving creating two identical reports, one with this exploded view, one without), showed that the Exploded View is done by reducing the ScaleFactor property of the PolygonLayer in the map control. The Exploded version has it below 1. If you set to above one, your shapes overlap. I discovered this by accident… I guess I hadn’t looked through all the PolygonLayer options to work out what they all do. And because this post is about Reporting, it can qualify for this month’s T-SQL Tuesday, hosted by Aaron Nelson (@sqlvariant). Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

    Read the article

  • BUILD 2013 Session&ndash;What&rsquo;s New In XAML

    - by Tim Murphy
    Originally posted on: http://geekswithblogs.net/tmurphy/archive/2013/06/27/build-2013-sessionndashwhatrsquos-new-in-xaml.aspx If ever there was a session that you felt like your head was going to explode, this one would do it.  Tim Heuer proceeded to try to fit as many of the changes and additions to XAML as he could in one hour. There were a number of improvements that struck me.  The first was the fact that we no longer need to put stack panels in the AppBar in order to add buttons.  This has been changed to a CommandBar which at the very least makes the markup read more cleanly.  Now if they would just bring this same improvement to Windows Phone we would be set. There was a lot of cheering at the beginning of his talk when he showed that there are now date time pickers.  I understand that it makes life easier, but I just couldn’t get that excited. The couple of features that did grab my attention being able to select a group of tags and then add an encapsulating tag such as a StackPanel around them and the fact that they have optimized XAML so that now runs on average 25% faster. I’d go crazy trying to list off all the improvements and new features so be sure to go and review the recording of the session. del.icio.us Tags: BUILD 2013,XAML,Windows 8.1

    Read the article

  • 24 hours to pass until 24 Hours of PASS

    - by Rob Farley
    There’s a bunch of stuff going on at the moment in the SQL world, so if you’ve missed this particular piece of news, let me tell you a bit about it. Twice a year, the SQL community puts on its biggest virtual event – 24 Hours of PASS. And the next one is tomorrow – March 21st, 2012. Twenty-four sessions, back-to-back, featuring a selection of some of the best presenters in the SQL world, speakers from all over the world, coming together in an online collaboration that so far has well over thirty thousand registrations across the presentations. Some people are signed up for all 24 sessions, some only one. Traditionally, LiveMeeting has been used as the platform for this event, but this year we’re going with a new platform – IBTalk. It promises big, and we’re hoping it won’t let us down. LiveMeeting has been great, and we thank Microsoft for providing it as a platform for the past few years. However, as the event has grown, we’ve found that a new idea is necessary. Last year a search was done for a new platform, and IBTalk ticked the right boxes. The feedback from the presenters and moderators so far has been overwhelmingly positive, and we’re hoping that this is going to really enhance the user experience. One of my favourite features of the platform is the language side. It provides a pretty good translation service. Users who join a session will see a flag on the left of the screen. If they click it, they can change the language to one of 15 on offer. Picking this changes all the labels on everything. It even translates the text in the Q&A window. What this means is that someone from Brazil can ask their question in Portuguese, and the presenter will see it in English. Then if the answer is typed in English, the questioner will be able to see the answer, also in Portuguese. Or they can switch to English to see it as the answerer typed it. I know there’s always the risk of bad translations going on, but I’ve heard good things about this translation service. But there’s more – IBTalk are providing staff to type up closed captioning live during the event. So if English isn’t your first language, don’t worry! Picking your language will also let you see subtitles in your chosen language. I’m hoping that this event is the start of PASS being able to reach people from all corners of the world. Wouldn’t it be great to find that this event is successful, and that the next 24HOP (later in the year, our Summit Preview event) has just as many non-English speakers tuning in as English speakers? If you haven’t been planning which sessions you’re going to attend, you really should get over to sqlpass.org/24hours and have a look through what’s on offer. There’s some amazing material from some of the industry’s brightest, covering a wide range of topics, from classic SQL areas to the brand new SQL 2012 features. There really should be something for every SQL professional. Check the time zones though – if you’re in the US you might be on Summer time, and an hour closer to GMT than normal. Massive thanks must go to Microsoft, SQL Sentry and Idera for sponsoring this event. Without sponsors we wouldn’t be able to put any of this on. These companies are helping 24HOP continue to grow into an event for the whole world. See you tomorrow! @rob_farley | #24hop | #sqlpass

    Read the article

  • 24 hours to pass until 24 Hours of PASS

    - by Rob Farley
    There’s a bunch of stuff going on at the moment in the SQL world, so if you’ve missed this particular piece of news, let me tell you a bit about it. Twice a year, the SQL community puts on its biggest virtual event – 24 Hours of PASS. And the next one is tomorrow – March 21st, 2012. Twenty-four sessions, back-to-back, featuring a selection of some of the best presenters in the SQL world, speakers from all over the world, coming together in an online collaboration that so far has well over thirty thousand registrations across the presentations. Some people are signed up for all 24 sessions, some only one. Traditionally, LiveMeeting has been used as the platform for this event, but this year we’re going with a new platform – IBTalk. It promises big, and we’re hoping it won’t let us down. LiveMeeting has been great, and we thank Microsoft for providing it as a platform for the past few years. However, as the event has grown, we’ve found that a new idea is necessary. Last year a search was done for a new platform, and IBTalk ticked the right boxes. The feedback from the presenters and moderators so far has been overwhelmingly positive, and we’re hoping that this is going to really enhance the user experience. One of my favourite features of the platform is the language side. It provides a pretty good translation service. Users who join a session will see a flag on the left of the screen. If they click it, they can change the language to one of 15 on offer. Picking this changes all the labels on everything. It even translates the text in the Q&A window. What this means is that someone from Brazil can ask their question in Portuguese, and the presenter will see it in English. Then if the answer is typed in English, the questioner will be able to see the answer, also in Portuguese. Or they can switch to English to see it as the answerer typed it. I know there’s always the risk of bad translations going on, but I’ve heard good things about this translation service. But there’s more – IBTalk are providing staff to type up closed captioning live during the event. So if English isn’t your first language, don’t worry! Picking your language will also let you see subtitles in your chosen language. I’m hoping that this event is the start of PASS being able to reach people from all corners of the world. Wouldn’t it be great to find that this event is successful, and that the next 24HOP (later in the year, our Summit Preview event) has just as many non-English speakers tuning in as English speakers? If you haven’t been planning which sessions you’re going to attend, you really should get over to sqlpass.org/24hours and have a look through what’s on offer. There’s some amazing material from some of the industry’s brightest, covering a wide range of topics, from classic SQL areas to the brand new SQL 2012 features. There really should be something for every SQL professional. Check the time zones though – if you’re in the US you might be on Summer time, and an hour closer to GMT than normal. Massive thanks must go to Microsoft, SQL Sentry and Idera for sponsoring this event. Without sponsors we wouldn’t be able to put any of this on. These companies are helping 24HOP continue to grow into an event for the whole world. See you tomorrow! @rob_farley | #24hop | #sqlpass

    Read the article

  • LobsterPot Solutions in the USA

    - by Rob Farley
    We’re expanding! I’m thrilled to announce that Microsoft Gold Partner LobsterPot Solutions has started another branch appointing the amazing Ted Krueger (5-time SQL MVP awardee) as the US lead. Ted is well-known in the SQL Server world, having written books on indexing, consulting and on being a DBA (not to mention contributing chapters to both MVP Deep Dives books). He is an expert on replication and high availability, and strong in the Business Intelligence space – vast experience which is both broad and deep. Ted is based in the south east corner of Wisconsin, just north of Chicago. He has been a consultant for eons and has helped many clients with their projects and problems, taking the role as both technical lead and consulting lead. He is also tireless in supporting and developing the SQL Server community, presenting at conferences across America, and helping people through his blog, Twitter and more. Despite all this – it’s neither his technical excellence with SQL Server nor his consulting skill that made me want him to lead LobsterPot’s US venture. I wanted Ted because of his values. In the time I’ve known Ted, I’ve found his integrity to be excellent, and found him to be morally beyond reproach. This is the biggest priority I have when finding people to represent the LobsterPot brand. I have no qualms in recommending Ted’s character or work ethic. It’s not just my thoughts on him – all my trusted friends that know Ted agree about this. So last week, LobsterPot Solutions LLC was formed in the United States, and in a couple of weeks, we will be open for business! LobsterPot Solutions can be contacted via email at [email protected], on the web at either www.lobsterpot.com.au or www.lobsterpotsolutions.com, and on Twitter as @lobsterpot_au and @lobsterpot_us. Ted Kruger blogs at LessThanDot, and can also be found on Twitter and LinkedIn. This post is cross-posted from http://lobsterpotsolutions.com/lobsterpot-solutions-in-the-usa

    Read the article

  • Visually stunning maps and PivotViewer

    - by Rob Farley
    One of the things about PivotViewer is that it runs in the Silverlight platform and can be extended recently. One of my guys at LobsterPot, Roger Noble, has used this to incorporate a Bing Maps layer, showing items which have  Latitude and Longitude values there. We’re already talking to a hospital about using this to allow them to browse their patient data, including showing the patients on a map according to which bed they’re in. Interesting times – this will involve having custom tiles instead of the ones from Bing Maps, but the idea is similar. Of course, we’ll be using Bing Maps to show where the patients live. I should also mention that this is a work-in-progress still. Figuring out how to use PivotViewer isn’t trivial, and we’ve done quite a lot of experimenting to see how to get things working. If you find bugs, please feel free to let me know (rob_farley at hotmail will usually reach me), and we’ll add them to our list. Here are some screenshots that I made recently using the collection at http://pivot.lobsterpot.com.au/flickr – by selecting a tag, you can get a new bunch of images. A couple of images that were taken in Iceland. Some from St Mary’s Lighthouse near Newcastle, UK. And some from around Big Ben in London. I’d recommend using either Firefox or Internet Explorer if you choose to browse this yourself. It seems the Chrome browser support for Silverlight doesn’t quite handle things as nicely as we’d all like. I imagine that at some point, we may enhance the Flickr collection, to be able to search on more than tags, but as a sample collection, it seems to work quite well.

    Read the article

  • The blocking nature of aggregates

    - by Rob Farley
    I wrote a post recently about how query tuning isn’t just about how quickly the query runs – that if you have something (such as SSIS) that is consuming your data (and probably introducing a bottleneck), then it might be more important to have a query which focuses on getting the first bit of data out. You can read that post here.  In particular, we looked at two operators that could be used to ensure that a query returns only Distinct rows. and The Sort operator pulls in all the data, sorts it (discarding duplicates), and then pushes out the remaining rows. The Hash Match operator performs a Hashing function on each row as it comes in, and then looks to see if it’s created a Hash it’s seen before. If not, it pushes the row out. The Sort method is quicker, but has to wait until it’s gathered all the data before it can do the sort, and therefore blocks the data flow. But that was my last post. This one’s a bit different. This post is going to look at how Aggregate functions work, which ties nicely into this month’s T-SQL Tuesday. I’ve frequently explained about the fact that DISTINCT and GROUP BY are essentially the same function, although DISTINCT is the poorer cousin because you have less control over it, and you can’t apply aggregate functions. Just like the operators used for Distinct, there are different flavours of Aggregate operators – coming in blocking and non-blocking varieties. The example I like to use to explain this is a pile of playing cards. If I’m handed a pile of cards and asked to count how many cards there are in each suit, it’s going to help if the cards are already ordered. Suppose I’m playing a game of Bridge, I can easily glance at my hand and count how many there are in each suit, because I keep the pile of cards in order. Moving from left to right, I could tell you I have four Hearts in my hand, even before I’ve got to the end. By telling you that I have four Hearts as soon as I know, I demonstrate the principle of a non-blocking operation. This is known as a Stream Aggregate operation. It requires input which is sorted by whichever columns the grouping is on, and it will release a row as soon as the group changes – when I encounter a Spade, I know I don’t have any more Hearts in my hand. Alternatively, if the pile of cards are not sorted, I won’t know how many Hearts I have until I’ve looked through all the cards. In fact, to count them, I basically need to put them into little piles, and when I’ve finished making all those piles, I can count how many there are in each. Because I don’t know any of the final numbers until I’ve seen all the cards, this is blocking. This performs the aggregate function using a Hash Match. Observant readers will remember this from my Distinct example. You might remember that my earlier Hash Match operation – used for Distinct Flow – wasn’t blocking. But this one is. They’re essentially doing a similar operation, applying a Hash function to some data and seeing if the set of values have been seen before, but before, it needs more information than the mere existence of a new set of values, it needs to consider how many of them there are. A lot is dependent here on whether the data coming out of the source is sorted or not, and this is largely determined by the indexes that are being used. If you look in the Properties of an Index Scan, you’ll be able to see whether the order of the data is required by the plan. A property called Ordered will demonstrate this. In this particular example, the second plan is significantly faster, but is dependent on having ordered data. In fact, if I force a Stream Aggregate on unordered data (which I’m doing by telling it to use a different index), a Sort operation is needed, which makes my plan a lot slower. This is all very straight-forward stuff, and information that most people are fully aware of. I’m sure you’ve all read my good friend Paul White (@sql_kiwi)’s post on how the Query Optimizer chooses which type of aggregate function to apply. But let’s take a look at SQL Server Integration Services. SSIS gives us a Aggregate transformation for use in Data Flow Tasks, but it’s described as Blocking. The definitive article on Performance Tuning SSIS uses Sort and Aggregate as examples of Blocking Transformations. I’ve just shown you that Aggregate operations used by the Query Optimizer are not always blocking, but that the SSIS Aggregate component is an example of a blocking transformation. But is it always the case? After all, there are plenty of SSIS Performance Tuning talks out there that describe the value of sorted data in Data Flow Tasks, describing the IsSorted property that can be set through the Advanced Editor of your Source component. And so I set about testing the Aggregate transformation in SSIS, to prove for sure whether providing Sorted data would let the Aggregate transform behave like a Stream Aggregate. (Of course, I knew the answer already, but it helps to be able to demonstrate these things). A query that will produce a million rows in order was in order. Let me rephrase. I used a query which produced the numbers from 1 to 1000000, in a single field, ordered. The IsSorted flag was set on the source output, with the only column as SortKey 1. Performing an Aggregate function over this (counting the number of rows per distinct number) should produce an additional column with 1 in it. If this were being done in T-SQL, the ordered data would allow a Stream Aggregate to be used. In fact, if the Query Optimizer saw that the field had a Unique Index on it, it would be able to skip the Aggregate function completely, and just insert the value 1. This is a shortcut I wouldn’t be expecting from SSIS, but certainly the Stream behaviour would be nice. Unfortunately, it’s not the case. As you can see from the screenshots above, the data is pouring into the Aggregate function, and not being released until all million rows have been seen. It’s not doing a Stream Aggregate at all. This is expected behaviour. (I put that in bold, because I want you to realise this.) An SSIS transformation is a piece of code that runs. It’s a physical operation. When you write T-SQL and ask for an aggregation to be done, it’s a logical operation. The physical operation is either a Stream Aggregate or a Hash Match. In SSIS, you’re telling the system that you want a generic Aggregation, that will have to work with whatever data is passed in. I’m not saying that it wouldn’t be possible to make a sometimes-blocking aggregation component in SSIS. A Custom Component could be created which could detect whether the SortKeys columns of the input matched the Grouping columns of the Aggregation, and either call the blocking code or the non-blocking code as appropriate. One day I’ll make one of those, and publish it on my blog. I’ve done it before with a Script Component, but as Script components are single-use, I was able to handle the data knowing everything about my data flow already. As per my previous post – there are a lot of aspects in which tuning SSIS and tuning execution plans use similar concepts. In both situations, it really helps to have a feel for what’s going on behind the scenes. Considering whether an operation is blocking or not is extremely relevant to performance, and that it’s not always obvious from the surface. In a future post, I’ll show the impact of blocking v non-blocking and synchronous v asynchronous components in SSIS, using some of LobsterPot’s Script Components and Custom Components as examples. When I get that sorted, I’ll make a Stream Aggregate component available for download.

    Read the article

  • The blocking nature of aggregates

    - by Rob Farley
    I wrote a post recently about how query tuning isn’t just about how quickly the query runs – that if you have something (such as SSIS) that is consuming your data (and probably introducing a bottleneck), then it might be more important to have a query which focuses on getting the first bit of data out. You can read that post here.  In particular, we looked at two operators that could be used to ensure that a query returns only Distinct rows. and The Sort operator pulls in all the data, sorts it (discarding duplicates), and then pushes out the remaining rows. The Hash Match operator performs a Hashing function on each row as it comes in, and then looks to see if it’s created a Hash it’s seen before. If not, it pushes the row out. The Sort method is quicker, but has to wait until it’s gathered all the data before it can do the sort, and therefore blocks the data flow. But that was my last post. This one’s a bit different. This post is going to look at how Aggregate functions work, which ties nicely into this month’s T-SQL Tuesday. I’ve frequently explained about the fact that DISTINCT and GROUP BY are essentially the same function, although DISTINCT is the poorer cousin because you have less control over it, and you can’t apply aggregate functions. Just like the operators used for Distinct, there are different flavours of Aggregate operators – coming in blocking and non-blocking varieties. The example I like to use to explain this is a pile of playing cards. If I’m handed a pile of cards and asked to count how many cards there are in each suit, it’s going to help if the cards are already ordered. Suppose I’m playing a game of Bridge, I can easily glance at my hand and count how many there are in each suit, because I keep the pile of cards in order. Moving from left to right, I could tell you I have four Hearts in my hand, even before I’ve got to the end. By telling you that I have four Hearts as soon as I know, I demonstrate the principle of a non-blocking operation. This is known as a Stream Aggregate operation. It requires input which is sorted by whichever columns the grouping is on, and it will release a row as soon as the group changes – when I encounter a Spade, I know I don’t have any more Hearts in my hand. Alternatively, if the pile of cards are not sorted, I won’t know how many Hearts I have until I’ve looked through all the cards. In fact, to count them, I basically need to put them into little piles, and when I’ve finished making all those piles, I can count how many there are in each. Because I don’t know any of the final numbers until I’ve seen all the cards, this is blocking. This performs the aggregate function using a Hash Match. Observant readers will remember this from my Distinct example. You might remember that my earlier Hash Match operation – used for Distinct Flow – wasn’t blocking. But this one is. They’re essentially doing a similar operation, applying a Hash function to some data and seeing if the set of values have been seen before, but before, it needs more information than the mere existence of a new set of values, it needs to consider how many of them there are. A lot is dependent here on whether the data coming out of the source is sorted or not, and this is largely determined by the indexes that are being used. If you look in the Properties of an Index Scan, you’ll be able to see whether the order of the data is required by the plan. A property called Ordered will demonstrate this. In this particular example, the second plan is significantly faster, but is dependent on having ordered data. In fact, if I force a Stream Aggregate on unordered data (which I’m doing by telling it to use a different index), a Sort operation is needed, which makes my plan a lot slower. This is all very straight-forward stuff, and information that most people are fully aware of. I’m sure you’ve all read my good friend Paul White (@sql_kiwi)’s post on how the Query Optimizer chooses which type of aggregate function to apply. But let’s take a look at SQL Server Integration Services. SSIS gives us a Aggregate transformation for use in Data Flow Tasks, but it’s described as Blocking. The definitive article on Performance Tuning SSIS uses Sort and Aggregate as examples of Blocking Transformations. I’ve just shown you that Aggregate operations used by the Query Optimizer are not always blocking, but that the SSIS Aggregate component is an example of a blocking transformation. But is it always the case? After all, there are plenty of SSIS Performance Tuning talks out there that describe the value of sorted data in Data Flow Tasks, describing the IsSorted property that can be set through the Advanced Editor of your Source component. And so I set about testing the Aggregate transformation in SSIS, to prove for sure whether providing Sorted data would let the Aggregate transform behave like a Stream Aggregate. (Of course, I knew the answer already, but it helps to be able to demonstrate these things). A query that will produce a million rows in order was in order. Let me rephrase. I used a query which produced the numbers from 1 to 1000000, in a single field, ordered. The IsSorted flag was set on the source output, with the only column as SortKey 1. Performing an Aggregate function over this (counting the number of rows per distinct number) should produce an additional column with 1 in it. If this were being done in T-SQL, the ordered data would allow a Stream Aggregate to be used. In fact, if the Query Optimizer saw that the field had a Unique Index on it, it would be able to skip the Aggregate function completely, and just insert the value 1. This is a shortcut I wouldn’t be expecting from SSIS, but certainly the Stream behaviour would be nice. Unfortunately, it’s not the case. As you can see from the screenshots above, the data is pouring into the Aggregate function, and not being released until all million rows have been seen. It’s not doing a Stream Aggregate at all. This is expected behaviour. (I put that in bold, because I want you to realise this.) An SSIS transformation is a piece of code that runs. It’s a physical operation. When you write T-SQL and ask for an aggregation to be done, it’s a logical operation. The physical operation is either a Stream Aggregate or a Hash Match. In SSIS, you’re telling the system that you want a generic Aggregation, that will have to work with whatever data is passed in. I’m not saying that it wouldn’t be possible to make a sometimes-blocking aggregation component in SSIS. A Custom Component could be created which could detect whether the SortKeys columns of the input matched the Grouping columns of the Aggregation, and either call the blocking code or the non-blocking code as appropriate. One day I’ll make one of those, and publish it on my blog. I’ve done it before with a Script Component, but as Script components are single-use, I was able to handle the data knowing everything about my data flow already. As per my previous post – there are a lot of aspects in which tuning SSIS and tuning execution plans use similar concepts. In both situations, it really helps to have a feel for what’s going on behind the scenes. Considering whether an operation is blocking or not is extremely relevant to performance, and that it’s not always obvious from the surface. In a future post, I’ll show the impact of blocking v non-blocking and synchronous v asynchronous components in SSIS, using some of LobsterPot’s Script Components and Custom Components as examples. When I get that sorted, I’ll make a Stream Aggregate component available for download.

    Read the article

  • Summit Old, Summit New, Summit Borrowed...

    - by Rob Farley
    PASS Summit is coming up, and I thought I’d post a few things. Summit Old... At the PASS Summit, you will get the chance to hear presentations by the SQL Server establishment. Just about every big name in the SQL Server world is a regular at the PASS Summit, so you will get to hear and meet people like Kalen Delaney (@sqlqueen) (who just recently got awarded MVP status for the 20th year running), and from all around the world such as the UK’s Chris Webb (@technitrain) or Pinal Dave (@pinaldave) from India. Almost all the household names in SQL Server will be there, including a large contingent from Microsoft. The PASS Summit is by far the best place to meet the legends of SQL Server. And they’re not all old. Some are, but most of them are younger than you might think. ...Summit New... The hottest topics are often about the newest technologies (such as SQL Server 2012). But you will almost certainly learn new stuff about older versions too. But that’s not what I wanted to pick on for this point. There are many new speakers at every PASS Summit, and content that has not been covered in other places. This year, for example, LobsterPot’s Roger Noble (@roger_noble) is giving a presentation for the first time. He’s a regular around the Australian circuit, but this is his first time presenting to a US audience. New Zealand’s Paul White (@sql_kiwi) is attending his first PASS Summit, and will be giving over four hours of incredibly deep stuff that has never been presented anywhere in the US before (I can’t say the world, because he did present similar material in Adelaide earlier in the year). ...Summit Borrowed... No, I’m not talking about plagiarism – the talks you’ll hear are all their own work. But you will get a lot of stuff you’ll be able to take back and apply at work. The PASS Summit sessions are not full of sales-pitches, telling you about how great things could be if only you’d buy some third-party vendor product. It’s simply not that kind of conference, and PASS doesn’t allow that kind of talk to take place. Instead, you’ll be taught techniques, and be able to download scripts and slides to let you perform that magic back at work when you get home. You will definitely find plenty of ideas to borrow at the PASS Summit. ...Summit Blue Yeah – and there’s karaoke. Blue - Jason - SQL Karaoke - YouTube

    Read the article

  • Spooling in SQL execution plans

    - by Rob Farley
    Sewing has never been my thing. I barely even know the terminology, and when discussing this with American friends, I even found out that half the words that Americans use are different to the words that English and Australian people use. That said – let’s talk about spools! In particular, the Spool operators that you find in some SQL execution plans. This post is for T-SQL Tuesday, hosted this month by me! I’ve chosen to write about spools because they seem to get a bad rap (even in my song I used the line “There’s spooling from a CTE, they’ve got recursion needlessly”). I figured it was worth covering some of what spools are about, and hopefully explain why they are remarkably necessary, and generally very useful. If you have a look at the Books Online page about Plan Operators, at http://msdn.microsoft.com/en-us/library/ms191158.aspx, and do a search for the word ‘spool’, you’ll notice it says there are 46 matches. 46! Yeah, that’s what I thought too... Spooling is mentioned in several operators: Eager Spool, Lazy Spool, Index Spool (sometimes called a Nonclustered Index Spool), Row Count Spool, Spool, Table Spool, and Window Spool (oh, and Cache, which is a special kind of spool for a single row, but as it isn’t used in SQL 2012, I won’t describe it any further here). Spool, Table Spool, Index Spool, Window Spool and Row Count Spool are all physical operators, whereas Eager Spool and Lazy Spool are logical operators, describing the way that the other spools work. For example, you might see a Table Spool which is either Eager or Lazy. A Window Spool can actually act as both, as I’ll mention in a moment. In sewing, cotton is put onto a spool to make it more useful. You might buy it in bulk on a cone, but if you’re going to be using a sewing machine, then you quite probably want to have it on a spool or bobbin, which allows it to be used in a more effective way. This is the picture that I want you to think about in relation to your data. I’m sure you use spools every time you use your sewing machine. I know I do. I can’t think of a time when I’ve got out my sewing machine to do some sewing and haven’t used a spool. However, I often run SQL queries that don’t use spools. You see, the data that is consumed by my query is typically in a useful state without a spool. It’s like I can just sew with my cotton despite it not being on a spool! Many of my favourite features in T-SQL do like to use spools though. This looks like a very similar query to before, but includes an OVER clause to return a column telling me the number of rows in my data set. I’ll describe what’s going on in a few paragraphs’ time. So what does a Spool operator actually do? The spool operator consumes a set of data, and stores it in a temporary structure, in the tempdb database. This structure is typically either a Table (ie, a heap), or an Index (ie, a b-tree). If no data is actually needed from it, then it could also be a Row Count spool, which only stores the number of rows that the spool operator consumes. A Window Spool is another option if the data being consumed is tightly linked to windows of data, such as when the ROWS/RANGE clause of the OVER clause is being used. You could maybe think about the type of spool being like whether the cotton is going onto a small bobbin to fit in the base of the sewing machine, or whether it’s a larger spool for the top. A Table or Index Spool is either Eager or Lazy in nature. Eager and Lazy are Logical operators, which talk more about the behaviour, rather than the physical operation. If I’m sewing, I can either be all enthusiastic and get all my cotton onto the spool before I start, or I can do it as I need it. “Lazy” might not the be the best word to describe a person – in the SQL world it describes the idea of either fetching all the rows to build up the whole spool when the operator is called (Eager), or populating the spool only as it’s needed (Lazy). Window Spools are both physical and logical. They’re eager on a per-window basis, but lazy between windows. And when is it needed? The way I see it, spools are needed for two reasons. 1 – When data is going to be needed AGAIN. 2 – When data needs to be kept away from the original source. If you’re someone that writes long stored procedures, you are probably quite aware of the second scenario. I see plenty of stored procedures being written this way – where the query writer populates a temporary table, so that they can make updates to it without risking the original table. SQL does this too. Imagine I’m updating my contact list, and some of my changes move data to later in the book. If I’m not careful, I might update the same row a second time (or even enter an infinite loop, updating it over and over). A spool can make sure that I don’t, by using a copy of the data. This problem is known as the Halloween Effect (not because it’s spooky, but because it was discovered in late October one year). As I’m sure you can imagine, the kind of spool you’d need to protect against the Halloween Effect would be eager, because if you’re only handling one row at a time, then you’re not providing the protection... An eager spool will block the flow of data, waiting until it has fetched all the data before serving it up to the operator that called it. In the query below I’m forcing the Query Optimizer to use an index which would be upset if the Name column values got changed, and we see that before any data is fetched, a spool is created to load the data into. This doesn’t stop the index being maintained, but it does mean that the index is protected from the changes that are being done. There are plenty of times, though, when you need data repeatedly. Consider the query I put above. A simple join, but then counting the number of rows that came through. The way that this has executed (be it ideal or not), is to ask that a Table Spool be populated. That’s the Table Spool operator on the top row. That spool can produce the same set of rows repeatedly. This is the behaviour that we see in the bottom half of the plan. In the bottom half of the plan, we see that the a join is being done between the rows that are being sourced from the spool – one being aggregated and one not – producing the columns that we need for the query. Table v Index When considering whether to use a Table Spool or an Index Spool, the question that the Query Optimizer needs to answer is whether there is sufficient benefit to storing the data in a b-tree. The idea of having data in indexes is great, but of course there is a cost to maintaining them. Here we’re creating a temporary structure for data, and there is a cost associated with populating each row into its correct position according to a b-tree, as opposed to simply adding it to the end of the list of rows in a heap. Using a b-tree could even result in page-splits as the b-tree is populated, so there had better be a reason to use that kind of structure. That all depends on how the data is going to be used in other parts of the plan. If you’ve ever thought that you could use a temporary index for a particular query, well this is it – and the Query Optimizer can do that if it thinks it’s worthwhile. It’s worth noting that just because a Spool is populated using an Index Spool, it can still be fetched using a Table Spool. The details about whether or not a Spool used as a source shows as a Table Spool or an Index Spool is more about whether a Seek predicate is used, rather than on the underlying structure. Recursive CTE I’ve already shown you an example of spooling when the OVER clause is used. You might see them being used whenever you have data that is needed multiple times, and CTEs are quite common here. With the definition of a set of data described in a CTE, if the query writer is leveraging this by referring to the CTE multiple times, and there’s no simplification to be leveraged, a spool could theoretically be used to avoid reapplying the CTE’s logic. Annoyingly, this doesn’t happen. Consider this query, which really looks like it’s using the same data twice. I’m creating a set of data (which is completely deterministic, by the way), and then joining it back to itself. There seems to be no reason why it shouldn’t use a spool for the set described by the CTE, but it doesn’t. On the other hand, if we don’t pull as many columns back, we might see a very different plan. You see, CTEs, like all sub-queries, are simplified out to figure out the best way of executing the whole query. My example is somewhat contrived, and although there are plenty of cases when it’s nice to give the Query Optimizer hints about how to execute queries, it usually doesn’t do a bad job, even without spooling (and you can always use a temporary table). When recursion is used, though, spooling should be expected. Consider what we’re asking for in a recursive CTE. We’re telling the system to construct a set of data using an initial query, and then use set as a source for another query, piping this back into the same set and back around. It’s very much a spool. The analogy of cotton is long gone here, as the idea of having a continual loop of cotton feeding onto a spool and off again doesn’t quite fit, but that’s what we have here. Data is being fed onto the spool, and getting pulled out a second time when the spool is used as a source. (This query is running on AdventureWorks, which has a ManagerID column in HumanResources.Employee, not AdventureWorks2012) The Index Spool operator is sucking rows into it – lazily. It has to be lazy, because at the start, there’s only one row to be had. However, as rows get populated onto the spool, the Table Spool operator on the right can return rows when asked, ending up with more rows (potentially) getting back onto the spool, ready for the next round. (The Assert operator is merely checking to see if we’ve reached the MAXRECURSION point – it vanishes if you use OPTION (MAXRECURSION 0), which you can try yourself if you like). Spools are useful. Don’t lose sight of that. Every time you use temporary tables or table variables in a stored procedure, you’re essentially doing the same – don’t get upset at the Query Optimizer for doing so, even if you think the spool looks like an expensive part of the query. I hope you’re enjoying this T-SQL Tuesday. Why not head over to my post that is hosting it this month to read about some other plan operators? At some point I’ll write a summary post – once I have you should find a comment below pointing at it. @rob_farley

    Read the article

  • T-SQL Tuesday - the swag

    - by Rob Farley
    This month’s T-SQL Tuesday is hosted by Kendal van Dyke (@SQLDBA), and is on the topic of swag. He asks about the best SQL Server swag that we’ve ever received from a conference. I can’t say I ever focus on getting the swag at conferences, as I see some people doing. I know there are plenty of people that get around all the sponsors as soon as they’ve arrived, collecting whatever goodies they can, sometimes as token gifts for those at home, sometimes as giveaways for the user groups they attend. I remember a few years ago at my first PASS Summit, the SQLCAT team gave me a large pile of leftover SQL Server swag to give away to my user group – piles of branded things to stop your phone sliding off your car dashboard, and other things. The user group members thought it was great, and over the course of a few months, happily cleared me out of it all. I tend to consider swag to be something that you haven’t earned except by being at a conference, and there was no winning associated with it, it was simply a giveaway item at a sponsor booth. That means I don’t include the HP Mini laptop that was given away at TechEd Australia a few years ago to every attendee, or the SQL Server bag and Camelbak bottle that I was given as a thank-you for writing a guest blog post (which I use as my regular laptop bag and water bottle for work). I don’t even include the copy of Midtown Madness that I got as a door prize at my vey first TechEd event in 1999 (that was a really good game, and even meant that when I went to Chicago last year, I felt a strange familiarity about the place). I don’t want to include shirts in the mix either. I was given a nice SQL Server shirt about five years ago TechEd Australia. It’s a business shirt (buttons, cuffs, pocket on the chest), black with the SQL Server logo on it. It was such a nice shirt that I commented about it to the Product Marketing Manager for Australia (Christine, at the time), who unexpectedly arranged for me to get another one. That was certainly an improvement on the tent I was given at one of the MVP conference I attended. So when I consider these ‘rules’, two pieces of swag come to mind, and I think both were at PASS Summits (although I can’t be sure). One was a hand-warmer from HP, one of the “crystallisation-type” ones, which proved extremely popular when I got home, until one day when it didn’t survive being recharged – not overly SQL related, but still it was good swag. The other was an umbrella, from expressor, which was from the PASS Summit in 2010, my first PASS Summit. I remember it well – Blythe Morrow (now Gietz) (@blythemorrow) was working the booth, having stopped working for PASS some time before, but she’d been on my list of people to meet, as I’d had plenty of contact with her while she’d worked at PASS, my being a chapter leader and general volunteer. There had been an expressor dinner on one of the first evenings, which I’d been asked to be at, which is when I’d met lots of SQL people in person for the first time, including Ted Krueger (@onpnt), Jessica Moss (@jessicamoss) and Blythe. Anyway, at some point the next day I swung by their booth to say hello and thank them for the dinner, and Blythe says “Oh, we have the best swag – here!” and handed me an umbrella. And she was right. It’s excellent. @rob_farley

    Read the article

  • How to start gnome-shell?

    - by Tim
    I successfully installed the gnome 3 gnome-shell from the launchpad ppa on my fully updated Natty test system. However, nothing I tried could get it to actually run. If I selected it in the startup options, I got a plain light blue screen with absolutely nothing on it. If I tried to start it using "gnome-shell --replace", I got: gnome-shell --replace & [2] 3251 tim@nattytest:/usr/lib$ Traceback (most recent call last): File "/usr/bin/gnome-shell", line 705, in normal_exit = run_shell() File "/usr/bin/gnome-shell", line 293, in run_shell if shell is None: UnboundLocalError: local variable 'shell' referenced before assignment /usr/bin/compiz (core) - Error: Screen 0 on display ":0.0" already has a window manager; try using the --replace option to replace the current window manager. [2]+ Exit 1 gnome-shell --replace I also tried preceding that with metacity --replace, as suggested at ubuntuforums.com. But, I got the same failure. I also linked /usr/lib/libmozjs.so to /usr/lib/xulrunner-2.0b12/libmozjs.so, which did not help either. No matter what I try, I get the same error messages. Thanks.

    Read the article

  • My Lightning Talk in MP3 format

    - by Rob Farley
    Download it now via http://bit.ly/RFCollation  Lots of people tell me they wish they’d heard my Lightning Talk from the PASS Summit. This was the one that was five minutes, in which I explained Collation using examples comparing US English, UK English and Australian English. At the end, I showed my Arsenal thongs. You can see a picture of them below. There was a visual joke involving the name Arsenal too... After the recordings became available, I asked the PASS legal people, and they said I could do what I liked with my own five-minute set, so long as I didn’t sell it. So I made an MP3. I’ve uploaded it to the LobsterPot Solutions web server, and provided an easy link via http://bit.ly/RFCollation. It’s a link straight to the MP3, and you’re welcome to download it, put it on your iPod, whatever you like. And also feel free to write comments here, to let me know what you think.

    Read the article

  • PowerShell to fetch a SQL Execution Plan

    - by Rob Farley
    With PowerShell becoming the scripting language of choice for many people, I’ve occasionally wondered about using it to analyse execution plans. After all, an execution plan is just XML, and PowerShell is just one tool which will very easily handle xml. The thing is – there’s no Get-SqlPlan cmdlet available, which has frustrated me in the past. Today I figured I’d make one. I know that I can write T-SQL to get an execution plan using SET SHOWPLAN_XML ON, but the problem is that this must be the only statement in a batch. So I used go, and a couple of newlines, and whipped up the following one-liner: function Get-SqlPlan([string] $query, [string] $server, [string] $db) { return ([xml] (invoke-sqlcmd -Server $server -Database $db -Query "set showplan_xml on;`ngo`n$query").Item( 0)) } (but please bear in mind that I have the SQL Snapins installed, which provides invoke-sqlcmd) To use this, I just do something like: $plan = get-sqlplan "select name from Production.Product" "." "AdventureWorks" And then find myself with an easy way to navigate through an execution plan! At some point I should make the function more robust, but this should be a good starter for any SQL PowerShell enthusiasts (like Aaron Nelson) out there.

    Read the article

  • How to start gnome-shell?

    - by Tim
    I successfully installed GNOME 3 and GNOME Shell from the Launchpad PPA on my fully updated Natty test system. However, nothing I tried could get it to actually run. If I selected it in the startup options, I got a plain light blue screen with absolutely nothing on it. If I tried to start it using gnome-shell --replace, I got: gnome-shell --replace & [2] 3251 tim@nattytest:/usr/lib$ Traceback (most recent call last): File "/usr/bin/gnome-shell", line 705, in <module> normal_exit = run_shell() File "/usr/bin/gnome-shell", line 293, in run_shell if shell is None: UnboundLocalError: local variable 'shell' referenced before assignment /usr/bin/compiz (core) - Error: Screen 0 on display ":0.0" already has a window manager; try using the --replace option to replace the current window manager. [2]+ Exit 1 gnome-shell --replace I also tried preceding that with metacity --replace, as suggested at ubuntuforums.com. But, I got the same failure. I also linked /usr/lib/libmozjs.so to /usr/lib/xulrunner-2.0b12/libmozjs.so, which did not help either. No matter what I try, I get the same error messages.

    Read the article

  • T-SQL in Chicago – the LobsterPot teams with DataEducation

    - by Rob Farley
    In May, I’ll be in the US. I have board meetings for PASS at the SQLRally event in Dallas, and then I’m going to be spending a bit of time in Chicago. The big news is that while I’m in Chicago (May 14-16), I’m going to teach my “Advanced T-SQL Querying and Reporting: Building Effectiveness” course. This is a course that I’ve been teaching since the 2005 days, and have modified over time for 2008 and 2012. It’s very much my most popular course, and I love teaching it. Let me tell you why. For years, I wrote queries and thought I was good at it. I was a developer. I’d written a lot of C (and other, more fun languages like Prolog and Lisp) at university, and then got into the ‘real world’ and coded in VB, PL/SQL, and so on through to C#, and saw SQL (whichever database system it was) as just a way of getting the data back. I could write a query to return just about whatever data I wanted, and that was good. I was better at it than the people around me, and that helped. (It didn’t help my progression into management, then it just became a frustration, but for the most part, it was good to know that I was good at this particular thing.) But then I discovered the other side of querying – the execution plan. I started to learn about the translation from what I’d written into the plan, and this impacted my query-writing significantly. I look back at the queries I wrote before I understood this, and shudder. I wrote queries that were correct, but often a long way from effective. I’d done query tuning, but had largely done it without considering the plan, just inferring what indexes would help. This is not a performance-tuning course. It’s focused on the T-SQL that you read and write. But performance is a significant and recurring theme. Effective T-SQL has to be about performance – it’s the biggest way that a query becomes effective. There are other aspects too though – such as using constructs better. For example – I can write code that modifies data nicely, but if I haven’t learned about the MERGE statement and the way that it can impact things, I’m missing a few tricks. If you’re going to do this course, a good place to be is the situation I was in a few years before I wrote this course. You’re probably comfortable with writing T-SQL queries. You know how to make a SELECT statement do what you need it to, but feel there has to be a better way. You can write JOINs easily, and understand how to use LEFT JOIN to make sure you don’t filter out rows from the first table, but you’re coding blind. The first module I cover is on Query Execution. Take a look at the Course Outline at Data Education’s website. The first part of the first module is on the components of a SELECT statement (where I make you think harder about GROUP BY than you probably have before), but then we jump straight into Execution Plans. Some stuff on indexes is in there too, as is simplification and SARGability. Some of this is stuff that you may have heard me present on at conferences, but here you have me for three days straight. I’m sure you can imagine that we revisit these topics throughout the rest of the course as well, and you’d be right. In the second and third modules we look at a bunch of other aspects, including some of the T-SQL constructs that lots of people don’t know, and various other things that can help your T-SQL be, well, more effective. I’ve had quite a lot of people do this course and be itching to get back to work even on the first day. That’s not a comment about the jokes I tell, but because people want to look at the queries they run. LobsterPot Solutions is thrilled to be partnering with Data Education to bring this training to Chicago. Visit their website to register for the course. @rob_farley

    Read the article

  • The SSIS tuning tip that everyone misses

    - by Rob Farley
    I know that everyone misses this, because I’m yet to find someone who doesn’t have a bit of an epiphany when I describe this. When tuning Data Flows in SQL Server Integration Services, people see the Data Flow as moving from the Source to the Destination, passing through a number of transformations. What people don’t consider is the Source, getting the data out of a database. Remember, the source of data for your Data Flow is not your Source Component. It’s wherever the data is, within your database, probably on a disk somewhere. You need to tune your query to optimise it for SSIS, and this is what most people fail to do. I’m not suggesting that people don’t tune their queries – there’s plenty of information out there about making sure that your queries run as fast as possible. But for SSIS, it’s not about how fast your query runs. Let me say that again, but in bolder text: The speed of an SSIS Source is not about how fast your query runs. If your query is used in a Source component for SSIS, the thing that matters is how fast it starts returning data. In particular, those first 10,000 rows to populate that first buffer, ready to pass down the rest of the transformations on its way to the Destination. Let’s look at a very simple query as an example, using the AdventureWorks database: We’re picking the different Weight values out of the Product table, and it’s doing this by scanning the table and doing a Sort. It’s a Distinct Sort, which means that the duplicates are discarded. It'll be no surprise to see that the data produced is sorted. Obvious, I know, but I'm making a comparison to what I'll do later. Before I explain the problem here, let me jump back into the SSIS world... If you’ve investigated how to tune an SSIS flow, then you’ll know that some SSIS Data Flow Transformations are known to be Blocking, some are Partially Blocking, and some are simply Row transformations. Take the SSIS Sort transformation, for example. I’m using a larger data set for this, because my small list of Weights won’t demonstrate it well enough. Seven buffers of data came out of the source, but none of them could be pushed past the Sort operator, just in case the last buffer contained the data that would be sorted into the first buffer. This is a blocking operation. Back in the land of T-SQL, we consider our Distinct Sort operator. It’s also blocking. It won’t let data through until it’s seen all of it. If you weren’t okay with blocking operations in SSIS, why would you be happy with them in an execution plan? The source of your data is not your OLE DB Source. Remember this. The source of your data is the NCIX/CIX/Heap from which it’s being pulled. Picture it like this... the data flowing from the Clustered Index, through the Distinct Sort operator, into the SELECT operator, where a series of SSIS Buffers are populated, flowing (as they get full) down through the SSIS transformations. Alright, I know that I’m taking some liberties here, because the two queries aren’t the same, but consider the visual. The data is flowing from your disk and through your execution plan before it reaches SSIS, so you could easily find that a blocking operation in your plan is just as painful as a blocking operation in your SSIS Data Flow. Luckily, T-SQL gives us a brilliant query hint to help avoid this. OPTION (FAST 10000) This hint means that it will choose a query which will optimise for the first 10,000 rows – the default SSIS buffer size. And the effect can be quite significant. First let’s consider a simple example, then we’ll look at a larger one. Consider our weights. We don’t have 10,000, so I’m going to use OPTION (FAST 1) instead. You’ll notice that the query is more expensive, using a Flow Distinct operator instead of the Distinct Sort. This operator is consuming 84% of the query, instead of the 59% we saw from the Distinct Sort. But the first row could be returned quicker – a Flow Distinct operator is non-blocking. The data here isn’t sorted, of course. It’s in the same order that it came out of the index, just with duplicates removed. As soon as a Flow Distinct sees a value that it hasn’t come across before, it pushes it out to the operator on its left. It still has to maintain the list of what it’s seen so far, but by handling it one row at a time, it can push rows through quicker. Overall, it’s a lot more work than the Distinct Sort, but if the priority is the first few rows, then perhaps that’s exactly what we want. The Query Optimizer seems to do this by optimising the query as if there were only one row coming through: This 1 row estimation is caused by the Query Optimizer imagining the SELECT operation saying “Give me one row” first, and this message being passed all the way along. The request might not make it all the way back to the source, but in my simple example, it does. I hope this simple example has helped you understand the significance of the blocking operator. Now I’m going to show you an example on a much larger data set. This data was fetching about 780,000 rows, and these are the Estimated Plans. The data needed to be Sorted, to support further SSIS operations that needed that. First, without the hint. ...and now with OPTION (FAST 10000): A very different plan, I’m sure you’ll agree. In case you’re curious, those arrows in the top one are 780,000 rows in size. In the second, they’re estimated to be 10,000, although the Actual figures end up being 780,000. The top one definitely runs faster. It finished several times faster than the second one. With the amount of data being considered, these numbers were in minutes. Look at the second one – it’s doing Nested Loops, across 780,000 rows! That’s not generally recommended at all. That’s “Go and make yourself a coffee” time. In this case, it was about six or seven minutes. The faster one finished in about a minute. But in SSIS-land, things are different. The particular data flow that was consuming this data was significant. It was being pumped into a Script Component to process each row based on previous rows, creating about a dozen different flows. The data flow would take roughly ten minutes to run – ten minutes from when the data first appeared. The query that completes faster – chosen by the Query Optimizer with no hints, based on accurate statistics (rather than pretending the numbers are smaller) – would take a minute to start getting the data into SSIS, at which point the ten-minute flow would start, taking eleven minutes to complete. The query that took longer – chosen by the Query Optimizer pretending it only wanted the first 10,000 rows – would take only ten seconds to fill the first buffer. Despite the fact that it might have taken the database another six or seven minutes to get the data out, SSIS didn’t care. Every time it wanted the next buffer of data, it was already available, and the whole process finished in about ten minutes and ten seconds. When debugging SSIS, you run the package, and sit there waiting to see the Debug information start appearing. You look for the numbers on the data flow, and seeing operators going Yellow and Green. Without the hint, I’d sit there for a minute. With the hint, just ten seconds. You can imagine which one I preferred. By adding this hint, it felt like a magic wand had been waved across the query, to make it run several times faster. It wasn’t the case at all – but it felt like it to SSIS.

    Read the article

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >