Search Results

Search found 25 results on 1 pages for 'lobsterpot'.

Page 1/1 | 1 

  • T-SQL in Chicago – the LobsterPot teams with DataEducation

    - by Rob Farley
    In May, I’ll be in the US. I have board meetings for PASS at the SQLRally event in Dallas, and then I’m going to be spending a bit of time in Chicago. The big news is that while I’m in Chicago (May 14-16), I’m going to teach my “Advanced T-SQL Querying and Reporting: Building Effectiveness” course. This is a course that I’ve been teaching since the 2005 days, and have modified over time for 2008 and 2012. It’s very much my most popular course, and I love teaching it. Let me tell you why. For years, I wrote queries and thought I was good at it. I was a developer. I’d written a lot of C (and other, more fun languages like Prolog and Lisp) at university, and then got into the ‘real world’ and coded in VB, PL/SQL, and so on through to C#, and saw SQL (whichever database system it was) as just a way of getting the data back. I could write a query to return just about whatever data I wanted, and that was good. I was better at it than the people around me, and that helped. (It didn’t help my progression into management, then it just became a frustration, but for the most part, it was good to know that I was good at this particular thing.) But then I discovered the other side of querying – the execution plan. I started to learn about the translation from what I’d written into the plan, and this impacted my query-writing significantly. I look back at the queries I wrote before I understood this, and shudder. I wrote queries that were correct, but often a long way from effective. I’d done query tuning, but had largely done it without considering the plan, just inferring what indexes would help. This is not a performance-tuning course. It’s focused on the T-SQL that you read and write. But performance is a significant and recurring theme. Effective T-SQL has to be about performance – it’s the biggest way that a query becomes effective. There are other aspects too though – such as using constructs better. For example – I can write code that modifies data nicely, but if I haven’t learned about the MERGE statement and the way that it can impact things, I’m missing a few tricks. If you’re going to do this course, a good place to be is the situation I was in a few years before I wrote this course. You’re probably comfortable with writing T-SQL queries. You know how to make a SELECT statement do what you need it to, but feel there has to be a better way. You can write JOINs easily, and understand how to use LEFT JOIN to make sure you don’t filter out rows from the first table, but you’re coding blind. The first module I cover is on Query Execution. Take a look at the Course Outline at Data Education’s website. The first part of the first module is on the components of a SELECT statement (where I make you think harder about GROUP BY than you probably have before), but then we jump straight into Execution Plans. Some stuff on indexes is in there too, as is simplification and SARGability. Some of this is stuff that you may have heard me present on at conferences, but here you have me for three days straight. I’m sure you can imagine that we revisit these topics throughout the rest of the course as well, and you’d be right. In the second and third modules we look at a bunch of other aspects, including some of the T-SQL constructs that lots of people don’t know, and various other things that can help your T-SQL be, well, more effective. I’ve had quite a lot of people do this course and be itching to get back to work even on the first day. That’s not a comment about the jokes I tell, but because people want to look at the queries they run. LobsterPot Solutions is thrilled to be partnering with Data Education to bring this training to Chicago. Visit their website to register for the course. @rob_farley

    Read the article

  • LobsterPot Solutions in the USA

    - by Rob Farley
    We’re expanding! I’m thrilled to announce that Microsoft Gold Partner LobsterPot Solutions has started another branch appointing the amazing Ted Krueger (5-time SQL MVP awardee) as the US lead. Ted is well-known in the SQL Server world, having written books on indexing, consulting and on being a DBA (not to mention contributing chapters to both MVP Deep Dives books). He is an expert on replication and high availability, and strong in the Business Intelligence space – vast experience which is both broad and deep. Ted is based in the south east corner of Wisconsin, just north of Chicago. He has been a consultant for eons and has helped many clients with their projects and problems, taking the role as both technical lead and consulting lead. He is also tireless in supporting and developing the SQL Server community, presenting at conferences across America, and helping people through his blog, Twitter and more. Despite all this – it’s neither his technical excellence with SQL Server nor his consulting skill that made me want him to lead LobsterPot’s US venture. I wanted Ted because of his values. In the time I’ve known Ted, I’ve found his integrity to be excellent, and found him to be morally beyond reproach. This is the biggest priority I have when finding people to represent the LobsterPot brand. I have no qualms in recommending Ted’s character or work ethic. It’s not just my thoughts on him – all my trusted friends that know Ted agree about this. So last week, LobsterPot Solutions LLC was formed in the United States, and in a couple of weeks, we will be open for business! LobsterPot Solutions can be contacted via email at [email protected], on the web at either www.lobsterpot.com.au or www.lobsterpotsolutions.com, and on Twitter as @lobsterpot_au and @lobsterpot_us. Ted Kruger blogs at LessThanDot, and can also be found on Twitter and LinkedIn. This post is cross-posted from http://lobsterpotsolutions.com/lobsterpot-solutions-in-the-usa

    Read the article

  • Visualising data a different way with Pivot collections

    - by Rob Farley
    Roger’s been doing a great job extending PivotViewer recently, and you can find the list of LobsterPot pivots at http://pivot.lobsterpot.com.au Many months back, the TED Talk that Gary Flake did about Pivot caught my imagination, and I did some research into it. At the time, most of what we did with Pivot was geared towards what we could do for clients, including making Pivot collections based on students at a school, and using it to browse PDF invoices by their various properties. We had actual commercial work based on Pivot collections back then, and it was all kinds of fun. Later, we made some collections for events that were happening, and even got featured in the TechEd Australia keynote. But I’m getting ahead of myself... let me explain the concept. A Pivot collection is an XML file (with .cxml extension) which lists Items, each linking to an image that’s stored in a Deep Zoom format (this means that it contains tiles like Bing Maps, so that the browser can request only the ones of interest according to the zoom level). This collection can be shown in a Silverlight application that uses the PivotViewer control, or in the Pivot Browser that’s available from getpivot.com. Filtering and sorting the items according to their facets (attributes, such as size, age, category, etc), the PivotViewer rearranges the way that these are shown in a very dynamic way. To quote Gary Flake, this lets us “see patterns which are otherwise hidden”. This browsing mechanism is very suited to a number of different methods, because it’s just that – browsing. It’s not searching, it’s more akin to window-shopping than doing an internet search. When we decided to put something together for the conferences such as TechEd Australia 2010 and the PASS Summit 2010, we did some screen-scraping to provide a different view of data that was already available online. Nick Hodge and Michael Kordahi from Microsoft liked the idea a lot, and after a bit of tweaking, we produced one that Michael used in the TechEd Australia keynote to show the variety of talks on offer. It’s interesting to see a pattern in this data: The Office track has the most sessions, but if the Interactive Sessions and Instructor-Led Labs are removed, it drops down to only the sixth most popular track, with Cloud Computing taking over. This is something which just isn’t obvious when you look an ordinary search tool. You get a much better feel for the data when moving around it like this. The more observant amongst you will have noticed some difference in the collection that Michael is demonstrating in the picture above with the screenshots I’ve shown. That’s because it’s been extended some more. At the SQLBits conference in the UK this year, I had some interesting discussions with the guys from Xpert360, particularly Phil Carter, who I’d met in 2009 at an earlier SQLBits conference. They had got around to producing a Pivot collection based on the SQLBits data, which we had been planning to do but ran out of time. We discussed some of ways that Pivot could be used, including the ways that my old friend Howard Dierking had extended it for the MSDN Magazine. I’m not suggesting I influenced Xpert360 at all, but they certainly inspired us with some of their posts on the matter So with LobsterPot guys David Gardiner and Roger Noble both having dabbled in Pivot collections (and Dave doing some for clients), I set Roger to work on extending it some more. He’s used various events and so on to be able to make an environment that allows us to do quick deployment of new collections, as well as showing the data in a grid view which behaves as if it were simply a third view of the data (the other two being the array of images and the ‘histogram’ view). I see PivotViewer as being a significant step in data visualisation – so much so that I feature it when I deliver talks on Spatial Data Visualisation methods. Any time when there is information that can be conveyed through an image, you have to ask yourself how best to show that image, and whether that image is the focal point. For Spatial data, the image is most often a map, and the map becomes the central mode for navigation. I show Pivot with postcode areas, since I can browse the postcodes based on their data, and many of the images are recognisable (to locals of South Australia). Naturally, the images could link through to the map itself, and so on, but generally people think of Spatial data in terms of navigating a map, which doesn’t always gel with the information you’re trying to extract. Roger’s even looking into ways to hook PivotViewer into the Bing Maps API, in a similar way to the Deep Earth project, displaying different levels of map detail according to how ‘zoomed in’ the images are. Some of the work that Dave did with one of the schools was generating the Deep Zoom tiles “on the fly”, based on images stored in a database, and Roger has produced a collection which uses images from flickr, that lets you move from one search term to another. Pulling the images down from flickr.com isn’t particularly ideal from a performance aspect, and flickr doesn’t store images in a small-enough format to really lend itself to this use, but you might agree that it’s an interesting concept which compares nicely to using Maps. I’m looking forward to future versions of the PivotViewer control, and hope they provide many more events that can be used, and even more hooks into it. Naturally, LobsterPot could help provide your business with a PivotViewer experience, but you can probably do a lot of it yourself too. There’s a thorough guide at getpivot.com, which is how we got into it. For some examples of what we’ve done, have a look at http://pivot.lobsterpot.com.au. I’d like to see PivotViewer really catch on a data visualisation tool.

    Read the article

  • On technical talent

    - by Rob Farley
    In honour of the regular T-SQL Tuesday blogging, the UnSQL theme started, looking at topics that were not directly SQL related, but nevertheless quite interesting. This is the brainchild of Jen McCown, who posted the second of these recently. I’m actually a bit late in responding, as I haven’t got it in my head to look for these posts yet. Still, Jen says I can still contribute now, hence this post. The theme this time is on Tech Giants. I could list people all day for those I admire in the SQL Server space, and go on even longer if I branch out to other areas. But I actually want to highlight four guys that I admire so much for their skills, integrity and general awesomeness that I hired them. Yes – the guys that work for me at LobsterPot Solutions, being Ben McNamara, David Gardiner, Roger Noble and Ashley Sewell. I admire them all, and they present the company with a platform on which to grow.

    Read the article

  • Slide-decks from recent Adelaide SQL Server UG meetings

    - by Rob Farley
    The UK has been well represented this summer at the Adelaide SQL Server User Group, with presentations from Chris Testa-O’Neill (isn’t that the right link? Maybe try this one) and Martin Cairney. The slides are available here and here. I thought I’d particularly mention Martin’s, and how it’s relevant to this month’s T-SQL Tuesday. Martin spoke about Policy-Based Management and the Enterprise Policy Management Framework – something which is remarkably under-used, and yet which can really impact your ability to look after environments. If you have policies set up, then you can easily test each of your SQL instances to see if they are still satisfying a set of policies as defined. Automation (the topic of this month’s T-SQL Tuesday) should mean that your life is made easier, thereby enabling to you to do more. It shouldn’t remove the human element, but should remove (most of) the human errors. People still need to manage the situation, and work out what needs to be done, etc. We haven’t reached a point where computers can replace people, but they are very good at replace the mundaneness and monotony of our jobs. They’ve made our lives more interesting (although many would rightly argue that they have also made our lives more complex) by letting us focus on the stuff that changes. Martin named his talk Put Your Feet Up, which nicely expresses the fact that managing systems shouldn’t be about running around checking things all the time. It must be about having systems in place which tell you when things aren’t going well. It’s never quite as simple as being able to actually put your feet up, but certainly no system should require constant attention. It’s definitely a policy we at LobsterPot adhere to, whether it’s an alert to let us know that an ETL package has run successfully, or a script that generates some code for a report. If things can be automated, it reduces the chance of error, reduces the repetitive nature of work, and in general, keeps both consultants and clients much happier.

    Read the article

  • Slide-decks from recent Adelaide SQL Server UG meetings

    - by Rob Farley
    The UK has been well represented this summer at the Adelaide SQL Server User Group, with presentations from Chris Testa-O’Neill (isn’t that the right link? Maybe try this one) and Martin Cairney. The slides are available here and here. I thought I’d particularly mention Martin’s, and how it’s relevant to this month’s T-SQL Tuesday. Martin spoke about Policy-Based Management and the Enterprise Policy Management Framework – something which is remarkably under-used, and yet which can really impact your ability to look after environments. If you have policies set up, then you can easily test each of your SQL instances to see if they are still satisfying a set of policies as defined. Automation (the topic of this month’s T-SQL Tuesday) should mean that your life is made easier, thereby enabling to you to do more. It shouldn’t remove the human element, but should remove (most of) the human errors. People still need to manage the situation, and work out what needs to be done, etc. We haven’t reached a point where computers can replace people, but they are very good at replace the mundaneness and monotony of our jobs. They’ve made our lives more interesting (although many would rightly argue that they have also made our lives more complex) by letting us focus on the stuff that changes. Martin named his talk Put Your Feet Up, which nicely expresses the fact that managing systems shouldn’t be about running around checking things all the time. It must be about having systems in place which tell you when things aren’t going well. It’s never quite as simple as being able to actually put your feet up, but certainly no system should require constant attention. It’s definitely a policy we at LobsterPot adhere to, whether it’s an alert to let us know that an ETL package has run successfully, or a script that generates some code for a report. If things can be automated, it reduces the chance of error, reduces the repetitive nature of work, and in general, keeps both consultants and clients much happier.

    Read the article

  • Be the surgeon

    - by Rob Farley
    It’s a phrase I use often, especially when teaching, and I wish I had realised the concept years earlier. (And of course, fits with this month’s T-SQL Tuesday topic, hosted by Argenis Fernandez) When I’m sick enough to go to the doctor, I see a GP. I used to typically see the same guy, but he’s moved on now. However, when he has been able to roughly identify the area of the problem, I get referred to a specialist, sometimes a surgeon. Being a surgeon requires a refined set of skills. It’s why they often don’t like to be called “Doctor”, and prefer the traditional “Mister” (the history is that the doctor used to make the diagnosis, and then hand the patient over to the person who didn’t have a doctorate, but rather was an expert cutter, typically from a background in butchering). But if you ask the surgeon about the pain you have in your leg sometimes, you’ll get told to ask your GP. It’s not that your surgeon isn’t interested – they just don’t know the answer. IT is the same now. That wasn’t something that I really understood when I got out of university. I knew there was a lot to know about IT – I’d just done an honours degree in it. But I also knew that I’d done well in just about all my subjects, and felt like I had a handle on everything. I got into developing, and still felt that having a good level of understanding about every aspect of IT was a good thing. This got me through for the first six or seven years of my career. But then I started to realise that I couldn’t compete. I’d moved into management, and was spending my days running projects, rather than writing code. The kids were getting older. I’d had a bad back injury (ask anyone with chronic pain how it affects  your ability to concentrate, retain information, etc). But most of all, IT was getting larger. I knew kids without lives who knew more than I did. And I felt like I could easily identify people who were better than me in whatever area I could think of. Except writing queries (this was before I discovered technical communities, and people like Paul White and Dave Ballantyne). And so I figured I’d specialise. I wish I’d done it years earlier. Now, I can tell you plenty of people who are better than me at any area you can pick. But there are also more people who might consider listing me in some of their lists too. If I’d stayed the GP, I’d be stuck in management, and finding that there were better managers than me too. If you’re reading this, SQL could well be your thing. But it might not be either. Your thing might not even be in IT. Find out, and then see if you can be a world-beater at it. But it gets even better, because you can find other people to complement the things that you’re not so good at. My company, LobsterPot Solutions, has six people in it at the moment. I’ve hand-picked those six people, along with the one who quit. The great thing about it is that I’ve been able to pick people who don’t necessarily specialise in the same way as me. I don’t write their T-SQL for them – generally they’re good enough at that themselves. But I’m on-hand if needed. Consider Roger Noble, for example. He’s doing stuff in HTML5 and jQuery that I could never dream of doing to create an amazing HTML5 version of PivotViewer. Or Ashley Sewell, a guy who does project management far better than I do. I could go on. My team is brilliant, and I love them to bits. We’re all surgeons, and when we work together, I like to think we’re pretty good! @rob_farley

    Read the article

  • Be the surgeon

    - by Rob Farley
    It’s a phrase I use often, especially when teaching, and I wish I had realised the concept years earlier. (And of course, fits with this month’s T-SQL Tuesday topic, hosted by Argenis Fernandez) When I’m sick enough to go to the doctor, I see a GP. I used to typically see the same guy, but he’s moved on now. However, when he has been able to roughly identify the area of the problem, I get referred to a specialist, sometimes a surgeon. Being a surgeon requires a refined set of skills. It’s why they often don’t like to be called “Doctor”, and prefer the traditional “Mister” (the history is that the doctor used to make the diagnosis, and then hand the patient over to the person who didn’t have a doctorate, but rather was an expert cutter, typically from a background in butchering). But if you ask the surgeon about the pain you have in your leg sometimes, you’ll get told to ask your GP. It’s not that your surgeon isn’t interested – they just don’t know the answer. IT is the same now. That wasn’t something that I really understood when I got out of university. I knew there was a lot to know about IT – I’d just done an honours degree in it. But I also knew that I’d done well in just about all my subjects, and felt like I had a handle on everything. I got into developing, and still felt that having a good level of understanding about every aspect of IT was a good thing. This got me through for the first six or seven years of my career. But then I started to realise that I couldn’t compete. I’d moved into management, and was spending my days running projects, rather than writing code. The kids were getting older. I’d had a bad back injury (ask anyone with chronic pain how it affects  your ability to concentrate, retain information, etc). But most of all, IT was getting larger. I knew kids without lives who knew more than I did. And I felt like I could easily identify people who were better than me in whatever area I could think of. Except writing queries (this was before I discovered technical communities, and people like Paul White and Dave Ballantyne). And so I figured I’d specialise. I wish I’d done it years earlier. Now, I can tell you plenty of people who are better than me at any area you can pick. But there are also more people who might consider listing me in some of their lists too. If I’d stayed the GP, I’d be stuck in management, and finding that there were better managers than me too. If you’re reading this, SQL could well be your thing. But it might not be either. Your thing might not even be in IT. Find out, and then see if you can be a world-beater at it. But it gets even better, because you can find other people to complement the things that you’re not so good at. My company, LobsterPot Solutions, has six people in it at the moment. I’ve hand-picked those six people, along with the one who quit. The great thing about it is that I’ve been able to pick people who don’t necessarily specialise in the same way as me. I don’t write their T-SQL for them – generally they’re good enough at that themselves. But I’m on-hand if needed. Consider Roger Noble, for example. He’s doing stuff in HTML5 and jQuery that I could never dream of doing to create an amazing HTML5 version of PivotViewer. Or Ashley Sewell, a guy who does project management far better than I do. I could go on. My team is brilliant, and I love them to bits. We’re all surgeons, and when we work together, I like to think we’re pretty good! @rob_farley

    Read the article

  • SQL Spatial: Getting “nearest” calculations working properly

    - by Rob Farley
    If you’ve ever done spatial work with SQL Server, I hope you’ve come across the ‘nearest’ problem. You have five thousand stores around the world, and you want to identify the one that’s closest to a particular place. Maybe you want the store closest to the LobsterPot office in Adelaide, at -34.925806, 138.605073. Or our new US office, at 42.524929, -87.858244. Or maybe both! You know how to do this. You don’t want to use an aggregate MIN or MAX, because you want the whole row, telling you which store it is. You want to use TOP, and if you want to find the closest store for multiple locations, you use APPLY. Let’s do this (but I’m going to use addresses in AdventureWorks2012, as I don’t have a list of stores). Oh, and before I do, let’s make sure we have a spatial index in place. I’m going to use the default options. CREATE SPATIAL INDEX spin_Address ON Person.Address(SpatialLocation); And my actual query: WITH MyLocations AS (SELECT * FROM (VALUES ('LobsterPot Adelaide', geography::Point(-34.925806, 138.605073, 4326)),                        ('LobsterPot USA', geography::Point(42.524929, -87.858244, 4326))                ) t (Name, Geo)) SELECT l.Name, a.AddressLine1, a.City, s.Name AS [State], c.Name AS Country FROM MyLocations AS l CROSS APPLY (     SELECT TOP (1) *     FROM Person.Address AS ad     ORDER BY l.Geo.STDistance(ad.SpatialLocation)     ) AS a JOIN Person.StateProvince AS s     ON s.StateProvinceID = a.StateProvinceID JOIN Person.CountryRegion AS c     ON c.CountryRegionCode = s.CountryRegionCode ; Great! This is definitely working. I know both those City locations, even if the AddressLine1s don’t quite ring a bell. I’m sure I’ll be able to find them next time I’m in the area. But of course what I’m concerned about from a querying perspective is what’s happened behind the scenes – the execution plan. This isn’t pretty. It’s not using my index. It’s sucking every row out of the Address table TWICE (which sucks), and then it’s sorting them by the distance to find the smallest one. It’s not pretty, and it takes a while. Mind you, I do like the fact that it saw an indexed view it could use for the State and Country details – that’s pretty neat. But yeah – users of my nifty website aren’t going to like how long that query takes. The frustrating thing is that I know that I can use the index to find locations that are within a particular distance of my locations quite easily, and Microsoft recommends this for solving the ‘nearest’ problem, as described at http://msdn.microsoft.com/en-au/library/ff929109.aspx. Now, in the first example on this page, it says that the query there will use the spatial index. But when I run it on my machine, it does nothing of the sort. I’m not particularly impressed. But what we see here is that parallelism has kicked in. In my scenario, it’s split the data up into 4 threads, but it’s still slow, and not using my index. It’s disappointing. But I can persuade it with hints! If I tell it to FORCESEEK, or use my index, or even turn off the parallelism with MAXDOP 1, then I get the index being used, and it’s a thing of beauty! Part of the plan is here: It’s massive, and it’s ugly, and it uses a TVF… but it’s quick. The way it works is to hook into the GeodeticTessellation function, which is essentially finds where the point is, and works out through the spatial index cells that surround it. This then provides a framework to be able to see into the spatial index for the items we want. You can read more about it at http://msdn.microsoft.com/en-us/library/bb895265.aspx#tessellation – including a bunch of pretty diagrams. One of those times when we have a much more complex-looking plan, but just because of the good that’s going on. This tessellation stuff was introduced in SQL Server 2012. But my query isn’t using it. When I try to use the FORCESEEK hint on the Person.Address table, I get the friendly error: Msg 8622, Level 16, State 1, Line 1 Query processor could not produce a query plan because of the hints defined in this query. Resubmit the query without specifying any hints and without using SET FORCEPLAN. And I’m almost tempted to just give up and move back to the old method of checking increasingly large circles around my location. After all, I can even leverage multiple OUTER APPLY clauses just like I did in my recent Lookup post. WITH MyLocations AS (SELECT * FROM (VALUES ('LobsterPot Adelaide', geography::Point(-34.925806, 138.605073, 4326)),                        ('LobsterPot USA', geography::Point(42.524929, -87.858244, 4326))                ) t (Name, Geo)) SELECT     l.Name,     COALESCE(a1.AddressLine1,a2.AddressLine1,a3.AddressLine1),     COALESCE(a1.City,a2.City,a3.City),     s.Name AS [State],     c.Name AS Country FROM MyLocations AS l OUTER APPLY (     SELECT TOP (1) *     FROM Person.Address AS ad     WHERE l.Geo.STDistance(ad.SpatialLocation) < 1000     ORDER BY l.Geo.STDistance(ad.SpatialLocation)     ) AS a1 OUTER APPLY (     SELECT TOP (1) *     FROM Person.Address AS ad     WHERE l.Geo.STDistance(ad.SpatialLocation) < 5000     AND a1.AddressID IS NULL     ORDER BY l.Geo.STDistance(ad.SpatialLocation)     ) AS a2 OUTER APPLY (     SELECT TOP (1) *     FROM Person.Address AS ad     WHERE l.Geo.STDistance(ad.SpatialLocation) < 20000     AND a2.AddressID IS NULL     ORDER BY l.Geo.STDistance(ad.SpatialLocation)     ) AS a3 JOIN Person.StateProvince AS s     ON s.StateProvinceID = COALESCE(a1.StateProvinceID,a2.StateProvinceID,a3.StateProvinceID) JOIN Person.CountryRegion AS c     ON c.CountryRegionCode = s.CountryRegionCode ; But this isn’t friendly-looking at all, and I’d use the method recommended by Isaac Kunen, who uses a table of numbers for the expanding circles. It feels old-school though, when I’m dealing with SQL 2012 (and later) versions. So why isn’t my query doing what it’s supposed to? Remember the query... WITH MyLocations AS (SELECT * FROM (VALUES ('LobsterPot Adelaide', geography::Point(-34.925806, 138.605073, 4326)),                        ('LobsterPot USA', geography::Point(42.524929, -87.858244, 4326))                ) t (Name, Geo)) SELECT l.Name, a.AddressLine1, a.City, s.Name AS [State], c.Name AS Country FROM MyLocations AS l CROSS APPLY (     SELECT TOP (1) *     FROM Person.Address AS ad     ORDER BY l.Geo.STDistance(ad.SpatialLocation)     ) AS a JOIN Person.StateProvince AS s     ON s.StateProvinceID = a.StateProvinceID JOIN Person.CountryRegion AS c     ON c.CountryRegionCode = s.CountryRegionCode ; Well, I just wasn’t reading http://msdn.microsoft.com/en-us/library/ff929109.aspx properly. The following requirements must be met for a Nearest Neighbor query to use a spatial index: A spatial index must be present on one of the spatial columns and the STDistance() method must use that column in the WHERE and ORDER BY clauses. The TOP clause cannot contain a PERCENT statement. The WHERE clause must contain a STDistance() method. If there are multiple predicates in the WHERE clause then the predicate containing STDistance() method must be connected by an AND conjunction to the other predicates. The STDistance() method cannot be in an optional part of the WHERE clause. The first expression in the ORDER BY clause must use the STDistance() method. Sort order for the first STDistance() expression in the ORDER BY clause must be ASC. All the rows for which STDistance returns NULL must be filtered out. Let’s start from the top. 1. Needs a spatial index on one of the columns that’s in the STDistance call. Yup, got the index. 2. No ‘PERCENT’. Yeah, I don’t have that. 3. The WHERE clause needs to use STDistance(). Ok, but I’m not filtering, so that should be fine. 4. Yeah, I don’t have multiple predicates. 5. The first expression in the ORDER BY is my distance, that’s fine. 6. Sort order is ASC, because otherwise we’d be starting with the ones that are furthest away, and that’s tricky. 7. All the rows for which STDistance returns NULL must be filtered out. But I don’t have any NULL values, so that shouldn’t affect me either. ...but something’s wrong. I do actually need to satisfy #3. And I do need to make sure #7 is being handled properly, because there are some situations (eg, differing SRIDs) where STDistance can return NULL. It says so at http://msdn.microsoft.com/en-us/library/bb933808.aspx – “STDistance() always returns null if the spatial reference IDs (SRIDs) of the geography instances do not match.” So if I simply make sure that I’m filtering out the rows that return NULL… …then it’s blindingly fast, I get the right results, and I’ve got the complex-but-brilliant plan that I wanted. It just wasn’t overly intuitive, despite being documented. @rob_farley

    Read the article

  • Nepotism In The SQL Family

    - by Rob Farley
    There’s a bunch of sayings about nepotism. It’s unpopular, unless you’re the family member who is getting the opportunity. But of course, so much in life (and career) is about who you know. From the perspective of the person who doesn’t get promoted (when the family member is), nepotism is simply unfair; even more so when the promoted one seems less than qualified, or incompetent in some way. We definitely get a bit miffed about that. But let’s also look at it from the other side of the fence – the person who did the promoting. To them, their son/daughter/nephew/whoever is just another candidate, but one in whom they have more faith. They’ve spent longer getting to know that person. They know their weaknesses and their strengths, and have seen them in all kinds of situations. They expect them to stay around in the company longer. And yes, they may have plans for that person to inherit one day. Sure, they have a vested interest, because they’d like their family members to have strong careers, but it’s not just about that – it’s often best for the company as well. I’m not announcing that the next LobsterPot employee is one of my sons (although I wouldn’t be opposed to the idea of getting them involved), but actually, admitting that almost all the LobsterPot employees are SQLFamily members… …which makes this post good for T-SQL Tuesday, this month hosted by Jeffrey Verheul (@DevJef). You see, SQLFamily is the concept that the people in the SQL Server community are close. We have something in common that goes beyond ordinary friendship. We might only see each other a few times a year, at events like the PASS Summit and SQLSaturdays, but the bonds that are formed are strong, going far beyond typical professional relationships. And these are the people that I am prepared to hire. People that I have got to know. I get to know their skill level, how well they explain things, how confident people are in their expertise, and what their values are. Of course there people that I wouldn’t hire, but I’m a lot more comfortable hiring someone that I’ve already developed a feel for. I need to trust the LobsterPot brand to people, and that means they need to have a similar value system to me. They need to have a passion for helping people and doing what they can to make a difference. Above all, they need to have integrity. Therefore, I believe in nepotism. All the people I’ve hired so far are people from the SQL community. I don’t know whether I’ll always be able to hire that way, but I have no qualms admitting that the things I look for in an employee are things that I can recognise best in those that are referred to as SQLFamily. …like Ted Krueger (@onpnt), LobsterPot’s newest employee and the guy who is representing our brand in America. I’m completely proud of this guy. He’s everything I want in an employee. He’s an experienced consultant (even wrote a book on it!), loving husband and father, genuine expert, and incredibly respected by his peers. It’s not favouritism, it’s just choosing someone I’ve been interviewing for years. @rob_farley

    Read the article

  • Visually stunning maps and PivotViewer

    - by Rob Farley
    One of the things about PivotViewer is that it runs in the Silverlight platform and can be extended recently. One of my guys at LobsterPot, Roger Noble, has used this to incorporate a Bing Maps layer, showing items which have  Latitude and Longitude values there. We’re already talking to a hospital about using this to allow them to browse their patient data, including showing the patients on a map according to which bed they’re in. Interesting times – this will involve having custom tiles instead of the ones from Bing Maps, but the idea is similar. Of course, we’ll be using Bing Maps to show where the patients live. I should also mention that this is a work-in-progress still. Figuring out how to use PivotViewer isn’t trivial, and we’ve done quite a lot of experimenting to see how to get things working. If you find bugs, please feel free to let me know (rob_farley at hotmail will usually reach me), and we’ll add them to our list. Here are some screenshots that I made recently using the collection at http://pivot.lobsterpot.com.au/flickr – by selecting a tag, you can get a new bunch of images. A couple of images that were taken in Iceland. Some from St Mary’s Lighthouse near Newcastle, UK. And some from around Big Ben in London. I’d recommend using either Firefox or Internet Explorer if you choose to browse this yourself. It seems the Chrome browser support for Silverlight doesn’t quite handle things as nicely as we’d all like. I imagine that at some point, we may enhance the Flickr collection, to be able to search on more than tags, but as a sample collection, it seems to work quite well.

    Read the article

  • What makes a great place to work

    - by Rob Farley
    Co-incidentally, I’ve been looking for office space for LobsterPot Solutions during the same few days that Luke Hayler ( @lukehayler ) has asked for my thoughts (okay, he ‘tagged’ me) on what makes a great place to work . He lists People and Environment, and I’m inclined to agree, but with a couple of other things too. I have three children. Two of them (both boys) are in school, but my daughter is only two. For the boys’ schools, we quickly realised that what they need most is a feeling of safety...(read more)

    Read the article

  • What makes a great place to work

    - by Rob Farley
    Co-incidentally, I’ve been looking for office space for LobsterPot Solutions during the same few days that Luke Hayler ( @lukehayler ) has asked for my thoughts (okay, he ‘tagged’ me) on what makes a great place to work . He lists People and Environment, and I’m inclined to agree, but with a couple of other things too. I have three children. Two of them (both boys) are in school, but my daughter is only two. For the boys’ schools, we quickly realised that what they need most is a feeling of safety...(read more)

    Read the article

  • 2011 PASS Board Applicants: Rob Farley

    - by andyleonard
    Introduction I am interviewing 2011 PASS Board Nominee Applicants. As listed on the PASS Board Elections site the applicants are: Rob Farley Geoff Hiten Adam Jorgensen Denise McInerney Sri Sridharan Kendal Van Dyke I'm asking everyone the same questions and blogging the responses in the order received. Rob Farley is first up: Interview With Rob Farley 1. What's your day job? I run LobsterPot Solutions out of Adelaide, Australia. We're a SQL & BI consultancy, and were the first Microsoft Partner...(read more)

    Read the article

  • My Lightning Talk in MP3 format

    - by Rob Farley
    Download it now via http://bit.ly/RFCollation  Lots of people tell me they wish they’d heard my Lightning Talk from the PASS Summit. This was the one that was five minutes, in which I explained Collation using examples comparing US English, UK English and Australian English. At the end, I showed my Arsenal thongs. You can see a picture of them below. There was a visual joke involving the name Arsenal too... After the recordings became available, I asked the PASS legal people, and they said I could do what I liked with my own five-minute set, so long as I didn’t sell it. So I made an MP3. I’ve uploaded it to the LobsterPot Solutions web server, and provided an easy link via http://bit.ly/RFCollation. It’s a link straight to the MP3, and you’re welcome to download it, put it on your iPod, whatever you like. And also feel free to write comments here, to let me know what you think.

    Read the article

  • Summit Old, Summit New, Summit Borrowed...

    - by Rob Farley
    PASS Summit is coming up, and I thought I’d post a few things. Summit Old... At the PASS Summit, you will get the chance to hear presentations by the SQL Server establishment. Just about every big name in the SQL Server world is a regular at the PASS Summit, so you will get to hear and meet people like Kalen Delaney (@sqlqueen) (who just recently got awarded MVP status for the 20th year running), and from all around the world such as the UK’s Chris Webb (@technitrain) or Pinal Dave (@pinaldave) from India. Almost all the household names in SQL Server will be there, including a large contingent from Microsoft. The PASS Summit is by far the best place to meet the legends of SQL Server. And they’re not all old. Some are, but most of them are younger than you might think. ...Summit New... The hottest topics are often about the newest technologies (such as SQL Server 2012). But you will almost certainly learn new stuff about older versions too. But that’s not what I wanted to pick on for this point. There are many new speakers at every PASS Summit, and content that has not been covered in other places. This year, for example, LobsterPot’s Roger Noble (@roger_noble) is giving a presentation for the first time. He’s a regular around the Australian circuit, but this is his first time presenting to a US audience. New Zealand’s Paul White (@sql_kiwi) is attending his first PASS Summit, and will be giving over four hours of incredibly deep stuff that has never been presented anywhere in the US before (I can’t say the world, because he did present similar material in Adelaide earlier in the year). ...Summit Borrowed... No, I’m not talking about plagiarism – the talks you’ll hear are all their own work. But you will get a lot of stuff you’ll be able to take back and apply at work. The PASS Summit sessions are not full of sales-pitches, telling you about how great things could be if only you’d buy some third-party vendor product. It’s simply not that kind of conference, and PASS doesn’t allow that kind of talk to take place. Instead, you’ll be taught techniques, and be able to download scripts and slides to let you perform that magic back at work when you get home. You will definitely find plenty of ideas to borrow at the PASS Summit. ...Summit Blue Yeah – and there’s karaoke. Blue - Jason - SQL Karaoke - YouTube

    Read the article

  • When is your interview?

    - by Rob Farley
    Sometimes it’s tough to evaluate someone – to figure out if you think they’d be worth hiring. These days, since starting LobsterPot Solutions, I have my share of interviews, on both sides of the desk. Sometimes I’m checking out potential staff members; sometimes I’m persuading someone else to get us on board for a project. Regardless of who is on which side of the desk, we’re both checking each other out. The world is not how it was some years ago. I’m pretty sure that every time I walk into a room for an interview, I’ve searched for them online, and they’ve searched for me. I suspect they usually have the easier time finding me, although there are obviously other Rob Farleys in the world. They may have even checked out some of my presentations from conferences, read my blog posts, maybe even heard me tell jokes or sing. I know some people need me to explain who I am, but for the most part, I think they’ve done plenty of research long before I’ve walked in the room. I remember when this was different (as it could be for you still). I remember a time when I dealt with recruitment agents, looking for work. I remember sitting in rooms having been giving a test designed to find out if I knew my stuff or not, and then being pulled into interviews with managers who had to find out if I could communicate effectively. I’d need to explain who I was, what kind of person I was, what my value-system involved, and so on. I’m sure you understand what I’m getting at. (Oh, and in case you hadn’t realised, it’s a T-SQL Tuesday post, this month about interviews.) At TechEd Australia some years ago (either 2009 or 2010 – I forget which), I remember hearing a comment made during the ‘locknote’, the closing session. The presenter described a conversation he’d heard between two girls, discussing a guy that one of them had just started dating. The other girl expressed horror at the fact that her friend had met this guy in person, rather than through an online dating agency. The presenter pointed out that people realise that there’s a certain level of safety provided through the checks that those sites do. I’m not sure I completely trust this, but I’m sure it’s true for people’s technical profiles. If I interview someone, I hope they have a profile. I hope I can look at what they already know. I hope I can get samples of their work, and see how they communicate. I hope I can get a feel for their sense of humour. I hope I already know exactly what kind of person they are – their value system, their beliefs, their passions. Even their grammar. I can work out if the person is a good risk or not from who they are online. If they don’t have an online presence, then I don’t have this information, and the risk is higher. So if you’re interviewing with me, your interview started long before the conversation. I hope it started before I’d ever heard of you. I know the interview in which I’m being assessed started before I even knew there was a product called SQL Server. It’s reflected in what I write. It’s in the way I present. I have spent my life becoming me – so let’s talk! @rob_farley

    Read the article

  • Cloud – the forecast is improving

    - by Rob Farley
    There is a lot of discussion about “the cloud”, and how that affects people’s data stories. Today the discussion enters the realm of T-SQL Tuesday, hosted this month by Jorge Segarra. Over the years, companies have invested a lot in making sure that their data is good, and I mean every aspect of it – the quality of it, the security of it, the performance of it, and more. Experts such as those of us at LobsterPot Solutions have helped these companies with this, and continue to work with clients to make sure that data is a strong part of their business, not an oversight. Whether business intelligence systems are being utilised or not, every business needs to be able to rely on its data, and have the confidence in it. Data should be a foundation upon which a business is built. In the past, data had been stored in paper-based systems. Filing cabinets stored vital information. Today, people have server rooms with storage of various kinds, recognising that filing cabinets don’t necessarily scale particularly well. It’s easy to ‘lose’ data in a filing cabinet, when you have people who need to make sure that the sheets of paper are in the right spot, and that you know how things are stored. Databases help solve that problem, but still the idea of a large filing cabinet continues, it just doesn’t involve paper. If something happens to the physical ‘filing cabinet’, then the problems are larger still. Then the data itself is under threat. Many clients have generators in case the power goes out, redundant cables in case the connectivity dies, and spare servers in other buildings just in case they’re required. But still they’re maintaining filing cabinets. You see, people like filing cabinets. There’s something to be said for having your data ‘close’. Even if the data is not in readable form, living as bits on a disk somewhere, the idea that its home is ‘in the building’ is comforting to many people. They simply don’t want to move their data anywhere else. The cloud offers an alternative to this, and the human element is an obstacle. By leveraging the cloud, companies can have someone else look after their filing cabinet. A lot of people really don’t like the idea of this, partly because the administrators of the data, those people who could potentially log in with escalated rights and see more than they should be allowed to, who need to be trusted to respond if there’s a problem, are now a faceless entity in the cloud. But this doesn’t mean that the cloud is bad – this is simply a concern that some people may have. In new functionality that’s on its way, we see other hybrid mechanisms that mean that people can leverage parts of the cloud with less fear. Companies can use cloud storage to hold their backup data, for example, backups that have been encrypted and are therefore not able to be read by anyone (including administrators) who don’t have the right password. Companies can have a database instance that runs locally, but which has its data files in the cloud, complete with Transparent Data Encryption if needed. There can be a higher level of control, making the change easier to accept. Hybrid options allow people who have had fears (potentially very justifiable) to take a new look at the cloud, and to start embracing some of the benefits of the cloud (such as letting someone else take care of storage, high availability, and more) without losing the feeling of the data being close. @rob_farley

    Read the article

  • Summit reflections

    - by Rob Farley
    So far, my three PASS Summit experiences have been notably different to each other. My first, I wasn’t on the board and I gave two regular sessions and a Lightning Talk in which I told jokes. My second, I was a board advisor, and I delivered a precon, a spotlight and a Lightning Talk in which I sang. My third (last week), I was a full board director, and I didn’t present at all. Let’s not talk about next year. I’m not sure there are many options left. This year, I noticed that a lot more people recognised me and said hello. I guess that’s potentially because of the singing last year, but could also be because board elections can bring a fair bit of attention, and because of the effort I’ve put in through things like 24HOP... Yeah, ok. It’d be the singing. My approach was very different though. I was watching things through different eyes. I looked for the things that seemed to be working and the things that didn’t. I had staff there again, and was curious to know how their things were working out. I knew a lot more about what was going on behind the scenes to make various things happen, and although very little about the Summit was actually my responsibility (based on not having that portfolio), my perspective had moved considerably. Before the Summit started, Board Members had been given notebooks – an idea Tom (who heads up PASS’ marketing) had come up with after being inspired by seeing Bill walk around with a notebook. The plan was to take notes about feedback we got from people. It was a good thing, and the notebook forms a nice pair with the SQLBits one I got a couple of years ago when I last spoke there. I think one of the biggest impacts of this was that during the first keynote, Bill told everyone present about the notebooks. This set a tone of “we’re listening”, and a number of people were definitely keen to tell us things that would cause us to pull out our notebooks. PASSTV was a new thing this year. Justin, the host, featured on the couch and talked a lot of people about a lot of things, including me (he talked to me about a lot of things, I don’t think he talked to a lot people about me). Reaching people through online methods is something which interests me a lot – it has huge potential, and I love the idea of being able to broadcast to people who are unable to attend in person. I’m keen to see how this medium can be developed over time. People who know me will know that I’m a keen advocate of certification – I've been SQL certified since version 6.5, and have even been involved in creating exams. However, I don’t believe in studying for exams. I think training is worthwhile for learning new skills, but the goal should be on learning those skills, not on passing an exam. Exams should be for proving that the skills are there, not a goal in themselves. The PASS Summit is an excellent place to take exams though, and with an attitude of professional development throughout the event, why not? So I did. I wasn’t expecting to take one, but I was persuaded and took the MCM Knowledge Exam. I hadn’t even looked at the syllabus, but tried it anyway. I was very tired, and even fell asleep at one point during it. I’ll find out my result at some point in the future – the Prometric site just says “Tested” at the moment. As I said, it wasn’t something I was expecting to do, but it was good to have something unexpected during the week. Of course it was good to catch up with old friends and make new ones. I feel like every time I’m in the US I see things develop a bit more, with more and more people knowing who I am, who my staff are, and recognising the LobsterPot brand. I missed being a presenter, but I definitely enjoyed seeing many friends on the list of presenters. I won’t try to list them, because there are so many these days that people might feel sad if I don’t mention them. For those that I managed to see, I was pleased to see that the majority of them have lifted their presentation skills since I last saw them, and I happily told them as much. One person who I will mention was Paul White, who travelled from New Zealand to his first PASS Summit. He gave two sessions (a regular session and a half-day), packed large rooms of people, and had everyone buzzing with enthusiasm. I spoke to him after the event, and he told me that his expectations were blown away. Paul isn’t normally a fan of crowds, and the thought of 4000 people would have been scary. But he told me he had no idea that people would welcome him so well, be so friendly and so down to earth. He’s seen the significance of the SQL Server community, and says he’ll be back. It’ll be good to see him there. Will you be there too?

    Read the article

  • Summit reflections

    - by Rob Farley
    So far, my three PASS Summit experiences have been notably different to each other. My first, I wasn’t on the board and I gave two regular sessions and a Lightning Talk in which I told jokes. My second, I was a board advisor, and I delivered a precon, a spotlight and a Lightning Talk in which I sang. My third (last week), I was a full board director, and I didn’t present at all. Let’s not talk about next year. I’m not sure there are many options left. This year, I noticed that a lot more people recognised me and said hello. I guess that’s potentially because of the singing last year, but could also be because board elections can bring a fair bit of attention, and because of the effort I’ve put in through things like 24HOP... Yeah, ok. It’d be the singing. My approach was very different though. I was watching things through different eyes. I looked for the things that seemed to be working and the things that didn’t. I had staff there again, and was curious to know how their things were working out. I knew a lot more about what was going on behind the scenes to make various things happen, and although very little about the Summit was actually my responsibility (based on not having that portfolio), my perspective had moved considerably. Before the Summit started, Board Members had been given notebooks – an idea Tom (who heads up PASS’ marketing) had come up with after being inspired by seeing Bill walk around with a notebook. The plan was to take notes about feedback we got from people. It was a good thing, and the notebook forms a nice pair with the SQLBits one I got a couple of years ago when I last spoke there. I think one of the biggest impacts of this was that during the first keynote, Bill told everyone present about the notebooks. This set a tone of “we’re listening”, and a number of people were definitely keen to tell us things that would cause us to pull out our notebooks. PASSTV was a new thing this year. Justin, the host, featured on the couch and talked a lot of people about a lot of things, including me (he talked to me about a lot of things, I don’t think he talked to a lot people about me). Reaching people through online methods is something which interests me a lot – it has huge potential, and I love the idea of being able to broadcast to people who are unable to attend in person. I’m keen to see how this medium can be developed over time. People who know me will know that I’m a keen advocate of certification – I've been SQL certified since version 6.5, and have even been involved in creating exams. However, I don’t believe in studying for exams. I think training is worthwhile for learning new skills, but the goal should be on learning those skills, not on passing an exam. Exams should be for proving that the skills are there, not a goal in themselves. The PASS Summit is an excellent place to take exams though, and with an attitude of professional development throughout the event, why not? So I did. I wasn’t expecting to take one, but I was persuaded and took the MCM Knowledge Exam. I hadn’t even looked at the syllabus, but tried it anyway. I was very tired, and even fell asleep at one point during it. I’ll find out my result at some point in the future – the Prometric site just says “Tested” at the moment. As I said, it wasn’t something I was expecting to do, but it was good to have something unexpected during the week. Of course it was good to catch up with old friends and make new ones. I feel like every time I’m in the US I see things develop a bit more, with more and more people knowing who I am, who my staff are, and recognising the LobsterPot brand. I missed being a presenter, but I definitely enjoyed seeing many friends on the list of presenters. I won’t try to list them, because there are so many these days that people might feel sad if I don’t mention them. For those that I managed to see, I was pleased to see that the majority of them have lifted their presentation skills since I last saw them, and I happily told them as much. One person who I will mention was Paul White, who travelled from New Zealand to his first PASS Summit. He gave two sessions (a regular session and a half-day), packed large rooms of people, and had everyone buzzing with enthusiasm. I spoke to him after the event, and he told me that his expectations were blown away. Paul isn’t normally a fan of crowds, and the thought of 4000 people would have been scary. But he told me he had no idea that people would welcome him so well, be so friendly and so down to earth. He’s seen the significance of the SQL Server community, and says he’ll be back. It’ll be good to see him there. Will you be there too?

    Read the article

  • The blocking nature of aggregates

    - by Rob Farley
    I wrote a post recently about how query tuning isn’t just about how quickly the query runs – that if you have something (such as SSIS) that is consuming your data (and probably introducing a bottleneck), then it might be more important to have a query which focuses on getting the first bit of data out. You can read that post here.  In particular, we looked at two operators that could be used to ensure that a query returns only Distinct rows. and The Sort operator pulls in all the data, sorts it (discarding duplicates), and then pushes out the remaining rows. The Hash Match operator performs a Hashing function on each row as it comes in, and then looks to see if it’s created a Hash it’s seen before. If not, it pushes the row out. The Sort method is quicker, but has to wait until it’s gathered all the data before it can do the sort, and therefore blocks the data flow. But that was my last post. This one’s a bit different. This post is going to look at how Aggregate functions work, which ties nicely into this month’s T-SQL Tuesday. I’ve frequently explained about the fact that DISTINCT and GROUP BY are essentially the same function, although DISTINCT is the poorer cousin because you have less control over it, and you can’t apply aggregate functions. Just like the operators used for Distinct, there are different flavours of Aggregate operators – coming in blocking and non-blocking varieties. The example I like to use to explain this is a pile of playing cards. If I’m handed a pile of cards and asked to count how many cards there are in each suit, it’s going to help if the cards are already ordered. Suppose I’m playing a game of Bridge, I can easily glance at my hand and count how many there are in each suit, because I keep the pile of cards in order. Moving from left to right, I could tell you I have four Hearts in my hand, even before I’ve got to the end. By telling you that I have four Hearts as soon as I know, I demonstrate the principle of a non-blocking operation. This is known as a Stream Aggregate operation. It requires input which is sorted by whichever columns the grouping is on, and it will release a row as soon as the group changes – when I encounter a Spade, I know I don’t have any more Hearts in my hand. Alternatively, if the pile of cards are not sorted, I won’t know how many Hearts I have until I’ve looked through all the cards. In fact, to count them, I basically need to put them into little piles, and when I’ve finished making all those piles, I can count how many there are in each. Because I don’t know any of the final numbers until I’ve seen all the cards, this is blocking. This performs the aggregate function using a Hash Match. Observant readers will remember this from my Distinct example. You might remember that my earlier Hash Match operation – used for Distinct Flow – wasn’t blocking. But this one is. They’re essentially doing a similar operation, applying a Hash function to some data and seeing if the set of values have been seen before, but before, it needs more information than the mere existence of a new set of values, it needs to consider how many of them there are. A lot is dependent here on whether the data coming out of the source is sorted or not, and this is largely determined by the indexes that are being used. If you look in the Properties of an Index Scan, you’ll be able to see whether the order of the data is required by the plan. A property called Ordered will demonstrate this. In this particular example, the second plan is significantly faster, but is dependent on having ordered data. In fact, if I force a Stream Aggregate on unordered data (which I’m doing by telling it to use a different index), a Sort operation is needed, which makes my plan a lot slower. This is all very straight-forward stuff, and information that most people are fully aware of. I’m sure you’ve all read my good friend Paul White (@sql_kiwi)’s post on how the Query Optimizer chooses which type of aggregate function to apply. But let’s take a look at SQL Server Integration Services. SSIS gives us a Aggregate transformation for use in Data Flow Tasks, but it’s described as Blocking. The definitive article on Performance Tuning SSIS uses Sort and Aggregate as examples of Blocking Transformations. I’ve just shown you that Aggregate operations used by the Query Optimizer are not always blocking, but that the SSIS Aggregate component is an example of a blocking transformation. But is it always the case? After all, there are plenty of SSIS Performance Tuning talks out there that describe the value of sorted data in Data Flow Tasks, describing the IsSorted property that can be set through the Advanced Editor of your Source component. And so I set about testing the Aggregate transformation in SSIS, to prove for sure whether providing Sorted data would let the Aggregate transform behave like a Stream Aggregate. (Of course, I knew the answer already, but it helps to be able to demonstrate these things). A query that will produce a million rows in order was in order. Let me rephrase. I used a query which produced the numbers from 1 to 1000000, in a single field, ordered. The IsSorted flag was set on the source output, with the only column as SortKey 1. Performing an Aggregate function over this (counting the number of rows per distinct number) should produce an additional column with 1 in it. If this were being done in T-SQL, the ordered data would allow a Stream Aggregate to be used. In fact, if the Query Optimizer saw that the field had a Unique Index on it, it would be able to skip the Aggregate function completely, and just insert the value 1. This is a shortcut I wouldn’t be expecting from SSIS, but certainly the Stream behaviour would be nice. Unfortunately, it’s not the case. As you can see from the screenshots above, the data is pouring into the Aggregate function, and not being released until all million rows have been seen. It’s not doing a Stream Aggregate at all. This is expected behaviour. (I put that in bold, because I want you to realise this.) An SSIS transformation is a piece of code that runs. It’s a physical operation. When you write T-SQL and ask for an aggregation to be done, it’s a logical operation. The physical operation is either a Stream Aggregate or a Hash Match. In SSIS, you’re telling the system that you want a generic Aggregation, that will have to work with whatever data is passed in. I’m not saying that it wouldn’t be possible to make a sometimes-blocking aggregation component in SSIS. A Custom Component could be created which could detect whether the SortKeys columns of the input matched the Grouping columns of the Aggregation, and either call the blocking code or the non-blocking code as appropriate. One day I’ll make one of those, and publish it on my blog. I’ve done it before with a Script Component, but as Script components are single-use, I was able to handle the data knowing everything about my data flow already. As per my previous post – there are a lot of aspects in which tuning SSIS and tuning execution plans use similar concepts. In both situations, it really helps to have a feel for what’s going on behind the scenes. Considering whether an operation is blocking or not is extremely relevant to performance, and that it’s not always obvious from the surface. In a future post, I’ll show the impact of blocking v non-blocking and synchronous v asynchronous components in SSIS, using some of LobsterPot’s Script Components and Custom Components as examples. When I get that sorted, I’ll make a Stream Aggregate component available for download.

    Read the article

  • The blocking nature of aggregates

    - by Rob Farley
    I wrote a post recently about how query tuning isn’t just about how quickly the query runs – that if you have something (such as SSIS) that is consuming your data (and probably introducing a bottleneck), then it might be more important to have a query which focuses on getting the first bit of data out. You can read that post here.  In particular, we looked at two operators that could be used to ensure that a query returns only Distinct rows. and The Sort operator pulls in all the data, sorts it (discarding duplicates), and then pushes out the remaining rows. The Hash Match operator performs a Hashing function on each row as it comes in, and then looks to see if it’s created a Hash it’s seen before. If not, it pushes the row out. The Sort method is quicker, but has to wait until it’s gathered all the data before it can do the sort, and therefore blocks the data flow. But that was my last post. This one’s a bit different. This post is going to look at how Aggregate functions work, which ties nicely into this month’s T-SQL Tuesday. I’ve frequently explained about the fact that DISTINCT and GROUP BY are essentially the same function, although DISTINCT is the poorer cousin because you have less control over it, and you can’t apply aggregate functions. Just like the operators used for Distinct, there are different flavours of Aggregate operators – coming in blocking and non-blocking varieties. The example I like to use to explain this is a pile of playing cards. If I’m handed a pile of cards and asked to count how many cards there are in each suit, it’s going to help if the cards are already ordered. Suppose I’m playing a game of Bridge, I can easily glance at my hand and count how many there are in each suit, because I keep the pile of cards in order. Moving from left to right, I could tell you I have four Hearts in my hand, even before I’ve got to the end. By telling you that I have four Hearts as soon as I know, I demonstrate the principle of a non-blocking operation. This is known as a Stream Aggregate operation. It requires input which is sorted by whichever columns the grouping is on, and it will release a row as soon as the group changes – when I encounter a Spade, I know I don’t have any more Hearts in my hand. Alternatively, if the pile of cards are not sorted, I won’t know how many Hearts I have until I’ve looked through all the cards. In fact, to count them, I basically need to put them into little piles, and when I’ve finished making all those piles, I can count how many there are in each. Because I don’t know any of the final numbers until I’ve seen all the cards, this is blocking. This performs the aggregate function using a Hash Match. Observant readers will remember this from my Distinct example. You might remember that my earlier Hash Match operation – used for Distinct Flow – wasn’t blocking. But this one is. They’re essentially doing a similar operation, applying a Hash function to some data and seeing if the set of values have been seen before, but before, it needs more information than the mere existence of a new set of values, it needs to consider how many of them there are. A lot is dependent here on whether the data coming out of the source is sorted or not, and this is largely determined by the indexes that are being used. If you look in the Properties of an Index Scan, you’ll be able to see whether the order of the data is required by the plan. A property called Ordered will demonstrate this. In this particular example, the second plan is significantly faster, but is dependent on having ordered data. In fact, if I force a Stream Aggregate on unordered data (which I’m doing by telling it to use a different index), a Sort operation is needed, which makes my plan a lot slower. This is all very straight-forward stuff, and information that most people are fully aware of. I’m sure you’ve all read my good friend Paul White (@sql_kiwi)’s post on how the Query Optimizer chooses which type of aggregate function to apply. But let’s take a look at SQL Server Integration Services. SSIS gives us a Aggregate transformation for use in Data Flow Tasks, but it’s described as Blocking. The definitive article on Performance Tuning SSIS uses Sort and Aggregate as examples of Blocking Transformations. I’ve just shown you that Aggregate operations used by the Query Optimizer are not always blocking, but that the SSIS Aggregate component is an example of a blocking transformation. But is it always the case? After all, there are plenty of SSIS Performance Tuning talks out there that describe the value of sorted data in Data Flow Tasks, describing the IsSorted property that can be set through the Advanced Editor of your Source component. And so I set about testing the Aggregate transformation in SSIS, to prove for sure whether providing Sorted data would let the Aggregate transform behave like a Stream Aggregate. (Of course, I knew the answer already, but it helps to be able to demonstrate these things). A query that will produce a million rows in order was in order. Let me rephrase. I used a query which produced the numbers from 1 to 1000000, in a single field, ordered. The IsSorted flag was set on the source output, with the only column as SortKey 1. Performing an Aggregate function over this (counting the number of rows per distinct number) should produce an additional column with 1 in it. If this were being done in T-SQL, the ordered data would allow a Stream Aggregate to be used. In fact, if the Query Optimizer saw that the field had a Unique Index on it, it would be able to skip the Aggregate function completely, and just insert the value 1. This is a shortcut I wouldn’t be expecting from SSIS, but certainly the Stream behaviour would be nice. Unfortunately, it’s not the case. As you can see from the screenshots above, the data is pouring into the Aggregate function, and not being released until all million rows have been seen. It’s not doing a Stream Aggregate at all. This is expected behaviour. (I put that in bold, because I want you to realise this.) An SSIS transformation is a piece of code that runs. It’s a physical operation. When you write T-SQL and ask for an aggregation to be done, it’s a logical operation. The physical operation is either a Stream Aggregate or a Hash Match. In SSIS, you’re telling the system that you want a generic Aggregation, that will have to work with whatever data is passed in. I’m not saying that it wouldn’t be possible to make a sometimes-blocking aggregation component in SSIS. A Custom Component could be created which could detect whether the SortKeys columns of the input matched the Grouping columns of the Aggregation, and either call the blocking code or the non-blocking code as appropriate. One day I’ll make one of those, and publish it on my blog. I’ve done it before with a Script Component, but as Script components are single-use, I was able to handle the data knowing everything about my data flow already. As per my previous post – there are a lot of aspects in which tuning SSIS and tuning execution plans use similar concepts. In both situations, it really helps to have a feel for what’s going on behind the scenes. Considering whether an operation is blocking or not is extremely relevant to performance, and that it’s not always obvious from the surface. In a future post, I’ll show the impact of blocking v non-blocking and synchronous v asynchronous components in SSIS, using some of LobsterPot’s Script Components and Custom Components as examples. When I get that sorted, I’ll make a Stream Aggregate component available for download.

    Read the article

  • Merge replication stopping without errors in SQL 2008 R2

    - by Rob Farley
    A non-SQL MVP friend of mine, who also happens to be a client, asked me for some help again last week. I was planning on writing this up even before Rob Volk (@sql_r) listed his T-SQL Tuesday topic for this month. Earlier in the year, I (well, LobsterPot Solutions, although I’d been the person mostly involved) had helped out with a merge replication problem. The Merge Agent on the subscriber was just stopping every time, shortly after it started. With no errors anywhere – not in the Windows Event Log, the SQL Agent logs, not anywhere. We’d managed to get the system working again, but didn’t have a good reason about what had happened, and last week, the problem occurred again. I asked him about writing up the experience in a blog post, largely because of the red herrings that we encountered. It was an interesting experience for me, also because I didn’t end up touching my computer the whole time – just tapping on my phone via Twitter and Live Msgr. You see, the thing with replication is that a useful troubleshooting option is to reinitialise the thing. We’d done that last time, and it had started to work again – eventually. I say eventually, because the link being used between the sites is relatively slow, and it took a long while for the initialisation to finish. Meanwhile, we’d been doing some investigation into what the problem could be, and were suitably pleased when the problem disappeared. So I got a message saying that a replication problem had occurred again. Reinitialising wasn’t going to be an option this time either. In this scenario, the subscriber having the problem happened to be in a different domain to the publisher. The other subscribers (within the domain) were fine, just this one in a different domain had the problem. Part of the problem seemed to be a log file that wasn’t being backed up properly. They’d been trying to back up to a backup device that had a corruption, and the log file was growing. Turned out, this wasn’t related to the problem, but of course, any time you’re troubleshooting and you see something untoward, you wonder. Having got past that problem, my next thought was that perhaps there was a problem with the account being used. But the other subscribers were using the same account, without any problems. The client pointed out that that it was almost exactly six months since the last failure (later shown to be a complete red herring). It sounded like something might’ve expired. Checking through certificates and trusts showed no sign of anything, and besides, there wasn’t a problem running a command-prompt window using the account in question, from the subscriber box. ...except that when he ran the sqlcmd –E –S servername command I recommended, it failed with a Named Pipes error. I’ve seen problems with firewalls rejecting connections via Named Pipes but letting TCP/IP through, so I got him to look into SQL Configuration Manager to see what kind of connection was being preferred... Everything seemed fine. And strangely, he could connect via Management Studio. Turned out, he had a typo in the servername of the sqlcmd command. That particular red herring must’ve been reflected in his cheeks as he told me. During the time, I also pinged a friend of mine to find out who I should ask, and Ted Kruger (@onpnt) ‘s name came up. Ted (and thanks again, Ted – really) reconfirmed some of my thoughts around the idea of an account expiring, and also suggesting bumping up the logging to level 4 (2 is Verbose, 4 is undocumented ridiculousness). I’d just told the client to push the logging up to level 2, but the log file wasn’t appearing. Checking permissions showed that the user did have permission on the folder, but still no file was appearing. Then it was noticed that the user had been switched earlier as part of the troubleshooting, and switching it back to the real user caused the log file to appear. Still no errors. A lot more information being pushed out, but still no errors. Ted suggested making sure the FQDNs were okay from both ends, in case the servers were unable to talk to each other. DNS problems can lead to hassles which can stop replication from working. No luck there either – it was all working fine. Another server started to report a problem as well. These two boxes were both SQL 2008 R2 (SP1), while the others, still working, were SQL 2005. Around this time, the client tried an idea that I’d shown him a few years ago – using a Profiler trace to see what was being called on the servers. It turned out that the last call being made on the publisher was sp_MSenumschemachange. A quick interwebs search on that showed a problem that exists in SQL Server 2008 R2, when stored procedures have more than 4000 characters. Running that stored procedure (with the same parameters) manually on SQL 2005 listed three stored procedures, the first of which did indeed have more than 4000 characters. Still no error though, and the problem as listed at http://support.microsoft.com/kb/2539378 describes an error that should occur in the Event log. However, this problem is the type of thing that is fixed by a reinitialisation (because it doesn’t need to send the procedure change across as a transaction). And a look in the change history of the long stored procs (you all keep them, right?), showed that the problem from six months earlier could well have been down to this too. Applying SP2 (with sufficient paranoia about backups and how to get back out again if necessary) fixed the problem. The stored proc changes went through immediately after the service pack was applied, and it’s been running happily since. The funny thing is that I didn’t solve the problem. He had put the Profiler trace on the server, and had done the search that found a forum post pointing at this particular problem. I’d asked Ted too, and although he’d given some useful information, nothing that he’d come up with had actually been the solution either. Sometimes, asking for help is the most useful thing you can do. Often though, you don’t end up getting the help from the person you asked – the sounding board is actually what you need. @rob_farley

    Read the article

  • PASS Summit Feedback

    - by Rob Farley
    PASS Feedback came in last week. I also saw my dentist for some fillings... At the PASS Summit this year, I delivered a couple of regular sessions and a Lightning Talk. People told me they enjoyed it, but when the rankings came out, they showed that I didn’t score particularly well. Brent Ozar was keen to discuss it with me. Brent: PASS speaker feedback is out. You did two sessions and a Lightning Talk. How did you go? Rob: Not so well actually, thanks for asking. Brent: Ha! Sorry. Of course you know that's why I wanted to discuss this with you. I was in one of your sessions at SQLBits in the UK a month before PASS, and I thought you rocked. You've got a really good and distinctive delivery style.  Then I noticed your talks were ranked in the bottom quarter of the Summit ratings and wanted to discuss it. Rob: Yeah, I know. You did ask me if we could do this...  I should explain – my presentation style is not the stereotypical IT conference one. I throw in jokes, and try to engage the audience thoroughly. I find many talks amazingly dry, and I guess I try to buck that trend. I also run training courses, and find that I get a lot of feedback from people thanking me for keeping things interesting. That said, I also get feedback criticising me for my style, and that’s basically what’s happened here. For the rest of this discussion, let’s focus on my talk about the Incredible Shrinking Execution Plan, which I considered to be my main talk. Brent: I thought that session title was the very best one at the entire Summit, and I had it on my recommended sessions list.  In four words, you managed to sum up the topic and your sense of humor.  I read that and immediately thought, "People need to be in this session," and then it didn't score well.  Tell me about your scores. Rob: The questions on the feedback form covered the usefulness of the information, the speaker’s presentation skills, their knowledge of the subject, how well the session was described, the amount of time allocated, and the quality of the presentation materials. Brent: Presentation materials? But you don’t do slides.  Did they rate your thong? Rob: No-one saw my flip-flops in this talk, Brent. I created a script in Management Studio, and published that afterwards, but I think people will have scored that question based on the lack of slides. I wasn’t expecting to do particularly well on that one. That was the only section that didn’t have 5/5 as the most popular score. Brent: See, that sucks, because cookbook-style scripts are often some of my favorites.  Adam Machanic's Service Broker workbench series helped me immensely when I was prepping for the MCM.  As an attendee, I'd rather have a commented script than a slide deck.  So how did you rank so low? Rob: When I look at the scores that you got (based on your blog post), you got very few scores below 3 – people that felt strong enough about your talk to post a negative score. In my scores, between 5% and 10% were below 3 (except on the question about whether I knew my stuff – I guess I came as knowledgeable). Brent: Wow – so quite a few people really didn’t like your talk then? Rob: Yeah. Mind you, based on the comments, some people really loved it. I’d like to think that there would be a certain portion of the room who may have rated the talk as one of the best of the conference. Some of my comments included “amazing!”, “Best presentation so far!”, “Wow, best session yet”, “fantastic” and “Outstanding!”. I think lots of talks can be “Great”, but not so many talks can be “Outstanding” without the word losing its meaning. One wrote “Pretty amazing presentation, considering it was completely extemporaneous.” Brent: Extemporaneous, eh? Rob: Yeah. I guess they don’t realise how much preparation goes into coming across as unprepared. In many ways it’s much easier to give a written speech than to deliver a presentation without slides as a prompt. Brent: That delivery style, the really relaxed, casual, college-professor approach was one of the things I really liked about your presentation at SQLbits.  As somebody who presents a lot, I "get" it - I know how hard it is to come off as relaxed and comfortable with your own material.  It's like improv done by jazz players and comedians - if you've never tried it, you don't realize how hard it is.  People also don't realize how hard it is to make a tough subject fun. Rob: Yeah well... There will be people writing comments on this post that say I wasn't trying to make the subject fun, and that I was making it all about me. Sometimes the style works, sometimes it doesn't. Most of the comments mentioned the fact that I tell jokes, some in a nice way, but some not so much (and it wasn't just a PASS thing - that's the mix of feedback I generally get). One comment at PASS was: “great stand up comedian - not what I'm looking for at pass”, and there were certainly a few that said “too many jokes”. I’m not trying to do stand-up – jokes are my way of engaging with the audience while I demonstrate some of the amazing things that the Query Optimizer can do if you write your queries the right way. Some people didn’t think it was technical enough, but I’ve also had some people tell me that the concepts I’m explaining are deep and profound. Brent: To me, that's a hallmark of a great explanation - when someone says, "But of course it has to work that way - how could it work any other way?  It seems so simple and logical."  Well, sure it does when it's explained correctly, but now pick up any number of thick SQL Server books and try to understand the Redundant Joins concept.  I guarantee it'll take more than 45 minutes. Rob: Some people in my audiences realise that, but definitely not everyone. There's only so much you can tell someone that something is profound. Generally it's something that they either have an epiphany on or not. I like to lull my audience into knowing what's going on, and do something that surprises them. Gain their trust, build a rapport, and then show them the deeper truth of what just happened. Brent: So you've learned your lesson about presentation scores, right?  From here on out, you're going to be dry, humorless, and all your presentations will consist of you reading bullet points off the screen. Rob: No Brent, I’m not. I'm also not going to suggest that most presentations at PASS are like that. No-one tries to present like that. There's a big space to occupy between what "dry and humourless" and me. My difference is to focus on the relationship I have with the crowd, rather than focussing on delivering the perfect session. I want to see people smiling and know they're relaxed. I think most presenters focus on the material, which is completely reasonable and safe. I remember once hearing someone talking about product creation. They talked about mediocrity. They said that one of the worst things that people can ever say about your product is that it’s “good”. What you want is for 10% of the world to love it enough to want to buy it. If 10% the world gave me a dollar, I’d have more money than I could ever use (assuming it wasn’t the SAME dollar they were giving me I guess). Brent: It's the Raving Fans theory.  It's better to have a small number of raving customers than a large number of almost-but-not-really customers who don't care that much about your product or service.  I know exactly how you feel - when I got survey feedback from my Quest video presentation when I was dressed up in a Richard Simmons costume, some of the attendees said I was unprofessional and distracting.  Some of the attendees couldn't get enough and Photoshopped all kinds of stuff into the screen captures.  On a whole, I probably didn't score that well, and I'm fine with that.  It sucks to look at the scores though - do those lower scores bother you? Rob: Of course they do. It hurts deeply. I open myself up and give presentations in a very personal way. All presenters do that, and we all feel the pain of negative feedback. I hate coming 146th & 162nd out of 185, but have to acknowledge that many sessions did worse still. Plus, once I feel the wounds have healed, I’ll be able to remember that there are people in the world that rave about my presentation style, and figure that people will hopefully talk about me. One day maybe those people that don’t like my presentation style will stay away and I might be able to score better. You don’t pay to hear country music if you prefer western... Lots of people find chili too spicy, but it’s still a popular food. Brent: But don’t you want to appeal to everyone? Rob: I do, but I don’t want to be lukewarm as in Revelation 3:16. I’d rather disgust and be discussed. Well, maybe not ‘disgust’, but I don’t want to conform. Conformity just isn’t the same any more. I’m not sure I’ve ever been one to do that. I try not to offend, but definitely like to be different. Brent: Count me among your raving fans, sir.  Where can we see you next? Rob: Considering I live in Adelaide in Australia, I’m not about to appear at anyone’s local SQL Saturday. I’m still trying to plan which events I’ll get to in 2011. I’ve submitted abstracts for TechEd North America, but won’t hold my breath. I’m also considering the SQLBits conferences in the UK in April, PASS in October, and I’m sure I’ll do some LiveMeeting presentations for user groups. Online, people download some of my recent SQLBits presentations at http://bit.ly/RFSarg and http://bit.ly/Simplification though. And they can download a 5-minute MP3 of my Lightning Talk at http://www.lobsterpot.com.au/files/Collation.mp3, in which I try to explain the idea behind collation, using thongs as an example. Brent: I was in the audience for http://bit.ly/RFSarg. That was a great presentation. Rob: Thanks, Brent. Now where’s my dollar?

    Read the article

  • When someone deletes a shared data source in SSRS

    - by Rob Farley
    SQL Server Reporting Services plays nicely. You can have things in the catalogue that get shared. You can have Reports that have Links, Datasets that can be used across different reports, and Data Sources that can be used in a variety of ways too. So if you find that someone has deleted a shared data source, you potentially have a bit of a horror story going on. And this works for this month’s T-SQL Tuesday theme, hosted by Nick Haslam, who wants to hear about horror stories. I don’t write about LobsterPot client horror stories, so I’m writing about a situation that a fellow MVP friend asked me about recently instead. The best thing to do is to grab a recent backup of the ReportServer database, restore it somewhere, and figure out what’s changed. But of course, this isn’t always possible. And it’s much nicer to help someone with this kind of thing, rather than to be trying to fix it yourself when you’ve just deleted the wrong data source. Unfortunately, it lets you delete data sources, without trying to scream that the data source is shared across over 400 reports in over 100 folders, as was the case for my friend’s colleague. So, suddenly there’s a big problem – lots of reports are failing, and the time to turn it around is small. You probably know which data source has been deleted, but getting the shared data source back isn’t the hard part (that’s just a connection string really). The nasty bit is all the re-mapping, to get those 400 reports working again. I know from exploring this kind of stuff in the past that the ReportServer database (using its default name) has a table called dbo.Catalog to represent the catalogue, and that Reports are stored here. However, the information about what data sources these deployed reports are configured to use is stored in a different table, dbo.DataSource. You could be forgiven for thinking that shared data sources would live in this table, but they don’t – they’re catalogue items just like the reports. Let’s have a look at the structure of these two tables (although if you’re reading this because you have a disaster, feel free to skim past). Frustratingly, there doesn’t seem to be a Books Online page for this information, sorry about that. I’m also not going to look at all the columns, just ones that I find interesting enough to mention, and that are related to the problem at hand. These fields are consistent all the way through to SQL Server 2012 – there doesn’t seem to have been any changes here for quite a while. dbo.Catalog The Primary Key is ItemID. It’s a uniqueidentifier. I’m not going to comment any more on that. A minor nice point about using GUIDs in unfamiliar databases is that you can more easily figure out what’s what. But foreign keys are for that too… Path, Name and ParentID tell you where in the folder structure the item lives. Path isn’t actually required – you could’ve done recursive queries to get there. But as that would be quite painful, I’m more than happy for the Path column to be there. Path contains the Name as well, incidentally. Type tells you what kind of item it is. Some examples are 1 for a folder and 2 a report. 4 is linked reports, 5 is a data source, 6 is a report model. I forget the others for now (but feel free to put a comment giving the full list if you know it). Content is an image field, remembering that image doesn’t necessarily store images – these days we’d rather use varbinary(max), but even in SQL Server 2012, this field is still image. It stores the actual item definition in binary form, whether it’s actually an image, a report, whatever. LinkSourceID is used for Linked Reports, and has a self-referencing foreign key (allowing NULL, of course) back to ItemID. Parameter is an ntext field containing XML for the parameters of the report. Not sure why this couldn’t be a separate table, but I guess that’s just the way it goes. This field gets changed when the default parameters get changed in Report Manager. There is nothing in dbo.Catalog that describes the actual data sources that the report uses. The default data sources would be part of the Content field, as they are defined in the RDL, but when you deploy reports, you typically choose to NOT replace the data sources. Anyway, they’re not in this table. Maybe it was already considered a bit wide to throw in another ntext field, I’m not sure. They’re in dbo.DataSource instead. dbo.DataSource The Primary key is DSID. Yes it’s a uniqueidentifier... ItemID is a foreign key reference back to dbo.Catalog Fields such as ConnectionString, Prompt, UserName and Password do what they say on the tin, storing information about how to connect to the particular source in question. Link is a uniqueidentifier, which refers back to dbo.Catalog. This is used when a data source within a report refers back to a shared data source, rather than embedding the connection information itself. You’d think this should be enforced by foreign key, but it’s not. It does allow NULLs though. Flags this is an int, and I’ll come back to this. When a Data Source gets deleted out of dbo.Catalog, you might assume that it would be disallowed if there are references to it from dbo.DataSource. Well, you’d be wrong. And not because of the lack of a foreign key either. Deleting anything from the catalogue is done by calling a stored procedure called dbo.DeleteObject. You can look at the definition in there – it feels very much like the kind of Delete stored procedures that many people write, the kind of thing that means they don’t need to worry about allowing cascading deletes with foreign keys – because the stored procedure does the lot. Except that it doesn’t quite do that. If it deleted everything on a cascading delete, we’d’ve lost all the data sources as configured in dbo.DataSource, and that would be bad. This is fine if the ItemID from dbo.DataSource hooks in – if the report is being deleted. But if a shared data source is being deleted, you don’t want to lose the existence of the data source from the report. So it sets it to NULL, and it marks it as invalid. We see this code in that stored procedure. UPDATE [DataSource]    SET       [Flags] = [Flags] & 0x7FFFFFFD, -- broken link       [Link] = NULL FROM    [Catalog] AS C    INNER JOIN [DataSource] AS DS ON C.[ItemID] = DS.[Link] WHERE    (C.Path = @Path OR C.Path LIKE @Prefix ESCAPE '*') Unfortunately there’s no semi-colon on the end (but I’d rather they fix the ntext and image types first), and don’t get me started about using the table name in the UPDATE clause (it should use the alias DS). But there is a nice comment about what’s going on with the Flags field. What I’d LIKE it to do would be to set the connection information to a report-embedded copy of the connection information that’s in the shared data source, the one that’s about to be deleted. I understand that this would cause someone to lose the benefit of having the data sources configured in a central point, but I’d say that’s probably still slightly better than LOSING THE INFORMATION COMPLETELY. Sorry, rant over. I should log a Connect item – I’ll put that on my todo list. So it sets the Link field to NULL, and marks the Flags to tell you they’re broken. So this is your clue to fixing it. A bitwise AND with 0x7FFFFFFD is basically stripping out the ‘2’ bit from a number. So numbers like 2, 3, 6, 7, 10, 11, etc, whose binary representation ends in either 11 or 10 get turned into 0, 1, 4, 5, 8, 9, etc. We can test for it using a WHERE clause that matches the SET clause we’ve just used. I’d also recommend checking for Link being NULL and also having no ConnectionString. And join back to dbo.Catalog to get the path (including the name) of broken reports are – in case you get a surprise from a different data source being broken in the past. SELECT c.Path, ds.Name FROM dbo.[DataSource] AS ds JOIN dbo.[Catalog] AS c ON c.ItemID = ds.ItemID WHERE ds.[Flags] = ds.[Flags] & 0x7FFFFFFD AND ds.[Link] IS NULL AND ds.[ConnectionString] IS NULL; When I just ran this on my own machine, having deleted a data source to check my code, I noticed a Report Model in the list as well – so if you had thought it was just going to be reports that were broken, you’d be forgetting something. So to fix those reports, get your new data source created in the catalogue, and then find its ItemID by querying Catalog, using Path and Name to find it. And then use this value to fix them up. To fix the Flags field, just add 2. I prefer to use bitwise OR which should do the same. Use the OUTPUT clause to get a copy of the DSIDs of the ones you’re changing, just in case you need to revert something later after testing (doing it all in a transaction won’t help, because you’ll just lock out the table, stopping you from testing anything). UPDATE ds SET [Flags] = [Flags] | 2, [Link] = '3AE31CBA-BDB4-4FD1-94F4-580B7FAB939D' /*Insert your own GUID*/ OUTPUT deleted.Name, deleted.DSID, deleted.ItemID, deleted.Flags FROM dbo.[DataSource] AS ds JOIN dbo.[Catalog] AS c ON c.ItemID = ds.ItemID WHERE ds.[Flags] = ds.[Flags] & 0x7FFFFFFD AND ds.[Link] IS NULL AND ds.[ConnectionString] IS NULL; But please be careful. Your mileage may vary. And there’s no reason why 400-odd broken reports needs to be quite the nightmare that it could be. Really, it should be less than five minutes. @rob_farley

    Read the article

1