blog spotlight - Page 115

NomCom Time

- by RickHeiges

Well, it is official... there is a race for the community seats on the PASS NomCom. I am very pleased to see that we have 12 people who decided to put their names forward for this task. This is largely a thankless job that takes a great deal of time, judgement, and consideration. I have put my name forward as one of those people who would like to take on this task and serve PASS (and the greater SQL Community) in this effort. You can find out more about me and the other candidates for the NomCom...(read more)

Read the article

Presenting at Roanoke Code Camp Saturday!

- by andyleonard

Introduction I am honored to once again be selected to present at Roanoke Code Camp ! An Introductory Topic One of my presentations is titled "I See a Control Flow Tab. Now What?" It's a Level 100 talk for those wishing to learn how to build their very first SSIS package. This highly-interactive, demo-intense presentation is for beginners and developers just getting started with SSIS. Attend and learn how to build SSIS packages from the ground up . Designing an SSIS Framework I'm also presenting...(read more)

Read the article

Less Useful Soft Skills

- by andyleonard

Introduction This post is the fifty-sixth part of a ramble-rant about the software business. The current posts in this series can be found on the series landing page . Over a career that spans decades, one encounters useful and “less useful” soft skills in the modern enterprise. I thought I would share a few of the less useful variety: Free Advice If someone asks another for advice, that’s a cool compliment. The person asking has seen something that compels them to seek information about how-another-does-or-sees-things....(read more)

Read the article

Be the surgeon

- by Rob Farley

It’s a phrase I use often, especially when teaching, and I wish I had realised the concept years earlier. (And of course, fits with this month’s T-SQL Tuesday topic, hosted by Argenis Fernandez) When I’m sick enough to go to the doctor, I see a GP. I used to typically see the same guy, but he’s moved on now. However, when he has been able to roughly identify the area of the problem, I get referred to a specialist, sometimes a surgeon. Being a surgeon requires a refined set of skills. It’s why they often don’t like to be called “Doctor”, and prefer the traditional “Mister” (the history is that the doctor used to make the diagnosis, and then hand the patient over to the person who didn’t have a doctorate, but rather was an expert cutter, typically from a background in butchering). But if you ask the surgeon about the pain you have in your leg sometimes, you’ll get told to ask your GP. It’s not that your surgeon isn’t interested – they just don’t know the answer. IT is the same now. That wasn’t something that I really understood when I got out of university. I knew there was a lot to know about IT – I’d just done an honours degree in it. But I also knew that I’d done well in just about all my subjects, and felt like I had a handle on everything. I got into developing, and still felt that having a good level of understanding about every aspect of IT was a good thing. This got me through for the first six or seven years of my career. But then I started to realise that I couldn’t compete. I’d moved into management, and was spending my days running projects, rather than writing code. The kids were getting older. I’d had a bad back injury (ask anyone with chronic pain how it affects your ability to concentrate, retain information, etc). But most of all, IT was getting larger. I knew kids without lives who knew more than I did. And I felt like I could easily identify people who were better than me in whatever area I could think of. Except writing queries (this was before I discovered technical communities, and people like Paul White and Dave Ballantyne). And so I figured I’d specialise. I wish I’d done it years earlier. Now, I can tell you plenty of people who are better than me at any area you can pick. But there are also more people who might consider listing me in some of their lists too. If I’d stayed the GP, I’d be stuck in management, and finding that there were better managers than me too. If you’re reading this, SQL could well be your thing. But it might not be either. Your thing might not even be in IT. Find out, and then see if you can be a world-beater at it. But it gets even better, because you can find other people to complement the things that you’re not so good at. My company, LobsterPot Solutions, has six people in it at the moment. I’ve hand-picked those six people, along with the one who quit. The great thing about it is that I’ve been able to pick people who don’t necessarily specialise in the same way as me. I don’t write their T-SQL for them – generally they’re good enough at that themselves. But I’m on-hand if needed. Consider Roger Noble, for example. He’s doing stuff in HTML5 and jQuery that I could never dream of doing to create an amazing HTML5 version of PivotViewer. Or Ashley Sewell, a guy who does project management far better than I do. I could go on. My team is brilliant, and I love them to bits. We’re all surgeons, and when we work together, I like to think we’re pretty good! @rob_farley

Read the article

Your Next IT Job

- by BuckWoody

Some data professionals have worked (and plan to work) in the same place for a long time. In organizations large and small, the turnover rate just isn’t that high. This has not been my experience. About every 3-5 years I’ve changed either roles or companies. That might be due to the IT environment or my personality (or a mix of the two), but the point is that I’ve had many roles and worked for many companies large and small throughout my 27+ years in IT. At one point this might have been a detriment – a prospective employer looks at the resume and says “it seems you’ve moved around quite a bit.” But I haven’t found that to be the case all the time –in fact, in some cases the variety of jobs I’ve held has been an asset because I’ve seen what works (and doesn’t) in other environments, which can save time and money. So if you’re in the first camp – great! Stay where you are, and continue doing the work you love. but if you’re in the second, then this post might be useful. If you are planning on making a change, or perhaps you’ve hit a wall at your current location, you might start looking around for a better paying job – and there’s nothing wrong with that. We all try to make our lives better, and for some that involves more money. Money, however, isn’t always the primary motivator. I’ve gone to another job that doesn’t have as many benefits or has the same salary as the current job I’m working to gain more experience, get a better work/life balance and so on. It’s a mix of factors that only you know about. So I thought I would lay out a few advantages and disadvantages in the shops I’ve worked at. This post isn’t aimed at a single employer, but represents a mix of what I’ve experienced, and of course the opinions here are my own. You will most certainly have a different take – if so, please post a response! I also won’t mention a specific industry – I’ve worked everywhere from medical firms, legal offices, retail, billing centers, manufacturing, government, even to NASA. I’m focusing here mostly on size and composition. And I’m making some very broad generalizations here – I am fully aware that a small company might have great benefits and a large company might allow a lot of role flexibility. your mileage may vary – and again, post those comments! Small Company To me a “small company” means around 100 people or less – sometimes a lot less. These can be really fun, frustrating places to to work. Advantages: a great deal of flexibility, a wide range of roles (often at the same time), a large degree of responsibility, immediate feedback, close relationships with co-workers, work directly with your customer. Disadvantages: Too much responsibility, little work/life balance, immature political structure, few (if any) benefits. If the business is family-owned, they can easily violate work/life boundaries. Medium Size company In my experience the next size company I would work for involves from a few hundred people to around five thousand. Advantages: Good mobility – fairly easy to get promoted, acceptable benefits, more defined responsibilities, better work/life balance, balanced load for expertise, but still the organizational structure is fairly simple to understand. Disadvantages: Pay is not always highest, rapid changes in structure as the organization grows, transient workforce. You may not be given the opportunity to work with another technology if someone already “owns” it. Politics are painful at this level as people try to learn how to do it. Large Company When you get into the tens of thousands of folks employed around the world, you’re in a large company. Advantages: Lots of room to move around – sometimes you can work (as I have) multiple jobs through the years and yet stay at the same company, building time for benefits, very defined roles, trained managers (yes, I know some of them are still awful – trust me – I DO know that), higher-end benefits, long careers possible, discounts at retailers and other “soft” benefits, prestige. For some, a higher level of politics (done professionally) is a good thing. Disadvantages: You could become another faceless name in the crowd, might not allow a great deal of flexibility, large organizational changes might take away any control you have of your career. I’ve also seen large layoffs happen, and good people get let go while “dead weight” is retained. For some, a higher level of politics is distasteful. So what are your experiences? Share with the group! Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Read the article

Disqus Comment Form Missing from Posts

- by Saad

I decided to transition from IntenseDebate to Disqus for the blog. So I uninstalled ID via their uninstall process (you upload the template to them, they remove code, you reupload your new template onto the site). Then I went to install Disqus into the site through their Blogger widget method. The problem is that there is no comment form present on any of the blog posts' pages. For example, when you click on the 'Comments' link it jumps to #disqus-thread but there is no thread there. So is there any fix that I can do in order to make the comment form appear? I checked Disqus' knowledgebase for Blogger installation but as far as I can tell my template should be compatible.

Read the article

SQL Community – stronger than ever

- by Rob Farley

I posted a few hours ago about a reflection of the Summit, but I wanted to write another one for this month’s T-SQL Tuesday, hosted by Chris Yates. In January of this year, Adam Jorgensen and I joked around in a video that was used for the SQL Server 2012 launch. We were asked about SQLFamily, and we said how we were like brothers – how we could drive each other crazy (the look he gave me as I patted his stomach was priceless), but that we’d still look out for each other, just like in a real family. And this is really true. Last week at the PASS Summit, there was a lot going on. I was busy as always, as were many others. People told me their good news, their awful news, and some whinged to me about other people who were driving them crazy. But throughout this, people in the SQL Server community genuinely want the best for each other. I’m sure there are exceptions, but I don’t see much of this. Australians aren’t big on cheering for each other. Neither are the English. I think we see it as an American thing. It could be easy for me to consider that the SQL Community that I see at the PASS Summit is mainly there because it’s a primarily American organisation. But when you speak to people like sponsors, or people involved in several types of communities, you quickly hear that it’s not just about that – that PASS has something special. It goes beyond cheering, it’s a strong desire to see each other succeed. I see MVPs feel disappointed for those people who don’t get awarded. I see Summit speakers concerned for those who missed out on the chance to speak. I see chapter leaders excited about the opportunity to help other chapters. And throughout, I see a gentleness and love for people that you rarely see outside the church (and sadly, many churches don’t have it either). Chris points out that the M-W dictionary defined community as “a unified body of individuals”, and I feel like this is true of the SQL Server community. It goes deeper though. It’s not just unity – and we’re most definitely different to each other – it’s more than that. We all want to see each other grow. We all want to pull ourselves up, to serve each other, and to grow PASS into something more than it is today. In that other post of mine I wrote a bit about Paul White’s experience at his first Summit. His missus wrote to me on Facebook saying that she welled up over it. But that emotion was nothing about what I wrote – it was about the reaction that the SQL Community had had to Paul. Be proud of it, my SQL brothers and sisters, and never lose it.

Read the article

When is a Seek not a Seek?

- by Paul White

The following script creates a single-column clustered table containing the integers from 1 to 1,000 inclusive. IF OBJECT_ID(N'tempdb..#Test', N'U') IS NOT NULL DROP TABLE #Test ; GO CREATE TABLE #Test ( id INTEGER PRIMARY KEY CLUSTERED ); ; INSERT #Test (id) SELECT V.number FROM master.dbo.spt_values AS V WHERE V.[type] = N'P' AND V.number BETWEEN 1 AND 1000 ; Let’s say we need to find the rows with values from 100 to 170, excluding any values that divide exactly by 10. One way to write that query would be: SELECT T.id FROM #Test AS T WHERE T.id IN ( 101,102,103,104,105,106,107,108,109, 111,112,113,114,115,116,117,118,119, 121,122,123,124,125,126,127,128,129, 131,132,133,134,135,136,137,138,139, 141,142,143,144,145,146,147,148,149, 151,152,153,154,155,156,157,158,159, 161,162,163,164,165,166,167,168,169 ) ; That query produces a pretty efficient-looking query plan: Knowing that the source column is defined as an INTEGER, we could also express the query this way: SELECT T.id FROM #Test AS T WHERE T.id >= 101 AND T.id <= 169 AND T.id % 10 > 0 ; We get a similar-looking plan: If you look closely, you might notice that the line connecting the two icons is a little thinner than before. The first query is estimated to produce 61.9167 rows – very close to the 63 rows we know the query will return. The second query presents a tougher challenge for SQL Server because it doesn’t know how to predict the selectivity of the modulo expression (T.id % 10 > 0). Without that last line, the second query is estimated to produce 68.1667 rows – a slight overestimate. Adding the opaque modulo expression results in SQL Server guessing at the selectivity. As you may know, the selectivity guess for a greater-than operation is 30%, so the final estimate is 30% of 68.1667, which comes to 20.45 rows. The second difference is that the Clustered Index Seek is costed at 99% of the estimated total for the statement. For some reason, the final SELECT operator is assigned a small cost of 0.0000484 units; I have absolutely no idea why this is so, or what it models. Nevertheless, we can compare the total cost for both queries: the first one comes in at 0.0033501 units, and the second at 0.0034054. The important point is that the second query is costed very slightly higher than the first, even though it is expected to produce many fewer rows (20.45 versus 61.9167). If you run the two queries, they produce exactly the same results, and both complete so quickly that it is impossible to measure CPU usage for a single execution. We can, however, compare the I/O statistics for a single run by running the queries with STATISTICS IO ON: Table '#Test'. Scan count 63, logical reads 126, physical reads 0. Table '#Test'. Scan count 01, logical reads 002, physical reads 0. The query with the IN list uses 126 logical reads (and has a ‘scan count’ of 63), while the second query form completes with just 2 logical reads (and a ‘scan count’ of 1). It is no coincidence that 126 = 63 * 2, by the way. It is almost as if the first query is doing 63 seeks, compared to one for the second query. In fact, that is exactly what it is doing. There is no indication of this in the graphical plan, or the tool-tip that appears when you hover your mouse over the Clustered Index Seek icon. To see the 63 seek operations, you have click on the Seek icon and look in the Properties window (press F4, or right-click and choose from the menu): The Seek Predicates list shows a total of 63 seek operations – one for each of the values from the IN list contained in the first query. I have expanded the first seek node to show the details; it is seeking down the clustered index to find the entry with the value 101. Each of the other 62 nodes expands similarly, and the same information is contained (even more verbosely) in the XML form of the plan. Each of the 63 seek operations starts at the root of the clustered index B-tree and navigates down to the leaf page that contains the sought key value. Our table is just large enough to need a separate root page, so each seek incurs 2 logical reads (one for the root, and one for the leaf). We can see the index depth using the INDEXPROPERTY function, or by using the a DMV: SELECT S.index_type_desc, S.index_depth FROM sys.dm_db_index_physical_stats ( DB_ID(N'tempdb'), OBJECT_ID(N'tempdb..#Test', N'U'), 1, 1, DEFAULT ) AS S ; Let’s look now at the Properties window when the Clustered Index Seek from the second query is selected: There is just one seek operation, which starts at the root of the index and navigates the B-tree looking for the first key that matches the Start range condition (id >= 101). It then continues to read records at the leaf level of the index (following links between leaf-level pages if necessary) until it finds a row that does not meet the End range condition (id <= 169). Every row that meets the seek range condition is also tested against the Residual Predicate highlighted above (id % 10 > 0), and is only returned if it matches that as well. You will not be surprised that the single seek (with a range scan and residual predicate) is much more efficient than 63 singleton seeks. It is not 63 times more efficient (as the logical reads comparison would suggest), but it is around three times faster. Let’s run both query forms 10,000 times and measure the elapsed time: DECLARE @i INTEGER, @n INTEGER = 10000, @s DATETIME = GETDATE() ; SET NOCOUNT ON; SET STATISTICS XML OFF; ; WHILE @n > 0 BEGIN SELECT @i = T.id FROM #Test AS T WHERE T.id IN ( 101,102,103,104,105,106,107,108,109, 111,112,113,114,115,116,117,118,119, 121,122,123,124,125,126,127,128,129, 131,132,133,134,135,136,137,138,139, 141,142,143,144,145,146,147,148,149, 151,152,153,154,155,156,157,158,159, 161,162,163,164,165,166,167,168,169 ) ; SET @n -= 1; END ; PRINT DATEDIFF(MILLISECOND, @s, GETDATE()) ; GO DECLARE @i INTEGER, @n INTEGER = 10000, @s DATETIME = GETDATE() ; SET NOCOUNT ON ; WHILE @n > 0 BEGIN SELECT @i = T.id FROM #Test AS T WHERE T.id >= 101 AND T.id <= 169 AND T.id % 10 > 0 ; SET @n -= 1; END ; PRINT DATEDIFF(MILLISECOND, @s, GETDATE()) ; On my laptop, running SQL Server 2008 build 4272 (SP2 CU2), the IN form of the query takes around 830ms and the range query about 300ms. The main point of this post is not performance, however – it is meant as an introduction to the next few parts in this mini-series that will continue to explore scans and seeks in detail. When is a seek not a seek? When it is 63 seeks © Paul White 2011 email: [email protected] twitter: @SQL_kiwi

Read the article

Oracle Index Skip Scan

- by jchang

There is a feature, called index skip scan that has been in Oracle since version 9i. When I across this, it seemed like a very clever trick, but not a critical capability. More recently, I have been advocating DW on SSD in approrpiate situations, and I am thinking this is now a valuable feature in keeping the number of nonclustered indexes to a minimum. Briefly, suppose we have an index with key columns: Col1 , Col2 , in that order. Obviously, a query with a search argument (SARG) on Col1 can use...(read more)

Read the article

The updated Survey pattern for Power Pivot and Tabular #powerpivot #tabular #ssas #dax

- by Marco Russo (SQLBI)

One of the first models I created for the many-to-many revolution white paper was the Survey one. At the time, it was in Analysis Services Multidimensional, and then we implemented it in Analysis Services Tabular and in Power Pivot, using the DAX language. I recently reviewed the data model and published it in the Survey article on DAX Patterns site. The Survey pattern is the foundation for others, such as the Basket Analysis, and it is widely used in many different business scenario. I was particularly happy to know it has been using to perform data analysis for cancer research! In this article I did some maintenance on the DAX formulas, checking that the proper error handling is part of the formulas, and highlighting some differences in slicers behavior between Excel 2010 and Excel 2013, which could be particularly important for the Survey scenario. As usual, we provide sample workbooks for both Excel 2010 and Excel 2013, and we use DAX Formatter to make the DAX code easier to read. Any feedback will be appreciated!

Read the article

Ola Hallengren adds STATISTICS support to his solution

- by AaronBertrand

Last week, Ola published a very useful update to his Backup, Integrity Check and Index Optimization scripts : the solution now supports updating statistics. There are several options, such as only updating when the data has been modified and using the RESAMPLE and NORECOMPUTE options. An example call: EXEC dbo.IndexOptimize @Databases = 'USER_DATABASES' , @FragmentationHigh_LOB = 'INDEX_REBUILD_OFFLINE' , @FragmentationHigh_NonLOB = 'INDEX_REBUILD_ONLINE' , @FragmentationMedium_LOB = 'INDEX_REORGANIZE_STATISTICS_UPDATE'...(read more)

Read the article

Managed code and the Shell – Do?

Back in 2006 I wrote a blog post titled: Managed code and the Shell – Don't!. Please visit that post to see why that advice was given.The crux of the issue has been addressed in the latest CLR via In-Process Side-by-Side Execution. In addition to the MSDN documentation I just linked, there is also an MSDN article on the topic: In-Process Side-by-Side.Now, even though the major technical impediment seems to be removed, I don’t know if Microsoft is now officially supporting managed extensions to the shell. Either way, I noticed a CodePlex project that is marching ahead to enable exactly that: Managed Mini Shell Extension Framework. Not much activity there, but maybe it will grow once .NET 4 is released... Comments about this post welcome at the original blog.

Read the article

Reproducing a Conversion Deadlock

- by Alexander Kuznetsov

Even if two processes compete on only one resource, they still can embrace in a deadlock. The following scripts reproduce such a scenario. In one tab, run this: CREATE TABLE dbo.Test ( i INT ) ; GO INSERT INTO dbo.Test ( i ) VALUES ( 1 ) ; GO SET TRANSACTION ISOLATION LEVEL SERIALIZABLE ; BEGIN TRAN SELECT i FROM dbo.Test ; --UPDATE dbo.Test SET i=2 ; After this script has completed, we have an outstanding transaction holding a shared lock. In another tab, let us have that another connection have...(read more)

Read the article

OT: Thank You, Microsoft

- by andyleonard

cross-posted from AndyLeonard.me … Each April 1st for the past five years, I have been honored to receive an email from Microsoft informing me I have been recognized as a SQL Server MVP. Tomorrow will be different. Back in January – when I wrote this – I requested Microsoft not consider me for renewal. I have enjoyed serving as a Microsoft MVP. I only got to see what it is like to be a SQL Server MVP, and I think we are part of a special community that makes being an MVP even more special. I have...(read more)

Read the article

Data Education: Great Classes Coming to a City Near You

- by Adam Machanic

In case you haven't noticed, Data Education (the training company I started a couple of years ago) has expanded beyond the US northeast; we're currently offering courses with top trainers in both St. Louis and Chicago , as well as the Boston area. The courses are starting to fill up fast—not surprising when you consider we’re talking about experienced instructors like Kalen Delaney , Rob Farley , and Allan Hirt —but we have still have some room. We’re very excited about bringing the highest quality...(read more)

Read the article

Dynamic Ranking with Excel and PowerPivot

- by AlbertoFerrari

Ranking is useful and, in our book , I and Marco provide a lot of information about how to perform ranking with PowerPivot. Nevertheless, there is an interesting scenario where ranking can be performed without complex DAX formulas, but with just some creative Excel usage. I would like to describe it here. Let us start with some words about the scenario: we want to rank products based on sales in a year (e.g. 2002) and see how the top 10 of these products performed in the following or preceding years....(read more)

Read the article

24 Hours of PASS: 15 Powerful Dynamic Management Objects - Deck and Demos

- by Adam Machanic

Thank you to everyone who attended today's 24 Hours of PASS webcast on Dynamic Management Objects! I was shocked, awed, and somewhat scared when I saw the attendee number peak at over 800. I really appreciate your taking time out of your day to listen to me talk. It's always interesting presenting to people I can't see or hear, so I relied on Twitter for a form of nearly real-time feedback. I would like to especially thank everyone who left me tweets both during and after the presentation. Your feedback...(read more)

Read the article

Oracle Product Leader Named a Leader in Gartner MQ for MDM of Product Data Solutions

- by Mala Narasimharajan

Gartner recently Oracle as a leader in the MQ report for MDM of Product Data Solutions. They named Oracle as a leader with the following key points: Strong MDM portfolio covering multiple data domains, industries and use cases Oracle PDH can be a good fit for Oracle EBS customers and can form part of a multidomain solution: Deep MDM of product data functionality Evolving support for information stewardship For more information on the report visit Oracle's Analyst Relations blog at http://blog.us.oracle.com/dimdmar/. To learn more about Oracle's product information solutions for master data management click here.

Read the article

Fun with Aggregates

- by Paul White

There are interesting things to be learned from even the simplest queries. For example, imagine you are given the task of writing a query to list AdventureWorks product names where the product has at least one entry in the transaction history table, but fewer than ten. One possible query to meet that specification is: SELECT p.Name FROM Production.Product AS p JOIN Production.TransactionHistory AS th ON p.ProductID = th.ProductID GROUP BY p.ProductID, p.Name HAVING COUNT_BIG(*) < 10; That query correctly returns 23 rows (execution plan and data sample shown below): The execution plan looks a bit different from the written form of the query: the base tables are accessed in reverse order, and the aggregation is performed before the join. The general idea is to read all rows from the history table, compute the count of rows grouped by ProductID, merge join the results to the Product table on ProductID, and finally filter to only return rows where the count is less than ten. This ‘fully-optimized’ plan has an estimated cost of around 0.33 units. The reason for the quote marks there is that this plan is not quite as optimal as it could be – surely it would make sense to push the Filter down past the join too? To answer that, let’s look at some other ways to formulate this query. This being SQL, there are any number of ways to write logically-equivalent query specifications, so we’ll just look at a couple of interesting ones. The first query is an attempt to reverse-engineer T-SQL from the optimized query plan shown above. It joins the result of pre-aggregating the history table to the Product table before filtering: SELECT p.Name FROM ( SELECT th.ProductID, cnt = COUNT_BIG(*) FROM Production.TransactionHistory AS th GROUP BY th.ProductID ) AS q1 JOIN Production.Product AS p ON p.ProductID = q1.ProductID WHERE q1.cnt < 10; Perhaps a little surprisingly, we get a slightly different execution plan: The results are the same (23 rows) but this time the Filter is pushed below the join! The optimizer chooses nested loops for the join, because the cardinality estimate for rows passing the Filter is a bit low (estimate 1 versus 23 actual), though you can force a merge join with a hint and the Filter still appears below the join. In yet another variation, the < 10 predicate can be ‘manually pushed’ by specifying it in a HAVING clause in the “q1” sub-query instead of in the WHERE clause as written above. The reason this predicate can be pushed past the join in this query form, but not in the original formulation is simply an optimizer limitation – it does make efforts (primarily during the simplification phase) to encourage logically-equivalent query specifications to produce the same execution plan, but the implementation is not completely comprehensive. Moving on to a second example, the following query specification results from phrasing the requirement as “list the products where there exists fewer than ten correlated rows in the history table”: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID HAVING COUNT_BIG(*) < 10 ); Unfortunately, this query produces an incorrect result (86 rows): The problem is that it lists products with no history rows, though the reasons are interesting. The COUNT_BIG(*) in the EXISTS clause is a scalar aggregate (meaning there is no GROUP BY clause) and scalar aggregates always produce a value, even when the input is an empty set. In the case of the COUNT aggregate, the result of aggregating the empty set is zero (the other standard aggregates produce a NULL). To make the point really clear, let’s look at product 709, which happens to be one for which no history rows exist: -- Scalar aggregate SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = 709; -- Vector aggregate SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = 709 GROUP BY th.ProductID; The estimated execution plans for these two statements are almost identical: You might expect the Stream Aggregate to have a Group By for the second statement, but this is not the case. The query includes an equality comparison to a constant value (709), so all qualified rows are guaranteed to have the same value for ProductID and the Group By is optimized away. In fact there are some minor differences between the two plans (the first is auto-parameterized and qualifies for trivial plan, whereas the second is not auto-parameterized and requires cost-based optimization), but there is nothing to indicate that one is a scalar aggregate and the other is a vector aggregate. This is something I would like to see exposed in show plan so I suggested it on Connect. Anyway, the results of running the two queries show the difference at runtime: The scalar aggregate (no GROUP BY) returns a result of zero, whereas the vector aggregate (with a GROUP BY clause) returns nothing at all. Returning to our EXISTS query, we could ‘fix’ it by changing the HAVING clause to reject rows where the scalar aggregate returns zero: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID HAVING COUNT_BIG(*) BETWEEN 1 AND 9 ); The query now returns the correct 23 rows: Unfortunately, the execution plan is less efficient now – it has an estimated cost of 0.78 compared to 0.33 for the earlier plans. Let’s try adding a redundant GROUP BY instead of changing the HAVING clause: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY th.ProductID HAVING COUNT_BIG(*) < 10 ); Not only do we now get correct results (23 rows), this is the execution plan: I like to compare that plan to quantum physics: if you don’t find it shocking, you haven’t understood it properly :) The simple addition of a redundant GROUP BY has resulted in the EXISTS form of the query being transformed into exactly the same optimal plan we found earlier. What’s more, in SQL Server 2008 and later, we can replace the odd-looking GROUP BY with an explicit GROUP BY on the empty set: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () HAVING COUNT_BIG(*) < 10 ); I offer that as an alternative because some people find it more intuitive (and it perhaps has more geek value too). Whichever way you prefer, it’s rather satisfying to note that the result of the sub-query does not exist for a particular correlated value where a vector aggregate is used (the scalar COUNT aggregate always returns a value, even if zero, so it always ‘EXISTS’ regardless which ProductID is logically being evaluated). The following query forms also produce the optimal plan and correct results, so long as a vector aggregate is used (you can probably find more equivalent query forms): WHERE Clause SELECT p.Name FROM Production.Product AS p WHERE ( SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () ) < 10; APPLY SELECT p.Name FROM Production.Product AS p CROSS APPLY ( SELECT NULL FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () HAVING COUNT_BIG(*) < 10 ) AS ca (dummy); FROM Clause SELECT q1.Name FROM ( SELECT p.Name, cnt = ( SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () ) FROM Production.Product AS p ) AS q1 WHERE q1.cnt < 10; This last example uses SUM(1) instead of COUNT and does not require a vector aggregate…you should be able to work out why :) SELECT q.Name FROM ( SELECT p.Name, cnt = ( SELECT SUM(1) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID ) FROM Production.Product AS p ) AS q WHERE q.cnt < 10; The semantics of SQL aggregates are rather odd in places. It definitely pays to get to know the rules, and to be careful to check whether your queries are using scalar or vector aggregates. As we have seen, query plans do not show in which ‘mode’ an aggregate is running and getting it wrong can cause poor performance, wrong results, or both. © 2012 Paul White Twitter: @SQL_Kiwi email: [email protected]

Read the article

Write DAX queries in Report Builder #ssrs #dax #ssas #tabular

- by Marco Russo (SQLBI)

If you use Report Builder with Reporting Services, you can use DAX queries even if the editor for Analysis Services provider does not support DAX syntax. In fact, the DMX editor that you can use in Visual Studio editor of Reporting Services (see a previous post on that), is not available in Report Builder. However, as Sagar Salvi commented in this Microsoft Connect entry, you can use the DAX query text in the query of a Dataset by using the OLE DB provider instead of the Analysis Services one. I think it’s a good idea to show the steps required. First, create a DataSet using the OLE DB connection type, and provide the connection string the provider (Provider), the server name (Data Source) and the database name (Initial Catalog), such as: Provider=MSOLAP;Data Source=SERVERNAME\\TABULAR;Initial Catalog=AdventureWorks Tabular Model SQL 2012 Then, create a Dataset using the data source previously defined, select the Text query type, and write the DAX code in the Query pane: You can also use the Query Designer window, that doesn’t provide any particular help in writing the DAX query, but at least can show a preview of the result of the query execution. I hope DAX will get better editors in the future… in the meantime, remember you can use DAX Studio to write and test your DAX queries, and DAX Formatter to improve their readability!If you want to learn the DAX Query Language, I suggest you watching my video Data Analysis Expressions as a Query Language on Project Botticelli!

Read the article

Possible SWITCH Optimization in DAX – #powerpivot #dax #tabular

- by Marco Russo (SQLBI)

In one of the Advanced DAX Workshop I taught this year, I had an interesting discussion about how to optimize a SWITCH statement (which could be frequently used checking a slicer, like in the Parameter Table pattern). Let’s start with the problem. What happen when you have such a statement? Sales := SWITCH ( VALUES ( Period[Period] ), "Current", [Internet Total Sales], "MTD", [MTD Sales], "QTD", [QTD Sales], "YTD", [YTD Sales], BLANK () ) The SWITCH statement is in reality just syntax sugar for a nested IF statement. When you place such a measure in a pivot table, for every cell of the pivot table the IF options are evaluated. In order to optimize performance, the DAX engine usually does not compute cell-by-cell, but tries to compute the values in bulk-mode. However, if a measure contains an IF statement, every cell might have a different execution path, so the current implementation might evaluate all the possible IF branches in bulk-mode, so that for every cell the result from one of the branches will be already available in a pre-calculated dataset. The price for that could be high. If you consider the previous Sales measure, the YTD Sales measure could be evaluated for all the cells where it’s not required, and also when YTD is not selected at all in a Pivot Table. The actual optimization made by the DAX engine could be different in every build, and I expect newer builds of Tabular and Power Pivot to be better than older ones. However, we still don’t live in an ideal world, so it could be better trying to help the engine finding a better execution plan. One student (Niek de Wit) proposed this approach: Selection := IF ( HASONEVALUE ( Period[Period] ), VALUES ( Period[Period] ) ) Sales := CALCULATE ( [Internet Total Sales], FILTER ( VALUES ( 'Internet Sales'[Order Quantity] ), 'Internet Sales'[Order Quantity] = IF ( [Selection] = "Current", 'Internet Sales'[Order Quantity], -1 ) ) ) + CALCULATE ( [MTD Sales], FILTER ( VALUES ( 'Internet Sales'[Order Quantity] ), 'Internet Sales'[Order Quantity] = IF ( [Selection] = "MTD", 'Internet Sales'[Order Quantity], -1 ) ) ) + CALCULATE ( [QTD Sales], FILTER ( VALUES ( 'Internet Sales'[Order Quantity] ), 'Internet Sales'[Order Quantity] = IF ( [Selection] = "QTD", 'Internet Sales'[Order Quantity], -1 ) ) ) + CALCULATE ( [YTD Sales], FILTER ( VALUES ( 'Internet Sales'[Order Quantity] ), 'Internet Sales'[Order Quantity] = IF ( [Selection] = "YTD", 'Internet Sales'[Order Quantity], -1 ) ) ) At first sight, you might think it’s impossible that this approach could be faster. However, if you examine with the profiler what happens, there is a different story. Every original IF’s execution branch is now a separate CALCULATE statement, which applies a filter that does not execute the required measure calculation if the result of the FILTER is empty. I used the ‘Internet Sales’[Order Quantity] column in this example just because in Adventure Works it has only one value (every row has 1): in the real world, you should use a column that has a very low number of distinct values, or use a column that has always the same value for every row (so it will be compressed very well!). Because the value –1 is never used in this column, the IF comparison in the filter discharge all the values iterated in the filter if the selection does not match with the desired value. I hope to have time in the future to write a longer article about this optimization technique, but in the meantime I’ve seen this optimization has been useful in many other implementations. Please write your feedback if you find scenarios (in both Power Pivot and Tabular) where you obtain performance improvements using this technique!

Read the article

SQL Rally Presentations

- by AllenMWhite

As I drove to Dallas for this year's SQL Rally conference (yes, I like to drive) I got a call asking if I could step in for another presenter who had to cancel at the last minute. Life happens, and it's best to be flexible, and I said sure, I can do that. Which presentation would you like me to do? (I'd submitted a few presentations, so it wasn't a problem.) So yesterday I presented "Gathering Performance Metrics With PowerShell" at 8:45AM, and my newest presentation, "Manage SQL Server 2012 on Windows...(read more)

Read the article

MicroTraining: Executing SSIS 2012 Packages 22 May 10:00 AM EDT (Free!)

- by andyleonard

I am pleased to announce the latest (free!) Linchpin People microtraining event will be held Tuesday 22 May 2012 at 10:00 AM EDT. The topic will be Executing SSIS 2012 Packages. In this presentation, I will be demonstrating several ways to execute SSIS 2012 packages. Register here ! Interested in learning about more microtraining from Linchpin People – before anyone else? Sign up for our newsletter ! :{>...(read more)

Read the article

Extracting GPS Data from JPG files

- by Peter W. DeBetta

I have been very remiss in posting lately. Unfortunately, much of what I do now involves client work that I cannot post. Fortunately, someone asked me how he could get a formatted list (e.g. tab-delimited) of files with GPS data from those files. He also added the constraint that this could not be a new piece of software (company security) and had to be scriptable. I did some searching around, and found some techniques for extracting GPS data, but was unable to find a complete solution. So, I did...(read more)

Read the article

Silverlight as a Transmedia Platform (Silverlight TV #33)

In this mini episode Jesse Liberty explains Transmedia Storytelling and why he believes that Silverlight may be the ideal platform for creating Transmedia applications on the web, Windows Phone 7 and eventually set-top boxes. Relevant links: John's Blog and on Twitter (@john_papa) Jesse's blog and on Twitter (@jesseliberty) Jesses mini-tutorial on Silverlight and Transmedia Follow us on Twitter @SilverlightTV or on the web at http://silverlight.tv/ ...Did you know that DotNetSlackers also publishes .net articles written by top known .net Authors? We already have over 80 articles in several categories including Silverlight. Take a look: here.

Search Results

Search found 10711 results on 429 pages for 'blog spotlight'.

Page 115/429 | < Previous Page | 111 112 113 114 115 116 117 118 119 120 121 122 | Next Page >

- by RickHeiges

- by andyleonard

- by andyleonard

- by Rob Farley

- by BuckWoody

- by Saad

- by Rob Farley

- by Paul White

- by jchang

- by Marco Russo (SQLBI)

- by AaronBertrand

- by Alexander Kuznetsov

- by andyleonard

- by Adam Machanic

- by AlbertoFerrari

- by Adam Machanic

- by Mala Narasimharajan

- by Paul White

- by Marco Russo (SQLBI)

- by Marco Russo (SQLBI)

- by AllenMWhite

- by andyleonard

- by Peter W. DeBetta

< Previous Page | 111 112 113 114 115 116 117 118 119 120 121 122 | Next Page >