SQL SERVER – Introduction to CUME_DIST – Analytic Functions Introduced in SQL Server 2012

Posted by pinaldave on SQL Authority See other posts from SQL Authority or by pinaldave
Published on Tue, 08 Nov 2011 01:30:07 +0000 Indexed on 2011/11/11 18:04 UTC
Read the original article Hit count: 465

This blog post is written in response to the T-SQL Tuesday post of Prox ‘n’ Funx. This is a very interesting subject. By the way Brad Schulz is my favorite guy when it is about blogging. I respect him as well learn a lot from him. Everybody is writing something new his subject, I decided to start SQL Server 2012 analytic functions series.

SQL Server 2012 introduces new analytical function CUME_DIST(). This function provides cumulative distribution value. It will be very difficult to explain this in words so I will attempt small example to explain you this function. Instead of creating new table, I will be using AdventureWorks sample database as most of the developer uses that for experiment.

Let us fun following query.
USE AdventureWorks
GO
SELECT SalesOrderID, OrderQty,
CUME_DIST() OVER(ORDER BY SalesOrderID) AS CDist
FROM Sales.SalesOrderDetail
WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
ORDER BY CDist DESC
GO

Above query will give us following result.

Now let us understand what is the formula behind CUME_DIST and why the values in SalesOrderID = 43670 are 1.

Let us take more example and be clear about why the values in SalesOrderID = 43667 are 0.5.

Now let us enhence the same example and use PARTITION BY into the OVER clause and see the results. Run following query in SQL Server 2012.

USE AdventureWorks
GO
SELECT SalesOrderID, OrderQty, ProductID,
CUME_DIST() OVER(PARTITION BY SalesOrderID
ORDER BY ProductID ) AS CDist
FROM Sales.SalesOrderDetail s
WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
ORDER BY s.SalesOrderID DESC, CDist DESC
GO

Now let us see the result of this query. We are have changed the ORDER BY clause as well partitioning by SalesOrderID.

You can see that CUME_DIST() function provides us different results. Additionally now we see value 1 multiple times. As we are using partitioning for each group of SalesOrderID we get the CUME_DIST() value.

CUME_DIST() was long awaited Analytical function and I am glad to see it in SQL Server 2012.

Reference: Pinal Dave (http://blog.SQLAuthority.com)


Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Function, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology

© SQL Authority or respective owner

Related posts about Pinal Dave

Related posts about PostADay