SQL Server – "Denali" – Using Analytic Function CUME_DIST()

CUME_DIST() calculates relative position of a value relative to a group of values. The value returned by CUME_DIST() is > 0 and <= 1, which represents percentage of number of rows with value less than (for ascending order) or equal to current row.

For example, consider below data:

Year Month Amount

2010 1 5000.00

2010 2 6000.00

2010 3 7000.00

2010 4 2000.00

2011 1 1000.00

2011 2 2000.00

2011 3 3000.00

2011 4 4000.00

Cumulative distribution of row 4 (year = 2010, month = 4) will be 37.50%, as number of rows which has value <= row 4 are 3 rows, and the value returned by CUME_DIST() will be 0.375.

Using CUME_DIST():

CUME_DIST() takes an ORDER BY clause as argument, ORDER BY clause determines the order in which the operation is performed:

SELECT [Year], [Month], [Amount],

CUME_DIST() OVER (ORDER BY [Amount]) CUME_DIST

–CAST(CUME_DIST() OVER (ORDER BY [Amount]) * 100 AS VARCHAR) + '%' CUME_DIST

FROM dbo.SalesData

ORDER BY [Amount]

Result Set:

Year Month Amount CUME_DIST

2011 1 1000.00 0.125

2011 2 2000.00 0.375

2010 4 2000.00 0.375

2011 3 3000.00 0.5

2011 4 4000.00 0.625

2010 1 5000.00 0.75

2010 2 6000.00 0.875

2010 3 7000.00 1

CUME_DUST() with PARTITION BY:

PARTITION BY clause divides the result set into partition to which CUME_DIST() is applied:

SELECT [Year], [Month], [Amount],

CUME_DIST() OVER (PARTITION BY [Year] ORDER BY [Amount])

CUME_DIST

FROM dbo.SalesData

ORDER BY [Year], [Month], [Amount]

Result Set:

Year Month Amount CUME_DIST

2010 1 5000.00 0.5

2010 2 6000.00 0.75

2010 3 7000.00 1

2010 4 2000.00 0.25

2011 1 1000.00 0.25

2011 2 2000.00 0.5

2011 3 3000.00 0.75

2011 4 4000.00 1

CUME_DIST() can also be used to find "TOP" rows, check below queries which selects 50% of data and the different result sets generated by them:

;WITH tempTable

(

SELECT [Year], [Month], [Amount],

CUME_DIST() OVER (ORDER BY [Amount]) CUME_DIST

FROM dbo.SalesData

)

SELECT [Year], [Month], [Amount]

FROM tempTable

WHERE CUME_DIST < 0.50

SELECT TOP 50 PERCENT [Year], [Month], [Amount]

FROM dbo.SalesData

Result Sets:

Year Month Amount

———– ———– ———————

2011 1 1000.00

2011 2 2000.00

2010 4 2000.00

(3 row(s) affected)

Year Month Amount

———– ———– ———————

2011 1 1000.00

2011 2 2000.00

2011 3 3000.00

2011 4 4000.00

(4 row(s) affected)

CUME_DIST() has been available in Oracle since 8i, and can also be used as an aggregate function in Oracle.

Hope This Helps! Cheers!

Reference : Vishal (http://SqlAndMe.com)

Filed under: SQLServer, SQLServer "Denali"

SQL Server – "Denali" – Using Analytic Function CUME_DIST()

Rate

Share

Share

Rate

SQL Server – "Denali" – Using Analytic Function CUME_DIST()

Rate

Share

Share

Rate

Related content

Book Review: Big Red - Voyage of a Trident Submarine

Database Mirroring FAQ: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup?

Inserting Markup into a String with SQL

Networking - Part 4

Speaking at Community Events - More Thoughts