Calculating Year to Date total

Question

Calculating Year to Date total

chrisvolny

SSC Journeyman

Points: 89
More actions
February 3, 2012 at 6:52 am

#254748

I have the following two table definitions. The query I have here is sort of what I need to display, except it's not working the way I want it to.
SELECT [intSupplierID]
,[intDistributionCenterID]
,[intCASSDivisionID]
,[intProductID]
,NVD.[intCustomerID]
,[intMonthKey]
,[intPlanID]
,[decRebatableVolumeMTH]
,[decRebatableVolumeYTD]
,SUM([decRebatableVolumeMTH]) over (PARTITION by [intSupplierID],[intDistributionCenterID],[intCASSDivisionID],[intProductID],NVD.[intCustomerID],[intMonthKey],[intPlanID])
FROM [CI_Temp].[CISummary].[NVD_Fact] NVD
, [CI_Temp].[CIMaster].[Customer] Cust
where NVD.intCustomerID = Cust.intCustomerID
and intCASSDivisionID not in (6, 10)
and intSupplierID = 74
order by 1,2,3,5,6,7
What I want is to display the sum(decRebatableVolumeMTH) incrementally; like this. Hopefully, this makes sense.
74611413711820249902252
74611413711820250900252
74831090801767149907867
7483109080176715090186272193
74831090991767149902292
74831090991767150908611588
7483180281767149905635
748318028176715090126188131
I'm at a loss, I've tried all I can think of with Group By, etc. Someone else suggested this use of the 'OVER' clause which I've never used before. I'm not even sure this sort of thing is possible in a single 'SELECT' statement.
Chris
CREATE TABLE [CISummary].[NVD_Fact](
[intProductID] [int] NOT NULL,
[intCustomerID] [int] NOT NULL,
[intMonthKey] [int] NOT NULL,
[intDistributionCenterID] [int] NULL,
[intSupplierID] [int] NOT NULL,
[intPlanID] [int] NOT NULL,
[decRebatableVolumeYTD] [decimal](18, 0) NULL,
[decRebatableVolumeMTH] [decimal](18, 0) NULL,
[decRebateAMT] [decimal](18, 0) NULL,
[blnMultiTier] [bit] NULL,
[blnClosed] [bit] NULL
)
CREATE TABLE [CIMaster].[Customer](
[intCustomerID] [int] IDENTITY(1,1) NOT NULL,
[strSourceCustomerID] [varchar](10) NULL,
[strName] [varchar](50) NULL,
[intCASSDivisionID] [int] NOT NULL,
[strClientNumber] [varchar](50) NOT NULL,
[strLOB] [varchar](25) NULL,
[strMGL] [varchar](25) NULL,
[intAccountManagerSK] [int] NULL,
[intRegionalManagerSK] [int] NULL,
[decPerDiemRate] [decimal](10, 2) NULL,
[intCapacity] [int] NULL,
[blnOccupied] [bit] NULL,
[decClientBudget] [decimal](10, 2) NULL,
[intARACustomerID] [int] NULL,
[intDataSourceID] [smallint] NOT NULL
)

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply

Jason Selburg SSC-Insane Points: 24641 More actions · Answer 1

I believe you're looking for a running total. If so, Mr. Jeff Moden has a wonderful article explaining the process. It can be found --> here[/url]. <-- Enjoy! 😀

______________________________________________________________________

Personal Motto: Why push the envelope when you can just open it?

If you follow the direction given HERE[/url] you'll likely increase the number and quality of responses you get to your question.

Jason L. Selburg

Cadavre SSC-Forever Points: 41690 More actions · Answer 2

Make sure you read Jeff Moden's running total article, what I'm about to show you is not even nearly as good.

BEGIN TRAN

IF object_id('tempdb..#testEnvironment') IS NOT NULL

BEGIN

DROP TABLE #testEnvironment

END

--100 Random rows of data

SELECT TOP 100 IDENTITY(INT,1,1) AS ID,

RAND(CHECKSUM(NEWID())) * DATEDIFF(dd,'2000','2002') + CAST('2000' AS DATETIME) AS randomDate,

RAND(CHECKSUM(NEWID())) * 100 AS randomSmallDec

INTO #testEnvironment

FROM master.dbo.syscolumns sc1, master.dbo.syscolumns sc2, master.dbo.syscolumns sc3

PRINT '========== QUERY =========='

SET STATISTICS TIME ON

SELECT *

FROM #testEnvironment a

CROSS APPLY (SELECT SUM(randomSmallDec)

FROM #testEnvironment cnt

WHERE cnt.randomDate <= a.randomDate

AND DATEADD(yy, DATEDIFF(yy,0,cnt.randomDate), 0) = DATEADD(yy, DATEDIFF(yy,0,a.randomDate), 0)) b(new)

ORDER BY randomDate

SET STATISTICS TIME OFF

PRINT REPLICATE('=',80)

ROLLBACK

The above is to show you that the query works.

Here is perf testing: -

BEGIN TRAN

--Standard TestEnvironment of 1,000,000 rows of random-ish data

IF object_id('tempdb..#testEnvironment') IS NOT NULL

BEGIN

DROP TABLE #testEnvironment

END

--1,000,000 Random rows of data

SELECT TOP 1000000 IDENTITY(INT,1,1) AS ID,

RAND(CHECKSUM(NEWID())) * DATEDIFF(dd,'2000','2002') + CAST('2000' AS DATETIME) AS randomDate,

RAND(CHECKSUM(NEWID())) * 100 AS randomSmallDec

INTO #testEnvironment

FROM master.dbo.syscolumns sc1, master.dbo.syscolumns sc2, master.dbo.syscolumns sc3

DECLARE @HOLDER INT --USED TO TAKE DISPLAY TIME OUT OF THE EQUATION

PRINT '========== QUERY =========='

SET STATISTICS TIME ON

SELECT @HOLDER = new

FROM #testEnvironment a

CROSS APPLY (SELECT SUM(randomSmallDec)

FROM #testEnvironment cnt

WHERE cnt.randomDate <= a.randomDate

AND DATEADD(yy, DATEDIFF(yy,0,cnt.randomDate), 0) = DATEADD(yy, DATEDIFF(yy,0,a.randomDate), 0)) b(new)

ORDER BY randomDate

SET STATISTICS TIME OFF

PRINT REPLICATE('=',80)

ROLLBACK

After 8 minutes of running the test, I cancelled it. Goes to show you exactly what you're getting into as your data-set expands, and that's only for 1 million rows.

Forever trying to learn
My blog - http://www.cadavre.co.uk/
For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Jeff Moden SSC Guru Points: 1004318 More actions · Answer 3

chrisvolny (2/3/2012)
What I want is to display the sum(decRebatableVolumeMTH) incrementally; like this. Hopefully, this makes sense.
74611413711820249902252
74611413711820250900252
74831090801767149907867
7483109080176715090186272193
74831090991767149902292
74831090991767150908611588
7483180281767149905635
748318028176715090126188131

If you'd be kind enough to provide a "CREATE TABLE" statment and some 2005 compatible INSERT statements (so everyone can play) to populate it with the data above, I'd be happy to show you exactly how to do this.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

johnitech.itech Hall of Fame Points: 3252 More actions · Answer 4

check these about ur problem

http://www.sqlhacks.com/Summary/Running_Total

http://www.databasejournal.com/features/mssql/article.php/3112381/SQL-Server-Calculating-Running-Totals-Subtotals-and-Grand-Total-Without-a-Cursor.htm

Jeff Moden SSC Guru Points: 1004318 More actions · Answer 5

johnitech.itech (2/6/2012)

check these about ur problem

http://www.sqlhacks.com/Summary/Running_Total

http://www.databasejournal.com/features/mssql/article.php/3112381/SQL-Server-Calculating-Running-Totals-Subtotals-and-Grand-Total-Without-a-Cursor.htm%5B/quote%5D

I'd recommend NOT using either of those methods because they both employ a "Triangular Join" which will crush your server even for relatively small sets of data to do a running total on. Please see the following article for what a "Triangular Join" is and how it can generate literally millions and billions of internal rows in trying to solve such a simple problem as a running total.

http://www.sqlservercentral.com/articles/T-SQL/61539/

Prior to SQL Server 2012, there are two ways to do running totals in T-SQL... cursor/While loop or "Quirky Update". A cursor/While loop may take a couple of minutes to do a running total on a million rows but it's easy to understand and supported. The "Quirky Update" will work on a million rows in scant seconds (2-4). Although it can be made to be quite safe, it's not supported by MS so some folks tend to shy away from it. It even beats the methods in 2012 but, again, it's not supported. All 3 methods work in a single pass of the table.

The "Triangular Join" is an I/O monster that will generate literally more than 500 BILLION internal rows to do a running total on a lousy million rows. Consider that a "Triangular Join" will generate (x²+x)/2+x internal rows where x is the number of rows in the running total. That means for just 100 rows, the "Triangular Join" will generate and scan 5,150 internal rows to process and the number of rows very quickly grows in an exponential fashion.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)