February 3, 2012 at 6:52 am
I have the following two table definitions. The query I have here is sort of what I need to display, except it's not working the way I want it to.
SELECT [intSupplierID]
,[intDistributionCenterID]
,[intCASSDivisionID]
,[intProductID]
,NVD.[intCustomerID]
,[intMonthKey]
,[intPlanID]
,[decRebatableVolumeMTH]
,[decRebatableVolumeYTD]
,SUM([decRebatableVolumeMTH]) over (PARTITION by [intSupplierID],[intDistributionCenterID],[intCASSDivisionID],[intProductID],NVD.[intCustomerID],[intMonthKey],[intPlanID])
FROM [CI_Temp].[CISummary].[NVD_Fact] NVD
, [CI_Temp].[CIMaster].[Customer] Cust
where NVD.intCustomerID = Cust.intCustomerID
and intCASSDivisionID not in (6, 10)
and intSupplierID = 74
order by 1,2,3,5,6,7
What I want is to display the sum(decRebatableVolumeMTH) incrementally; like this. Hopefully, this makes sense.
74611413711820249902252
74611413711820250900252
74831090801767149907867
7483109080176715090186272193
74831090991767149902292
74831090991767150908611588
7483180281767149905635
748318028176715090126188131
I'm at a loss, I've tried all I can think of with Group By, etc. Someone else suggested this use of the 'OVER' clause which I've never used before. I'm not even sure this sort of thing is possible in a single 'SELECT' statement.
Chris
CREATE TABLE [CISummary].[NVD_Fact](
[intProductID] [int] NOT NULL,
[intCustomerID] [int] NOT NULL,
[intMonthKey] [int] NOT NULL,
[intDistributionCenterID] [int] NULL,
[intSupplierID] [int] NOT NULL,
[intPlanID] [int] NOT NULL,
[decRebatableVolumeYTD] [decimal](18, 0) NULL,
[decRebatableVolumeMTH] [decimal](18, 0) NULL,
[decRebateAMT] [decimal](18, 0) NULL,
[blnMultiTier] [bit] NULL,
[blnClosed] [bit] NULL
)
CREATE TABLE [CIMaster].[Customer](
[intCustomerID] [int] IDENTITY(1,1) NOT NULL,
[strSourceCustomerID] [varchar](10) NULL,
[strName] [varchar](50) NULL,
[intCASSDivisionID] [int] NOT NULL,
[strClientNumber] [varchar](50) NOT NULL,
[strLOB] [varchar](25) NULL,
[strMGL] [varchar](25) NULL,
[intAccountManagerSK] [int] NULL,
[intRegionalManagerSK] [int] NULL,
[decPerDiemRate] [decimal](10, 2) NULL,
[intCapacity] [int] NULL,
[blnOccupied] [bit] NULL,
[decClientBudget] [decimal](10, 2) NULL,
[intARACustomerID] [int] NULL,
[intDataSourceID] [smallint] NOT NULL
)
February 3, 2012 at 7:01 am
I believe you're looking for a running total. If so, Mr. Jeff Moden has a wonderful article explaining the process. It can be found --> here[/url]. <-- Enjoy! 😀
______________________________________________________________________
Personal Motto: Why push the envelope when you can just open it?
If you follow the direction given HERE[/url] you'll likely increase the number and quality of responses you get to your question.
Jason L. SelburgFebruary 3, 2012 at 8:12 am
Make sure you read Jeff Moden's running total article, what I'm about to show you is not even nearly as good.
BEGIN TRAN
IF object_id('tempdb..#testEnvironment') IS NOT NULL
BEGIN
DROP TABLE #testEnvironment
END
--100 Random rows of data
SELECT TOP 100 IDENTITY(INT,1,1) AS ID,
RAND(CHECKSUM(NEWID())) * DATEDIFF(dd,'2000','2002') + CAST('2000' AS DATETIME) AS randomDate,
RAND(CHECKSUM(NEWID())) * 100 AS randomSmallDec
INTO #testEnvironment
FROM master.dbo.syscolumns sc1, master.dbo.syscolumns sc2, master.dbo.syscolumns sc3
PRINT '========== QUERY =========='
SET STATISTICS TIME ON
SELECT *
FROM #testEnvironment a
CROSS APPLY (SELECT SUM(randomSmallDec)
FROM #testEnvironment cnt
WHERE cnt.randomDate <= a.randomDate
AND DATEADD(yy, DATEDIFF(yy,0,cnt.randomDate), 0) = DATEADD(yy, DATEDIFF(yy,0,a.randomDate), 0)) b(new)
ORDER BY randomDate
SET STATISTICS TIME OFF
PRINT REPLICATE('=',80)
ROLLBACK
The above is to show you that the query works.
Here is perf testing: -
BEGIN TRAN
--Standard TestEnvironment of 1,000,000 rows of random-ish data
IF object_id('tempdb..#testEnvironment') IS NOT NULL
BEGIN
DROP TABLE #testEnvironment
END
--1,000,000 Random rows of data
SELECT TOP 1000000 IDENTITY(INT,1,1) AS ID,
RAND(CHECKSUM(NEWID())) * DATEDIFF(dd,'2000','2002') + CAST('2000' AS DATETIME) AS randomDate,
RAND(CHECKSUM(NEWID())) * 100 AS randomSmallDec
INTO #testEnvironment
FROM master.dbo.syscolumns sc1, master.dbo.syscolumns sc2, master.dbo.syscolumns sc3
DECLARE @HOLDER INT --USED TO TAKE DISPLAY TIME OUT OF THE EQUATION
PRINT '========== QUERY =========='
SET STATISTICS TIME ON
SELECT @HOLDER = new
FROM #testEnvironment a
CROSS APPLY (SELECT SUM(randomSmallDec)
FROM #testEnvironment cnt
WHERE cnt.randomDate <= a.randomDate
AND DATEADD(yy, DATEDIFF(yy,0,cnt.randomDate), 0) = DATEADD(yy, DATEDIFF(yy,0,a.randomDate), 0)) b(new)
ORDER BY randomDate
SET STATISTICS TIME OFF
PRINT REPLICATE('=',80)
ROLLBACK
After 8 minutes of running the test, I cancelled it. Goes to show you exactly what you're getting into as your data-set expands, and that's only for 1 million rows.
February 5, 2012 at 1:39 pm
chrisvolny (2/3/2012)
What I want is to display the sum(decRebatableVolumeMTH) incrementally; like this. Hopefully, this makes sense.74611413711820249902252
74611413711820250900252
74831090801767149907867
7483109080176715090186272193
74831090991767149902292
74831090991767150908611588
7483180281767149905635
748318028176715090126188131
If you'd be kind enough to provide a "CREATE TABLE" statment and some 2005 compatible INSERT statements (so everyone can play) to populate it with the data above, I'd be happy to show you exactly how to do this.
--Jeff Moden
Change is inevitable... Change for the better is not.
February 6, 2012 at 3:59 am
February 6, 2012 at 7:36 am
johnitech.itech (2/6/2012)
http://www.sqlhacks.com/Summary/Running_Total
I'd recommend NOT using either of those methods because they both employ a "Triangular Join" which will crush your server even for relatively small sets of data to do a running total on. Please see the following article for what a "Triangular Join" is and how it can generate literally millions and billions of internal rows in trying to solve such a simple problem as a running total.
http://www.sqlservercentral.com/articles/T-SQL/61539/
Prior to SQL Server 2012, there are two ways to do running totals in T-SQL... cursor/While loop or "Quirky Update". A cursor/While loop may take a couple of minutes to do a running total on a million rows but it's easy to understand and supported. The "Quirky Update" will work on a million rows in scant seconds (2-4). Although it can be made to be quite safe, it's not supported by MS so some folks tend to shy away from it. It even beats the methods in 2012 but, again, it's not supported. All 3 methods work in a single pass of the table.
The "Triangular Join" is an I/O monster that will generate literally more than 500 BILLION internal rows to do a running total on a lousy million rows. Consider that a "Triangular Join" will generate (x2+x)/2+x internal rows where x is the number of rows in the running total. That means for just 100 rows, the "Triangular Join" will generate and scan 5,150 internal rows to process and the number of rows very quickly grows in an exponential fashion.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply