hi all,
i would like a running total with a restart after a certain limit. Let set the condition to be 30, so every 30 running sum will add to a group
Cat Value Running_total
101 10 10
102 20 30
103 30 30
104 12 12
105 18 30
106 10 10
107 10 20
becomes
Category Value Running_total GroupID
101 10 10 1
102 20 30 1
103 30 30 2
104 12 12 3
105 18 19 3
106 10 10 4
107 10 20 4
I am using the following for RunningTotal it restarts the counter but adds the remaining onto previous and doesnt restart with exact value (assuming 105 has value of 20)sum([Value]) over (order by [Category] rows unbounded preceding ) % 30 as Running_Total
this results in:
Category Value Running_total
101 10 10
102 20 30
103 30 30
104 12 12
105 20 2
106 10 12
107 10 27
Many thanks
DECLARE @T TABLE (Category VARCHAR(5), Value INT)
INSERT INTO @T VALUES
('101', 10),
('102', 20),
('103
', 30),
('104', 12),
('105', 19),
('106', 10),
('107', 10)
September 11, 2017 at 8:49 am
How are you getting a running total when 104, 105, etc don't add to the total? Do you mean after a reset you stop?
September 11, 2017 at 9:05 am
Say i can only fit max of 30 values on 1 page.
101, 102 = page 1
103 = page 2
104, 105 = page 3
106, 107 = page 4
the running total value resets when it reaches 30, then restarts at zero for next line and checks if running total is <=30 and assigns page. so what i really want as output is
Category Value Running_total Page
101 10 10 1
102 20 30 1
103 30 30 2
104 12 12 3
105 18 30 3
106 10 1 4
107 10 17 4
September 11, 2017 at 11:08 am
What happens if the sum is not exactly 30? Would you need to divide it? or should you move the whole row to a new page?
September 11, 2017 at 11:47 am
just move onto next row, which would reset and start a new running total.
Category Value Running_total Page
101 10 10 1
102 20 30 1
103 30 30 2
104 12 12 3
105 20 20 4
106 5 25 4
107 10 10 5
above i change rows 105 onwards to explain your statement.
at 104 the running total is 12, 105 value is 20, which goes over 30, so assign 104 with 3.
at 105 the running total is 20, plus, 106 value of 5, is 25, 107 value of 10 goes over to 35. so assign 105 and 106 with groupid 4
and so on
September 11, 2017 at 1:06 pm
K, rolled into a proc since it's easier to test. Put your data into a table called RTSource. This lets me add other data and play.
Proc:CREATE OR ALTER PROCEDURE RunningTotalQueries
AS
WITH lagCTE
AS (SELECT
Category,
SomeValue,
LagValue1 = LAG(SomeValue, 1, 0) OVER (ORDER BY Category),
LagValue2 = LAG(SomeValue, 2, 0) OVER (ORDER BY Category)
FROM RTSource
)
SELECT
lagCTE.Category,
lagCTE.SomeValue,
SUms = CASE
WHEN lagCTE.SomeValue + lagCTE.LagValue1 > 30 THEN
lagCTE.SomeValue
ELSE
lagCTE.SomeValue + lagCTE.LagValue1
END
FROM lagCTE;
GO
Test. Uses tSQLt. If I wanted to check other data, I'd either add it into the test or I'd create duplicate tests.
EXEC tSQLt.NewTestClass @ClassName = N'tTSQLTests';
GO
CREATE PROCEDURE tTSQLTests.[test running total reset]
AS
-----------------------------------
------- Assemble
-----------------------------------
EXEC tsqlt.FakeTable
@TableName = N'RTSource'
INSERT RTSource
VALUES ('101', 10),
('102', 20),
('103', 30),
('104', 12),
('105', 19),
('106', 10),
('107', 10);
CREATE TABLE tTSQLTests.Expected
( Category VARCHAR(5),
SomeValue INT,
RunningTotal INT
);
INSERT INTO tTSQLTests.Expected
VALUES
('101', 10, 10),
('102', 20, 30),
('103', 30, 30),
('104', 12, 12),
('105', 19, 19),
('106', 10, 29),
('107', 10, 10);
SELECT
Category,
SomeValue,
RunningTotal
INTO tTSQLTests.Actual
FROM tTSQLTests.Expected
WHERE 1 = 0;
-----------------------------------
------- Act
-----------------------------------
INSERT tTSQLTests.Actual EXEC RunningTotalQueries
-----------------------------------
------- Assert
-----------------------------------
EXEC tsqlt.AssertEqualsTable
@Expected = N'tTSQLTests.Expected',
@Actual = N'tTSQLTests.Actual',
@Message = N'incorrect query'
GO
September 11, 2017 at 10:42 pm
Talvin Singh - Monday, September 11, 2017 11:47 AMjust move onto next row, which would reset and start a new running total.
Category Value Running_total Page
101 10 10 1
102 20 30 1
103 30 30 2
104 12 12 3
105 20 20 4
106 5 25 4
107 10 10 5above i change rows 105 onwards to explain your statement.
at 104 the running total is 12, 105 value is 20, which goes over 30, so assign 104 with 3.
at 105 the running total is 20, plus, 106 value of 5, is 25, 107 value of 10 goes over to 35. so assign 105 and 106 with groupid 4
and so on
One final question. If you're limit is 30 and you have a row that contains something larger on that one row (Value = 92, for example), what then?
--Jeff Moden
Change is inevitable... Change for the better is not.
September 11, 2017 at 11:45 pm
I will give this a go!
The large dataset I have, a single row is not higher than the condition. I will be testing this on a larger dataset that has a over 50k rows, a max value of 30, but a running total max 500.
Do you think this will work with the above code?
September 12, 2017 at 2:08 am
With the proc procedure, is there a way to create a column that assigns a group id?
For example, for those values that reach the limit of 30,you assign an id, so in our example the first 2 rows will be group id 1, see below;
Category Value Running_total groupId
101 10 10 1
102 20 30 1
103 30 30 2
104 12 12 3
105 20 20 4
106 5 25 4
107 10 10 5
September 12, 2017 at 6:50 am
Are there other values in the set? You don't really have anything that creates a group, other than the sum, and for that, you have a variable set of values.
September 12, 2017 at 6:54 am
Here's a version that works in all versions of SQL Server from 2005 and up. And, yes... it calculates the GroupID in a fashion that Steve identified above.
First, the test data... read the comment... it's critical.
--===== The table MUST have a UNIQUE CLUSTERED INDEX
-- on the Category column AND the Category column
-- MUST preserve the order that you want the data
-- to appear in (might be tough with VARCHAR()).
CREATE TABLE #TestTable
(
Category VARCHAR(5) NOT NULL PRIMARY KEY CLUSTERED
,Value INT NOT NULL
,RunningTotal INT
,GroupID INT
)
;
INSERT INTO #TestTable
(Category, Value)
SELECT '101', 10 UNION ALL
SELECT '102', 20 UNION ALL
SELECT '103', 30 UNION ALL
SELECT '104', 12 UNION ALL
SELECT '105', 20 UNION ALL
SELECT '106', 5 UNION ALL
SELECT '107', 10
;
After that, we do what is affectionately known as the "Quirky Update". It works in all versions of SQL from 2005 and up. It's also nasty fast and will process a million rows in just a couple of seconds. Do read the comments. They're important. So is the order of processing. The GroupID MUST be calculated before the RunningTotal changes and the "Safety" must calculated last (or at least after the CASE that does the safety check).
--===== Declare and preset the support variables.
-- Did it this way so that it works for all versions 2005 and up.
-- The variables have obvious names as to what they're for.
DECLARE @RunningTotal INT
,@GroupID INT
,@Safety INT
,@MaxValue INT
;
SELECT @RunningTotal = 0
,@GroupID = 1
,@Safety = 1
,@MaxValue = 30
;
--===== This "Quirky Update" works very similar to how you
-- would solve the problem with a loop except it uses
-- the behind the scenes natural loops (pseudo-cursor)
-- of both the SELECT and the UPDATE.
WITH cteQU AS
(--==== This cte adds a row number as a "safety counter"
-- that we'll use to ensure that the proper order
-- of processing occurred. It also forces the order
-- so that the safety check is also the order guarantee.
SELECT Safety = ROW_NUMBER() OVER (ORDER BY Category)
,Value
,RunningTotal
,GroupID
FROM #TestTable
) --=== This is just like any managed code except it uses 3-part updates
UPDATE qu
SET @GroupID = GroupID = CASE
WHEN @RunningTotal + Value <= @MaxValue
THEN @GroupID
ELSE @GroupID +1
END
,@RunningTotal = RunningTotal = CASE
WHEN Safety = @Safety
THEN
CASE
WHEN @RunningTotal + Value <= @MaxValue
THEN @RunningTotal + Value
ELSE Value
END
ELSE 'Process Order Failed.' +@Safety/0
END
,@Safety = @Safety + 1
FROM cteQU qu WITH (TABLOCKX, INDEX(1)) --Must have these hints
OPTION (MAXDOP 1) --Must guarantee no parallelism
;
SELECT * FROM #TestTable
;
The old "Black Arts" of T-SQL rule. 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
September 12, 2017 at 8:05 am
LOL, nice, Jeff. I was hoping to avoid some variable in there, but that looks nice.
September 12, 2017 at 8:23 am
Steve Jones - SSC Editor - Tuesday, September 12, 2017 8:05 AMLOL, nice, Jeff. I was hoping to avoid some variable in there, but that looks nice.
🙂😀
--Jeff Moden
Change is inevitable... Change for the better is not.
This works very well!
i was able to test it on a large dataset, works blindingly fast!
really appreciate this
Viewing 15 posts - 1 through 15 (of 19 total)
You must be logged in to reply to this topic. Login to reply