grouped running totals

Question

Post reply

grouped running totals

greg.bull

SSCrazy

Points: 2093
More actions
April 20, 2012 at 9:23 am

#267746

Hi all. Firstly, here's some background and then some test data. I have a table that has lots of staff (described by unique staffids) and hours worked as a metric for each int week number.
I'm looking for an update statement that'll wizz thru the table and insert running totals for each staff member (in weekno order asc). I can do this for 1 staff member, but am struggling where there are lots of the little devils in there.
Test data, results and desired resuts follow:
declare @work table(
runtot float, staffid int, weekno int, variancehours float )
--- little table def
insert @work (runtot, staffid, weekno, variancehours)
--- insert some test data
select null,1,1,1.5 union
select null,1,2,-2 union
select null,1,3,-1 union
select null,1,4,5 union
select null,2,1,4 union
select null,2,2,-1
---
select * from @work
--- results would be:
NULL111.5
NULL12-2
NULL13-1
NULL145
NULL214
NULL22-1
--results I'd like to see post update...
1.5111.5
-0.512-2
-1.513-1
-2.5145
4214
322-1
Thanks for any help, regards Greg.

Viewing 10 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 1

Check out this article http://www.sqlservercentral.com/articles/T-SQL/68467/ AND the discussion that goes with it.

greg.bull SSCrazy Points: 2093 More actions · Answer 2

Thanks. I;d read this already but it didn't help - I'd implemented the triangular join type mechanism (I think) to create a running total - but this part I'm stuck on specifically is that I now need to UPDATE what's there, considering seperate running totals for differing staffids. I know i am being dumb, as it'll just need some grouping on staffid, I guess...

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 3

CELKO (4/20/2012)
>> I have a table that has lots of staff (described by unique staff_ids) and hours worked as a metric for each int week number. <<
You have not ever worked with database before and never did any research. I would give you 1-2 years before you have an epiphany and can write good SQL.
1. Nobody would use integers for staff. There is no validation or verification. How about the SSN?
2. There is an ISO-8601 for weeks within a year. The format is yyyyWww. We use standards. Bad programmers invent and re-invent.
3. Why are using FLOAT? Are those hours really keep to laboratory precision? Have you ever seen a time clock that has more than two decimal places? You office has an atomic clock?
4. Where is the key to table? No constraints, etc. Files and punch cards are like this! This is not SQL yet.
5. How do you work negative hours, as shown in your sample data?
6. Why are you storing a running total? SQL people will compute such things and perhaps put it in a VIEW. You are a punch card programmer who has to PHYSICALLY store computations in the card.
Here is how we would do this in SQL
CREATE TABLE Timecards
(staff_id CHAR(9) NOT NULL PRIMARY KEY
REFERENCES Personnel (staff_id)
work_week CHAR(10) NOT NULL
CHECK (work_week LIKE '[12][0-9][0-9][0-9]W[0-5][0-9]'),
PRIMARY KEY (staff_id, work_week),
work_hrs DECIMAL (5,2) DEFAULT 0.00 NOT NULL);
Now we use the window clause on a SUM() aggregate function.
SELECT staff_id, work_week,
SUM(work_hrs)
OVER (PARTITION BY staff_id
ORDER BY work_week
ROWS UNBOUNDED PRECEDING BETWEEN CURRENT ROW)
AS work_hrs_run_tot
FROM Timecards
WHERE ..;

Well Mr. Celko, your solution would work if this were SQL Server 2012 (if I remember coorectly, this is the version where these get expanded), unfortunately your solution will not work in SQL Server 2008R2 and older. How about providing a working solution to a problem.

Chuck Hottle SSCarpal Tunnel Points: 4777 More actions · Answer 4

CREATE TABLE Timecards

(staff_id CHAR(9) NOT NULL PRIMARY KEY

REFERENCES Personnel (staff_id)

work_week CHAR(10) NOT NULL

CHECK (work_week LIKE '[12][0-9][0-9][0-9]W[0-5][0-9]'),

PRIMARY KEY (staff_id, work_week),

work_hrs DECIMAL (5,2) DEFAULT 0.00 NOT NULL);

Aren't there multiple primary keys in this definition or am I missing something?

Jeff Moden SSC Guru Points: 1004685 More actions · Answer 5

CELKO (4/20/2012)
>> I have a table that has lots of staff (described by unique staff_ids) and hours worked as a metric for each int week number. <<
You have not ever worked with database before and never did any research. I would give you 1-2 years before you have an epiphany and can write good SQL.
1. Nobody would use integers for staff. There is no validation or verification. How about the SSN?
2. There is an ISO-8601 for weeks within a year. The format is yyyyWww. We use standards. Bad programmers invent and re-invent.
3. Why are using FLOAT? Are those hours really keep to laboratory precision? Have you ever seen a time clock that has more than two decimal places? You office has an atomic clock?
4. Where is the key to table? No constraints, etc. Files and punch cards are like this! This is not SQL yet.
5. How do you work negative hours, as shown in your sample data?
6. Why are you storing a running total? SQL people will compute such things and perhaps put it in a VIEW. You are a punch card programmer who has to PHYSICALLY store computations in the card.
Here is how we would do this in SQL
CREATE TABLE Timecards
(staff_id CHAR(9) NOT NULL PRIMARY KEY
REFERENCES Personnel (staff_id)
work_week CHAR(10) NOT NULL
CHECK (work_week LIKE '[12][0-9][0-9][0-9]W[0-5][0-9]'),
PRIMARY KEY (staff_id, work_week),
work_hrs DECIMAL (5,2) DEFAULT 0.00 NOT NULL);
Now we use the window clause on a SUM() aggregate function.
SELECT staff_id, work_week,
SUM(work_hrs)
OVER (PARTITION BY staff_id
ORDER BY work_week
ROWS UNBOUNDED PRECEDING BETWEEN CURRENT ROW)
AS work_hrs_run_tot
FROM Timecards
WHERE ..;

Let's see your SQL Server 2008 solution, Joe. 😉

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004685 More actions · Answer 6

greg.bull (4/20/2012)
Thanks. I;d read this already but it didn't help - I'd implemented the triangular join type mechanism (I think) to create a running total - but this part I'm stuck on specifically is that I now need to UPDATE what's there, considering seperate running totals for differing staffids. I know i am being dumb, as it'll just need some grouping on staffid, I guess...

Actually, the whole purpose of the article was to get people away from RBAR methods and the Triangular Join is a form of "Hidden" RBAR. The article was actually about the Quirky Update method to replace Triangular Joins, etc.

Here's your solution. As normal, the details are in the comments in the code.

--===== Conditionally drop the work table to

-- make reruns in SSMS easier.

IF OBJECT_ID('Tempdb..#Work','U') IS NOT NULL

DROP TABLE #Work

;

--===== Create the working table

CREATE TABLE #Work

(

RunTot FLOAT,

StaffID INT,

WeekNo INT,

VarianceHours FLOAT

)

;

--===== Populate the working table.

INSERT INTO #Work

(RunTot,StaffID,WeekNo,VarianceHours)

SELECT NULL,1,1,1.5 UNION ALL

SELECT NULL,1,2,-2 UNION ALL

SELECT NULL,1,3,-1 UNION ALL

SELECT NULL,1,4,5 UNION ALL

SELECT NULL,2,1,4 UNION ALL

SELECT NULL,2,2,-1

;

--===== Add the quintessential Clustered Index to

-- the columns the Quirky Update will rely on.

CREATE UNIQUE CLUSTERED INDEX IX_#Work_QU

ON #Work (StaffID, WeekNo)

;

--===== Declare some obviously name variables for

-- the Quirky Update to use.

DECLARE @RunTot FLOAT,

@PrevStaffID INT,

@Counter INT

;

SELECT @Counter = 1

;

--===== Do the Running Total using the Quirky Update with a

-- built in safety mechanism to force an error if the

-- order is disturbed.

-- This will do a million rows in just a couple of seconds

-- which blows the 2012 method out of the water. Still,

-- the 2012 method is faster than a cursor and it's supported.

WITH

cteEnumerate AS

(

SELECT Counter = ROW_NUMBER() OVER (ORDER BY StaffID, WeekNo),

RunTot,StaffID,WeekNo,VarianceHours

FROM #Work WITH(TABLOCKX)

)

UPDATE cteEnumerate

SET @RunTot = RunTot

= CASE

WHEN Counter = @Counter

THEN

CASE

WHEN StaffID = @PrevStaffID

THEN @RunTot + VarianceHours

ELSE VarianceHours

END

ELSE 1/0 --Force Error

END,

@PrevStaffID = StaffID,

@Counter = @Counter + 1

FROM cteEnumerate

OPTION(MAXDOP 1)

;

--===== Show the results.

SELECT * FROM #Work ORDER BY StaffID, WeekNo

;

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

greg.bull SSCrazy Points: 2093 More actions · Answer 7

Thanks to you all for your comments and feedback. Jeff, I understood the purpose of your excellent article - but I'd probably have to use a very simple syntax (that I could understand and support). Will try to fathom out how your solution works and give it a try.

Regards, Greg.

greg.bull SSCrazy Points: 2093 More actions · Answer 8

Thanks Lynn, for your direction and advice too ! I'm very new to SQL and the basic syntax is about all I can get my head around at present. As for the table I'm working on, that's just the existing structure which I need to analyse...

Jeff Moden SSC Guru Points: 1004685 More actions · Answer 9

greg.bull (4/22/2012)
Thanks to you all for your comments and feedback. Jeff, I understood the purpose of your excellent article - but I'd probably have to use a very simple syntax (that I could understand and support). Will try to fathom out how your solution works and give it a try.
Regards, Greg.

Thank you for the feedback.

The way it works is just exactly the same as if you were to write some C# code to do it all except you that you don't have to write the Read/Write code... UPDATE does that for you.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)