August 6, 2009 at 4:46 pm
Hi,
Maybe I'm missing something here, but why bother with a script?
Simply use the unpivot function and then query as normal:
WITH P
AS (SELECT EMPLOYID,
PERIOD,
SALARY
FROM (SELECT EMPLOYID,
GROSWAGS_1,
GROSWAGS_2,
GROSWAGS_3,
GROSWAGS_4,
GROSWAGS_5,
GROSWAGS_6,
GROSWAGS_7,
GROSWAGS_8,
GROSWAGS_9,
GROSWAGS_10,
GROSWAGS_11,
GROSWAGS_12
FROM UPR00900) P
UNPIVOT
(SALARY
FOR PERIOD IN ( GROSWAGS_1,GROSWAGS_2,GROSWAGS_3,GROSWAGS_4,GROSWAGS_5,GROSWAGS_6,GROSWAGS_7,GROSWAGS_8,GROSWAGS_9,GROSWAGS_10,GROSWAGS_11,GROSWAGS_12 ) ) AS UNPVT),
P1
AS (SELECT EMPLOYID,
CASE
WHEN Len(PERIOD) = 10
THEN Right(PERIOD,1) * 1
ELSE Right(PERIOD,2) * 1
END AS PERIOD,
SALARY
FROM P)
SELECT EMPLOYID,
Sum(SALARY) AS TOTAL_SALARY
FROM P1
WHERE PERIOD <= 3 /*Criteria for period */
GROUP BY EMPLOYID
No need to start creating extra work.
Hope it helps
August 6, 2009 at 7:30 pm
gaj104 (8/6/2009)
Hi,Maybe I'm missing something here, but why bother with a script?
Simply use the unpivot function and then query as normal:
WITH P
AS (SELECT EMPLOYID,
PERIOD,
SALARY
FROM (SELECT EMPLOYID,
GROSWAGS_1,
GROSWAGS_2,
GROSWAGS_3,
GROSWAGS_4,
GROSWAGS_5,
GROSWAGS_6,
GROSWAGS_7,
GROSWAGS_8,
GROSWAGS_9,
GROSWAGS_10,
GROSWAGS_11,
GROSWAGS_12
FROM UPR00900) P
UNPIVOT
(SALARY
FOR PERIOD IN ( GROSWAGS_1,GROSWAGS_2,GROSWAGS_3,GROSWAGS_4,GROSWAGS_5,GROSWAGS_6,GROSWAGS_7,GROSWAGS_8,GROSWAGS_9,GROSWAGS_10,GROSWAGS_11,GROSWAGS_12 ) ) AS UNPVT),
P1
AS (SELECT EMPLOYID,
CASE
WHEN Len(PERIOD) = 10
THEN Right(PERIOD,1) * 1
ELSE Right(PERIOD,2) * 1
END AS PERIOD,
SALARY
FROM P)
SELECT EMPLOYID,
Sum(SALARY) AS TOTAL_SALARY
FROM P1
WHERE PERIOD <= 3 /*Criteria for period */
GROUP BY EMPLOYID
No need to start creating extra work.
Hope it helps
That's about the same as what the other folks did to unpivot the data except they used the traditional method instead of UNPIVOT. End result is exactly the same idea and both methods can be in a sproc.
--Jeff Moden
Change is inevitable... Change for the better is not.
August 6, 2009 at 8:26 pm
This may not be as elegant as an UNPIVOT but it should be faster:
CREATE FUNCTION dbo.RangeWages (@Month1 AS INT, @Month2 AS INT)
RETURNS TABLE AS
RETURN SELECT EMPLOYID,
CASE WHEN 1 BETWEEN @Month1 AND @Month2 THEN GROSWAGS_1 ELSE 0 END
+ CASE WHEN 2 BETWEEN @Month1 AND @Month2 THEN GROSWAGS_2 ELSE 0 END
+ CASE WHEN 3 BETWEEN @Month1 AND @Month2 THEN GROSWAGS_3 ELSE 0 END
+ CASE WHEN 4 BETWEEN @Month1 AND @Month2 THEN GROSWAGS_4 ELSE 0 END
+ CASE WHEN 5 BETWEEN @Month1 AND @Month2 THEN GROSWAGS_5 ELSE 0 END
+ CASE WHEN 6 BETWEEN @Month1 AND @Month2 THEN GROSWAGS_6 ELSE 0 END
+ CASE WHEN 7 BETWEEN @Month1 AND @Month2 THEN GROSWAGS_7 ELSE 0 END
+ CASE WHEN 8 BETWEEN @Month1 AND @Month2 THEN GROSWAGS_8 ELSE 0 END
+ CASE WHEN 9 BETWEEN @Month1 AND @Month2 THEN GROSWAGS_9 ELSE 0 END
+ CASE WHEN 10 BETWEEN @Month1 AND @Month2 THEN GROSWAGS_10 ELSE 0 END
+ CASE WHEN 11 BETWEEN @Month1 AND @Month2 THEN GROSWAGS_11 ELSE 0 END
+ CASE WHEN 12 BETWEEN @Month1 AND @Month2 THEN GROSWAGS_12 ELSE 0 END
FROM dbo.UPR00900
It's been my experience that you can use incredibly complex formulas and CASE statements out the wazoo (to compute output fields, not in the WHERE clause) and they all collapse into one tiny Compute Scalar step in the execution plan, while using UNIONs or anything else that may add I/O has a much greater impact on performance. On the other hand a pretty solution like the UNPIVOT may be more readable and easier to maintain. (And not leave you feeling like you're writing COBOL.)
August 7, 2009 at 7:20 am
Personally, I find the Pivot and Unpivot syntax for SQL a bit cryptic. I also think the Pivot command is about as useless as a refrigerator in Antarctica. That's why I don't use them.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
August 7, 2009 at 7:23 am
Ok, so I think this project is getting harder than it needs to be; however, I had another hick-up in the plan. I had this working perfectly, using this code:
DECLARE @StartMonth INT,
@EndMonth INT
SET @StartMonth = 1
SET @EndMonth = 3;
WITH EmployeeCte(EMPLOYID,MONTH,GROSWAGS)
AS (SELECT EMPLOYID,
1,
GROSWAGS_1
FROM [UPR00900]
UNION ALL
SELECT EMPLOYID,
2,
GROSWAGS_2
FROM [UPR00900]
UNION ALL
SELECT EMPLOYID,
3,
GROSWAGS_3
FROM [UPR00900]
UNION ALL
SELECT EMPLOYID,
4,
GROSWAGS_4
FROM [UPR00900]
UNION ALL
SELECT EMPLOYID,
5,
GROSWAGS_5
FROM [UPR00900]
UNION ALL
SELECT EMPLOYID,
6,
GROSWAGS_6
FROM [UPR00900]
UNION ALL
SELECT EMPLOYID,
7,
GROSWAGS_7
FROM [UPR00900]
UNION ALL
SELECT EMPLOYID,
8,
GROSWAGS_8
FROM [UPR00900]
UNION ALL
SELECT EMPLOYID,
9,
GROSWAGS_9
FROM [UPR00900]
UNION ALL
SELECT EMPLOYID,
10,
GROSWAGS_10
FROM [UPR00900]
UNION ALL
SELECT EMPLOYID,
11,
GROSWAGS_11
FROM [UPR00900]
UNION ALL
SELECT EMPLOYID,
12,
GROSWAGS_12
FROM [UPR00900])
SELECT EMPLOYID, cast(sum(GROSWAGS) AS MONEY) AS TotalWage
FROM EmployeeCte
WHERE [Month] BETWEEN @StartMonth AND @EndMonth
GROUP BY EMPLOYID;
The problem that I ran into is that I have since then found out EMPLOYID isn't unique. Gotta love Microsoft products, huh? Anyway, I was getting large salary amounts and when I researched it a little more, I found out that one employeeID might be in there 5 or 6 times and this query is grouping them all together and adding them up.
I need someway to treat each record as its own and not worry about what's in the EMPLOYID. Is that possible?
Jordon
August 7, 2009 at 7:31 am
What's the definition for the primary key for the table? (Assuming there is one.)
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
August 7, 2009 at 7:33 am
I'm sure someone will something more elegant to work this out, but here's my shot at it:
USE tempdb
GO
IF EXISTS (SELECT 1
FROM sys.objects
WHERE object_Name(object_ID) = 'UPR00900')
BEGIN
DROP TABLE UPR00900
END
GO
CREATE TABLE [dbo].[UPR00900] (
[EMPLOYID] [CHAR](15) NOT NULL,
[GROSWAGS_1] [NUMERIC](19,5) NOT NULL,
[GROSWAGS_2] [NUMERIC](19,5) NOT NULL,
[GROSWAGS_3] [NUMERIC](19,5) NOT NULL,
[GROSWAGS_4] [NUMERIC](19,5) NOT NULL,
[GROSWAGS_5] [NUMERIC](19,5) NOT NULL,
[GROSWAGS_6] [NUMERIC](19,5) NOT NULL,
[GROSWAGS_7] [NUMERIC](19,5) NOT NULL,
[GROSWAGS_8] [NUMERIC](19,5) NOT NULL,
[GROSWAGS_9] [NUMERIC](19,5) NOT NULL,
[GROSWAGS_10] [NUMERIC](19,5) NOT NULL,
[GROSWAGS_11] [NUMERIC](19,5) NOT NULL,
[GROSWAGS_12] [NUMERIC](19,5) NOT NULL)
ON [PRIMARY]
GO
INSERT INTO [dbo].[UPR00900]
([EMPLOYID],
[GROSWAGS_1],
[GROSWAGS_2],
[GROSWAGS_3],
[GROSWAGS_4],
[GROSWAGS_5],
[GROSWAGS_6],
[GROSWAGS_7],
[GROSWAGS_8],
[GROSWAGS_9],
[GROSWAGS_10],
[GROSWAGS_11],
[GROSWAGS_12])
VALUES (1718,
350.75000,
475.82000,
736.76000,
383.52000,
383.52000,
671.16000,
584.41000,
767.04000,
715.06000,
735.02000,
234.38000,
149.25000)
INSERT INTO [dbo].[UPR00900]
([EMPLOYID],
[GROSWAGS_1],
[GROSWAGS_2],
[GROSWAGS_3],
[GROSWAGS_4],
[GROSWAGS_5],
[GROSWAGS_6],
[GROSWAGS_7],
[GROSWAGS_8],
[GROSWAGS_9],
[GROSWAGS_10],
[GROSWAGS_11],
[GROSWAGS_12])
VALUES (1718,
350.75000,
475.82000,
736.76000,
383.52000,
383.52000,
671.16000,
584.41000,
767.04000,
715.06000,
735.02000,
234.38000,
149.25000)
GO
INSERT INTO [dbo].[UPR00900]
([EMPLOYID],
[GROSWAGS_1],
[GROSWAGS_2],
[GROSWAGS_3],
[GROSWAGS_4],
[GROSWAGS_5],
[GROSWAGS_6],
[GROSWAGS_7],
[GROSWAGS_8],
[GROSWAGS_9],
[GROSWAGS_10],
[GROSWAGS_11],
[GROSWAGS_12])
VALUES (1709,
350.75000,
475.82000,
7336.76000,
38323.52000,
3853.52000,
671.16000,
58164.41000,
7567.04000,
715.06000,
7365.02000,
2734.38000,
1249.25000)
INSERT INTO [dbo].[UPR00900]
([EMPLOYID],
[GROSWAGS_1],
[GROSWAGS_2],
[GROSWAGS_3],
[GROSWAGS_4],
[GROSWAGS_5],
[GROSWAGS_6],
[GROSWAGS_7],
[GROSWAGS_8],
[GROSWAGS_9],
[GROSWAGS_10],
[GROSWAGS_11],
[GROSWAGS_12])
VALUES (1703,
1350.75000,
1475.82000,
7236.76000,
3833.52000,
3813.52000,
6711.16000,
5824.41000,
7637.04000,
7145.06000,
73521.02000,
2354.38000,
14629.25000)
GO
DECLARE @StartMonth INT,
@EndMonth INT
SET @StartMonth = 1
SET @EndMonth = 8;
WITH EmployeeCte(EMPLOYID,MONTH,GROSWAGS,ID)
AS (SELECT EMPLOYID,
1,
GROSWAGS_1,
row_Number()
OVER(ORDER BY EmployID ASC) AS ID
FROM [UPR00900]
UNION ALL
SELECT EMPLOYID,
2,
GROSWAGS_2,
row_Number()
OVER(ORDER BY EmployID ASC) AS ID
FROM [UPR00900]
UNION ALL
SELECT EMPLOYID,
3,
GROSWAGS_3,
row_Number()
OVER(ORDER BY EmployID ASC) AS ID
FROM [UPR00900]
UNION ALL
SELECT EMPLOYID,
4,
GROSWAGS_4,
row_Number()
OVER(ORDER BY EmployID ASC) AS ID
FROM [UPR00900]
UNION ALL
SELECT EMPLOYID,
5,
GROSWAGS_5,
row_Number()
OVER(ORDER BY EmployID ASC) AS ID
FROM [UPR00900]
UNION ALL
SELECT EMPLOYID,
6,
GROSWAGS_6,
row_Number()
OVER(ORDER BY EmployID ASC) AS ID
FROM [UPR00900]
UNION ALL
SELECT EMPLOYID,
7,
GROSWAGS_7,
row_Number()
OVER(ORDER BY EmployID ASC) AS ID
FROM [UPR00900]
UNION ALL
SELECT EMPLOYID,
8,
GROSWAGS_8,
row_Number()
OVER(ORDER BY EmployID ASC) AS ID
FROM [UPR00900]
UNION ALL
SELECT EMPLOYID,
9,
GROSWAGS_9,
row_Number()
OVER(ORDER BY EmployID ASC) AS ID
FROM [UPR00900]
UNION ALL
SELECT EMPLOYID,
10,
GROSWAGS_10,
row_Number()
OVER(ORDER BY EmployID ASC) AS ID
FROM [UPR00900]
UNION ALL
SELECT EMPLOYID,
11,
GROSWAGS_11,
row_Number()
OVER(ORDER BY EmployID ASC) AS ID
FROM [UPR00900]
UNION ALL
SELECT EMPLOYID,
12,
GROSWAGS_12,
row_Number()
OVER(ORDER BY EmployID ASC) AS ID
FROM [UPR00900])
SELECT min(EmployID) AS EmployID,
cast(sum(GROSWAGS) AS MONEY) AS TotalWage
FROM EmployeeCte
WHERE [Month] BETWEEN @StartMonth AND @EndMonth
GROUP BY ID;
Hope that helps,
Cheers,
J-F
August 7, 2009 at 8:03 am
Actually, the primary key is the EMPLOYID, but its not unique. I know, I was like "What????" Anyway, what you did worked perfectly!!!
Thanks,
Jordon
August 7, 2009 at 8:09 am
jordon.shaw (8/7/2009)
Actually, the primary key is the EMPLOYID, but its not unique. I know, I was like "What????" Anyway, what you did worked perfectly!!!Thanks,
Jordon
You are welcome, I'm happy it suits your needs.
Cheers,
J-F
August 7, 2009 at 9:02 am
The solution I posted earlier handles each record once and so is not affected by duplicate employee ids. And I'm still convinced it will run much faster than eleven UNIONs.
CREATE FUNCTION dbo.RangeWages (@Month1 AS INT, @Month2 AS INT)
RETURNS TABLE AS
RETURN SELECT EMPLOYID,
CASE WHEN 1 BETWEEN @Month1 AND @Month2 THEN GROSWAGS_1 ELSE 0 END
+ CASE WHEN 2 BETWEEN @Month1 AND @Month2 THEN GROSWAGS_2 ELSE 0 END
+ CASE WHEN 3 BETWEEN @Month1 AND @Month2 THEN GROSWAGS_3 ELSE 0 END
+ CASE WHEN 4 BETWEEN @Month1 AND @Month2 THEN GROSWAGS_4 ELSE 0 END
+ CASE WHEN 5 BETWEEN @Month1 AND @Month2 THEN GROSWAGS_5 ELSE 0 END
+ CASE WHEN 6 BETWEEN @Month1 AND @Month2 THEN GROSWAGS_6 ELSE 0 END
+ CASE WHEN 7 BETWEEN @Month1 AND @Month2 THEN GROSWAGS_7 ELSE 0 END
+ CASE WHEN 8 BETWEEN @Month1 AND @Month2 THEN GROSWAGS_8 ELSE 0 END
+ CASE WHEN 9 BETWEEN @Month1 AND @Month2 THEN GROSWAGS_9 ELSE 0 END
+ CASE WHEN 10 BETWEEN @Month1 AND @Month2 THEN GROSWAGS_10 ELSE 0 END
+ CASE WHEN 11 BETWEEN @Month1 AND @Month2 THEN GROSWAGS_11 ELSE 0 END
+ CASE WHEN 12 BETWEEN @Month1 AND @Month2 THEN GROSWAGS_12 ELSE 0 END
FROM dbo.UPR00900
August 7, 2009 at 9:59 am
Just tested the speed. Scott's right.
Here's the test set-up:
USE [ProofOfConcept]
GO
/****** Object: Table [dbo].[UPR00900] Script Date: 08/07/2009 11:32:44 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING OFF
GO
CREATE TABLE [dbo].[UPR00900](
[EMPLOYID] [char](15) NOT NULL,
[GROSWAGS_1] [numeric](19, 5) NOT NULL,
[GROSWAGS_2] [numeric](19, 5) NOT NULL,
[GROSWAGS_3] [numeric](19, 5) NOT NULL,
[GROSWAGS_4] [numeric](19, 5) NOT NULL,
[GROSWAGS_5] [numeric](19, 5) NOT NULL,
[GROSWAGS_6] [numeric](19, 5) NOT NULL,
[GROSWAGS_7] [numeric](19, 5) NOT NULL,
[GROSWAGS_8] [numeric](19, 5) NOT NULL,
[GROSWAGS_9] [numeric](19, 5) NOT NULL,
[GROSWAGS_10] [numeric](19, 5) NOT NULL,
[GROSWAGS_11] [numeric](19, 5) NOT NULL,
[GROSWAGS_12] [numeric](19, 5) NOT NULL
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO
set nocount on;
go
insert into dbo.UPR00900 (EMPLOYID, GROSWAGS_1, GROSWAGS_2, GROSWAGS_3, GROSWAGS_4, GROSWAGS_5, GROSWAGS_6, GROSWAGS_7, GROSWAGS_8, GROSWAGS_9, GROSWAGS_10, GROSWAGS_11, GROSWAGS_12)
select left(cast(newid() as varchar(100)), 15),
cast(abs(checksum(newid())) as float)/cast(abs(checksum(newid())) as float),
cast(abs(checksum(newid())) as float)/cast(abs(checksum(newid())) as float),
cast(abs(checksum(newid())) as float)/cast(abs(checksum(newid())) as float),
cast(abs(checksum(newid())) as float)/cast(abs(checksum(newid())) as float),
cast(abs(checksum(newid())) as float)/cast(abs(checksum(newid())) as float),
cast(abs(checksum(newid())) as float)/cast(abs(checksum(newid())) as float),
cast(abs(checksum(newid())) as float)/cast(abs(checksum(newid())) as float),
cast(abs(checksum(newid())) as float)/cast(abs(checksum(newid())) as float),
cast(abs(checksum(newid())) as float)/cast(abs(checksum(newid())) as float),
cast(abs(checksum(newid())) as float)/cast(abs(checksum(newid())) as float),
cast(abs(checksum(newid())) as float)/cast(abs(checksum(newid())) as float),
cast(abs(checksum(newid())) as float)/cast(abs(checksum(newid())) as float)
from dbo.Numbers;
go 50
insert into dbo.UPR00900 (EMPLOYID, GROSWAGS_1, GROSWAGS_2, GROSWAGS_3, GROSWAGS_4, GROSWAGS_5, GROSWAGS_6, GROSWAGS_7, GROSWAGS_8, GROSWAGS_9, GROSWAGS_10, GROSWAGS_11, GROSWAGS_12)
select EMPLOYID, GROSWAGS_1, GROSWAGS_2, GROSWAGS_3, GROSWAGS_4, GROSWAGS_5, GROSWAGS_6, GROSWAGS_7, GROSWAGS_8, GROSWAGS_9, GROSWAGS_10, GROSWAGS_11, GROSWAGS_12
from dbo.UPR00900;
That created 1-million rows of test data, with at least one duplicate for each, and possibly many more.
Here's the test:
if object_id(N'tempdb..#T_Unions') is not null
drop table #T_Unions;
if object_id(N'tempdb..#T_Cases') is not null
drop table #T_Cases;
set nocount on;
DECLARE @StartMonth INT,
@EndMonth INT
SET @StartMonth = 1
SET @EndMonth = 8;
set statistics io on;
set statistics time on;
WITH EmployeeCte(EMPLOYID,MONTH,GROSWAGS,ID)
AS (SELECT EMPLOYID,
1,
GROSWAGS_1,
row_Number()
OVER(ORDER BY EmployID ASC) AS ID
FROM [UPR00900]
UNION ALL
SELECT EMPLOYID,
2,
GROSWAGS_2,
row_Number()
OVER(ORDER BY EmployID ASC) AS ID
FROM [UPR00900]
UNION ALL
SELECT EMPLOYID,
3,
GROSWAGS_3,
row_Number()
OVER(ORDER BY EmployID ASC) AS ID
FROM [UPR00900]
UNION ALL
SELECT EMPLOYID,
4,
GROSWAGS_4,
row_Number()
OVER(ORDER BY EmployID ASC) AS ID
FROM [UPR00900]
UNION ALL
SELECT EMPLOYID,
5,
GROSWAGS_5,
row_Number()
OVER(ORDER BY EmployID ASC) AS ID
FROM [UPR00900]
UNION ALL
SELECT EMPLOYID,
6,
GROSWAGS_6,
row_Number()
OVER(ORDER BY EmployID ASC) AS ID
FROM [UPR00900]
UNION ALL
SELECT EMPLOYID,
7,
GROSWAGS_7,
row_Number()
OVER(ORDER BY EmployID ASC) AS ID
FROM [UPR00900]
UNION ALL
SELECT EMPLOYID,
8,
GROSWAGS_8,
row_Number()
OVER(ORDER BY EmployID ASC) AS ID
FROM [UPR00900]
UNION ALL
SELECT EMPLOYID,
9,
GROSWAGS_9,
row_Number()
OVER(ORDER BY EmployID ASC) AS ID
FROM [UPR00900]
UNION ALL
SELECT EMPLOYID,
10,
GROSWAGS_10,
row_Number()
OVER(ORDER BY EmployID ASC) AS ID
FROM [UPR00900]
UNION ALL
SELECT EMPLOYID,
11,
GROSWAGS_11,
row_Number()
OVER(ORDER BY EmployID ASC) AS ID
FROM [UPR00900]
UNION ALL
SELECT EMPLOYID,
12,
GROSWAGS_12,
row_Number()
OVER(ORDER BY EmployID ASC) AS ID
FROM [UPR00900])
SELECT min(EmployID) AS EmployID,
cast(sum(GROSWAGS) AS MONEY) AS TotalWage
INTO #T_Unions
FROM EmployeeCte
WHERE [Month] BETWEEN @StartMonth AND @EndMonth
GROUP BY ID;
SELECT EMPLOYID,
CASE WHEN 1 BETWEEN @StartMonth AND @EndMonth THEN GROSWAGS_1 ELSE 0 END
+ CASE WHEN 2 BETWEEN @StartMonth AND @EndMonth THEN GROSWAGS_2 ELSE 0 END
+ CASE WHEN 3 BETWEEN @StartMonth AND @EndMonth THEN GROSWAGS_3 ELSE 0 END
+ CASE WHEN 4 BETWEEN @StartMonth AND @EndMonth THEN GROSWAGS_4 ELSE 0 END
+ CASE WHEN 5 BETWEEN @StartMonth AND @EndMonth THEN GROSWAGS_5 ELSE 0 END
+ CASE WHEN 6 BETWEEN @StartMonth AND @EndMonth THEN GROSWAGS_6 ELSE 0 END
+ CASE WHEN 7 BETWEEN @StartMonth AND @EndMonth THEN GROSWAGS_7 ELSE 0 END
+ CASE WHEN 8 BETWEEN @StartMonth AND @EndMonth THEN GROSWAGS_8 ELSE 0 END
+ CASE WHEN 9 BETWEEN @StartMonth AND @EndMonth THEN GROSWAGS_9 ELSE 0 END
+ CASE WHEN 10 BETWEEN @StartMonth AND @EndMonth THEN GROSWAGS_10 ELSE 0 END
+ CASE WHEN 11 BETWEEN @StartMonth AND @EndMonth THEN GROSWAGS_11 ELSE 0 END
+ CASE WHEN 12 BETWEEN @StartMonth AND @EndMonth THEN GROSWAGS_12 ELSE 0 END as TotalWages
INTO #T_Cases
FROM dbo.UPR00900;
set statistics time off;
set statistics io off;
Selecting into temp tables eliminates the delays that would be caused by trying to display that many rows of data in the client app. I ran it twice, to make sure neither suffered from having to auto-grow tempdb (which would unfairly slow it down).
Here are the results:
Table 'UPR00900'. Scan count 36, logical reads 200004, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 87827 ms, elapsed time = 148017 ms.
Table 'UPR00900'. Scan count 1, logical reads 16667, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 4484 ms, elapsed time = 4476 ms.
The Union version took 148 seconds total (and change), and 36 scans of the table. The Case version took just under 4.5 seconds total, and 1 scan of the table. Definitely much better.
On the assumption that the EMPLOYID column was an actual primary key, I then modified the table by taking out all the duplicates and making that column the PK, including making it the leading edge of the clustered index. Then I ran my original query, and Scott's query on that table. Scott's still was faster, but the difference was just over 2 seconds for his and just under 6 for a Unions version.
On the other hand, if duplicate EMPLYIDs mean one employee actually had two different wages for the same month (perhaps based on wage category, like hourly vs commissions), which would make sense in a table like this, then the Case statement one won't give you the actual totals, and the Unions one will. If that's the situation, then you have to add another step to the Cases version, to aggregate the final results. I tested that and it added about 2 seconds to the execution time of that one, after adding the duplicates back into the original table (and dropping the PK, of course).
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
August 7, 2009 at 11:33 am
Well, thanks for the times Gus.
I'm surprised to see a huge select case solution beats the group by, but it makes sense, when you think about all the union all I had to build to cover the query (12 times).
I'm curious to know, though, how the query actually got 36 scans.. which is 3 times more then the actual number of scans I thought of. Anyone can shed a light on why the (12) scans have been done 3 times (3 times 12) to get 36 scans?
Does the number of scans reduce when you use the clustered primary key on employeeId?
Cheers,
J-F
August 7, 2009 at 12:14 pm
When I added a PK, the scans went way down. When I added an index to each of the wages columns, on top of the PK (12 indexes + clustered), it went down to one scan per union clause.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
August 7, 2009 at 1:44 pm
All those CASE functions look like a lot of work when you read them, but when you look at the execution plan you realize that they only represent a little math to be done on each row. The important thing is that it only has to go through each data page once. Moving data around, even if it is only logical reads from the buffer cache, takes much more work than doing arithmetic on each row.
August 7, 2009 at 2:02 pm
CREATE TABLE [dbo].[UPR00900](
Hmmm with names like those... let me guess.. Great Plains?
I find it sad that Microsoft makes very poor design mistakes like this stuff.. I'm getting used to it with the developers around here... but sheesh...
Viewing 15 posts - 16 through 30 (of 31 total)
You must be logged in to reply to this topic. Login to reply