August 21, 2012 at 9:58 am
I have a tentative solution below, it's just really inefficient. So I thought I would check if anyone had suggestions to improve it. Basically, I'm using several queries to populate a new table where reports will aggregate the stats. I need to check if no results were returned, and return the values that would have appeared if the conditions would have been met. Maybe an example will help. Here's some data:
CREATE TABLE Enroll
(
Stud_IDVARCHAR(20)NOT NULL,
Enroll_ID VARCHAR(20)NOT NULL,
Prog_Code VARCHAR(25) NOT NULL,
CampusVARCHAR(20)NOT NULL,
EStatus VARCHAR(20) NOT NULL
)
INSERT INTO Enroll (Stud_ID, Enroll_ID, Prog_Code, Campus) VALUES ('11111', '22222', 'AR-278', 'Kingston', 'Entering')
CREATE TABLE Stu_Chgs
(
Enroll_IDVARCHAR(20)NOT NULL,
New_StatVARCHAR(25)NOT NULL,
Start_StatDate NOT NULL,
End_StatDate NOT NULL
)
INSERT INTO Stu_Chgs (Enroll_ID, New_Stat, Start_Stat, End_Stat) VALUES ('22222', 'Start', '2011-12-01', '2012-03-01')
CREATE TABLE Stud
(
Stud_IDVARCHAR(20)NOT NULL,
GenderVARCHAR(20)NOT NULL
)
INSERT INTO Stud (Stud_ID, Gender) VALUES ('11111', 'Female')
Here's something that works, although it's very resource intensive:
DECLARE @PV_Tdy VARCHAR(25)
SET @PV_Tdy = 'AR-190'
DECLARE @PV_TdyLstYr VARCHAR(25)
SET @PV_TdyLstYr = 'AR-189'
DECLARE @TdLstYear AS Date
SET @TdLstYear = (SELECT dateadd(yy,-1,getdate()))
;
WITH cec AS
(
/*Today - Current Entering Class*/
SELECT GETDATE() AS Runtime
,Enroll.Prog_Code AS PVCode
,COUNT(DISTINCT Enroll.Stud_ID) AS StatCount
,Enroll.Campus AS Campus
,'Current Entering Class' AS StatCategory
,'Today' AS StatPeriod
,'AppCycle' AS StatType
FROM Enroll
WHERE Prog_Code = @PV_Tdy
AND EStatus = 'Entering'
GROUP BY Enroll.Prog_Code
,Enroll.EStatus
,Enroll.Campus
),
ig AS
(
/*TodayLstYr - Incoming Gender*/
SELECT GETDATE() AS Runtime
,Enroll.Prog_Code AS PVCode
,COUNT(DISTINCT Enroll.Stud_ID) AS StatCount
,Enroll.Campus AS Campus
,Stud.Gender AS StatCategory
,'TodayLstYr' AS StatPeriod
,'EnteringDemo' AS StatType
FROM Enroll
JOIN Stud ON Enroll.Stud_ID = Stud.Stud_ID
JOIN Stu_Chgs ON Enroll.Enroll_ID = Stu_Chgs.Enroll_ID
WHERE Prog_Code = @PV_TdyLstYr
AND New_Stat = 'Entering'
AND Start_Stat <= @TdLstYear
AND End_Stat >= @TdLstYear
GROUP BY Enroll.Prog_Code
,Stud.Gender
,Enroll.Campus
)
SELECT Runtime, PVCode, StatCount, Campus, StatCategory, StatPeriod, StatType FROM cec
UNION ALL
SELECT Runtime, PVCode, StatCount, Campus, StatCategory, StatPeriod, StatType FROM cec
--cec NULL check
UNION ALL
SELECT GETDATE() AS Runtime, @PV_Tdy, '0', 'ALCOM', 'Current Entering Class', 'Today', 'AppCyc' WHERE NOT EXISTS (SELECT * FROM cec)
--ig NULL check for Females
UNION ALL
SELECT GETDATE() AS Runtime, @PV_TdyLstYr, '0', 'ALCOM', 'Female', 'TodayLstYr', 'EnteringDemo' WHERE NOT EXISTS (SELECT * FROM ig WHERE StatCategory = 'Female')
--ig NULL check for Males
UNION ALL
SELECT GETDATE() AS Runtime, @PV_TdyLstYr, '0', 'ALCOM', 'Male', 'TodayLstYr', 'EnteringDemo' WHERE NOT EXISTS (SELECT * FROM ig WHERE StatCategory = 'Male')
For instance, if the conditions would have been met in COUNTing Stud_ID's for Gender, the StatCategories of Female and Male would be returned with the count. However, if they aren't met, I still need everything else returned with a '0' in the stat count.
And maybe there isn't a quicker way to do this. The job will run around 4 a.m., so losing some time wouldn't be the end of the world. Either way, any thoughts?
August 21, 2012 at 9:49 pm
I don't see the EStatus column defined in the Enroll table:
SELECT GETDATE() AS Runtime
,Enroll.Prog_Code AS PVCode
,COUNT(DISTINCT Enroll.Stud_ID) AS StatCount
,Enroll.Campus AS Campus
,'Current Entering Class' AS StatCategory
,'Today' AS StatPeriod
,'AppCycle' AS StatType
FROM Enroll
WHERE Prog_Code = @PV_Tdy
AND EStatus = 'Entering'
GROUP BY Enroll.Prog_Code
,Enroll.EStatus
,Enroll.Campus
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
August 22, 2012 at 8:37 am
Thanks for pointing that out. I fixed the script.
August 22, 2012 at 7:38 pm
Can't tell if this is more efficient or not (it may not be according to the execution plan) because there's not enough records in the sample data, but you may want to try it like this:
CREATE TABLE #Enroll
(
Stud_IDVARCHAR(20)NOT NULL,
Enroll_ID VARCHAR(20)NOT NULL,
Prog_Code VARCHAR(25) NOT NULL,
CampusVARCHAR(20)NOT NULL,
EStatus VARCHAR(20) NOT NULL
)
INSERT INTO #Enroll (Stud_ID, Enroll_ID, Prog_Code, Campus, EStatus) VALUES ('11111', '22222', 'AR-278', 'Kingston', 'Entering')
CREATE TABLE #Stu_Chgs
(
Enroll_IDVARCHAR(20)NOT NULL,
New_StatVARCHAR(25)NOT NULL,
Start_StatDate NOT NULL,
End_StatDate NOT NULL
)
INSERT INTO #Stu_Chgs (Enroll_ID, New_Stat, Start_Stat, End_Stat) VALUES ('22222', 'Start', '2011-12-01', '2012-03-01')
CREATE TABLE #Stud
(
Stud_IDVARCHAR(20)NOT NULL,
GenderVARCHAR(20)NOT NULL
)
INSERT INTO #Stud (Stud_ID, Gender) VALUES ('11111', 'Female')
DECLARE @PV_Tdy VARCHAR(25)
SET @PV_Tdy = 'AR-190'
DECLARE @PV_TdyLstYr VARCHAR(25)
SET @PV_TdyLstYr = 'AR-189'
DECLARE @TdLstYear AS Date
SET @TdLstYear = (SELECT dateadd(yy,-1,getdate()))
;
WITH cec AS
(
/*Today - Current Entering Class*/
SELECT Runtime=MAX(Runtime), PVCode
,StatCount=COUNT(DISTINCT Stud_ID)-1
,Campus, StatCategory=MAX(StatCategory), StatPeriod=MAX(StatPeriod)
,StatType=MAX(StatType)
FROM (
SELECT GETDATE() AS Runtime
,Enroll.Prog_Code AS PVCode
,Stud_ID
,Enroll.Campus AS Campus
,'Current Entering Class' AS StatCategory
,'Today' AS StatPeriod
,'AppCycle' AS StatType
FROM #Enroll Enroll
WHERE Prog_Code = @PV_Tdy AND EStatus = 'Entering'
UNION ALL
SELECT GETDATE(), @PV_Tdy, '0', 'ALCOM', 'Current Entering Class', 'Today', 'AppCyc'
) a
GROUP BY PVCode, Campus
),
ig AS
(
/*TodayLstYr - Incoming Gender*/
SELECT Runtime=MAX(Runtime), PVCode, StatCount=COUNT(DISTINCT Stud_ID)-1
,Campus, StatCategory, StatPeriod=MAX(StatPeriod), StatType=MAX(StatType)
FROM (
SELECT GETDATE() AS Runtime
,Enroll.Prog_Code AS PVCode
,Enroll.Stud_ID
,Enroll.Campus AS Campus
,Stud.Gender AS StatCategory
,'TodayLstYr' AS StatPeriod
,'EnteringDemo' AS StatType
FROM #Enroll Enroll
JOIN #Stud Stud ON Enroll.Stud_ID = Stud.Stud_ID
JOIN #Stu_Chgs Stu_Chgs ON Enroll.Enroll_ID = Stu_Chgs.Enroll_ID
WHERE Prog_Code = @PV_TdyLstYr
AND New_Stat = 'Entering'
AND Start_Stat <= @TdLstYear
AND End_Stat >= @TdLstYear
UNION ALL
SELECT GETDATE(), @PV_TdyLstYr, '0', 'ALCOM', 'Female', 'TodayLstYr', 'EnteringDemo'
UNION ALL
SELECT GETDATE(), @PV_TdyLstYr, '0', 'ALCOM', 'Male', 'TodayLstYr', 'EnteringDemo'
) a
GROUP BY PVCode, StatCategory, Campus
)
SELECT Runtime, PVCode, StatCount, Campus, StatCategory, StatPeriod, StatType FROM cec
UNION ALL
SELECT Runtime, PVCode, StatCount, Campus, StatCategory, StatPeriod, StatType FROM ig
--cec NULL check
--UNION ALL
--SELECT GETDATE() AS Runtime, @PV_Tdy, '0', 'ALCOM', 'Current Entering Class', 'Today', 'AppCyc'
--WHERE NOT EXISTS (SELECT * FROM cec)
--ig NULL check for Females
--UNION ALL
--SELECT GETDATE() AS Runtime, @PV_TdyLstYr, '0', 'ALCOM', 'Female', 'TodayLstYr', 'EnteringDemo' WHERE NOT EXISTS (SELECT * FROM ig WHERE StatCategory = 'Female')
----ig NULL check for Males
--UNION ALL
--SELECT GETDATE() AS Runtime, @PV_TdyLstYr, '0', 'ALCOM', 'Male', 'TodayLstYr', 'EnteringDemo' WHERE NOT EXISTS (SELECT * FROM ig WHERE StatCategory = 'Male')
DROP TABLE #Enroll
DROP TABLE #Stu_Chgs
DROP TABLE #Stud
The idea is to move your "empty" cases up into the CTEs and then subtract 1 from the associated counts.
If there's a way to combine all of this into a single pass on the tables, it would probably be more efficient yet. I just can't see a way to do that at this time.
Edit: And you may want to try adding
OPTION(RECOMPILE)
to both queries when you run (try it with and without) to see if that helps the actual run time.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply