May 13, 2015 at 1:49 pm
Hello everyone,
Please help to get the desired results for the following sample data set. I was able to come up with a query that returns the the expected results however only for a given day, so I'd need to union several select statements the get the desired results which is definitely not ideal. I'd like to pass a parameter in (number of days) instead of doing a unions for each select. Thank you for your help.
DECLARE @T TABLE (Id INT, Category VARCHAR(1), [Date] DATE)
INSERT INTO @T
SELECT 1 AS Id, 'A' AS Category, '2015-5-13' AS ActivationDate UNION ALL
SELECT 1, 'A', NULL UNION ALL
SELECT 1, 'A', '2015-5-13' UNION ALL
SELECT 1, 'A', NULL UNION ALL
SELECT 1, 'A', NULL UNION ALL
SELECT 1, 'A', '2015-5-13' UNION ALL
SELECT 1, 'A', '2015-5-12' UNION ALL
SELECT 1, 'A', '2015-5-12' UNION ALL
SELECT 1, 'A', NULL UNION ALL
SELECT 1, 'A', '2015-5-12' UNION ALL
SELECT 1, 'A', '2015-5-12' UNION ALL
SELECT 1, 'A', '2015-5-12' UNION ALL
SELECT 1, 'A', '2015-5-11' UNION ALL
SELECT 1, 'A', '2015-5-11' UNION ALL
SELECT 1, 'A', '2015-5-11' UNION ALL
SELECT 2, 'X', '2015-5-13' UNION ALL
SELECT 2, 'X', NULL UNION ALL
SELECT 2, 'X', NULL UNION ALL
SELECT 2, 'X', '2015-5-13' UNION ALL
SELECT 2, 'X', '2015-5-12' UNION ALL
SELECT 2, 'X', '2015-5-12' UNION ALL
SELECT 2, 'X', NULL UNION ALL
SELECT 2, 'X', '2015-5-12' UNION ALL
SELECT 2, 'X', '2015-5-12' UNION ALL
SELECT 2, 'X', NULL UNION ALL
SELECT 2, 'X', '2015-5-11' UNION ALL
SELECT 2, 'Y', '2015-5-13' UNION ALL
SELECT 2, 'Y', NULL UNION ALL
SELECT 2, 'Y', NULL UNION ALL
SELECT 2, 'Y', '2015-5-13' UNION ALL
SELECT 2, 'Y', '2015-5-12' UNION ALL
SELECT 2, 'Y', '2015-5-12' UNION ALL
SELECT 2, 'Y', NULL UNION ALL
SELECT 2, 'Y', NULL UNION ALL
SELECT 2, 'Y', '2015-5-11' UNION ALL
SELECT 2, 'Z', '2015-5-13' UNION ALL
SELECT 2, 'Z', '2015-5-13' UNION ALL
SELECT 2, 'Z', NULL UNION ALL
SELECT 2, 'Z', NULL UNION ALL
SELECT 2, 'Z', NULL UNION ALL
SELECT 2, 'Z', '2015-5-12' UNION ALL
SELECT 2, 'Z', '2015-5-12' UNION ALL
SELECT 2, 'Z', NULL UNION ALL
SELECT 2, 'Z', NULL UNION ALL
SELECT 2, 'Z', '2015-5-11'
SELECT * FROM @T
-- expected results
SELECT 1,'A',15,11,CAST('5/13/2015' AS DATE) UNION ALL
SELECT 1,'A',15,8,CAST('5/12/2015' AS DATE) UNION ALL
SELECT 1,'A',15,3,CAST('5/11/2015' AS DATE) UNION ALL
SELECT 2,'X',11,6,CAST('5/13/2015' AS DATE) UNION ALL
SELECT 2,'X',11,4,CAST('5/12/2015' AS DATE) UNION ALL
SELECT 2,'X',11,1,CAST('5/11/2015' AS DATE) UNION ALL
SELECT 2,'Y',9,5,CAST('5/13/2015' AS DATE) UNION ALL
SELECT 2,'Y',9,3,CAST('5/12/2015' AS DATE) UNION ALL
SELECT 2,'Y',9,1,CAST('5/11/2015' AS DATE) UNION ALL
SELECT 2,'Z',10,5,CAST('5/13/2015' AS DATE) UNION ALL
SELECT 2,'Z',10,3,CAST('5/12/2015' AS DATE) UNION ALL
SELECT 2,'Z',10,1,CAST('5/11/2015' AS DATE)
May 13, 2015 at 2:26 pm
How do you get from your raw data to the result?
SELECT 1,'A',15,11,CAST('5/13/2015' AS DATE) UNION ALL
SELECT 1,'A',15,8,CAST('5/12/2015' AS DATE) UNION ALL
SELECT 1,'A',15,3,CAST('5/11/2015' AS DATE) UNION ALL
I see where the first two columns and the date column are coming from, but where are the (15,11),(15,8),(15,3) coming from? is that some kind of running total?
May 13, 2015 at 2:35 pm
Thank you for your response. Yes, as the title suggests I'm trying to calculate the running totals.
So for the first row:
15 is the Total count
11 is the Activated count
May 13, 2015 at 2:37 pm
This seems to do the trick. Please be sure to understand what it is doing and ask any questions that you have.
Running totals became a lot easier on SQL Server 2012.
WITH CTE AS(
SELECT Id, Category, ISNULL( Date, '99991231') Date, COUNT(*) rec_count
FROM @T
GROUP BY Id, Category, Date
),
CTE2 AS(
SELECT Id,
Category,
Date,
SUM(rec_count) OVER(PARTITION BY Id, Category
ORDER BY Date
ROWS UNBOUNDED PRECEDING) running_total,
SUM(rec_count) OVER( PARTITION BY Id, Category) grand_total
FROM CTE
)
SELECT Id,
Category,
grand_total,
running_total,
Date
FROM CTE2
WHERE Date < '99991231'
ORDER BY Id, Category, Date DESC
May 13, 2015 at 2:58 pm
Luis beat me too it but I took the time to write it so I'm posting it... 😉
SELECT
a.Id,
a.Category,
MAX(a.CountID) AS OverallTotal,
SUM(a.CountCat) OVER (PARTITION BY a.Id, a.Category ORDER BY a.Date) AS RunningTotal,
a.Date
FROM (
SELECT
t.Id,
t.Category,
t.Date,
COUNT(*) OVER (PARTITION BY t.id, t.Category) AS CountID,
COUNT(*) OVER (PARTITION BY t.id, t.Category, t.Date) AS CountCat
FROM @T t
) a
WHERE
a.Date IS NOT NULL
GROUP BY
a.Id,
a.Category,
a.Date,
a.CountCat
ORDER BY
a.id,
a.Category,
a.Date DESC
May 13, 2015 at 2:59 pm
Luis Cazares (5/13/2015)
This seems to do the trick. Please be sure to understand what it is doing and ask any questions that you have.Running totals became a lot easier on SQL Server 2012.
WITH CTE AS(
SELECT Id, Category, ISNULL( Date, '99991231') Date, COUNT(*) rec_count
FROM @T
GROUP BY Id, Category, Date
),
CTE2 AS(
SELECT Id,
Category,
Date,
SUM(rec_count) OVER(PARTITION BY Id, Category
ORDER BY Date
ROWS UNBOUNDED PRECEDING) running_total,
SUM(rec_count) OVER( PARTITION BY Id, Category) grand_total
FROM CTE
)
SELECT Id,
Category,
grand_total,
running_total,
Date
FROM CTE2
WHERE Date < '99991231'
ORDER BY Id, Category, Date DESC
May or may not answer the question. How do you want to handle the null date values? What would the results look like if you only wanted the from 2015-05-12 forward? In your sample data is the "positioning" of the data with null dates significant to the results?
May 13, 2015 at 3:08 pm
Lynn Pettis (5/13/2015)
May or may not answer the question. How do you want to handle the null date values? What would the results look like if you only wanted the from 2015-05-12 forward? In your sample data is the "positioning" of the data with null dates significant to the results?
1. The null date values are included in the grand total as implied by the expected results posted.
2. All values in the grand_total and running_total columns would have the value from 2015-05-11 substracted.
3. It is because the final filter excludes the value that replaced the null dates and allows me to use the SUM() OVER() directly instead of using a CASE to exclude the null dates values. So, even if it's not shown in the end, it's essential for the running total calculation.
May 13, 2015 at 3:16 pm
Luis Cazares (5/13/2015)
Lynn Pettis (5/13/2015)
May or may not answer the question. How do you want to handle the null date values? What would the results look like if you only wanted the from 2015-05-12 forward? In your sample data is the "positioning" of the data with null dates significant to the results?
1. The null date values are included in the grand total as implied by the expected results posted.
2. All values in the grand_total and running_total columns would have the value from 2015-05-11 substracted.
3. It is because the final filter excludes the value that replaced the null dates and allows me to use the SUM() OVER() directly instead of using a CASE to exclude the null dates values. So, even if it's not shown in the end, it's essential for the running total calculation.
Missing the point. The OP also said something about passing a parameter for the number of days. Based on the provided sample data, what would the OP's expected results be if he only wanted the data from 2015-05-12 forward knowing there is data for 2015-05-11. For the data with null dates, is their position in the sample data significant?
Yes, Luis, your solution solves the problem based on the given sample data and expected results, but is that really the question that needs to be solved?
May 13, 2015 at 3:37 pm
Now I get the point. It just didn't hit me because I thought that a filter would be the easy part. Of course is easier to get the logic if we refer to the date routines that you shared: http://www.sqlservercentral.com/blogs/lynnpettis/2009/03/25/some-common-date-routines/
SELECT dateadd(dd, datediff(dd, 0, GETDATE()) - @NumberOfDays, 0)
And to keep the null dates, an OR clause should be used to handle independently from the non null dates.
May 13, 2015 at 4:31 pm
Thank you both for your help, the query does make sense, thank you.
If I run this for a slightly different sample data set, I get:
DECLARE @T TABLE (Id INT, Category VARCHAR(1), [Date] DATE)
INSERT INTO @T
SELECT 1 AS Id, 'A' AS Category, '2015-5-13' AS ActivationDate UNION ALL
SELECT 1, 'A', NULL UNION ALL
SELECT 1, 'A', '2015-5-13' UNION ALL
SELECT 1, 'A', NULL UNION ALL
SELECT 1, 'A', NULL UNION ALL
SELECT 1, 'A', '2015-5-13' UNION ALL
SELECT 1, 'A', '2015-5-12' UNION ALL
SELECT 1, 'A', '2015-5-12' UNION ALL
SELECT 1, 'A', NULL UNION ALL
SELECT 1, 'A', '2015-5-12' UNION ALL
SELECT 1, 'A', '2015-5-12' UNION ALL
SELECT 1, 'A', '2015-5-11' UNION ALL
SELECT 1, 'A', '2015-5-11' UNION ALL
SELECT 1, 'A', '2015-5-11' UNION ALL
SELECT 2, 'X', '2015-5-12' UNION ALL
SELECT 2, 'X', '2015-5-12' UNION ALL
SELECT 2, 'X', NULL UNION ALL
SELECT 2, 'X', '2015-5-12' UNION ALL
SELECT 2, 'X', '2015-5-12' UNION ALL
SELECT 2, 'X', NULL UNION ALL
SELECT 2, 'X', '2015-5-11' UNION ALL
SELECT 2, 'Y', '2015-5-13' UNION ALL
SELECT 2, 'Y', NULL UNION ALL
SELECT 2, 'Y', NULL UNION ALL
SELECT 2, 'Y', '2015-5-13' UNION ALL
SELECT 2, 'Y', '2015-5-12' UNION ALL
SELECT 2, 'Y', '2015-5-12' UNION ALL
SELECT 2, 'Y', NULL UNION ALL
SELECT 2, 'Y', NULL UNION ALL
SELECT 2, 'Z', '2015-5-13' UNION ALL
SELECT 2, 'Z', '2015-5-13' UNION ALL
SELECT 2, 'Z', NULL UNION ALL
SELECT 2, 'Z', NULL UNION ALL
SELECT 2, 'Z', NULL UNION ALL
SELECT 2, 'Z', '2015-5-11'
;WITH CTE AS(
SELECT Id, Category, ISNULL( Date, '99991231') Date, COUNT(*) rec_count
FROM @T
GROUP BY Id, Category, Date
),
CTE2 AS(
SELECT Id,
Category,
Date,
SUM(rec_count) OVER(PARTITION BY Id, Category
ORDER BY Date
ROWS UNBOUNDED PRECEDING) running_total,
SUM(rec_count) OVER( PARTITION BY Id, Category) grand_total
FROM CTE
)
SELECT Id,
Category,
grand_total,
running_total,
Date
FROM CTE2
WHERE Date < '99991231'
ORDER BY Id, Category, Date DESC
I want missing dates to show up so I do a Cross apply to a function that returns every date for a given range and then left join it back to the cte, finally I tried to use lag/lead to populate the running total values with no luck. Any help would be appreciated.
Missing records highlighted:
Id Category grand_total running_total Date
----------- -------- ----------- ------------- ----------
1 A 14 10 2015-05-13
1 A 14 7 2015-05-12
1 A 14 3 2015-05-11
2 X 7 5 2015-05-13
2 X 7 5 2015-05-12
2 X 7 1 2015-05-11
2 Y 8 4 2015-05-13
2 Y 8 2 2015-05-12
2 Y 8 2 2015-05-11
2 Z 6 3 2015-05-13
2 Z 6 1 2015-05-12
2 Z 6 1 2015-05-11
May 13, 2015 at 6:52 pm
You can try this but I won't guarantee it will work with your real data.
WITH DistinctDates AS
(
SELECT [Date]
FROM @T
WHERE [Date] IS NOT NULL
GROUP BY [Date]
),
GrandTotals AS
(
SELECT ID, Category, grand_total=COUNT(*)
FROM @T
GROUP BY ID, Category
)
SELECT ID, Category, grand_total
,running_total=CASE a.running_total
WHEN 0
THEN LEAD(a.running_total) OVER
(
PARTITION BY ID, Category
ORDER BY a.[Date]
)
ELSE a.running_total
END
,[Date]
FROM
(
SELECT a.ID, a.Category
,grand_total=MAX(a.grand_total)
,running_total=MAX(a.running_total)
,a.[Date]
FROM
(
SELECT a.ID, a.Category, a.grand_total, b.[Date]
,running_total=COUNT(c.[Date]) OVER
(
PARTITION BY a.ID, a.Category
ORDER BY b.[Date]
ROWS UNBOUNDED PRECEDING
)
FROM GrandTotals a
CROSS JOIN DistinctDates b
LEFT JOIN @T c
ON a.ID = c.ID AND a.Category = c.Category AND b.[Date] = c.[Date]
) a
GROUP BY a.ID, a.Category, a.[Date]
) a
ORDER BY a.ID, a.Category, a.[Date] DESC;
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
May 13, 2015 at 7:07 pm
Hi again. Forgot to mention something.
ID/Category 2/Y was quite problematic because there was no data for 2015-05-11. You can see where I used LEAD to pull that data from 2015-05-12.
The issue is that if 2 days are missing, using LEAD won't work. In that case, I'd take the entire derived table in the main query and put it into another CTE, and use either a sub-query or a CROSS APPLY to pull the data from 2015-05-13.
If that doesn't make any sense to you (very well could be due to caffeine deficit on my part), let me know and I can show you.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
May 14, 2015 at 3:23 am
What is expected result for the sample data like this ?
INSERT INTO @T
SELECT 1 AS Id, 'A' AS Category, NULL AS ActivationDate UNION ALL
SELECT 2, 'X', NULL UNION ALL
SELECT 2, 'Y', NULL UNION ALL
SELECT 2, 'Y', NULL UNION ALL
SELECT 2, 'Y', NULL UNION ALL
SELECT 2, 'Y', NULL UNION ALL
SELECT 2, 'Z', NULL UNION ALL
SELECT 2, 'Z', NULL UNION ALL
SELECT 2, 'Z', NULL UNION ALL
May 14, 2015 at 12:10 pm
Thank you so much for your help.
I'd appreciate if you could show me how to overcome the issue when there are +2 missing records. So if there is no change in the inventory for a given Id/Category, just display the previous known inventory.
I agree that my data is no fun to work with 🙁
Thank again!!
May 14, 2015 at 12:53 pm
Maybe this got overcomplicated and Dwain's solution is better. But I just thought it was good to give this a shot.
DECLARE @DaysBefore int = 3; --Parameter?
DECLARE @StartDate Date = dateadd(dd, datediff(dd, 0, GETDATE()) - @DaysBefore, 0);
WITH Counts AS(
SELECT Id, Category, ISNULL( Date, '99991231') Date, COUNT(*) rec_count
FROM @T
GROUP BY Id, Category, Date
),
IdsCategories AS(
SELECT *
FROM (
SELECT DISTINCT Id, Category
FROM @T) t
CROSS
JOIN (
SELECT datekey Date
FROM Calendar c
WHERE c.datekey BETWEEN @StartDate AND GETDATE()
UNION ALL
SELECT '99991231') c
),
RunningTotals AS(
SELECT ic.Id,
ic.Category,
ic.Date,
SUM(ISNULL( rec_count, 0)) OVER(PARTITION BY ic.Id, ic.Category
ORDER BY ic.Date
ROWS UNBOUNDED PRECEDING) running_total,
SUM(rec_count) OVER( PARTITION BY ic.Id, ic.Category) grand_total
FROM Counts c
RIGHT JOIN IdsCategories ic ON c.Id = ic.Id AND c.Category = ic.Category AND c.Date = ic.Date
)
SELECT Id,
Category,
grand_total,
running_total,
Date
FROM RunningTotals
WHERE Date < '99991231'
ORDER BY Id, Category, Date DESC
Viewing 15 posts - 1 through 15 (of 18 total)
You must be logged in to reply to this topic. Login to reply