January 31, 2017 at 5:25 pm
Hello All,
I'm trying to fine the most efficient way of doing a running total of COUNT(DISTINCT ID's). I need to run it as of year, as of the month during a year. I looked at WINDOWS functions but can't do a COUNT(DISTINCT). I have some small test data. But i'll be running this over around 15 million row table. I have complete schema control as well. but here is the bare basics of the data. Any help or pointing in the right direction is very appreciated.
CREATE TABLE #Test ( VisitID INT, VisitDate DATE )
INSERT INTO #Test (VisitID, VisitDate) VALUES
(295591 ,'2015-10-02'),
(1101206 ,'2014-04-01'),
(1104675 ,'2014-04-11'),
(1116085 ,'2014-05-14'),
(291111 ,'2015-09-20'),
(1108068 ,'2014-04-22'),
(1116085 ,'2014-05-14'),
(291111 ,'2015-09-20')
SELECT * FROM #Test
DROP TABLE #Test
***SQL born on date Spring 2013:-)
February 1, 2017 at 12:10 am
Not entirely certain what you are after but here is an example that
😎
IF OBJECT_ID(N'tempdb..#Test') IS NOT NULL DROP TABLE #Test;
CREATE TABLE #Test ( VisitID INT, VisitDate DATE )
INSERT INTO #Test (VisitID, VisitDate) VALUES
(295591 ,'2015-10-02'),
(1101206 ,'2014-04-01'),
(1104675 ,'2014-04-11'),
(1116085 ,'2014-05-14'),
(291111 ,'2015-09-20'),
(1108068 ,'2014-04-22'),
(1116085 ,'2014-05-14'),
(291111 ,'2015-09-20');
CREATE NONCLUSTERED INDEX NCLIDX_#TEST_VISITID_VISITDATE ON #Test (VisitID ASC, VisitDate ASC);
;WITH FIRST_VISITS AS
(
SELECT
T.VisitID
,MIN(T.VisitDate) AS FIRST_VISIT
FROM #Test T
GROUP BY T.VisitID
)
,VISIT_BY_YEAR_MONTH AS
(
SELECT
YEAR(FV.FIRST_VISIT) AS V_YEAR
,MONTH(FV.FIRST_VISIT) AS V_MONTH
,COUNT(FV.VisitID) AS V_COUNT
FROM FIRST_VISITS FV
GROUP BY YEAR(FV.FIRST_VISIT)
,MONTH(FV.FIRST_VISIT)
)
SELECT
VYM.V_YEAR
,VYM.V_MONTH
,VYM.V_COUNT
,SUM(VYM.V_COUNT) OVER
(
ORDER BY VYM.V_YEAR
,VYM.V_MONTH
ROWS BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW
) AS RT_VISIT_COUNT
FROM VISIT_BY_YEAR_MONTH VYM;
Output
V_YEAR V_MONTH V_COUNT RT_VISIT_COUNT
----------- ----------- ----------- --------------
2014 4 3 3
2014 5 1 4
2015 9 1 5
2015 10 1 6
February 1, 2017 at 7:38 am
Hi Eirikur,
Thanks again for your input, those results are very close to what I need. Except the Year counts would start over vs be continous
So the end total for 2014 would be 4 and the end total for 2015 would. The months are dead on. I have used the quirky update method before but i'm not sure how i can get that to work for COUNT(DISTINCT).
***SQL born on date Spring 2013:-)
February 1, 2017 at 7:53 am
This might help you, restarts the count on a new year. Subject X from year n-1 will be counted again on year n.
😎
IF OBJECT_ID(N'tempdb..#Test') IS NOT NULL DROP TABLE #Test;
CREATE TABLE #Test ( VisitID INT, VisitDate DATE )
INSERT INTO #Test (VisitID, VisitDate) VALUES
(295591 ,'2015-10-02'),
(1101206 ,'2014-04-01'),
(1104675 ,'2014-04-11'),
(1116085 ,'2014-05-14'),
(291111 ,'2015-09-20'),
(1108068 ,'2014-04-22'),
(1116085 ,'2014-05-14'),
(291111 ,'2015-09-20');
CREATE NONCLUSTERED INDEX NCLIDX_#TEST_VISITID_VISITDATE ON #Test (VisitID ASC, VisitDate ASC);
;WITH FIRST_VISITS AS
(
SELECT
T.VisitID
,YEAR(T.VisitDate) AS V_YEAR
,MIN(T.VisitDate) AS FIRST_VISIT
FROM #Test T
GROUP BY T.VisitID
,YEAR(T.VisitDate)
)
,VISIT_BY_YEAR_MONTH AS
(
SELECT
FV.V_YEAR AS V_YEAR
,MONTH(FV.FIRST_VISIT) AS V_MONTH
,COUNT(FV.VisitID) AS V_COUNT
FROM FIRST_VISITS FV
GROUP BY FV.V_YEAR
,MONTH(FV.FIRST_VISIT)
)
SELECT
VYM.V_YEAR
,VYM.V_MONTH
,VYM.V_COUNT
,SUM(VYM.V_COUNT) OVER
(
PARTITION BY VYM.V_YEAR
ORDER BY VYM.V_YEAR
,VYM.V_MONTH
ROWS BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW
) AS RT_VISIT_COUNT
FROM VISIT_BY_YEAR_MONTH VYM;
OutputV_YEAR V_MONTH V_COUNT RT_VISIT_COUNT
----------- ----------- ----------- --------------
2014 4 3 3
2014 5 1 4
2015 9 1 1
2015 10 1 2
February 1, 2017 at 7:57 am
Awesome going to give it a try !. Thanks !
***SQL born on date Spring 2013:-)
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply