June 30, 2018 at 4:04 am
Comments posted to this topic are about the item Aggregate Data for the Last Day of the Month
August 16, 2018 at 2:11 am
Hello
Firstly, I appreciate anyone who takes the time to write helpful articles on SQL as I am mostly self-taught. I'm still a bit of an SQL novice so apologies if this question has an obvious answer. If all you need to run the last day of the month CTE is a list of IDs from 1-12, why do we need to go through the bother and processing time/power of having to create the list from an arbitrary source like the 1st CTE?
Thanks
Drew
August 16, 2018 at 4:40 am
Thank you Adam for this effort but I must criticize the inefficiency of the query which can easily be vastly improved.
😎
Here is a version that is more than 10 more efficient than the one posted in the article
;WITH T(N) AS (SELECT X.N FROM (VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12)) X(N))
,EOM(EOM_DATE) AS
(
SELECT
EOMONTH(DATEFROMPARTS(2016,TD.N,1)) AS EOM_DATE
FROM T TD
)
SELECT
ED.EOM_DATE AS LastDayDate
,SUM(DS.TotalSalePrice) AS TotalDailySales
FROM EOM ED
INNER JOIN Data.Sales DS
ON ED.EOM_DATE = CONVERT(DATE,DS.SaleDate,0)
GROUP BY ED.EOM_DATE
ORDER BY ED.EOM_DATE ASC;
August 16, 2018 at 6:41 am
Apart from technical inefficiency, there is an issue with the logical shortcomings.
Sale amounts on the last days of a month don't say much, unless they are compared with sales on any other day of a month.
Here is my version of the query, much shorter, more informative and not using highly inefficient (and actually useless) functions DATEFROMPARTS and EOMONTH:SELECT DATEADD(mm, DATEDIFF(MM, 0, SLS.SaleDate), 0) MonthBeginning,
SUM(CASE WHEN SLS.SaleDate >= DATEADD(mm, DATEDIFF(MM, 0, SLS.SaleDate)+1, 0)-1 THEN SLS.SalePrice ELSE 0 END) AS TotalSales_LastDay,
SUM(CASE WHEN SLS.SaleDate < DATEADD(mm, DATEDIFF(MM, 0, SLS.SaleDate)+1, 0)-1 THEN SLS.SalePrice ELSE 0 END) AS TotalSales_AnyOtherDay
FROM Data.SalesByCountry SLS
GROUP BY DATEADD(mm, DATEDIFF(MM, 0, SLS.SaleDate), 0)
ORDER BY MonthBegin
_____________
Code for TallyGenerator
August 16, 2018 at 7:53 am
Sale amounts on the last days of a month don't say much, unless they are compared with sales on any other day of a month.
This may be the case but taking a requirement and producing something you think they might want is somewhat arrogant 🙂
August 16, 2018 at 11:25 am
Sergiy - Thursday, August 16, 2018 6:41 AMApart from technical inefficiency, there is an issue with the logical shortcomings.Sale amounts on the last days of a month don't say much, unless they are compared with sales on any other day of a month.
Here is my version of the query, much shorter, more informative and not using highly inefficient (and actually useless) functions DATEFROMPARTS and EOMONTH:
SELECT DATEADD(mm, DATEDIFF(MM, 0, SLS.SaleDate), 0) MonthBeginning,
SUM(CASE WHEN SLS.SaleDate >= DATEADD(mm, DATEDIFF(MM, 0, SLS.SaleDate)+1, 0)-1 THEN SLS.SalePrice ELSE 0 END) AS TotalSales_LastDay,
SUM(CASE WHEN SLS.SaleDate < DATEADD(mm, DATEDIFF(MM, 0, SLS.SaleDate)+1, 0)-1 THEN SLS.SalePrice ELSE 0 END) AS TotalSales_AnyOtherDay
FROM Data.SalesByCountry SLS
GROUP BY DATEADD(mm, DATEDIFF(MM, 0, SLS.SaleDate), 0)
ORDER BY MonthBegin
You may want to thing again about the DATEFROMPARTS and the EOMONTH functions, those are some of the fastest options and certainly faster than cascading DATEADD and DATEDIFF
😎
Here is a quick example
USE TEEST;
GO
SET NOCOUNT ON;
DECLARE @SAMPLE_SIZE BIGINT = 1000000;
DECLARE @DATE_RANGE INT = 126144000;
DECLARE @VALUE_RANGE INT = 1000000;
DECLARE @FIRST_DATE DATETIME = '2015-01-01';
--/*
IF OBJECT_ID(N'dbo.TBL_TEST_DATE_AGGREGATE') IS NOT NULL DROP TABLE dbo.TBL_TEST_DATE_AGGREGATE;
CREATE TABLE dbo.TBL_TEST_DATE_AGGREGATE
(
TDA_ID INT IDENTITY(1,1) NOT NULL CONSTRAINT PK_DBO_TBL_TEST_DATE_AGGREGATE_TDA_ID PRIMARY KEY CLUSTERED (TDA_ID ASC)
,SALE_DATE DATETIME NOT NULL
,SALE_VALUE NUMERIC(12,2) NOT NULL
)
;WITH T(N) AS (SELECT X.N FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) X(N))
, NUMS(N) AS (SELECT TOP(@SAMPLE_SIZE) ROW_NUMBER() OVER (ORDER BY @@VERSION) AS N FROM T T1,T T2,T T3,T T4,T T5,T T6,T T7,T T8,T T9)
INSERT INTO dbo.TBL_TEST_DATE_AGGREGATE WITH (TABLOCKX) (SALE_DATE,SALE_VALUE)
SELECT
DATEADD(SECOND,ABS(CHECKSUM(NEWID())) % @DATE_RANGE,@FIRST_DATE) AS SALE_DATE
,CONVERT(NUMERIC(12,2),((ABS(CHECKSUM(NEWID())) % @VALUE_RANGE) + 0.0) / 100.0,0) AS SALE_VALUE
FROM NUMS NM
ORDER BY SALE_DATE ASC;
CREATE NONCLUSTERED INDEX NCLIDX_DBO_TBL_TEST_DATE_AGGREGATE_SALE_DATE_INCLUDE_SALE_VALUE_INCL_SALE_VALUE ON dbo.TBL_TEST_DATE_AGGREGATE(SALE_DATE ASC) INCLUDE (SALE_VALUE);
-- */
DECLARE @timer TABLE (T_TXT VARCHAR(50) NOT NULL, T_TD DATETIME2(7) NOT NULL DEFAULT (SYSDATETIME()));
DECLARE @DATETIME_BUCKET DATETIME = 0;
DECLARE @NUMERIC_BUCKET NUMERIC(12,2) = 0.0;
INSERT INTO @timer (T_TXT) VALUES ('DRY RUN');
SELECT
@DATETIME_BUCKET = TDA.SALE_DATE
,@NUMERIC_BUCKET = TDA.SALE_VALUE
FROM dbo.TBL_TEST_DATE_AGGREGATE TDA
OPTION (MAXDOP 1);
INSERT INTO @timer (T_TXT) VALUES ('DRY RUN');
INSERT INTO @timer (T_TXT) VALUES ('DATEDIFF');
SELECT
@DATETIME_BUCKET = DATEADD(mm, DATEDIFF(MM, 0, SLS.SALE_DATE), 0)
FROM dbo.TBL_TEST_DATE_AGGREGATE SLS
OPTION (MAXDOP 1);
INSERT INTO @timer (T_TXT) VALUES ('DATEDIFF');
INSERT INTO @timer (T_TXT) VALUES ('EOMONTH');
SELECT
@DATETIME_BUCKET = EOMONTH(SLS.SALE_DATE)
FROM dbo.TBL_TEST_DATE_AGGREGATE SLS
OPTION (MAXDOP 1);
INSERT INTO @timer (T_TXT) VALUES ('EOMONTH');
SELECT
T.T_TXT
,DATEDIFF(MICROSECOND,MIN(T.T_TD),MAX(T.T_TD)) AS DURATION
FROM @timer T
GROUP BY T.T_TXT
ORDER BY DURATION ASC;
Results on a latest gen i5 laptop
T_TXT DURATION
--------- --------
DRY RUN 179125
EOMONTH 232185
DATEDIFF 247174
August 16, 2018 at 11:27 am
Also, the author did say that this is just one solution. And the article also stated that the requirement was for sales on the last day of each month.
August 16, 2018 at 12:46 pm
Eirikur Eiriksson - Thursday, August 16, 2018 4:40 AMThank you Adam for this effort but I must criticize the inefficiency of the query which can easily be vastly improved.
😎Here is a version that is more than 10 more efficient than the one posted in the article
;WITH T(N) AS (SELECT X.N FROM (VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12)) X(N))
,EOM(EOM_DATE) AS
(
SELECT
EOMONTH(DATEFROMPARTS(2016,TD.N,1)) AS EOM_DATE
FROM T TD
)
SELECT
ED.EOM_DATE AS LastDayDate
,SUM(DS.TotalSalePrice) AS TotalDailySales
FROM EOM ED
INNER JOIN Data.Sales DS
ON ED.EOM_DATE = CONVERT(DATE,DS.SaleDate,0)
GROUP BY ED.EOM_DATE
ORDER BY ED.EOM_DATE ASC;
With a couple of minor changes - you can eliminate the datefromparts altogether and allow for the past 12 months of data from any given date.
Declare @endDate date = '2016-12-31';
;WITH T(N) AS (SELECT X.N FROM (VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11)) X(N))
,EOM(EOM_DATE) AS
(
SELECT
EOMONTH(@endDate,-TD.N) AS EOM_DATE
FROM T TD
)
SELECT
ED.EOM_DATE AS LastDayDate
FROM EOM ED
INNER JOIN Data.Sales DS
ON ED.EOM_DATE = CONVERT(DATE,DS.SaleDate,0)
GROUP BY ED.EOM_DATE
ORDER BY ED.EOM_DATE ASC;
Or better yet...
Declare @endDate date = '2017-01-15';
;WITH T(N) AS (SELECT X.N FROM (VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12)) X(N))
,EOM(EOM_DATE) AS
(
SELECT
EOMONTH(@endDate,-TD.N) AS EOM_DATE
FROM T TD
)
SELECT
ED.EOM_DATE AS LastDayDate
FROM EOM ED
INNER JOIN Data.Sales DS
ON ED.EOM_DATE = CONVERT(DATE,DS.SaleDate,0)
GROUP BY ED.EOM_DATE
ORDER BY ED.EOM_DATE ASC;
The second version gives you the prior 12 months from the given date - not including the month of the given date.
Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”
― Charles R. Swindoll
How to post questions to get better answers faster
Managing Transaction Logs
August 16, 2018 at 7:22 pm
Lynn Pettis - Thursday, August 16, 2018 11:27 AMAlso, the author did say that this is just one solution. And the article also stated that the requirement was for sales on the last day of each month.
yes, but why should it be the worst possible one which gets published (apart from a cursor approach)?
_____________
Code for TallyGenerator
August 16, 2018 at 11:25 pm
Sergiy - Thursday, August 16, 2018 7:22 PMLynn Pettis - Thursday, August 16, 2018 11:27 AMAlso, the author did say that this is just one solution. And the article also stated that the requirement was for sales on the last day of each month.yes, but why should it be the worst possible one which gets published (apart from a cursor approach)?
Why isn't the grass blue and the sky green?
Because that is the one the author published in the article.
August 17, 2018 at 6:54 pm
Lynn Pettis - Thursday, August 16, 2018 11:25 PMSergiy - Thursday, August 16, 2018 7:22 PMLynn Pettis - Thursday, August 16, 2018 11:27 AMAlso, the author did say that this is just one solution. And the article also stated that the requirement was for sales on the last day of each month.yes, but why should it be the worst possible one which gets published (apart from a cursor approach)?
Why isn't the grass blue and the sky green?
Because that is the one the author published in the article.
Heh... yeah, tell that to the users when the code grows or someone uses the same techniques on something much larger. 😉
One of the things that this community is VERY good about is coming up with better ways in the discussions regardless of what the author states in the article. Would you be so quick with your comment if the author had actually used a cursor or recursive CTE or While Loop? Of course not. You'd have explained a better way and that's exactly what's happening on this thread.
A bigger and very real problem is that this article was derived from a book that the author wrote and is advertising on every article he's posted in this series. Considering that the code does need some help and a book is being pushed in the process, I think it's especially important in this case.
--Jeff Moden
Change is inevitable... Change for the better is not.
August 18, 2018 at 8:42 pm
Eirikur Eiriksson - Thursday, August 16, 2018 11:25 AMSergiy - Thursday, August 16, 2018 6:41 AMApart from technical inefficiency, there is an issue with the logical shortcomings.Sale amounts on the last days of a month don't say much, unless they are compared with sales on any other day of a month.
Here is my version of the query, much shorter, more informative and not using highly inefficient (and actually useless) functions DATEFROMPARTS and EOMONTH:
SELECT DATEADD(mm, DATEDIFF(MM, 0, SLS.SaleDate), 0) MonthBeginning,
SUM(CASE WHEN SLS.SaleDate >= DATEADD(mm, DATEDIFF(MM, 0, SLS.SaleDate)+1, 0)-1 THEN SLS.SalePrice ELSE 0 END) AS TotalSales_LastDay,
SUM(CASE WHEN SLS.SaleDate < DATEADD(mm, DATEDIFF(MM, 0, SLS.SaleDate)+1, 0)-1 THEN SLS.SalePrice ELSE 0 END) AS TotalSales_AnyOtherDay
FROM Data.SalesByCountry SLS
GROUP BY DATEADD(mm, DATEDIFF(MM, 0, SLS.SaleDate), 0)
ORDER BY MonthBeginYou may want to thing again about the DATEFROMPARTS and the EOMONTH functions, those are some of the fastest options and certainly faster than cascading DATEADD and DATEDIFF
😎Here is a quick example
USE TEEST;
GO
SET NOCOUNT ON;
DECLARE @SAMPLE_SIZE BIGINT = 1000000;
DECLARE @DATE_RANGE INT = 126144000;
DECLARE @VALUE_RANGE INT = 1000000;
DECLARE @FIRST_DATE DATETIME = '2015-01-01';--/*
IF OBJECT_ID(N'dbo.TBL_TEST_DATE_AGGREGATE') IS NOT NULL DROP TABLE dbo.TBL_TEST_DATE_AGGREGATE;
CREATE TABLE dbo.TBL_TEST_DATE_AGGREGATE
(
TDA_ID INT IDENTITY(1,1) NOT NULL CONSTRAINT PK_DBO_TBL_TEST_DATE_AGGREGATE_TDA_ID PRIMARY KEY CLUSTERED (TDA_ID ASC)
,SALE_DATE DATETIME NOT NULL
,SALE_VALUE NUMERIC(12,2) NOT NULL
);WITH T(N) AS (SELECT X.N FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) X(N))
, NUMS(N) AS (SELECT TOP(@SAMPLE_SIZE) ROW_NUMBER() OVER (ORDER BY @@VERSION) AS N FROM T T1,T T2,T T3,T T4,T T5,T T6,T T7,T T8,T T9)
INSERT INTO dbo.TBL_TEST_DATE_AGGREGATE WITH (TABLOCKX) (SALE_DATE,SALE_VALUE)
SELECT
DATEADD(SECOND,ABS(CHECKSUM(NEWID())) % @DATE_RANGE,@FIRST_DATE) AS SALE_DATE
,CONVERT(NUMERIC(12,2),((ABS(CHECKSUM(NEWID())) % @VALUE_RANGE) + 0.0) / 100.0,0) AS SALE_VALUE
FROM NUMS NM
ORDER BY SALE_DATE ASC;CREATE NONCLUSTERED INDEX NCLIDX_DBO_TBL_TEST_DATE_AGGREGATE_SALE_DATE_INCLUDE_SALE_VALUE_INCL_SALE_VALUE ON dbo.TBL_TEST_DATE_AGGREGATE(SALE_DATE ASC) INCLUDE (SALE_VALUE);
-- */
DECLARE @timer TABLE (T_TXT VARCHAR(50) NOT NULL, T_TD DATETIME2(7) NOT NULL DEFAULT (SYSDATETIME()));
DECLARE @DATETIME_BUCKET DATETIME = 0;
DECLARE @NUMERIC_BUCKET NUMERIC(12,2) = 0.0;INSERT INTO @timer (T_TXT) VALUES ('DRY RUN');
SELECT
@DATETIME_BUCKET = TDA.SALE_DATE
,@NUMERIC_BUCKET = TDA.SALE_VALUE
FROM dbo.TBL_TEST_DATE_AGGREGATE TDA
OPTION (MAXDOP 1);
INSERT INTO @timer (T_TXT) VALUES ('DRY RUN');INSERT INTO @timer (T_TXT) VALUES ('DATEDIFF');
SELECT
@DATETIME_BUCKET = DATEADD(mm, DATEDIFF(MM, 0, SLS.SALE_DATE), 0)
FROM dbo.TBL_TEST_DATE_AGGREGATE SLS
OPTION (MAXDOP 1);
INSERT INTO @timer (T_TXT) VALUES ('DATEDIFF');INSERT INTO @timer (T_TXT) VALUES ('EOMONTH');
SELECT
@DATETIME_BUCKET = EOMONTH(SLS.SALE_DATE)
FROM dbo.TBL_TEST_DATE_AGGREGATE SLS
OPTION (MAXDOP 1);
INSERT INTO @timer (T_TXT) VALUES ('EOMONTH');SELECT
T.T_TXT
,DATEDIFF(MICROSECOND,MIN(T.T_TD),MAX(T.T_TD)) AS DURATION
FROM @timer T
GROUP BY T.T_TXT
ORDER BY DURATION ASC;Results on a latest gen i5 laptop
T_TXT DURATION
--------- --------
DRY RUN 179125
EOMONTH 232185
DATEDIFF 247174
You see, Eirikur, to me ineffectiveness of EOMONTH is defined not by those 5 or so percent in the speed of computation, but by all the stuff which comes with it.
The function is pretty much useless for all of the scenarios when event date+time recorded in the same column. Except, probably that one case which was used in the article, when you need data for that single last day of the month.
In every other case you need rather beginning of the next ,month, and EOMONTH has no use case.
And if you have date and time stored separately, the overall overhead of such approach will be so big that you'll forget about that tiny difference in date computations.
_____________
Code for TallyGenerator
August 20, 2018 at 3:39 am
I am getting the same output with this query
Select convert(date,saledate),sum(saleprice) from Data.SalesbyCountry
where convert(date,saledate) = eomonth(saledate) and year(saledate)='2016'
group by convert(date,saledate)
Am I missing something ???
April 3, 2020 at 3:47 pm
I am getting the same output with this query
Select convert(date,saledate),sum(saleprice) from Data.SalesbyCountry
where convert(date,saledate) = eomonth(saledate) and year(saledate)='2016'
group by convert(date,saledate)Am I missing something ???
Yes, Sir. You're missing the point of the entire article.
The whole point was not to merely get the sum of what's there, which is easy as you have demonstrated, but how to get the sum of the last day of every month to demonstrate how to present the fact that not all months in the range of dates had sales.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 14 posts - 1 through 13 (of 13 total)
You must be logged in to reply to this topic. Login to reply