June 20, 2022 at 7:59 am
Hi all,
I have been using FORMAT()
FORMAT(Tab.date, 'yyyy-MM-dd') AS 'date_begin',
but been told it would be slow when it comes to big queries.
How can I still want the result to be a date (NOT STRING)
But I want to 'discard' the Days, Hours, etc (2020-11-01 00:00:00.0000000)
Leaving a year/month only date field (2020-11)
Thanks
June 20, 2022 at 10:25 am
SQL does not have a Year-Month only data type
June 20, 2022 at 11:58 am
2020-11 is not a date. 202011 as an INT is, however, easy enough.
You say that you don't want to return a string, but that is exactly what FORMAT returns, which you appear to be somewhat fond of.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
June 20, 2022 at 1:04 pm
replacement for format to return just yyyy-mm-dd (or yyyy-mm) is easy - convert(char(10), datefield, 121) - change the char(10) to char(7) to return just yyyy-mm - but as mentioned all these functions return a string.
or if you just wish to remove the time portion of a datetime field you can do convert(date, datefield) - this will give you just the date portion as a date datatype (instead of string as the other examples, including FORMAT)
June 20, 2022 at 1:08 pm
replacement for format to return just yyyy-mm-dd (or yyyy-mm) is easy - convert(char(10), datefield, 121) - change the char(10) to char(7) to return just yyyy-mm - but as mentioned all these functions return a string.
or if you just wish to remove the time portion of a datetime field you can do convert(date, datefield) - this will give you just the date portion as a date datatype (instead of string as the other examples, including FORMAT)
Thanks
June 20, 2022 at 3:31 pm
You could just use the 1st for the day, which would mean it was still a valid date:
CONVERT(varchar(10), DATEADD(MONTH, DATEDIFF(MONTH, 0, Tab.date), 0), 120)
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
June 20, 2022 at 3:42 pm
And, just to make you feel better about abandoning FORMAT, I've not yet seen it where it's not at least 20 times slower than code like what Frederico suggested even on modern machines with nasty fast SSDs and a shedload of memory, etc.
--Jeff Moden
Change is inevitable... Change for the better is not.
June 21, 2022 at 11:05 am
The test is simple and straight forward
set statistics time , io on
go
select FORMAT(datecol, 'yyyy-MM-dd') as Datetext
from #TallyDate
--order by DateCol
/*
(10000 rows affected)
Table '#TallyDate__________________________________________________________________________________________________________000000000008'. Scan count 1, logical reads 22, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
SQL Server Execution Times:
CPU time = 156 ms, elapsed time = 226 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
*/
select convert(char(10), dateadd(dd, datediff(dd,0,datecol), 0), 121 ) as Datetext
from #TallyDate
/*
(10000 rows affected)
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table '#TallyDate__________________________________________________________________________________________________________000000000008'. Scan count 1, logical reads 22, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
SQL Server Execution Times:
CPU time = 16 ms, elapsed time = 136 ms.
*/
Johan
Learn to play, play to learn !
Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:
- How to post Performance Problems
- How to post data/code to get the best help[/url]
- How to prevent a sore throat after hours of presenting ppt
press F1 for solution, press shift+F1 for urgent solution 😀
Need a bit of Powershell? How about this
Who am I ? Sometimes this is me but most of the time this is me
June 21, 2022 at 9:12 pm
@Johan ... Do you have the code where you created and populated the #TallyDate table so that newbies can run your test?
Also, your good code doesn't need the DATEADD/DATEDIFF thing. And change the CHAR(10) to CHAR(7) to give the op the YYYY-MM format he wanted.
--Jeff Moden
Change is inevitable... Change for the better is not.
June 21, 2022 at 9:24 pm
Hi all,
I have been using FORMAT()
FORMAT(Tab.date, 'yyyy-MM-dd') AS 'date_begin',but been told it would be slow when it comes to big queries.
How can I still want the result to be a date (NOT STRING) But I want to 'discard' the Days, Hours, etc (2020-11-01 00:00:00.0000000) Leaving a year/month only date field (2020-11)
Thanks
Just to remind people of what the OP is really asking for...
--Jeff Moden
Change is inevitable... Change for the better is not.
June 22, 2022 at 3:31 am
Hi all,
I have been using FORMAT()
FORMAT(Tab.date, 'yyyy-MM-dd') AS 'date_begin',but been told it would be slow when it comes to BIG QUERIES.
How can I still want the result to be a date (NOT STRING) But I want to 'discard' the Days, Hours, etc (2020-11-01 00:00:00.0000000) Leaving a year/month only date field (2020-11)
Thanks
As the others have said, there is no temporal data type to hold just the Year and Month.
Considering the "big queries" you speak of, if you're trying to do aggregates by year and month, the usual recommendation is to reduce each date down to being the first of the month (year included) as a DATE datatype for the given date of each row OR do a "Range Grouping". I've even made separate columns on some tables that keep both the original date and the first of the month date if the reporting requirements demand such an optimization.
"Formatted" dates should never be used to form such aggregations because of all the bloody implicit casts (performance) and the fact that it will certainly make the criteria for you queries "Non-SARGable" resulting in full scans all the time.
So, what I recommend is posting one of the queries that is "slow because of FORMAT". Yes, FORMAT absolutely sucks for performance but so does converting things to the character based YYYY-MM format using ANY method if you use that method in the wrong place. What's the "wrong place"? Anything (like aggregations) that should be done purely in the "data layer". Only the results should be formatted and there are many that will tell you that shouldn't EVER be done in T-SQL. While I agree with the premise, I don't agree with the absolute nature of that recommendation.
So let's see one of the queries where you used FORMAT before and let's see if we can show you a way to "double-prong" the performance using techniques like "pre-aggregation" to do the calculations and then format them for the output you're directed to provide.
--Jeff Moden
Change is inevitable... Change for the better is not.
June 22, 2022 at 6:37 am
@Johan ... Do you have the code where you created and populated the #TallyDate table so that newbies can run your test?
Also, your good code doesn't need the DATEADD/DATEDIFF thing. And change the CHAR(10) to CHAR(7) to give the op the YYYY-MM format he wanted.
ref: "The "Numbers" or "Tally" Table: What it is and how it replaces a loop"
Formatting "yyyy-MM" must be done front end. Non date data cannot be stored in date/datetime(2) data types.
Date values must be stored in date/datetime(2) data types with a single exception: your proprietary DateSeries table.
Johan
Learn to play, play to learn !
Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:
- How to post Performance Problems
- How to post data/code to get the best help[/url]
- How to prevent a sore throat after hours of presenting ppt
press F1 for solution, press shift+F1 for urgent solution 😀
Need a bit of Powershell? How about this
Who am I ? Sometimes this is me but most of the time this is me
June 22, 2022 at 6:49 am
TheNewbie wrote:Hi all,
I have been using FORMAT()
FORMAT(Tab.date, 'yyyy-MM-dd') AS 'date_begin',but been told it would be slow when it comes to BIG QUERIES.
How can I still want the result to be a date (NOT STRING) But I want to 'discard' the Days, Hours, etc (2020-11-01 00:00:00.0000000) Leaving a year/month only date field (2020-11)
Thanks
As the others have said, there is no temporal data type to hold just the Year and Month.
Considering the "big queries" you speak of, if you're trying to do aggregates by year and month, the usual recommendation is to reduce each date down to being the first of the month (year included) as a DATE datatype for the given date of each row OR do a "Range Grouping". I've even made separate columns on some tables that keep both the original date and the first of the month date if the reporting requirements demand such an optimization.
"Formatted" dates should never be used to form such aggregations because of all the bloody implicit casts (performance) and the fact that it will certainly make the criteria for you queries "Non-SARGable" resulting in full scans all the time.
So, what I recommend is posting one of the queries that is "slow because of FORMAT". Yes, FORMAT absolutely sucks for performance but so does converting things to the character based YYYY-MM format using ANY method if you use that method in the wrong place. What's the "wrong place"? Anything (like aggregations) that should be done purely in the "data layer". Only the results should be formatted and there are many that will tell you that shouldn't EVER be done in T-SQL. While I agree with the premise, I don't agree with the absolute nature of that recommendation.
So let's see one of the queries where you used FORMAT before and let's see if we can show you a way to "double-prong" the performance using techniques like "pre-aggregation" to do the calculations and then format them for the output you're directed to provide.
Hi,
Thanks for all this information. I really appreciate and found that helpful.
here is an (simplified) example below
SELECT tab1.a
tab1.b,
tab2.c,
tab3.d,
tab3.e,
tab1.f,
FORMAT(tab1.g, 'yyyy-MM-dd'), -- date
FORMAT(tab1.h, 'yyyy-MM-dd'), -- date
FORMAT(tab1.i, 'yyyy-MM-dd'), -- date
FORMAT(tab2.j, 'yyyy-MM-dd'), -- date
FORMAT(tab2.k, 'yyyy-MM-dd'), -- date
FORMAT(tab2.l, 'yyyy-MM-dd'), -- date
tab1.m,
tab1.n,
tab1.o,
tab1.p,
tab1.q,
tab1.r,
tab1.s,
tab1.t
FROM table1 tab1
RIGHT JOIN table2 tab2
ON tab1.a = tab2.c
INNER JOIN table3 tab3
ON tab1.b= tab3.e
WHERE tab1.g >= '2012-04-01' -- or '2012-04'
and I have to convert them all into date format in PowerBI
Viewing 15 posts - 1 through 15 (of 33 total)
You must be logged in to reply to this topic. Login to reply