September 14, 2023 at 2:46 pm
Hello,
Converting file name MMDDYYYY.txt to a Date Field but CONVERT or CAST do not seem to work perhaps because I also have a REPLACE? Can someone help me understand how to do this if it is possible? See sample code below and many thanks in advance!
CREATE TABLE #t (getFileName varchar(100))
INSERT INTO #T (getFileName) VALUES ('09142023.txt')
SELECT REPLACE(getFileName, '.txt','') FROM #t
September 14, 2023 at 2:53 pm
Put the date into YYYYMMDD format before casting and it should work.
DECLARE @x VARCHAR(100) = '09142023';
SELECT @x
,SomeDate = CAST (CONCAT (RIGHT(@x, 4), LEFT(@x, 2), SUBSTRING (@x, 3, 2)) AS DATE);
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
September 14, 2023 at 2:59 pm
SELECT CAST(SUBSTRING((getFileName, 5, 4) +
LEFT(getFileName, 4) AS date) AS FileNameDate
FROM #t
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
September 14, 2023 at 4:07 pm
Make the filename use the international ISO format YYYYMMDD instead of that quaint american date format and you'll have no trouble at all. 🙂
September 14, 2023 at 4:42 pm
Make the filename use the international ISO format YYYYMMDD instead of that quaint american date format and you'll have no trouble at all. 🙂
+1, this has the added benefit of being sortable in Windows File Explorer.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
September 14, 2023 at 4:42 pm
... but if you can't change the date format in the file name...
This would also be a possible solution:
DECLARE @x VARCHAR(100) = '09142023';
SELECT @x as [@x]
, CONVERT(date,STUFF(STUFF(@x,3,0,'/'),6,0,'/'),101) as SomeDate
I use STUFF to insert a slash in the right places so that the string now conforms to the 101 style used by CONVERT.
September 14, 2023 at 8:40 pm
Using the OP's original example:
SELECT *
, file_date = datefromparts(f.string_date % 10000, f.string_date / 1000000, f.string_date / 10000 % 100)
FROM #t
CROSS APPLY (VALUES (substring(getFileName, 1, charindex('.', getFileName, 1) - 1))) AS f(string_date);
Assumptions: the filename is formatted as MMDDYYYY.txt. Using CROSS APPLY to remove the extension it is then just math to parse out month, day and year.
Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”
― Charles R. Swindoll
How to post questions to get better answers faster
Managing Transaction Logs
September 15, 2023 at 1:59 am
Here's a quick question for you because we have to enforce such things where I work... do you want the conversion to fail if the date contains less than 8 characters?
--Jeff Moden
Change is inevitable... Change for the better is not.
September 15, 2023 at 2:03 am
Deleted. Made a mistake and didn't check before I posted. 🙁
--Jeff Moden
Change is inevitable... Change for the better is not.
September 15, 2023 at 4:38 am
IF the MMDDYYYY formatted string date is guaranteed to always be 8 characters, the following works (most of the posted code doesn't consider the inclusion of the ".txt") according to the 'MMDDYYYY.txt' string format and, because it does not use CHARINDEX, it's more than twice as fast as all the other code that works against the OPs data so far (with the exception of Scott Pletcher's code, which I missed on the first pass). It will produce an error if it's shorter and probably an error if it's longer but no guarantee there. If you want errors to convert to NULLs, use TRY_CONVERT or TRY_CAST on systems that support it (2016+).
DECLARE @BitBucket DATE;
SET STATISTICS TIME ON;
SELECT @BitBucket = CONVERT(DATE,SUBSTRING(getFileName,5,4)+SUBSTRING(getFileName,1,4)) --SWAPS MMDD and YYYY to make ISO date.
FROM #T
SET STATISTICS TIME OFF;
GO 5
Now, full disclosure says the time difference is only about 4.5 seconds faster on 10 million rows so ... who cares?
Consider this... if you wrote everything on your server to run a little more than twice as fast, your server workload would run twice as fast. The trouble is that few look at that big picture when they're writing their code and so their server is dying from the "SQL Death by a Million Cuts".
If you'd like to test, here's some test code to build the 10 Million rows according to the spec the OP wrote.
--=====================================================================================================================
-- Presets (This creates 10 million rows in about 5 seconds on my laptop)
--=====================================================================================================================
--===== If the test table already exists, drop it to make reruns in SSMS easier.
-- Using "old" method for pre-2012 folks.
IF OBJECT_ID('TempDB.dbo.#T') IS NOT NULL
DROP TABLE #T
;
GO
--===== Parameters to control the desired range of dates and and number of rows.
DECLARE @LoDateLimit DATETIME = '2020' --Inclusive, same as 2020-01-01
,@HiDateLimit DATETIME = '2030' --Exclusive, same as 2030-01-01
,@RowCount INT = 10000000 --Currently, 10 million rows
;
--===== Local constants to simplify formulas and increase performance.
DECLARE @DayCount INT = DATEDIFF(dd,@LoDateLimit,@HiDateLimit)
;
--=====================================================================================================================
-- Create and populate the table (HEAP) on the fly using given parameters.
--=====================================================================================================================
SELECT TOP (@RowCount) --Needs to be hardocded for 2005 or less.
getFileName = SUBSTRING(ca.ISODateTxt,5,8)+SUBSTRING(ca.ISODateTxt,1,4)+'.txt' --SWAPS YYYY and MMDD to make MMDDYYYY date.
INTO #T
FROM sys.all_columns ac1
CROSS JOIN sys.all_columns ac2
CROSS APPLY (VALUES (CONVERT(CHAR(8),DATEADD(dd,ABS(CHECKSUM(NEWID())%@DayCount),@LoDateLimit),112)))ca(ISODateTxt)
;
--===== Display a sample of the contents of the Temp Table for verification.
SELECT TOP (100) * FROM #T
After finding an old post of mine from 2018, if you want to scrape off slightly more than an extra 100 ms (a suggestion on my part but written by Eirikur Eiriksson), the following will to the trick. It's similar to a lot of the other code but uses DATEFROMPARTS with explicit conversions to INT. Here's that bit of code...
DECLARE @BitBucket DATE;
SET STATISTICS TIME ON;
SELECT @BitBucket = DATEFROMPARTS(
CONVERT(INT,SUBSTRING(getFileName,5,4),0)
,CONVERT(INT,SUBSTRING(getFileName,1,2),0)
,CONVERT(INT,SUBSTRING(getFileName,3,2),0)
)
FROM #T;
SET STATISTICS TIME OFF;
GO 5
And, finally, you'd think that Integer Math would be the fastest but the LEFT(8) or SUBSTRING(1,8) in the CROSS APPLY slows it down by about 600 ms.
DECLARE @BitBucket DATE;
SET STATISTICS TIME ON;
SELECT @BitBucket = DATEFROMPARTS(ca.IntDate%10000, ca.IntDate/1000000, ca.IntDate/10000%100)
FROM dbo.TestIntDateTime t
CROSS APPLY (VALUES (CONVERT(INT,SUBSTRING(t.getFileName,1,8))))ca(IntDate)
SET STATISTICS TIME OFF;
GO 5
--Jeff Moden
Change is inevitable... Change for the better is not.
September 15, 2023 at 1:56 pm
For the love of whatever you hold dear, stop using “MMDDYYYY” formatting!
It doesn't sort chronology as text. You should be using “YYYYMMDD”.
September 15, 2023 at 3:15 pm
IF the MMDDYYYY formatted string date is guaranteed to always be 8 characters, the following works (most of the posted code doesn't consider the inclusion of the ".txt") according to the 'MMDDYYYY.txt' string format and, because it does not use CHARINDEX, it's more than twice as fast as all the other code that works against the OPs data so far. It will produce an error if it's shorter and probably an error if it's longer but no guarantee there. If you want errors to convert to NULLs, use TRY_CONVERT or TRY_CAST on systems that support it (2016+).
DECLARE @BitBucket DATE;
SET STATISTICS TIME ON;
SELECT @BitBucket = CONVERT(DATE,SUBSTRING(getFileName,5,4)+SUBSTRING(getFileName,1,4)) --SWAPS MMDD and YYYY to make ISO date.
FROM #T
SET STATISTICS TIME OFF;
GO 5Now, full disclosure says the time difference is only about 4.5 seconds faster on 10 million rows so ... who cares?
Consider this... if you wrote everything on your server to run a little more than twice as fast, your server workload would run twice as fast. The trouble is that few look at that big picture when they're writing their code and so their server is dying from the "SQL Death by a Million Cuts".
If you'd like to test, here's some test code to build the 10 Million rows according to the spec the OP wrote.
--=====================================================================================================================
-- Presets (This creates 10 million rows in about 5 seconds on my laptop)
--=====================================================================================================================
--===== If the test table already exists, drop it to make reruns in SSMS easier.
-- Using "old" method for pre-2012 folks.
IF OBJECT_ID('TempDB.dbo.#T') IS NOT NULL
DROP TABLE #T
;
GO
--===== Parameters to control the desired range of dates and and number of rows.
DECLARE @LoDateLimit DATETIME = '2020' --Inclusive, same as 2020-01-01
,@HiDateLimit DATETIME = '2030' --Exclusive, same as 2030-01-01
,@RowCount INT = 10000000 --Currently, 10 million rows
;
--===== Local constants to simplify formulas and increase performance.
DECLARE @DayCount INT = DATEDIFF(dd,@LoDateLimit,@HiDateLimit)
;
--=====================================================================================================================
-- Create and populate the table (HEAP) on the fly using given parameters.
--=====================================================================================================================
SELECT TOP (@RowCount) --Needs to be hardocded for 2005 or less.
getFileName = SUBSTRING(ca.ISODateTxt,5,8)+SUBSTRING(ca.ISODateTxt,1,4)+'.txt' --SWAPS YYYY and MMDD to make MMDDYYYY date.
INTO #T
FROM sys.all_columns ac1
CROSS JOIN sys.all_columns ac2
CROSS APPLY (VALUES (CONVERT(CHAR(8),DATEADD(dd,ABS(CHECKSUM(NEWID())%@DayCount),@LoDateLimit),112)))ca(ISODateTxt)
;
--===== Display a sample of the contents of the Temp Table for verification.
SELECT TOP (100) * FROM #TAfter finding an old post of mine from 2018, if you want to scrape off slightly more than an extra 100 ms (a suggestion on my part but written by Eirikur Eiriksson), the following will to the trick. It's similar to a lot of the other code but uses DATEFROMPARTS with explicit conversions to INT. Here's that bit of code...
DECLARE @BitBucket DATE;
SET STATISTICS TIME ON;
SELECT @BitBucket = DATEFROMPARTS(
CONVERT(INT,SUBSTRING(getFileName,5,4),0)
,CONVERT(INT,SUBSTRING(getFileName,1,2),0)
,CONVERT(INT,SUBSTRING(getFileName,3,2),0)
)
FROM #T;
SET STATISTICS TIME OFF;
GO 5And, finally, you'd think that Integer Math would be the fastest but the LEFT(8) or SUBSTRING(1,8) in the CROSS APPLY slows it down by about 600 ms.
DECLARE @BitBucket DATE;
SET STATISTICS TIME ON;
SELECT @BitBucket = DATEFROMPARTS(ca.IntDate%10000, ca.IntDate/1000000, ca.IntDate/10000%100)
FROM dbo.TestIntDateTime t
CROSS APPLY (VALUES (CONVERT(INT,SUBSTRING(t.getFileName,1,8))))ca(IntDate)
SET STATISTICS TIME OFF;
GO 5
The only difference I see between that code and mine is CONVERT rather than CAST ... I didn't realize CAST had that "big overhead". (Presumably using LEFT rather than SUBSTRING couldn't have that much overhead.)
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
September 15, 2023 at 4:38 pm
The only difference I see between that code and mine is CONVERT rather than CAST ... I didn't realize CAST had that "big overhead". (Presumably using LEFT rather than SUBSTRING couldn't have that much overhead.)
Crud. My sincere apologies, Scott. I missed your code. You're code is faster that the other code, as well. I've updated my response to point that out.
--Jeff Moden
Change is inevitable... Change for the better is not.
September 15, 2023 at 11:07 pm
Many thanks all for all the replies!
September 16, 2023 at 2:56 am
Many thanks all for all the replies!
Thanks for the feedback.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 1 through 14 (of 14 total)
You must be logged in to reply to this topic. Login to reply