December 19, 2012 at 3:40 pm
Comments posted to this topic are about the item GetDateInString
December 21, 2012 at 12:01 am
hi
let me know below query satisfy requirement or not .
thanks
Jayanth
CREATE FUNCTION [Global].[fn_GetDateInString]
(
@InputString AS NVARCHAR(500)
)
RETURNS NVARCHAR(50)
AS
BEGIN
DECLARE @output AS nvarchar(100)
SELECT @output = Convert(varchar,convert(Date, @InputString, 107),107)
return @output
END
December 21, 2012 at 7:04 am
That would work if all you had was the date in different formats. However, the main purpose that I had to write the scripts was because we have a database that is varchar(500) and the user enters with the narrative a data value. I'm extracting if found this value.
January 17, 2013 at 7:03 am
-- Convert to an inline table-valued function: more efficient.
-- minimise the work done to identify a date within a string.
DECLARE @InputString NVARCHAR(500) = 'ABC01/01/2013XYZ'
SELECT
DateString = CASE WHEN ISDATE(PatternString) = 1 THEN PatternString ELSE NULL END
FROM (
-- look for a date pattern in the string, stop if found.
SELECT PatternString = CASE
-- date format as 01/01/2012
WHEN PATINDEX('%[0-9][0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9]%',@InputString) > 0
THEN LEFT(SUBSTRING(@InputString,PATINDEX('%[0-9][0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9]%',@InputString),StringLen),10)
-- date format as 1/01/2012
WHEN PATINDEX('%[0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9]%',@InputString) > 0
THEN LEFT(SUBSTRING(@InputString,PATINDEX('%[0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9]%',@InputString),StringLen),9)
-- date format as 01/1/2012
WHEN PATINDEX('%[0-9][0-9]/[0-9]/[0-9][0-9][0-9][0-9]%',@InputString) > 0
THEN LEFT(SUBSTRING(@InputString,PATINDEX('%[0-9][0-9]/[0-9]/[0-9][0-9][0-9][0-9]%',@InputString),StringLen),9)
-- date format as 1/1/2012
WHEN PATINDEX('%[0-9]/[0-9]/[0-9][0-9][0-9][0-9]%',@InputString) > 0
THEN LEFT(SUBSTRING(@InputString,PATINDEX('%[0-9]/[0-9]/[0-9][0-9][0-9][0-9]%',@InputString),StringLen),8)
-- date format as 01/11/12
WHEN PATINDEX('%[0-9][0-9]/[0-9][0-9]/[0-9][0-9]%',@InputString) > 0
THEN LEFT(SUBSTRING(@InputString,PATINDEX('%[0-9][0-9]/[0-9][0-9]/[0-9][0-9]%',@InputString),StringLen),8)
-- date format as 01/1/12
WHEN PATINDEX('%[0-9][0-9]/[0-9]/[0-9][0-9]%',@InputString) > 0
THEN LEFT(SUBSTRING(@InputString,PATINDEX('%[0-9][0-9]/[0-9]/[0-9][0-9]%',@InputString),StringLen),8)
-- date format as 1/11/12
WHEN PATINDEX('%[0-9]/[0-9][0-9]/[0-9][0-9]%',@InputString) > 0
THEN LEFT(SUBSTRING(@InputString,PATINDEX('%[0-9]/[0-9][0-9]/[0-9][0-9]%',@InputString),StringLen),7)
-- date format as 1/1/12
WHEN PATINDEX('%[0-9]/[0-9]/[0-9][0-9]%',@InputString) > 0
THEN LEFT(SUBSTRING(@InputString,PATINDEX('%[0-9]/[0-9]/[0-9][0-9]%',@InputString),StringLen),6)
-- another 80 or so options...
ELSE NULL END
FROM (SELECT StringLen = LEN(@InputString)) s
) d
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
January 17, 2013 at 8:02 am
I would never write a so complicated script for this simple functionality. It's too hard to maintain.
January 17, 2013 at 8:04 am
bli-963763 (1/17/2013)
I would never write a so complicated script for this simple functionality. It's too hard to maintain.
I'd very much like to see your simpler version.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
January 17, 2013 at 8:13 am
Hard to maintain it's really not an option when it comes to working with databases, you got to do what you got to do. Data entered by users can be very tricky and whether you like it or not as a DBA your job is to maintain it, easy or difficult it's part of the job.
What you have to consider is what's the simplest and better out of all the options. ChrisM's version believe it or not is easier and simpler to maintain than the version I wrote.
BTW thanks ChrisM.
January 17, 2013 at 8:17 am
HildaJ (1/17/2013)
Hard to maintain it's really not an option when it comes to working with databases, you got to do what you got to do. Data entered by users can be very tricky and whether you like it or not as a DBA your job is to maintain it, easy or difficult it's part of the job.What you have to consider is what's the simplest and better out of all the options. ChrisM's version believe it or not is easier and simpler to maintain than the version I wrote.
BTW thanks ChrisM.
You're welcome HildaJ. I can easily understand why you wrote this piece of code. Finding a date within a string is trivial if it's in a fixed format.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
January 17, 2013 at 8:53 am
ChrisM@Work (1/17/2013)
bli-963763 (1/17/2013)
I would never write a so complicated script for this simple functionality. It's too hard to maintain.I'd very much like to see your simpler version.
I might try something like this (trying to taking advantage of SQL Server's IsDate function):
-- Find all possible dates in a string column
;with Tally as (
-- Celko-style derived numbers table
select a.n
+ b.n * 10
+ c.n * 100
+ 1 as n
from (select 0 as n union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) a
, (select 0 as n union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) b
, (select 0 as n union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) c
)
select i.n as SubstringIndex
, l.n as SubstringLen
, substring(t.MyColumn, i.n, l.n) as CandidateSubstring
, cast(substring(t.MyColumn, i.n, l.n) as datetime) as InterpretedDateTime
from Tally i
cross join Tally l
cross join MyTable t
where 1 = 1
and i.n + l.n - 1 <= len(t.MyColumn)
and isdate(substring(t.MyColumn, i.n, l.n)) = 1
order by l.n desc, i.n
This is close to the simplest form that I can come up with, though it could be optimized with smarter where/join clauses (what is the max len of a date, etc.). Also, using replace for month names/"st"/"th" might be smart as well, as would using Jeff Moden's Tally table instead of the CTE above. And again, we're at the mercy of IsDate, so would have to massage the data (possibly using replace and such) to fit the mold.
That said, the OP has a great function. Fun stuff. Thanks!
January 17, 2013 at 9:08 am
Nice work Hilda - I have fought that dragon too with embedded dates in large text fields.... I like your solution. Since I'm also a dot.net guy my approach is via a CLR. I appreciate your work, thank you 🙂
Tom in Sacramento - For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
January 17, 2013 at 9:29 am
Nice work on the first script and for the follow ons. I have usually managed this on the .net c# side with no considerations for doing it in the SQL world. This adds another tool to the arsenal when we seek to slay the dragon. Thanks.
M.
Not all gray hairs are Dinosaurs!
January 18, 2013 at 1:40 am
Nice one Hilda,
More like a braintrainer, than for any actual purpose I tried to come up with something that could do the same task with a little less elaborate code. I haven't managed to completely mimick your functionality in the 1,5 hours spent. For example this doesn't support formats with month names or abbreviations in it, but I ran out of time before I could do that. Still I did come up with something I'd like to share with you, maybe it gives you some ideas to further improve your function.
declare @InputString nvarchar(500);
select @InputString = '123456789012/31/20132/1/201312345678902013-01-182004-02-29';
with cteNumbers as (
select row_number() over (order by (select null)) as n
from sys.syscolumns sc1
cross join sys.syscolumns sc2
),
ctePatterns as (
select N'[mm]/[dd]/[yyyy]' as pattern, 10 as len, 1 as ms, 2 as ml, 4 as ds, 2 as dl, 7 as ys, 4 as yl, convert(smallint, 0x07) as checks
union all
select N'[m]/[dd]/[yyyy]', 9, 1, 1, 3, 2, 6, 4, 0x07 as checks
union all
select N'[m]/[d]/[yyyy]', 8, 1, 1, 3, 1, 5, 4, 0x07 as checks
union all
select N'[mm]/[d]/[yyyy]', 9, 1, 2, 4, 1, 6, 4, 0x07 as checks
union all
select N'[mm]/[dd]/[yy]', 8, 1, 2, 4, 2, 7, 2, 0x0B as checks
union all
select N'[m]/[dd]/[yy]', 7, 1, 1, 3, 2, 6, 2, 0x0B as checks
union all
select N'[m]/[d]/[yy]', 6, 1, 1, 3, 1, 5, 2, 0x0B as checks
union all
select N'[mm]/[d]/[yy]', 7, 1, 2, 4, 1, 6, 2, 0x0B as checks
union all
select N'[yyyy]-[mm]-[dd]', 10, 6, 2, 9, 2, 1, 4, 0x07 as checks
union all
select N'[yyyy]-[m]-[dd]', 9, 6, 1, 8, 2, 1, 4, 0x07 as checks
union all
select N'[yyyy]-[m]-[d]', 8, 6, 1, 8, 1, 1, 4, 0x07 as checks
union all
select N'[yyyy]-[mm]-[d]', 9, 6, 2, 9, 1, 1, 4, 0x07 as checks
)
select n.n as position,
pat.len,
row_number() over (partition by n.n + pat.len order by pat.len desc),
substring(inp.txt, n.n, pat.len),
pat.pattern,
-- inp.txt,
x.[year],
x.[month],
x.[day]
from (
select @inputString as txt
) inp
cross join cteNumbers n
cross join ctePatterns pat
cross apply (
select replace(replace(replace(replace(replace(replace(
pat.pattern,
'[yyyy]', '[0-9][0-9][0-9][0-9]'),
'[yy]', '[0-9][0-9]'),
'[dd]', '[0-9][0-9]'),
'[d]', '[0-9]'),
'[mm]', '[0-9][0-9]'),
'[m]', '[0-9]') as pattern,
convert(int, substring(inp.txt, -1 + n.n + pat.ms, pat.ml)) as [month],
convert(int, substring(inp.txt, -1 + n.n + pat.ds, pat.dl)) as [day],
convert(int, substring(inp.txt, -1 + n.n + pat.ys, pat.yl)) as [year]
) x
where n.n < datalength(inp.txt) / 2
and patindex(x.pattern, substring(inp.txt, n.n, pat.len)) > 0
and not exists (
select 1 where pat.checks & 0x01 <> 0x00 and not convert(int, substring(inp.txt, -1 + n.n + pat.ms, pat.ml)) between 1 and 12
union all
select 1 where pat.checks & 0x02 <> 0x00 and (
convert(int, substring(inp.txt, -1 + n.n + pat.ds, pat.dl)) < 1
or convert(int, substring(inp.txt, -1 + n.n + pat.ds, pat.dl)) >
case
when convert(int, substring(inp.txt, -1 + n.n + pat.ms, pat.ml)) in (1,3,5,7,8,10,12) then 31
when convert(int, substring(inp.txt, -1 + n.n + pat.ms, pat.ml)) in (4,6,9,11) then 30
when convert(int, substring(inp.txt, -1 + n.n + pat.ms, pat.ml)) = 2 then
case when convert(int, substring(inp.txt, -1 + n.n + pat.ms, pat.ml)) % 4 = 0 then 29
else 28
end
end)
union all
select 1 where pat.checks & 0x04 <> 0x00 and not convert(int, substring(inp.txt, -1 + n.n + pat.ys, pat.yl)) between 1900 and 9999
union all
select 1 where pat.checks & 0x08 <> 0x00 and not convert(int, substring(inp.txt, -1 + n.n + pat.ys, pat.yl)) between 0 and 99
)
order by n.n + pat.len,
3;
The output is as follows:
positionlen(No column name)(No column name)patternyearmonthday
118112/31/20[mm]/[dd]/[yy]201231
1110112/31/2013[mm]/[dd]/[yyyy]20131231
21612/1/20[m]/[d]/[yy]2021
21812/1/2013[m]/[d]/[yyyy]201321
39912013-01-1[yyyy]-[mm]-[d]201311
391012013-01-18[yyyy]-[mm]-[dd]2013118
49912004-02-2[yyyy]-[mm]-[d]200422
January 18, 2013 at 7:12 am
Very cool, and I must ask, did you do this for fun? 😀 Pretty amazing. The more I learn the more I feel there's so much out there that I don't know. I'll have to make time to incorporate everyones suggestions into mine.
Thanks.
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply