PatIndex

  • Problem: Trying to pick out two digit day from a string in a Desc column and concatenating with a month column and a year column from another table.

    Here is the scenario...

    Table 1 (No day column in this table)

    --------

    YEAR(Decimal) | MONTH(Decimal)

    2011 | 02

    Table 2

    --------

    DESCRIPTION(Varchar)

    020511 Covers

    123110 Pulling

    So, I am trying to take the '05' or '31' from the Desc column in Table 2 and concatenate with Month and Year column from Table 1.

    Inital attempt (Didnt work...Got The conversion of a varchar data type to a datetime data type resulted in an out-of-range value) below...

    Select case when (select PatIndex('%[0-9]%',left(Table2.DESC,6))) = 1 then cast(rtrim(Table1.Year*10000+Table1.Month*100+convert(nvarchar,right(left(Table2.DESC,4),2)))

    as datetime) else 0 end

    Thoughts?

    Thanks!

  • I'm not sure I would do it this way, but with mixed types you have challenges.

    Try this:

    Select case when PatIndex('%[^0-9]%',left(Table2.[DESC], 6)) = 0 then

    CONVERT(DATETIME, CAST(Table1.YEAR * 10000 + Table1.Month * 100 + CAST(SUBSTRING(Table2.[DESC], 3, 2) AS INT) AS CHAR(8)), 112)

    else 0 end

    Note: I changed your PatIndex function call to essentially be a NOT LIKE not a number. Essentially it will return the position of the first character that is not a number, so if it returns anything other than 0 the first 6 characters are not all numbers. (Since your version didn't work like I thought you wanted it.)

  • Thanks for the attempt UMG Developer. Still coming back with the same error. Getting some of the records back, but the majority are falling out of the result set because of the The conversion of a varchar data type to a datetime data type resulted in an out-of-range value error.

  • Do the select without the cast to datetime and see what data comes back. If you convert everything to varchar at first, you can start to look for places where the pattern breaks down and then try to see if you can solve those with a WHERE clause or some other error handling.

    If the value always a 2 digit month, then a day? If so, do you need patindex? Can you use use substring?

  • thamatrix1436 (4/7/2011)


    Thanks for the attempt UMG Developer. Still coming back with the same error. Getting some of the records back, but the majority are falling out of the result set because of the The conversion of a varchar data type to a datetime data type resulted in an out-of-range value error.

    What I provided should work as long as the data is always in the format you specified. The most likely problem is there is something wrong in your data such that characters 3-4 aren't a valid day for the Year and Month specified.

    Say the Month in your table is February, but you are pull the day from 033110. There is no February 31st so that will always fail. This is probably your problem, what do you want as a result when you try to combine day 31 with month February?

    Modify this to work with your data and then run it, that should help show you where some of the problems are:

    Select case when PatIndex('%[^0-9]%',left(Table2.[DESC], 6)) = 0 then

    CAST(Table1.YEAR * 10000 + Table1.Month * 100 + CAST(SUBSTRING(Table2.[DESC], 3, 2) AS INT) AS CHAR(8))

    else null END

    FROM

    <your tables/joins>

    WHERE

    ISDATE(CASE WHEN PatIndex('%[^0-9]%',left(Table2.[DESC], 6)) = 0 THEN

    CAST(Table1.YEAR * 10000 + Table1.Month * 100 + CAST(SUBSTRING(Table2.[DESC], 3, 2) AS INT) AS CHAR(8))

    ELSE NULL END) = 0;

    Note: ISDATE isn't perfect, so it is possible you still won't see the problem, so you could take the WHERE out of the query and run it to see what date is invalid and then figure out how you want to deal with it.

    If you need further help we would need to see the data that is causing the problem.

  • Steve Jones - SSC Editor (4/7/2011)


    If the value always a 2 digit month, then a day? If so, do you need patindex? Can you use use substring?

    Steve,

    The way I saw the PatIndex being used was to verify that the firsth 6 characters are all numbers, it isn't being used to figure out where to pull stuff from the string.

  • If this question is a joke -> heheh <= I'm god on FEBRUARY 30th+ (leapy).

    Well, this code will work (hey it's tested 😉

    declare @t1 table(y decimal(4), m decimal(2))

    declare @t2 table(d varchar(max))

    set nocount on

    insert @t1(y,m)

    values (2011,2)

    insert @t2(d)

    values ('020511'),

    ('123110'),

    ('bc2conquest')

    Select CASE WHEN PatIndex('%[0-9]%',left(t2.d,6)) = 1

    THEN dateadd(dd,cast(substring(t2.d,3,2) as int)-1,cast(cast(t1.y*100+t1.m as varchar)+'01' as datetime))

    ELSE NULL END

    FROM @t1 t1,@t2 t2

  • Flexdog (4/7/2011)


    Well, this code will work (hey it's tested 😉

    That is an interesting solution, but we don't have specification for how the situation should be handled, maybe they want it to be the last day of the month instead of stepping in to the next month...

  • Agree, boundaries cases always tricky and usually obmitted in specs ...

  • So the main problem with this situation is that the data is stored in a DESC column that is a varchar field. And I really only care about the first 6 characters being numbers because those 6 characters are being added to the DESC entry to signify the actual date the entry was done and entered into the system. The problem is when the other table was created, it was created with a YEAR and MONTH column, but no date column, thus why after verifying that the first 6 characters are in fact numbers, I was attempting to pull out the 2 digit DAY and concatenate with the other 2 columns. After doing some investigate of the data this weekend, I have just determined that I am going to use the first 6 characters and not even worry about the concatenation piece. However, as I mentioned above, the problem is that the data is in a DESC, so the data is entered in whatever format they choose. I will give some data here (with all the info changed for privacy but the format is how it is in the table) to show the various ways its entered. I have noted with ** which rows I would successfully select for the dates. All other rows would be given a static date of "year+month+01" to signify that it doesnt have a real date. There are other rows that contain dates, like the row with "03/16/11" or the row that has "121910" end the end of the string, but I think that will just be too many various formats to try to search for just to get the date. So, my overall concern is still the same, using PatIndex to verify the first 6 numbers, take those 6 characters and convert them to datetime.

    DESC How I want Date

    abcdefgher 10020110101

    abcdefgher 10020110101

    abcdefgher 10020110101

    abcdefgher 1020110101

    **020811 abcdef20110208

    **022411 hgysjekn20110224

    jusjejsmt 83/1720110101

    justjensn 83/1720110101

    paper 200020110101

    paper 200020110101

    abdece DEC10Pmt20101201

    abcjdej DEC10Pmt20101201

    03/16/11 andoshe20110301

    **031711 usnehtnsu20110317

    testing 20110301

    testing 20110301

    data/data20110301

    data/data20110301

    data/data20110301

    data/data20110301

    data/data20110301

    data/data20110301

    data/data20110301

    frtuensht 12191020101201

    testing 1222stol20101201

    **021011 Truthserul20110210

    I think, using a piece of UMG Developer code, that I was able to get the first 6 characters correctly (thank you btw) but now cant convert them to a date...

    Select CASE WHEN PatIndex('%[^0-9]%',left(DESC, 6)) = 0 THEN

    cast(YEAR*10000 + MNTH*100 + CAST(SUBSTRING(DESC, 3, 2) AS int) AS varchar(10))

    else cast(YEAR*10000 + MNTH*100 + '01' as varchar(8)) end

    Thanks for all your help commenters. Not the cleanest data situation but when is it really!

  • thamatrix1436 (4/11/2011)


    I think, using a piece of UMG Developer code, that I was able to get the first 6 characters correctly (thank you btw) but now cant convert them to a date...

    Select CASE WHEN PatIndex('%[^0-9]%',left(DESC, 6)) = 0 THEN

    cast(YEAR*10000 + MNTH*100 + CAST(SUBSTRING(DESC, 3, 2) AS int) AS varchar(10))

    else cast(YEAR*10000 + MNTH*100 + '01' as varchar(8)) end

    That is easy, you just have to add some separators and CONVERT it, here is a sample:

    SELECT

    [Desc],

    CASE WHEN PatIndex('%[^0-9]%',left([DESC], 6)) = 0 THEN

    CONVERT(DATETIME, STUFF(STUFF([DESC], 3, 0, '/') , 6, 0, '/'), 1)

    END AS DatePulledFromFirst6

    FROM (SELECT '031611' AS [Desc] UNION ALL SELECT 'Bad Data') a;

    You probably want to include the PatIndex in the WHERE clause so you only return records with the first 6 characters being a date.

    Thanks for all your help commenters. Not the cleanest data situation but when is it really!

    Your welcome! Bad data is always a PIA, I often spend more time cleaning data up than actually running the report.

  • Very odd. Got an error trying to do the conversion...

    Conversion failed when converting date and/or time from character string.

    And thats w/ and w/o the PatIndex in the WHERE clause.

  • thamatrix1436 (4/11/2011)


    Very odd. Got an error trying to do the conversion...

    Conversion failed when converting date and/or time from character string.

    And thats w/ and w/o the PatIndex in the WHERE clause.

    Not really, it just means that you have cases where the first 6 characters are all numbers, but it still isn't a valid date. It can be difficult to identify the offending rows, but this might help:

    WITH CTE AS (SELECT

    [Desc],

    CASE WHEN PatIndex('%[^0-9]%',left([DESC], 6)) = 0 THEN

    STUFF(STUFF([DESC], 3, 0, '/') , 6, 0, '/')

    END AS DateStringFromFirst6

    FROM (SELECT '021511' AS [Desc]

    UNION ALL SELECT '023111'

    UNION ALL SELECT 'Bad Data') a)

    SELECT

    *

    FROM CTE

    WHERE

    ISDATE(DateStringFromFirst6) = 0;

  • UMG Developer (4/11/2011)


    thamatrix1436 (4/11/2011)


    Very odd. Got an error trying to do the conversion...

    Conversion failed when converting date and/or time from character string.

    And thats w/ and w/o the PatIndex in the WHERE clause.

    Not really, it just means that you have cases where the first 6 characters are all numbers, but it still isn't a valid date. It can be difficult to identify the offending rows, but this might help:

    WITH CTE AS (SELECT

    [Desc],

    CASE WHEN PatIndex('%[^0-9]%',left([DESC], 6)) = 0 THEN

    STUFF(STUFF([DESC], 3, 0, '/') , 6, 0, '/')

    END AS DateStringFromFirst6

    FROM (SELECT '021511' AS [Desc]

    UNION ALL SELECT '023111'

    UNION ALL SELECT 'Bad Data') a)

    SELECT

    *

    FROM CTE

    WHERE

    ISDATE(DateStringFromFirst6) = 0;

    Edited to remove incorrect statement as I missed the ^ :blush:

    However, I think the following is a bit easier to follow

    CASE WHEN [DESC] LIKE '[0-9][0-9][0-9][0-9][0-9][0-9]%' THEN

    Far away is close at hand in the images of elsewhere.
    Anon.

  • David Burrows (4/12/2011)


    However, I think the following is a bit easier to follow

    CASE WHEN [DESC] LIKE '[0-9][0-9][0-9][0-9][0-9][0-9]%' THEN

    True, I was just following his example, and the thread is called PatIndex. 😉

    Even better would be to add more logic to help filter out more invalid dates:

    CASE WHEN [DESC] LIKE '[0-1][0-2][0-3][0-9][0-9][0-9]%' THEN

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply