CASE with LEN Evaluation

  • I'm struggling with this bit of code within a case Statement:

    CASE
        WHEN (LEN(([Col1]))) = 1 AND [Col1] = '0' THEN '99991231'
        WHEN (LEN(([Col1]))) = 3 AND [Col1] = '200' THEN '99991231' 
        WHEN (LEN(([Col1]))) = 3 AND [Col1] != '200' THEN '20000'+ [Col1]
        WHEN (LEN(([Col1]))) = 4 AND [Col1] != '0' OR [Col1] != '200' THEN '2000' + [Col1]
        WHEN (LEN(([Col1]))) = 5 AND [Col1] != '0' OR [Col1] != '200' THEN '200' + [Col1]
        WHEN (LEN((Col1]))) = 6 AND LEFT([Col1],1) = '9' THEN '19' + [Col1] ELSE '20' + [Col1]

    I'm struggling with the last case when the Column has a length of 6, check the first value of the string if there is a '9'

    what is a good approach for this?

    Cheers

  • DinoRS - Friday, February 1, 2019 6:37 AM

    I'm struggling with this bit of code within a case Statement:

    CASE
        WHEN (LEN(([Col1]))) = 1 AND [Col1] = '0' THEN '99991231'
        WHEN (LEN(([Col1]))) = 3 AND [Col1] = '200' THEN '99991231' 
        WHEN (LEN(([Col1]))) = 3 AND [Col1] != '200' THEN '20000'+ [Col1]
        WHEN (LEN(([Col1]))) = 4 AND [Col1] != '0' OR [Col1] != '200' THEN '2000' + [Col1]
        WHEN (LEN(([Col1]))) = 5 AND [Col1] != '0' OR [Col1] != '200' THEN '200' + [Col1]
        WHEN (LEN((Col1]))) = 6 AND LEFT([Col1],1) = '9' THEN '19' + [Col1] ELSE '20' + [Col1]

    I'm struggling with the last case when the Column has a length of 6, check the first value of the string if there is a '9'

    what is a good approach for this?

    Cheers

    Can you post the DDL (create table) script, sample data as an insert statement and the expected results please?
    😎

  • It appears to be some sort of string manipulation to standardize a date.

    In the final case:
    - missing a bracket:  WHEN (LEN(([Col1])))
    - Appears to make the assumption that if the column length is 6 characters and starts with a 9, then the date would be in the 1990's or else 2000 or later.
    But it could potentially have a flaw depending on the data, see example three below.
    ex1: 951231 => 19951231
    ex2: 051231 => 20051231
    ex3: 851231 => 20851231

    Is this what you are looking for or is it something else?

  • there is not much of a DDL to post here, it's a nchar(8) field and it stores either:

    0, 200, or some 3 - 6 String length combinations of what once was a date in YYYYMMDD format, I want to restore that format essentially. 0 and 200 translate to '99991231' in any case.
    special case with LEN == 6 is dates can start as far back as 1990 so If I get a value like '901212' this once did represent '19901212' on the other hand a '101212' means it's '20101212'
    LEN == 5 Example: '91212' equals '20091212'
    LEN == 4 Example: '1212' equals '20001212'
    LEN == 3 Example: '212' equals '20000212'

    exeptions are as said 200 and 0. LEN == 1 should always be delivered as 0 so only '99991212' is in question.

    Hope this makes it more clear.

    Edit: indeed Rob it is and it simply was that leading bracket which I missed 🙂 However results are not as I would expect, I see a lot of dates like '2000190116' so something seems still off.

  • Just realized you may be having troubles with the ELSE clause where it may need to be another WHEN statement i.e.: 
    WHEN (LEN(([Col1]))) = 6 AND LEFT([Col1],1) <> '9' THEN '20' + [Col1]

  • I do have still troubles, as far as I see he can't evaluate properly for some reason,

     I mainly see LEN ==6 values off like: '2000190116' which should be translated to '20190116' instead, not sure why he thinks it's a LEN == 4 value

  • I'm not sure what the problem is, but you're certainly over complicating the conditions. This is a simplified version of your original code.

    CASE
      WHEN [Col1] = '0' THEN '99991231'
      WHEN [Col1] = '200' THEN '99991231'
      WHEN LEN([Col1]) = 3 THEN '20000'+ [Col1]
      WHEN LEN([Col1]) = 4 THEN '2000' + [Col1]
      WHEN LEN([Col1]) = 5 THEN '200' + [Col1]
      WHEN LEN([Col1]) = 6 AND [Col1] LIKE '9%' THEN '19' + [Col1]
      ELSE '20' + [Col1] END

    DinoRS - Friday, February 1, 2019 7:16 AM

    I do have still troubles, as far as I see he can't evaluate properly for some reason,

     I mainly see LEN ==6 values off like: '2000190116' which should be translated to '20190116' instead, not sure why he thinks it's a LEN == 4 value

    Do you have any leading spaces?

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Luis Cazares - Friday, February 1, 2019 7:29 AM

    I'm not sure what the problem is, but you're certainly over complicating the conditions. This is a simplified version of your original code.

    CASE
      WHEN [Col1] = '0' THEN '99991231'
      WHEN [Col1] = '200' THEN '99991231'
      WHEN LEN([Col1]) = 3 THEN '20000'+ [Col1]
      WHEN LEN([Col1]) = 4 THEN '2000' + [Col1]
      WHEN LEN([Col1]) = 5 THEN '200' + [Col1]
      WHEN LEN([Col1]) = 6 AND [Col1] LIKE '9%' THEN '19' + [Col1]
      ELSE '20' + [Col1] END

    DinoRS - Friday, February 1, 2019 7:16 AM

    I do have still troubles, as far as I see he can't evaluate properly for some reason,

     I mainly see LEN ==6 values off like: '2000190116' which should be translated to '20190116' instead, not sure why he thinks it's a LEN == 4 value

    Do you have any leading spaces?

    To Luis's point, here is an example
    😎


    USE TEEST;
    GO
    SET NOCOUNT ON;

    ;WITH SAMPLE_DATA AS
    (
      SELECT 1 AS XID ,'123' AS XSTR , 3 AS ACTLEN UNION ALL
      SELECT 2   ,' 123'   ,4     UNION ALL
      SELECT 3   ,'123 '   ,4     UNION ALL
      SELECT 4   ,' 123 '  ,5     UNION ALL
      SELECT 5   ,' 123'   ,4     UNION ALL
      SELECT 6   ,'123 '   ,6
    )
    SELECT
      SD.XID
     ,CONCAT('"',SD.XSTR,'"'") AS XSTR
     ,SD.ACTLEN
     ,LEN(SD.XSTR) AS XLEN
    FROM SAMPLE_DATA  SD;

    Output

    XID    XSTR     ACTLEN  XLEN
    ------ -------- ------- -----
    1         "123"       3    3
    2        " 123"       4    4
    3        "123 "       4    3
    4       " 123 "       5    4
    5        " 123"       4    5
    6      "123   "       6    3

  • Sorry - don't have a much time to explain further at the moment; but hope the examples may show revisions/use of LTRIM & RTRIM if/when needed.

    DECLARE @date table (rowid int IDENTITY(1,1), Col1 nchar(8))
    ;
    INSERT @date (Col1)
    VALUES ('0'),('200'),('91212'),('1212'),('212'),(' 190116'),('050505 ')
    ;

    SELECT    d.rowid, d.Col1, LEN(d.Col1)
    FROM    @date AS d
    ;

    UPDATE d
    SET    d.Col1 =
            CASE
                WHEN LEN(d.Col1) = 1 AND d.Col1 = '0' THEN '99991231'
                WHEN LEN(d.Col1) = 3 AND d.Col1 = '200' THEN '99991231'
                WHEN LEN(d.Col1) = 3 AND d.Col1 <> '200' THEN '20000' + d.Col1
                WHEN LEN(d.Col1) = 4 AND ( d.Col1 <> '0' OR d.Col1 <> '200') THEN '2000' + d.Col1
                WHEN LEN(d.Col1) = 5 AND ( d.Col1 <> '0' OR d.Col1 <> '200') THEN '200' + d.Col1
                WHEN LEN(d.Col1) = 6 AND LEFT(d.Col1, 1) = '9' THEN '19' + d.Col1
                WHEN LEN(d.Col1) = 6 AND LEFT(d.Col1, 1) <> '9' THEN '20' + d.Col1
                ELSE d.Col1
            END
    FROM    @date AS d
    ;

    SELECT    d.rowid, d.Col1, LEN(d.Col1)
    FROM    @date AS d
    ;

    UPDATE d
    SET    d.Col1 =
            CASE
                WHEN LEN(LTRIM(RTRIM(d.Col1))) = 1 AND d.Col1 = '0' THEN '99991231'
                WHEN LEN(LTRIM(RTRIM(d.Col1))) = 3 AND d.Col1 = '200' THEN '99991231'
                WHEN LEN(LTRIM(RTRIM(d.Col1))) = 3 AND d.Col1 <> '200' THEN '20000' + LTRIM(RTRIM(d.Col1))
                WHEN LEN(LTRIM(RTRIM(d.Col1))) = 4 AND ( d.Col1 <> '0' OR d.Col1 <> '200') THEN '2000' + LTRIM(RTRIM(d.Col1))
                WHEN LEN(LTRIM(RTRIM(d.Col1))) = 5 AND ( d.Col1 <> '0' OR d.Col1 <> '200') THEN '200' + LTRIM(RTRIM(d.Col1))
                WHEN LEN(LTRIM(RTRIM(d.Col1))) = 6 AND LEFT(d.Col1, 1) = '9' THEN '19' + LTRIM(RTRIM(d.Col1))
                WHEN LEN(LTRIM(RTRIM(d.Col1))) = 6 AND LEFT(d.Col1, 1) <> '9' THEN '20' + LTRIM(RTRIM(d.Col1))
                ELSE d.Col1
            END
    FROM    @date AS d
    WHERE    LTRIM(RTRIM(d.Col1)) <> 8
    ;

    SELECT    d.rowid, d.Col1, LEN(d.Col1)
    FROM    @date AS d
    ;

  • Luis Cazares - Friday, February 1, 2019 7:29 AM

    I'm not sure what the problem is, but you're certainly over complicating the conditions. This is a simplified version of your original code.

    CASE
      WHEN [Col1] = '0' THEN '99991231'
      WHEN [Col1] = '200' THEN '99991231'
      WHEN LEN([Col1]) = 3 THEN '20000'+ [Col1]
      WHEN LEN([Col1]) = 4 THEN '2000' + [Col1]
      WHEN LEN([Col1]) = 5 THEN '200' + [Col1]
      WHEN LEN([Col1]) = 6 AND [Col1] LIKE '9%' THEN '19' + [Col1]
      ELSE '20' + [Col1] END

    DinoRS - Friday, February 1, 2019 7:16 AM

    I do have still troubles, as far as I see he can't evaluate properly for some reason,

     I mainly see LEN ==6 values off like: '2000190116' which should be translated to '20190116' instead, not sure why he thinks it's a LEN == 4 value

    Do you have any leading spaces?

    I was going to say the same thing as Luis, but I came up with an ever so slightly more concise approach.  More importantly, I wanted to explain why the simplifications work.
    CASE
        WHEN [Col1] = '0' THEN '99991231'
            /*  If [Col1] = '0' then the length MUST be 1 and you don't have to test for it.  */
        WHEN [Col1] = '200' THEN '99991231' 
            /*  If [Col1] = '200' then the length MUST be 3 and you don't have to test for it.  */
        WHEN (LEN(([Col1]))) = 3 THEN '20000'+ [Col1]
            /*  If you've gotten to this point in the CASE expression, the first two tests have failed.
                Col1 CANNOT be equal to 0 or 200, so you don't have to test for this.  */
        WHEN (LEN(([Col1]))) = 4 THEN '2000' + [Col1]
            /*  If you've gotten to this point in the CASE expression, the first three tests have failed.
                Col1 CANNOT be equal to 0 or 200, so you don't have to test for this.  */
        WHEN (LEN(([Col1]))) = 5 THEN '200' + [Col1]
        WHEN [Col1] NOT LIKE '9%' THEN '20' + [Col1]
            /*  Switching the order of the last WHEN clause and the ELSE clause means you can simplify the condition. */
        ELSE '19' + [Col1]
    END

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • DinoRS - Friday, February 1, 2019 7:16 AM

    I do have still troubles, as far as I see he can't evaluate properly for some reason,

     I mainly see LEN ==6 values off like: '2000190116' which should be translated to '20190116' instead, not sure why he thinks it's a LEN == 4 value

    If you know the range and the formats of these strings, why not just build a lookup table? I'm assuming that the range of these dates is less than 500 years. The last column could be a DATE data type and save you the trouble of casting..

    Please post DDL and follow ANSI/ISO standards when asking for help. 

  • A shorter alternative:


    SELECT [Col1],
        CASE
            WHEN [Col1] = '0' OR [Col1] = '200' THEN '99991231'
            WHEN LEN([Col1]) = 6 THEN CASE WHEN LEFT([Col1], 1) = '9' THEN '19' ELSE '20' END + [Col1]
            ELSE LEFT('2000', 8 - LEN([Col1])) + CASE WHEN LEN([Col1]) = 3 THEN '0' ELSE '' END + [Col1]
        END AS [Col1_Date]
    FROM ( VALUES
        ('0'), ('200'),
        ('20170201'), ('0170201'), /*testing just in case longer dates sneak in somehow*/
        ('901212'), ('101212'),
        ('91212'), ('1212'), ('212') ) AS test_dates([Col1])

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • Dear thank you everyone who's participating.

    First I want to say I also had potential leading / trailing spaces in suspicion as the base data type was nchar(8) but (correct me if I'm wrong here) that part only kicks in / requires trimming if you would want to compare the values against some other values but as we're only looking at content and "meta data" and evaluate it against itself, TRIM takes no effect nor does it make any difference. I ran what we've figured out before with and without L/RTRIMming things, did not make any difference at all.

    Yeah I'm definitely overcomplicating things at times, thats why all your inputs are so valuable to me, thank you!

    Needless to say Drews' approach works like a charm and so does Luis' approach, the only difference I see is that Luis' approach seems to be 1 second faster @ 330k rows on SQL 2014 than Drews'

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply