How do we avoid junk in the VARCHAR fields ?

  • This is my update statement.

    UPDATE [MEMBER].[IMP_MEMBER_INT]

    SET

    [ContactLastName] = CASE WHEN LEN(ISNULL(LNAME,'')) > 1 THEN left(ltrim(rtrim(LNAME)),60) ELSE @DefOfNull END

    ,[ContactFirstName] = CASE WHEN LEN(ISNULL(FNAME,'')) > 1 THEN left(ltrim(rtrim(FNAME)),35) ELSE @DefOfNull END

    ,[ContactMiddleInitial] = CASE WHEN LEN(ISNULL(MINTL,'')) > 1 THEN left(ltrim(rtrim(MINTL)),1) ELSE @DefOfNull END

    ,ContactAddress1 = CASE WHEN ISNULL(MLAD1,'') > '' THEN left(ltrim(rtrim(MLAD1)),55) ELSE @DefOfNull END

    ,ContactAddress2= CASE WHEN ISNULL(MLAD2,'') > '' THEN left(ltrim(rtrim(MLAD2)),55) ELSE @DefOfNull END

    ,ContactCity= CASE WHEN ISNULL(MLCTY,'') > '' THEN left(ltrim(rtrim(MLCTY)),30) ELSE @DefOfNull END

    ,ContactState= CASE WHEN ISNULL(MLSTAT,'') > '' THEN left(ltrim(rtrim(MLSTAT)),02) ELSE @DefOfNull END

    ,ContactZipCode= CASE WHEN ISNULL(MLZIP,'') > '' THEN left(ltrim(rtrim(MLZIP)),15) ELSE @DefOfNull END

    from [MEMBER].[IMP_MEMBER_INT]

    inner join [mhpdw].TransferDB.[DBO].MHPFamily on

    [HIC_NUMBER] COLLATE DATABASE_DEFAULT =ltrim(rtrim(EMPNO)) COLLATE DATABASE_DEFAULT

    However take a look at the attached image.

    When I do a SELECT why do I get blank fields ( I have circled them in the image )

    I guess those fields have something other than just space chars ( or some control characters that are not visible )

    So how do we write the update statement so I only needs some readable value in the ContactFirstName or else I need to see a

    NULL

  • What are the values of LNAME, FNAME and MINTL for those two rows?

    John

  • Maybe like this? Why do you have different validations for names and other values?

    UPDATE [MEMBER].[IMP_MEMBER_INT]

    SET

    [ContactLastName] = CASE WHEN LEN(LNAME) > 1 THEN left(ltrim(LNAME),60) ELSE @DefOfNull END

    ,[ContactFirstName] = CASE WHEN LEN(FNAME) > 1 THEN left(ltrim(FNAME),35) ELSE @DefOfNull END

    ,[ContactMiddleInitial] = CASE WHEN LEN(MINTL) > 1 THEN left(ltrim(MINTL),1) ELSE @DefOfNull END

    ,ContactAddress1 = CASE WHEN MLAD1 > '' THEN left(ltrim(MLAD1),55) ELSE @DefOfNull END

    ,ContactAddress2 = CASE WHEN MLAD2 > '' THEN left(ltrim(MLAD2),55) ELSE @DefOfNull END

    ,ContactCity = CASE WHEN MLCTY > '' THEN left(ltrim(MLCTY),30) ELSE @DefOfNull END

    ,ContactState = CASE WHEN MLSTAT > '' THEN left(ltrim(MLSTAT),02) ELSE @DefOfNull END

    ,ContactZipCode = CASE WHEN MLZIP > '' THEN left(ltrim(MLZIP),15) ELSE @DefOfNull END

    from [MEMBER].[IMP_MEMBER_INT]

    inner join [mhpdw].TransferDB.[DBO].MHPFamily on

    [HIC_NUMBER] COLLATE DATABASE_DEFAULT =ltrim(EMPNO) COLLATE DATABASE_DEFAULT

    WHERE LNAME LIKE '%[A-Za-z]%'

    OR FNAME LIKE '%[A-Za-z]%'

    OR MINTL LIKE '%[A-Za-z]%';

    I eliminated unnecessary functions.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Luis Cazares (8/1/2016)


    Maybe like this? Why do you have different validations for names and other values?

    UPDATE [MEMBER].[IMP_MEMBER_INT]

    SET

    [ContactLastName] = CASE WHEN LEN(LNAME) > 1 THEN left(ltrim(LNAME),60) ELSE @DefOfNull END

    ,[ContactFirstName] = CASE WHEN LEN(FNAME) > 1 THEN left(ltrim(FNAME),35) ELSE @DefOfNull END

    ,[ContactMiddleInitial] = CASE WHEN LEN(MINTL) > 1 THEN left(ltrim(MINTL),1) ELSE @DefOfNull END

    ,ContactAddress1 = CASE WHEN MLAD1 > '' THEN left(ltrim(MLAD1),55) ELSE @DefOfNull END

    ,ContactAddress2 = CASE WHEN MLAD2 > '' THEN left(ltrim(MLAD2),55) ELSE @DefOfNull END

    ,ContactCity = CASE WHEN MLCTY > '' THEN left(ltrim(MLCTY),30) ELSE @DefOfNull END

    ,ContactState = CASE WHEN MLSTAT > '' THEN left(ltrim(MLSTAT),02) ELSE @DefOfNull END

    ,ContactZipCode = CASE WHEN MLZIP > '' THEN left(ltrim(MLZIP),15) ELSE @DefOfNull END

    from [MEMBER].[IMP_MEMBER_INT]

    inner join [mhpdw].TransferDB.[DBO].MHPFamily on

    [HIC_NUMBER] COLLATE DATABASE_DEFAULT =ltrim(EMPNO) COLLATE DATABASE_DEFAULT

    WHERE LNAME LIKE '%[A-Za-z]%'

    OR FNAME LIKE '%[A-Za-z]%'

    OR MINTL LIKE '%[A-Za-z]%';

    I eliminated unnecessary functions.

    How about

    SET [ContactAddress1] = NULLIF(LEFT(LTRIM(MLAD1),60),'')

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • mw112009 (8/1/2016)


    This is my update statement.

    UPDATE [MEMBER].[IMP_MEMBER_INT]

    SET

    [ContactLastName] = CASE WHEN LEN(ISNULL(LNAME,'')) > 1 THEN left(ltrim(rtrim(LNAME)),60) ELSE @DefOfNull END

    ,[ContactFirstName] = CASE WHEN LEN(ISNULL(FNAME,'')) > 1 THEN left(ltrim(rtrim(FNAME)),35) ELSE @DefOfNull END

    ,[ContactMiddleInitial] = CASE WHEN LEN(ISNULL(MINTL,'')) > 1 THEN left(ltrim(rtrim(MINTL)),1) ELSE @DefOfNull END

    ,ContactAddress1 = CASE WHEN ISNULL(MLAD1,'') > '' THEN left(ltrim(rtrim(MLAD1)),55) ELSE @DefOfNull END

    ,ContactAddress2= CASE WHEN ISNULL(MLAD2,'') > '' THEN left(ltrim(rtrim(MLAD2)),55) ELSE @DefOfNull END

    ,ContactCity= CASE WHEN ISNULL(MLCTY,'') > '' THEN left(ltrim(rtrim(MLCTY)),30) ELSE @DefOfNull END

    ,ContactState= CASE WHEN ISNULL(MLSTAT,'') > '' THEN left(ltrim(rtrim(MLSTAT)),02) ELSE @DefOfNull END

    ,ContactZipCode= CASE WHEN ISNULL(MLZIP,'') > '' THEN left(ltrim(rtrim(MLZIP)),15) ELSE @DefOfNull END

    from [MEMBER].[IMP_MEMBER_INT]

    inner join [mhpdw].TransferDB.[DBO].MHPFamily on

    [HIC_NUMBER] COLLATE DATABASE_DEFAULT =ltrim(rtrim(EMPNO)) COLLATE DATABASE_DEFAULT

    However take a look at the attached image.

    When I do a SELECT why do I get blank fields ( I have circled them in the image )

    I guess those fields have something other than just space chars ( or some control characters that are not visible )

    So how do we write the update statement so I only needs some readable value in the ContactFirstName or else I need to see a

    NULL

    Can you post your table definition please....

    Also if you are doing this data "fixing" on a regular basis I would look at your data import process and fix the issues at that end...

  • I am pleased with the reply from Luis Cazares (8/1/2016).

    That works fine for me.

    No need to reply to this post anymore.

    Thanks for stopping by

  • mw112009 (8/3/2016)


    I am pleased with the reply from Luis Cazares (8/1/2016).

    That works fine for me.

    No need to reply to this post anymore.

    Thanks for stopping by

    You should also try Chris' suggestion.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Luis Cazares (8/1/2016)


    Maybe like this? Why do you have different validations for names and other values?

    UPDATE [MEMBER].[IMP_MEMBER_INT]

    SET

    [ContactLastName] = CASE WHEN LEN(LNAME) > 1 THEN left(ltrim(LNAME),60) ELSE @DefOfNull END

    ,[ContactFirstName] = CASE WHEN LEN(FNAME) > 1 THEN left(ltrim(FNAME),35) ELSE @DefOfNull END

    ,[ContactMiddleInitial] = CASE WHEN LEN(MINTL) > 1 THEN left(ltrim(MINTL),1) ELSE @DefOfNull END

    ,ContactAddress1 = CASE WHEN MLAD1 > '' THEN left(ltrim(MLAD1),55) ELSE @DefOfNull END

    ,ContactAddress2 = CASE WHEN MLAD2 > '' THEN left(ltrim(MLAD2),55) ELSE @DefOfNull END

    ,ContactCity = CASE WHEN MLCTY > '' THEN left(ltrim(MLCTY),30) ELSE @DefOfNull END

    ,ContactState = CASE WHEN MLSTAT > '' THEN left(ltrim(MLSTAT),02) ELSE @DefOfNull END

    ,ContactZipCode = CASE WHEN MLZIP > '' THEN left(ltrim(MLZIP),15) ELSE @DefOfNull END

    from [MEMBER].[IMP_MEMBER_INT]

    inner join [mhpdw].TransferDB.[DBO].MHPFamily on

    [HIC_NUMBER] COLLATE DATABASE_DEFAULT =ltrim(EMPNO) COLLATE DATABASE_DEFAULT

    WHERE LNAME LIKE '%[A-Za-z]%'

    OR FNAME LIKE '%[A-Za-z]%'

    OR MINTL LIKE '%[A-Za-z]%';

    I eliminated unnecessary functions.

    Actually and depending on the source of the data, RTRIM may be necessary. Although trailing spaces are ignored by most comparisons and some functions like LEN, they still exist and they still take space everywhere. If there are a lot of rows, the savings can be substantial not only in the area of space used but in query performance increases because of decreased row size.

    In this case, any indexes in the affected columns would need to be rebuilt to take advantage of the savings and that would, of course, include the clustered index.

    I also agree that this type of thing should be done right up front during initial loads.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply