A real doozie!

  • Microsoft SQL Server 2016 (SP1-GDR) (KB4019089) - 13.0.4206.0 (X64)

    I am performing a very simple table insert and seeing the dreaded
    Msg 8152, Level 16, State 2, Line 669
    String or binary data would be truncated.
    The statement has been terminated.

    I check all my lengths and everything is fine. This is a super simple insert. Only 33 rows! I can see everything, I even do a LEN on every column. Data types match as well.I get the idea to run this through a while loop and, amazingly, it works! Just for kicks I delete my inserts I made with the while loop and try again. Nope! It simply will not insert with the set based approach. What gives?

    This works:

    select 1 --just to set initial count so it runs reliably
    while @@ROWCOUNT > 0
    BEGIN
        INSERT INTO BIOGRAPH_MASTER
        (ID_NUM, GENDER, BIRTH_DTE
        , user_name, job_name, job_time)

        SELECT n.ID_NUM, t.Gender, t.Birth_date
        ,'TE_ADMIN' as USER_NAME, 'recruitment_data_insert_v99' as JOB_NAME, GETDATE() as JOB_TIME
        from NAME_MASTER n join temp_ad_act_20190208 t
        on n.EMAIL_ADDRESS = t.EMAIL
            left join BIOGRAPH_MASTER b on b.ID_NUM = n.ID_NUM
        where b.ID_NUM is null
            and t.SEQ_NUM =
            (
            select MIN(b.SEQ_NUM)
            from NAME_MASTER a join temp_ad_act_20190208 b
            on a.EMAIL_ADDRESS = b.EMAIL
            left join BIOGRAPH_MASTER c on c.ID_NUM = a.ID_NUM
            where c.ID_NUM is null
            )
    END

    This does NOT work!:

        INSERT INTO BIOGRAPH_MASTER
        (ID_NUM, GENDER, BIRTH_DTE
        , user_name, job_name, job_time)

        SELECT n.ID_NUM, t.Gender, t.Birth_date
        ,'TE_ADMIN' as USER_NAME, 'recruitment_data_insert_v99' as JOB_NAME, GETDATE() as JOB_TIME
        from NAME_MASTER n join temp_ad_act_20190208 t
        on n.EMAIL_ADDRESS = t.EMAIL
            left join BIOGRAPH_MASTER b on b.ID_NUM = n.ID_NUM
        where b.ID_NUM is null

    I've even run it row by row, without the while loop, for 33 times, and every row inserts just fine. It's driving me nuts! Why won't the good ol' set based insert work? Where am I going wrong here?

  • Could you post one or two "row-by-row" inserts that worked as well as the table definition of BIOGRAPH_MASTER table? There's not enough information here to figure out what's wrong.

  • Please post the table definitions for NAME_MASTER, temp_ad_act_20190208 and BIOGRAPH_MASTER. Or at least post the definitions of the columns used.
    This has probably nothing to do with your problem, but I recommend that you use NOT EXISTS instead of the LEFT JOIN.

    INSERT INTO BIOGRAPH_MASTER
    (
        ID_NUM,
        GENDER,
        BIRTH_DTE,
        user_name,
        job_name,
        job_time
    )
    SELECT n.ID_NUM,
       t.Gender,
       t.Birth_date,
       'TE_ADMIN' as USER_NAME,
       'recruitment_data_insert_v99' as JOB_NAME,
       GETDATE() as JOB_TIME
    FROM NAME_MASTER n
    JOIN temp_ad_act_20190208 t ON n.EMAIL_ADDRESS = t.EMAIL
    WHERE NOT EXISTS( SELECT *
               FROM BIOGRAPH_MASTER b
               WHERE b.ID_NUM = n.ID_NUM);

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Here are the column definitions of anything I believe is relevant.

    NAME_MASTER

    Column_nameTypeComputedLengthPrecScaleNullableTrimTrailingBlanksFixedLenNullInSourceCollation
    ID_NUMintno4100no(n/a)(n/a)NULL
    EMAIL_ADDRESScharno60          yesyesyesSQL_Latin1_General_CP1_CI_AS

    BIOGRAPH_MASTER

    Column_nameTypeComputedLengthPrecScaleNullableTrimTrailingBlanksFixedLenNullInSourceCollation
    ID_NUMintno4100no(n/a)(n/a)NULL
    BIRTH_DTEdatetimeno8          yes(n/a)(n/a)NULL
    GENDERcharno1          yesyesyesSQL_Latin1_General_CP1_CI_AS
    USER_NAMEvarcharno513          yesyesyesSQL_Latin1_General_CP1_CI_AS
    JOB_NAMEvarcharno30          yesyesyesSQL_Latin1_General_CP1_CI_AS
    JOB_TIMEdatetimeno8          yes(n/a)(n/a)NULL

    temp_ad_act_20190208

    Column_nameTypeComputedLengthPrecScaleNullableTrimTrailingBlanksFixedLenNullInSourceCollation
    Last_Namenvarcharno510          yes(n/a)(n/a)SQL_Latin1_General_CP1_CI_AS
    First_Namenvarcharno510          yes(n/a)(n/a)SQL_Latin1_General_CP1_CI_AS
    Addr_Line_1nvarcharno510          yes(n/a)(n/a)SQL_Latin1_General_CP1_CI_AS
    Citynvarcharno510          yes(n/a)(n/a)SQL_Latin1_General_CP1_CI_AS
    Statenvarcharno510          yes(n/a)(n/a)SQL_Latin1_General_CP1_CI_AS
    Zipfloatno853NULLyes(n/a)(n/a)NULL
    Emailnvarcharno510          yes(n/a)(n/a)SQL_Latin1_General_CP1_CI_AS
    Gendernvarcharno510          yes(n/a)(n/a)SQL_Latin1_General_CP1_CI_AS
    Birth_datedatetimeno8          yes(n/a)(n/a)NULL
    SEQ_NUMintno4100no(n/a)(n/a)NULL

    I'm showing 90% of the temp file columns so you could see that it is a very simple file. What I am doing (for starters) is matching them on obvious fields like e-mail address. In this case, I have 33 matches, that I have verified to match on name as well. I've done this many, many times, (much to my chagrin!).

    As much as I try to get a "clean" file, there are always issues with column length, etc. That's why I just import most columns in as a large nvarchar. Yes, I HAVE tried LTRIM and RTRIM around all of these columns and run extensive LEN testing. It's all the same.The set based fails and the loop (or one by one) works.

    Could it be that SSMS is looking at those large nvarchars and freaking out? If so, why doesn't it also short-circuit when I run them one by one?

    pietlinden - Monday, February 11, 2019 7:41 PM

    Could you post one or two "row-by-row" inserts that worked as well as the table definition of BIOGRAPH_MASTER table? There's not enough information here to figure out what's wrong.

    Sure, just leave the while loop out in the example above. Like this:


       INSERT INTO BIOGRAPH_MASTER
        (ID_NUM, GENDER, BIRTH_DTE
        , user_name, job_name, job_time)

        SELECT n.ID_NUM, t.Gender, t.Birth_date
        ,'TE_ADMIN' as USER_NAME, 'recruitment_data_insert_v99' as JOB_NAME, GETDATE() as JOB_TIME
        from NAME_MASTER n join temp_ad_act_20190208 t
        on n.EMAIL_ADDRESS = t.EMAIL
            left join BIOGRAPH_MASTER b on b.ID_NUM = n.ID_NUM
        where b.ID_NUM is null
            and t.SEQ_NUM =
            (
            select MIN(b.SEQ_NUM)
            from NAME_MASTER a join temp_ad_act_20190208 b
            on a.EMAIL_ADDRESS = b.EMAIL
            left join BIOGRAPH_MASTER c on c.ID_NUM = a.ID_NUM
            where c.ID_NUM is null
            )

    This section specifically is what restricts the output to one row at a time:

    and t.SEQ_NUM =
            (
            select MIN(b.SEQ_NUM)
            from NAME_MASTER a join temp_ad_act_20190208 b
            on a.EMAIL_ADDRESS = b.EMAIL
            left join BIOGRAPH_MASTER c on c.ID_NUM = a.ID_NUM
            where c.ID_NUM is null
            )

    Luis Cazares - Tuesday, February 12, 2019 7:16 AM

    ...This has probably nothing to do with your problem, but I recommend that you use NOT EXISTS instead of the LEFT JOIN.

    Thank You Luis. I'm assuming it's more efficient to do so?

    /***********************/

    Some screenshots for y'all:

    I randomized the data here, don't want to show the true ID_NUMs of course. They are the same length, though. A sample of what I am trying to insert:

  • The only thing that I could be happening is with the large gender column on your temp table.  Maybe you have some non-visible characters in there.
    Here's an example:

    CREATE TABLE NAME_MASTER(
      ID_NUM    int    NOT NULL,
      EMAIL_ADDRESS    char(60)
    )

    CREATE TABLE BIOGRAPH_MASTER(
      ID_NUM    int    NOT NULL,
      BIRTH_DTE    datetime NULL,
      GENDER    char(1) NULL ,
      USER_NAME    varchar(513) NULL,
      JOB_NAME    varchar(30) NULL,
      JOB_TIME    datetime NULL
    )

    CREATE TABLE temp_ad_act_20190208 (
    --Column_name    Type    Computed    Length    Prec    Scale    Nullable    TrimTrailingBlanks    FixedLenNullInSource    Collation
      Last_Name    nvarchar(255),
      First_Name    nvarchar(255),
      Addr_Line_1    nvarchar(255),
      City    nvarchar(255),
      State    nvarchar(255),
      Zip    float,
      Email    nvarchar(255),
      Gender    nvarchar(255),
      Birth_date    datetime,
      SEQ_NUM    int
    )
    INSERT INTO NAME_MASTER VALUES(1, 'luis@hotmail.com'),(2, 'lcazares@hotmail.com')
    INSERT INTO temp_ad_act_20190208 VALUES( 'Cazares', 'Luis', 'Unknown', 'Atlanta', 'GA', 30302, 'luis@hotmail.com', 'M', '19000101', 1)
    INSERT INTO temp_ad_act_20190208 VALUES( 'Cazares', 'Luis', 'Unknown', 'Atlanta', 'GA', 30302, 'luisverylongemailaddresthatwontfitonsixtycharacters@hotmail.com', 'M', '19000101', 2)
    INSERT INTO temp_ad_act_20190208 VALUES( 'Cazares', 'Luis', 'Unknown', 'Atlanta', 'GA', 30302, 'lcazares@hotmail.com', 'M' + CHAR(0), '19000101', 3)

    SELECT n.ID_NUM,
      t.Gender,
      t.Birth_date,
      'TE_ADMIN' as USER_NAME,
      'recruitment_data_insert_v99' as JOB_NAME,
      GETDATE() as JOB_TIME
    FROM NAME_MASTER n
    JOIN temp_ad_act_20190208 t ON n.EMAIL_ADDRESS = t.EMAIL
    WHERE NOT EXISTS( SELECT *
        FROM BIOGRAPH_MASTER b
        WHERE b.ID_NUM = n.ID_NUM);

    INSERT INTO BIOGRAPH_MASTER
    (
      ID_NUM,
      GENDER,
      BIRTH_DTE,
      user_name,
      job_name,
      job_time
    )
    SELECT n.ID_NUM,
      t.Gender,
      t.Birth_date,
      'TE_ADMIN' as USER_NAME,
      'recruitment_data_insert_v99' as JOB_NAME,
      GETDATE() as JOB_TIME
    FROM NAME_MASTER n
    JOIN temp_ad_act_20190208 t ON n.EMAIL_ADDRESS = t.EMAIL
    WHERE NOT EXISTS( SELECT *
        FROM BIOGRAPH_MASTER b
        WHERE b.ID_NUM = n.ID_NUM);

    GO
    --DROP TABLE BIOGRAPH_MASTER, NAME_MASTER, temp_ad_act_20190208

    EDIT: Yes, NOT EXISTS is more efficient and clearer on what it does. This article explains it with examples: https://sqlinthewild.co.za/index.php/2010/03/23/left-outer-join-vs-not-exists/

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Ok. Just for kicks, I will move that column somewhere else, drop it, recreate it to char(1), restore, and give it a go again. Still, it doesn't make sense that it DOES work when I run them one by one. :\

  • Check for INSERT triggers on the table being INSERTed into, just to be sure.  The error could be occurring in a trigger rather than for the main INSERT.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • Success! Yet, the mystery continues as I do not feel satisfied with WHY this happens in T-SQL.

    When I ran this, in an attempt to modify the Gender column:
    alter table temp_ad_act_20190208
    add Gender_Temp varchar(1)

    BEGIN
    update a
    set Gender_Temp = a.Gender
    from temp_ad_act_20190208 a
    END

    I received the EXACT same error as before. So there, it's definitely that column that is the culprit. Let me be clear, I did check for LEN before, however, I was only checking in my results set, NOT the entire temp file. I updated the two values (out of 3,000+ rows) with a long string to (null), then the code below worked just fine.
       INSERT INTO BIOGRAPH_MASTER
        (ID_NUM, GENDER, BIRTH_DTE
        , USER_NAME, JOB_NAME, JOB_TIME)

        SELECT n.ID_NUM, t.Gender, t.Birth_date
        ,'TE_ADMIN' as USER_NAME, 'recruitment_data_insert_v99' as JOB_NAME, GETDATE() as JOB_TIME
        from NAME_MASTER n join temp_ad_act_20190208 t
        on n.EMAIL_ADDRESS = t.EMAIL and t.Gender is not null
        where NOT EXISTS
                (SELECT *
                FROM BIOGRAPH_MASTER b
                WHERE b.ID_NUM = n.ID_NUM)

    The remaining BURNING question
    What explains the inconsistency in how the SQL engine is evaluating column values? It seems that it evaluates ALL (distinct?) column values when using a set based approach, yet, when running a single row insert, it gives you a free pass.

  • zuma-1122451 - Tuesday, February 12, 2019 5:07 PM

    Success! Yet, the mystery continues as I do not feel satisfied with WHY this happens in T-SQL.

    When I ran this, in an attempt to modify the Gender column:
    alter table temp_ad_act_20190208
    add Gender_Temp varchar(1)

    BEGIN
    update a
    set Gender_Temp = a.Gender
    from temp_ad_act_20190208 a
    END

    I received the EXACT same error as before. So there, it's definitely that column that is the culprit. Let me be clear, I did check for LEN before, however, I was only checking in my results set, NOT the entire temp file. I updated the two values (out of 3,000+ rows) with a long string to (null), then the code below worked just fine.
       INSERT INTO BIOGRAPH_MASTER
        (ID_NUM, GENDER, BIRTH_DTE
        , USER_NAME, JOB_NAME, JOB_TIME)

        SELECT n.ID_NUM, t.Gender, t.Birth_date
        ,'TE_ADMIN' as USER_NAME, 'recruitment_data_insert_v99' as JOB_NAME, GETDATE() as JOB_TIME
        from NAME_MASTER n join temp_ad_act_20190208 t
        on n.EMAIL_ADDRESS = t.EMAIL and t.Gender is not null
        where NOT EXISTS
                (SELECT *
                FROM BIOGRAPH_MASTER b
                WHERE b.ID_NUM = n.ID_NUM)

    The remaining BURNING question
    What explains the inconsistency in how the SQL engine is evaluating column values? It seems that it evaluates ALL (distinct?) column values when using a set based approach, yet, when running a single row insert, it gives you a free pass.

    Checking for LEN() will NOT check for trailing spaces.  Make the check using DATALENGTH().  If the data you're checking is NVARCHAR or NCHAR, don't forget to divide the result of DATALENGTH() by 2 to get the number of characters.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Good point Jeff. In this case, I didn't have any trailing spaces but your point is well taken. LEN indeed doesn't count trailing spaces. In this file, the gender column actually said 'No specification' in a couple of the rows. Any idea what explains the behavior I witnessed? Why would it look at column values that were not even present in my result set, yet conveniently ignore them when updating one row at a time?

  • I think why it is failing on the set-based vs the RBAR approach is because of where the subquery exists in the statement. In order to evaluate the subquery, it has to evaluate the join conditions first, thus generating a set which includes rows that overflow the Gender column. (The way the engine works, it first builds a super-set based on the join criteria, and then whittles it down based on conditions in the WHERE clause. I've probably grossly over-simplified things, but you get the general idea.)

    If you can find a way to pre-filter temp_ad_act_20190208 first before the insert, this should avoid the error on the set-based approach. (I assume by the naming convention that this is some sort of perma-temp table?)

  • That sounds plausible and it's what I have been thinking of as well. The execution plan is simply different for each approach. It's different enough that, as you state, the set-based approach makes it evaluate the whole table somehow. Perhaps I could poke around in the execution plans and try to make sense of them. I'm importing students from a list our Admissions department purchases, students who have taken their ACT tests. They use them for leads and, of course, a lot of them are already in our database from other sources. It's a lot of fun. :\

    I'm kind of  obsessing about this behavior now. When I get a moment, I'm going to attempt to recreate this scenario with just a couple of tables and very few rows. In the end, it all comes back to garbage in, garbage out. I know these files are not the greatest (even though we've paid for them!). Getting other departments to help in the cleanup of the file is like pulling teeth, though. Lesson learned. An ounce of prevention is better than...blah blah blah. You know the saying. You never know what SQL is gonna do sometimes! It doesn't even annoy me. Somehow I find it incredibly interesting. I guess that is just one more reason I really like being a DBA!

  • You can always select all that data into a new table and see what the datatypes are.  It may clue you in on which column and row

    For better, quicker answers, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Mike01, we're past that now. It was two rouge rows in the Gender column that were making SQL unhappy. The grand mystery was that SQL was evaluating the entire column when deciding there would be a truncation error. The rows I was inserting into the table did NOT contain anything longer than char(1).

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply