String or Binary Truncation Error Strange Behavior

  • Ok... this is driving me absolutely nuts!

    We have a insert query that is throwing a "string or binary truncation error" (SQL Server 2017 CU16 (Dev) and CU20 (Prod)). One of the fields (AccountNo) was configured to be a CHAR(6) NULL,  and the "source" table has the field as a varchar(20).

    Turning on trace flag 460 indicates the problem is with the AccountNo field with a value of "" (seriously?).

    • Inserting the data into a temp table where the AccountNo field is a varchar(10) works fine. Anything less than a varchar(10) throws an error.

    • Querying the data in the temp table, nothing exceeds 6 characters in the AccountNo field. There are no special characters, embedded CRLF (Char(13)/Char(10)), etc.

    • The query is very simple, no cross applies, only one inner join and one left outer join (with a filter/where clause using "InvoiceNo is NULL" for the LEFT OUTER JOIN table ... lets not go there with "NOT EXISTS" or "COALESCE"... this is a very old query! LOL).

    • One of the source tables (88 columns wide), the source of the "AccountNo" data, contains tens of millions of records (we are only pulling 6000) has a 36-column wide Non-clustered columnstore index, which has shown to make dramatic improvements in various queries/reports.

    * Remove the non-clustered columnstore index and the truncation error goes away.

    * Add the non-clustered columnstore index back and the truncation error returns.

    * Using a RIGHT(AccountNo,6) also resolves the issue, as does setting Ansi Warnings OFF.

    Other than patching the system to the latest CU, any ideas or suggestions?

    • This topic was modified 4 years, 8 months ago by  DB_Newbie2007.
    • This topic was modified 4 years, 8 months ago by  DB_Newbie2007.

    Argue for your limitations, and sure enough they're yours (Richard Bach, Illusions)

  • We just discovered another process that is having a similar issue. It appears to be where the field size in the "source" table (which contains the non-clustered columnstore index) is larger than the size in the "destination" table, even when the actual data size being inserted is correct.

    • This reply was modified 4 years, 8 months ago by  DB_Newbie2007.
    • This reply was modified 4 years, 8 months ago by  DB_Newbie2007.

    Argue for your limitations, and sure enough they're yours (Richard Bach, Illusions)

  • Could be some string terminating characters in the account number, see if this returns anything from your source table

    select ASCII(SUBSTRING(AccountNo,6,1)), AccountNo
    from mySourceTable
    where ASCII(SUBSTRING(AccountNo,6,1)) is not null
  • Jonathan,

    Thanks for the suggestion. We did think of that early on and tried a few different combinations, i.e.,

    AccountNo LIKE '%[^a-zA-Z0-9]%'
    AccountNo LIKE '%'+CHAR(13)+'%' OR AccountNo LIKE '%'+CHAR(10)+'%' --- CR LF

    but had no success. I tried your suggestion also and nothing other than standard ASCII characters were returned.

    At this point, since we have seen this problem on 2 different fields (AccountNo, InvoiceNo) where the "destination" table contains a field size smaller than the "source" table and removing the non-clustered columnstore index (NCCI) "corrects" the string truncation error, we have high confidence that there is a bug related to the NCCI  index.

    Argue for your limitations, and sure enough they're yours (Richard Bach, Illusions)

  • DB_Newbie2007 wrote:

    Jonathan,

    Thanks for the suggestion. We did think of that early on and tried a few different combinations, i.e.,

    Sorry, I meant to put the 7th character not the 6th:

    select ASCII(SUBSTRING(AccountNo,7,1)), AccountNo
    from mySourceTable
    where ASCII(SUBSTRING(AccountNo,7,1)) is not null

     

  • You say you tested and nothing exceeded 6 characters in the source table.  If you use LEN() to do the test, you did it wrong.  Use DATALENGTH() instead and see what you get.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff,

    Thanks for the update. For the initial problem, we spent over 5 hours trying to look at the data to figure out what was going wrong.

    • Len() and DataLength()

    • Different methods to try and look for "non-alphanumeric" (A-Z0-9) characters, including export of the data to view it in a hex editor.

    • Broke the query down to test all individual account sequences (until there were no more accounts to test).

    • Looked at the query plan to make sure nothing strange was going on (literally, this was one inner join and one left outer join).

    • Tested on different servers (all SQL Server 2017).

    I can say we exhausted all of our knowledge in trying to figure out what might be wrong with the "AccountNo".

    After looking at the various cumulative update bug fixes, we noticed one (CU16? CU17?) mentioned a problem with queries and non-clustered columnstore indexes (NCCI). As a desperate try, we removed the index and suddenly the truncation error went away. Adding the NCCI back resulted in the truncation error again. We also noticed that doing a RIGHT(AccountNo,6) also fixed the problem.

    Aftwards, we noticed another scheduled process had been failing for some time (months... please don't ask why) with a similar issue (but a different field). Similar to the AccountNo, removal of the NCCI or doing a LEFT(InvoiceNo,26) also corrected the problem.

    The strange part was the trace flag 460 returned a "" (blank) in both cases (AccountNo and InvoiceNo). Unlike the AccountNo issue (which comes from one datasource-format/multiple companies), the "InvoiceNo" comes from over 6 different datasources-formats (CSV, XML, EDI, etc.) and a variety of companies into one table (and this table is available across hundreds of databases, each one representing different customers), so the likely-hood that they would all contain a non-alphanumeric character or exceed 26 characters is VERY VERY small.

    Argue for your limitations, and sure enough they're yours (Richard Bach, Illusions)

  • If the source definition for the column is varchar(20) - why are you trying to force that into a char(6) column?  It sounds like the process to update the columnstore index is validating the data type and sees the size difference and that is generating the error.

    I guess I don't understand why you wouldn't define the column with the correct data type.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • DB_Newbie2007 wrote:

    As a desperate try, we removed the index and suddenly the truncation error went away.

    Lordy.  Thanks for the feedback on that.  It's very timely as I was going to actually try column store on a table or two and then put them into production.  I don't need "glitches'n'gotyas" like that.  Heh... I thought I had waited long enough for other folks to shake the carpet out on column store.

     

    As a bit of a sidebar, you should write an article about all the things you tried.  People love to hear such war stories and it helps make them better troubleshooters.  Who would have ever guessed that an NCCI could have caused such a problem?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • If the source definition for the column is varchar(20) - why are you trying to force that into a char(6) column? 

    Long story... but in a nutshell, the "source" table contains data from multiple suppliers, where the "destination" table contains some summary data (plus other info) from just one of the suppliers. The one supplier's account number(s) are 6-characters long and being inserted into a table created over 8 years ago.

    It sounds like the process to update the columnstore index is validating the data type and sees the size difference and that is generating the error.

    Actually, we are not updating the columnstore index, but yes, it does appear that something in the process (query parser? algebrizer? query optimizer?) does appear to be comparing size differences and then throwing the error, even though it appears the actual data being extracted/inserted is fine.

    Argue for your limitations, and sure enough they're yours (Richard Bach, Illusions)

  • Lordy.  Thanks for the feedback on that.  It's very timely as I was going to actually try column store on a table or two and then put them into production.  I don't need "glitches'n'gotyas" like that.  Heh... I thought I had waited long enough for other folks to shake the carpet out on column store.

    We did quite a bit of testing, read lots of articles (Niko Neugebauer as well as others) and, over all, we are very happy with the benefits (report speed) using the non-clustered columnstore index. We did try a clustered columnstore index, but (I can't remember the details at this moment) we discovered some problems with it. If using a non-clustered columnstore index requires we use a LEFT() or RIGHT() function, it far outweighs the benefits we have seen so far. BTW, a lot of our tables are 10s of millions or 100s of millions of records (rolling 15-17 months of data) on tables that are anywhere from 30 to 90 columns wide.

    Argue for your limitations, and sure enough they're yours (Richard Bach, Illusions)

  • DB_Newbie2007 wrote:

    Lordy.  Thanks for the feedback on that.  It's very timely as I was going to actually try column store on a table or two and then put them into production.  I don't need "glitches'n'gotyas" like that.  Heh... I thought I had waited long enough for other folks to shake the carpet out on column store.

    We did quite a bit of testing, read lots of articles (Niko Neugebauer as well as others) and, over all, we are very happy with the benefits (report speed) using the non-clustered columnstore index. We did try a clustered columnstore index, but (I can't remember the details at this moment) we discovered some problems with it. If using a non-clustered columnstore index requires we use a LEFT() or RIGHT() function, it far outweighs the benefits we have seen so far. BTW, a lot of our tables are 10s of millions or 100s of millions of records (rolling 15-17 months of data) on tables that are anywhere from 30 to 90 columns wide.

    Normally, if you use such a function on a column in a WHERE or JOIN, it destroys SARGability.  I don't yet know if that's a considereation to worry about with columnstore but I can't imagine it's not.

    With that, let me ask a question to clarify what you mean because I'm genuinely interested.  Are you saying that even with wrapping a function around a column in a WHERE or JOIN that column store is so bloody fast that it's still worthwhile to use columnstore even with such non-SARGable predicates???

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • DB_Newbie2007 wrote:

    We just discovered another process that is having a similar issue. It appears to be where the field size in the "source" table (which contains the non-clustered columnstore index) is larger than the size in the "destination" table, even when the actual data size being inserted is correct.

    does the error go away if you  cast the source field to your desired size (6 if not mistaken) as part of the source select?

    If so why not do it?

     

    regards

     

    Frederico

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply