A real doozie!

Question

A real doozie!

zuma-1122451

SSC Enthusiast

Points: 146
More actions
February 11, 2019 at 7:04 pm

#409504

Microsoft SQL Server 2016 (SP1-GDR) (KB4019089) - 13.0.4206.0 (X64)
I am performing a very simple table insert and seeing the dreaded
Msg 8152, Level 16, State 2, Line 669
String or binary data would be truncated.
The statement has been terminated.

I check all my lengths and everything is fine. This is a super simple insert. Only 33 rows! I can see everything, I even do a LEN on every column. Data types match as well.I get the idea to run this through a while loop and, amazingly, it works! Just for kicks I delete my inserts I made with the while loop and try again. Nope! It simply will not insert with the set based approach. What gives?
This works:
select 1 --just to set initial count so it runs reliably while @@ROWCOUNT > 0 BEGIN     INSERT INTO BIOGRAPH_MASTER     (ID_NUM, GENDER, BIRTH_DTE     , user_name, job_name, job_time)
    SELECT n.ID_NUM, t.Gender, t.Birth_date     ,'TE_ADMIN' as USER_NAME, 'recruitment_data_insert_v99' as JOB_NAME, GETDATE() as JOB_TIME     from NAME_MASTER n join temp_ad_act_20190208 t     on n.EMAIL_ADDRESS = t.EMAIL         left join BIOGRAPH_MASTER b on b.ID_NUM = n.ID_NUM     where b.ID_NUM is null         and t.SEQ_NUM =         (         select MIN(b.SEQ_NUM)         from NAME_MASTER a join temp_ad_act_20190208 b         on a.EMAIL_ADDRESS = b.EMAIL         left join BIOGRAPH_MASTER c on c.ID_NUM = a.ID_NUM         where c.ID_NUM is null         ) END
This does NOT work!:
    INSERT INTO BIOGRAPH_MASTER     (ID_NUM, GENDER, BIRTH_DTE     , user_name, job_name, job_time)
    SELECT n.ID_NUM, t.Gender, t.Birth_date     ,'TE_ADMIN' as USER_NAME, 'recruitment_data_insert_v99' as JOB_NAME, GETDATE() as JOB_TIME     from NAME_MASTER n join temp_ad_act_20190208 t     on n.EMAIL_ADDRESS = t.EMAIL         left join BIOGRAPH_MASTER b on b.ID_NUM = n.ID_NUM     where b.ID_NUM is null
I've even run it row by row, without the while loop, for 33 times, and every row inserts just fine. It's driving me nuts! Why won't the good ol' set based insert work? Where am I going wrong here?

Viewing 14 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply

pietlinden SSC Guru Points: 63633 More actions · Answer 1

Could you post one or two "row-by-row" inserts that worked as well as the table definition of BIOGRAPH_MASTER table? There's not enough information here to figure out what's wrong.

Luis Cazares SSC Guru Points: 183706 More actions · Answer 2

Please post the table definitions for NAME_MASTER, temp_ad_act_20190208 and BIOGRAPH_MASTER. Or at least post the definitions of the columns used.
This has probably nothing to do with your problem, but I recommend that you use NOT EXISTS instead of the LEFT JOIN.
INSERT INTO BIOGRAPH_MASTER ( ID_NUM, GENDER, BIRTH_DTE, user_name, job_name, job_time ) SELECT n.ID_NUM, t.Gender, t.Birth_date, 'TE_ADMIN' as USER_NAME, 'recruitment_data_insert_v99' as JOB_NAME, GETDATE() as JOB_TIME FROM NAME_MASTER n JOIN temp_ad_act_20190208 t ON n.EMAIL_ADDRESS = t.EMAIL WHERE NOT EXISTS( SELECT * FROM BIOGRAPH_MASTER b WHERE b.ID_NUM = n.ID_NUM);

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

zuma-1122451 SSC Enthusiast Points: 146 More actions · Answer 3

Here are the column definitions of anything I believe is relevant.

NAME_MASTER

Column_name	Type	Computed	Length	Prec	Scale	Nullable	TrimTrailingBlanks	FixedLenNullInSource	Collation
ID_NUM	int	no	4	10	0	no	(n/a)	(n/a)	NULL
EMAIL_ADDRESS	char	no	60			yes	yes	yes	SQL_Latin1_General_CP1_CI_AS

BIOGRAPH_MASTER

Column_name	Type	Computed	Length	Prec	Scale	Nullable	TrimTrailingBlanks	FixedLenNullInSource	Collation
ID_NUM	int	no	4	10	0	no	(n/a)	(n/a)	NULL
BIRTH_DTE	datetime	no	8			yes	(n/a)	(n/a)	NULL
GENDER	char	no	1			yes	yes	yes	SQL_Latin1_General_CP1_CI_AS
USER_NAME	varchar	no	513			yes	yes	yes	SQL_Latin1_General_CP1_CI_AS
JOB_NAME	varchar	no	30			yes	yes	yes	SQL_Latin1_General_CP1_CI_AS
JOB_TIME	datetime	no	8			yes	(n/a)	(n/a)	NULL

temp_ad_act_20190208

Column_name	Type	Computed	Length	Prec	Scale	Nullable	TrimTrailingBlanks	FixedLenNullInSource	Collation
Last_Name	nvarchar	no	510			yes	(n/a)	(n/a)	SQL_Latin1_General_CP1_CI_AS
First_Name	nvarchar	no	510			yes	(n/a)	(n/a)	SQL_Latin1_General_CP1_CI_AS
Addr_Line_1	nvarchar	no	510			yes	(n/a)	(n/a)	SQL_Latin1_General_CP1_CI_AS
City	nvarchar	no	510			yes	(n/a)	(n/a)	SQL_Latin1_General_CP1_CI_AS
State	nvarchar	no	510			yes	(n/a)	(n/a)	SQL_Latin1_General_CP1_CI_AS
Zip	float	no	8	53	NULL	yes	(n/a)	(n/a)	NULL
Email	nvarchar	no	510			yes	(n/a)	(n/a)	SQL_Latin1_General_CP1_CI_AS
Gender	nvarchar	no	510			yes	(n/a)	(n/a)	SQL_Latin1_General_CP1_CI_AS
Birth_date	datetime	no	8			yes	(n/a)	(n/a)	NULL
SEQ_NUM	int	no	4	10	0	no	(n/a)	(n/a)	NULL

I'm showing 90% of the temp file columns so you could see that it is a very simple file. What I am doing (for starters) is matching them on obvious fields like e-mail address. In this case, I have 33 matches, that I have verified to match on name as well. I've done this many, many times, (much to my chagrin!).

As much as I try to get a "clean" file, there are always issues with column length, etc. That's why I just import most columns in as a large nvarchar. Yes, I HAVE tried LTRIM and RTRIM around all of these columns and run extensive LEN testing. It's all the same.The set based fails and the loop (or one by one) works.

Could it be that SSMS is looking at those large nvarchars and freaking out? If so, why doesn't it also short-circuit when I run them one by one?

pietlinden - Monday, February 11, 2019 7:41 PM
Could you post one or two "row-by-row" inserts that worked as well as the table definition of BIOGRAPH_MASTER table? There's not enough information here to figure out what's wrong.

Sure, just leave the while loop out in the example above. Like this:

INSERT INTO BIOGRAPH_MASTER (ID_NUM, GENDER, BIRTH_DTE , user_name, job_name, job_time)

SELECT n.ID_NUM, t.Gender, t.Birth_date ,'TE_ADMIN' as USER_NAME, 'recruitment_data_insert_v99' as JOB_NAME, GETDATE() as JOB_TIME from NAME_MASTER n join temp_ad_act_20190208 t on n.EMAIL_ADDRESS = t.EMAIL left join BIOGRAPH_MASTER b on b.ID_NUM = n.ID_NUM where b.ID_NUM is null and t.SEQ_NUM = ( select MIN(b.SEQ_NUM) from NAME_MASTER a join temp_ad_act_20190208 b on a.EMAIL_ADDRESS = b.EMAIL left join BIOGRAPH_MASTER c on c.ID_NUM = a.ID_NUM where c.ID_NUM is null )

This section specifically is what restricts the output to one row at a time:
and t.SEQ_NUM = ( select MIN(b.SEQ_NUM) from NAME_MASTER a join temp_ad_act_20190208 b on a.EMAIL_ADDRESS = b.EMAIL left join BIOGRAPH_MASTER c on c.ID_NUM = a.ID_NUM where c.ID_NUM is null )

Luis Cazares - Tuesday, February 12, 2019 7:16 AM
...This has probably nothing to do with your problem, but I recommend that you use NOT EXISTS instead of the LEFT JOIN.

Thank You Luis. I'm assuming it's more efficient to do so?

/***********************/

Some screenshots for y'all:

I randomized the data here, don't want to show the true ID_NUMs of course. They are the same length, though. A sample of what I am trying to insert:

Luis Cazares SSC Guru Points: 183706 More actions · Answer 4

The only thing that I could be happening is with the large gender column on your temp table. Maybe you have some non-visible characters in there.
Here's an example:
CREATE TABLE NAME_MASTER( ID_NUM int NOT NULL, EMAIL_ADDRESS char(60) )

CREATE TABLE BIOGRAPH_MASTER( ID_NUM int NOT NULL, BIRTH_DTE datetime NULL, GENDER char(1) NULL , USER_NAME varchar(513) NULL, JOB_NAME varchar(30) NULL, JOB_TIME datetime NULL )

CREATE TABLE temp_ad_act_20190208 ( --Column_name Type Computed Length Prec Scale Nullable TrimTrailingBlanks FixedLenNullInSource Collation Last_Name nvarchar(255), First_Name nvarchar(255), Addr_Line_1 nvarchar(255), City nvarchar(255), State nvarchar(255), Zip float, Email nvarchar(255), Gender nvarchar(255), Birth_date datetime, SEQ_NUM int ) INSERT INTO NAME_MASTER VALUES(1, 'luis@hotmail.com'),(2, 'lcazares@hotmail.com') INSERT INTO temp_ad_act_20190208 VALUES( 'Cazares', 'Luis', 'Unknown', 'Atlanta', 'GA', 30302, 'luis@hotmail.com', 'M', '19000101', 1) INSERT INTO temp_ad_act_20190208 VALUES( 'Cazares', 'Luis', 'Unknown', 'Atlanta', 'GA', 30302, 'luisverylongemailaddresthatwontfitonsixtycharacters@hotmail.com', 'M', '19000101', 2) INSERT INTO temp_ad_act_20190208 VALUES( 'Cazares', 'Luis', 'Unknown', 'Atlanta', 'GA', 30302, 'lcazares@hotmail.com', 'M' + CHAR(0), '19000101', 3)

SELECT n.ID_NUM, t.Gender, t.Birth_date, 'TE_ADMIN' as USER_NAME, 'recruitment_data_insert_v99' as JOB_NAME, GETDATE() as JOB_TIME FROM NAME_MASTER n JOIN temp_ad_act_20190208 t ON n.EMAIL_ADDRESS = t.EMAIL WHERE NOT EXISTS( SELECT * FROM BIOGRAPH_MASTER b WHERE b.ID_NUM = n.ID_NUM);

INSERT INTO BIOGRAPH_MASTER ( ID_NUM, GENDER, BIRTH_DTE, user_name, job_name, job_time ) SELECT n.ID_NUM, t.Gender, t.Birth_date, 'TE_ADMIN' as USER_NAME, 'recruitment_data_insert_v99' as JOB_NAME, GETDATE() as JOB_TIME FROM NAME_MASTER n JOIN temp_ad_act_20190208 t ON n.EMAIL_ADDRESS = t.EMAIL WHERE NOT EXISTS( SELECT * FROM BIOGRAPH_MASTER b WHERE b.ID_NUM = n.ID_NUM);

GO --DROP TABLE BIOGRAPH_MASTER, NAME_MASTER, temp_ad_act_20190208

EDIT: Yes, NOT EXISTS is more efficient and clearer on what it does. This article explains it with examples: https://sqlinthewild.co.za/index.php/2010/03/23/left-outer-join-vs-not-exists/

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

zuma-1122451 SSC Enthusiast Points: 146 More actions · Answer 5

Ok. Just for kicks, I will move that column somewhere else, drop it, recreate it to char(1), restore, and give it a go again. Still, it doesn't make sense that it DOES work when I run them one by one. :\

ScottPletcher SSC Guru Points: 101235 More actions · Answer 6

Check for INSERT triggers on the table being INSERTed into, just to be sure. The error could be occurring in a trigger rather than for the main INSERT.

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

zuma-1122451 SSC Enthusiast Points: 146 More actions · Answer 7

Success! Yet, the mystery continues as I do not feel satisfied with WHY this happens in T-SQL.

When I ran this, in an attempt to modify the Gender column:
alter table temp_ad_act_20190208 add Gender_Temp varchar(1)

BEGIN update a set Gender_Temp = a.Gender from temp_ad_act_20190208 a END
I received the EXACT same error as before. So there, it's definitely that column that is the culprit. Let me be clear, I did check for LEN before, however, I was only checking in my results set, NOT the entire temp file. I updated the two values (out of 3,000+ rows) with a long string to (null), then the code below worked just fine.
INSERT INTO BIOGRAPH_MASTER (ID_NUM, GENDER, BIRTH_DTE , USER_NAME, JOB_NAME, JOB_TIME)

SELECT n.ID_NUM, t.Gender, t.Birth_date ,'TE_ADMIN' as USER_NAME, 'recruitment_data_insert_v99' as JOB_NAME, GETDATE() as JOB_TIME from NAME_MASTER n join temp_ad_act_20190208 t on n.EMAIL_ADDRESS = t.EMAIL and t.Gender is not null where NOT EXISTS (SELECT * FROM BIOGRAPH_MASTER b WHERE b.ID_NUM = n.ID_NUM)

The remaining BURNING question
What explains the inconsistency in how the SQL engine is evaluating column values? It seems that it evaluates ALL (distinct?) column values when using a set based approach, yet, when running a single row insert, it gives you a free pass.

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 8

zuma-1122451 - Tuesday, February 12, 2019 5:07 PM
Success! Yet, the mystery continues as I do not feel satisfied with WHY this happens in T-SQL.
When I ran this, in an attempt to modify the Gender column:
alter table temp_ad_act_20190208 add Gender_Temp varchar(1)
BEGIN update a set Gender_Temp = a.Gender from temp_ad_act_20190208 a END
I received the EXACT same error as before. So there, it's definitely that column that is the culprit. Let me be clear, I did check for LEN before, however, I was only checking in my results set, NOT the entire temp file. I updated the two values (out of 3,000+ rows) with a long string to (null), then the code below worked just fine.
INSERT INTO BIOGRAPH_MASTER (ID_NUM, GENDER, BIRTH_DTE , USER_NAME, JOB_NAME, JOB_TIME)
SELECT n.ID_NUM, t.Gender, t.Birth_date ,'TE_ADMIN' as USER_NAME, 'recruitment_data_insert_v99' as JOB_NAME, GETDATE() as JOB_TIME from NAME_MASTER n join temp_ad_act_20190208 t on n.EMAIL_ADDRESS = t.EMAIL and t.Gender is not null where NOT EXISTS (SELECT * FROM BIOGRAPH_MASTER b WHERE b.ID_NUM = n.ID_NUM)
The remaining BURNING question
What explains the inconsistency in how the SQL engine is evaluating column values? It seems that it evaluates ALL (distinct?) column values when using a set based approach, yet, when running a single row insert, it gives you a free pass.

Checking for LEN() will NOT check for trailing spaces. Make the check using DATALENGTH(). If the data you're checking is NVARCHAR or NCHAR, don't forget to divide the result of DATALENGTH() by 2 to get the number of characters.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

zuma-1122451 SSC Enthusiast Points: 146 More actions · Answer 9

Good point Jeff. In this case, I didn't have any trailing spaces but your point is well taken. LEN indeed doesn't count trailing spaces. In this file, the gender column actually said 'No specification' in a couple of the rows. Any idea what explains the behavior I witnessed? Why would it look at column values that were not even present in my result set, yet conveniently ignore them when updating one row at a time?

Jonathan Szeto SSC Enthusiast Points: 120 More actions · Answer 10

I think why it is failing on the set-based vs the RBAR approach is because of where the subquery exists in the statement. In order to evaluate the subquery, it has to evaluate the join conditions first, thus generating a set which includes rows that overflow the Gender column. (The way the engine works, it first builds a super-set based on the join criteria, and then whittles it down based on conditions in the WHERE clause. I've probably grossly over-simplified things, but you get the general idea.)

If you can find a way to pre-filter temp_ad_act_20190208 first before the insert, this should avoid the error on the set-based approach. (I assume by the naming convention that this is some sort of perma-temp table?)

zuma-1122451 SSC Enthusiast Points: 146 More actions · Answer 11

That sounds plausible and it's what I have been thinking of as well. The execution plan is simply different for each approach. It's different enough that, as you state, the set-based approach makes it evaluate the whole table somehow. Perhaps I could poke around in the execution plans and try to make sense of them. I'm importing students from a list our Admissions department purchases, students who have taken their ACT tests. They use them for leads and, of course, a lot of them are already in our database from other sources. It's a lot of fun. :\

I'm kind of obsessing about this behavior now. When I get a moment, I'm going to attempt to recreate this scenario with just a couple of tables and very few rows. In the end, it all comes back to garbage in, garbage out. I know these files are not the greatest (even though we've paid for them!). Getting other departments to help in the cleanup of the file is like pulling teeth, though. Lesson learned. An ounce of prevention is better than...blah blah blah. You know the saying. You never know what SQL is gonna do sometimes! It doesn't even annoy me. Somehow I find it incredibly interesting. I guess that is just one more reason I really like being a DBA!

Mike01 SSChampion Points: 11420 More actions · Answer 12

You can always select all that data into a new table and see what the datatypes are. It may clue you in on which column and row

For better, quicker answers, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

zuma-1122451 SSC Enthusiast Points: 146 More actions · Answer 13

Mike01, we're past that now. It was two rouge rows in the Gender column that were making SQL unhappy. The grand mystery was that SQL was evaluating the entire column when deciding there would be a truncation error. The rows I was inserting into the table did NOT contain anything longer than char(1).