December 14, 2017 at 1:02 pm
Hi,
declare @Paper varchar(100), @Value_date datetime2;
select @Paper = 'Kotimaiset osakaslainat, kiint.', @Value_date = '2017-11-30';
select CAST(HASHBYTES('MD5', CAST(@Value_date AS char(10)) + ' # ' + LTrim(Rtrim(@Paper))) AS VARBINARY(40)) dv_key ;
0xB54E921E9BF81A19E76D8D15D16AD000
But when inserted to varbinary(40) column in is modified to 0xB54E921E9BF81A19E76D8D15D16AD0
Two trailing zeroes are removed
When I compare the incoming 0xB54E921E9BF81A19E76D8D15D16AD000 to 0xB54E921E9BF81A19E76D8D15D16AD0 in Informatica lookup, it considers them different and the row is inserted to table and the causing an error due duplicate value in a primary key column...
Why does the HAshBytes function return those two trailing zeroes?
Can I get rid of them?
Ville
December 14, 2017 at 4:20 pm
In my case the result is consistent, I used this script:
declare @Paper varchar(100), @Value_date datetime2;
declare @result VARBINARY(40);
select @Paper = 'Kotimaiset osakaslainat, kiint.', @Value_date = '2017-11-30';
declare @tst table(result VARBINARY(40));
create table dbo.tst(result VARBINARY(40));
select CAST(HASHBYTES('MD5', CAST(@Value_date AS char(10)) + ' # ' + LTrim(Rtrim(@Paper))) AS VARBINARY(40)) dv_key;
set @result = HASHBYTES('MD5', CAST(@Value_date AS char(10)) + ' # ' + LTrim(Rtrim(@Paper)));
select @result;
insert into @tst values(HASHBYTES('MD5', CAST(@Value_date AS char(10)) + ' # ' + LTrim(Rtrim(@Paper))));
select * from @tst;
insert into dbo.tst values(HASHBYTES('MD5', CAST(@Value_date AS char(10)) + ' # ' + LTrim(Rtrim(@Paper))));
select * from dbo.tst;
drop table dbo.tst;
December 14, 2017 at 10:21 pm
Evgeny - Thursday, December 14, 2017 4:20 PMIn my case the result is consistent, I used this script:
declare @Paper varchar(100), @Value_date datetime2;
declare @result VARBINARY(40);
select @Paper = 'Kotimaiset osakaslainat, kiint.', @Value_date = '2017-11-30';
declare @tst table(result VARBINARY(40));
create table dbo.tst(result VARBINARY(40));
select CAST(HASHBYTES('MD5', CAST(@Value_date AS char(10)) + ' # ' + LTrim(Rtrim(@Paper))) AS VARBINARY(40)) dv_key;
set @result = HASHBYTES('MD5', CAST(@Value_date AS char(10)) + ' # ' + LTrim(Rtrim(@Paper)));
select @result;
insert into @tst values(HASHBYTES('MD5', CAST(@Value_date AS char(10)) + ' # ' + LTrim(Rtrim(@Paper))));
select * from @tst;
insert into dbo.tst values(HASHBYTES('MD5', CAST(@Value_date AS char(10)) + ' # ' + LTrim(Rtrim(@Paper))));
select * from dbo.tst;
drop table dbo.tst;
Hi,
yes so it seems so.
The difference is that in my case the insert goes through Informatica ETL tool. There is something with the Unix ODBC API or in the tool itself that causes this. I know I must try to investigate that.
But
December 14, 2017 at 10:29 pm
Hi,
yes so it seems...
The difference is that in my case the insert goes through Informatica ETL tool. There is something with the Unix ODBC API or in the tool itself that causes this. I know I must try to investigate that.
But since 0xB54E921E9BF81A19E76D8D15D16AD000 and 0xB54E921E9BF81A19E76D8D15D16AD0 are the same value atleast when comparing in Where clause
select 'Hello' where 0xB54E921E9BF81A19E76D8D15D16AD000 = 0xB54E921E9BF81A19E76D8D15D16AD0 ==> 'Hello'
but
select 'Hello' where 0xB54E921E9BF81A19E76D8D15D16AD0000 = 0xB54E921E9BF81A19E76D8D15D16AD0 ==> Nothing
and then again
select 'Hello' where 0xB54E921E9BF81A19E76D8D15D16AD00000 = 0xB54E921E9BF81A19E76D8D15D16AD0 ==> 'Hello'
So a even number of trailing zeroes does not change the value.
Suppose I don't find any explanation/reason from Informatica, is there anything I can do with T-SQL?
Ville
December 20, 2017 at 1:18 pm
I suspect you're getting an implicit conversion issue. Somewhere in your processing, that data type of VARBINARY(40) is getting implicitly converted, and that succeeds because the resulting data type is likely some kind of character string, where an additional 0x00 worth of binary data is irrelevant, because most string data types using a binary 0x00 to terminate the string, down at the lowest level. Thus the comparison only processes up to the string terminator. Figure out why you need varbinary() or figure out where the implicit conversion occurs. The only question that remains is understanding how varbinary values are terminated. I don't know that answer...
Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)
December 20, 2017 at 2:02 pm
Also, the MD5 hash is a 16 byte value. Don't count the leading 0x. You can change the VARBINARY(40) to VARBINARY(16) and hashbytes returns the same value. So, there may be a truncation occurring going into your Informatica process. Sorry, I haven't touched Informatica in over 12 years.
December 28, 2017 at 11:14 am
Hi,
OK. Binary(16) would solve the problem. There is only one minor problem...there is a big DataVault solely based on varbinary(40) fields - too big a job to start changing data types and Informatica loads...
Truncation in the load is not the case, all fports are varbinaru(40)
Thnaks, Ville
December 28, 2017 at 11:27 am
WilburSmith - Thursday, December 28, 2017 11:14 AMHi,OK. Binary(16) would solve the problem. There is only one minor problem...there is a big DataVault solely based on varbinary(40) fields - too big a job to start changing data types and Informatica loads...
Truncation in the load is not the case, all fports are varbinaru(40)
Thnaks, Ville
This value: 0xB54E921E9BF81A19E76D8D15D16AD000, is 16 bytes. HASHBYTES did NOT add trailing bytes to the value. If the final two zeros are being dropped, it is not HASHBYTES causing the problem. I get the same value whether I run HASHBYTES alone or cast it VARBINARY(40).
December 28, 2017 at 11:48 am
Hi,
yes I agree, it is not a hashbytes problem. instead Informatica removes the two trailing zeroes when inserting the row to db. Why..? Remains a mystery.
ville
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply