July 6, 2017 at 4:35 am
I'm trying to perform SHA256 hashing, using the standard HASHBYTES function (2012). The goal is to join 2 tables: one hashed internally in SQLServer, the other imported (already hashed in a different platform). Sort of this:
TABLE 1 (SQLserver) -> Hash it SHA256 <= INNER JOIN => TABLE 2 (imported into same SQLServer from .CSV file)
I've done some testing before the matching, but the output of the HASHBYTES function is unexpected. E.g.
DECLARE @HashThis nvarchar(4000);
SELECT @HashThis = CONVERT(nvarchar(4000), '123');
SELECT HASHBYTES('SHA2_256', @HashThis) ;
GO
produces as output: 0x26D6A8AD97C75FFC548F6873E5E93CE475479E3E1A1097381E54221FB53EC1D2
However, if I run SHA256 in R, or in any of the free SHA256 websites, the correct output for '123' is:
a665a45920422f9d417e4867efdc4fb8a04a1f3fff1fa07e998e86f7f7a27ae3
Initially I thought they don't match because my output is a binary var, but no: even if you convert to nvarchar, you never get the same output. My questions:
- What is going on here? Is this because of the coding language in the server (UTF, UNICODE, other. sorry not familiar with language codes)?
- How can I make sure that, when using HASHBYTES-SHA256, the output matches the 'standard' output shown above?
- Even if I fix the previous issue, the 0x at the start baffles me a bit. Is is a blank char? Could this cause trouble when trying to match the 2 tables?
Thanks in advance,
July 6, 2017 at 4:46 am
a_ud - Thursday, July 6, 2017 4:35 AMI'm trying to perform SHA256 hashing, using the standard HASHBYTES function (2012). The goal is to join 2 tables: one hashed internally in SQLServer, the other imported (already hashed in a different platform). Sort of this:
TABLE 1 (SQLserver) -> Hash it SHA256 <= INNER JOIN => TABLE 2 (imported into same SQLServer from .CSV file)
I've done some testing before the matching, but the output of the HASHBYTES function is unexpected. E.g.
DECLARE @HashThis nvarchar(4000);
SELECT @HashThis = CONVERT(nvarchar(4000), '123');
SELECT HASHBYTES('SHA2_256', @HashThis) ;
GOproduces as output:
0x26D6A8AD97C75FFC548F6873E5E93CE475479E3E1A1097381E54221FB53EC1D2
However, if I run SHA256 in R, or in any of the free SHA256 websites, the correct output for '123' is:
a665a45920422f9d417e4867efdc4fb8a04a1f3fff1fa07e998e86f7f7a27ae3
Initially I thought they don't match because my output is a binary var, but no: even if you convert to nvarchar, you never get the same output. My questions:
- What is going on here? Is this because of the coding language in the server (UTF, UNICODE, other. sorry not familiar with language codes)?
- How can I make sure that, when using HASHBYTES-SHA256, the output matches the 'standard' output shown above?
- Even if I fix the previous issue, the 0x at the start baffles me a bit. Is is a blank char? Could this cause trouble when trying to match the 2 tables?Thanks in advance,
The SHA2 hash you posted as correct is the ascii string hash, not the unicode hash
😎
DECLARE @HashThis varchar(4000);
SELECT @HashThis = '123';
SELECT HASHBYTES('SHA2_256', @HashThis) ;
Produces0xA665A45920422F9D417E4867EFDC4FB8A04A1F3FFF1FA07E998E86F7F7A27AE3
which is the same as in R, or in any SHA256 websites
July 6, 2017 at 4:58 am
Thanks for the prompt reply. As said not very familiar with code chars, or how to switch from one to another.
My interest is to perform a JOIN, therefore if I can't switch the charset of both ends (particularly the SQLServer one), it will not work.
July 6, 2017 at 5:08 am
a_ud - Thursday, July 6, 2017 4:58 AMThanks for the prompt reply. As said not very familiar with code chars, or how to switch from to another.
- Does SQL Server then use UNICODE by default (while R & websites use ASCII )? Is this set by the server OS, or the client OS?
- Do other popular databases, e.g. mySQL, also use UNICODE?
My interest is to perform a JOIN, therefore if I can't switch the charset of both ends (particularly the SQLServer one), it will not work.
1) SQL Server uses which ever type you decide to use, i.e. NVARCHAR for unicode and VARCHAR for ascii strings.
2) Most RDBMS support both unicode and ascii
😎
If you cannot change the type in SQL Server then you can use convert inside the hash functionDECLARE @HashThis NVARCHAR(4000);
SELECT @HashThis = N'123';
SELECT HASHBYTES('SHA2_256', CONVERT(VARCHAR(4000),@HashThis,0)) ;
July 6, 2017 at 5:20 am
a_ud - Thursday, July 6, 2017 4:58 AMDoes SQL Server then use UNICODE by default (while R & websites use ASCII )? Is this set by the server OS, or the client OS?
No, no and no.
It was using unicode because you declared the variable to be hashed as a unicode string.
DECLARE @HashThis nvarchar(4000);
And the 0x at the start indicates that it's a binary string (hexadecimal characters)
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
July 6, 2017 at 5:46 am
That worked fine, and I learnt something new !!
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply