February 19, 2013 at 12:01 am
Comments posted to this topic are about the item An in-depth look at change detection in SQL Server - Part 02
Thanks & Regards,
Nakul Vachhrajani.
http://nakulvachhrajani.com
Follow me on
Twitter: @sqltwins
February 19, 2013 at 4:42 am
About HASHBYTES (from http://msdn.microsoft.com/en-us/library/ms174415.aspx):
1. input size is limited to 8000
2. output size depends on used algorithm, for MD5 you can use CAST(HASHBYTES('MD5', (EmployeeName + CityName)) AS BINARY(16))
3. beware addition NULL values
February 19, 2013 at 4:42 am
While the return type of hashbytes might be 8000 bytes, that's more for future proofing, the actual value length will be:
Allowed input values are limited to 8000 bytes. The output conforms to the algorithm standard: 128 bits (16 bytes) for MD2, MD4, and MD5; 160 bits (20 bytes) for SHA and SHA1; 256 bits (32 bytes) for SHA2_256, and 512 bits (64 bytes) for SHA2_512.
from http://msdn.microsoft.com/en-us/library/ms174415.aspx
so your absolute longest using one of the currently supported hashes, is going to be 64 bytes, so your actual storage should be 66 bytes. And you could save those two bytes by doing a cast to a fixed length binary since you will know what algorithm you will be using.
February 21, 2013 at 1:32 am
Hi
It is dangerous to use hashbytes this way:
DataHashBytes AS HASHBYTES('MD5', (EmployeeName + CityName))
When EmployeeName = '' and CityName = 'Paris' and someone changes it to EmployeeName = 'Paris' and CityName = '' the hash stays the same. The solution (not very elegant but still a solution) is to use a delimiter, e.g.:
DataHashBytes AS HASHBYTES('MD5', (EmployeeName + char(0) + CityName))
The article is very useful, thank you.
April 28, 2015 at 6:30 pm
I read this excellent article and have to confess there is one case which I don't fully comprehend.
"BINARY_CHECKSUM fails to detect symmetric changes"
What is a symmetric change?
What other strings could be used in your example?
thx
December 6, 2017 at 12:52 pm
Nakul,
I have one question, I'm using SQL server 2016 and I have a table with 150 columns with multiple data types
I implemented hashbytes (SHA1) for Merge statement (Insert / Update)
I notice that sometimes (I have around 3 millions rows) it not detect changes
(for example it not detect 7 rows that need to update out of 3 million)
Is there any limitation for how many columns you can have in hashbytes function?
Thanks,
Oded Dror
December 6, 2017 at 9:42 pm
Is there any particular reason to use SHA1 instead of MD5? I would recommend changing the algorithm and then validating if the changes are being detected correctly or not. SQL 2016 comes with a lot of the older algorithms marked as "Deprecated" and will not be supported in a future version of SQL Server.
Thanks & Regards,
Nakul Vachhrajani.
http://nakulvachhrajani.com
Follow me on
Twitter: @sqltwins
December 6, 2017 at 10:04 pm
Nakul Vachhrajani - Wednesday, December 6, 2017 9:42 PMIs there any particular reason to use SHA1 instead of MD5? I would recommend changing the algorithm and then validating if the changes are being detected correctly or not. SQL 2016 comes with a lot of the older algorithms marked as "Deprecated" and will not be supported in a future version of SQL Server.
Nakul,
Thank you
Oded Dror
May 6, 2024 at 9:17 pm
It's an old article that I hadn't seen before. I like it a lot and just used it instead of me having to do a long winded explanation to someone as the what and why to use the HashBytes part of it all.
Thank you for the time you spent researching and writing it up Nakul.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply