February 2, 2010 at 2:37 pm
john.arnott (2/2/2010)
When the HASHBYTES() function fails with the input truncation error, it does not return NULL, but rather doesn't return anything.
But usually SQL Server considers "nothing" as a NULL value.
Here is the example:
declare @tab table (a int)
select a from @tab
if (select a from @tab) is null
print 'null'
else
print 'not null'
The result is:
a
-----------
(0 row(s) affected)
null
As we can see, "nothing" = "null" in this case.
I think that the warning (or error, or exception, or whatever) plays its important role.
Here is the modified example:
declare @tab table (a int)
select a from @tab where 1 = 1/0
if (select a from @tab where 1 = 1/0) is null
print 'null'
else
print 'not null'
The result is:
a
-----------
Msg 8134, Level 16, State 1, Line 3
Divide by zero error encountered.
Msg 8134, Level 16, State 1, Line 5
Divide by zero error encountered.
not null
Does it look like "nothing" <> "null"? I don't think so, because of the error message. I think that SQL Server executes "select a from @tab where 1 = 1/0", gets an error, and does not even try to compare this with NULL. Since there's no TRY..CATCH block, batch execution continues and... steps into the ELSE block. But why is the ELSE block not skipped? Maybe it's "by design" 🙂
February 2, 2010 at 3:32 pm
The "inconsistencies" I point to in this post may actually be controlled by the use of settings as pointed out by Paul White in a post later in this thread. Here's that post:
Paul White NZ (3/30/2010)
Everything that has been observed, regarding overflows, truncations, ELSE clauses executing and so on...is all documented in Books Online, under the following topics:The various behaviours described in this thread can all be varied, and explained, by reading those references.
The other thing is that a scalar sub-query like in the example (SELECT a FROM @tab) always returns a value - since it returns a column reference. If no rows are produced, the scalar sub-query presents a NULL.
vk-kirov,
Your example is consistent with the HASHBYTES example from Hugo. And it DOES look as though SQL is not treating "no result" as the same thing as "value undetermined" or NULL. But then, I suppose it makes a certain amount of logical sense that a test for NULL return FALSE when the thing being tested is not defined; and "not defined" is not the same as "has an undetermined value".
TSQL is inconsistent. A simple conversion error stops the processing.
Declare @x datetime
set @x = convert(datetime,'2010-0518')
If @x is NULL
print 'Null found'
else print '@x is not NULL'
--------------- Returns:
--Msg 241, Level 16, State 1, Line 3
--Conversion failed when converting datetime from character string.
But the inconsistency is that the HASHBYTES overflow allows the IF to continue, which finds that the local variable has an undefined value, expressed as NULL:
Declare @b-2 varbinary
Set @b-2 = HASHBYTES('SHA1', REPLICATE(CAST('A' AS varchar(max)), 9000))
If @b-2 is NULL
print 'Null found'
else print '@b is not NULL'
--------------- Returns:
--Msg 8152, Level 16, State 10, Line 2
--String or binary data would be truncated.
--Null found
Finallly, testing the HASHBYTES function directly gets back to the original form. Again the error doesn't stop further processing, but this time, the execution result itself is undefined, so the test fails and the ELSE branch is followed. At this point, it doesn't matter whether you test for IS NULL or for IS NOT NULL.
If HASHBYTES('SHA1', REPLICATE(CAST('A' AS varchar(max)), 9000)) is NULL
print 'Null found'
else print 'Result is not NULL'
If HASHBYTES('SHA1', REPLICATE(CAST('A' AS varchar(max)), 9000)) is NOT NULL
print 'Null found'
else print 'Result is not NULL'
-- --------------------- Returns:
--Msg 8152, Level 16, State 10, Line 2
--String or binary data would be truncated.
--Result is not NULL
--Msg 8152, Level 16, State 10, Line 6
--String or binary data would be truncated.
--Result is not NULL
------
edited to add preface with post from Paul.
------
February 3, 2010 at 9:51 am
"Inconsistent" - indeed. I thought this should generate an error, so I don't get the 1 point.
But this is a bug in SQL Server! Given one often uses hashing to verify things are identical when confirming data integrity, a hash which treats two different strings as identical is a a serious failure. The string size limit for HASHBYTES may well be 8,000 (4,000 for nvarchar), but larger strings HASHBYTES should NOT result in the hash of a truncated string - it should produce an ERROR. Or allow long strings when the data type permits them.
Has anyone raised this as a bug yet?
February 3, 2010 at 12:38 pm
a hash which treats two different strings as identical is a a serious failure.
No, it's not. Collisions in a hash are by design. After all, the possible number of different values in a 8000-character string is much larger than the possible number of 160-but hash values.
That's not what causes the problem here, though. As I'm sure you understand after reading all the other replies. The real problem is that REPLICATE uses the same input as output data type, which in this case has an 8000-byte limit, and that longer values are truncated. Raising a bug is pointless, as this is all both by design and well documented.
February 3, 2010 at 12:53 pm
The real problem is that REPLICATE uses the same input as output data type, which in this case has an 8000-byte limit, and that longer values are truncated.
I would have thought that although this reasonable (if not obvious) behavior of REPLICATE led to the two HASHBYTES calls evaluating identical strings in the original QOD, the real problem is that when HASHBYTES fails due to input-overflow (when we feed it a varchar(max), it apparently does not present a result of NULL that may be tested. Rather, as I showed in my previous post, an IF statement testing the output of HASHBYTES goes to the FALSE branch regardless of whether you test for "IS NULL" or for "IS NOT NULL". That would seem more of a candidate for bug status.
February 3, 2010 at 1:07 pm
john.arnott (2/3/2010)
The real problem is that REPLICATE uses the same input as output data type, which in this case has an 8000-byte limit, and that longer values are truncated.
I would have thought that although this reasonable (if not obvious) behavior of REPLICATE led to the two HASHBYTES calls evaluating identical strings in the original QOD, the real problem is that when HASHBYTES fails due to input-overflow (when we feed it a varchar(max), it apparently does not present a result of NULL that may be tested. Rather, as I showed in my previous post, an IF statement testing the output of HASHBYTES goes to the FALSE branch regardless of whether you test for "IS NULL" or for "IS NOT NULL". That would seem more of a candidate for bug status.
That's the real problem that surfaxced in this discussion; the problem exposed by the question (which I believed David to be responding to) is not related to the input-overflow behaviour.
And I agree that the overflow behaviour is a lot more bug-worthy than the 8000-character limit on varchar data.
February 4, 2010 at 5:18 am
Good Question ... there is always something new to learn
March 30, 2010 at 10:07 am
Everything that has been observed, regarding overflows, truncations, ELSE clauses executing and so on...is all documented in Books Online, under the following topics:
The various behaviours described in this thread can all be varied, and explained, by reading those references.
The other thing is that a scalar sub-query like in the example (SELECT a FROM @tab) always returns a value - since it returns a column reference. If no rows are produced, the scalar sub-query presents a NULL.
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
Viewing 8 posts - 16 through 22 (of 22 total)
You must be logged in to reply to this topic. Login to reply