In my last post, I noted that one of the biggest differences between ISNULL and COALESCE was the fact that ISNULL attempted to convert the second parameter to the data type of the first parameter where as COALESCE converted according to the Data Type Precedence table. A reader requested that I go into more detail on what that means. At first I wasn't sure what I was going to explain, there didn't seem like a lot to talk about once I linked the BOL article on Data Type Precedence(which I meant to do in my initial post but apparently never did). After thinking about it for a while, I realized that one thing that isn't really pointed out in the BOL page is what these implicit conversions can do to performance if you aren't paying attention. This post got a bit long.
This isn't a topic that I'm real familiar with, so I had to do some research / tests of my own to write this. I've had to fix the varchar/nvarchar one several times, but others I can't truly explain. Why does a char->varchar comparison not trigger an implicit conversion? Honestly, I'm not sure. My guess would be that the optimizer is simply smart enough to not do it, but as I said, that's just a guess. While attempting to find the answer to this online, I stumbled across a brilliant script written by Jonathan Kehayias that focuses on finding implicit conversions in the plan cache.
The examples below focus on non-numeric conversions. I did a good amount of testing on different numeric conversions, and although I've read that SQL 2000 had specific issues, I was not able to easily duplicate this with the numeric types in any compatibility level with my 2K8 installation (so I left those examples out). If anyone has any good examples of this behavior with numeric data types, I’d be happy to add them.
Test Setup: (Note that because I am dropping/creating a real table and user defined types, you should be careful which database you execute this against. I would suggest creating a new one and executing it there)
--- Drop and Re-create User Defined Type
IF EXISTS (SELECT * FROM sys.types where name = 'UDVC') DROP TYPE UDVC
IF EXISTS (SELECT * FROM sys.types where name = 'UDNVC') DROP TYPE UDNVC
CREATE TYPE UDVC FROM varchar(60)
CREATE TYPE UDNVC FROM nvarchar(60)
--- Drop and Re-Create Test Table
IF EXISTS(SELECT * FROM sys.objects WHERE name = 'DTP' AND type = 'U') DROP TABLE DTP
SELECT TOP 100000 -- If you use too few rows, the performance differences aren't as apparent.
CAST(NEWID() as nvarchar(60)) NVCCol,
CAST(NEWID() as varchar(60)) VCCol,
CAST(NEWID() as char(60)) CCol,
CAST(NEWID() as sql_variant) SQLVCVarCol
INTO DTP
FROM Util..Tally --I keep a Tally table in a Utility Database named Util.
--Either change to the location of your tally table or use the other FROM statement below.
--FROM master..spt_values A CROSS JOIN master..spt_values B CROSS JOIN master..spt_values C
CREATE INDEX IX_NVCCol ON DTP(NVCCol)
CREATE INDEX IX_VCCol ON DTP(VCCol)
CREATE INDEX IX_CCol ON DTP(CCol)
CREATE INDEX IX_SQLVCVarCol ON DTP(SQLVCVarCol)
GO
Tests
-- Test 1: Compare varchar and nvarchar against varchar column.
-- These will show a massive difference because it must convert the varchar column VCCol to nvarchar to compare them
-- In the execution Plan, you will see that the first uses an index scan and the second uses an index seek.
SELECT VCCol FROM DTP WHERE VCCol = 'A'
SELECT VCCol FROM DTP WHERE VCCol = N'A'
If you hover over the index scan, you will see a CONVERT_IMPLICIT on your varchar column:
-- Test 2: Compare varchar and nvarchar against nvarchar column.
-- These also show a very small difference because nvarchar is higher in the precedence list and so it only has to convert 'A'
-- to an nvarchar rather than the entire column.
SELECT NVCCol FROM DTP WHERE NVCCol = 'A'
SELECT NVCCol FROM DTP WHERE NVCCol = N'A'
Here you will see the CONVERT_IMPLICIT again,but this time it only applies to the scalar.
-- Test 3: Compare varchar and nvarchar against SQLVariant column.
-- Although this may seem very similar to Test 1, you won't see much of a difference here. This is because sql_variant
-- is near the top of the DTP table (higher than varchar/nvarchar), so you only have to convert the single value.
SELECT SQLVCVarCol FROM DTP WHERE SQLVCVarCol = 'A'
SELECT SQLVCVarCol FROM DTP WHERE SQLVCVarCol = N'A'
-- Test 4: Compare varchar and sql_variant against varchar column.
-- Here you get the massive difference you would expect because you must convert the entire column to a sql_variant.
-- This one actually has a different plan all together and not just an index scan.
DECLARE @a sql_variant
SET @a = 'A'
SELECT VCCol FROM DTP WHERE VCCol = 'A'
SELECT VCCol FROM DTP WHERE VCCol = @a
-- Test 5: Compare User Defined Types (with bases of varchar and nvarchar) against varchar column.
-- This behaves exactly as Test 1 did. Conversions seem to be handled like they would be if the data types were the base types.
DECLARE @a UDVC
DECLARE @b UDNVC
SET @a = 'A'
SET @b = 'B'
SELECT VCCol FROM DTP WHERE VCCol =@a
SELECT VCCol FROM DTP WHERE VCCol = @b
--Test 6: Compare char and varchar against char column
-- There is not any performance loss here. It seems like there should be, but at least in my tests there is not.
DECLARE @a varchar(60), @b char(60)
SET @a = 'A'
SET @b = 'A'
SELECT CCol FROM DTP WHERE CCol = @b
SELECT CCol FROM DTP WHERE CCol = @a
The varchar/nvarchar conversions can be especially painful / tricky when the code is being passed in from elsewhere. One of the places that I've seen issues with this is LINQ to SQL. It is entirely possible that it was just our setup that was mismatched (I wasn't involved in that end of it), but I figured I'd throw it out there anyways.