I recently encountered an interesting performance issue (due to implicit conversions) that I was able to solve using a lesser known technique and I wanted to share it with you here.
A customer environment was suddenly receiving high CPU alarms and I was called in to take a look.
Having identified the query responsible for the high CPU consumption (courtesy of sp_WhoIsActive), I then inspected it’s execution plan (using SQL Sentry’s Plan Explorer) in an effort to establish a possible cause for the poor performance.
From the execution plan for the query we can see that the majority of the cost (83%) is incurred by the Index Scan operator.
Reviewing the properties for this operator showed that an Implicit Conversion was being performed. The query had a parameter with a data type of NVARCHAR and was performing a comparison to a column with data type of VARCHAR.
What To Do When You Cannot Edit the T-SQL Source
Implicit conversions are a well known source of performance issues. No problem you say, let’s just edit the T-SQL:
- Find the stored procedure and cast the parameter – No can do, the code is called directly from the application as an ad-hoc batch.
- Change the parameter data type on the application side – Sure but we have a problem in production right now! The instance is pegged and you’re customer is loosing money. Where is the code within the application? What if you don’t have access to the application source, maybe it’s a third party product.
What we need here is a way to change the data type of the column and index, without touching the existing T-SQL code or adversely affecting the current schema configuration. In other words, the changes need to be completely transparent to the application.
Fortunately these requirements can be met by using an Indexed Computed Column.
Using an Indexed Computed Column to Avoid Implicit Conversions
Let’s go through an example to see this behavior in action.
(You can use the following build script to create a simple test database with a single table and some data should you wish.)
Consider the following simple table schema and query:
CREATE TABle TableA ( IDINT identity(1,1) not null, SomeDatavarchar(50) null, SomeDatedatetime ); declare @p1 nvarchar(50) set @p1 = N'Username 4' select A.ID, A.SomeData, A.SomeDate from TableA A Where A.SomeData = @p1
The table TableA has a column SomeData with a data type of VARCHAR(50) and a parameter @p1 of NVARCHAR(50). This data type mismatch comes in to play during a comparison in the WHERE clause.
The execution plan looks like this:
As you can see, the Index Seek operator on TableA.ncl_SomeData performs an implicit conversion.
Now we’ll create a computed column SomeData_comp and add a non-clustered index to it.
--Create a computed column using the preferred data type ALTER TABLE TableA ADD SomeData_comp AS (convert(nvarchar(50),SomeData)); --Create a non-clustered index on the computed column CREATE NONCLUSTERED INDEX ncl_SomeData_comp on TableA(SomeData_comp);
Let’s now execute our example query once more:
As you can see, SQL Server is able to determine that although we have requested the column SomeData (a varchar) be compared to @p1 (an nvarchar), there is now an index available that provides the same data but in the desired data type, via the computed column.
Implementing this change means that SQL Server is now able to take advantage of this new Index, that is of the preferred data type for the query, without the need to modify the query itself.
(Note: If the application is performing inserts without an explicit field list, they will fail under this solution. A hat-tip to Brent Ozar(Blog|Twitter) for reminding me of this.)
Business As Usual is “the” Priority
In the production scenario I described earlier, this solution meant that a temporary fix could be deployed immediately in order to remove the excessive CPU consumption. Addressing the implicit conversion meant that an Index Seek could be used, rather than the expensive Scan operation, and subsequently a more suitable join operation was chosen by optimizer (Nested Loops rather than Merge) for the query.
The indexed computed column solution enabled business operations to return an acceptable level swiftly, allowing for a more suitable permanent fix to be pursued in a timely fashion.