Introduction
Have you ever experienced a T-SQL query, a stored procedure, view or a trigger
returning unexpected results? As you investigate, you get a suspicion that it
might have something to do with ANSI DEFAULTS? ANSI DEFAULTS is probably the
last thing you would think of. In such cases, you are probably thinking that
either it’s a SQL Server bug or a bug in your code or simply the data in the
tables is bad. You would be surprised by the manner in which ANSI settings
impact how SQL Server stores data, how NULL comparisons are evaluated and how
T-SQL queries search and process data. Well, this article attempts to resolve
some of the mystery surrounding ANSI DEFAULTS options in SQL Server and how it
can impact your application behavior.
First, lets talk about what ANSI DEFAULTS options are all about. ANSI X3.135-1992
(also referred to as SQL-92 and ANSI SQL) is the industry standard for Database
Language SQL. T-SQL in Microsoft SQL Server 2000 provides the SET command to
manipulate ANSI DEFAULTS settings that specify some SQL-92 standard behavior.
Together, the SQL-92 standard SET options, including ANSI_NULLS,
ANSI_NULL_DFLT_ON, ANSI_PADDING, ANSI_WARNING, CURSOR_CLOSE_ON_COMMIT,
IMPLICIT_TRANSACTIONS AND QUOTED_IDENTIFIER, define the query-processing
environment of a SQL Server session. These options, however, do not include all
of the options required to conform to the SQL-92 standard.
The options can be manipulated by the T-SQL SET command and can have a paramount
affect on how SQL Server stores and retrieves data. For those interested in
digging deeper in to ANSI SQL specs I
would recommend checking out
http://web.ansi.org/public/std_info.html.
Often, developers and DBAs underestimate the impact of these settings on SELECT
statements, stored procedures, triggers and views. The impact can lead to
unexpected application behavior and hours of research and troubleshooting. The
troubleshooting task can become challenging since ANSI options are set at
several levels within Microsoft SQL Server 2000, as well as at the connection
level by the client application. For example,
the SQL Server ODBC driver and Microsoft OLE
DB Provider for SQL Server automatically set ANSI_DEFAULTS to ON. Similarly,
some ANSI options can be configured using ODBC data sources, ODBC connection
attributes, or OLE DB connection properties.
Further
in the article, I will explain how ANSI DEFAULTS settings work in SQL Server. In
addition, through problem scenarios, I will explain the behavior of two very
common ANSI options namely; ANSI PADDING and ANSI NULLS and demonstrate their
impact on T-SQL queries. I will discuss some T-SQL functions and commands that
are available to identify and recover from such scenarios.
How does
Microsoft SQL Server manage ANSI SQL-92 SET options?
First it’s important to understand how SQL Server
handles SQL-92 standard SET options. In Microsoft SQL Server these SET options
are defined at three levels – server, database and session.
The database setting overwrites the server setting, and
the session setting overwrites the server and database setting. Let’s take a
look at these three levels independently and understand how ANSI settings are
handled at each of these levels. The sp_configure system stored procedure has
the ‘user options’ setting that allows manipulation of server-wide ANSI settings
and it works across databases. For example, the following script sets
server-wide ANSI_NULL settings:
USE master
go
EXEC
sp_configure 'user options', '32'
go
RECONFIGURE
WITH OVERRIDE
go
The sp_dboption system stored procedure allows
manipulation of database settings and works at the database level only. For
example, the following script will cause all comparisons to a null value made in
the Northwind database to evaluate to unknown.
USE master
go
sp_dboption 'Northwind',
'ANSI nulls', TRUE
go
And finally, the SET statement allows manipulation of
ANSI settings at the session level. It overrides database option settings (set
by using sp_dboption). For example, the following statement will cause the
statement - “SELECT * from table_name WHERE column_name
= NULL” to return zero rows even if there are null values in column_name.
SET
ANSI_NULLS ON
Having discussed how
ANSI_DEFAULTS SET options work in SQL Server 2000 now lets take a look at two
specific scenarios. Scenario 1 demonstrates ANSI PADDING behavior
and Scenario 2 explains the impact of ANSI NULLS.
Scenario 1 – ANSI_PADDING
Client applications often access data at the backend by utilizing some sort of
search criteria to return result sets. In most of these queries, when searching
a char or varchar data type, significant spaces padded at the end of the
characters can affect the final outcome of the query. The ANSI PADDING option
directly impacts how SQL Server stores char, binary, varchar and varbinary data.
In this scenario we will focus on character values only. Depending on whether
this option is set to ON or OFF at the time of creating the table, SQL Server
may or may not trim trailing blanks while storing the table’s values.
Script A demonstrates this behavior. It creates the table called Members
in four different ways with following combinations of ANSI_PADING and column
data types:
- ANSI_PADDING ON and columns defined as char
- With ANSI_PADDING OFF and columns defined as char
- With ANSI_PADDING ON and columns defined as varchar
- With ANSI_PADDING OFF and columns defined as varchar<
Two records are inserted in the Members table. Both
records are similar except that the column LastName in the first record has a
value of ‘Doe’ and in the second record has a value of ‘Doe ‘ (padded with two
blank spaces). A SELECT statement is then run against the Members table using
search criteria on column LastName. For demonstration purposes I have used the
LIKE operator to implement the search as – “WHERE LastName LIKE ‘Doe %’ “(Doe
followed by a space and % sign which acts as a wild card).
The results of Script A are interesting.
From Results A, notice that the output returned by the SELECT
statement when the ANSI PADDING is OFF, irrespective of whether LastName is
defined as CHAR or VARCHAR datatype are the same (zero rows returned). What this
means is ‘Doe ‘ (with padded spaces) gets stored as ‘Doe’ (no spaces padded).
Therefore, WHERE LIKE = ‘Doe %’ evaluates to zero rows found the in the table.
But when ANSI PADDING is ON the results are much different. Two rows are
returned if LastName is defined as char and one row is returned if LastName is
defined as varchar. Looking at Results A a bit closer, we can see
that when ANSI_PADDING is ON and the column is defined as a varchar column,
trailing blanks are not trimmed and therefore impact results returned by search
based on a padded string. In addition, it is important to note that the value
of ANSI_PADDING at the time of table creation will be in effect for the life of
the table. Once the table has been created, setting this option to ON or OFF
will not alter how character values are stored. So, you can see that its
important for DBAs and developers to clearly understand this behavior when
defining their schema and writing code to search through char, varchar, binary
and varbinary columns.
SCRIPT A
SET NOCOUNT ON
go
Print 'With
ANSI_PADDING ON and column type char'
SET ANSI_PADDING
ON
GO
CREATE TABLE
Members (Firstname char(20), LastName char(20))
GO
INSERT Members
values ('John', 'Doe')
INSERT Members
values ('John', 'Doe ')
GO
SELECT FirstName,
LastName FROM Members WHERE LastName like 'Doe %'
SELECT 'LastName
column length' = datalength(lastname) FROM Members
GO
Drop table
Members
go
Print 'With
ANSI_PADDING OFF and column type char'
SET ANSI_PADDING
OFF
GO
CREATE TABLE
Members (Firstname char(20), LastName char(20))
GO
INSERT Members
values ('John', 'Doe')
INSERT Members
values ('John', 'Doe ')
GO
SELECT FirstName,
LastName FROM Members WHERE LastName like 'Doe %'
SELECT 'LastName
column length' = datalength(lastname) FROM Members
GO
Drop table
Members
go
Print 'With
ANSI_PADDING ON and column type varchar'
SET ANSI_PADDING
ON
GO
CREATE TABLE
Members (Firstname varchar(20), LastName varchar(20))
GO
INSERT Members
values ('John', 'Doe')
INSERT Members
values ('John', 'Doe ')
GO
SELECT FirstName,
LastName FROM Members WHERE LastName like 'Doe %'
SELECT 'LastName
column length' = datalength(lastname) FROM Members
GO
Drop table
Members
go
Print 'With
ANSI_PADDING OFF and column type varchar'
SET ANSI_PADDING
OFF
GO
CREATE TABLE
Members (Firstname varchar(20), LastName varchar(20))
GO
INSERT Members
values ('John', 'Doe')
INSERT Members
values ('John', 'Doe ')
GO
SELECT FirstName,
LastName FROM Members WHERE LastName like 'Doe %'
SELECT 'LastName
column length' = datalength(lastname) FROM Members
GO
Drop table
Members
Go
RESULTS A
With
ANSI_PADDING ON and column type char
FirstName
LastName
-------------------- --------------------
John Doe
John Doe
LastName
column length
----------------------
20
20
With
ANSI_PADDING OFF and column type char
FirstName
LastName
-------------------- --------------------
LastName
column length
----------------------
3
3
With
ANSI_PADDING ON and column type varchar
FirstName
LastName
-------------------- --------------------
John
Doe
LastName
column length
----------------------
3
5
With
ANSI_PADDING OFF and column type varchar
FirstName
LastName
-------------------- ------------- -------
LastName
column length
----------------------
3
3
Another important aspect of the behavior demonstrated by
Script A is that ANSI PADDING affects how SQL Server internally
stores the LastName values. The behavior is determined by the DATALENGTH
function, which returns the number of bytes for a given expression. DATALENGTH
returns a value of 20 for both rows when ANSI_PADDING is ON and LastName is
defined as char. Whereas, it returns 3 and 5 for the two rows when ANSI_PADDING
is ON and LastName is defined as varchar. This is not within the scope of this
article but its worth noting.
How to recover from the affects of ANSI_PADDING?
Regardless of the ANSI_PADDING setting (whether its ON or OFF), essentially, you
need to be aware of two time frames – creation of the
tables and execution of the SET statement. It is the combination of the two that
determine how the values will be stored in the table and how search conditions
involving padded strings will be evaluated. To know the value of the
ANSI_PADDING setting at the time of execution of the CREATE TABLE statement you
can use the COLUMNPROPERTY function.
SELECT
COLUMNPROPERTY (OBJECT_ID (‘Members’), ‘LastName’, ‘UsesAnsiTrim’)
A
returned value of 1 means ANSI_PADDING was ON when the table was created and a
value of 0 means it was OFF.
Since most Transact-SQL SET statements are interpreted
during execution it may be important to identify the exact SET options that the
user connection inherits. DBCC USEROPTIONS can be used to determine the SET
options inherited by a session.
DBCC USEROPTIONS
For
example, DBCC USERIPTIONS can tell us what ANSI SET options are in affect for a
particular SQL server user session. Below is a sample output of this command:
Set
Option
Value
----------------------- -----------------
textsize
64512
language us_english
dateformat
mdy
datefirst
7
quoted_identifier SET
arithabort
SET
nocount
SET
ansi_null_dflt_on SET
ansi_defaults
SET
ansi_warnings
SET
ansi_padding
SET
ansi_nulls
SET
concat_null_yields_null SET
Developers commonly use the RTRIM function to shield their application against
problems arising from padded strings. Script AA below demonstrates
how the use of RTRIM in Scenario-I can resolve the search problem.
SCRIPT AA
SET NOCOUNT ON
go
Print 'With
ANSI_PADDING ON and column type char'
SET ANSI_PADDING
ON
GO
CREATE TABLE
Members (Firstname char(20), LastName char(20))
GO
INSERT Members
values ('John', 'Doe')
INSERT Members
values ('John', 'Doe ')
GO
declare @string
char (5)
select @string
='Doe '
SELECT FirstName,
LastName FROM Members WHERE LastName like RTRIM(@string)+'%'
SELECT 'LastName
column length' = datalength(lastname) FROM Members
GO
Drop table
Members
go
Print 'With
ANSI_PADDING OFF and column type char'
SET ANSI_PADDING
OFF
GO
CREATE TABLE
Members (Firstname char(20), LastName char(20))
GO
INSERT Members
values ('John', 'Doe')
INSERT Members
values ('John', 'Doe ')
GO
declare @string
char (5)
select @string
='Doe '
SELECT FirstName,
LastName FROM Members WHERE LastName like RTRIM(@string)+'%'
SELECT 'LastName
column length' = datalength(lastname) FROM Members
GO
Drop table
Members
go
Print 'With
ANSI_PADDING ON and column type varchar'
SET ANSI_PADDING
ON
GO
CREATE TABLE
Members (Firstname varchar(20), LastName varchar(20))
GO
INSERT Members
values ('John', 'Doe')
INSERT Members
values ('John', 'Doe ')
GO
declare @string
char (5)
select @string
='Doe '
SELECT FirstName,
LastName FROM Members WHERE LastName like RTRIM(@string)+'%'
SELECT 'LastName
column length' = datalength(lastname) FROM Members
GO
Drop table
Members
go
Print 'With
ANSI_PADDING OFF and column type varchar'
SET ANSI_PADDING
OFF
GO
CREATE TABLE
Members (Firstname varchar(20), LastName varchar(20))
GO
INSERT Members
values ('John', 'Doe')
INSERT Members
values ('John', 'Doe ')
GO
declare @string
char (5)
select @string
='Doe '
SELECT FirstName,
LastName FROM Members WHERE LastName like RTRIM(@string)+'%'
SELECT 'LastName
column length' = datalength(lastname) FROM Members
GO
Drop table
Members
Go
Scenario II – ANSI NULLS
ANSI_NULLS SET option controls comparisons against NULL values. Similar to
ANSI_PADDING, ANSI_NULLS can affect results returned by a SELECT statement with
a WHERE condition on NULL values. When this option is set ON NULLs evaluate to
unknown. It is often necessary when implementing business logic to do this type
of comparison and if ANSI_NULLS is not set appropriately then it can lead to
unexpected results and affect your application’s behavior. Script B
demonstrates this scenario. It inserts two rows in the Members table, one with
LastName ‘Doe’ and second with LastName having a value of NULL. When ANSI_NULLS
is set to ON, the SELECT statement returns zero rows since the search condition
‘WHERE LastName=NULL’ evaluates to unknown. When ANSI_NULLS is set to OFF, it
returns one row.
SCRIPT B
SET NOCOUNT ON
go
Print 'With
ANSI_NULLS ON '
SET ANSI_NULLS
ON
GO
CREATE TABLE
Members (Firstname char(20), LastName char(20))
GO
INSERT Members
values ('John', 'Doe')
INSERT Members
values ('John', NULL)
GO
SELECT FirstName,
LastName FROM Members WHERE LastName=NULL
GO
Drop table
Members
go
Print 'With
ANSI_NULLS OFF '
SET ANSI_NULLS
OFF
GO
CREATE TABLE
Members (Firstname char(20), LastName char(20))
GO
INSERT Members
values ('John', 'Doe')
INSERT Members
values ('John', NULL)
GO
SELECT FirstName,
LastName FROM Members WHERE LastName=NULL
GO
Drop table
Members
Go
RESULTS B
With ANSI_NULLS ON
FirstName
LastName
-------------------- --------------------
With
ANSI_NULLS OFF
FirstName
LastName
-------------------- --------------------
John NULL
How to recover from the affects of ANSI_NULL?
In order to avoid the unwanted implications of ANSI_NULL setting on your
application, it is best to use the IS NULL or IS NOT NULL clause to test for a
NULL value. Script BB demonstrates how the IS NULL clause resolves
the problem shown in Scenario II.
SCRIPT BB
SET NOCOUNT ON
go
Print 'With
ANSI_NULLS ON '
SET ANSI_NULLS
ON
GO
CREATE TABLE
Members (Firstname char(20), LastName char(20))
GO
INSERT Members
values ('John', 'Doe')
INSERT Members
values ('John', NULL)
GO
SELECT FirstName,
LastName FROM Members WHERE LastName IS NULL
GO
Drop table
Members
go
Print 'With
ANSI_NULLS OFF '
SET ANSI_NULLS
OFF
GO
CREATE TABLE
Members (Firstname char(20), LastName char(20))
GO
INSERT Members
values ('John', 'Doe')
INSERT Members
values ('John', NULL)
GO
SELECT FirstName,
LastName FROM Members WHERE LastName IS NULL
GO
Drop table
Members
go
It is slightly more
complicated to resolve ANSI_NULL problems when dealing with stored procedures,
triggers and views as the ANSI_NULLS setting is saved when these objects are
created or altered and is enabled (used) upon their execution. In other words,
these objects inherit the ANSI settings of the user session that is used to
creates them. Script C below demonstrates this behavior. It
creates the stored procedure, ‘test1’, first with ANSI_NULLS ON and then with
ANSI_NULLS OFF. Before the execution of ‘test1’ the ANSI_NULLS setting is
switched. So if it was created with ANSI_NULLS ON then it’s executed with
ANSI_NULLS OFF and vice versa. The procedure initializes a variable @var to NULL
and then does a simple NULL comparison of the variable as “IF @var = NULL”. In
this case the IF statement should always evaluate to TRUE. But looking at
Results C, you will notice that the execution of the stored procedure
‘test1’ seem to ignore the ANSI_NULL setting of the user session that issues the
execute statement. The stored procedure inherits the ANSI_NULL setting that
existed at object creation time, and ignores the current setting. I have seen
many times where this behavior is cast off as a bug. This is expected behavior
within the SQL Server product and all DBAs and developers need to be familiar
with it. The same behavior applies to triggers and views as well.
SCRIPT C
Print 'Object created with
ANSI_NULLS ON '
SET ANSI_NULLS ON
GO
create proc test1 as
declare @var int
select @var= null
If @var= NUll
select ' it is null'
else select ' it is not null'
go
exec test1
set ansi_nulls off
go
exec test1
SELECT OBJECTPROPERTY (OBJECT_ID('test1'),
'ExecIsAnsiNullsOn')
go
drop proc test1
go
Print 'Object created with
ANSI_NULLS OFF '
SET ANSI_NULLS OFF
GO
create proc test1 as
declare @var int
select @var= null
If @var= NUll
select ' it is null'
else select ' it is not null'
go
exec test1
set ansi_nulls on
go
exec test1
SELECT OBJECTPROPERTY (OBJECT_ID('test1'),
'ExecIsAnsiNullsOn')
go
drop proc test1
go
RESULTS
C
Object
created with ANSI_NULLS ON
---------------
it is not
null
---------------
it is not
null
-----------
1
Object
created with ANSI_NULLS OFF
-----------
it is null
-----------
it is null
-----------
0
Just like we saw in Scenario I, the
situation described by Script C warrants the need to know the
ANSI_NULLS settings at the time of object creation. Again, you can use the
OBJECTPROPERTY function to determine this. For example:
SELECT
OBJECTPROPERTY (OBJECT_ID('procname'), 'ExecIsAnsiNullsOn')
In the above statement, a
returned value of 1 means ANSI NULL was ON when the stored procedure was created
and a value of 0 means it was OFF.
Conclusion
As you can see from
Scenario I and II, it can be a bit puzzling to clearly understand odd query
behaviors, which are impacted by changes in ANSI SET options. It is important
to understand how SQL Server implements the SET options. Since the SET options
can be turned on and off at any time by users and applications, the developers
and DBAs should test their code with the ANSI_DEFAULTS SET options turned both
ON and OFF. When set to ON, this option enables the following SQL-92 settings:
SET ANSI_NULLS
SET CURSOR_CLOSE_ON_COMMIT
SET ANSI_NULL_DFLT_ON
SET IMPLICIT_TRANSACTIONS
SET ANSI_PADDING
SET QUOTED_IDENTIFIER
SET ANSI_WARNINGS.
This ensures that the behavior of T-SQL scripts, stored procedures, triggers and
views can be verified and corrected regardless of what options a particular
connection may have turned on using the SET statement prior to invoking the
object. In my next article I will discuss scenarios on
ARITHABORT, ANSI_WARNING, IMPLICIT_TRANSACTIONS
AND QUOTED_IDENTIFIER options. We will discuss how these options can affect your
application behavior and ways to avoid spending hours tracking down the problem.
Additional Resources
http://web.ansi.org/public/std_info.html
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/acdata/ac_8_qd_03_35df.asp
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/qryanlzr/qryanlzr_8qlu.asp
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/odbcsql/od_6_015_0tf7.asp