ANSI PADDING, Trailing Whitespace, and Variable Length Character Columns

A few months ago I was involved in a couple of threads on SQLServerCentral regarding how SQL Server deals with trailing spaces/whitespace in variable length character columns (varchar/nvarchar). Here are the links to the 2 threads:

Help with Trim
String comparison with trailing spaces

Based on these 2 threads and my own prior ignorance regarding this, I thought compiling what I learned about it through them and other research would help others better understand how it works. Much of this is also on my blog (www.wiseman-wiseguy.blogspot.com).

What is ANSI_PADDING?

ANSI_PADDING is a session/connection setting in SQL Server that determines if SQL Server handles variable length character data (varchar/nvarchar) according to the ANSI standard. SSMS defaults to ANSI_PADDING on as do other .NET applications and ODBC connections. This is important to remember as Microsoft states that the ANSI_PADDING setting at column creation is what matters, keep this in mind when we run our tests. In the ANSI standard trailing blanks are stored in varchar/nvarchar columns. The following is a chart outlining how SQL Server deals with several objects and operations based on the ANSI_PADDING setting (thanks to Matt Miller in the String comparison with trailing spaces thread):

SQL Object/Operation	Result (ANSI Padding ON)	Result (ANSI Padding OFF)
Primary Key	Trim it. Duplicate key error	Trim it
Unique Index/Constraint	Trim it. Duplicate key error	Trim it
Comparison ("=", "!=")	Trim it. 'a' = 'a '	Trim it
Comparison ("like using _")	Don't trim it	Trim it
Concatenation	Don't trim it. 'a ' + 'a' = 'a a'	Trim it 'a ' + 'a' = 'aa'
Storage	Don't trim it 'a' = 1 byte (2 unicode) 'a ' = 2 bytes (4 unicode)	Trim it 'a' & 'a ' = 1 byte (2 unicode)
Group By	Trim it	Trim it
Order By	Trim it	Trim it

The RED rows show where there are differences in behavior.
Now what does this mean to you and me when it comes to development? I had to re-think how I expect varchar/nvarchar to act when dealing with trailing blanks since had I always thought that trailing blanks were removed from varchar/nvarchar and this is not the standard behavior. Here are the ways this has changed how I work:

Use the RTRIM function when inserting data into varchar/nvarchar columns to reduce storage.
Use the RTRIM function on left side operators in concatenation.
How does this affect .NET and other languages? When I use a lookup component in SSIS does it use the trailing blanks so that 'a' != 'a ' or is it like SQL Server where 'a' = 'a '? I know comparison with SSIS and .NET is case-sensitive, but now I need to trim as well.

The second issue is that SQL Server does not treat what I call special characters (tab, carriage return, line feed) as blanks. So while 'a' = 'a ' (space), 'a' != 'a ' (special character). This behavior is also true when considering Primary Keys, Unique Indexes/Constraints, Order By and Group By. This also affects the trim functions as LTRIM and RTRIM trim blanks, so the special characters are not trimmed using them, so you need to take this into account when accepting data from sources where these are possible. In the Help with Trim thread this issue occurred when importing data from an ACCESS database that had special characters on the end and correcting it became an issue because when removing the special characters the user encountered unique constraint errors.

Attached are some tests that verify the behaviors explained. Please take the time to read the comments included in the code.

A Normalization Primer

by Brian Kelley

SQLServerCentral.com

T-SQL

For most DBAs, normalization is an understood concept, a bread and butter bit of knowledge. However, it is not at all unusual to review a database design by a development group for an OLTP (OnLine Transaction Processing) environment and find that the schema chosen is anything but properly normalized. This article by Brian Kelley will give you the core knowledge to data model.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

(3)

You rated this post out of 5. Change rating

2003-01-13

18,730 reads

Discuss

Reaching the Outer Limits

by Steve Jones

SQLServerCentral.com

T-SQL

What's this error:Arithmetic overflow error converting IDENTITY to data type int? It was a new one to me, but read on to find out what it means.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

(2)

You rated this post out of 5. Change rating

2002-10-14

7,008 reads

Discuss

Working with email addresses in SQL Server!

by Additional Articles

Other

T-SQL

This article shows you how to design the storage for email addresses, how to validate email addresses, how to retrieve demographic information from email addresses efficiently, using computed columns and indexes. It also covers the security aspect of dealing with email addresses.

2002-08-01

1,340 reads