June 10, 2010 at 9:47 am
Maybe I am missing something, but I do not understand the result set from the following code.
Can someone explain this to me?
DECLARE @tbl TABLE
(
testdata VARCHAR(1024)
)
INSERT INTO @tbl ( testdata )
VALUES ( 'a')--no spaces
INSERT INTO @tbl ( testdata )
VALUES ( 'a ')--1 space
INSERT INTO @tbl ( testdata )
VALUES ( 'a ')--2 spaces
INSERT INTO @tbl ( testdata )
VALUES ( 'a ')--3 spaces
SELECT * FROM @tbl t WHERE t.testdata='a'--no spaces(actually it does not matter how many spaces i put after the 'a)
returns 4 rows.
Steve
June 10, 2010 at 10:24 am
Read about ANSI_PADDING. You can insert and store trailing spaces if you really wish to do so. Note, it will not affect the SQL string comparison rules (eg. WHERE and HAVING clauses).
June 10, 2010 at 10:46 am
it will not affect the SQL string comparison rules (eg. WHERE and HAVING clauses).
Why are trailing spaces considered insignificant in relation to comparison rules?
And I guess more important that that, Is this behavior generally accepted or is there configuration to control it?
Steve
June 10, 2010 at 10:49 am
You can insert and store trailing spaces if you really wish to do so.
it appears to be storing the spaces
when i change the select statement to the following
SELECT '>' + t.testdata + '<' FROM @tbl t WHERE t.testdata='a'
I get this result set
>a< --no spaces
>a < --one space
>a < --two spaces
>a < --three spaces
Steve
June 10, 2010 at 11:00 am
Edited.
Steve, when I ran your query I saw all columns with spaces having the same length when you wrapped them in ><. Without the wrapper, the LEN function always returns a length of 1 for the various combinations being stored, although BOL says it should include trailing spaces in the count.
When I doubled the number of spaces in the constant text being inserted, the '>column<' lengths were different, regardless of the ansi_padding setting. I can't resolve the results with what I'm reading in BOL. If y'all haven't resolved this by this evening, I will spend some serious time researching it.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
June 10, 2010 at 11:04 am
The Dixie Flatline (6/10/2010)
Despite the BOL language that suggests ANSI_PADDING can allow you to store trailing blanks in a column with VARCHAR datatype, it appears that the actual behavior is different.I think the results of the queries below may surprise you. Looking at the LEN() value, the trailing blanks are not being stored at all, regardless of the ANSI_PADDING setting.
Well that is becasue the LEN function, like the comparisons, ignores trailing spaces. The spaces are still there. DATALENGTH will count trailing spaced though:
DECLARE @Foo VARCHAR(100)
SET @Foo = 'a '
SELECT LEN(@foo), DATALENGTH(@Foo)
June 10, 2010 at 11:08 am
June 10, 2010 at 11:21 am
Lamprey,
Thanks for the cite. I knew there had to be something out there on point. 🙂
I was about to challenge your statement about the LEN() but stopped to think about it and realized I agreed with you.
I'm tossing this in because it may save someone else some confusion in the future:
When I used the help index from SSMS to go to the LEN() function, it showed me only the LEN() for SSIS, which (quite annoyingly) behaves differently from LEN() for T-SQL. To get to the LEN() function for T-SQL, I had to change the filtering to show only SQL Database engine results. This was a wake-up call for me.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
June 10, 2010 at 11:25 am
GREAT ARTICLE.
It explains a lot, however, it seems to indicate that the LIKE operator would allow you to find the non padded a.
When I tested it
SELECT DATALENGTH(t.testdata) actualLength,t.testdata FROM @tbl t WHERE t.testdata LIKE 'a'
SELECT DATALENGTH(t.testdata) actualLength,t.testdata FROM @tbl t WHERE Patindex ('a',t.testdata) != 0
SELECT DATALENGTH(t.testdata) actualLength,t.testdata FROM @tbl t WHERE t.testdata='a'
All three of these select statements return 4 rows of data.
Steve
June 10, 2010 at 11:39 am
Like is a bit funky when it comes to trailing spaces. You need to switch things around for a proper comparison:SELECT testdata
FROM @tbl
WHERE 'a' LIKE testdata
June 10, 2010 at 11:55 am
Thanks to all
So the summarized information is here
DECLARE @tbl TABLE
(
testdata VARCHAR(1024)
)
INSERT INTO @tbl ( testdata )
VALUES ( 'a') --0 spaces
INSERT INTO @tbl ( testdata )
VALUES ( 'a ') --1 space
INSERT INTO @tbl ( testdata )
VALUES ( 'a ') --2 spaces
INSERT INTO @tbl ( testdata )
VALUES ( 'a ') --3 spaces
Because of Ansi-92 specification the following selects return all 4 rows
SELECT DATALENGTH(t.testdata) actualLength,t.testdata FROM @tbl t WHERE t.testdata='a'
SELECT DATALENGTH(t.testdata) actualLength,t.testdata FROM @tbl t WHERE t.testdata LIKE 'a'
Because of the curious behavior of LIKE and PATINDEX the following selects return the 1 desired row.
SELECT DATALENGTH(t.testdata) actualLength,t.testdata FROM @tbl t WHERE 'a' LIKE t.testdata
SELECT DATALENGTH(t.testdata) actualLength,t.testdata FROM @tbl t WHERE Patindex (t.testdata,'a') != 0
If you want to know the actual Length of data, DATALENGTH() is your friend. Throw Len() away.
Steve
June 10, 2010 at 12:46 pm
I wouldn't say there is anything curious about LIKE's behavior.
It allows for the building of different kinds of strings for a variety of "like" comparisons. In the absence of a wildcard in the string, it is basically an equality test. To get all strings starting with "A", the standard string for comparison is 'A%', not 'A'.
Just to be a devil's advocate, I will now ask why trailing spaces need ever be considered significant? Why not just routinely scrub them from your data?
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
June 10, 2010 at 1:49 pm
It's also worth noting that when using fixed width data types such as char and nchar, any values shorter than the maximum length will be padded with trailing spaces regardless of whether the insert included them:
CREATE TABLE #test (char_col CHAR(10), varchar_col VARCHAR(10))
INSERT INTO #test
SELECT 'testing123', 'testing123' UNION ALL
SELECT 'test','test'
SELECT * FROM #test
WHERE RIGHT(char_col,1)=' '
It's a common gotcha if using cached lookups in SSIS as by default, this doesn't consider a strings to be equal when there is a mismatch in trailing spaces.
June 10, 2010 at 2:02 pm
I wouldn't say there is anything curious about LIKE's behavior.
The curious behavior to which I was referring is the subtle nuance between the following statements
SELECT DATALENGTH(t.testdata) actualLength,t.testdata FROM @tbl t WHERE t.testdata LIKE 'a'
SELECT DATALENGTH(t.testdata) actualLength,t.testdata FROM @tbl t WHERE 'a' LIKE t.testdata
It is not intuitive, at least not to me, that the two statements would return different result sets.
To me that is curious. It may not be to others.
Steve
June 10, 2010 at 2:07 pm
I will now ask why trailing spaces need ever be considered significant
It is not significant, unless you have data that came from an outside source, say migrated data.
If you did not know that trailing spaces have this effect, you would not know to code for it.
Also, (I have not tested this),I will bet that if I were to put a unique index on the field that the constraint would not be violated.
In that case If I ran a select on a value that was expected to be unique and I got multiple rows... well you can guess from there.
Steve
Viewing 15 posts - 1 through 15 (of 16 total)
You must be logged in to reply to this topic. Login to reply