issues with blank spaces

  • one of my colleagues asked me why this query doesn't do what he thinks it should

    I ran a few tests and the blank spaces seem to be ignored

    SELECT TOP (1000) 
    LEN(Area) length_of_area -- Should be 10, because there is a space at the end of Marketing
    ,*
    ,LEN(Area)
    FROM [MyStore].[dbo].WebAreaAccess
    WHERE AccountCode = 'marshorsecare'
    AND Area LIKE '% %'
    AND Area = 'Marketing ' -- This line and the next line should contradict each other
    AND Area = 'Marketing' -- This line and the previous line should contradict each other
    ORDER BY LEN(Area)

    so - the blank spaces don't seem to be counted in the where clause - the 2 conflicting area statements surely should break each other..

    I can put as many spaces as I want in the query and it still returns the same recordset

    the field is a varchar(100) and collation is latin1 general ci as

    anyone fancy educating me?

     

    MVDBA

  • I remembered coming across this article once upon a time which lead me here.  If I recall correctly I was fighting with a similar sort of thing in a WHERE clause.


    On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
    —Charles Babbage, Passages from the Life of a Philosopher

    How to post a question to get the most help http://www.sqlservercentral.com/articles/Best+Practices/61537

  • Spaces at the end of strings are usually ignored by SQL Server. Have a read here for some more detail.

    --Neil beat me to it!

    • This reply was modified 4 years, 9 months ago by  Phil Parkin.

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • For varchar, LEN(Area) will give the trimmed length while DATALENGTH(Area) will give the actual length.

    For nvarchar, LEN(Area) will give the trimmed length while DATALENGTH(Area) will give the actual length times two.

    So for varchar DATALENGTH(Area) > LEN(Area) will show areas with trailing spaces.

    • This reply was modified 4 years, 9 months ago by  Ken McKelvey.
  • one of the ideas I looked at is reversing the string and checking the ascii value of the first character - please someone tell me that is a stupid idea

    MVDBA

  • Ken McKelvey wrote:

    For varchar, LEN(Area) will give the trimmed length while DATALENGTH(Area) will give the actual length.

    For nvarchar, LEN(Area) will give the trimmed length while DATALENGTH(Area) will give the actual length times two.

    So for varchar DATALENGTH(Area) > LEN(Area) will show areas with trailing spaces.

    I tried this and did some research - len and datalength gave the same value.

    my understanding is that len gives you the number of characters, datalength gives you the number of bytes (since 1 char =1 byte then same thing) - datalength seems to be designed for the limitations of the size of field you can "len" - datalength can go much higher

    MVDBA

  • SELECT LEN(Area + 'x') - 1
  • Jonathan AC Roberts wrote:

    SELECT LEN(Area + 'x') - 1

    worth a try - good idea

    MVDBA

  • The behaviour of DATALENGTH seems to depend on ANSI_PADDING when the table was created.

    All my connections have ANSI_PADDING ON so I have never noticed the difference.

    SET ANSI_PADDING ON;

    IF SESSIONPROPERTY('ANSI_PADDING') = 1
    PRINT 'ANSI_PADDING ON'

    CREATE TABLE #tON
    (
    id int NOT NULL PRIMARY KEY
    ,Test varchar(20) NOT NULL
    );
    INSERT INTO #tON
    VALUES(1,'test'),(2,'test '),(3,'test ');

    SELECT *
    ,LEN(Test) AS LenTest
    ,DATALENGTH(Test) AS DataLengthTest
    FROM #tON

    SET ANSI_PADDING OFF;

    IF SESSIONPROPERTY('ANSI_PADDING') = 0
    PRINT 'ANSI_PADDING OFF'

    CREATE TABLE #tOFF
    (
    id int NOT NULL PRIMARY KEY
    ,Test varchar(20) NOT NULL
    );
    INSERT INTO #tOFF
    VALUES(1,'test'),(2,'test '),(3,'test ');

    SELECT *
    ,LEN(Test) AS LenTest
    ,DATALENGTH(Test) AS DataLengthTest
    FROM #tOFF

    SET ANSI_PADDING ON;
  • interesting - I had another comment that is was the session variable when the ROW was created rather than table creation. there is some "ambiguous" documentation around this area. (create/insert/select)

    if it's a create table thing then I need to drop all tables and recreate, plus re-insert the old data...not a pleasant job

    and if the correct ansi settings are on, will the BCP data conform? or will I have the same issues for legacy data?

    MVDBA

  • The following suggests it is the ANSI_PADDING setting with CREATE TABLE.

    This is interesting. In future I think I am going to alter the first line of my CREATE TABLE scripts from:

    SET ANSI_NULLS, QUOTED_IDENTIFIER ON;

    to

    SET ANSI_NULLS, QUOTED_IDENTIFIER, ANSI_PADDING ON;

    SET ANSI_PADDING ON;

    IF SESSIONPROPERTY('ANSI_PADDING') = 1
    PRINT 'ANSI_PADDING ON'

    CREATE TABLE #tON
    (
    id int NOT NULL PRIMARY KEY
    ,Test varchar(20) NOT NULL
    );
    INSERT INTO #tON VALUES (1,'test ');
    GO
    SET ANSI_PADDING OFF;
    GO
    INSERT INTO #tON VALUES (2,'test ');

    SELECT *
    ,LEN(Test) AS LenTest
    ,DATALENGTH(Test) AS DataLengthTest
    FROM #tON

    SET ANSI_PADDING OFF;

    IF SESSIONPROPERTY('ANSI_PADDING') = 0
    PRINT 'ANSI_PADDING OFF'

    CREATE TABLE #tOFF
    (
    id int NOT NULL PRIMARY KEY
    ,Test varchar(20) NOT NULL
    );
    INSERT INTO #tOFF VALUES (1,'test ');
    GO
    SET ANSI_PADDING ON;
    GO
    INSERT INTO #tOFF VALUES (2,'test ');

    SELECT *
    ,LEN(Test) AS LenTest
    ,DATALENGTH(Test) AS DataLengthTest
    FROM #tOFF

    SET ANSI_PADDING ON;
  • One of today's articles has a script to show columns created with ANSI_PADDING OFF

    https://www.red-gate.com/hub/product-learning/sql-prompt/never-create-columns-with-ansi_padding-set-to-off

     

  • There could be some ugly consequences in some cases if existing legacy data has trailing spaces contained in varchar data type columns, as both tables and indexes could get a difference in page splits (and any attendant nastiness) as a result of a change in the overall data length in a given row or set of rows that result from doing cleanup, such as RTRIM(column) being used as part of an INSERT statement.   Mind you, we're talking about reducing the length of varchar data, but if done via copying a table into tempdb, and then inserting the smaller rows, the smaller row lengths now fit more rows in a page, and ya never know ahead of time what new kind of funky can occur when there's that kind of change, and especially when you already have existing page splits...  You'd like to think that such might get better, but there's no assurance of that, and the more fragmented the available disk space within the MDF file is, the more likely that a problem can occur.   I'd also not be surprised if an in-place UPDATE could cause similar issues.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply