does index also change the output?

  • opc.three (3/1/2013)


    GilaMonster (3/1/2013)


    opc.three (3/1/2013)


    My comment was that the yield from the FROM, WHERE, GROUP BY and HAVING clauses is not altered based on which index the engine picks.

    I can probably produce you an example where it is (row number over a non-unique column and filter on that row number comes to mind)

    I am not picturing it. Wouldn't that require a derived table, in which case the ORDER BY in the ranking function would guarantee the correct result?

    Was thinking something like this:

    SELECT name,

    row_num

    FROM ( SELECT ROW_NUMBER() OVER ( ORDER BY name ),

    name

    FROM sys.tables

    ) tbls ( row_num, name )

    WHERE row_num = 1;

    name is unique.

    When you do a row number over a column that is not unique, the order in which those row numbers are assigned to 'duplicate' values is not guaranteed, change the index that SQL's using and you can change the order that the row numbers are assigned.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I realized that nuance after I posted.

    Looking to mock-up an example...

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Found one. I was never under the impression that adding or changing an index was completely harmless for various other reasons but not for this particular one. I am still faulting over the fact that it is something in a SELECT-column-list that is altering the outcome, although it is being incorporated into a resultset so all bets are off. This has been added to my list of things to consider when tuning, reviewing and writing code.

    Setup:

    USE tempdb;

    CREATE TABLE dbo.test

    (

    name VARCHAR(100),

    type_desc VARCHAR(100)

    );

    INSERT INTO dbo.test

    ( name, type_desc )

    VALUES ( 'xyz', 'user' );

    INSERT INTO dbo.test

    ( name, type_desc )

    VALUES ( 'abc', 'user' );

    CREATE CLUSTERED INDEX [cx] ON dbo.test (name);

    Query 1:

    SELECT name,

    row_num

    FROM ( SELECT ROW_NUMBER() OVER ( ORDER BY type_desc ),

    name,

    type_desc

    FROM dbo.test

    ) tbls ( row_num, name, type_desc )

    WHERE row_num = 1;

    Add another index:

    -- add nc index to change support for ROW_NUMBER

    CREATE NONCLUSTERED INDEX [ix] ON dbo.test (type_desc DESC);

    Query again, same as above but different result:

    SELECT name,

    row_num

    FROM ( SELECT ROW_NUMBER() OVER ( ORDER BY type_desc ),

    name,

    type_desc

    FROM dbo.test

    ) tbls ( row_num, name, type_desc )

    WHERE row_num = 1;

    Unsetup:

    DROP TABLE dbo.test;

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • @Gail, In thinking about this a little more, I think the example I gave is just a roundabout way of asking for TOP 1 without an ORDER BY, which I say is nonsensical.

    USE tempdb;

    CREATE TABLE dbo.test

    (

    name VARCHAR(100),

    type_desc VARCHAR(100)

    );

    INSERT INTO dbo.test

    ( name, type_desc )

    VALUES ( 'xyz', 'user' );

    INSERT INTO dbo.test

    ( name, type_desc )

    VALUES ( 'abc', 'user' );

    CREATE CLUSTERED INDEX [cx] ON dbo.test (name);

    GO

    ------------------------------------------------------------------------------

    SELECT TOP 1

    name,

    type_desc

    FROM dbo.test

    WHERE type_desc = 'user';

    GO

    -- add nc index to change support for ROW_NUMBER

    CREATE NONCLUSTERED INDEX [ix] ON dbo.test (type_desc, name desc);

    -- same query as above

    SELECT TOP 1

    name,

    type_desc

    FROM dbo.test

    WHERE type_desc = 'user';

    DROP TABLE dbo.test;

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • opc.three (3/1/2013)


    @Gail, In thinking about this a little more, I think the example I gave is just a roundabout way of asking for TOP 1 without an ORDER BY, which I say is nonsensical.

    Or a TOP 1 (without specifying WITH TIES) where the column you're ordering by has duplicate values

    USE tempdb;

    CREATE TABLE dbo.test

    (

    name VARCHAR(100),

    type_desc VARCHAR(100)

    );

    INSERT INTO dbo.test

    ( name, type_desc)

    VALUES ( 'xyz', 'user');

    INSERT INTO dbo.test

    ( name, type_desc)

    VALUES ( 'abc', 'user');

    CREATE CLUSTERED INDEX [cx] ON dbo.test (name);

    GO

    ------------------------------------------------------------------------------

    SELECT TOP 1

    name,

    type_desc

    FROM dbo.test

    WHERE type_desc = 'user'

    ORDER BY type_desc;

    GO

    -- add nc index to change support for filter and order

    CREATE NONCLUSTERED INDEX [ix] ON dbo.test (type_desc, name desc);

    -- same query as above

    SELECT TOP 1

    name,

    type_desc

    FROM dbo.test

    WHERE type_desc = 'user'

    ORDER BY type_desc;

    DROP TABLE dbo.test;

    Moral of the story, be careful when ordering by a non-unique column and then restricting the rows based on that ordering (via row_number, rank, etc or top)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (3/1/2013)


    Or a TOP 1 (without specifying WITH TIES) where the column you're ordering by has duplicate values

    Nice. That is a tricky example that might slip by even a good static code analysis rule because it depends on the uniqueness of the column.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • I think i found the root cause here. The query which we were using brings top 15 records in asc order based on a column which has NULL or 0 0r 1. This was also one of the main columns in the index which did allow nulls, however if i change the index and not include this column then we the null column data would not be in the top 15.

    Bottom line is that it was data issue, however i would still like to know from other folks that should i include my column which allows NULL values in my index or it just depends on business reason?

    Thanks for your help.

  • sqldba_newbie (3/4/2013)


    I think i found the root cause here. The query which we were using brings top 15 records in asc order based on a column which has NULL or 0 0r 1.

    Right, so ordering by a non-unique column and limiting the results. That's the case I showed to opc. Because that column's not unique, when you order by just that you have no guarantee of what order the rows will be within the 'null, 0 or 1' groupings, hence the index changes the assumed order (because it was never guaranteed) and appears to change the results.

    The solution here would be to order by a second column that does define the 'correct' order of rows as you want to see them. Removing the column from the index is hiding the symptom, nothing more. There are lots of other things that can and will cause the order of rows to differ from what you assume they should be.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 8 posts - 16 through 22 (of 22 total)

You must be logged in to reply to this topic. Login to reply