March 1, 2013 at 10:44 am
opc.three (3/1/2013)
GilaMonster (3/1/2013)
opc.three (3/1/2013)
My comment was that the yield from the FROM, WHERE, GROUP BY and HAVING clauses is not altered based on which index the engine picks.I can probably produce you an example where it is (row number over a non-unique column and filter on that row number comes to mind)
I am not picturing it. Wouldn't that require a derived table, in which case the ORDER BY in the ranking function would guarantee the correct result?
Was thinking something like this:
SELECT name,
row_num
FROM ( SELECT ROW_NUMBER() OVER ( ORDER BY name ),
name
FROM sys.tables
) tbls ( row_num, name )
WHERE row_num = 1;
name is unique.
When you do a row number over a column that is not unique, the order in which those row numbers are assigned to 'duplicate' values is not guaranteed, change the index that SQL's using and you can change the order that the row numbers are assigned.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
March 1, 2013 at 10:47 am
I realized that nuance after I posted.
Looking to mock-up an example...
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
March 1, 2013 at 11:18 am
Found one. I was never under the impression that adding or changing an index was completely harmless for various other reasons but not for this particular one. I am still faulting over the fact that it is something in a SELECT-column-list that is altering the outcome, although it is being incorporated into a resultset so all bets are off. This has been added to my list of things to consider when tuning, reviewing and writing code.
Setup:
USE tempdb;
CREATE TABLE dbo.test
(
name VARCHAR(100),
type_desc VARCHAR(100)
);
INSERT INTO dbo.test
( name, type_desc )
VALUES ( 'xyz', 'user' );
INSERT INTO dbo.test
( name, type_desc )
VALUES ( 'abc', 'user' );
CREATE CLUSTERED INDEX [cx] ON dbo.test (name);
Query 1:
SELECT name,
row_num
FROM ( SELECT ROW_NUMBER() OVER ( ORDER BY type_desc ),
name,
type_desc
FROM dbo.test
) tbls ( row_num, name, type_desc )
WHERE row_num = 1;
Add another index:
-- add nc index to change support for ROW_NUMBER
CREATE NONCLUSTERED INDEX [ix] ON dbo.test (type_desc DESC);
Query again, same as above but different result:
SELECT name,
row_num
FROM ( SELECT ROW_NUMBER() OVER ( ORDER BY type_desc ),
name,
type_desc
FROM dbo.test
) tbls ( row_num, name, type_desc )
WHERE row_num = 1;
Unsetup:
DROP TABLE dbo.test;
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
March 1, 2013 at 11:31 am
@Gail, In thinking about this a little more, I think the example I gave is just a roundabout way of asking for TOP 1 without an ORDER BY, which I say is nonsensical.
USE tempdb;
CREATE TABLE dbo.test
(
name VARCHAR(100),
type_desc VARCHAR(100)
);
INSERT INTO dbo.test
( name, type_desc )
VALUES ( 'xyz', 'user' );
INSERT INTO dbo.test
( name, type_desc )
VALUES ( 'abc', 'user' );
CREATE CLUSTERED INDEX [cx] ON dbo.test (name);
GO
------------------------------------------------------------------------------
SELECT TOP 1
name,
type_desc
FROM dbo.test
WHERE type_desc = 'user';
GO
-- add nc index to change support for ROW_NUMBER
CREATE NONCLUSTERED INDEX [ix] ON dbo.test (type_desc, name desc);
-- same query as above
SELECT TOP 1
name,
type_desc
FROM dbo.test
WHERE type_desc = 'user';
DROP TABLE dbo.test;
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
March 1, 2013 at 11:48 am
opc.three (3/1/2013)
@Gail, In thinking about this a little more, I think the example I gave is just a roundabout way of asking for TOP 1 without an ORDER BY, which I say is nonsensical.
Or a TOP 1 (without specifying WITH TIES) where the column you're ordering by has duplicate values
USE tempdb;
CREATE TABLE dbo.test
(
name VARCHAR(100),
type_desc VARCHAR(100)
);
INSERT INTO dbo.test
( name, type_desc)
VALUES ( 'xyz', 'user');
INSERT INTO dbo.test
( name, type_desc)
VALUES ( 'abc', 'user');
CREATE CLUSTERED INDEX [cx] ON dbo.test (name);
GO
------------------------------------------------------------------------------
SELECT TOP 1
name,
type_desc
FROM dbo.test
WHERE type_desc = 'user'
ORDER BY type_desc;
GO
-- add nc index to change support for filter and order
CREATE NONCLUSTERED INDEX [ix] ON dbo.test (type_desc, name desc);
-- same query as above
SELECT TOP 1
name,
type_desc
FROM dbo.test
WHERE type_desc = 'user'
ORDER BY type_desc;
DROP TABLE dbo.test;
Moral of the story, be careful when ordering by a non-unique column and then restricting the rows based on that ordering (via row_number, rank, etc or top)
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
March 1, 2013 at 11:54 am
GilaMonster (3/1/2013)
Or a TOP 1 (without specifying WITH TIES) where the column you're ordering by has duplicate values
Nice. That is a tricky example that might slip by even a good static code analysis rule because it depends on the uniqueness of the column.
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
March 4, 2013 at 9:00 am
I think i found the root cause here. The query which we were using brings top 15 records in asc order based on a column which has NULL or 0 0r 1. This was also one of the main columns in the index which did allow nulls, however if i change the index and not include this column then we the null column data would not be in the top 15.
Bottom line is that it was data issue, however i would still like to know from other folks that should i include my column which allows NULL values in my index or it just depends on business reason?
Thanks for your help.
March 4, 2013 at 9:24 am
sqldba_newbie (3/4/2013)
I think i found the root cause here. The query which we were using brings top 15 records in asc order based on a column which has NULL or 0 0r 1.
Right, so ordering by a non-unique column and limiting the results. That's the case I showed to opc. Because that column's not unique, when you order by just that you have no guarantee of what order the rows will be within the 'null, 0 or 1' groupings, hence the index changes the assumed order (because it was never guaranteed) and appears to change the results.
The solution here would be to order by a second column that does define the 'correct' order of rows as you want to see them. Removing the column from the index is hiding the symptom, nothing more. There are lots of other things that can and will cause the order of rows to differ from what you assume they should be.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
Viewing 8 posts - 16 through 22 (of 22 total)
You must be logged in to reply to this topic. Login to reply