October 20, 2012 at 3:24 pm
Lee Crain (10/20/2012)
Gail,If the original poster creates an index on just the column with the 4 values, will this not force the Optimizer to use that index?
"SELECT <column names> FROM <table name> WITH (INDEX(<index name>))"
I've done it in production and it's prevented a clustered index scan.
I hope it won't, because 12.5 million lookups in an ordered unrelated to the clustered index are probably going to cost more than a clustered index scan of 50 million records would would.
If the optimiser is working properly, and the column being indexed really does have only 4 distinct values over the 50 million rows, the only way to make it use that index should be to eliminate the lookups by making it a covering index.
Tom
October 20, 2012 at 3:26 pm
Lee Crain (10/20/2012)
Gail,If the original poster creates an index on just the column with the 4 values, will this not force the Optimizer to use that index?
"SELECT <column names> FROM <table name> WITH (INDEX(<index name>))"
I've done it in production and it's prevented a clustered index scan.
Yes, that will force a seek and tonnes and tonnes of key lookups. It'll be horridly inefficient, do millions of reads and take forever, but it'll prevent a (more optimal) clustered index scan.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
October 20, 2012 at 8:39 pm
L' Eomot Inversé, Gail,
I understand what you're saying but please follow me on this.
Maybe there are only a few records with one of the values in them so there might be some efficiency associated with having a single index for this one column.
But let's look at the worst case. Let's say that:
> the data in the column is evenly distributed. There are exactly 12,500,000 rows for each of the 4 values.
> the non-clustered index for the column is a 4 byte INT.
> the only search criterion is the data in the INT column index so a covering index is not a viable search option.
> each row is 1000 bytes long.
> the index statistics histogram is useless. The entire non-clustered index will need to be read into RAM to locate the correct 12,500,000 rows.
Which would be preferable?
1.Have the server read into RAM: 50,000,000 rows x 4 bytes = 200,000,000 bytes plus a read for every row with the correct value: 12,500,000 x 1000 bytes = 12,500,000,000 bytes, the sum of the two being: 200,000,000 bytes + 12,500,000,000 bytes = 12,700,000,000 bytes?
or,
2.Have the server read into RAM: 50,000,000 rows x 1000 bytes = 50,000,000,000 bytes?
This scenario assumes that the entire non-clustered index would need to be read, which I doubt because the index statistics histogram is likely to have at least some useful data distribution information in it.
Is my math wrong or have I missed something? Honest question.
Edit: Is the issue that individual record seeks would be more inefficient than a full index scan, likely a serial read operation?
Second Edit: I had to force this once when writing a big report query that had to execute in an OLTP environment. The situation was not identical to the original poster's but similar enough that I thought I'd contribute how I did it.
I had a 700,000,000+ row table (very wide rows) that was part of an INNER JOIN on its ID column in a SELECT statement. The clustered index was on the ID column, an INT IDENTITY. There was also a non-clustered index on just the ID column.
The report query needed to produce a result of several thousand rows. I wrote it, looked at the estimated execution plan, then ran it. Page Life Expectancy on the server soon went to almost zero. I stopped the query and examined the actual execution plan. The plan required a full index scan on the large table's clustered index, 700,000,000+ rows. It completely ignored the non-clustered index on the ID column.
After some experimentation, I forced the query to use the non-clustered index on the ID column. The query executed in a few seconds with no harm to the OLTP environment.
The cardinality of the ID column was excellent: unique. This is not the situation the original poster faces but it is similar enough that I thought the concept might be useful, at least worthy of some experimentation.
October 21, 2012 at 4:24 am
Lee Crain (10/20/2012)
But let's look at the worst case. Let's say that:> the data in the column is evenly distributed. There are exactly 12,500,000 rows for each of the 4 values.
> the non-clustered index for the column is a 4 byte INT.
> the only search criterion is the data in the INT column index so a covering index is not a viable search option.
> each row is 1000 bytes long.
> the index statistics histogram is useless. The entire non-clustered index will need to be read into RAM to locate the correct 12,500,000 rows.
Which would be preferable?
1. Have the server read into RAM: 50,000,000 rows x 4 bytes = 200,000,000 bytes plus a read for every row with the correct value: 12,500,000 x 1000 bytes = 12,500,000,000 bytes, the sum of the two being: 200,000,000 bytes + 12,500,000,000 bytes = 12,700,000,000 bytes?
or,
2. Have the server read into RAM: 50,000,000 rows x 1000 bytes = 50,000,000,000 bytes?
Except that SQL does not read rows, it reads pages and you've ignored the way that the key lookups work, they cannot just magically read the correct value.
So, let's see how that plays out....
So same assumptions you made, row 1000 bytes wide, nonclustered index on an int and the clustered index is an int as well.
Roughly 6 250 000 pages at the leaf of the clustered index. (50 million rows at 1000 bytes each. 8 rows per page)
3125 rows in the intermediate level
2 pages in the next intermediate level
1 page in the root
So a 4 level deep clustered index (in reality there might be more pages, I left out things like the null bitmaps, row headers, etc)
For the nonclustered index, the rows are 8 bytes wide (ignoring row headers and other stuff), so 50000 pages at the leaf level
50 pages at the intermediate level
1 page at the root
So a 3 level deep nonclustered index.
A clustered index scan will read 6253128 pages (SQL will use the intermediate levels to drive the readahead process. It will do that in the fewest IO operations possible. So physical reads 6253128, logical reads 6253128.
Now, let's try the case where we're retrieving 12500000 rows from a non-covering non-clustered index, because someone forced the usage of the index.
First, the scan of the NC index
50051 reads (physical and logical)
Now SQL needs to execute 12500000 key lookups. That means navigating the clustered index tree 12500000 times. The clustered index tree is 4 levels deep, so each and every one of those key lookups will incur 4 logical reads. The first time a page is read, it's read physically as well.
Hence 50 000 000 logical reads need to be done against the clustered index. Depending on how the values are distributed in the cluster, we're doing anything from 1 563 283 physical reads (assuming that we only read 1/4 of the cluster because the rows we want are in contiguous) to 6253128 (if we end up reading all the pages)
In summary.
Clustered index scan, physical reads 6253128, logical reads 6253128.
Nonclustered index seek + lookups, physical reads anything between 1613334 and 6303179 physical reads, logical reads 50050051 (assuming buffer pool large enough to hold all that so that pages are physically read at most once)
I'll take the clustered index scan any day. If you want, I'll put a test together and show you just how badly the one with the forced index will perform.
Index hints are for when you know exactly why the optimiser hasn't chosen the index you think is optimal, you know absolutely that the index you want is and will always be optimal and you have absolutely no other way of encouraging the optimiser to use the index you think is optimal (which we do, covering indexes, updated statistics and a whole bunch of other tools).
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
October 21, 2012 at 4:31 am
Lee Crain (10/20/2012)
But let's look at the worst case. Let's say that:> the data in the column is evenly distributed. There are exactly 12,500,000 rows for each of the 4 values.
> the non-clustered index for the column is a 4 byte INT.
> the only search criterion is the data in the INT column index so a covering index is not a viable search option.
> each row is 1000 bytes long.
> the index statistics histogram is useless. The entire non-clustered index will need to be read into RAM to locate the correct 12,500,000 rows.
Which would be preferable?
1.Have the server read into RAM: 50,000,000 rows x 4 bytes = 200,000,000 bytes plus a read for every row with the correct value: 12,500,000 x 1000 bytes = 12,500,000,000 bytes, the sum of the two being: 200,000,000 bytes + 12,500,000,000 bytes = 12,700,000,000 bytes?
or,
2.Have the server read into RAM: 50,000,000 rows x 1000 bytes = 50,000,000,000 bytes?
This scenario assumes that the entire non-clustered index would need to be read, which I doubt because the index statistics histogram is likely to have at least some useful data distribution information in it.
Is my math wrong or have I missed something? Honest question.
You've missed something. The server doesn't read single rows into RAM, it reads pages. With 8 bytes per page, and 25% of rows hit with a random distribution through the DB, the chance of a page being missed is one in ten (3/4 to the power 8) so the expected number if pages being missed, out of 6250000, is about 6250. So the pages bing brought into RAM contain 49,9950,000 rows, not the 12,500,000 you assume. In addition, unless the buffer pool is bigger than 50 Gbytes (it may be, we don't know, but it seems unlikely) each page will be brought in not once but several times, because pages will have been displaced by other pages being brought in. Also, the cluster key is probably a lot less than 1000 bytes, let's guess at 100, and now we have the probability of a clustered index page not being pulled in by one of those lookups is 0.75 to the power 80, ie 0 for practical purposes, so we know that the lookups bring in every non-leaf page of the clustered index as well as almost all the leaf pages, and these too risk being brought in more than once. The 6250 pages saved on the leaf level of the clustered index is a lot smaller than a quarter of the non-clustered index (which has a few more than 101 byes, not the 4 bytes you have used, per entry in its lowest level pages if the cluster key is 100 bytes, so there are 625000 of these pages, of which we will read 156250, as well as higher level pages). So we can conclude that using the nonclustered index actually requires more physical IO than the clustered index scan and uses more RAM, and unless there is a vast quantity of RAM available it requires several times as much physical IO.
So the clustered index scan is better on physical IO than the multiple seeks, and uses at least more RAM; and it's 100% better on lookups (12,500,000 lookups saved by the clustered index scan).
Second Edit: I had to force this once when writing a big report query that had to execute in an OLTP environment. The situation was not identical to the original poster's but similar enough that I thought I'd contribute how I did it.
I had a 700,000,000+ row table (very wide rows) that was part of an INNER JOIN on its ID column in a SELECT statement. The clustered index was on the ID column, an INT IDENTITY. There was also a non-clustered index on just the ID column.
The report query needed to produce a result of several thousand rows. I wrote it, looked at the estimated execution plan, then ran it. Page Life Expectancy on the server soon went to almost zero. I stopped the query and examined the actual execution plan. The plan required a full index scan on the large table's clustered index, 700,000,000+ rows. It completely ignored the non-clustered index on the ID column.
After some experimentation, I forced the query to use the non-clustered index on the ID column. The query executed in a few seconds with no harm to the OLTP environment.
The cardinality of the ID column was excellent: unique. This is not the situation the original poster faces but it is similar enough that I thought the concept might be useful, at least worthy of some experimentation.
The reason this was useful instead of a catastrophic mistake was that you were selecting only 1 row
in about 100,000 rather than 1 row in 4, so you could guarantee to miss a significan number of pages instead of a trivial number. With selectivity as low as 1 in 4 it could have been nasty.
Did you try using a FORCESEEK table hint specifying the clustered index, which you say was on the same column as the non-clustered index? Wouldn't that have given just as good a result? What is the point of a non-clustered index in this case?
edit: I see Gail posted read counts while I was typing, with a logical read count included to hammer the nail yet more firmly in.
Tom
October 21, 2012 at 5:04 am
L' Eomot Inversé (10/21/2012)
The reason this was useful instead of a catastrophic mistake was that you were selecting only 1 rowin about 100,000 rather than 1 row in 4, so you could guarantee to miss a significan number of pages instead of a trivial number. With selectivity as low as 1 in 4 it could have been nasty.
Did you try using a FORCESEEK table hint specifying the clustered index, which you say was on the same column as the non-clustered index? Wouldn't that have given just as good a result? What is the point of a non-clustered index in this case?.
The predicate was probably non-SARGable, otherwise SQL would have just done a seek on the cluster. Forcing the NC index (a redundant NC index) changed that to a scan of the NC plus lots of lookups. Less efficient than fixing the predicate to be SARGable would have been.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
October 21, 2012 at 5:54 am
GilaMonster (10/21/2012)
L' Eomot Inversé (10/21/2012)
The reason this was useful instead of a catastrophic mistake was that you were selecting only 1 rowin about 100,000 rather than 1 row in 4, so you could guarantee to miss a significan number of pages instead of a trivial number. With selectivity as low as 1 in 4 it could have been nasty.
Did you try using a FORCESEEK table hint specifying the clustered index, which you say was on the same column as the non-clustered index? Wouldn't that have given just as good a result? What is the point of a non-clustered index in this case?.
The predicate was probably non-SARGable, otherwise SQL would have just done a seek on the cluster. Forcing the NC index (a redundant NC index) changed that to a scan of the NC plus lots of lookups. Less efficient than fixing the predicate to be SARGable would have been.
Yes, that makes sense.
I find it difficult to envisage writing a non-SARGable predicate on an identity column (surely only appalling data design could lead to the need for one). But over the years have become quite used to people doing things I can't envisage myself doing.
Tom
October 21, 2012 at 11:33 am
L' Eomot Inversé and Gail,
Thank you both for your detailed responses. They've been very educational.
________
L' Eomot Inversé,
In answer to your question:
"Did you try using a FORCESEEK table hint specifying the clustered index, which you say was on the same column as the non-clustered index? Wouldn't that have given just as good a result? What is the point of a non-clustered index in this case?"
The point of using the non-clustered index was to perform an efficient INNER JOIN using just the ID column, only 4 bytes wide, instead of reading entire records from the clustered index using the ID column for the INNER JOIN.
No, I didn't implement a FORCESEEK. Would it have produced as good a result? I don't know; I didn't try it. The results I obtained were excellent. A complex query involving several joins on some very large tables executed in under 10 seconds without impacting the production OLTP environment.
_________
Gail, to my knowledge, the predicate was sargeable:
....FROM table1 as t1
INNER JOIN table2 as t2 ON t1.t2ID = t2.ID...
This was the first time I had encountered this problem and never did understand why this particular report query ignored the non-clustered index on the ID column, since it was used by many stored procedures for the same type of INNER JOIN.
_________
Regarding the database design, it had many flaws. It was something I inherited and had to work around.
October 21, 2012 at 11:52 am
Several valid reasons, root being that the nonclustered index wasn't covering therefore wasn't efficient. Probably a hash or merge join, those don't use indexes. (btw, that predicate technically is not considered SARGable because it's column compared to column)
In short, you got away with it that time, but there's a fair chance that at some future point that index hint will become a problem and hopefully will get removed.
Just to re-emphasise the point (from your reply I suspect you missed it), SQL does not read records from an index, it reads entire pages. Only once the page is in memory is it cracked and the requested rows retrieved from it.
For what it's worth, in 7 years of doing SQL performance tuning I have never had to use an index hint and only had to use a join hint once (and I know exactly why I had to use that join hint)
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
October 21, 2012 at 2:13 pm
Gail, thanks again for your responses.
In my 12 years of working with SQL Server, both as a software engineer and as a DBA, I have never before or since had to use a table or query hint to make a query execute efficiently.
Viewing 10 posts - 16 through 24 (of 24 total)
You must be logged in to reply to this topic. Login to reply