December 21, 2016 at 9:37 am
I've seen articles that say you shouldn't make the leading column in your index a BIT column. There are, of course, other articles pointing out the eternal truth of SQL Server "It Depends".
However, Microsoft's Missing Index suggestion continually leads the indexes it suggests with BIT columns. I would like to push the BIT column further down in the index column order, but I need to be able to argue against what's seen as Microsoft suggestions.
For example, the column it suggests is Obsolete. On some tables it can be a 50/50 split between 1 & 0, on others it can be 90/10. On still others it is nullable. It suggests the Obsolete column first on all of these.
Any suggestions, philosophies, dictates, or musings are welcome.
--------------------------------------
When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
--------------------------------------
It’s unpleasantly like being drunk.
What’s so unpleasant about being drunk?
You ask a glass of water. -- Douglas Adams
December 21, 2016 at 10:06 am
I don't see any problem with using a leading bit column for an index. I can think of many more examples where it would be appropriate as part of a non-clustered index. I sometimes work with data with an IsActive bit column and will create a covering index with IsActive as the first key for supporting queries where we only need active or inactive clients, products, etc.
-- Itzik Ben-Gan 2001
December 21, 2016 at 10:10 am
Going back a few employers I combined Current and Historical data into the same tables when downloading and importing data from our ISAM databases into SQL Server. Doing this I had to flag the data as Current (0) or Historical (1). Knowing that the ratio between Current and Historical was 10% Current and 90% Historical, I put the flag as the leading column in the index since 90% of my queries were against the 10% of the data. This limited what data had to be searched.
I would probably do it if the distribution was 50/50 and 90% of my queries were still against just Current data.
What does this mean? Just what you said, it depends. It isn't just the distribution of the data, it also depends on the distribution of the data accessed and how it is accessed.
December 21, 2016 at 10:46 am
If the flag is significant enough, make it the first column in the clustered index. For example, in a 90%-10% situation where you mostly process the 10%.
Be aware that SQL will (almost) always prefer equality matches, such as bit_column = 0|1, rather than range conditions, such as int_column >= value1 and int_column < value2, even if sometimes the latter condition is more selective. In those cases, you could cluster first on int_column, no matter what the index suggestion was. Remember: it is ultimately just a suggestion, and even MS themselves recommend thoroughly vetting the suggestions before adding/altering/dropping indexes.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
December 26, 2016 at 8:30 pm
Sioban Krzywicki (12/21/2016)
I've seen articles that say you shouldn't make the leading column in your index a BIT column. There are, of course, other articles pointing out the eternal truth of SQL Server "It Depends".However, Microsoft's Missing Index suggestion continually leads the indexes it suggests with BIT columns. I would like to push the BIT column further down in the index column order, but I need to be able to argue against what's seen as Microsoft suggestions.
For example, the column it suggests is Obsolete. On some tables it can be a 50/50 split between 1 & 0, on others it can be 90/10. On still others it is nullable. It suggests the Obsolete column first on all of these.
Any suggestions, philosophies, dictates, or musings are welcome.
Your looking at the guy that accidentally paralyzed Expedia.com for two minutes by adding an NCI that had a bit column for the leading column. The pages splits on that index during the INSERTs took performance and threw it on the flow.
I don't care what Microsoft or anyone else says, don't use bit columns or other very low cardinality columns as the leading column of any index. Use a filtered index instead. Just make sure that any SQL Agent jobs that run a script against the table also include SET QUOTED_IDENTIFIER ON or the jobs will fail with a message reminding you of that fact. It doesn't bother jobs that use stored procedures that were created in SSMS if that's the default for whatever SSMS connections are made.
--Jeff Moden
Change is inevitable... Change for the better is not.
December 28, 2016 at 2:14 pm
Jeff Moden (12/26/2016)
Sioban Krzywicki (12/21/2016)
I've seen articles that say you shouldn't make the leading column in your index a BIT column. There are, of course, other articles pointing out the eternal truth of SQL Server "It Depends".However, Microsoft's Missing Index suggestion continually leads the indexes it suggests with BIT columns. I would like to push the BIT column further down in the index column order, but I need to be able to argue against what's seen as Microsoft suggestions.
For example, the column it suggests is Obsolete. On some tables it can be a 50/50 split between 1 & 0, on others it can be 90/10. On still others it is nullable. It suggests the Obsolete column first on all of these.
Any suggestions, philosophies, dictates, or musings are welcome.
Your looking at the guy that accidentally paralyzed Expedia.com for two minutes by adding an NCI that had a bit column for the leading column. The pages splits on that index during the INSERTs took performance and threw it on the flow.
I don't care what Microsoft or anyone else says, don't use bit columns or other very low cardinality columns as the leading column of any index. Use a filtered index instead. Just make sure that any SQL Agent jobs that run a script against the table also include SET QUOTED_IDENTIFIER ON or the jobs will fail with a message reminding you of that fact. It doesn't bother jobs that use stored procedures that were created in SSMS if that's the default for whatever SSMS connections are made.
I'd love to hear more about that. What data was in that bit column, why it lead to page splits. It feels like the full context might be an important consideration.
Also what the connection between filtered indexes, and ensuring quoted identifiers are on in SQL agent jobs. The connection is not immediately clear
December 29, 2016 at 7:03 am
Sioban Krzywicki (12/21/2016)
I've seen articles that say you shouldn't make the leading column in your index a BIT column. There are, of course, other articles pointing out the eternal truth of SQL Server "It Depends".However, Microsoft's Missing Index suggestion continually leads the indexes it suggests with BIT columns. I would like to push the BIT column further down in the index column order, but I need to be able to argue against what's seen as Microsoft suggestions.
For example, the column it suggests is Obsolete. On some tables it can be a 50/50 split between 1 & 0, on others it can be 90/10. On still others it is nullable. It suggests the Obsolete column first on all of these.
Any suggestions, philosophies, dictates, or musings are welcome.
The first thing to recognize is that the missing index suggestions you see in an execution plan or in the dynamic management views DO NOT specify an order for the columns:
https://technet.microsoft.com/en-us/library/ms345485(v=sql.105).aspx
What is the best thing to do is consider how the most frequently used queries against this table join to it or filter data in the WHERE clause. The columns that are most frequently used and are the most selective are the best candidates to be the first columns in the index.
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply