Question related to indexes?

  • I have a SQL statement, which has around 9 columns and 4 columns in a where clause.

    Question regarding index.

    Should I create a composite index (include all the columns) or have 4 different indexes on 4 different columns in a same table. I just need to know the difference, is it good to have 4 indexes or bad?

    SELECT top 1001 T657 . C1 , C813000363 , C900000501 , C900000502 , C900000503 , C900000504 , C814000425 , C900000407 , C7

    FROM T657

    WHERE

    ( ( T657 . C1000020534 = @0 ) AND

    ( T657 . C1000020525 = @1 ) AND

    ( T657 . C814000429 ! = @2 ) AND

    ( T657 . C1000020523 = @3 ) )

    ORDER BY 1 asc

    "He who learns for the sake of haughtiness, dies ignorant. He who learns only to talk, rather than to act, dies a hyprocite. He who learns for the mere sake of debating, dies irreligious. He who learns only to accumulate wealth, dies an atheist. And he who learns for the sake of action, dies a mystic."[/i]

  • to satisfy that specific query, i think an index on the four columns involved int he WHERE clause is needed, and INCLUDE all the columns in the SELECT, right?

    CREATE INDEX IX_T657_Example

    ON T657(C1000020534,C1000020525,C814000429,C1000020523)

    INCLUDE(C1,

    C813000363,

    C900000501,

    C900000502,

    C900000503,

    C900000504,

    C814000425,

    C900000407,

    C7)

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Lowell (2/4/2015)


    to satisfy that specific query, i think an index on the four columns involved int he WHERE clause is needed, and INCLUDE all the columns in the SELECT, right?

    CREATE INDEX IX_T657_Example

    ON T657(C1000020534,C1000020525,C814000429,C1000020523)

    INCLUDE(C1,

    C813000363,

    C900000501,

    C900000502,

    C900000503,

    C900000504,

    C814000425,

    C900000407,

    C7)

    That's what I was thinking as well, but why is this better versus having four different indexes? I am sure it's a silly question, but need some clarification, that's all.

    "He who learns for the sake of haughtiness, dies ignorant. He who learns only to talk, rather than to act, dies a hyprocite. He who learns for the mere sake of debating, dies irreligious. He who learns only to accumulate wealth, dies an atheist. And he who learns for the sake of action, dies a mystic."[/i]

  • All non-clustered indexes are a duplication of data and, along with that realization come concerns about disk space, backup times, and restore times especially where wide covering indexes with INCLUDES are concerned.

    My question here is, is there an performance problem associated with the idea that you need an index here and, if so, how MUCH of a performance problem is there? It may be than no INCLUDES would be the way to go and, owing to the inequality for @2, would suggest that at best it should only be included as a part of an INCLUDE rather than as a part of the index columns.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • New Born DBA (2/4/2015)


    Lowell (2/4/2015)


    to satisfy that specific query, i think an index on the four columns involved int he WHERE clause is needed, and INCLUDE all the columns in the SELECT, right?

    CREATE INDEX IX_T657_Example

    ON T657(C1000020534,C1000020525,C814000429,C1000020523)

    INCLUDE(C1,

    C813000363,

    C900000501,

    C900000502,

    C900000503,

    C900000504,

    C814000425,

    C900000407,

    C7)

    That's what I was thinking as well, but why is this better versus having four different indexes? I am sure it's a silly question, but need some clarification, that's all.

    Mostly because there's no guarantee that the 4 indexes will be used and because, if they are used, there will be 4 B-Trees that need to be gone through which can easily more than quadruple the amount of memory and memory I/O used for queries.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • This particular SQL Statement runs around 200,000 times a day, goes through millions of logical reads and when I look at the execution plan, it shows that creating a non-clustered index on the table will improve the performance by 98%.

    "He who learns for the sake of haughtiness, dies ignorant. He who learns only to talk, rather than to act, dies a hyprocite. He who learns for the mere sake of debating, dies irreligious. He who learns only to accumulate wealth, dies an atheist. And he who learns for the sake of action, dies a mystic."[/i]

  • New Born DBA (2/4/2015)


    Lowell (2/4/2015)


    to satisfy that specific query, i think an index on the four columns involved int he WHERE clause is needed, and INCLUDE all the columns in the SELECT, right?

    CREATE INDEX IX_T657_Example

    ON T657(C1000020534,C1000020525,C814000429,C1000020523)

    INCLUDE(C1,

    C813000363,

    C900000501,

    C900000502,

    C900000503,

    C900000504,

    C814000425,

    C900000407,

    C7)

    That's what I was thinking as well, but why is this better versus having four different indexes? I am sure it's a silly question, but need some clarification, that's all.

    For the four indexes to be used the optimizer has to recognize that it can do joins between the indexes based on the PK. It does happen, but it's rare. You're better off with an index that satisfies all the columns because it will be much more selective and easier for the optimizer to spot.

    One point worth noting, you may need to experiment a little with the leading edge of the index, the first column. This is the only column that goes into the histogram of the statistics. That's a major driver on choice of the index by the optimizer (as well as decisions within the query plan itself). Usually the most selective column is the best, but not always. Testing and experimentation will be your friends here.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • FYI--When you run this SQL Statement, there are 207,716 records. 1 more thing I also have to mention is that column C1000020534 has only 1 value "Case" and C1000020523 has '0' value.

    "He who learns for the sake of haughtiness, dies ignorant. He who learns only to talk, rather than to act, dies a hyprocite. He who learns for the mere sake of debating, dies irreligious. He who learns only to accumulate wealth, dies an atheist. And he who learns for the sake of action, dies a mystic."[/i]

  • New Born DBA (2/4/2015)


    FYI--When you run this SQL Statement, there are 207,716 records. 1 more thing I also have to mention is that column C1000020534 has only 1 value "Case" and C1000020523 has '0' value.

    That's a big factor. That would leave only:

    C1000020525

    for the lead column of the clustered index.

    Edit: And only column.

    You could add:

    C814000429

    after it, but most of the time you wouldn't gain anything from that.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • New Born DBA (2/4/2015)


    FYI--When you run this SQL Statement, there are 207,716 records. 1 more thing I also have to mention is that column C1000020534 has only 1 value "Case" and C1000020523 has '0' value.

    If both columns have a single value for each one, why do you include them in your WHERE clause?

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • ScottPletcher (2/4/2015)


    New Born DBA (2/4/2015)


    FYI--When you run this SQL Statement, there are 207,716 records. 1 more thing I also have to mention is that column C1000020534 has only 1 value "Case" and C1000020523 has '0' value.

    That's a big factor. That would leave only:

    C1000020525

    for the lead column of the clustered index.

    Edit: And only column.

    You could add:

    C814000429

    after it, but most of the time you wouldn't gain anything from that.

    So this should help?

    CREATE INDEX IX_T657

    ON T657(C1000020525,C814000429,)

    INCLUDE(C1,

    C813000363,

    C900000501,

    C900000502,

    C900000503,

    C900000504,

    C814000425,

    C900000407,

    C7)

    "He who learns for the sake of haughtiness, dies ignorant. He who learns only to talk, rather than to act, dies a hyprocite. He who learns for the mere sake of debating, dies irreligious. He who learns only to accumulate wealth, dies an atheist. And he who learns for the sake of action, dies a mystic."[/i]

  • The key thing for performance is the clustered index. Yes, you can instead create gazillions of nonclustered covering indexes, but that actually doesn't help overall performance much, and could even hurt it, possibly dramatically.

    I have no idea of whether C1000020525 is generally sequential or not. That is certainly best for a clustered index. But if you always report based on it, that could be the best clustered index regardless. We really should review the missing index and index usage stats at least to fill in more details about best indexing.

    CREATE CLUSTERED INDEX [T657_INDEX_CL]

    ON dbo.T657 ( C1000020525, C814000429 )

    WITH ( FILLFACTOR = 95 ) --set proper fillfactor as needed

    ON [PRIMARY] --set filegroup name on which the table is to be created

    Edit: Added keyword "INDEX".

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • ScottPletcher (2/4/2015)


    The key thing for performance is the clustered index. Yes, you can instead create gazillions of nonclustered covering indexes, but that actually doesn't help overall performance much, and could even hurt it, possibly dramatically.

    I have no idea of whether C1000020525 is generally sequential or not. That is certainly best for a clustered index. But if you always report based on it, that could be the best clustered index regardless. We really should review the missing index and index usage stats at least to fill in more details about best indexing.

    CREATE CLUSTERED INDEX [T657_INDEX_CL]

    ON dbo.T657 ( C1000020525, C814000429 )

    WITH ( FILLFACTOR = 95 ) --set proper fillfactor as needed

    ON [PRIMARY] --set filegroup name on which the table is to be created

    Edit: Added keyword "INDEX".

    We already have a clustered index on the table.

    It's not sequential. Look at the attachment plz

    1 more thing.

    This is the view, so should I create an index on the view or on the underlying table?

    "He who learns for the sake of haughtiness, dies ignorant. He who learns only to talk, rather than to act, dies a hyprocite. He who learns for the mere sake of debating, dies irreligious. He who learns only to accumulate wealth, dies an atheist. And he who learns for the sake of action, dies a mystic."[/i]

  • New Born DBA (2/4/2015)


    ScottPletcher (2/4/2015)


    The key thing for performance is the clustered index. Yes, you can instead create gazillions of nonclustered covering indexes, but that actually doesn't help overall performance much, and could even hurt it, possibly dramatically.

    I have no idea of whether C1000020525 is generally sequential or not. That is certainly best for a clustered index. But if you always report based on it, that could be the best clustered index regardless. We really should review the missing index and index usage stats at least to fill in more details about best indexing.

    CREATE CLUSTERED INDEX [T657_INDEX_CL]

    ON dbo.T657 ( C1000020525, C814000429 )

    WITH ( FILLFACTOR = 95 ) --set proper fillfactor as needed

    ON [PRIMARY] --set filegroup name on which the table is to be created

    Edit: Added keyword "INDEX".

    We already have a clustered index on the table.

    It's not sequential. Look at the attachment plz

    1 more thing.

    This is the view, so should I create an index on the view or on the underlying table?

    It looks the C1000020525 is mostly single values. In that case, create a nonclus index with the C1000020525 column as the key and include the <> column. Let SQL go back to the main table to lookup the rest, as those values are always the same anyway.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • New Born DBA (2/4/2015)


    This particular SQL Statement runs around 200,000 times a day, goes through millions of logical reads and when I look at the execution plan, it shows that creating a non-clustered index on the table will improve the performance by 98%.

    Although that's sometimes a good place to start looking and is sometimes the answer, it's not always true. Caveat Emptor for such automatic recommendations especially if the lead column doesn't have much selectivity.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply