Index Question -Why the optimizer use ..

  • Ola!

    First i want excuse for my bad english but i hope you will understand me.

    My Problem: I created a huge database with 2 tables.

    Table1 (20.000.000 rows):

    id [PK] (clustered index)

    col1 (non clustered index)

    col2

    Table2 (2.000 rows)

    id [PK] (clustered Index)

    col1

    Now i create the Statement:

    Select a.col1, a.col2, b.col1 FROM Table1 a inner join Table2 b on a.col1 = b.id

    This Statement use the correct index in Table2 but in Table1 it use the clustered index on id-column.

    My Questions:

    Why the optimizers chooses the clustered index ob Table1 and not the index on col1?

    How can i change this manuelly?

    Any other comments?

    Thnx for your help,

    teLLy

  • Because the index doesn't include all the columns being used, so going with the clustered index (which has all the columns) will be faster and easier for the server.

    You can force index use. Check out query hints in Books Online for how to do so. It is almost always a bad idea to do so, but it is possible. In almost 9 years of database development and tuning, I've used an index hint once, because the rest of the time it would have made things worse.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • In addition to what GSquared stated, how many rows out of the 20,000,000 rows have a value in col2?

    If most of those are populated, then SQL Server is going to perform a clustered index scan on that table because it has to read every row.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Ola!

    Tnx for your (very fast) support!

    I try all day and find for me new and interesting things!

    Within the attached picture (sql.JPG) you can see the complete scenario!

    MainTable = 20.000.000 rows

    Arzt = 2.000 rows

    Diagnose = 15.000 rows

    Person = 1.000.000 rows

    All Information are uniformly distributed! (i.e for each row in Person exist nearly 20 rows in MainTable) !

    With these indexes i get good speed and the Sql server uses both (ie. arzt_key- and id-Index in MainTable).

    My Questions:

    1) If i add all columns of MainTable to the arzt_key-index on Maintable (as included column) , the speed is better, but of course the index need more space! Should i do this if i have enoung space on harddisk?

    2) Is this in general a good design, any advantage hints?

    3) Can i increse performance (read)?

    Tnx for you support!

    teLLy

  • prinztelly (2/25/2009)


    1) If i add all columns of MainTable to the arzt_key-index on Maintable (as included column) , the speed is better, but of course the index need more space! Should i do this if i have enoung space on harddisk?

    No. Creating covering indexes that cover everything is a complete waste of space and is going to degrade performance on inserts and updates. Create covering indexes only when you need to, when the non-covering is too slow, and when the query is important.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Ok, tnx for this advise!

    I tried today with partitions, but i dont understand my Server. I post the result here (because here is the DB-Schemata Picture), if it contains to another forum please copy it.

    I use the wizard and creted so this script:

    --------------------

    USE [Schema6]

    GO

    BEGIN TRANSACTION

    CREATE PARTITION FUNCTION [InsertDatePartitionFunc](datetime) AS RANGE LEFT FOR VALUES (N'2007-01-01T00:00:00', N'2007-04-01T00:00:00', N'2007-07-01T00:00:00', N'2007-10-01T00:00:00', N'2008-01-01T00:00:00', N'2008-04-01T00:00:00', N'2008-07-01T00:00:00', N'2008-10-01T00:00:00', N'2009-01-01T00:00:00')

    CREATE PARTITION SCHEME [InsertDatePartitionScheme] AS PARTITION [InsertDatePartitionFunc] TO ([PRIMARY], [PRIMARY], [PRIMARY], [PRIMARY], [PRIMARY], [PRIMARY], [PRIMARY], [PRIMARY], [PRIMARY], [PRIMARY])

    ALTER TABLE [dbo].[MainTable] DROP CONSTRAINT [PK_MainTable]

    ALTER TABLE [dbo].[MainTable] ADD CONSTRAINT [PK_MainTable] PRIMARY KEY NONCLUSTERED

    (

    [id] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    CREATE CLUSTERED INDEX [ClusteredIndex_on_InsertDatePartitionScheme_633712506014843750] ON [dbo].[MainTable]

    (

    [Eintragsdatum]

    )WITH (SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [InsertDatePartitionScheme]([Eintragsdatum])

    DROP INDEX [ClusteredIndex_on_InsertDatePartitionScheme_633712506014843750] ON [dbo].[MainTable] WITH ( ONLINE = OFF )

    CREATE NONCLUSTERED INDEX [Arzt_key] ON [dbo].[MainTable]

    (

    [Arzt_key] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = ON, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [InsertDatePartitionScheme]([Eintragsdatum])

    CREATE NONCLUSTERED INDEX [Diagnose_key] ON [dbo].[MainTable]

    (

    [Diagnose_key] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = ON, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [InsertDatePartitionScheme]([Eintragsdatum])

    CREATE NONCLUSTERED INDEX [Diagnose_Zeit] ON [dbo].[MainTable]

    (

    [Diagnose_Zeitpunkt] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = ON, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [InsertDatePartitionScheme]([Eintragsdatum])

    CREATE NONCLUSTERED INDEX [Einfüge_Index] ON [dbo].[MainTable]

    (

    [Eintragsdatum] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = ON, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [InsertDatePartitionScheme]([Eintragsdatum])

    CREATE NONCLUSTERED INDEX [Person_key] ON [dbo].[MainTable]

    (

    [Person_key] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = ON, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [InsertDatePartitionScheme]([Eintragsdatum])

    COMMIT TRANSACTION

    --------------------

    Non i have two questions:

    1) Why the Index ClusteredIndex_on_InsertDatePartitionScheme_633712506014843750 will created and in next line dropt?

    2) I test the performance (select all columns with a where condition for each column) and espacially the partition-column-test has a much worser result than without the partition. Why?

    Greetz

    teLLy

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply