Need some help with this query

  • New Born DBA (5/6/2015)


    I have this query which runs every night pulls 800,000+ records. I just wanted to know if there is a way to make it run better.

    SELECT Audit_Object_Type AS [Audit_Object_Type_Int] ,

    Field ,

    Field_Type AS [Field_Type_Int] ,

    From_Value ,

    To_Value ,

    Updated_By_Login ,

    Updated_By ,

    Parent_Hierarchy_Key ,

    DATEADD(S, Create_Date, '1970-01-01 00:00:00') AS [Create Date] ,

    Request_ID

    FROM COL_CORE_Audit_Data

    WHERE Modified_Date >= (DATEDIFF(s, '19700101', GETDATE())- 2678400);

    just for clarification and going back to basics......

    have this query which runs every night pulls 800,000+ records........what is the total number of rows in the underlying table?

    if there is a way to make it run better....what are you expecting as "better"....faster?...if so what are you current timings?

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • J Livingston SQL (5/8/2015)


    New Born DBA (5/6/2015)


    I have this query which runs every night pulls 800,000+ records. I just wanted to know if there is a way to make it run better.

    SELECT Audit_Object_Type AS [Audit_Object_Type_Int] ,

    Field ,

    Field_Type AS [Field_Type_Int] ,

    From_Value ,

    To_Value ,

    Updated_By_Login ,

    Updated_By ,

    Parent_Hierarchy_Key ,

    DATEADD(S, Create_Date, '1970-01-01 00:00:00') AS [Create Date] ,

    Request_ID

    FROM COL_CORE_Audit_Data

    WHERE Modified_Date >= (DATEDIFF(s, '19700101', GETDATE())- 2678400);

    just for clarification and going back to basics......

    have this query which runs every night pulls 800,000+ records........what is the total number of rows in the underlying table?

    if there is a way to make it run better....what are you expecting as "better"....faster?...if so what are you current timings?

    Well, there are around 2.8 million records. It takes about 40+ seconds to run. When I look at the execution plan, its not even using the Index I created and offocurse there is an Index Scan and not Seek.

    "He who learns for the sake of haughtiness, dies ignorant. He who learns only to talk, rather than to act, dies a hyprocite. He who learns for the mere sake of debating, dies irreligious. He who learns only to accumulate wealth, dies an atheist. And he who learns for the sake of action, dies a mystic."[/i]

  • My question is who says you can't drop and recreate the primary key constraint as a nonclustered index on C1?

    My Manager and people who have developed this application.

    Putting a clustered index on C6 would improve this query, and possibly others that do range searches on the same column. Your other choice would be adding the necessary columns to support this query to the current index on C6 as included columns, but that may duplicate the data in the clustered index if you need all of the data columns.

    It did improve a lot. I tried that. It went from 40 seconds to 13 seconds execution time, but it's not something we can change. :angry:

    "He who learns for the sake of haughtiness, dies ignorant. He who learns only to talk, rather than to act, dies a hyprocite. He who learns for the mere sake of debating, dies irreligious. He who learns only to accumulate wealth, dies an atheist. And he who learns for the sake of action, dies a mystic."[/i]

  • J Livingston SQL (5/8/2015)


    New Born DBA (5/6/2015)


    I have this query which runs every night pulls 800,000+ records. I just wanted to know if there is a way to make it run better.

    SELECT Audit_Object_Type AS [Audit_Object_Type_Int] ,

    Field ,

    Field_Type AS [Field_Type_Int] ,

    From_Value ,

    To_Value ,

    Updated_By_Login ,

    Updated_By ,

    Parent_Hierarchy_Key ,

    DATEADD(S, Create_Date, '1970-01-01 00:00:00') AS [Create Date] ,

    Request_ID

    FROM COL_CORE_Audit_Data

    WHERE Modified_Date >= (DATEDIFF(s, '19700101', GETDATE())- 2678400);

    just for clarification and going back to basics......

    have this query which runs every night pulls 800,000+ records........what is the total number of rows in the underlying table?

    if there is a way to make it run better....what are you expecting as "better"....faster?...if so what are you current timings?

    ok...so you are selecting 800K from 2800K...approx 30%...yes?

    and having selected these rows....what are you doing with them?......I assume you are not just opening the results in SSMS

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • Lynn Pettis (5/8/2015)


    MMartin1 (5/8/2015)


    New Born DBA (5/7/2015)


    Lynn Pettis (5/7/2015)


    New Born DBA (5/7/2015)


    Sorry, but I actually spit soda out my nose on this comment:

    Well I hope you are OK. 😀

    It isn't a system field, the column c1is a user defined column on the table, and the index can be dropped and recreated as a nonclustered index and a clustered index created on the column c6.

    So this is what I was given.

    http://theremedyforit.com/2012/03/understanding-the-request-id-field-in-bmc-remedy/

    I don't see anything in that information that would prevent you from changing the primary key from a clustered index to a nonclustered index.

    For others, here is the link made clickable:

    http://theremedyforit.com/2012/03/understanding-the-request-id-field-in-bmc-remedy/

    Thanks Lynn for helping me out. I really appreciate that 🙂

    From your last comment, I can't decipher if Lynn helped you. Her advise is right on. The SQL engine was scanning the table as it saw that most of the data returned happen in the last 31 days. Ordering the table by this C6/modified_date field will help, probably tremendously. If data inserts to this table later become a issue, then the answer is not to drop the clustered index, but instead look to partition (again by this C6/modified_date field) the data in the base table.

    First, I am not a her but a him. Second, partitioning is not a performance tuning tool.

    i) Apologies on the confusion ... ii) Yes partitioning is not for performance in the sense of retrieving data, hence why I mentioned inserts.

    ----------------------------------------------------

  • New Born DBA (5/8/2015)


    J Livingston SQL (5/8/2015)


    New Born DBA (5/6/2015)


    I have this query which runs every night pulls 800,000+ records. I just wanted to know if there is a way to make it run better.

    SELECT Audit_Object_Type AS [Audit_Object_Type_Int] ,

    Field ,

    Field_Type AS [Field_Type_Int] ,

    From_Value ,

    To_Value ,

    Updated_By_Login ,

    Updated_By ,

    Parent_Hierarchy_Key ,

    DATEADD(S, Create_Date, '1970-01-01 00:00:00') AS [Create Date] ,

    Request_ID

    FROM COL_CORE_Audit_Data

    WHERE Modified_Date >= (DATEDIFF(s, '19700101', GETDATE())- 2678400);

    just for clarification and going back to basics......

    have this query which runs every night pulls 800,000+ records........what is the total number of rows in the underlying table?

    if there is a way to make it run better....what are you expecting as "better"....faster?...if so what are you current timings?

    Well, there are around 2.8 million records. It takes about 40+ seconds to run. When I look at the execution plan, its not even using the Index I created and offocurse there is an Index Scan and not Seek.

    The engine, as I understand it, is quite good at developing a query plan... as long as your stats are fairly accurate. For example> If your query is returning, say 99% of the rows, why not use a table scan? In your case since you are returning about 33%... Can you check that your stats do automatically update and if so are they fairly up to date?

    ----------------------------------------------------

  • MMartin1 (5/8/2015)


    New Born DBA (5/8/2015)


    J Livingston SQL (5/8/2015)


    New Born DBA (5/6/2015)


    I have this query which runs every night pulls 800,000+ records. I just wanted to know if there is a way to make it run better.

    SELECT Audit_Object_Type AS [Audit_Object_Type_Int] ,

    Field ,

    Field_Type AS [Field_Type_Int] ,

    From_Value ,

    To_Value ,

    Updated_By_Login ,

    Updated_By ,

    Parent_Hierarchy_Key ,

    DATEADD(S, Create_Date, '1970-01-01 00:00:00') AS [Create Date] ,

    Request_ID

    FROM COL_CORE_Audit_Data

    WHERE Modified_Date >= (DATEDIFF(s, '19700101', GETDATE())- 2678400);

    just for clarification and going back to basics......

    have this query which runs every night pulls 800,000+ records........what is the total number of rows in the underlying table?

    if there is a way to make it run better....what are you expecting as "better"....faster?...if so what are you current timings?

    Well, there are around 2.8 million records. It takes about 40+ seconds to run. When I look at the execution plan, its not even using the Index I created and offocurse there is an Index Scan and not Seek.

    The engine, as I understand it, is quite good at developing a query plan... as long as your stats are fairly accurate. For example> If your query is returning, say 99% of the rows, why not use a table scan? In your case since you are returning about 33%... Can you check that your stats do automatically update and if so are they fairly up to date?

    Again, it is doing a clustered index scan (also known as a table scan) instead of using the nonclustered index you created on C6 (with no included columns) because the SQL Server Optimizer has determined that it is easier to scan the index and determine which rows of data fit the criteria of the query than use the nonclustered index and do bookmark lookups to the clustered index for the other data.

  • MMartin1 (5/8/2015)


    Lynn Pettis (5/8/2015)


    MMartin1 (5/8/2015)


    New Born DBA (5/7/2015)


    Lynn Pettis (5/7/2015)


    New Born DBA (5/7/2015)


    Sorry, but I actually spit soda out my nose on this comment:

    Well I hope you are OK. 😀

    It isn't a system field, the column c1is a user defined column on the table, and the index can be dropped and recreated as a nonclustered index and a clustered index created on the column c6.

    So this is what I was given.

    http://theremedyforit.com/2012/03/understanding-the-request-id-field-in-bmc-remedy/

    I don't see anything in that information that would prevent you from changing the primary key from a clustered index to a nonclustered index.

    For others, here is the link made clickable:

    http://theremedyforit.com/2012/03/understanding-the-request-id-field-in-bmc-remedy/

    Thanks Lynn for helping me out. I really appreciate that 🙂

    From your last comment, I can't decipher if Lynn helped you. Her advise is right on. The SQL engine was scanning the table as it saw that most of the data returned happen in the last 31 days. Ordering the table by this C6/modified_date field will help, probably tremendously. If data inserts to this table later become a issue, then the answer is not to drop the clustered index, but instead look to partition (again by this C6/modified_date field) the data in the base table.

    First, I am not a her but a him. Second, partitioning is not a performance tuning tool.

    i) Apologies on the confusion ... ii) Yes partitioning is not for performance in the sense of retrieving data, hence why I mentioned inserts.

    Also, before considering partitioning, we would need to know what edition of SQL Server is being used as partitioning is only available in the Enterprise Edition of SQL Server. If they are using Enterprise Edition, the other issue would be that the clustered index would need to be removed from the column C1 and placed on column C6. We have already been told that this cannot be done per the powers that be.

  • Again, it is doing a clustered index scan (also known as a table scan) instead of using the nonclustered index you created on C6 (with no included columns) because the SQL Server Optimizer has determined that it is easier to scan the index and determine which rows of data fit the criteria of the query than use the nonclustered index and do bookmark lookups to the clustered index for the other data.

    Agreed, I think we are saying the same thing. I used table scan and clustered index scan interchangeably. Regardless of stats, the engine is still picking the best plan. It would be wise for the OP to see for himself and rule out stats since he has the concern of why the scan was chosen,where he thinks there is a faster method. IE ... updated stats may make the query faster, but can still maintain the same query plan ( clustered index scan.)

    ----------------------------------------------------

  • Again, it is doing a clustered index scan (also known as a table scan) instead of using the nonclustered index you created on C6 (with no included columns) because the SQL Server Optimizer has determined that it is easier to scan the index and determine which rows of data fit the criteria of the query than use the nonclustered index and do bookmark lookups to the clustered index for the other data.

    I think you may find this particular response from Lynn probably explains what is going on here.

    You can experiment by "including" the other columns in your index and see how it does...

  • CKX (5/11/2015)

    I think you may find this particular response from Lynn probably explains what is going on here.

    You can experiment by "including" the other columns in your index and see how it does...

    I did create a clustered index on C6 and it worked perfectly fine, but we can't drop clustered index from C1. Won't help in our case, because there are tons of other queries we have in our environment where we need clustered index on C1 column.

    I also created a non clustered index, included all the columns in an index, but SQL Server doesn't use that index. Even when I ran the query and the force SQL Server to use the index I created but no luck.

    "He who learns for the sake of haughtiness, dies ignorant. He who learns only to talk, rather than to act, dies a hyprocite. He who learns for the mere sake of debating, dies irreligious. He who learns only to accumulate wealth, dies an atheist. And he who learns for the sake of action, dies a mystic."[/i]

  • New Born DBA (5/11/2015)


    CKX (5/11/2015)

    I think you may find this particular response from Lynn probably explains what is going on here.

    You can experiment by "including" the other columns in your index and see how it does...

    I did create a clustered index on C6 and it worked perfectly fine, but we can't drop clustered index from C1. Won't help in our case, because there are tons of other queries we have in our environment where we need clustered index on C1 column.

    I also created a non clustered index, included all the columns in an index, but SQL Server doesn't use that index. Even when I ran the query and the force SQL Server to use the index I created but no luck.

    Show us that index that you had created.

  • Lynn Pettis (5/11/2015)Show us that index that you had created.

    ALTER TABLE [dbo].[T387] ADD PRIMARY KEY CLUSTERED

    (

    [C6] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

    I also created this index as well.

    CREATE NONCLUSTERED INDEX [Test] ON [dbo].[T387]

    (

    [C6] ASC

    )

    INCLUDE

    ( All the columns)

    WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

    Edit: "All the column" is not in a create non clustered Index syntax, I don't have index anymore because I ended up deleting it, so I am just showing how the index was created.

    "He who learns for the sake of haughtiness, dies ignorant. He who learns only to talk, rather than to act, dies a hyprocite. He who learns for the mere sake of debating, dies irreligious. He who learns only to accumulate wealth, dies an atheist. And he who learns for the sake of action, dies a mystic."[/i]

  • New Born DBA (5/11/2015)


    Lynn Pettis (5/11/2015)Show us that index that you had created.

    ALTER TABLE [dbo].[T387] ADD PRIMARY KEY CLUSTERED

    (

    [C6] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

    I also created this index as well.

    CREATE NONCLUSTERED INDEX [Test] ON [dbo].[T387]

    (

    [C6] ASC

    )

    INCLUDE

    ( All the columns)

    WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

    Edit: "All the column" is not in a create non clustered Index syntax, I don't have index anymore because I ended up deleting it, so I am just showing how the index was created.

    So the following (hopefully I mapped everything correctly using the VIEW definition) didn't help:

    CREATE NONCLUSTERED INDEX [Test] ON [dbo].[T387]

    (

    [C6] ASC

    )

    INCLUDE

    (

    C1,

    C3,

    C67001003,

    C875156001,

    C875156006,

    C875156003,

    C875156004,

    C875156005,

    C875156007,

    C875000001

    )

    WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

  • Lynn Pettis (5/11/2015)So the following (hopefully I mapped everything correctly using the VIEW definition) didn't help:

    CREATE NONCLUSTERED INDEX [Test] ON [dbo].[T387]

    (

    [C6] ASC

    )

    INCLUDE

    (

    C1,

    C3,

    C67001003,

    C875156001,

    C875156006,

    C875156003,

    C875156004,

    C875156005,

    C875156007,

    C875000001

    )

    WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

    No Sir, it didn't.

    "He who learns for the sake of haughtiness, dies ignorant. He who learns only to talk, rather than to act, dies a hyprocite. He who learns for the mere sake of debating, dies irreligious. He who learns only to accumulate wealth, dies an atheist. And he who learns for the sake of action, dies a mystic."[/i]

Viewing 15 posts - 31 through 45 (of 50 total)

You must be logged in to reply to this topic. Login to reply