Tuning help

  • Hi,

    I have a table with appr. 10 million records.(history data)

    While trying to retrieve records with a max query it takes around 4 minutes (appr. 20 lakh records)

    I have included indexes ; clustered index on the data but so far no partioning has been done

    Please let me know Will retieving data be faster if partioning is done?

    And also steps for the same ...

    Thanks!

  • Hi

    Is your query using the clustered index?

    Does your query have a where clause? If it does and you partition your data according to the condition in the where clause AND your query does not need to access more than 1 partition then i think partition might be helpfull. Whew:D

    Other opinions most welcome..

    "Keep Trying"

  • Yes teh indexes are used in the where clause, a glimpse of the query. Highlighted are the indexes used:

    Createddatetime uses clustered index and sid uses non clustered

    SELECT * FROM Ex A WHERE

    ( A.CreatedDateTime in (select max(CS.CreatedDateTime) from Ex CS where CS.conId = A.ConId group by Conid))

    AND ( A.SId in (Select Max(CS.SId) from EX CS where CS.ConId = A.ConId and CS.CreatedDateTime = A.CreatedDateTime group by ConId))

  • There's "used" and there's "used"

    Do the indexes get scanned or is there a seek operation against them? If scanned, how many rows are reported as being scanned?

    Also, the code snippet you showed was using IN clauses with select statements. These should be converted into JOIN statements. They will perform better.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Am using 'IN' since its a self join

  • Not a problem. Just alias the table. Self-join or not, you're looking at the equivalent of a cursor.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • jananik (6/5/2008)


    Yes teh indexes are used in the where clause, a glimpse of the query. Highlighted are the indexes used:

    Createddatetime uses clustered index and sid uses non clustered

    SELECT * FROM Ex A WHERE

    ( A.CreatedDateTime in (select max(CS.CreatedDateTime) from Ex CS where CS.conId = A.ConId group by Conid))

    AND ( A.SId in (Select Max(CS.SId) from EX CS where CS.ConId = A.ConId and CS.CreatedDateTime = A.CreatedDateTime group by ConId))

    you may want to try:

    With CTE

    AS

    ( SELECT col1,.... -- Use column names not "*"

    ROW_NUMBER() OVER ( Partition by CondId, order by createdDateTime DESC, Sid DESC) rn

    )

    SELECT col1,.... -- Use column names not "*"

    FROM CTE

    where rn = 1


    * Noel

  • Actually, the only thing that I see wrong with your query is that you are using GROUP BY in your subSelects when you do not need to.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Try this, see if it doesn't speed up the query a bit:

    ;with

    CTE_CreateTime (ConID, Created) as

    (select conid, max(createddatetime)

    from Ex

    group by conid),

    CTE_SID (ConID, Created, [SID]) as

    (select ex.conid, ex.createddatetime, max([sid])

    from Ex

    inner join CTE_CreateTime cte1

    on ex.conid = cte1.conid

    and ex.createddatetime = cte1.created

    group by ex.conid, ex.createddatetime)

    SELECT a.*

    FROM Ex A

    inner join CTE_SID

    on A.conid = cte_sid.conid

    and a.createddt = cte_sid.created

    and a.[sid] = cte_sid.[sid]

    I recommend replacing the "select *" with a specific list of columns. Even if it's the whole table, it's a good idea, for various reasons.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • GSquared (6/5/2008)


    Try this, see if it doesn't speed up the query a bit:

    ;with

    CTE_CreateTime (ConID, Created) as

    (select conid, max(createddatetime)

    from Ex

    group by conid),

    CTE_SID (ConID, Created, [SID]) as

    (select ex.conid, ex.createddatetime, max([sid])

    from Ex

    inner join CTE_CreateTime cte1

    on ex.conid = cte1.conid

    and ex.createddatetime = cte1.created

    group by ex.conid, ex.createddatetime)

    SELECT a.*

    FROM Ex A

    inner join CTE_SID

    on A.conid = cte_sid.conid

    and a.createddt = cte_sid.created

    and a.[sid] = cte_sid.[sid]

    Actually, this gives almost identical performance and query plans as the "No Group By" version:

    SELECT *

    FROM Ex A

    WHERE A.CreatedDateTime in (select max(CS.CreatedDateTime) from Ex CS

    where CS.conId = A.ConId

    --group by Conid --dont need these

    )

    AND A.SId in (Select Max(CS.SId) from EX CS

    where CS.ConId = A.ConId

    and CS.CreatedDateTime = A.CreatedDateTime

    --group by ConId --dont need these

    )

    The "No Group By" version is about 00.01% faster according to the query plans because it eliminates an insignificant Sort step. Both are about 10% faster than the original.

    To do any better, we really need to see your table DDL and your query plan.

    Both are about

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Hi Everyone,

    Just to confuse the issue, given that you are working with historical (and therefore I assume relatively static) data, it sounds like you may have a good case for a Data Warehouse solution and might wish to concider the use of Analysis Services as a more robust approach to working with your data.

    Cheers,

    John

  • jananik (6/5/2008)


    While trying to retrieve records with a max query it takes around 4 minutes (appr. 20 lakh records)

    That's 2,000,000 rows that you're returning to the screen... don't measure how long that takes... measure how long it takes to insert into a table with no indexes using SELECT/INTO... see how long that takes... better measure would just be to select into variables that throw the data away... that would be a closer measure of how the server is going to handle things

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply