Query Optimization problem(Heavy cost in Top n Sort)

  • I have following table structure.

    RecordID | CustomerTicket | Other details ....

    1 TK01

    2 TK02

    3 TK03

    4 TK04

    Now status of these tickets is placed in another table(maybe its a mistake but...)

    ID | CustomerTicket | StatusID

    1 TK01 1

    2 TK01 2

    3 TK03 3

    Now I have to return latest status against a single ticket so what I did was

    select ...... , (select Top 1 StatusID from TicketStatus where TicketID = 1)

    when I do this and a run an estimated execution plan the heaviest cost is in top n Sort. Is there any way to avoid this, through sql ?

    It is giving me 46 rows in 7 seconds and data is around 200000 or few k's more.

  • To proceed to troubleshoot this firstly, should we assume that TicketID has at least a NON CLUSTERED Index on it?


    Kindest Regards,

  • Could you please detail a little bit more your query? I suppose I haven't understood exactly what you need, because I would have written something like :

    SELECT ID , MAX(StatusID) FROM TicketStatus GROUP BY ID

    Or, if you need to bring MAX(StatusID) in the first table just use the above statement as a virtual table ?

    ------------------

    Virgil Rucsandescu

  • Can you post the entire query please, the table structure and all the indexes on it? Also, please save the execution plan as a .sqlplan file (right-click the exec plan, save as), zip it and attach to your post.

    Thanks

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Without everything that Gail asked for, answering this is tough, but it looks like you've got that SELECT statement as part of your initial SELECT statement. Move it into your table list either as a derived table that you join against, or as part of the join definition of the table you're selecting from, like this:

    SELECT...

    FROM TableA a

    JOIN TableB b

    ON a.Id = b.Id

    AND b.StatusId = (SELECT TOP(1) b2.StatusId

    FROM TableB b2

    WHERE b.Id = b2.Id

    ORDER BY b2.StatusId DESC)

    With good indexing, this should scream.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Following is the query:

    Declare @FromDate varchar(15),@ToDate varchar(15)

    set @FromDate = '17-Apr-2008'

    set @ToDate = '18-Apr-2008'

    select a.OrderID, a.UniqueID, a.CustomerId, a.LocationID, a.OutletID, a.PunchDate, a.ReceivedDate, a.DelieverdDate, b.CustName, c.LocationName, d.OutletName,

    (select top 1 StatusID from CustomerOrderStatus where CustomerOrderStatus.OrderID = a.OrderID order by StatusID DESC) [OrderStatus]

    into #temp

    from Orders a

    inner join CustomerInformation b on a.CustomerID = b.CustomerId

    inner join Locations c on a.LocationId = c.LocationId

    inner join OutletInformation d on a.OutletID = d.OutletID

    where Cast(Left(PunchDate,11) AS DateTime) between @FromDate AND @ToDate

    Now Table Structures:

    Orders:

    UniqueIdvarchar(100)Checked

    CustomerIdintChecked

    LocationIdintChecked

    OrderIdbigintUnchecked

    OutletIdintChecked

    PunchDatedatetimeChecked

    ReceivedDatedatetimeChecked

    DelieverdDatedatetimeChecked

    FinishDatedatetimeChecked

    UserIDintChecked

    ProcessModebitChecked

    PaymentModevarchar(50)Checked

    SPInstruction2varchar(500)Checked

    Index Information of Orders table

    Clustered Index in on OrderID

    Non clustered Index on LocationId, OutletID

    CustomerInformation:

    CustomerIDintUnchecked

    CustNamevarchar(300)Checked

    CustPhonevarchar(200)Checked

    CustMobile1varchar(150)Checked

    CustMobile2varchar(50)Checked

    CustAddressvarchar(500)Checked

    LocationIDintChecked

    OutletIdintChecked

    SaveDatedatetimeChecked

    LastUpdateDatedatetimeChecked

    CustomerCommentsvarchar(200)Checked

    Index Information of CustomerInformation table:

    Clustered Index on following fields:

    CustomerID,CustName

    NonClustered Index on following fields:

    CustomerID

    Locations:

    LocationIdint

    OutletIdint

    LocationNamevarchar(50)

    CityIDint

    Index information of Locations table:

    Clustered Index: CityID

    NonClustered Index: LocationID

    OutletInformation

    OutletIdint

    OutletNamevarchar(50)

    CityIDintChecked

    ManagerNamevarchar(50)

    ResturantTelvarchar(50)

    ContactInfovarchar(50)

    isActivebit

    Clustered Index:OutletID

    Non Clustered Index:OutletID

    CustomerOrderStatus

    CustOrderIDint

    Orderidbigint

    StatusIdint

    StatusDatedatetime

    Non Clustered Index:[CustOrderID] ASC,

    [Orderid] ASC,

    [StatusId] ASC

    ----------------------------------------------------

    I have given SQL Statement, Table Structure, Index Information and finally execution plan in attached rar file.

    I hope to learn more from you guys, please do post helpful comment on this lengthy text.

    Thanks

  • your top N query

    (select top 1 StatusID from CustomerOrderStatus

    where CustomerOrderStatus.OrderID = a.OrderID order by StatusID DESC)

    can be replaced with

    (select max(StatusID) from CustomerOrderStatus

    where CustomerOrderStatus.OrderID = a.OrderID)

    the max() should perform better.

    also, an index on Orders.PunchDate will help things if you change @FromDate and @ToDate to real dates.

    Declare @FromDate datetime, @ToDate datetime

    set @FromDate = '17-Apr-2008'

    set @ToDate = '18-Apr-2008'

    ...

    where PunchDate >= @FromDate and PunchDate < dateadd(day,1,@ToDate)

    lastly, a local table may be helpful if CustomerOrderStatus or Orders are very large.

    Declare @FromDate datetime, @ToDate datetime

    set @FromDate = '17-Apr-2008'

    set @ToDate = '18-Apr-2008'

    declare @tOrders table (OrderId bigint)

    insert into @tOrders

    select OrderId

    from Orders

    where PunchDate >= @FromDate and PunchDate < dateadd(day,1,@ToDate)

    select a.OrderID, a.UniqueID, a.CustomerId, a.LocationID,

    a.OutletID, a.PunchDate, a.ReceivedDate, a.DelieverdDate,

    b.CustName, c.LocationName, d.OutletName, STAT.latestStatusID

    into #temp

    from @tOrders as ORD

    inner join Orders a on a.OrderID = ORD.OrderID

    inner join CustomerInformation b on a.CustomerID = b.CustomerId

    inner join Locations c on a.LocationId = c.LocationId

    inner join OutletInformation d on a.OutletID = d.OutletID

    inner join (select ST.OrderID, max(ST.StatusID) as latestStatusID

    from CustomerOrderStatus ST join @tOrders T on ST.OrderID = T.OrderID

    group by ST.OrderID) as STAT on STAT.OrderID = a.OrderID

  • You've effectively got a cursor going on with that query running in the SELECT statement. I'd try it like this:

    SELECT a.OrderID

    ,a.UniqueID

    ,a.CustomerId

    ,a.LocationID

    ,a.OutletID

    ,a.PunchDate

    ,a.ReceivedDate

    ,a.DelieverdDate

    ,b.CustName

    ,c.LocationName

    ,d.OutletName

    ,co.StatusId

    INTO #temp

    FROM Orders a

    INNER JOIN CustomerInformation b

    ON a.CustomerID = b.CustomerId

    INNER JOIN Locations c

    ON a.LocationId = c.LocationId

    INNER JOIN OutletInformation d

    ON a.OutletID = d.OutletID

    INNER JOIN CustomerOrderStatus co

    ON a.OrderID = co.OrderId

    AND co.StatusId = ( SELECT TOP 1

    StatusID

    FROM CustomerOrderStatus

    WHERE CustomerOrderStatus.OrderID = a.OrderID

    ORDER BY StatusID DESC

    )

    WHERE CAST(LEFT(PunchDate, 11) AS DATETIME) BETWEEN @FromDate

    AND @ToDate

    MAX might work better. I'm actually working on a series of tests between MAX, TOP and ROW_NUMBER to determine which one works better when because I've seen different methods work better in different situations. Test each one in your own environment.

    You should probably make sure you've got a clustered index on CustomerOrderStatus. You don't show one in your description. OrderId should probably be the leading edge on that index too. Again, test it all in your own environment.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • To paraphrase what has been said before me, there are definitely some index adjustments needed. Orders could use an index on PunchDate. CustomerOrderStatus could use an index with OrderId and more.

    Certainly, the main WHERE clause of your query needs to be rewritten, something along the lines of what Antonio has suggested, otherwise it will not use any indexes.

    I have one important question though. You stated:

    Now I have to return latest status against a single ticket so what I did was

    However, unless your business rules specifically support this, selecting the MAX/Highest/Top 1 StatusId does not guarantee that you have selected the "latest", rather you have selected the one with the greatest ID. If you truly want the "latest", then we need to look at ordering based on StatusDate or possibly CustOrderId, depending on your business rules.

    So, what StatusId do you really need?

  • I want latest statusId to be returned.

    I think one way to do this is to store latest status id in the orders table and avoid any joins with customerstatus table.What do u say ?

    I did felt that MAX is better than top cuz in few tests it took 4 - 6 secs less than top

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply