Last-Date performance issues

  • Hi,

    We have a problem with a query used very often. We want to calculate the last operation for a "set of utlities", and it takes about a second if the row contains 80,000 rows. I wander if it can be done faster...

    I have converted my problem to the Northwind database and it would be more or less equivallent to the question. "What are the last order dates of all french customers". The result is the set:

    Blondel père et fils1998-01-12 00:00:00.000

    Bon app'1998-05-06 00:00:00.000

    Du monde entier1998-02-16 00:00:00.000

    Folies gourmande1997-12-22 00:00:00.000

    France restauration1998-03-24 00:00:00.000

    La corne d'abondance1998-03-24 00:00:00.000

    La maison d'Asie1998-04-27 00:00:00.000

    Spécialités du monde1998-04-22 00:00:00.000

    Victuailles en stock1998-01-23 00:00:00.000

    Vins et alcools Chevalier1997-11-12 00:00:00.000

    An obvious way to do this is:

    SELECT Customers.CompanyName, MAX(Orders.OrderDate) AS LastOrderDate

    FROM Orders INNER JOIN

    Customers ON Orders.CustomerID = Customers.CustomerID

    WHERE (Customers.Country = 'France')

    GROUP BY Customers.CompanyName

    The generated statistics give:

    Table 'Orders'. Scan count 11, logical reads 233, physical 0, read-ahead 0.

    Table 'Customers'. Scan count 1, logical reads 183, physical 0, read-ahead 0.

    An alternative query could be:

    SELECT distinct cus.CompanyName, ord.OrderDate

    FROM Customers cus, Orders ord

    WHERE cus.Country= 'france'

    AND cus.CustomerID = ord.CustomerID

    AND ord.OrderDate =

    (SELECT MAX(ordsub.OrderDate)

    FROM Orders ordsub

    WHERE ordsub.CustomerID = cus.CustomerID)

    The generated statistics for this one give:

    Table 'Customers'. Scan count 90, logical reads 182, physical 0, read-ahead 0.

    Table 'Orders'. Scan count 1, logical reads 21, physical 0, read-ahead 0.

    I like the latter more as it scans more on the "smaller" table. However, in our case they run still equally slow (read about a second) so I consider now a backpointer "to the last order date" to boost performance)

    Question:

    - Which of the two should theoretically give the fastest result?

    - Is there a faster query / manner / technique?

    Thx

    Kurt

    X


    X

  • The first query should be faster , if you had the appropriate indexes on

    Customers.Country and composite on Orders.CustomerId + Orderdate (or clustered on Customerid)

  • Thanx GRN,

    A composite index indeed might help!

    About putting an index on "Country" in table "Customer" (in our case "Type" in table "Utility"), we don't do that really if the table only contains 50 rows or so. Could that intervene in this case?

    Cheers

    Kurt

    X


    X

  • Ok for those numbers the lower the better all the way across. The higher the more work is having to be done. But this still does not mean the lowest one is the best choice with regards to what query to use just helps.

    Since I don't know a lote about your data I will offer this.

    Query 1 is the simplest and works best as opposed to query 2. The reason is the data is joined then filtered in query 1, in query 2 each record from the customer table is joined to the order table and as the sets enter the buffer they are validated in the subquery each time. So if you have 1000 orders then 1000 validations occurr with the subquery which impacts the server exponentially as the datasize grows.

    Now to offer 2 other potential canidate query concepts that although do not bennifit in the Northwind database may help you.

    
    
    SELECT
    cus.CompanyName,
    ord.OrderDate
    FROM
    Customers cus
    INNER JOIN
    (
    SELECT
    ordsub.CustomerID,
    MAX(ordsub.OrderDate) OrderDate
    FROM
    Orders ordsub
    GROUP BY
    ordsub.CustomerID
    ) ord
    ON
    cus.CustomerID = ord.CustomerID
    WHERE
    cus.Country = 'france'
    ORDER BY
    cus.CompanyName

    The above works best when you have a medium sized Customers table (couple of thousand unique records) and a large Orders table (hundreds of thousands into the millions).

    Or try this

    
    
    SELECT cus.CompanyName, ord.OrderDate
    FROM
    (
    SELECT
    cussub.CompanyName,
    cussub.CustomerID
    FROM
    Customers cussub
    WHERE
    cussub.Country = 'france'
    ) cus
    INNER JOIN
    (
    SELECT
    ordsub.CustomerID,
    MAX(ordsub.OrderDate) OrderDate
    FROM
    Orders ordsub
    GROUP BY
    ordsub.CustomerID
    ) ord
    ON
    cus.CustomerID = ord.CustomerID
    ORDER BY
    cus.CompanyName

    This works best when you have a large Customers table and a large Orders table.

    But the trick is trying as many solutions as possible and seeing which does the work the fastest and most efficient for the speed.

    But indexing is essential.

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply