December 2, 2003 at 4:50 am
Hi,
We have a problem with a query used very often. We want to calculate the last operation for a "set of utlities", and it takes about a second if the row contains 80,000 rows. I wander if it can be done faster...
I have converted my problem to the Northwind database and it would be more or less equivallent to the question. "What are the last order dates of all french customers". The result is the set:
Blondel père et fils1998-01-12 00:00:00.000
Bon app'1998-05-06 00:00:00.000
Du monde entier1998-02-16 00:00:00.000
Folies gourmande1997-12-22 00:00:00.000
France restauration1998-03-24 00:00:00.000
La corne d'abondance1998-03-24 00:00:00.000
La maison d'Asie1998-04-27 00:00:00.000
Spécialités du monde1998-04-22 00:00:00.000
Victuailles en stock1998-01-23 00:00:00.000
Vins et alcools Chevalier1997-11-12 00:00:00.000
An obvious way to do this is:
SELECT Customers.CompanyName, MAX(Orders.OrderDate) AS LastOrderDate
FROM Orders INNER JOIN
Customers ON Orders.CustomerID = Customers.CustomerID
WHERE (Customers.Country = 'France')
GROUP BY Customers.CompanyName
The generated statistics give:
Table 'Orders'. Scan count 11, logical reads 233, physical 0, read-ahead 0.
Table 'Customers'. Scan count 1, logical reads 183, physical 0, read-ahead 0.
An alternative query could be:
SELECT distinct cus.CompanyName, ord.OrderDate
FROM Customers cus, Orders ord
WHERE cus.Country= 'france'
AND cus.CustomerID = ord.CustomerID
AND ord.OrderDate =
(SELECT MAX(ordsub.OrderDate)
FROM Orders ordsub
WHERE ordsub.CustomerID = cus.CustomerID)
The generated statistics for this one give:
Table 'Customers'. Scan count 90, logical reads 182, physical 0, read-ahead 0.
Table 'Orders'. Scan count 1, logical reads 21, physical 0, read-ahead 0.
I like the latter more as it scans more on the "smaller" table. However, in our case they run still equally slow (read about a second) so I consider now a backpointer "to the last order date" to boost performance)
Question:
- Which of the two should theoretically give the fastest result?
- Is there a faster query / manner / technique?
Thx
Kurt
X
X
December 2, 2003 at 5:24 am
The first query should be faster , if you had the appropriate indexes on
Customers.Country and composite on Orders.CustomerId + Orderdate (or clustered on Customerid)
December 2, 2003 at 6:56 am
Thanx GRN,
A composite index indeed might help!
About putting an index on "Country" in table "Customer" (in our case "Type" in table "Utility"), we don't do that really if the table only contains 50 rows or so. Could that intervene in this case?
Cheers
Kurt
X
X
December 2, 2003 at 7:19 am
Ok for those numbers the lower the better all the way across. The higher the more work is having to be done. But this still does not mean the lowest one is the best choice with regards to what query to use just helps.
Since I don't know a lote about your data I will offer this.
Query 1 is the simplest and works best as opposed to query 2. The reason is the data is joined then filtered in query 1, in query 2 each record from the customer table is joined to the order table and as the sets enter the buffer they are validated in the subquery each time. So if you have 1000 orders then 1000 validations occurr with the subquery which impacts the server exponentially as the datasize grows.
Now to offer 2 other potential canidate query concepts that although do not bennifit in the Northwind database may help you.
SELECT
cus.CompanyName,
ord.OrderDate
FROM
Customers cus
INNER JOIN
(
SELECT
ordsub.CustomerID,
MAX(ordsub.OrderDate) OrderDate
FROM
Orders ordsub
GROUP BY
ordsub.CustomerID
) ord
ON
cus.CustomerID = ord.CustomerID
WHERE
cus.Country = 'france'
ORDER BY
cus.CompanyName
The above works best when you have a medium sized Customers table (couple of thousand unique records) and a large Orders table (hundreds of thousands into the millions).
Or try this
SELECT cus.CompanyName, ord.OrderDate
FROM
(
SELECT
cussub.CompanyName,
cussub.CustomerID
FROM
Customers cussub
WHERE
cussub.Country = 'france'
) cus
INNER JOIN
(
SELECT
ordsub.CustomerID,
MAX(ordsub.OrderDate) OrderDate
FROM
Orders ordsub
GROUP BY
ordsub.CustomerID
) ord
ON
cus.CustomerID = ord.CustomerID
ORDER BY
cus.CompanyName
This works best when you have a large Customers table and a large Orders table.
But the trick is trying as many solutions as possible and seeing which does the work the fastest and most efficient for the speed.
But indexing is essential.
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply