March 28, 2003 at 7:52 pm
Hi:
I'm having difficulty with a query that takes much longer than I think it should, but maybe not.
Here is the situation
I have five tables, two of which are temporary.
Table_1: Licenses (temp)
------------------------
Description: A list of unique serial numbers, each represents single product license
Fields:
Serial_No
LastVersionShipped
Table_2: SNHistory (temp)
-------------------
Description: Linked to the Licenses table by the Serial_No fields. The OldSerial_Nos field has serial numbers that have, in the past, represented the same product license.
Fields:
Serial_No
OldSerial_Nos
Table_3: Transact (perm)
-------------------------
Description: Line item details from an invoice. Each record represents a product shipment for a specific license.
Fields:
Serial_No
Product_ID
InvoiceKey
Table_4: Prodkits (perm)
------------------------
Description: Product definitions
Fields:
Product_ID
ProductType
Version
Table_5: ShipLog (perm)
-----------------------
Description: Invoice line Items
Fields:
InvoiceKey
Ship_Date
The goal
---------
determine the most recent product version that has shipped for a specific license. Shipments for a specific product can have been done under a number of different serial numbers. No serial number, active or inactive, ever represents more than one distinct license.
So we have something along the lines of
Licenses
--------
Serial_No
345678
LicenseHistory
-----------------------
Serial_No / OldSerial_Nos
345678 / 123456
345678 / 234567
345678 / 345678
Transact
-------------------------------
Serial_No / Product_ID / InvoiceKey
123456 / 1 / 111
123456 / 7 / 251
234567 / 20 / 491
345678 / 152 / 5031
Prodkits
---------------------------
Product_ID / ProdType / Version
1 / Release / 1.000
7 / Upgrade / 2.500
20 / Update / 2.750
152 / Update / 5.200
Shiplog
---------------------------------------
InvoiceKey / Ship_Date
111 / 01-01-1991
251 / 07-25-1993
491 / 10-31-1995
5031 / 03-31-2003
There are more than 12,000 records in the License table and I'm looking to capture the last version shipped for each one of them.
For various reasons, the only way I can come up with to identify the LAST version is to order by the Ship_Date of the invoice. The Version field is a text field. To date, we have not had version numbers with alpha characters, but I don't think it is something that can be ruled out.
So the query I created is:
UPDATE Licenses
SET LastVersionShipped =
(SELECT TOP 1 T5.Version
FROM Licenses AS T2 INNER JOIN LicenseHistory AS T3 ON T2.Serial_No = T3.Serial_No
INNER JOIN Transact AS T4 ON T3.OldSerial_Nos = T4.Serial_No
INNER JOIN Prodkits AS T5 ON T4.Product_ID = T5.Product_ID
INNER JOIN Shiplog AS T6 ON T4.InvoiceKey = T6.InvoiceKey
WHERE T2.Serial_No = T1.Serial_No AND
T5.ProdType IN ('Release', 'Update', 'Upgrade')
ORDER BY T6.Ship_Date DESC)
FROM Licenses AS T1
But this seems to take far too long to complete, if it does complete. I've never waited long enought to find out.
In trying to simplify it, most of the licenses (>80%)have only been represented by a single serial number. So, if I modify the query to restrict it to such licenses (There is a way to identify such licenses, I excluded it from the descriptions above to keep things simple), the query finishes in a little over a minute.
Taking that a step further, modifying the query to update only those records (~2500) that have been represented by multiple serial numbers. It runs for 30 minutes and then TEMPDB runs out of disk space. Obviously, I'm not doing something right.
I'm not asking for anyone to write the query for me, but can anyone tell me why is this approach fails? I've tried to think of other approaches to the problem, but this is the only one that I can come up with, and it seems like a reasonable answer.
One last observation:
If I remove the ORDER BY T6.Ship_Date DESC from the query that eats all the disk space, the query completes in less than a minute. The answers are wrong, but the query completes.
Thanks ,
JK
March 31, 2003 at 8:00 am
This was removed by the editor as SPAM
March 31, 2003 at 8:44 am
I don't know if it will help, but try to modify the "T5.ProdType IN ('Release', 'Update', 'Upgrade')
" part to a JOIN. With other words: put the 'Release' and 'Upgrade' values in a table and join that table with table T5 (ProdKits).
Something like this:
UPDATE Licenses
SET LastVersionShipped =
(SELECT TOP 1 T5.Version
FROM Licenses AS T2 INNER JOIN LicenseHistory AS T3 ON T2.Serial_No = T3.Serial_No
INNER JOIN Transact AS T4 ON T3.OldSerial_Nos = T4.Serial_No
INNER JOIN Prodkits AS T5 ON T4.Product_ID = T5.Product_ID
INNER JOIN Shiplog AS T6 ON T4.InvoiceKey = T6.InvoiceKey
-- Modify BEGIN
INNER JOIN ReleaseTypes AS T7 ON T5.ProdType = T7.TypeID
-- Modify END
WHERE T2.Serial_No = T1.Serial_No
ORDER BY T6.Ship_Date DESC)
FROM Licenses AS T1
where ReleaseTypes table would be something like this:
ReleaseTypes
------------
TypeId/TypeDescription
Release/Release version
Update/Product update
I noticed that in T-SQL on MS SQL Server using character comparisons in SELECT statements, reduces performance/increases the SELECT's running time.
Hope it will help!
Best regards,
Boti
🙂
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply