April 24, 2007 at 3:09 pm
I have an invoice table containing AccountNumber,InvoiceNumber,InvoiceDate and RowID. InvoiceNumber and RowID are always unique values in the whole table.
Invoices are always created in ascending RowID order meaning that the invoice for an account with the highest RowID will be for the most recent billing period.
I want to write a SELECT query to retrieve the most recent and 2nd most recent invoice for each account (TOP 2 per account). Retrieving the most recent invoice is easy. I can't think of an elegant way to retrieve the second most recent invoice.
What are some options?
April 24, 2007 at 3:16 pm
DECLARE @InvoicesToReturn
SET @InvoicesToReturn = 2 -- ideally this value must be stored in database
SET ROWCOUNT @InvoicesToReturn
SELECT RowID
INTO #SelectedInvoices
FROM dbo.Invoice
ORDER BY ....
SET ROWCOUNT 0
SELECT I.*
FROM dbo.Invoice I
INNER JOIN #SelectedInvoices S ON I.RowId = S.RowId
_____________
Code for TallyGenerator
April 25, 2007 at 9:02 am
I don't know whether I'd call it elegant but this works:
DECLARE @invoices TABLE (AccountNumber INT, InvoiceNumber INT, InvoiceDate DATETIME, RowID INT)
INSERT INTO @invoices VALUES (1, 1001, '2007-01-01', 1)
INSERT INTO @invoices VALUES (3, 3002, '2007-01-01', 2)
INSERT INTO @invoices VALUES (1, 1003, '2007-01-02', 3)
INSERT INTO @invoices VALUES (3, 3004, '2007-01-02', 4)
INSERT INTO @invoices VALUES (2, 2005, '2007-01-03', 5)
INSERT INTO @invoices VALUES (3, 3006, '2007-01-04', 6)
-- SELECT * FROM @invoices
SELECT DISTINCT accountNumber
FROM @invoices
DECLARE @top2Invoices TABLE (AccountNumber INT, InvoiceNumber INT, InvoiceDate DATETIME, RowID INT)
-- First Invoice Per Account
INSERT INTO @top2Invoices
SELECT orig.*
FROM @invoices orig
INNER JOIN (
SELECT accountNumber, MAX(invoiceDate) as firstInv
FROM @invoices
GROUP BY accountNumber
)fst
ON fst.accountNumber = orig.accountNumber
AND fst.firstInv = orig.invoiceDate
ORDER BY orig.accountNumber
-- Second Invoice Per Account
INSERT INTO @top2Invoices
SELECT orig.*
FROM @invoices orig
INNER JOIN (
SELECT accountNumber, MAX(invoiceDate) as firstInv
FROM @invoices
WHERE rowID NOT IN (SELECT rowID FROM @top2Invoices)
GROUP BY accountNumber
)fst
ON fst.accountNumber = orig.accountNumber
AND fst.firstInv = orig.invoiceDate
ORDER BY orig.accountNumber
SELECT *
FROM @top2Invoices
ORDER BY accountNumber, invoiceDate
April 25, 2007 at 9:36 am
SELECT a.AccountNumber,a.InvoiceNumber,a.InvoiceDate,a.RowID
FROM [InvoiceTable] a
WHERE a.RowID IN (SELECT TOP 2 b.RowID
FROM [InvoiceTable] b
WHERE b.AccountNumber = a.AccountNumber
ORDER BY b.RowID DESC)
Far away is close at hand in the images of elsewhere.
Anon.
April 25, 2007 at 9:44 am
April 25, 2007 at 11:47 am
This is a little longer but overall the performance should be better than David's the larger the dataset is you are working with.
The reason is that in Davids example (which is commonly used) must evaluate all records in the dataset individually as they enter the processing buffers and executes the TOP 2 subquery each time.
This method builds a derived table for the values aliased as i then joins to the dataset evaluating in fewer steps.
Up to you to determine what works best for your needs but thought I would include as yet another option.
SELECT
X.AccountNumber,
X.InvoiceNumber,
X.InvoiceDate,
X.RowID
FROM
@invoices X
INNER JOIN
(
SELECT
a.AccountNumber,
a.RowID,
COUNT(*) Cnt
FROM
@invoices A
INNER JOIN
@invoices B
ON
A.AccountNumber = B.AccountNumber AND
A.RowID <= B.RowID
GROUP BY
a.AccountNumber,
a.RowID
HAVING
COUNT(*) <= 2
) i
ON
X.AccountNumber = i.AccountNumber AND
X.RowID = i.RowID
April 26, 2007 at 2:09 am
..overall the performance should be better... |
Yes I was thinking of adding another post to indicate that what I posted may not be that efficient for large datasets
and I have to admit that it is not my solution but one that had been posted on this forum many times
Another method I sometimes use is to put the values in a temp table with an identity column and select appropriately
All the methods have merit but as you say 'determine what works best'
Far away is close at hand in the images of elsewhere.
Anon.
April 26, 2007 at 6:23 am
The query below works well enough and is nice and tight. It takes about 20 seconds on a 250,000 row table on a 2 CPU/4 GB server. That's OK for a report that is run a few times per day at most.
SELECT a.AccountNumber,a.InvoiceNumber,a.InvoiceDate,a.RowID
FROM [InvoiceTable] a
WHERE a.RowID IN (SELECT TOP 2 b.RowID
FROM [InvoiceTable] b
WHERE b.AccountNumber = a.AccountNumber
ORDER BY b.RowID DESC)
April 26, 2007 at 6:34 am
Thanks for the feedback Tom
Far away is close at hand in the images of elsewhere.
Anon.
April 26, 2007 at 6:54 am
Revision. So, I found an invoice table with 3.5 million invoices and discovered the query was taking too long - 11 minutes as of when I killed it.
So, I switched to a different approach using temp tables. First, I get a list of the most recent approved invoice for each account and store that in a temp table.
Then, I repeat the query to get the 2nd most recent invoice by excluding those invoices (uniquely identified by InvoiceDirKey) returned by the first query.
This runs in 50 seconds on the 3.5 million row table.
IF OBJECT_ID('tempdb..#LATEST_INVX') IS NOT NULL
DROP TABLE [#LATEST_INVX]
IF OBJECT_ID('tempdb..#2ND_LATEST_INVX') IS NOT NULL
DROP TABLE [#2ND_LATEST_INVX]
DECLARE @InvStatus INT
SELECT @InvStatus=TablDKey FROM WTABLE WHERE TablNumber=13 AND TablName='APPROVED'
SELECT a.AccountNumber,MAX(a.InvoiceDirKey) AS InvoiceDirKey
INTO #LATEST_INVX FROM INVX a
WHERE InvStatus=@InvStatus
GROUP BY a.AccountNumber
CREATE UNIQUE CLUSTERED INDEX [#LATEST_INVX_001] ON [#LATEST_INVX]([InvoiceDirKey]) ON [PRIMARY]
SELECT a.AccountNumber,MAX(a.InvoiceDirKey) AS InvoiceDirKey
INTO #2ND_LATEST_INVX FROM INVX a
WHERE NOT EXISTS (SELECT 1 FROM #LATEST_INVX b WHERE a.InvoiceDirKey=b.InvoiceDirKey)
GROUP BY a.AccountNumber
CREATE UNIQUE CLUSTERED INDEX [#2ND_LATEST_INVX_001] ON [#2ND_LATEST_INVX]([InvoiceDirKey]) ON [PRIMARY]
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply