Getting the nth record for each group

Question

Getting the nth record for each group

Mark Firth-200666

Old Hand

Points: 363
More actions
July 11, 2006 at 6:54 pm

#191264

Candidate for QOD
I have a table with 3.3m rows, the table is well indexed to support the queries.
There are 500 records for each IssuerID.
CREATE TABLE [dbo].[FactPV](
[ID] [int] NOT NULL,
[ScenarioDate] [datetime] NOT NULL,
[IssuerID] [int] NOT NULL,
[PortfolioID] [int] NULL,
[PV01h] [numeric](38, 10) NULL,
[PV01l] [numeric](38, 10) NULL,
[CSh] [numeric](25, 10) NULL,
[CSl] [numeric](25, 10) NULL,
[PVTotal] [numeric](25, 10) NULL,
) ON [PRIMARY]
For each Issuerid in a subset I need to get
6th highest PVTotal
6th lowest PVTotal
6th highest [CSh] + [CSl]
I currently load all the records for the issuers into a table var
I create a second table var for the results
I use the following within a cursor for each IssuerID
1 SET ROWCOUNT 6
Insert into @tblFcVaR
Select DataDate, F.IssuerID, PVTotal, 0
From @tbltemp F
where F.IssuerId= @IssuerID
Order by PVTotal
2 Select top 1 @DwnPV = PVTotal from @tblFcVaR order by PVTotal desc
delete from @tblFcVaR
This is repeated for lowest PVTotal and highest [CSh] + [CSl]
and the 3 vars are inserted into the result table.
This is repeated for each Issuer (6-200 times)
The Question!
Will I get better performance from selecting directly against the table and therefore getting the benefit of the indexes or selecting against the table var with less data (500k vs 3.3m) and no IO.
I presume there is no way to get the 6th record in 1 step.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply

Phil Factor SSC-Insane Points: 20244 More actions · Answer 1

Well, to get the sixth highest (or lowest with a very slight mod..) in every group, a simple way would be to skim off the keys for the top five of each group into a table variable and use a NOT IN () to get the sixth. This doesn't really work too well if there is a tie but otherwise would do the trick. I've done a sketch using dummy data to try to illustrate what I mean.....

SET nocount ON
DECLARE  @DummyData TABLE(MyID INT IDENTITY(1,1), name VARCHAR(20), score INT, Team VARCHAR(5))
INSERT INTO @DummyData SELECT 'fred',5,'blue' INSERT INTO @DummyData SELECT 'mary',2,'blue'
INSERT INTO @DummyData SELECT 'jane',1,'blue' INSERT INTO @DummyData SELECT 'paul',3,'blue'
INSERT INTO @DummyData SELECT 'bill',2,'blue' INSERT INTO @DummyData SELECT 'barry',4,'blue'
INSERT INTO @DummyData SELECT 'garry',6,'blue' INSERT INTO @DummyData SELECT 'Raj',2,'blue'
INSERT INTO @DummyData SELECT 'brad',5,'blue' INSERT INTO @DummyData SELECT 'Mo',7,'blue'
INSERT INTO @DummyData SELECT 'joe',12,'blue' INSERT INTO @DummyData SELECT 'mell',8,'blue'
INSERT INTO @DummyData SELECT 'keith',13,'blue' INSERT INTO @DummyData SELECT 'Shane',9,'blue'
INSERT INTO @DummyData SELECT 'ian',7,'red' INSERT INTO @DummyData SELECT 'mary',5,'red'
INSERT INTO @DummyData SELECT 'jane',12,'red' INSERT INTO @DummyData SELECT 'paul',13,'red'
INSERT INTO @DummyData SELECT 'damian',5,'red' INSERT INTO @DummyData SELECT 'barry',7,'red'
INSERT INTO @DummyData SELECT 'garry',8,'red' INSERT INTO @DummyData SELECT 'pete',2,'red'
INSERT INTO @DummyData SELECT 'keith',6,'red' INSERT INTO @DummyData SELECT 'sanji',9,'red'
INSERT INTO @DummyData SELECT 'joe',15,'red' INSERT INTO @DummyData SELECT 'mell',18,'red'
INSERT INTO @DummyData SELECT 'keith',13,'red' INSERT INTO @DummyData SELECT 'Shane',19,'red'
INSERT INTO @DummyData SELECT 'andy',4,'red' INSERT INTO @DummyData SELECT 'Saun',5,'red'

DECLARE  @topRankers TABLE (TheirID INT, iteration INT)
DECLARE @ii INT
DECLARE @iiMax INT

SELECT @ii=6
WHILE @ii>0
       BEGIN
       INSERT INTO @TopRankers(TheirID, iteration)
       SELECT MyID,@ii FROM @DummyData g INNER JOIN
               (
               SELECT [highscore]=MAX(score), team FROM @DummyData
               WHERE myID NOT IN 
                       (
                       SELECT TheirID FROM @TopRankers
                       )
                GROUP BY team
               )f
       ON f.highScore=Score AND f.team=g.team
       SELECT @ii=@ii-1
       END
SELECT name, score, team FROM @dummyData INNER JOIN @TopRankers ON theirID=myID 
        WHERE iteration=1

Best wishes,
Phil Factor

John Mitchell-245523 SSC Guru Points: 148809 More actions · Answer 2

Mark

In the absence of any sample data, I've used the Northwind database. This query finds the second highest value of Freight for each CustomerID in the Orders table. You should be able to tweak it fairly easily to suit your needs. No guarantees given on performance - although it doesn't use an explicit cursor or WHILE loop, I think it will have to run the TOP query once for each CustomerID.

John

select distinct o1.customerid,

(select top 1 oo.freight from

(select top 2 o2.freight from Orders o2

where o2.CustomerID = o1.CustomerID

order by o2.Freight desc) oo

order by oo.Freight asc)

as SecondFreight

from orders o1

Ward Pond SSC Eights! Points: 999 More actions · Answer 3

You should be able to write a version of this query using the new ROW_NUMBER intrinsic as well..

Take care..

- Ward Pond
blogs.technet.com/wardpond

Mark Firth-200666 Old Hand Points: 363 More actions · Answer 4

Outcome from the ideas - thanks.

Stage 1 was to select directly against the heavily indexed FactPV table. This reduced the processing from 6:14 to 3:24

I then created a summary table which reduced the row count from 3.3m to 1.25m and selected directly against the indexed summary table. This reduced the processing from 3:24 to 1:55

I then implemented the Row_Number method and reduced the processing to 1:24. I had to create a view to do the sum outside the Row_Number method. While it probably could have been done using another select I dislike too many nested select statements.

WITH PVData AS

(SELECT PVTotal,

ROW_NUMBER() OVER (order by PVTotal)as RowNumber

FROM vwFactPVSum01

where IssuerId= @IssuerID

&nbsp

SELECT @DwnPV = PVTotal

FROM PVData

WHERE RowNumber = 6;

Thanks you all and especially Ward Pond for the Row_Number hint.

Mark Firth