July 11, 2006 at 6:54 pm
Candidate for QOD
I have a table with 3.3m rows, the table is well indexed to support the queries.
There are 500 records for each IssuerID.
CREATE TABLE [dbo].[FactPV](
[ID] [int] NOT NULL,
[ScenarioDate] [datetime] NOT NULL,
[IssuerID] [int] NOT NULL,
[PortfolioID] [int] NULL,
[PV01h] [numeric](38, 10) NULL,
[PV01l] [numeric](38, 10) NULL,
[CSh] [numeric](25, 10) NULL,
[CSl] [numeric](25, 10) NULL,
[PVTotal] [numeric](25, 10) NULL,
) ON [PRIMARY]
For each Issuerid in a subset I need to get
6th highest PVTotal
6th lowest PVTotal
6th highest [CSh] + [CSl]
I currently load all the records for the issuers into a table var
I create a second table var for the results
I use the following within a cursor for each IssuerID
1 SET ROWCOUNT 6
Insert into @tblFcVaR
Select DataDate, F.IssuerID, PVTotal, 0
From @tbltemp F
where F.IssuerId= @IssuerID
Order by PVTotal
2 Select top 1 @DwnPV = PVTotal from @tblFcVaR order by PVTotal desc
delete from @tblFcVaR
This is repeated for lowest PVTotal and highest [CSh] + [CSl]
and the 3 vars are inserted into the result table.
This is repeated for each Issuer (6-200 times)
The Question!
Will I get better performance from selecting directly against the table and therefore getting the benefit of the indexes or selecting against the table var with less data (500k vs 3.3m) and no IO.
I presume there is no way to get the 6th record in 1 step.
July 12, 2006 at 3:06 am
Well, to get the sixth highest (or lowest with a very slight mod..) in every group, a simple way would be to skim off the keys for the top five of each group into a table variable and use a NOT IN () to get the sixth. This doesn't really work too well if there is a tie but otherwise would do the trick. I've done a sketch using dummy data to try to illustrate what I mean.....
SET nocount ON DECLARE @DummyData TABLE(MyID INT IDENTITY(1,1), name VARCHAR(20), score INT, Team VARCHAR(5)) INSERT INTO @DummyData SELECT 'fred',5,'blue' INSERT INTO @DummyData SELECT 'mary',2,'blue' INSERT INTO @DummyData SELECT 'jane',1,'blue' INSERT INTO @DummyData SELECT 'paul',3,'blue' INSERT INTO @DummyData SELECT 'bill',2,'blue' INSERT INTO @DummyData SELECT 'barry',4,'blue' INSERT INTO @DummyData SELECT 'garry',6,'blue' INSERT INTO @DummyData SELECT 'Raj',2,'blue' INSERT INTO @DummyData SELECT 'brad',5,'blue' INSERT INTO @DummyData SELECT 'Mo',7,'blue' INSERT INTO @DummyData SELECT 'joe',12,'blue' INSERT INTO @DummyData SELECT 'mell',8,'blue' INSERT INTO @DummyData SELECT 'keith',13,'blue' INSERT INTO @DummyData SELECT 'Shane',9,'blue' INSERT INTO @DummyData SELECT 'ian',7,'red' INSERT INTO @DummyData SELECT 'mary',5,'red' INSERT INTO @DummyData SELECT 'jane',12,'red' INSERT INTO @DummyData SELECT 'paul',13,'red' INSERT INTO @DummyData SELECT 'damian',5,'red' INSERT INTO @DummyData SELECT 'barry',7,'red' INSERT INTO @DummyData SELECT 'garry',8,'red' INSERT INTO @DummyData SELECT 'pete',2,'red' INSERT INTO @DummyData SELECT 'keith',6,'red' INSERT INTO @DummyData SELECT 'sanji',9,'red' INSERT INTO @DummyData SELECT 'joe',15,'red' INSERT INTO @DummyData SELECT 'mell',18,'red' INSERT INTO @DummyData SELECT 'keith',13,'red' INSERT INTO @DummyData SELECT 'Shane',19,'red' INSERT INTO @DummyData SELECT 'andy',4,'red' INSERT INTO @DummyData SELECT 'Saun',5,'red' DECLARE @topRankers TABLE (TheirID INT, iteration INT) DECLARE @ii INT DECLARE @iiMax INT SELECT @ii=6 WHILE @ii>0 BEGIN INSERT INTO @TopRankers(TheirID, iteration) SELECT MyID,@ii FROM @DummyData g INNER JOIN ( SELECT [highscore]=MAX(score), team FROM @DummyData WHERE myID NOT IN ( SELECT TheirID FROM @TopRankers ) GROUP BY team )f ON f.highScore=Score AND f.team=g.team SELECT @ii=@ii-1 END SELECT name, score, team FROM @dummyData INNER JOIN @TopRankers ON theirID=myID WHERE iteration=1
Best wishes,
Phil Factor
July 12, 2006 at 3:29 am
Mark
In the absence of any sample data, I've used the Northwind database. This query finds the second highest value of Freight for each CustomerID in the Orders table. You should be able to tweak it fairly easily to suit your needs. No guarantees given on performance - although it doesn't use an explicit cursor or WHILE loop, I think it will have to run the TOP query once for each CustomerID.
John
select distinct o1.customerid,
(select top 1 oo.freight from
(select top 2 o2.freight from Orders o2
where o2.CustomerID = o1.CustomerID
order by o2.Freight desc) oo
order by oo.Freight asc)
as SecondFreight
from orders o1
July 12, 2006 at 12:14 pm
You should be able to write a version of this query using the new ROW_NUMBER intrinsic as well..
- Ward Pond
blogs.technet.com/wardpond
July 12, 2006 at 9:36 pm
Outcome from the ideas - thanks.
Stage 1 was to select directly against the heavily indexed FactPV table. This reduced the processing from 6:14 to 3:24
I then created a summary table which reduced the row count from 3.3m to 1.25m and selected directly against the indexed summary table. This reduced the processing from 3:24 to 1:55
I then implemented the Row_Number method and reduced the processing to 1:24. I had to create a view to do the sum outside the Row_Number method. While it probably could have been done using another select I dislike too many nested select statements.
WITH PVData AS
(SELECT PVTotal,
ROW_NUMBER() OVER (order by PVTotal)as RowNumber
FROM vwFactPVSum01
where IssuerId= @IssuerID
 
SELECT @DwnPV = PVTotal
FROM PVData
WHERE RowNumber = 6;
Thanks you all and especially Ward Pond for the Row_Number hint.
Mark Firth
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply