May 21, 2008 at 9:55 am
If I have a simple query like this:
SELECT d.[DocumentName]
,d.[DocumentId]
,v.[VersionDescription]
,v.[VersionId]
FROM dbo.[Document] d
JOIN dbo.[Version] v
ON d.[DocumentId] = v.[DocumentId]
AND v.[VersionId] = (SELECT TOP (1) v2.VersionId
FROM dbo.[Version] v2
WHERE v2.DocumentId = v.DocumentId
ORDER BY v2.DocumentId, v2.VersionId DESC
)
WHERE d.[DocumentId] = 9729
I can convert it to use ROW_NUMBER like this:
SELECT x.*
FROM ( SELECT d.[DocumentName],
d.[DocumentId],
v.[VersionDescription],
v.[VersionId],
ROW_NUMBER()OVER (ORDER BY v.VersionId DESC) AS RowNum
FROM dbo.[Document] d
JOIN dbo.[Version] v ON d.[DocumentId] = v.[DocumentId]
WHERE d.[DocumentId] = 9729
) AS x
WHERE x.RowNum = 1
But when I have a slightly more complicated query like this:
SELECT d.[DocumentName],
d.[DocumentId],
v.[VersionDescription],
pu.[VersionId],
p.[PublisherName],
pu.[PublicationDate],
pu.[PublicationNumber]
FROM dbo.[Document] d
CROSS APPLY ( SELECT TOP ( 1 )
v2.VersionId,
v2.DocumentId,
v2.VersionDescription
FROM dbo.[Version] v2
WHERE v2.DocumentId = d.DocumentId
ORDER BY v2.DocumentId,
v2.VersionId DESC
) AS v
JOIN dbo.[Publication] pu
ON pu.[DocumentId] = d.[DocumentId]
AND pu.[VersionId] = ( SELECT TOP ( 1 )
pu2.versionid
FROM dbo.Publication pu2
WHERE pu2.DocumentId = d.DocumentId
AND pu2.VersionId <= v.[VersionId]
AND pu2.PublisherId = pu.PublisherId
ORDER BY pu2.DocumentId,
pu2.VersionId DESC
)
JOIN dbo.[Publisher] p
ON pu.[PublisherId] = p.[PublisherId]
WHERE d.[DocumentId] = 10432
AND p.[PublisherId] = 4813
I'm stumped. Anyone have a good shot at this? So far the failures not pretty.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
May 21, 2008 at 10:12 am
Hmm...Trying to work this one out.
The first one (in the CROSS APPLY), returns the Highest version of a given document in the VERSION TABLE.
The second one seems to be taking an awfully hard road to get to the high version of a document in the DOCUMENT table. Are there times when the DOCUMENT table wouldn't have a row matching the highest version in the VERSION table?
If "no" - then this might do it:
SELECT d.[DocumentName],
d.[DocumentId],
v.[VersionDescription],
pu.[VersionId],
p.[PublisherName],
pu.[PublicationDate],
pu.[PublicationNumber]
FROM dbo.[Document] d
inner join ( SELECTRow_number() over (Partition by v2.documentID order by v2.versionID DESC) RN,
v2.VersionId,
v2.DocumentId,
v2.VersionDescription
FROM dbo.[Version] v2
) AS v on d.documentID=v.documentID
JOIN dbo.[Publication] pu
ON pu.[DocumentId] = d.[DocumentId]
AND pu.[VersionId] = v.versionID --just go for the max version at all times.
JOIN dbo.[Publisher] p
ON pu.[PublisherId] = p.[PublisherId]
WHERE d.[DocumentId] = 10432
AND p.[PublisherId] = 4813
and v2.rn=1
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
May 21, 2008 at 10:20 am
hmm... never mind earlier comment. I think this one's a little better anyway.
SELECT d.[DocumentName],
d.[DocumentId],
v.[VersionDescription],
pu.[VersionId],
p.[PublisherName],
pu.[PublicationDate],
pu.[PublicationNumber]
FROM dbo.[Document] d
inner join
( SELECTRow_number() over (Partition by v2.documentID order by v2.versionID DESC) RN,
v2.VersionId,
v2.DocumentId,
v2.VersionDescription
FROM dbo.[Version] v2
) AS v ON d.documentID=v.documentID
LEFT OUTER JOIN
( SELECT Row_number() over (Partition by pu2.documentID, publisherID order by pu2.versionID DESC) RN,
pu2.versionid,
pu2.documentID,
pu2.publicationdate,
pu2.publicationnumber,
pu2.publisherID
FROM dbo.Publication pu2
) puON pu.[DocumentId] = d.[DocumentId]
JOIN dbo.[Publisher] p
ON pu.[PublisherId] = p.[PublisherId]
WHERE d.[DocumentId] = 10432
AND p.[PublisherId] = 4813
and v2.rn=1
and pu.rn=1
Edit: I think you need the second one to partition by documentID AND publisherID
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
May 21, 2008 at 12:44 pm
Ah, partition by... crud. I was bouncing all around it and I just couldn't figure it out. That looks great. I tested it against the single row and large set and it worked fine.
Thanks for the help.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
May 21, 2008 at 12:58 pm
Grant Fritchey (5/21/2008)
Ah, partition by... crud. I was bouncing all around it and I just couldn't figure it out. That looks great. I tested it against the single row and large set and it worked fine.Thanks for the help.
Any time!
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
May 21, 2008 at 1:12 pm
Have you tried turning those row_number sub-queries into aggregates? I've tried both, and sometime the aggregates are faster.
Something like this:
;with
-- Publication Version CTE, used for Pub CTE
PubVersion (DocID, PubID, VerID) as
(SELECT pu2.documentID, publisherID, max(pu2.versionID)
FROM dbo.Publication pu2
GROUP BY pu2.documentID, publisherID),
-- Publication data CTE, used in final query
Pub (DocID, PubID, VerID, PublisherName, PublicationDate, PublicationNumber) as
(select DocID, PubID, VerID, PublisherName, PublicationDate,
PublicationNumber
from PubVersion
inner join dbo.Publication p
on PubVersion.DocID = p.DocumentID
and PubVersion.PubID = p.PublisherID
and PubVersion.VerID = p.VersionID
inner join dbo.[Publisher] p2
on PubVersion.PubID = p2.PublisherID),
-- Document Version CTE, used in final query
DocVersion (DocID, VerID) as
(SELECT v2.documentID, max(v2.versionID)
FROM dbo.[Version] v2)
SELECT
d.[DocumentName],
d.[DocumentId],
v.[VersionDescription],
v.VerID as [VersionId],
pub.[PublisherName],
pub.[PublicationDate],
pub.[PublicationNumber]
FROM
dbo.[Document] d
INNER JOIN dbo.[Version] v
on d.DocumentID = v.DocumentID
INNER JOIN DocVersion
on v.DocumentID = DocVersion.DocID
and v.VersionID = DocVersion.VerID
LEFT OUTER JOIN Pub
on d.DocumentID = Pub.DocID
I find CTEs more readable than derived tables in the From clause, so I set it out that way. See if that'll do what you need.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
May 21, 2008 at 1:18 pm
Yeah. I'm actually working on comparing MAX, TOP and ROW_NUMBER. I was hoping to outline which might be better when.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
May 21, 2008 at 1:22 pm
I'm definitely interested in your test results on that.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
May 21, 2008 at 1:30 pm
Grant Fritchey (5/21/2008)
Yeah. I'm actually working on comparing MAX, TOP and ROW_NUMBER. I was hoping to outline which might be better when.
A worthy investigation. I look forward to your results.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
May 21, 2008 at 1:33 pm
Another thing you might test on it is using an indexed view with the aggregates already in it. That's one of the things they're good for.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
May 21, 2008 at 1:38 pm
Grant Fritchey (5/21/2008)
Yeah. I'm actually working on comparing MAX, TOP and ROW_NUMBER. I was hoping to outline which might be better when.
If you want to see some REALLY interesting things - throw in the MAX() OVER () into your testing. Careful though - it might not return what you'd expect.
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
May 21, 2008 at 2:13 pm
Now, now... we're stepping outside the bounds of the experiment.
I'll have to see what I can do with a materialized view though. That's a good idea just to try out.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
May 21, 2008 at 2:49 pm
Matt Miller (5/21/2008)
If you want to see some REALLY interesting things - throw in the MAX() OVER () into your testing.
Max () OVER ()???? :blink:
< wanders off in search of the Tome of All Knowledge >
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply