December 3, 2011 at 9:24 am
tmccar (12/3/2011)
...The "first occurrence" will be the lowest value of "Project_" in "Large". I hope that's clear!Tom
Note: In my code above I'm assuming that you want the first value of Project_ when Project_ is sorted in alpha order ascending. If you're actually using a different rule to determine "lowest value of Project_" then you'll need to alter the query accordingly.
December 3, 2011 at 9:34 am
Yes, that's looking good. The actual file has 51 columns and 117,000 rows but it should be possible to apply the same procedures to it.
December 3, 2011 at 9:37 am
tmccar (12/3/2011)
Yes, that's looking good. The actual file has 51 columns and 117,000 rows but it should be possible to apply the same procedures to it.
Yep, the general concept should stay the same. It all boils down to figuring out how to identify the one row you are interested in for each group, and using that in your join predicates.
December 3, 2011 at 9:45 am
If I'd only know the expected result based on the sample data I might be able to provide an alternative solution (using ROW_NUMBER() and/or CROSS APPLY)...
December 3, 2011 at 10:24 am
LutzM (12/3/2011)
If I'd only know the expected result based on the sample data I might be able to provide an alternative solution (using ROW_NUMBER() and/or CROSS APPLY)...
I played with those approaches too, but didn't really find anything that performed better than what I originally posted. Using ROW_NUMBER alone I got almost identical performance results to the MIN()/GROUP BY version, although possibly slower...really hard to tell on the small sample size though. CROSS APPLY seemed to perform much worse, due to it causing multiple scans of the table inside the ITVF. Admittedly I may have been missing a better way to write it though. Here's what I tried using ROW_NUMBER():
SELECT drv.*, Small.Path
FROM (
SELECT ROW_NUMBER() OVER (PARTITION BY [Drawing Number] ORDER BY Project_) AS RowNum, Large.*
FROM Large
) AS drv INNER JOIN
Small ON Small.[Drawing Number] = drv.[Drawing Number]
WHERE RowNum = 1
Here's what I tried using CROSS APPLY:
SELECT s.Path, l.*
FROM Small AS s CROSS APPLY
(
SELECT TOP 1 Large.*
FROM Large -- This results in Large getting scanned once for every distinct value of Drawing Number in Small
WHERE Large.[Drawing Number] = s.[Drawing Number]
ORDER BY Project_ ASC
) as l
Anything you would suggest doing differently to get either of those to perform better?
Viewing 5 posts - 16 through 19 (of 19 total)
You must be logged in to reply to this topic. Login to reply