April 1, 2005 at 3:02 pm
Remi, I've actually been thinking about doing something very similar... interesting approach - thanks!
April 1, 2005 at 3:07 pm
Unless you're really short on disk space, I don't see the problem with putting all the data in a temp table, especially if you'd doing this just once and then dropping it.
If I understand this problem correctly... I might use an undocumented trick I learned, which is to use a clustered index to order the rows, then use an UPDATE to "rank" them, and divide the ranks by the number of rows in the group:
SELECT * INTO #t FROM tblStateCensus_top10percent
ALTER TABLE #t ADD pct FLOAT
CREATE CLUSTERED INDEX #IX_t ON #t(state, population desc)
SELECT state, COUNT(*) AS region_count
INTO #tstates
GROUP BY state
DECLARE @rank INT
DECLARE @state CHAR(2)
SET @state = ''
UPDATE #t
FROM #t JOIN #tstates ON #t.state = #tstate.state
SET @rank = CASE WHEN @state = #t.state THEN @rank + 1 ELSE 1 END,
pct = CAST(@rank AS FLOAT) / #tstate.region_count,
@state = state
SELECT *
FROM #t
WHERE pct < 0.90
-- or is it WHERE pct < 0.10 ?
April 1, 2005 at 10:27 pm
I also thaught of something similar but I didn't pursue it because I wanted a set based approach. This temp table solution would however be much faster than the set solution I presented (the in clause of the exists will absolutely destroy the performance of this query on a large table).
April 1, 2005 at 10:37 pm
Here's another thaught: the in clause will kill the performance because the whole list will be scanned untill a match is found. Since
where 1 in (3,1,2)
is the same as
where 1 = 3 or 1 = 1 or 1 = 2
and that sql server uses shortcircuiting, this should improve the performance greatly (can't test this theory here) :
SELECT O.XType
-- , count(*) AS TotalHits_Found
, O.name
, (SELECT CEILING(COUNT(*) * 0.9) FROM dbo.SysObjects O4 WHERE O4.XType = O.XType) as [90%]
, (SELECT COUNT(*) FROM dbo.SysObjects O5 WHERE O5.XType = O.XType) AS [100%]
FROM dbo.SysObjects O
WHERE EXISTS (SELECT * FROM dbo.SysObjects O2 WHERE O2.XType = O.XType AND O2.id = O.id and O2.id IN (SELECT TOP 90 PERCENT id FROM dbo.SysObjects O3 WHERE O3.XType = O.XType ORDER BY /*send the wanted item at the front of the list*/ CASE WHEN O3.id = O2.id then 0 ELSE 1 END))
--GROUP BY O.XType
ORDER BY O.XType
, O.Name
April 4, 2005 at 12:55 pm
I finally had the opportunity to check my theory.
1 - it doesn't work, as you ALWAYS move the searched item in the top 90% group which yields false results.
2 - it doesn't seem to work any faster. I assume that since you the order by case must evaluate all rows and then the in only 1, it takes more than than the in never evaluating all the rows unless no hit is found or the item is at the end of the list. But I don't know enough about the servers internals to be certain of that.
April 5, 2005 at 12:52 am
Remi,
interesting thread. Look like a one-man show here. Wasn't there recently a remark on posting multiple times when none or one time is enough?
--
Frank Kalis
Microsoft SQL Server MVP
Webmaster: http://www.insidesql.org/blogs
My blog: http://www.insidesql.org/blogs/frankkalis/[/url]
April 5, 2005 at 6:27 am
Yes but this time I was replying to myself as new ideas came to me. And it wasn't pointless chatting like this .
April 5, 2005 at 8:32 am
I appreciated all the responses. Got a lot out of the answers. Thank you to everyone who took the time to help me out!
-Heather
April 5, 2005 at 8:36 am
Have you tried running my statemement on the 8M rows table?.. I was wondering how long it would take to run it (if it doesn't interferre with production).
April 5, 2005 at 8:40 am
Yes but this time I was replying to myself as new ideas came to me. And it wasn't pointless chatting like this
Just another definition for the same thing.
--
Frank Kalis
Microsoft SQL Server MVP
Webmaster: http://www.insidesql.org/blogs
My blog: http://www.insidesql.org/blogs/frankkalis/[/url]
April 5, 2005 at 8:42 am
Obviously, but I look better that way .
April 5, 2005 at 8:44 am
Okay, boys - behave! I know I'm cute, but I won't have you fighting over me. hehe.
April 5, 2005 at 8:50 am
Hehe... it's not over you we're arguing and as far as I know Frank is married with children (new born included).
Have you tested my query on a large table... I have nothing to test it on a table that remotely compares to your environement?
April 5, 2005 at 8:52 am
I know, it was a joke. Totally kidding.
I'll try that query out today and let you know how it goes.
-heather
April 5, 2005 at 8:55 am
Three to be precise here.
Obviously, but I look better that way .
Now if I recall this correct, you say in English:
Beauty lies in the eyes of the beholder.
So, if you think you look better this way, who am I to argue against
--
Frank Kalis
Microsoft SQL Server MVP
Webmaster: http://www.insidesql.org/blogs
My blog: http://www.insidesql.org/blogs/frankkalis/[/url]
Viewing 15 posts - 16 through 30 (of 32 total)
You must be logged in to reply to this topic. Login to reply