December 16, 2010 at 12:42 am
How to select random (10 or 20) records from table?
December 16, 2010 at 1:38 am
Use Top clause without order by clause.
Select top 10 columnname from yourtable.
--------------------------------------------------------------------------------------------------
I am just an another naive wannabe DBA trying to learn SQL Server
December 16, 2010 at 2:05 am
Sachin Nandanwar (12/16/2010)
Use Top clause without order by clause.Select top 10 columnname from yourtable.
This would mostly give the same result again and again. Use NEWID() in the ORDER BY Clause
SELECT TOP 10 * FROM YourTable ORDER BY NEWID()
How to post data/code on a forum to get the best help - Jeff Moden
http://www.sqlservercentral.com/articles/Best+Practices/61537/
December 16, 2010 at 4:17 pm
Here is another method I've seen/used that is pretty similar to the ORDER BY NEWID(), but much faster:
SELECT TOP 10 * FROM MyTable
WHERE 0.01 >= CAST(CHECKSUM(NEWID(), SomeColumnName) & 0x7fffffff AS float) / CAST (0x7fffffff AS int)
NOTE: Repalce "SomeColumnName" with some columns or set of columns from your table, PKs work well.
December 16, 2010 at 5:10 pm
Using ORDER BY NEWID() with huge files is slow, because of the sort time.
Generating TOP (X) WHERE (random number test) is fast, but tends to skew towards the front of the table, whatever that may be.
If you have an indexed IDENTITY (ID) column in your primary table, I would generate a temp table of 10-20 distinct random numbers between min(ID) and max(ID) and join it to the primary.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
December 17, 2010 at 11:53 am
Here is an example of pulling 20 random rows from a million row table.
Note to Jeff Moden: It may not be "Nasty Fast" but it's "Wicked Quick" 😀
Had to make this an attachment. For some reason, SSC is blocking my posts.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
December 17, 2010 at 6:57 pm
The Dixie Flatline (12/17/2010)
Note to Jeff Moden: It may not be "Nasty Fast" but it's "Wicked Quick"
Heh... too funny, Bob. 😀
--Jeff Moden
Change is inevitable... Change for the better is not.
March 1, 2014 at 10:34 pm
The Dixie Flatline (12/16/2010)
Using ORDER BY NEWID() with huge files is slow, because of the sort time.Generating TOP (X) WHERE (random number test) is fast, but tends to skew towards the front of the table, whatever that may be.
If you have an indexed IDENTITY (ID) column in your primary table, I would generate a temp table of 10-20 distinct random numbers between min(ID) and max(ID) and join it to the primary.
I am in a place where I also need to select random rows. I like the NEWID() solution because it's simple. Can you say when a file is too huge to use it on?
Is the temp table of random numbers between min(ID) and max(ID) etc....really the superior one?
Thanks.
March 2, 2014 at 9:49 am
Would this work? DECLARE @Rand INT
SET @Rand = (RAND() * 20) + 10
SELECT TOP @Rand FROM MYTable WHERE SomeIntColumn > @Rand
______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience
March 2, 2014 at 12:04 pm
KoldCoffee (3/1/2014)
Is the temp table of random numbers between min(ID) and max(ID) etc....really the superior one?
For performance, absolutely. For accuracy in returning the proper number of rows, probably not because the IDENTITY column can have gaps in it.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 3, 2014 at 5:03 pm
Here is another way to improve performance while ensuring the desired number of rows. (see attachment) Whether or not you really need the performance gain depends on the size of the gain, how often you are going to run the query, etc.
One of the problems with the TOP/ORDER BY technique and large tables is that it can takes huge sorts in order to select a small number of rows. The more rows you throw into a sort, the slower it runs. The code below avoids this by selecting a random subset of rows prior to the sort (with a cushion to make sure that more than the desired number of rows is available.) It still uses the ORDER BY technique to defeat the front-end loading problem, but runs consistently faster simply because fewer rows have to be sorted. For larger number of rows it is outperforming the original solution. Good hunting.
P.S. to Jeff: The subject table now deliberately includes gaps, but its a moot point since the newer technique doesn't generate the table of random N values.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
March 4, 2014 at 3:01 am
Hi Dixie Flatline, I saw this on another post and was trying to understand it so I appreciate that you explained the code more in this version. I will study it and thanks.
March 4, 2014 at 6:55 am
I know Bob (Dixie) has done a masterful job with this script.
Here is a slightly different route I have taken in the past that was also quite efficient and performed well.
http://jasonbrimhall.info/2010/12/13/sql-confessions/
Yes it is similar to other methods mentioned.
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
March 4, 2014 at 10:31 am
Masterful? Do you need to borrow money or something, Jason?
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
Viewing 15 posts - 1 through 15 (of 24 total)
You must be logged in to reply to this topic. Login to reply