November 1, 2007 at 1:33 pm
Hi everyone-
Prob an easy question, is it possible to sample every nth record in SQL 2005? I have a database with over 4 million records and wish to sample every 100th record? Thanks!
November 1, 2007 at 2:46 pm
Assuming you want a random sampling:
declare @n int
set @n=100
select t.* from
(select *, row_number() over (Order by newID() ) as 'ranknumber' from table1) t
where ranknumber%@n=0
Or...(in the case one out of 100)
select top 1 percent * from
table1
order by newID()
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
November 2, 2007 at 6:55 am
This is unlikely to run very fast, 43 seconds on my machine against 54 million rows, but it'll work.
WITH x AS (SELECT b.[ID]
,b.[Date]
,b.[Value]
,ROW_NUMBER() OVER(ORDER BY b.[ID]) AS RowNum
FROM [BigTable] b
)
SELECT * FROM x
WHERE (x.rownum % 100 = 0)
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
November 2, 2007 at 11:24 am
Try this:
SELECT *
FROM your_table
TABLESAMPLE SYSTEM (1 PERCENT) ;
November 7, 2007 at 2:39 pm
Thanks everyone!
November 8, 2007 at 2:16 am
Suppose i have 250 records. If i tried by the above suggested queries i am getting 3 .All fine. But how to confirm those are the recoeds in 0th , 100th and 200th positions. I tried but getting different records. Please clear my confusion
November 12, 2007 at 5:32 am
deb (11/8/2007)
Suppose i have 250 records. If i tried by the above suggested queries i am getting 3 .All fine. But how to confirm those are the recoeds in 0th , 100th and 200th positions. I tried but getting different records. Please clear my confusion
I can't address all the examples, but from the example I supplied, with no ORDER BY statement, you can't guarantee getting the same values every time. Adding the ORDER BY would guarantee that.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
November 12, 2007 at 11:53 am
I came across this in my notes, but haven't tried it lately ... (too tired & busy right now)
+++++++++
--This selects every 500 th record, using an identity field
SELECT fielda, fieldb, fieldc
FROM Table_Name
WHERE Identityfield % 500 = 0
November 12, 2007 at 12:22 pm
That's right there with my take on it, but I tried supplying the ID field in case the table didn't have one, or there were gaps or whatever. "Every 'X'" pretty much equated to modulo in my mind.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
November 13, 2007 at 6:57 am
If you have an identity column then % operator should be the quickest mechanism to get every N records back.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
November 13, 2007 at 7:01 am
Except if there are gaps (which there usually are) and you really need exactly every nth record from some particular order. Then you still have to do some kind of processing to arrive at the solution.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
November 13, 2007 at 7:25 am
That was the same line of thinking I used when I threw in the ROW_NUMBER() solution. At least you know you won't get a gap that way.
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
November 13, 2007 at 7:43 am
I took it that if there was an identity column and the Nth record was absent then that Nth record shouldn't be returned. 😛
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
November 13, 2007 at 8:40 am
True - but if records with ID 40-60 are gone, then record with ID=100 isn't the 100th record anymore...
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
Viewing 14 posts - 1 through 13 (of 13 total)
You must be logged in to reply to this topic. Login to reply