September 16, 2019 at 1:33 pm
Hello,
I have the following dataset:
Family Room subfamily
1 5 6
2 5 7
3 5 7
4 2 3
I would like to extract the data from this table randomly. It would take in the extraction a piece taken at random by couple family / sub-family.
For this dataset for example the extraction would be:
Family Room subfamily
1 5 6
3 5 7
4 2 3
I do not know how to attack the request to have the expected result ... If someone has clues I am a taker.
Thanks in advance.
September 16, 2019 at 4:19 pm
You can use TOP and ORDER BY NEWID() like this:
-- Sample data
DECLARE @yourdata TABLE
(
Family INT,
Room INT,
subfamily INT
);
INSERT @yourdata VALUES (1,5,6),(2,5,7),(3,5,7),(4,2,3),(4,2,4);
-- Number of Rows
DECLARE @rows INT = 3;
SELECT TOP (@rows) y.Family, y.Room, y.subfamily
FROM @yourdata AS y
ORDER BY NEWID();
You can return a random number of rows using CHECKSUM and NEWID like this:
DECLARE @yourdata TABLE
(
Family INT,
Room INT,
subfamily INT
);
INSERT @yourdata VALUES (1,5,6),(2,5,7),(3,5,7),(4,2,3),(4,2,4);
-- number of rows to return (e.g. 2 to 4 rows)
DECLARE @low INT = 2, @high INT = 4;
SELECT TOP (ABS(CHECKSUM(NEWID())%(@high-@low+1))+@low) y.Family, y.Room, y.subfamily
FROM @yourdata AS y
ORDER BY NEWID();
-- Itzik Ben-Gan 2001
September 16, 2019 at 6:37 pm
>> I have the following dataset: <<
And thanks to your lack of netiquette, we have no DDL. When you're at work do you ask other people to do the basic things that an SQL programmer should do because you can't do them yourself? we have no datatypes, no keys no constraints. And even column names are vague.
>> I would like to extract the data from this table randomly. It would take in the extraction a piece taken at random by couple family / sub-family. I do not know how to attack the request to have the expected result ... If someone has clues I am a taker. <<
What sampling method do you want to use? Random? Stratified? SQL was not meant for this and the attempts to paste it into SQL have been disappointing. How big a sample do you want to get from your population? To do this right, you're going to have to know some basic statistics and it doesn't sound like you do.
You might want to look at an old, but still, a very good introductory book entitled "A Sampler on Sampling" by Bill Williams (ISBN 0471030368).
Please post DDL and follow ANSI/ISO standards when asking for help.
September 16, 2019 at 7:01 pm
And thanks to your lack of netiquette, we have no DDL.
I think what I posted should answer the OP's question as well as yours Joe.
-- Itzik Ben-Gan 2001
September 16, 2019 at 9:22 pm
> You can return a random number of rows using CHECKSUM and NEWID <<
Is this method a uniform random thing? I think I remember that a lot of these SQL Server trick skew toward the rows that are at the front of a data page ...
Please post DDL and follow ANSI/ISO standards when asking for help.
September 16, 2019 at 10:12 pm
Why not use TABLESAMPLE?
SELECT * FROM YourTable TABLESAMPLE(100 ROWS)
SELECT * FROM YourTable TABLESAMPLE(10 PERCENT)
--Vadim R.
September 18, 2019 at 12:08 am
This can be done using
SELECT TOP (@rows) y.Family, y.Room, y.subfamily
FROM @Table AS y
ORDER BY NEWID();
OR
SELECT y.Family, y.Room, y.subfamily
FROM @Table AS y TABLESAMPLE(3 ROWS)
But, the TABLESAMPLE clause cannot be applied to derived tables, tables from linked servers, tables derived from table-valued functions, rowset functions or OPENXML or from views.
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply