July 14, 2009 at 7:27 am
Hi,
Best appraoch to get the records randomly, which will give better performance.
thanx.
🙂
July 14, 2009 at 7:33 am
randomizing data returned doesn't always perform well, so I always have to compromise and select fewer rows.
SET STATISTICS TIME ON
SELECT TOP 3 object_name(object_id) As TBName,* from sys.columns order by NEWID()
will produce 3 random rows each time...change that to top 1000 , then top 10000 and you'll see the query will take a bit longer each time.
Lowell
July 14, 2009 at 7:54 am
I think in Lowells example , most of the delay is returning the data to the client
if you look at the query plans for
SELECT TOP 10 object_id from sys.columns order by NEWID()
go
SELECT TOP 100 object_id from sys.columns order by NEWID()
go
SELECT TOP 1000 object_id from sys.columns order by NEWID()
they all scan sys.columns and produce a newid for each row ( 20,000 in my case)
The expensive operation (other than the initial scan) is ordering those newids. The extra expense of the top X rows is small as it still has to order those 20,000 rows to find the top X.
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply