I recently had the basic need to retrieve a record from the database at random. What seemed to be an easy task quickly became a complex one. This case showed an interesting quirk with T-SQL that was resolved in an equally quirky way. This quick article shows you a method to retrieve random data or randomize the display of data.
Why would you ever want to retrieve random data?
- In my case, I wanted to pull a random article to display on this site’s homepage
- Choose a random user to receive a prize
- Choose a random employee for a drug test
The problem with retrieving random data using the RAND() function is how it’s actually used in the query. For example, if you run the below query against the Northwind database, you can see that you will see the same random value and date for each row in the results.
SELECT TOP 3 RAND(), GETDATE(), ProductID, ProductName FROM Products Results: 0.544292737664158642003-03-19 15:06:27.32717Alice Mutton 0.544292737664158642003-03-19 15:06:27.3273Aniseed Syrup 0.544292737664158642003-03-19 15:06:27.32740Boston Crab Meat
This behavior prohibits the obvious way to retrieve random data by using a query like this:
SELECT TOP 3 ProductID, ProductName FROM products ORDER BY RAND() Results in: ProductID ProductName ----------- ---------------------------------------- 17 Alice Mutton 3 Aniseed Syrup 40 Boston Crab Meat
If you execute this query over and over again, you should see the same results each time. The trick then is to use a system function that doesn’t use this type of behavior. The newid() function is a system function used in replication that produces a Global Unique Identifier (GUID). You can see in the following query that it produces unique records at the row-level.
SELECT TOP 3 newid(), ProductID, ProductName FROM Products Results in: ProductID ProductName ------------------------------------ ----------- ---------------------------------------- 8D0A4758-0C90-49DC-AF3A-3FC949540B45 17 Alice Mutton E6460D00-A5D1-4ADC-86D5-DE8A08C2DCF0 3 Aniseed Syrup FC0D00BF-F3A2-4341-A584-728DC8DDA513 40 Boston Crab MeatYou can also execute the following query to randomize your data (TOP clause optional):
SELECT TOP 1 ProductID, ProductName FROM products ORDER BY NEWID() Results in: ProductID ProductName ----------- ---------------------------------------- 7 Uncle Bob's Organic Dried Pears
Each time you fire it off, you should retrieve a different result. There’s also an additional way to actually use the rand() function that Itzik Ben-Gan has discovered using user defined functions and views as a workaround. The secret there is to produce a view that uses the rand() function as shown below:
CREATE VIEW VRand AS SELECT RAND() AS rnd GO
Then create a user defined function (only works in SQL Server 2000) that selects from the view and returns the random value.
CREATE FUNCTION dbo.fn_row_rand() RETURNS FLOAT AS BEGIN RETURN (SELECT rnd FROM VRand) END To use the function, you can use syntax as shown below to retrieve random records. SELECT TOP 1 ProductID, ProductName FROM Products ORDER BY dbo.fn_row_rand() GO
This is also handy if you wish to use the getdate() function at the record level to display data. I have found that this method has slight performance enhancement but it is negligible. Make sure you test between the two methods before you use either.