By: Greg Robidoux | Comments (1) | Related: > Functions System
Problem
Have you ever had the need to randomly retrieve a set of records from a database table? So for each time the query runs it returns a totally random set of data.
Solution
We will look at two ways you might think about about doing this and show the approach that is much easier to implement and does what we need.
Using RAND() to return random data
You might think one way of doing this is to use the RAND() function to generate a random number and then tie this back to your original record set.
The RAND() function works as follows:
SELECT RAND()
So if we generate a random number for each row and then sort by those random numbers, maybe this will solve our problem. The problem with using the RAND() function is that it gives the same value for the entire recordset.
So if we run this query we get the following result with each record having the same RandomValue.
SELECT TOP 5 ProductName, RAND() as RandomValue FROM dbo.products ORDER BY RandomValue
The problem with using the RAND() function is that it gives the same value for the entire recordset. So if we run the above query over and over again, the ProductName will always be the same list and just the RandomValue will change but it will be the same for all rows which doesn't help solve the problem.
ProductName | RandomValue |
---|---|
Alice Mutton | 0.80020779549998178 |
Aniseed Syrup | 0.80020779549998178 |
Boston Crab Meat | 0.80020779549998178 |
Camembert Pierrot | 0.80020779549998178 |
Carnarvon Tigers | 0.80020779549998178 |
One option would be to create a User Defined Function (UDF) to return a random number for each row. The random number would then be calculated for each record in the record set and therefore you could sort on that random number since each value would be unique. This approach would work, but seems a bit complex for what we are trying to achieve and will probably not be the fastest option either.
Using NEWID() to return random data
A simple solution would be to use the NEWID() function. The NEWID() function returns a uniqueidentifier for each row of data that is returned and each value is different.
So to simply retrieve the first 5 rows from the products table we would issue the following statement.
SELECT TOP 5 productName FROM dbo.products ORDER BY NEWID()
The first run returns the following recordset:
- Guaraná Fantástica
- Tunnbröd
- Ikura
- Filo Mix
- Genen Shouyu
The second run returns a totally different recordset:
- Mozzarella di Giovanni
- Scottish Longbreads
- Gustaf's Knäckebröd
- Louisiana Fiery Hot Pepper Sauce
- Filo Mix
So you can see achieving a totally random list of records is very straight forward using the NEWID() function.
NEWID() Performance Overhead
Be careful when using this especially on large tables to make sure this does not become a performance issue, since each row has to get a value and then just a certain number would be returned based on the TOP option. So in the above example, the query would scan then entire table, apply a NEWID for each row, sort the data and then return the TOP 5 rows based on the sort order.
One thing you could do is to use a WHERE clause to limit the scope of records that you are working with and therefore the query will be much faster since it will work with a subset of data instead of the entire table.
Next Steps
- Learn more about the NEWID() function
- If you need to retrieve a random list of records use this function in your SELECT statements
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips