Randomly Retrieve SQL Server Records

By:   |   Comments (1)   |   Related: > Functions System


Problem

Have you ever had the need to randomly retrieve a set of records from a database table?  So for each time the query runs it returns a totally random set of data.

Solution

We will look at two ways you might think about about doing this and show the approach that is much easier to implement and does what we need.

Using RAND() to return random data

You might think one way of doing this is to use the RAND() function to generate a random number and then tie this back to your original record set.

The RAND() function works as follows:

SELECT RAND()
This returns a value like this: 0.7154366573674853

So if we generate a random number for each row and then sort by those random numbers, maybe this will solve our problem. The problem with using the RAND() function is that it gives the same value for the entire recordset.

So if we run this query we get the following result with each record having the same RandomValue.

SELECT TOP 5 ProductName, RAND() as RandomValue 
FROM dbo.products 
ORDER BY RandomValue

The problem with using the RAND() function is that it gives the same value for the entire recordset.  So if we run the above query over and over again, the ProductName will always be the same list and just the RandomValue will change but it will be the same for all rows which doesn't help solve the problem.

ProductName RandomValue
Alice Mutton  0.80020779549998178
Aniseed Syrup  0.80020779549998178
Boston Crab Meat  0.80020779549998178
Camembert Pierrot  0.80020779549998178
Carnarvon Tigers  0.80020779549998178

One option would be to create a User Defined Function (UDF) to return a random number for each row.  The random number would then be calculated for each record in the record set and therefore you could sort on that random number since each value would be unique. This approach would work, but seems a bit complex for what we are trying to achieve and will probably not be the fastest option either.

Using NEWID() to return random data

A simple solution would be to use the NEWID() function.  The NEWID() function returns a uniqueidentifier for each row of data that is returned and each value is different.

So to simply retrieve the first 5 rows from the products table we would issue the following statement.

SELECT TOP 5 productName 
FROM dbo.products 
ORDER BY NEWID()

The first run returns the following recordset:

  • Guaraná Fantástica
  • Tunnbröd
  • Ikura
  • Filo Mix
  • Genen Shouyu

The second run returns a totally different recordset:

  • Mozzarella di Giovanni
  • Scottish Longbreads
  • Gustaf's Knäckebröd
  • Louisiana Fiery Hot Pepper Sauce
  • Filo Mix

So you can see achieving a totally random list of records is very straight forward using the NEWID() function.

NEWID() Performance Overhead

Be careful when using this especially on large tables to make sure this does not become a performance issue, since each row has to get a value and then just a certain number would be returned based on the TOP option.  So in the above example, the query would scan then entire table, apply a NEWID for each row, sort the data and then return the TOP 5 rows based on the sort order.

One thing you could do is to use a WHERE clause to limit the scope of records that you are working with and therefore the query will be much faster since it will work with a subset of data instead of the entire table.

Next Steps
  • Learn more about the NEWID() function
  • If you need to retrieve a random list of records use this function in your SELECT statements


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Greg Robidoux Greg Robidoux is the President and founder of Edgewood Solutions, a technology services company delivering services and solutions for Microsoft SQL Server. He is also one of the co-founders of MSSQLTips.com. Greg has been working with SQL Server since 1999, has authored numerous database-related articles, and delivered several presentations related to SQL Server. Before SQL Server, he worked on many data platforms such as DB2, Oracle, Sybase, and Informix.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips



Comments For This Article




Monday, November 17, 2008 - 6:23:48 AM - Rohit Madhok Back To Top (2215)

When selecting rows randomly from a large table one may refer to http://msdn.microsoft.com/en-us/library/cc441928.aspx

e.g. SELECT top 100 pk_column_id FROM MyTable WHERE (ABS(CAST((BINARY_CHECKSUM(pk_column_id, NEWID())) as bigint)) % 100) < 10

where, 'pk_column_id' is the name of the primary key column (defined as bigint)















get free sql tips
agree to terms