Learn more about SQL Server tools

mssqltips logo
giveaway
 

Tutorials          DBA          Dev          BI          Career          Categories          Webcasts          Whitepapers          Today's Tip          Join

Tutorials      DBA      Dev      BI      Categories      Webcasts

DBA    Dev    BI    Categories

 

SQL Server Enterprise Advanced Scan aka Merry-Go-Round Scan


By:   |   Read Comments   |   Related Tips: More > Performance Tuning

Attend a SQL Server Conference for FREE >> click to learn more


Problem

A rarely known feature of SQL Server Enterprise Edition is the Advanced Scan (aka Merry-Go-Round Scan). Microsoft does not seem to mention this feature much, so there is a lack of documentation about how the Advanced Scan works. Also, there doesn't seem to be a trace flag or Extended Events which indicate if queries are using the Advanced Scan.  In this tip we will discuss the Advance Scan feature and demonstrate its characteristics. 

Solution

In SQL Server Enterprise Edition, the Advanced Scan feature allows multiple tasks to share full table scans. This means that SQL Server can read a data page once and provide the same page across different executing sessions. This feature only activates when the table size scanned is larger than the SQL Server buffer pool committed target pages.

Here is an explanation of how Advanced Scan works:

  • Let's say there is a table named TableA that contains 200,000 data pages.
  • Query 1 performs a table scan on TableA and the table scan reads the first 100,000 pages.
  • Query 2 is started which also requires a table scan on TableA.
  • SQL Server can schedule another set of read requests for the pages after 100,000 and pass the rows to both scans that are occurring.
  • When Query 1 and Query 2 reach page 200,000, Query 1 will complete, but Query 2 will wrap back to the first data page and continue to scan until it reaches page 100,000 and then completes.
  • If other queries are started while Query 1 and Query 2 are running that require a table scan on TableA, SQL Server can join the new query to the executing table scan the same way. Hence, this how the Advanced Scan feature works which is also referred to as a merry-go-round scan.

The significance of this feature is multiple tasks are not requesting different data pages individually when performing a scan on the same table. For example, if data page 100,000 is read by Query 1 into the buffer pool, but this data page is already flushed out due to a limited buffer pool size then Query 2 comes along later to request the same data page which will need be read back into memory again.  If you have a busy system, this can easily overwhelm the SQL Server buffer pool and cause disk arm contention on very large table scans.

SQL Server Table Scan Exhibiting Advanced Scan Behavior

In this demonstration, SQL Server 2016 Developer Edition RTM is running on a Windows Server 2012 R2 Standard virtual machine on a laptop. The virtual machine is allocated 1.5GB of server memory and the SQL Server maximum memory is set to 1GB. The underlying storage is a 1TB Solid State Disk without any RAID.

A test table will be created with each row taking a full 8K data page. We will insert 2 million rows into the test table which will occupy 15.6GB of space. The test table size is 15 times the size of the SQL Server buffer pool, so SQL Server will be reading data pages mostly from disk and quickly flushing out data pages from the buffer cache during this test.

First we create a database, a table and insert some data.

CREATE DATABASE [AdvancedScan]
 CONTAINMENT = NONE
 ON  PRIMARY 
( NAME = N'AdvancedScan', FILENAME = N'D:\SQLDATA\MSSQL13.MSSQLSERVER\MSSQL\DATA\AdvancedScan.mdf' , SIZE = 20480000KB , FILEGROWTH = 65536KB )
 LOG ON 
( NAME = N'AdvancedScan_log', FILENAME = N'E:\SQLDATA\MSSQL13.MSSQLSERVER\MSSQL\Log\AdvancedScan_log.ldf' , SIZE = 20480000KB , FILEGROWTH = 65536KB )
GO
ALTER DATABASE [AdvancedScan] SET RECOVERY SIMPLE 
GO

CREATE TABLE dbo.AdvScanTest 
(ID INT IDENTITY(1,1)
, FixedCol CHAR(6000) DEFAULT 'a')
GO

SET NOCOUNT ON;
DECLARE @c INT = 1
BEGIN TRAN

WHILE @c <= 2000000
BEGIN
 INSERT INTO dbo.AdvScanTest DEFAULT VALUES

 SET @c += 1
END

COMMIT TRAN

We will be testing using 3 similarly constructed queries with different predicates, but each query will result in a full table scan.

-- Query 1
SELECT COUNT(*) FROM dbo.AdvScanTest
WHERE ID BETWEEN 64565 AND 546412
OPTION (MAXDOP 1)

-- Query 2
SELECT COUNT(*) FROM dbo.AdvScanTest
WHERE ID >= 1200000
OPTION (MAXDOP 1)

-- Query 3
SELECT COUNT(*) FROM dbo.AdvScanTest 
OPTION (MAXDOP 1)

When one of the queries is executed for the first time, it took 33 seconds to complete as shown below.

SQL Server Quesy Execution Time is 33 Seconds

The three queries are executed around the same time in three different query windows and the execution duration is recorded. To completely clear out everything from memory, the SQL Server service is restarted before each round of execution. This means no data page will remain in the SQL Server buffer cache when the next set of test queries is executed.

Each query is executed 5 times and the duration is measured in seconds in the table below. We seem to be getting pretty consistent results and durations when just a single table scan is occurring. This is probably because the test is run on a laptop with no other users hitting the database server.

  Query 1 Query 2

Query 3

Run 1 42 seconds 42 seconds 41 seconds
Run 2 34 seconds 33 seconds 33 seconds
Run 3 34 seconds 34 seconds 33 seconds
Run 4 34 seconds 35 seconds 36 seconds
Run 5 33 seconds 33 seconds 32 seconds


Three Test SQL Server Queries Running for Apprioximately 40 seconds

Checking the SQL Server DMV sys.dm_os_waiting_tasks which returns information about the wait queues for tasks that are waiting on some resource. All three sessions are requesting and waiting on the same data page at any given point-in-time, albeit the queries are scheduled on different SQLOS schedulers. This seems to conform to the Advanced Scan behavior described on MSDN where a single data page read is passed to multiple tasks.

The following T-SQL script should be run in a different query window while the above queries are running.

SELECT  rq.start_time, ot.task_state, wt.session_id, ot.task_state
, wt.wait_type, wt.resource_description, ot.scheduler_id
FROM  sys.dm_os_waiting_tasks  wt  
INNER  JOIN sys.dm_os_tasks ot ON ot.task_address = wt.waiting_task_address 
INNER JOIN sys.dm_exec_sessions es ON es.session_id = wt.session_id 
INNER JOIN sys.dm_exec_requests rq ON rq.session_id = es.session_id
WHERE es.is_user_process =  1 AND es.session_id <> @@SPID
ORDER BY es.session_id

If you look at the resource_description column you can see that all three queries are going after the same data page.

Validate all three SQL Server Queries are access the same data page

Table Scan Reading Different Data Pages

For this test, we will execute the same set of queries again in different query windows, but each query is executed 5 seconds after the previous query is started. So, Query 1 is started, then 5 seconds later Query 2 is started and then 10 seconds later Query 3 is started.

Before each set of executions, the SQL Server service is restarted to make sure the SQL Server buffer pool does not contain any cached data pages.

The queries are executed 5 times and the query duration is recorded below.

  Query 1 Query 2

Query 3

Run 1 1 minute 18 seconds 1 minute 28 seconds 1 minute 22 seconds
Run 2 1 minute 23 seconds 1 minute 27 seconds 1 minute 25 seconds
Run 3 1 minute 20 seconds 1 minute 23 seconds 1 minute 21 seconds
Run 4 1 minute 22 seconds 1 minute 24 seconds 1 minute 22 seconds
Run 5 1 minute 19 seconds 1 minute 23 seconds 1 minute 20 seconds


SQL Server Table Scans Accessing Different Pages

When checking the SQL Server DMV sys.dm_os_waiting_tasks using the query above, the data pages requested by each query is different. This indicates each session is requesting different data pages individually throughout each query execution to completion.

If you look at the resource_description column you can see that all three queries are going after different data pages, which shows these queries are not using the Advanced Scan and also explains why these queries took longer to complete.

Validate all three SQL Server Queries are accessing different data pages

Summary

The concept of the Advanced Scan to synchronize reads across different executing sessions is a simple yet brilliant way to avoid unnecessary data page reads and thrashing of the SQL Server buffer pool. You can see from this simple test how large table scans can easily overwhelm the SQL Server buffer pool and disk I/O.

I consider the test table size created in this demonstration relatively small, but the apparent difference in execution duration is likely due to running the test on a laptop with no cache on the storage tier. My assumption would be such a small table scan would manifest a smaller difference if tested on enterprise grade hardware where the storage tier typically has GBs of cache. In an enterprise where multiple table scans occur on table sizes in hundreds on GBs, the Advanced Scan feature could be a good bang for the buck when using SQL Server Enterprise Edition.

Next Steps


Last Update:


signup button

next tip button



About the author
MSSQLTips author Simon Liew Simon Liew is an independent SQL Server Consultant in Sydney, Australia. He is a Microsoft Certified Master for SQL Server 2008 and holds a Masterís Degree in Distributed Computing.

View all my tips
Related Resources





Post a comment or let the author know this tip helped.

All comments are reviewed, so stay on subject or we may delete your comment. Note: your email address is not published. Required fields are marked with an asterisk (*).

*Name    *Email    Notify for updates 


SQL tips:

*Enter Code refresh code     



Learn more about SQL Server tools