SQL Server Enterprise Advanced Scan aka Merry-Go-Round Scan
By: Simon Liew | Updated: 2016-10-05 | Comments | Related: More > Performance Tuning
A rarely known feature of SQL Server Enterprise Edition is the Advanced Scan (aka Merry-Go-Round Scan). Microsoft does not seem to mention this feature much, so there is a lack of documentation about how the Advanced Scan works. Also, there doesn't seem to be a trace flag or Extended Events which indicate if queries are using the Advanced Scan. In this tip we will discuss the Advance Scan feature and demonstrate its characteristics.
In SQL Server Enterprise Edition, the Advanced Scan feature allows multiple tasks to share full table scans. This means that SQL Server can read a data page once and provide the same page across different executing sessions. This feature only activates when the table size scanned is larger than the SQL Server buffer pool committed target pages.
Here is an explanation of how Advanced Scan works:
- Let's say there is a table named TableA that contains 200,000 data pages.
- Query 1 performs a table scan on TableA and the table scan reads the first 100,000 pages.
- Query 2 is started which also requires a table scan on TableA.
- SQL Server can schedule another set of read requests for the pages after 100,000 and pass the rows to both scans that are occurring.
- When Query 1 and Query 2 reach page 200,000, Query 1 will complete, but Query 2 will wrap back to the first data page and continue to scan until it reaches page 100,000 and then completes.
- If other queries are started while Query 1 and Query 2 are running that require a table scan on TableA, SQL Server can join the new query to the executing table scan the same way. Hence, this how the Advanced Scan feature works which is also referred to as a merry-go-round scan.
The significance of this feature is multiple tasks are not requesting different data pages individually when performing a scan on the same table. For example, if data page 100,000 is read by Query 1 into the buffer pool, but this data page is already flushed out due to a limited buffer pool size then Query 2 comes along later to request the same data page which will need be read back into memory again. If you have a busy system, this can easily overwhelm the SQL Server buffer pool and cause disk arm contention on very large table scans.
SQL Server Table Scan Exhibiting Advanced Scan Behavior
In this demonstration, SQL Server 2016 Developer Edition RTM is running on a Windows Server 2012 R2 Standard virtual machine on a laptop. The virtual machine is allocated 1.5GB of server memory and the SQL Server maximum memory is set to 1GB. The underlying storage is a 1TB Solid State Disk without any RAID.
A test table will be created with each row taking a full 8K data page. We will insert 2 million rows into the test table which will occupy 15.6GB of space. The test table size is 15 times the size of the SQL Server buffer pool, so SQL Server will be reading data pages mostly from disk and quickly flushing out data pages from the buffer cache during this test.
First we create a database, a table and insert some data.
CREATE DATABASE [AdvancedScan] CONTAINMENT = NONE ON PRIMARY ( NAME = N'AdvancedScan', FILENAME = N'D:\SQLDATA\MSSQL13.MSSQLSERVER\MSSQL\DATA\AdvancedScan.mdf' , SIZE = 20480000KB , FILEGROWTH = 65536KB ) LOG ON ( NAME = N'AdvancedScan_log', FILENAME = N'E:\SQLDATA\MSSQL13.MSSQLSERVER\MSSQL\Log\AdvancedScan_log.ldf' , SIZE = 20480000KB , FILEGROWTH = 65536KB ) GO ALTER DATABASE [AdvancedScan] SET RECOVERY SIMPLE GO CREATE TABLE dbo.AdvScanTest (ID INT IDENTITY(1,1) , FixedCol CHAR(6000) DEFAULT 'a') GO SET NOCOUNT ON; DECLARE @c INT = 1 BEGIN TRAN WHILE @c <= 2000000 BEGIN INSERT INTO dbo.AdvScanTest DEFAULT VALUES SET @c += 1 END COMMIT TRAN
We will be testing using 3 similarly constructed queries with different predicates, but each query will result in a full table scan.
-- Query 1 SELECT COUNT(*) FROM dbo.AdvScanTest WHERE ID BETWEEN 64565 AND 546412 OPTION (MAXDOP 1) -- Query 2 SELECT COUNT(*) FROM dbo.AdvScanTest WHERE ID >= 1200000 OPTION (MAXDOP 1) -- Query 3 SELECT COUNT(*) FROM dbo.AdvScanTest OPTION (MAXDOP 1)
When one of the queries is executed for the first time, it took 33 seconds to complete as shown below.
The three queries are executed around the same time in three different query windows and the execution duration is recorded. To completely clear out everything from memory, the SQL Server service is restarted before each round of execution. This means no data page will remain in the SQL Server buffer cache when the next set of test queries is executed.
Each query is executed 5 times and the duration is measured in seconds in the table below. We seem to be getting pretty consistent results and durations when just a single table scan is occurring. This is probably because the test is run on a laptop with no other users hitting the database server.
|Query 1||Query 2||
|Run 1||42 seconds||42 seconds||41 seconds|
|Run 2||34 seconds||33 seconds||33 seconds|
|Run 3||34 seconds||34 seconds||33 seconds|
|Run 4||34 seconds||35 seconds||36 seconds|
|Run 5||33 seconds||33 seconds||32 seconds|
Checking the SQL Server DMV sys.dm_os_waiting_tasks which returns information about the wait queues for tasks that are waiting on some resource. All three sessions are requesting and waiting on the same data page at any given point-in-time, albeit the queries are scheduled on different SQLOS schedulers. This seems to conform to the Advanced Scan behavior described on MSDN where a single data page read is passed to multiple tasks.
The following T-SQL script should be run in a different query window while the above queries are running.
SELECT rq.start_time, ot.task_state, wt.session_id, ot.task_state , wt.wait_type, wt.resource_description, ot.scheduler_id FROM sys.dm_os_waiting_tasks wt INNER JOIN sys.dm_os_tasks ot ON ot.task_address = wt.waiting_task_address INNER JOIN sys.dm_exec_sessions es ON es.session_id = wt.session_id INNER JOIN sys.dm_exec_requests rq ON rq.session_id = es.session_id WHERE es.is_user_process = 1 AND es.session_id <> @@SPID ORDER BY es.session_id
If you look at the resource_description column you can see that all three
queries are going after the same data page.
Table Scan Reading Different Data Pages
For this test, we will execute the same set of queries again in different query windows, but each query is executed 5 seconds after the previous query is started. So, Query 1 is started, then 5 seconds later Query 2 is started and then 10 seconds later Query 3 is started.
Before each set of executions, the SQL Server service is restarted to make sure the SQL Server buffer pool does not contain any cached data pages.
The queries are executed 5 times and the query duration is recorded below.
|Query 1||Query 2||
|Run 1||1 minute 18 seconds||1 minute 28 seconds||1 minute 22 seconds|
|Run 2||1 minute 23 seconds||1 minute 27 seconds||1 minute 25 seconds|
|Run 3||1 minute 20 seconds||1 minute 23 seconds||1 minute 21 seconds|
|Run 4||1 minute 22 seconds||1 minute 24 seconds||1 minute 22 seconds|
|Run 5||1 minute 19 seconds||1 minute 23 seconds||1 minute 20 seconds|
When checking the SQL Server DMV sys.dm_os_waiting_tasks using the query above, the data pages requested by each query is different. This indicates each session is requesting different data pages individually throughout each query execution to completion.
If you look at the resource_description column you can see that all three queries are going after different data pages, which shows these queries are not using the Advanced Scan and also explains why these queries took longer to complete.
The concept of the Advanced Scan to synchronize reads across different executing sessions is a simple yet brilliant way to avoid unnecessary data page reads and thrashing of the SQL Server buffer pool. You can see from this simple test how large table scans can easily overwhelm the SQL Server buffer pool and disk I/O.
I consider the test table size created in this demonstration relatively small, but the apparent difference in execution duration is likely due to running the test on a laptop with no cache on the storage tier. My assumption would be such a small table scan would manifest a smaller difference if tested on enterprise grade hardware where the storage tier typically has GBs of cache. In an enterprise where multiple table scans occur on table sizes in hundreds on GBs, the Advanced Scan feature could be a good bang for the buck when using SQL Server Enterprise Edition.
- Here is some information about how SQL Server Reads Pages
- Check out the following resources:
About the author
View all my tips
Article Last Updated: 2016-10-05