Learn more about SQL Server tools

mssqltips logo
 

Tutorials          DBA          Dev          BI          Career          Categories          Webcasts          Scripts          Today's Tip          Join

Tutorials      DBA      Dev      BI      Categories      Webcasts

DBA    Dev    BI    Categories

 

SQL Server Performance Comparison - Storing XML data in-row versus out-of-row


By:   |   Last Updated: 2013-12-18   |   Comments (1)   |   Related Tips: More > XML

Problem

Over the last few years I have seen more and more cases where we need to store XML data in the database. This tip will take a look at the performance impact, if any, of storing XML data in-row with the rest of the data that make up the table record vs. out-of-row where only a pointer to the XML data is stored with the row data and the actual XML data is stored in a separate space.

Solution

In order to control the behavior of how XML data is stored in the table we have 2 options, the "text in row" and the "large value types out of row" table options. Since the "text in row" option is going to be removed from a future version of SQL Server we won't discuss it in this tip but you can find more details about it here. As with the "text in row" option the "large value types out of row" option can be set using the sp_tableoption system stored procedure. By default this option is set to 0 which means all large UDT values (including XML) are stored in the data row up to 8000 bytes. If the value does not fit in 8000 bytes or there is not enough free space on that page, a pointer is stored in the row and the rest of the data is stored in the LOB storage space. If this option is updated to 1 then all large UDT values are stored out of the row and only a 16-bit pointer to the LOB storage space is stored with the row data.

Sample table and data setup

In order to check the performance of this option we will set up two identical tables with the only difference being one of them will have the "large value types out of row" option set. We will also create a separate filegroup for the LOB storage. Here is the script to create this filegroup and tables.

-- Filegroup and table creation logic
ALTER DATABASE [TEST] ADD FILEGROUP [XMLDATA]
GO
ALTER DATABASE [TEST] ADD FILE 
  (NAME = N'XML_Data', FILENAME = N'C:\Data\XML_Data.ndf' , 
                       SIZE = 51200KB , FILEGROWTH = 1024KB ) 
   TO FILEGROUP [XMLDATA]
GO
CREATE TABLE dbo.XMLTest_largevalueinrow
  (Col1 int NOT NULL,
   XMLData xml NOT NULL) ON [PRIMARY]
  TEXTIMAGE_ON XMLDATA
GO
ALTER TABLE dbo.XMLTest_largevalueinrow 
  ADD CONSTRAINT PK_XMLTest_largevalueinrow PRIMARY KEY CLUSTERED 
    (Col1) 
  WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, 
        ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
  ON [PRIMARY]
GO
CREATE TABLE dbo.XMLTest_largevalueoutofrow
  (Col1 int NOT NULL,
   XMLData xml NOT NULL) ON [PRIMARY]
  TEXTIMAGE_ON XMLDATA
GO
ALTER TABLE dbo.XMLTest_largevalueoutofrow
  ADD CONSTRAINT PK_XMLTest_largevalueoutofrow PRIMARY KEY CLUSTERED 
    (Col1) 
  WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, 
        ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
  ON [PRIMARY]
GO
sp_tableoption 'XMLTest_largevalueoutofrow','large value types out of row',1
GO

After running this script you can check the "large value types out of row" was set correctly using the following script.

select name,large_value_types_out_of_row from sys.tables
 where name like 'XMLTest%'

 

name large_value_types_out_of_row
XMLTest_largevalueinrow 0
XMLTest_largevalueoutofrow 1

To complete the setup we'll need to load some data into these two tables. For this example I chose an XML data size of about 2kb. This means that for the table that has the default value for "large value types out of row", all the XML data will be stored with the rest of the row data. For the table with "large value types out of row" set to 1, all the XML data will be stored in the LOB storage space. Here is a script that can be used to load the tables. Note: You'll have to update ##INSERT XML HERE## with valid XML data.

DECLARE @i INTEGER
SELECT @i=1
WHILE @i < 1000000
BEGIN
   INSERT INTO dbo.XMLTest_largevalueinrow (Col1,XMLData) VALUES (@i,'##INSERT XML HERE##')
   INSERT INTO dbo.XMLTest_largevalueoutofrow (Col1,XMLData) VALUES (@i,'##INSERT XML HERE##')
   SELECT @[email protected]+1
END 

Once this script has been run we can take a look at the how the table storage is organized using the following script.

SELECT OBJECT_NAME(p.object_id) AS Object_Name
       , i.name AS Index_Name
       , ps.in_row_used_page_count AS IN_ROW_DATA
       , ps.lob_used_page_count AS LOB_DATA
FROM sys.dm_db_partition_stats ps
JOIN sys.partitions p ON ps.partition_id = p.partition_id
JOIN sys.indexes i ON p.index_id = i.index_id AND p.object_id = i.object_id
WHERE OBJECTPROPERTY (p.[object_id], 'IsUserTable') = 1

Here is the output of that query. We can see that in the case of the table with the default setting no data is stored in the LOB storage space. Another interesting thing to note with this output is that the table storing the XML data exclusively out-of-row is using more space than the table storing all the data in-row.

Object_Name Index_Name IN_ROW_DATA LOB_DATA

XMLTest_largevalueinrow

PK_XMLTest_largevalueinrow

250934

0

XMLTest_largevalueoutofrow

PK_XMLTest_largevalueoutofrow

4098

347832

Test execution and results

In order to test this table option we will run some simple SELECT, INSERT, UPDATE and DELETE statements. Since these tables contain the exact same data we will run the exact same statements against both of them and compare the results. To gather the results of each set of queries I am going to use SQL Profiler and monitor the "SQL Batch Completed" event gathering the usual performance statistics: Read, Writes, CPU and Duration. Below is our test script. Please note that as with the initial data load script you'll have to update the INSERT and UPDATE queries with valid XML data. Also note that I have run DBCC DROPCLEANBUFFERS before each set of statements to ensure the cache is empty.

-- Test SELECT statements
DBCC DROPCLEANBUFFERS
GO
select * from XMLTest_largevalueinrow where Col1=34232
select * from XMLTest_largevalueinrow where Col1=753423
select * from XMLTest_largevalueinrow where Col1=176755
select * from XMLTest_largevalueinrow where Col1=323233
select * from XMLTest_largevalueinrow where Col1=534232
go
select * from XMLTest_largevalueoutofrow where Col1=34232
select * from XMLTest_largevalueoutofrow where Col1=753423
select * from XMLTest_largevalueoutofrow where Col1=176755
select * from XMLTest_largevalueoutofrow where Col1=323233
select * from XMLTest_largevalueoutofrow where Col1=534232
go
-- Test INSERT statements
DBCC DROPCLEANBUFFERS
GO
insert into XMLTest_largevalueinrow (Col1,XMLData) values (1000000,'##INSERT XML HERE##')
insert into XMLTest_largevalueinrow (Col1,XMLData) values (1000001,'##INSERT XML HERE##')
insert into XMLTest_largevalueinrow (Col1,XMLData) values (1000002,'##INSERT XML HERE##')
insert into XMLTest_largevalueinrow (Col1,XMLData) values (1000003,'##INSERT XML HERE##')
insert into XMLTest_largevalueinrow (Col1,XMLData) values (1000004,'##INSERT XML HERE##')
go
insert into XMLTest_largevalueoutofrow (Col1,XMLData) values (1000000,'##INSERT XML HERE##')
insert into XMLTest_largevalueoutofrow (Col1,XMLData) values (1000001,'##INSERT XML HERE##')
insert into XMLTest_largevalueoutofrow (Col1,XMLData) values (1000002,'##INSERT XML HERE##')
insert into XMLTest_largevalueoutofrow (Col1,XMLData) values (1000003,'##INSERT XML HERE##')
insert into XMLTest_largevalueoutofrow (Col1,XMLData) values (1000004,'##INSERT XML HERE##')
go
-- Test UPDATE statements
DBCC DROPCLEANBUFFERS
GO
update XMLTest_largevalueinrow set XMLData = '##INSERT XML HERE##' where Col1=34232
update XMLTest_largevalueinrow set XMLData = '##INSERT XML HERE##' where Col1=753423
update XMLTest_largevalueinrow set XMLData = '##INSERT XML HERE##' where Col1=176755
update XMLTest_largevalueinrow set XMLData = '##INSERT XML HERE##' where Col1=323233
update XMLTest_largevalueinrow set XMLData = '##INSERT XML HERE##' where Col1=534232
go
update XMLTest_largevalueoutofrow set XMLData = '##INSERT XML HERE##' where Col1=34232
update XMLTest_largevalueoutofrow set XMLData = '##INSERT XML HERE##' where Col1=753423
update XMLTest_largevalueoutofrow set XMLData = '##INSERT XML HERE##' where Col1=176755
update XMLTest_largevalueoutofrow set XMLData = '##INSERT XML HERE##' where Col1=323233
update XMLTest_largevalueoutofrow set XMLData = '##INSERT XML HERE##' where Col1=534232
go
-- Test DELETE statements
DBCC DROPCLEANBUFFERS
GO
delete from XMLTest_largevalueinrow where Col1=44232
delete from XMLTest_largevalueinrow where Col1=653423
delete from XMLTest_largevalueinrow where Col1=276755
delete from XMLTest_largevalueinrow where Col1=923233
delete from XMLTest_largevalueinrow where Col1=334232
go
delete from XMLTest_largevalueoutofrow where Col1=44232
delete from XMLTest_largevalueoutofrow where Col1=653423
delete from XMLTest_largevalueoutofrow where Col1=276755
delete from XMLTest_largevalueoutofrow where Col1=923233
delete from XMLTest_largevalueoutofrow where Col1=334232
go

And here are the results.

Statement Type Reads Writes CPU Duration
SELECT (in-row)

20

0

0

145

SELECT (out-of-row)

25

0

0

240

INSERT (in-row)

43

4

15

105

INSERT (out-of-row)

61

5

16

66

UPDATE (in-row)

31

5

0

493

UPDATE (out-of-row)

61

10

0

582

DELETE (in-row)

46

10

0

282

DELETE (out-of-row)

71

20

0

345

We can see from these results that in this case storing the XML data in-row provides a little better performance than storing it out-of-row. One thing you'll want to consider though is the trade off between storage and performance. It might make more sense from an administration point of view to have this data segregated in its own filegroup even if you have to give up a bit of performance. As always it's best to test in your own environment to confirm these results and decide on what's best for your particular situation.

Next Steps
  • Run similar test with different size XML data to see if the size of the XML data has any effect on performance
  • Read more information on the "text in row"option
  • Other XML tips


Last Updated: 2013-12-18


get scripts

next tip button



About the author
MSSQLTips author Ben Snaidero Ben Snaidero has been a SQL Server and Oracle DBA for over 10 years and focuses on performance tuning.

View all my tips




Post a comment or let the author know this tip helped.

All comments are reviewed, so stay on subject or we may delete your comment. Note: your email address is not published. Required fields are marked with an asterisk (*).

*Name    *Email    Email me updates 


Signup for our newsletter
 I agree by submitting my data to receive communications, account updates and/or special offers about SQL Server from MSSQLTips and/or its Sponsors. I have read the privacy statement and understand I may unsubscribe at any time.



    



Friday, February 28, 2014 - 10:33:08 AM - Jeff Moden Back To Top

To be honest, I'm still fairly well amazed that anyone would store XML in their databases at all.  It's denormalized.  It's bloated.  It requires special handling.  The blobs that it's stored in make it impossible to rebuild an index in an online fashion for anything pre-2014.  To really be quick, it requires special indexing which is an extreme duplication of data.

Folks turn their noses up at storing the likes of CSV data in a database.  Why are they so quick to accept storing XML in a database?  Why not shred it and store it properly?


Learn more about SQL Server tools