By: Chad Boyd | Comments | Related: More > Database Administration
Given some of the "larger" features being introduced with Sql 2008 which are getting lots of coverage and attention (understandably so), there are actually quite a few "smaller" features that are included which will provide a great deal of benefit to SQL Server users everywhere - one of those "smaller" features that hasn't been getting as much attention is improvements made in the database engine for bulk-logging / minimal logging of standard INSERT INTO statements and the new MERGE statement as well. Prior to this functionality, to get minimal-logging for an operation that required pushing data into an existing table with existing data would have required the use of partitioned tables/indexes and a merge/split/switch type operation where the data would have been bulk-loaded from a source into an empty staging table in your server, then switched into an empty partition within your pre-existing table. Naturally, this would necessitate the use of partitioning on the table, and inherently require you use the Enterprise edition of SQL Server (which is the only edition that supports partitioning). If you either didn't want to (or couldn't) partition your existing table, or ran a non-Enterprise version of the server, you really didn't have any options for bulk-loading into existing tables with existing data (baring a partitioned view configuration perhaps). This new enhancement in 2008 will allow bulk-loading / minimally-logged operations for many more scenarios than are possible today.
Similar to the existing minimally-logged operations, there are some prerequisites for these statements to actually be minimally-logged - you can find a full and detailed list in SQL 2008 Books Online, and also a discussion about the different operations on the SQL Server Storage Engine team's blog.
Sunil, a PM on the SQL Server Storage Engine Team, has a great 3-part series covering the enhancements, so I won't bother repeating what he has already described extremely well, instead I'll simply point you to each of the posts:
I'll also leave a very simple sample script you can run to see some of the performance differences between the fully-logged operation in 2008 and the same statement in a minimally-logged execution...Sunil's posts referenced above have additional samples that go into much greater detail and cover a wide-variety of possible scenarios.
Enjoy!
-------------------------------------------------------------------------------------------------------------
CODE ONLY BELOW
-------------------------------------------------------------------------------------------------------------
use AdventureWorks;
go
-- Ensure full recovery...
alter database AdventureWorks set recovery full;
go
-- Create a simple table...
if object_id('dbo.insertLoadTest') > 0
drop table dbo.insertLoadTest;
go
create table dbo.insertLoadTest (id int, charval char(36), filler char(250));
go
-- Fully logged insert...
use AdventureWorks;
go
truncate table dbo.insertLoadTest;
go
declare @d datetime2;
select @d = sysdatetime();
insert dbo.insertLoadTest with(tablock) (id, charval, filler)
select top 500000
row_number() over (order by a.object_id), newid(), 'filler'
from sys.columns a with(tablock)
cross join sys.columns b with(tablock);
-- Get the time difference...
select datediff(millisecond, @d, sysdatetime());
go
-- Minimally logged insert...
use master;
go
-- Using simple vs. bulk-logged simply to ease the fact that I'd have to
-- perform log backups with bulk-logged...this makes it obviously easier...
alter database AdventureWorks set recovery bulk_logged;
go
-- Rerun the same tests as above again...should notice a significant
-- improvement in not only run-time, but also a large difference in
-- log-space usage as well...
use AdventureWorks;
go
truncate table dbo.insertLoadTest;
go
declare @d datetime2;
select @d = sysdatetime();
insert dbo.insertLoadTest with(tablock) (id, charval, filler)
select top 500000
row_number() over (order by a.object_id), newid(), 'filler'
from sys.columns a with(tablock)
cross join sys.columns b with(tablock);
-- Get the time difference...
select datediff(millisecond, @d, sysdatetime());
go
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips