.NET Bulk Insert into SQL Server


By:   |   Updated: 2011-05-13   |   Comments (3)   |   Related: More > Application Development

Problem

Knowing about the SqlBulkCopy class is most helpful if you need to load large amounts of data into your database from within a .NET application. For example you may need to integrate a .NET custom step in order to do the initial data import in a Visual Studio application installer. Thus you benefit from the advanced error handling offered by the .NET platform. Another situation when SqlBulkCopy is useful is the data migration to SQL Azure. In this tip I'll walk you through a simple console application which will demonstrate the basic usage of this class and some options you should be aware of.

Solution

The SqlBulkCopy class functionality is similar to the BULK INSERT statement and to the bcp utility with the "in" argument. As always, when choosing a tool to work with it's useful to know its limitations. You can only use SqlBulkCopy to import data into SQL tables from any source which can be loaded into a DataTable or read with an IDataReader - for example you can import data directly from a SqlDataReader

The sample application I'll talk about uses as a csv source file with about 42000 lines containing product price data. The data will be loaded into a DataTable, processed and finally bulk inserted into a SQL table. The underlying database (let's call it simply TestBC) uses the bulk logged recovery model and contains the Product_bc table in which we'll bulk insert the data from the csv file. For the purposes of this example Product_bc is a heap.

CREATE TABLE [dbo].[Product_bc](
    [Item] [varchar](50) NOT NULL,
    [ProductID] [int] NOT NULL,
    [ListPrice] [money] NULL,
    [SalePrice] [money] NULL
) ON [PRIMARY]

Please see the Next Steps section for a link to download the Visual Studio project.

The first step is to load the csv file data into a DataTable. The below code creates the DataTable (notice that the appropriate data type is used for each column), reads the data from the csv file using a Microsoft.Jet.OLEDB.4.0 connection string and fills the DataTable. The configurable pieces of information, such as connection strings and file paths, are stored into the App.config file.

public static DataTable LoadCSVFile(){
    string selectFromFile = ConfigurationManager.AppSettings["selectFromFile"];
    DataTable dt = new DataTable();
    using (OleDbConnection jetConn = 
    new OleDbConnection(ConfigurationManager.AppSettings["JETConn"]))
    {
        using (OleDbCommand oleCmd = new OleDbCommand(selectFromFile, jetConn)){
            using(OleDbDataAdapter oleAdapter = new OleDbDataAdapter(oleCmd)){
                dt.Columns.Add("Item", System.Type.GetType("System.String"));
                dt.Columns.Add("ProductID", System.Type.GetType("System.Int32"));
                dt.Columns.Add("ListPrice", System.Type.GetType("System.Decimal"));
                dt.Columns.Add("SalePrice", System.Type.GetType("System.Decimal"));
                oleAdapter.Fill(dt);
            }
        }
    }
    return dt;
}

To bulk insert the DataTable's content into the SQL table I'm using the SqlBulkCopy class, as you can see in the next piece of code. There are a few things to pay attention to here. First of all, notice the constructor of the SqlBulkCopy class, which includes the TableLock option. As you most probably already know, there are a few conditions necessary to make the bulk import a minimally logged operation. One of them is to specify table locking - that is what the TableLock option does. The DestinationTableName property is held in the App.config file. In our case the data is copied into the Product_bc table. I use the SqlRowsCopied event in conjunction with the NotifyAfter property to write a message to the console after every 1000 rows have been copied. The WriteToServer method will copy the DataTable's content to the destination table.

public static void CopyData(string connStr, DataTable dt)
    {
        using (SqlBulkCopy bulkCopy = 
        new SqlBulkCopy(connStr, SqlBulkCopyOptions.TableLock)){
            bulkCopy.DestinationTableName = 
            ConfigurationManager.AppSettings["DestinationTable"];
            bulkCopy.SqlRowsCopied += 
            new SqlRowsCopiedEventHandler(bulkCopy_SqlRowsCopied);
            bulkCopy.NotifyAfter = 1000;
            bulkCopy.WriteToServer(dt);
        }
    }

To see that the table lock is acquired, I've set up a server side trace containing the Sql:BatchStarting and SqlBatchCompleted events. The application has 2 bulk copy methods, one of them ("CopyData") specifies the table locking, the other ("CopyData_NoTableLock") not. Here is the trace result in the first case:

using the sql bilk copy class

Prior to the bulk import SQL Server checks the table structure by issuing a SELECT with SET FMTONLY ON. It also checks the table and column collation executing the sys.sp_tablecollations_100 procedure. As you can see, the bulk import is done WITH TABLOCK. In the second case, the trace result looks like below:

sql server will check the table stucture

As you can see, the duration is higher when no table lock is held - 503ms versus 419ms with table lock.

Furthermore, you can check if the bulk insertion is minimally logged or not following these steps. First take a backup of the database, so that it will be under the "bulk logged" recovery model, not under the "pseudo-simple" one. Run the CopyData method (this method uses table lock) and run this afterwards:

--obtaining a list of page ids
DECLARE @dbid int = DB_ID(), @objid int = OBJECT_ID('dbo.Product_bc')
DBCC EXTENTINFO(@dbid, @objid)
--choose a page id (on my machine for example 8520) 
--and see the details of that page; 
DBCC TRACEON(3604)
DBCC PAGE(@dbid, 1, 8520, 3)

The Allocation status section of the page detail should mention "MIN_LOGGED":

the allocation status section

If you run the other method (CopyData_NoTableLock) the corresponding page detail will look like "NOT MIN_LOGGED":

make sure to review the sqlbulkcopy options

Make sure that you review the other SqlBulkCopy options enumerated here.

This example features a very simple case, in which the number of columns and their order is the same, both in the csv file and in the table. If the data source "does not match" the destination table, i.e. if the number of the columns and/or their ordinal position is not the same, the SqlBulkCopy.ColumnMappings property lets you specify the column mapping. Remember the "format file" you use with bcp or with BULK INSERT - this property fulfills a part of the "format file" job. Another property you may want to be aware of is the BatchSizeProperty.

Next Steps

Please feel free to download the .Net sample application.

  • After downloading the zip file, run the script which creates the database (TestBC.sql), open the solution in VS 2010 and change the connection string and log file location values in the App.config file according to your environment.
  • Program.cs is the "main" file. Trace.cs contains code regarding the error log file. Worker.cs contains the LoadCSVFile() method, which loads the SourceFile.csv file into a DataTable and 2 methods which bulk copy the data into the SQL table, one for each case I discussed (with TABLOCK and without TABLOCK).
  • I used VS 2010; therefore you won't be able to open the solution with VS 2008. If you need to work with VS 2008, create a new "console application" project, copy the files I've sent in the new directory (Program.cs will be replaced) and open the solution. In the "solution explorer" pane right click the project name, choose Add -> Existing Item and add Worker.cs, Trace.cs and App.Config. Make sure that the namespace is OK (i.e. is the same as project name) in each .cs file.
  • Here I tried to keep this example simple enough; therefore I did not mentioned that you can perform multiple bulk copy operations using a SqlBulkCopy instance.
  • Don't miss this complete Data Loading Performance Guide
  • Those who purchased SQL Server MVP Deep Dives should definitely read the 8th chapter ("What makes a bulk insert a minimally logged operation?") written by Denis Gobo.
  • Finally, you can find here one of the best descriptions of the class, from a .NET developer point of view.


Last Updated: 2011-05-13


get scripts

next tip button



About the author
MSSQLTips author Diana Moldovan Diana Moldovan is a DBA and data centric applications developer with 6 years of experience covering SQL 2000, SQL 2005 and SQL 2008.

View all my tips




Post a comment or let the author know this tip helped.

All comments are reviewed, so stay on subject or we may delete your comment. Note: your email address is not published. Required fields are marked with an asterisk (*).

*Name
*Email
Email me updates

Signup for our newsletter

I agree by submitting my data to receive communications, account updates and/or special offers about SQL Server from MSSQLTips and/or its Sponsors. I have read the privacy statement and understand I may unsubscribe at any time.





Friday, May 13, 2011 - 1:49:50 PM - Jeremy Kadlec Back To Top

Henry and Diana,

The download URL has been corrected.

Thank you,
Jeremy Kadlec


Friday, May 13, 2011 - 12:51:00 PM - Diana Moldovan Back To Top

Henry, thank you for letting me know. The link will be fixed soon.


Friday, May 13, 2011 - 11:12:11 AM - Henry Minute Back To Top

The link to download the source code is broken.

Even when copied and an initial 'h' is added, it still does not work.



download

























get free sql tips

I agree by submitting my data to receive communications, account updates and/or special offers about SQL Server from MSSQLTips and/or its Sponsors. I have read the privacy statement and understand I may unsubscribe at any time.



Learn more about SQL Server tools