Making data imports into SQL Server as fast as possible
When bulk loading data into SQL Server, in most cases you want to have this process run as fast as possible. Depending on the options you select and how indexes are setup on the table, the time to load the data could differ quite drastically. In this tip we will take a look at different options for bulk loading along with examples and the time it takes for each of these operations to complete.
There are several ways to load data into SQL Server and export data from SQL Server here are a few of those previous tips. Refer to these tips if you are not familiar with this process.
In our tests we took six different approaches for loading the data into the table.
- Table has a clustered and 2 non-clustered indexes
- Table has only a clustered index
- Table has no indexes
- Table has no indexes and used "TABLOCK" hint
- Table has only a clustered index and used "TABLOCK" hint
- Table has a clustered and 2 non-clustered indexex
In addition, the database was set to Bulk-Logged recovery.
For these tests after each iteration the database and log were backed up to clear the log file. Also, before each step the table was dropped and recreated.
The first run was done using a dataset of 121,317 rows and an empty table. Before each run the table was recreated using the specifics for that test.
As you can see from the results below there is a difference depending on the process that you use to load the data.
|1||with clustered index and 2 non-clustered indexes||5.1||5.3|
|2||with clustered index only||3.2||3.0|
|3||with no indexes||1.4||1.4|
|4||with no indexes and tablock||1.2||1.3|
|5||with clustered index only and tablock||2.8||2.5|
|6||with clustered index and 2 non-clustered indexes and tablock||4.1||3.9|
The next run was done using a dataset of 242,634 rows and an empty table. Before each run the table was recreated using the specifics for that test.
Again, you can see there are quite a few differences for these runs.
|1||with clustered index and 2 non-clustered indexes||14.0||13.8|
|2||with clustered index only||6.9||7.3|
|3||with no indexes||2.7||2.7|
|4||with no indexes and tablock||2.5||2.5|
|5||with clustered index only and tablock||5.6||5.5|
|6||with clustered index and 2 non-clustered indexes and tablock||8.4||8.7|
In this test we ran two loads. The first load was done using a dataset of 121,317 rows and an empty table and then we loaded another 121,317 records into the same table. Before each run the table was recreated using the specifics for that test.
|1||with clustered index and 2 non-clustered indexes||8.4||7.7||16.1|
|2||with clustered index only||3.5||3.2||6.7|
|3||with no indexes||1.5||1.4||2.9|
|4||with no indexes and tablock||1.3||1.3||2.6|
|5||with clustered index only and tablock||3.1||4.0||7.1|
|6||with clustered index and 2 non-clustered indexes and tablock||4.0||8.5||12.5|
As you can see from these results and these small loads the various options do effect the time it takes the process to complete. Therefore for faster loads you may want to load your tables without any indexes and then later create the indexes after the load. The one step we did not do is to see how much time the index creation would take if this was done after the data load.
In addition, make sure your data and log files have sufficient space to complete the operation without having to autogrow these files. The autogrow process can significantly effect the overall load time.
Lastly, when using the tablock option your operations will be minimally logged which is a good thing, but take a look at this previous tip, Minimally Logging Bulk Load Inserts into SQL Server, as to how this works depending on the state of the table. This is one of the reasons why in Test 3 the second load took longer to complete since there was already data in the table.
- Next time you need to import data into SQL Server keep this tip in mind as to other options you may have to make the process run faster.
- Take a look at these other tips:
About the author
View all my tips