Import multiple Excel Worksheets into multiple SQL Server Database Tables

By:   |   Updated: 2022-01-13   |   Comments   |   Related: More > Import and Export


Problem

In a series of tutorials (see references at the bottom) I presented how you can export data from an SQL database to an Excel file. In this article I will present how we can import data from an Excel file into a destination table in a SQL database. I will use Microsoft SQL Server 2019, Visual Studio 2019 and Microsoft Excel 365. Some of the C# code will not work if you don’t use .NET 4.8 or later.

Solution

Importing data (or loading data) into a SQL database is part of an ETL process. Sometimes this is accomplished with an SSIS package, the SQL Server Import and Export Wizard, OPENROWSET or a Linked Server.  For this process Excel files are widely used, first because their structure is similar to a database (for example Excel sheets can be seen as database tables, and columns in a sheet can be seen as table columns), and second because it is very easy and convenient to collect data in an Excel file and load it into the database.

In this tutorial I will show how you can import an Excel file with multiple sheets into a new database with SQL tables.

The database I use for this test can be created by the following statements:

IF NOT EXISTS (SELECT [name] FROM sys.databases WHERE [name] = 'dbtest')
    create database dbtest
go

IF EXISTS (SELECT [name] FROM sys.databases WHERE [name] = 'dbtest')
begin
    exec('CREATE TABLE dbtest.dbo.[Company]
    (
        [CompanyCode] varchar(8) NOT NULL,
        [CompanyName] varchar(24) NULL,
        [BIC] varchar(11) NULL,
        [IBAN] varchar(35) NULL,
        CONSTRAINT [PK_Company] PRIMARY KEY CLUSTERED 
        (
            [CompanyCode]
        )
    )')

    exec('CREATE TABLE dbtest.dbo.[EmployeeDetails]
    (
        [EmployeeCode] varchar(8) NOT NULL,
        [Surname] varchar(25),
        [FirstName] varchar(50),
        [Address] varchar(50),
        CONSTRAINT [PK_EmployeeDetails] PRIMARY KEY CLUSTERED 
        (
            [EmployeeCode] 
        )
    )')

    exec('CREATE TABLE dbtest.dbo.[Employees]
    (
        [EmployeeCode] varchar(8) NOT NULL,
        [Director] bit,
        [GrossIncome] float,
        [BIC] varchar(11),
        [IBAN] varchar(34),
        CONSTRAINT [PK_Employees] PRIMARY KEY CLUSTERED 
        (
            [EmployeeCode] 
        )
    )')
end

We have two connections: one to the SQL Server and another to an Excel file.

We can use an ODBC connection to connect to the Excel file. For this, the connection string looks like:

string extension = Path.GetExtension(_excelFile);
string connectionString = "";
switch(extension)
{
  // HDR=YES ==> we have headers and do not want them (ignore first row)
  case ".xls": //Excel 97 to 03
    connectionString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source={0};Extended Properties='Excel 8.0;HDR=YES'";
    break;
  case ".xlsx": //Excel 2007 ->
    connectionString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties='Excel 12.0;HDR=YES'";
    break;
}

connectionString = string.Format(connectionString, _excelFile); 

As you can see, we have different providers to connect to an Excel file, depending on the Excel file version. Older Excel files created with Excel up to version 2003 use the Jet engine and newer Excel files use the ACE engine. I assume that currently most Excel files are created using the newer version of Excel.

Another thing to mention is HDR=YES as an extended property in the connection string. This says the first row in the Excel file is used as header for columns, so we can give each column a header, as in the following picture:

sample excel data

It is not necessary that the column header names are the same as the table columns in the database but is it useful.

Reading Excel sheets is the next step, which can be done with the following code:

DataTable dt = null;
using OleDbConnection conn = new OleDbConnection(connectionString);
conn.Open();
dt = conn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
List<string> excelSheets = new List<string>();

// Add the sheet name to the string array.
foreach(DataRow row in dt.Rows)
{
    string sheet = row["TABLE_NAME"].ToString();
    if(sheet.StartsWith("'"))
        sheet = sheet.Substring(1);
    // Get rid of $ in the sheet name
    if(sheet.EndsWith("'"))
        sheet = sheet.Substring(0, sheet.Length - 2);
    else
        sheet = sheet.Substring(0, sheet.Length - 1);
    excelSheets.Add(sheet.ToLower());
}

An Excel sheet name ends in $. If we use a space in the sheet name, it will be surrounded by quotes. For example, if we have a sheet called Employee Details (with a space in the name), the name we read using OleDB is 'Employee Details$', and if the sheet is called EmployeeDetails (without a space in the name) the name we read is EmployeeDetails$. We need only the name of the sheet, so we need to get rid of the extra characters in the name we read from Excel. The purpose of the code above is to make sure we read the correct Excel sheet.

We can, for example, give the Excel sheet the name of the corresponding table in the database, and we can check if that table exists before reading data from the sheet.

There are basically two ways of reading data from the Excel file. The first way is to read all sheets into memory and then write to the corresponding tables. A second way is to read one row from a sheet and write it immediately to the corresponding table without keeping the data in memory. This is very useful if the Excel file is large. We will use the first approach in this article.

The Excel file we will use has three sheets, one of them being called Company. This sheet can be read using the following code:

private void ReadCompany(OleDbConnection conn)
{
    string query = "SELECT * FROM [Company$]'";
    OleDbCommand cmdCompany = new OleDbCommand(query, conn);
    using OleDbDataReader dr = cmdCompany.ExecuteReader();

    // Get only the first company found
    if(dr.Read())
    {
        List<string> cols = new List<string>();
        for(int i = 0; i < dr.FieldCount; ++i)
        {
            cols.Add(dr[i].ToString().Replace("'", "''"));
        }
        _company = cols.ToArray();
    }
}

All of the Code and Sample Excel Data

Copy data to an existing table in the database is easy and can be found in the attached C# project. The article contains the full C# project, the Excel file used as an example, and the SQL script used to create the database (this database script is at the beginning of this article).

Download the code for this article.

Next Steps



Get Started Now - Click here to get your free 14 day trial of SolarWinds Database Insights




get scripts

next tip button



About the author
MSSQLTips author Mircea Dragan Mircea Dragan is a Senior Computer Scientist with a strong mathematical background with over 18 years of hands-on experience.

View all my tips


Article Last Updated: 2022-01-13

Comments For This Article





download














get free sql tips
agree to terms