Learn more about SQL Server tools

mssqltips logo
 

Tutorials          DBA          Dev          BI          Career          Categories          Webcasts          Scripts          Today's Tip          Join

Tutorials      DBA      Dev      BI      Categories      Webcasts

DBA    Dev    BI    Categories

 

Importing JSON Files Using SQL Server Integration Services


By:   |   Last Updated: 2018-03-23   |   Comments (1)   |   Related Tips: More > Integration Services Development

Problem

I would like to import JSON data using SQL Server Integration Services. Are there any default control flow tasks and data flow source adapters available to manage JSON data files in SSIS?

Solution

SQL Server Integration Services does not support JSON natively, so we need to develop a custom source for a data flow task to load JSON file. In this tip, I will walkthrough a method to develop a bespoke source to load JSON files using .Net libraries.

Basic JSON Format

A JSON document must have an object or an array at its root. A JSON object is represented by {} and a JSON array is represented by []. A JSON object can contain another JSON object, a JSON array or string.

The JSON property details will be represented as key/value pairs. These key value sets are separated by using a colon “:” and multiple sets are separated using a comma, “,”.

Understanding JSON file format

The first important step in the process is to understand the format of the JSON. If you are using Notepad++, then the JSON plugins are a good starting point to learn about different JSON formats.

You can install the JSON viewer from here and install JSToolAPP from here. The JSON viewer plugin will help you to understand the format of the JSON. The second plugin JSToolApp will help you to format the JSON. In this tip, I will use Notepad++ and these plugins to showcase the example.

Sample JSON file

Let us assume that we have been supplied a JSON file for orders. This JSON file contain two order details and each order has the details such as order id, customer id and order status.

Now let us open this file in the Notepad++ and select the data contents and press Ctrl + Alt+ Shift + J. This will open the JSON viewer and will display the JSON object model.

From the JSON viewer, it is evident that the JSON file has an array of objects. Each object represents an order. The object attributes are representing the details of an order.

Order File Content - Description: Order File Content

It is observed that the supplied JSON file has an object array within the JSON document. An order has three attributes namely OrderID, CustomerID and OrderStatus.

Developing Custom Data Source for JSON File

As Microsoft has not supplied the default data source for JSON files, we have to develop a custom/bespoke data source using the script component. In this tip, we will use a script component to build a data source for JSON.

Let's add a script component to the data flow task. This script component will read the JSON file and generate the output records with the help of .Net libraries. So, let's configure this script component as a source.

Data Flow Task - Description: Data Flow Task

As a first step, let's prepare the script component to generate an order dataset. Let's add the order attributes (OrderID, CustomerID & OrderStatus) as output columns with suitable datatypes as mentioned in the image below.

Output Columns - Description: Output Columns

Now we are ready to write C# code to read the JSON file. We will be using the function in the .Net library System.Web.Extension to read the JSON file. So, to access the function, we need to add this library as a reference. This can be achieved by selecting the project and by right clicking on Add Reference.

Adding Reference - Description: Adding Reference

In the dialog box, under the .NET tab, look for "System.Web.Extension" library and select it to add it as a reference.

System.Web.Extensions - Description: System.Web.Extensions

Once done, the library will be visible under the reference section.

References Added - Description: References Added

To enable us to use the library, we need to add that in the namespace region as shown in the below image.

Namespace Updates - Description: Namespace Updates
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;
using System.Collections.Generic;
using System.Text;
using System.Web.Script.Serialization;
using System.IO;
using OrderNamespace;			

Before we start coding on the C#, let us learn some basics.

Deserialization

Deserialization is a process that helps to transform JSON document to a runtime object. Once the data is available as a runtime object, then it can be parsed by using the .Net libraries.

Now we need to read the JSON file content and deserialize it to convert into runtime object.

Creating an Order Class

We need to create object that can hold the JSON content. So, let's create a class in the C#. This class must have the same structure and properties as the JSON content.

A C# class can be created by selecting the project and click Add > Select Class as mentioned in the image below.

Adding new class - Description: Adding new class

Type the name of the class as "Order".

Adding new class - Description: Adding new class

Now we need to rename the namespace as "OrderNamespace" in the class as mentioned below. Also, we need to create an Order class with three properties (OrderID, CustomerID and OrderStatus).

Adding Order class - Description: Adding Order class
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace OrderNamespace
{
   public class Order
   {
      public int orderid { get; set; }
      public int customerID { get; set; }
      public string orderstatus { get; set; }
   }
}		

The name of the properties has to match the name of the attributes mentioned in the JSON file.

Let's assume that the Orders.JSON file is available in the local folder and let's read the file content into a string using File.ReadallText function. The JavaScriptSeralizaer is an object defined in the System.Web.extension assembly that will be used to deserialize the JSON file content. So, create an instance of the object JavaScriptSerializer. The "Deserialize" function in the JavaScriptSerializer object will deserialize and return a runtime object of type "Order". As the Orders JSON file contains an array the Deserialize function will return a List of type "Order".

Once the JSON file content has been read, the list can be iterated through using a for each loop as shown in the picture below.

As there are two order details in the JSON file, the loop will iterate through two times. In the for each loop the order attributes can be read and provide the output to output buffer.

public override void CreateNewOutputRows()
{
   /*
   Add rows by calling the AddRow method on the member variable named "<Output Name>Buffer".
   For example, call MyOutputBuffer.AddRow() if your output was named "MyOutput".
   */
   String jsonFileContent = File.ReadAllText(@"E:\WorkArea\Analysis\JSON\Data\Orders.JSON");
   JavaScriptSerializer js = new JavaScriptSerializer();
   List<Order> orders = js.Deserialize<List<Order>>(jsonFileContent);
   foreach (Order order in orders)
   {
      Output0Buffer.AddRow();
      Output0Buffer.OrderID = order.orderid;
      Output0Buffer.CustomerID = order.customerID;
      Output0Buffer.OrderStatus = order.orderstatus;
   }
}			
Deserialize JSON data - Description: Deserialize JSON data

Let's add a union all after the script component and add a data viewer to see the actual data. Now let's execute the package and you can see there are two order details available in the data pipeline.

Data Pipieline - Description: Data Pipieline

Summary

In this tip, we have learned about importing JSON data using SQL Server Integration Services. Also, we have learned about deserializing JSON content into a JSON runtime object.

Next Steps
  • Learn JSON basics with this tip.
  • Challenge your JSON knowledge with this tip.


Last Updated: 2018-03-23


get scripts

next tip button



About the author
MSSQLTips author Nat Sundar Nat Sundar is working as an independent SQL BI consultant in the UK with a Bachelors Degree in Engineering.

View all my tips




Post a comment or let the author know this tip helped.

All comments are reviewed, so stay on subject or we may delete your comment. Note: your email address is not published. Required fields are marked with an asterisk (*).

*Name    *Email    Email me updates 


Signup for our newsletter
 I agree by submitting my data to receive communications, account updates and/or special offers about SQL Server from MSSQLTips and/or its Sponsors. I have read the privacy statement and understand I may unsubscribe at any time.



    



Friday, July 27, 2018 - 3:45:57 AM - Jeya Back To Top

Hi,

Very helpful. But, I received below error. Did you come across error like this? do we know what would be the cause?

"there is no current row in the buffer. A row may need to be added using the AddRow method."


Learn more about SQL Server tools