SQL Server Function Examples with JSON Files

By:   |   Updated: 2023-01-12   |   Comments   |   Related: More > TSQL


Problem

I notice the increasing popularity of JSON content for exchanging data over Internet as well as for a lightweight way of storing log and measurement data. I also notice the growing functionality of SQL Server to accept, store, and process JSON data. Please provide several examples of SQL Server functions that illustrate how to accept, store, and process files with JSON content.

Solution

JSON (JavaScript Object Notation) is a text-based format for representing data via key-value pairs. The JSON format is more compact than XML, but JSON provides many of the same advantages. Its compact size makes it a fast format for exchanging data. Just like XML, JSON is easily read by humans. JSON's ability to hierarchically represent data and to nest hierarchical data within other key-value pairs facilitates its ability to represent complex hierarchical relationships across data elements.

Microsoft SQL Server recognizes the strengths of JSON by progressively adding more features for importing, parsing to relational tables, and exporting JSON data from relational tables to text files for use in NoSQL (Not only SQL) applications. This tip illustrates several T-SQL examples for parsing JSON data files for a file from a commercial application.

Several prior MSSQLTips.com articles introduce JSON for SQL professionals. For example, Aaron Bertrand authored a three-part series (here, here, and here) on JSON features in SQL Server. MSSQLTips.com also offers coverage of how to open JSON data in SQL Server as well as how to open, process and display JSON data with Python.

This tip drills down on how to process JSON files with SQL Server while building on previously covered techniques as well as demonstrating a novel approach for extracting key values from JSON nested key-value pairs. The processing is performed by selected built-in SQL Server functions. The JSON file used in examples within this tip can be retrieved from this tip's download file as well as from a link to a Restful API.

The sample JSON file for this tip

The Alpha Vantage API is a freemium service with free and premium series available for download via the internet. Among the historical data series are stock price and volume data, company revenue and income, foreign currency and crypto currency exchange rates, and economic indicators. The documentation includes a link for downloading free daily prices for the IBM ticker that does not require a special key and is available without charge. The link permits harvesting of the most recent 100 trading days to a browser on your computer. Alternatively, you can retrieve the exact same JSON file used in demonstrations within this tip from its download file.

Here are a couple of screen shots from the Notepad++ application showing lines of JSON data from the beginning and end of the file available from this tip's download file. Notepad++ is a free application similar to Microsoft Notepad but with more advanced features for working with text files, including JSON data files. The Notepad++ output highlights the relationships between key-value pairs in a JSON file.

  • There are two main JSON objects in the file – one named Meta Data and the other named Time Series (Daily).
  • Within the two main objects are other nested key-value pairs
    • The Meta Data object contains key-value pairs describing the types of data available from the JSON file.
    • The Time Series (Daily) object contains a series of dates beginning on July 7, 2022 and running through November 25, 2022.
    • The dates appear in descending order from the most recent date through the least recent date.
    • Each date serves as a key for five nested key-value pairs starting with the open price on a trading date and running through the volume of shares exchanged on a trading date.
json data
json data

Opening and displaying the sample JSON file in SQL Server

You can use the openrowset function to access the JSON text in either varchar or nvarchar format for import into a SQL Server table column. Then, you can assign the table column values to a local variable. Next, you can open and display with the openjson function top-level key-value pairs from the JSON file copied to a local variable. The following script implements these steps.

  • The openrowset function appears in the from clause of the first select statement.
    • The bulk operator is required
      • An argument to the bulk operator designates the path to the file with the JSON data. The account from which the script runs must have read access permission for the JSON file
      • The second argument indicates the character format in the JSON file
        • SINGLE_CLOB is for text formatted as varchar characters, which is used in the examples within this tip
        • SINGLE_NCLOB is an alternate argument value for text formatted as nvarchar characters, which is not demonstrated in this tip
    • The results set from the openrowset function is returned as bulkcolumn, which is assigned to a local variable named @json
    • A couple of additional local variables (@jsonMetaDataObject and @jsonTimeSeriesDailyObject) are also declared for use in subsequent examples within the same T-SQL script file. These examples appear later in this tip
  • The openjson function parses JSON data file contents within the @json local variable. This example extracts JSON data from a file. It is also possible to extract JSON data from a local variable declared as a string. See this prior tip for more extensive coverage of the openjson function with a variety of different JSON data formats. Reviewing this earlier tip might be especially beneficial if you have limited prior experience with JSON formatted data
-- import a json file to @json local variable
-- display outer key-value pairs with default format
declare 
 @json varchar(max)
,@jsonMetaDataObject varchar(max)
,@jsonTimeSeriesDailyObject varchar(max)

select @json= bulkcolumn
from openrowset(bulk 'C:\DataScienceSamples\json_files_SS_19\daily_prices_ibm.json', SINGLE_CLOB) importfile
select *
from openjson(@json)

The next screen shot shows the output from the preceding script.

The bottom border of the two screen shots in the preceding section shows that there are 711 lines in the NotePad++ portrayal of the JSON file. However, the following screen shot highlights that there are just two outer-level keys named Meta Data and Time Series (Daily) in the JSON file. All the other key-value pairs within the JSON file appear within one of these two outer keys or an object within one of these keys. These keys appear in the default format for output from the openjson function.

  • The default format displays data in three columns for the
    • key name
    • key value
    • key type
  • The first outer-level key has a name of Meta Data
    • If you examine the characters in the key's value column, you can see that they correspond to the beginning characters of the JSON text file excerpted in the two screen shots within the preceding section.
    • JSON supports up to six unique data types. The fifth data type (represented by 5 in the type column) denotes an object data type, which can contain one or more nested key-value pairs. There are five key-value pairs nested within the Meta Data key
      • The first nested key has the name 1. Information
      • The second nested key has the name 2. Symbol
      • and so forth for the remaining three nested keys
  • The second outer-level key has a name of Time Series (Daily). This outer-level key has two levels of nested keys within it
    • The first nested level is for trading date identifiers. For example, the most recent trading date is 2022-11-25. The least recent trading date is 2022-07-07
    • Within each trading date key, there are five nested key-value pairs. These nested keys have names of
      • 1. open
      • 2. high
      • 3. low
      • 4. close
      • 5. volume
  • Authors and applications have substantial freedom to name and arrange data within their JSON files in whatever ways best serve their purposes. An essential JSON requirement is that all data must appear in key-value pairs
query results

Extracting key-value pairs from the Meta Data object

The next script segment is meant to run immediately after the preceding script segment in the same script file. There are three select statements in the following segment.

  • The first select statement populates the @jsonMetaDataObject local variable. This local variable is assigned the key-value pairs nested within the Meta Data object. Recall that this object is one of two top-level objects in the source JSON file, which is referenced as daily_prices_ibm.json in the preceding section
    • There are two keys in the @json local variable, which is the argument for the openjson function in the following script segment
    • An alias name of x is assigned to the output from the openjson function in the first select statement
      • x.[Key] refers to the key names in @json
      • The where clause criterion explicitly references the Meta Data key name
      • x.[Value] refers to the key values in @json, which are from the Meta Data object
      • x.[Value] in the select list returns the value from the Meta Data key-value pair; the return value is a nested set of key-value pairs
    • The assignment in the first select statement assigns the Meta Data nested key-value pairs to the @jsonMetaDataObject local variable
  • The second select statement displays the key-value pairs in the @jsonMetaDataObject local variable
  • The third select statement rearranges the Meta Data key-value pairs in a traditional relational dataset format with key names (y.key) in the first column and key values (y.value) in matching rows of the second column
-- extract just the json data values for the Meta Data object
-- in the @jsonMetaDataObject
select @jsonMetaDataObject = x.[Value]
FROM openjson(@Json) x 
where x.[Key] = 'Meta Data'
 
-- echo @jsonMetaDataObject
select @jsonMetaDataObject [Meta Data object key-name pairs]
 
-- display key name and key value pairs within @jsonMetaDataObject
select y.[key] [Meta Data key name], y.value  [Meta Data key value] 
from openjson(@jsonMetaDataObject) y

Here is the output from the script in this segment; the output from the preceding script trails the output from the script in the preceding segment.

  • The first pane shows the two keys in the @json local variable with the default format from the openjson function
  • The second pane shows the nested key-value pairs as a string inside the @jsonMetaDataObject local variable. JSON formatting embraces the nested key value pairs in curly braces ({})
  • The third pane shows the nested key-value pairs in a traditional relational dataset format
query results

Extracting key-value pairs from the Time Series (Daily) object

Just as the prior section showed how to copy the Meta Data object from the @json local variable, you can also copy the Time Series (Daily) object from the @json local variable. While there is just one instance of the Meta Data object, there are multiple instances of the Time Series (Daily) object. The key names for the instances of the Time Series (Daily) object instances are unique by their trading date value. Furthermore, each Time Series (Daily) object instance is associated with its own set of nested key-value pairs. Therefore, if you want to track close values across trading dates, you need to extract the close key-value pair from each Time Series object instance.

The following script segment shows one approach to this task. This script needs to run in the same script file as the prior two script segments. Two important reasons for this are that

  • This segment depends on the @json local variable being populated in a prior segment
  • This segment populates a local variable (@jsonTimeSeriesDailyObject) previously declared in a prior segment

Here is the new script segment for parsing the close values for each trading date in the Time Series (Daily) object.

  • The first select statement extracts the Time Series (Daily) object from the @json local variable; the extracted object is stored in the @jsonTimeSeriesDailyObject local variable
  • The second select statement displays the nested key-value pairs for each trading date in the @jsonTimeSeriesDailyObject local variable
    • Within the context of this tip, there are 100 trading dates
    • The select list references both z.[key] and z.[value], which are column values for each of the 100 trading rows in the results set
    • The source for the select statement is @jsonTimeSeriesDailyObject, which has an alias of z
  • The third select statement illustrates how to use a SQL Server substring function to extract close values for successive trading dates. There are specialized JSON functions for extracting a value from a key-value pair. However, the substring function is particularly easy to use in this circumstance. It is legitimate to use the @substring function since JSON files are composed of text values
-- extract just the json data values for the Time Series (Daily) object
-- into the @jsonTimeSeriesDailyObject
select @jsonTimeSeriesDailyObject = z.[Value]
FROM openjson(@Json) z
where z.[Key] = 'Time Series (Daily)'
 
-- display key name and key value pairs within the Time Series (Daily) object
select z.[key] [Time Series (Daily) key name], z.value  [Time Series (Daily) key value]
from openjson(@jsonTimeSeriesDailyObject) z
 
-- key and key value, but key value is a json object
-- use substring function to extract the close price (starting in col 136 with a len of 8 colls)
select 
 z.[key] [Time Series (Daily) key name]
,substring(z.[value],136,8) [close]
from openjson(@jsonTimeSeriesDailyObject) z

The following screen shot shows an excerpt from the output for the second select statement in the preceding script segment.

  • The first column has the name Time Series (Daily) key name
    • The column values start at 2022-11-25
    • The excerpt shows the output for the next four rows
    • The column value for the one hundredth row of output, which does not appear in the excerpt, is 2022-07-07
  • The second column has the name Time Series (Daily) key value
    • There are five sets of key-values for each column value
    • The close values start in column 136 and extend from that position for 8 columns in total; these parameters designate the substring arguments for extracting close values
query results

The next screen shot shows an analogous excerpt from the third select statement in the preceding script. This results set excerpt is also for the first five successive trading dates.

  • The close value for 2022-11-25 matches the corresponding close value in the preceding results set excerpt
  • The close values for the second through the fifth rows below also match the corresponding close values in the preceding results set excerpt
query results
Next Steps

There are two files in the download for this tip. First, there is a JSON data file with the same values as those shown in the tip. Second, there is a SQL script file that references the downloaded JSON file and performs the extract operations demonstrated in the tip.

After saving the JSON file and configuring the SQL script file to point at your saved version of the JSON data file, you can see if you get the same results as presented in this tip.

Next, you can use the link in the "The sample JSON file for this tip" section for downloading a fresh version of the data. Then, you can copy the data from your browser to a text file and save the JSON file to a path of your choice. Finally, you can point the SQL script file at the fresh text file that you downloaded from the link.

If you have some JSON files in your organization, you can adapt the code in this tip to process them for use with SQL Server.






get scripts

next tip button



About the author
MSSQLTips author Rick Dobson Rick Dobson is a Microsoft Certified Technical Specialist and well accomplished SQL Server and Access author.

View all my tips


Article Last Updated: 2023-01-12

Comments For This Article

















get free sql tips
agree to terms