Reading and Writing JSON Files with Python

By:   |   Updated: 2022-11-04   |   Comments   |   Related: > Python


Problem

I've become interested in the JSON file format and what it can do. Can you briefly highlight the key benefits of using JSON files with the Python programming language? I'm also interested in how to create JSON files with a text editor, load the created JSON files into Python, parse JSON and display the contents. Lastly, I could benefit from a demonstration showing the difference between simple JSON and nested JSON.

Solution

One popular use case for JSON files with Python is data exchange. JSON (JavaScript Object Notation) is a lightweight format for exchanging data between computers. Because a JSON file consists of key-value pairs in a text format, humans can easily read its contents. A JSON file is language-independent; therefore, you do not need JavaScript to use JSON files, and you can process JSON files in other languages besides JavaScript, such as Python or T-SQL. The structure of JSON files is more compact than XML files. Therefore, JSON files can be sent over the internet and loaded faster than XML files.

One typical JSON file use case is when a client computer requests information from a server, which is passed from the server computer via a JSON file to the client computer.  When a web-based server runs a RESTful application programming interface (API), it commonly satisfies requests via JSON files.  A previous MSSQLTips.com article titled Overview of How to Collect Securities Prices for SQL Server devotes a section to introducing the Alpha Vantage RESTful API. The Alpha Vantage offers both free and paid tiers of service. The securities prices tutorial showed excerpts from a downloaded JSON file from Alpha Vantage. However, the prior tutorial has narrowly focused coverage on how to generate JSON files and exchange them between computers.

With regards to JSON files and Python scripts, this Python tutorial is at more of a beginner's level than the previous article. The tutorial includes a couple of examples that start with creating a JSON file in a text editor, then loading and displaying the file's contents in Python. One example is for a file with simple JSON, and the other is for a file with nested JSON.

  • In simple JSON, each key-value pair has the JSON file as its parent.
  • In nested JSON, one or more key-value pairs are embedded within another parent key. Nesting can extend through multiple levels of key-value pairs, such as child, parent, grandparent, and great-grandparent levels.

If you can open Python and load a Python script file, and you have previously used a text editor, such as Notepad++ or Notepad, then you will be able to run the examples in this tutorial. If you do not already have Python installed on your computer, you can learn more about it and obtain a free copy of Python from Python.org. By running the samples in this script, you will receive an introduction to the interoperation of Python and JSON files that will grow your Python programming capabilities and improve your understanding of JSON files.

Key-value Pairs in Simple and Nested Formats

The following screenshot shows a JSON file with the basic JSON types of string, number, object, array, boolean, and null in a Notepad++ session. The syntax for a basic JSON type is a key name followed by a colon, a space, and a value. A key's value can be any of the six basic JSON types. A comma follows all key-value pairs except the last one in a simple JSON file. Two basic JSON types appear twice in the screenshot below with two different values.

  • The name of a key must appear in double quotes. The key name is always a string value designated by the JSON file developer.
    • The first key name is String_value
    • All number type values are JavaScript's double-precision floating-point format. Therefore, within JSON, 1 and 1.0 have the same value. For the sake of familiarity to SQL Server professionals, this tutorial shows number type values in either of two sub-variants:
      • The second key name (Number_value_int) is for integral number values that have no decimal point
      • The third key name (Number_value_float) is for number values with a decimal point
  • The last key name is Object_value. The syntax for this key-value pair is unique in that it includes within braces a key-value pair as its value.
  • The screenshot below is for a simple JSON format.
  • Within a JSON file formatted as a set of simple JSON key-value pairs, opening and closing braces ({}) denote the start and end of the key-value pairs within a JSON file.
    • Therefore, the open brace ({) on the first line and the closing brace (}) in the last line denote the beginning and end, respectively, of the parent object for the key-value pairs in the file.
    • Although object data type syntax has braces, object data types are considered basic JSON data types.
Simple JSON format

The next screenshot shows an example with nested JSON. This screenshot displays the json_types_with_nesting.json file in a Notepad++ session.

  • Key-value pairs with key names of Number Types and Boolean Types show examples of nested JSON.
  • The other key-value pairs have key names of StringType, Null_value, and Array_value.
    • These key-value pairs follow simple JSON format conventions as described in the commentary for the preceding screenshot.
    • The Array_value pair in the following screen image is slightly different from the Array_value pair in the preceding screen image. This difference and its implications for JSON arrays are briefly discussed towards the end of this section.

The syntax for nested JSON has these conventions:

  • One or more inner key-value pairs embedded in one or more outer key-value pairs.
  • The outer key name is for a set of nested JSON.
    • After the outer key name, colon, and space, one or more sets of inner key-value pairs appear within brackets ([]).
      • An opening bracket ([) precedes the first set of inner key-value pairs
      • A closing bracket (]) trails the last set of inner key-value pairs; the closing bracket needs to be followed by a comma.
    • Each set of inner key-value pairs is delimited by braces.
      • When one set of inner key-value pairs within nested JSON is followed by another set of inner key-value pairs, the braces at the end of that set must be followed by a comma.
      • When a set of inner key-value pairs is the last set following the key name, there should not be a comma at the end of the braces.
    • Key-value pairs within an inner set should be followed by a comma, except for the last key-value pair in a set.

A JSON array can contain any ordered sequence of basic JSON type values. The type values must be comma delimited.

  • The following screen image shows number type values as the beginning and ending elements in an array. The intermediate array elements have string types.
  • By including object data types within an array, you can develop sets of key-value pairs within an array.
Example of nested JSON

Python Code for Loading and Reading JSON File

The process for loading and displaying a JSON file in Python is the same whether your source data file is in simple JSON format or nested JSON format.

The following screenshot shows a sample Python script for loading and displaying a JSON file in Python. The screenshot is from the IDLE application, which is an integrated development environment for Python scripts. Recall that IDLE downloads and installs automatically along with Python from Python.org. This script runs the 3.10.4 version of Python available from Python.org. This Python version also ships with the JSON and pprint package libraries. The two import statements at the top of the following script reference these packages.

  • The Python open function returns a handle (f) to the json_types.json file generated in the preceding section.
  • The json.load function copies the JSON text from the file referenced by the file handle to a Python dictionary object named data. A Python dictionary object resembles a JSON file, but distinct differences exist. One critical way they are similar is that they both represent data via key-value pairs.
  • After storing a copy of the JSON file in the data dictionary, a series of print statements highlight different aspects of the source and copied data.
Sample Python script for loading and displaying a JSON file in Python

The first print statement in the preceding script shows the content of the data dictionary as an unformatted string of key-value pairs. Notice that it is hard to follow. If you carefully examine the output, you might note that the formatting of content in the data dictionary is slightly different from the contents of the JSON text file. However, the most obvious difference is that the structure of the JSON file in Notepad++ makes the key-value pairs relatively easy to read. In contrast, the text values from the data dictionary are relatively hard to read because the displayed contents do not reflect the structure of the JSON data.

Here is a copied portion of the first print statement's output from the IDLE shell window. This output begins with content from the Python operating system. Next, a brief description of the output appears before the actual contents.

Python 3.10.4 (tags/v3.10.4:9d38120, Mar 23 2022, 23:13:41) [MSC v.1929 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license()" for more information.
 
======== RESTART: C:\my_py_scripts\load_simple_json_file_and_display.py ========
display contents of data
{'String_value': 'Json key types and sample values', 'Number_value_int': 12, 'Number_value_float': 12.3456, 'Boolean_value_true': True, 'Boolean_value_false': False, 'Null_value': None, 'Array_value': ['a', 'r', 'r', 'a', 'y'], 'Object_value': {'obj': 'ect'}}

After displaying the whole dictionary object as a string, the script displays just the key values in the data dictionary. Here is a portion of the output from the preceding script showing the key values from the data dictionary. A brief descriptive label precedes the output.

display keys in data
dict_keys(['String_value', 'Number_value_int', 'Number_value_float', 'Boolean_value_true', 'Boolean_value_false', 'Null_value', 'Array_value', 'Object_value'])

The next portion of the output is for what is typically called a "pretty print" of the text values in the data dictionary. The custom label for this screen excerpt is pretty print contents of data. The pretty print output is generated from the pprint function of the pprint package. With the output from the pprint function, it is easier to discern differences between the Notepad++ image of the JSON file and the data dictionary object.

Here are several differences that stood out to me between the underlying JSON data and the pretty print output:

  • Key names appear in single quotes in the data dictionary, but they are in double quotes in the original JSON file.
  • Rows are alphabetically ordered by key name in the pretty print output from the data dictionary, but rows appear in Notepad++ in an order specified by the developer. If the developer chooses a specific non-alphabetical order because of internal standards (or some other reason), that order is lost in the pprint output for the data dictionary. By the way, the unformatted string representation of the data dictionary does preserve the original order of the key-value pairs.
  • The capitalization of selected values for key values is automatically changed in the data dictionary. For example, the value "true" in the original JSON file for a Boolean data type is changed to "True" in the data dictionary.
pretty print contents of data
{'Array_value': ['a', 'r', 'r', 'a', 'y'],
 'Boolean_value_false': False,
 'Boolean_value_true': True,
 'Null_value': None,
 'Number_value_float': 12.3456,
 'Number_value_int': 12,
 'Object_value': {'obj': 'ect'},
 'String_value': 'Json key types and sample values'}

The last section of output is from the Python script for the content from the original source JSON file that appears next. The original source data is derived with the dumps function. The data dictionary is passed as a parameter to the dumps function. The dumps function pulls source content from the JSON file underlying the data dictionary. The output below is generated by the Python print function with an indent parameter for designating the number of blank spaces at the beginning of indented lines. The indent parameter causes the output to appear like pretty print output as opposed to the output for raw text. This output is closer in appearance to the original source data in the Notepad++ file than the output from the pprint function in the pprint package.

  • Key names in this output appear in double quotes, just as in the original Notepad++ example from the preceding section.
  • Capitalization for selected key values, such as true, conform to the underlying JSON file as opposed to the transformed data dictionary object.
  • The elements of the array data type appear in a vertical orientation as opposed to the horizontal orientation in the underlying JSON file. This is a minor deviation because the array element values and their order in the array are identical between the output from the Python print function and the underlying JSON file displayed by Notepad++.
pretty print contents of json file
{
  "String_value": "Json key types and sample values",
  "Number_value_int": 12,
  "Number_value_float": 12.3456,
  "Boolean_value_true": true,
  "Boolean_value_false": false,
  "Null_value": null,
  "Array_value": [
    "a",
    "r",
    "r",
    "a",
    "y"
  ],
  "Object_value": {
    "obj": "ect"
  }
}

Here is the Python script for displaying the contents of the JSON file with a nested JSON format. If you compare this script with the preceding script for the loading and displaying the contents of a JSON file with simple JSON formatting, you will discover that the scripts are the same, except for the name of the JSON file. The following script references a file with the name "json_types_with_nesting.json" in its open statement, and the preceding script references a file with the name "json_types.json". In other words, the same Python code can load and display the contents of JSON files, whether the file is one with a simple JSON format or one with a nested JSON format. As a result, the remaining text in this section reviews the output from Python and compares it to the source JSON content in the referenced JSON file.

file with the name of json_types_with_nesting.json in its open statement, and the preceding script references a file with the name of json_types.json

To process the original JSON file with nesting in Python, it must first be copied to a Python dictionary object. The following excerpt displays a segment of the output that shows the JSON in a dictionary without pretty print formatting. The results are copied from the IDLE shell window. Note: It is not easy to read this output because the content is not formatted based on the structure of the underlying JSON data. Nevertheless, if you know precisely what you want to review, sufficient attention to detail may permit you to find what you want in the output.

Python 3.10.4 (tags/v3.10.4:9d38120, Mar 23 2022, 23:13:41) [MSC v.1929 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license()" for more information.
 
======== RESTART: C:\my_py_scripts\load_nested_json_file_and_display.py ========
display contents of data
{'StringType': 'This is a string', 'Number Types': [{'Number_value': 12, 'Number_type': 'int', 'key_value': 'Number'}, {'Number_value': 12.3456, 'Number_type': 'float', 'key_value': 'Number'}], 'Boolean Types': [{'Boolean_value_true': True, 'Boolean_type': True, 'key_value': 'Boolean'}, {'Boolean_value_false': False, 'Boolean_type': False, 'key_value': 'Boolean'}], 'Null_value': None, 'Array_value': [0, 'a', 'r', 'r', 'a', 'y', 1]}

The next excerpt displays the parent-level key names from the data dictionary. The Python code does not return any key names from the child level for nested key-value pairs. Reviewing the preceding results excerpt can verify that child-level key names are in the raw dictionary key-value pairs. However, the code to display data keys from the data dictionary only returns parent-level keys. For example:

  • The Number Types key appears in the following results output.
  • However, neither of the two child-level keys named Number_value appear in the following results. You can see from the immediately preceding results excerpt that the first set Number_value key has a value of 12 and the second set Number_value key has a value of 12.3456.
display keys in data
dict_keys(['StringType', 'Number Types', 'Boolean Types', 'Null_value', 'Array_value'])

The following results excerpt displays the pretty print version of the data dictionary. The pretty print version sorts parent- and child-level key names in alphabetical order.

The pretty print version output from the pprint function also transforms the capitalization for selected values, such as True and False.

  • In the pprint output, the first letter of these values is capitalized.
  • In the underlying file, the first letter of these values is not capitalized.
pretty print contents of data
{'Array_value': [0, 'a', 'r', 'r', 'a', 'y', 1],
 'Boolean Types': [{'Boolean_type': True,
                    'Boolean_value_true': True,
                    'key_value': 'Boolean'},
                   {'Boolean_type': False,
                    'Boolean_value_false': False,
                    'key_value': 'Boolean'}],
 'Null_value': None,
 'Number Types': [{'Number_type': 'int',
                   'Number_value': 12,
                   'key_value': 'Number'},
                  {'Number_type': 'float',
                   'Number_value': 12.3456,
                   'key_value': 'Number'}],
 'StringType': 'This is a string'}

The last results excerpt is for the underlying JSON file; recall that this output depends on the dumps function from the JSON package. The output is generated with the Python print function having an indent parameter. This results output excerpt displays the key-value pairs in the same order as in the underlying JSON file because the dumps function references the underlying JSON file.

  • The orientation of the array element values is vertical in the output below and horizontal in the underlying JSON file.
  • However, the number and order of the array elements are identical between the Python print output and the underlying JSON file displayed by Notepad++.
pretty print contents of json file
{
"StringType": "This is a string",
"Number Types": [   {
  "Number_value": 12,
  "Number_type": "int",
  "key_value": "Number"
},
{
  "Number_value": 12.3456,
  "Number_type": "float",
  "key_value": "Number"
}
],
"Boolean Types": [
{
  "Boolean_value_true": true,
  "Boolean_type": true,
  "key_value": "Boolean"
},
{
  "Boolean_value_false": false,
  "Boolean_type": false,
  "key_value": "Boolean"
}
],
"Null_value": null,
"Array_value": [
0,
"a",
"r",
"r",
"a",
"y",
1
]
}
Next Steps

For previous alternative articles on Python and/or JSON for SQL Server professionals, you can examine these other tips:

This article has four resource files in its download:

  • A JSON file in simple JSON format
  • A second JSON file in nested JSON format
  • A Python script file to load and display the first JSON file
  • A second Python script file to load and display the second JSON file

An initial next step after reading the tip is to load both JSON files into a text editor, such as Notepad++. Next, use the python script files to confirm that you can load and display the JSON file contents as demonstrated in the tip's examples.

Then, carefully review the instructions for creating a JSON file as a basis for modifying the JSON files provided in the download with this tip. For example, instead of using key names for basic JSON types, use selected products and/or services provided by your enterprise as key names and values. Then, load and display the modified JSON files with the python script files.

Finally, search the internet and/or your company's internal web resources for an API with content that you would like to download to your computer. Next, follow the API's instructions for downloading content and specify that you want your content in a JSON file in case the API supports multiple output formats. Then, load and display via Python the downloaded JSON file in Python from the API you reference.



sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Rick Dobson Rick Dobson is an author and an individual trader. He is also a SQL Server professional with decades of T-SQL experience that includes authoring books, running a national seminar practice, working for businesses on finance and healthcare development projects, and serving as a regular contributor to MSSQLTips.com. He has been growing his Python skills for more than the past half decade -- especially for data visualization and ETL tasks with JSON and CSV files. His most recent professional passions include financial time series data and analyses, AI models, and statistics. He believes the proper application of these skills can help traders and investors to make more profitable decisions.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips


Article Last Updated: 2022-11-04

Comments For This Article

















get free sql tips
agree to terms