PostgreSQL JSON and JSONB Data Types for Non-Structured or Semi-Structured Data

By:   |   Updated: 2024-07-11   |   Comments   |   Related: > PostgreSQL


Problem

We have all encountered the need to store non-structured or semi-structured data in an RDBMS; XML or JSON data in particular. This can be complicated, especially in the past with limited technical options, and even more complicated if we want to query this data efficiently.

Solution

PostgreSQL offers two different data types, JSON and JSONB, as well as many operators in order to query, index, and work efficiently with JSON. It is probably the best JSON management of all RDBMSs.

This tip will review the most important syntaxes, features, and possibilities of the PostgreSQL JSON and JSONB data types.

JSON and JSONB Data Types

PostgreSQL has two different data types to manage JSON data:

  1. JSON - The oldest and less feature-rich, and
  2. JSONB - Offers the possibility to apply indexes to the data type and other specific operator

In terms of differences, the JSON data type stores an exact copy of the data that needs to be parsed when accessed while JSONB stores a decomposed binary version of the data which is slower to insert, but faster to process because it is not reprocessed. There is also a third data type in PostgreSQL related to JSON, JSONPATH, but this tip will not analyze it.

So, let's begin by showing an example of adding a JSON column in a table, as well as the operators we can use, and then query these data types. As always, we will use the Chinook database, a GitHub free downloadable database available in multiple RDBMS formats. It is a simulation of a digital media store with some sample data. All you need to do is download the required version, which includes all the scripts for data structure and inserts for data.

-- MSSQLTips.com
 
create table if not exists orders 
(order_id bigint generated always as identity (start with 1 increment by 1) not null primary key,
order_details jsonb not null);
query message

We can take a look at this newly created table in pgAdmin and then insert some data in it:

newly created table
-- MSSQLTips.com
 
insert into orders (order_details)
values ('{"customer": "Astrid Gruber", "track":"For Those About To Rock (We Salute You)"}'),
      ('{"customer": "Astrid Gruber", "track":"Love In An Elevator"}'),
      ('{"customer": "Alexandre Rocha", "track":"Whole lotta love"}');
insert data

All is fine so far; we have inserted some correctly parsed JSON data. Let's try now to insert some not correctly formed JSON:

-- MSSQLTips.com
 
insert into orders (order_details)
values ('{"customer": "Alexandre Rocha", "track":"For Those About To Rock (We Salute You)"'); 
error message

We receive an error as the string is not correctly ended for a JSON! That is amazing! Without relying on any complicated parsing function, we have a "free" check of our data!

JSON Operators

But it's not over! There are many JSON operators available to query this data!

Let's introduce the first: ->

This operator extracts the JSON object field with the given key. Here's an example:

-- MSSQLTips.com
 
select order_details -> 'track' as track
from orders;
->JSON operator

As you can see, it is very easy to extract an object from the JSON. Notice that, in this case, the resulting data is still in JSONB data type. Anyway, we can do more. We can use the object to filter our query, in this case, extracting it as a text data type with the operator ->>:

-- MSSQLTips.com
 
select order_details -> 'track' as track_json, order_details ->> 'track' as track_text 
from orders
where order_details ->> 'customer' like 'Astrid%';
->>JSON operator

As you can see, I used both operators to compare the results with different data types.

These first two operators are valid in both JSON and JSONB data types. But, there are some specific ones valid only with JSONB data type. An example of this is the ?, which is interesting in that it returns a Boolean if the specific text is a key inside our JSON. Let's see it in action.

-- MSSQLTips.com
 
insert into orders (order_details)
values ('{"track":"For Those About To Rock (We Salute You)"}');
? JSON Operator

We first inserted a new row in the table without the customer key in the JSON. Now, we can query the table again and check only the rows where this key is present:

-- MSSQLTips.com
 
select order_details ->> 'customer' as customer
from orders
where order_details ? 'customer';
? JSON Operator

As expected, the row that we just inserted is not returned by the query. This is just an example of the operators that I mostly used. I will share a complete list of the links with the official PostgreSQL documentation.

JSON Processing and Testing Functions

Obviously, there are some functions used to test and process JSON and JSONB data types. Let's make a couple of examples of two interesting processing functions like jsonb_each (and the equivalent for JSON data type named json_each):

-- MSSQLTips.com
 
select jsonb_each(order_details)
from orders;
jsonb_each

As we can see with this function, we have the JSON broken down with all keys and values, which is a nice feature for exploring what's in it. As the result set is a record type, we can represent it as a recordset result, maintaining the same data type JSONB for the values or converting it to a TEXT data type using jsonb_each_text:

-- MSSQLTips.com
 
select key,value
   from orders,
jsonb_each_text(order_details);
jsonb_each_text

We need to use this particular syntax to select directly from the jsonb_each_text output, and then the result set will be formatted nicely.

But there are more functions like jsonb_object_keys, which will return all the keys of a JSON:

-- MSSQLTips.com
 
select order_id, jsonb_object_keys(order_details)
from orders;
jsonb_object_keys

As expected, the query returns only seven rows since the last row inserted has only the "track" key in the JSON.

Starting with version 16 of PostgreSQL, we also have a testing function, IS JSON, similar to the function found in many other RDBMS (including SQL Server and Oracle). This function helps recognize if a string is JSON:

-- MSSQLTips.com
 
select order_details is json as ISJSON
from orders;
IS JSON

As expected, the result is always true as the order_details column is a JSONB. But, below is the result when we run the function against the string we tried to insert in one of the first examples above, receiving an error:

-- MSSQLTips.com
 
select ('{"customer": "Alexandre Rocha", "track":"For Those About To Rock (We Salute You)"') is json as ISJSON;
ISJSON

As expected, it returns as false since it is not JSON.

JSONB Indexing

Last but not least, a JSONB data type can be indexed, greatly helping performance when using all the JSON operators. The index type that is used is GIN. (I will write more in-depth about indexes in PostgreSQL in another tip series.) Let's check out an example by first inserting some additional rows:

-- MSSQLTips.com
 
insert into orders (order_details)
select cast('{"customer":'||'"'||"FirstName"||' '||"LastName"||'", "track":'||'"'||"Name"||'"}' as jsonb) 
from "Finance"."Customer"
cross join "Track"
   where "GenreId" in (1,3,4,13) and "Name" not like '%"%';
JSON Indexing

As you can see, I filtered out the track names containing the JSON escape character (") to simplify the insert, and I explicitly cast the text resulting from the query into a JSONB data type. With a separate query, I added some more rows using the same method to have a table of 168K rows, which is not big, but is at least significant.

We can now query our table using the JSONB operator @> in the filter. This operator, specific to a JSONB data type, returns true if the correct JSON is contained in the left one, thus perfect for a filter:

-- MSSQLTips.com
 
select order_details -> 'track' as track 
from orders 
where order_details @> '{"customer": "Astrid Gruber"}';

Let's run the Explain Analysis of this query to obtain the plan used:

@>JSON operator

As expected, since there are no indexes, we have a sequential scan of the entire table.

Now, we can create an index on the JSONB column:

-- MSSQLTips.com
 
create index idx_test_jsonb on orders using gin (order_details);
Index

We can now check again the Explain Analysis to see the plan used by the optimizer to return the query:

-- MSSQLTips.com
 
select order_details -> 'track' as track 
from orders 
where order_details @> '{"customer": "Astrid Gruber"}';
Explain Analysis

The query now uses the index created for the filter, making the query run a little bit faster.

Next Steps

In this tip, we have reviewed some of the most interesting features of JSON data type management in PostgreSQL.

Here are links to the official PostgreSQL documentation:

Here are some other tips regarding JSON:



sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Andrea Gnemmi Andrea Gnemmi is a Database and Data Warehouse professional with almost 20 years of experience, having started his career in Database Administration with SQL Server 2000.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips


Article Last Updated: 2024-07-11

Comments For This Article

















get free sql tips
agree to terms