SQL Server Insert Parent and Child Records with One Statement

Problem

A few days ago, one of the developers asked me if that was possible to generate test data by performing multiple nested INSERT statements, each of them involving inserting new rows into several parent tables and in the same statement reusing the autogenerated primary keys for the foreign key columns in the child table. The developer was working with PostgreSql and so I tried to find a solution for both PostgreSql and SQL Server to learn the differences in the ANSI features implementation in these different database systems.

Solution

Disclaimers

The fastest data generation solution is as follows:
1. Insert the required number of rows into each parent table
2. Get the ids according to the data generation logic and use them to add rows to the child table

In this tip I will not be using the technique above, but try to do this all with just one statement.

On my laptop, I generated 100,000 rows using the below technique. In SQL Server it took 86 seconds compared to the 3 statements logic (like below) which took approximately 5 minutes.

insert into 1^st parent table + store output into the variable
insert into 2^st parent table + store output into the variable
insert into a child table

ANSI Solution

There is a great feature in the ANSI standards that can be used exactly for this challenge, using a common table expression or CTE. A CTE is a temporary named result set, created from a simple query defined within an execution scope of SELECT, INSERT, UPDATE or DELETE statement.

In PostgreSql, CTE implementation includes data modification query scope. But in SQL Server, the CTEs query definition must meet a view’s requirements which means we cannot modify data inside a CTE.

In this tip I will show you a single statement solution I came up with in PostgreSql and SQL Server and I would love to hear your comments on how you would solve this challenge.

PostgreSql Approach to Load Data into Parent and Child Tables at the Same Time

Before we get started, here is the syntax for creating the three tables.

create database nested_inserts;
 
CREATE TABLE public.product (
productid serial NOT NULL,
product_name varchar(256) NOT NULL,
color varchar(30) NULL,
listprice money NOT null,
CONSTRAINT pk_product_productid PRIMARY KEY (productid)
);
CREATE TABLE public.salesperson (
salespersonid serial NOT NULL,
territoryid int4 NULL,
salesquota money NULL,
bonus money NOT null,
CONSTRAINT pk_salespersonid PRIMARY KEY (salespersonid)
);
CREATE TABLE public.salesorderheader (
salesorderid serial not null,
salespersonid int4 NOT NULL,
productid int4 NOT NULL,
orderdate timestamp NOT NULL,
shipdate timestamp NULL,
status int2 NOT null,
CONSTRAINT pk_salesorderheader_salesorderid 
PRIMARY KEY (salesorderid),
CONSTRAINT fk_salesorderheader__salesperson_salespersonid 
FOREIGN KEY (salespersonid) 
REFERENCES public.salesperson(salespersonid),
CONSTRAINT fk_salesorderheader__product_productid
FOREIGN KEY (productid) 
REFERENCES public.product(productid)
);

In PostgreSql, the solution to this challenge is quite simple, we will update two tables in the CTE and use the generated ids as foreign key ids in the third table.

with prod as(    
    INSERT INTO public.product(product_name,color,listprice)
    VALUES('3D printer','green',560)
    RETURNING productid
    ),
 pers as (
    INSERT INTO public.salesperson(territoryid,salesquota,bonus)
    VALUES(56,5000,100)
    RETURNING salespersonid
    )
INSERT INTO public.salesorderheader (salespersonid,productid,orderdate,shipdate,status)
SELECT   prod.productid,
         pers.salespersonid,
         now() as orderdate,
         now() - INTERVAL '7 day' as shipdate,
         1 as status
FROM prod,pers
select * from public.salesorderheader

SQL Server Approach to Load Data into Parent and Child Tables at the Same Time

Before we get started, here is the syntax for creating the three tables.

create database nested_inserts;
use nested_inserts
GO<br /> 
CREATE TABLE dbo.product (
productid int IDENTITY(1,1) NOT NULL,
product_name varchar(256) NOT NULL,
color varchar(30) NULL,
listprice money NOT null,
CONSTRAINT pk_product_productid PRIMARY KEY (productid)
);
CREATE TABLE dbo.salesperson (
salespersonid int IDENTITY(1,1) NOT NULL,
territoryid int NULL,
salesquota money NULL,
bonus money NOT null,
CONSTRAINT pk_salespersonid PRIMARY KEY (salespersonid)
);
CREATE TABLE dbo.salesorderheader (
salesorderid int IDENTITY(1,1) not null,
salespersonid int NOT NULL,
productid int NOT NULL,
orderdate datetime NOT NULL,
shipdate datetime NULL,
status smallint NOT null,
CONSTRAINT pk_salesorderheader_salesorderid 
PRIMARY KEY (salesorderid),
CONSTRAINT fk_salesorderheader__salesperson_salespersonid 
FOREIGN KEY (salespersonid) 
REFERENCES dbo.salesperson(salespersonid),
CONSTRAINT fk_salesorderheader__product_productid
FOREIGN KEY (productid) 
REFERENCES dbo.product(productid)
);

As I mentioned earlier, in SQL Server a CTEs query definition must meet a view’s requirements which means we cannot modify data inside the CTE.

We can use INSERT…OUTPUT construction, but another limitation in SQL Server for capturing the results of an OUTPUT clause in a nested INSERT, UPDATE, DELETE or MERGE statement, a target table cannot participate on either side of a FOREIGN KEY constraint.

 INSERT INTO dbo.product(product_name,color,listprice)
OUTPUT 1 AS salespersonid, -- at first I am trying to do it with only one parent table
   inserted.productid, 
   getdate() as orderdate, 
   getdate()+7 as shipdate, 
   1 as status 
INTO dbo.salesorderheader (salespersonid,productid,orderdate,shipdate,status) 
VALUES('3D printer','green',560)

Query execution failed: The target table 'dbo.salesorderheader'
of the INSERT statement cannot be on either side of a (primary key, foreign
key) relationship when the FROM clause contains a nested INSERT, UPDATE, DELETE,
or MERGE statement.

Since we are generating the test data and this is not a production system, we can temporary disable foreign keys as follows:

ALTER TABLE dbo.salesorderheader NOCHECK CONSTRAINT fk_salesorderheader__salesperson_salespersonid; 
ALTER TABLE dbo.salesorderheader NOCHECK CONSTRAINT fk_salesorderheader__product_productid;

We still cannot have two layers of nested INSERTs, because it is not possible to have two OUTPUT INTO clauses in the same statement:

INSERT INTO dbo.salesperson (territoryid,salesquota,bonus,lastsoldproduct) 
OUTPUT inserted.salespersonid,inserted.lastsoldproduct,getdate(),getdate() +7,1 
INTO dbo.salesorderheader (salespersonid,productid,orderdate,shipdate,status) 
SELECT 56,5000,100, prod.productid
FROM (
      INSERT INTO dbo.product(product_name,color,listprice)
      OUTPUT inserted.productid
      VALUES('3D printer','green',560)
) prod

SQL Error [10717] [S0001]: The OUTPUT INTO clause
is not allowed when the FROM clause contains a nested INSERT, UPDATE, DELETE,
or MERGE statement.

To overcome the issue, I have used INSERT EXEC construction, but in order to use both autogenerated keys in the OUTPUT statement, I need to have both of them in the inserted table. I have added a new column to the table salesperson for storing the id generated during product creation.

ALTER TABLE dbo.salesperson ADD lastsoldproduct int;

And here is my final statement:

INSERT INTO dbo.salesorderheader (salespersonid,productid,orderdate,shipdate,status) 
EXEC (' 
       INSERT INTO dbo.salesperson (territoryid,salesquota,bonus,lastsoldproduct) 
       OUTPUT inserted.salespersonid,inserted.lastsoldproduct,getdate(),getdate() +7,1 
       SELECT 56,5000,100,prod.productid 
       FROM ( 
             INSERT INTO dbo.product(product_name,color,listprice) 
             OUTPUT inserted.productid 
             VALUES(''3D printer'',''green'',560) 
       ) prod 
     ')
select * from dbo.salesorderheader

We have succeeded to insert two separate rows into two tables, generate 2 ids and used them in the third insert all in one statement. Take into consideration that this solution required disabling referential integrity keys which is not suggested for production environments.

If you come up with another way to implement the above query, without disabling the keys and without an additional column – I would love to see it. Please enter feedback in the comments section below.

Next Steps

Maria Zakourdaev

Maria Zakourdaev is a Data Platform Microsoft MVP and a technology expert with more than 20 years of experience, a community leader with a profound knowledge of Microsoft products and services, while also being able to bring together diverse platforms, products, and solutions to solve real-world problems.

Maria has a hands-on experience managing various data management technologies in Azure, AWS and Google public cloud platforms, including SQL Server, Azure CosmosDB, AWS DynamoDB, MemSQL, MySQL, Postgresql, Snowflake, Redis, Amazon Redshift, Couchbase and Elasticsearch, Spark, Databricks, Big Query.

Maria is an organiser of the annual conference “Data TLV” ( ex “SQL Saturday Israel”), a free training event for any data professionals: http://datatlv.com.

MSSQLTips Awards: Trendsetter (25+ tips) – 2022 | Author Contender – 2022 | Rookie Contender – 2018

2 Comments

T Owens
February 16, 2022 / 11:52 am Reply
This is another reason why we should never use an identity column as a surrogate PK alone. There should be another candidate key; put a unique constraint on it and use that for inserts and updates.
Mourad Karib
January 11, 2021 / 4:59 am Reply
Hi, thank you for the postgres solution, it’s really ingenious! Unfortunately my company is using MSSQL and I was hoping for a solution inside a production database, is there a way to overcome this problem?
Thank you in advance,

Disclaimers

ANSI Solution

PostgreSql Approach to Load Data into Parent and Child Tables at the Same Time

SQL Server Approach to Load Data into Parent and Child Tables at the Same Time

2 Comments

Leave a ReplyCancel Reply