Flight Plan API to load data into SQL Server using Python

By:   |   Updated: 2023-03-23   |   Comments   |   Related: > Python


Problem

There are so many APIs available to get various types of information. One goal a lot of developers have is how to load data into SQL Server by using an API. In this article, we look at an API from RapidAPI that allows us to retrieve flight information which we will use to load into a SQL Server table.

Solution

You can use one of RapidAPI's API to get global airport information and use Python to store the data in your SQL Server database for later use to analyze or just read the data from a table. The API we will look at has a lot of different information about flights and you can use this data to see which airports to avoid, flight delays, or just general flight information.

In this tutorial, we will look at how to retrieve data for flight delays from different airports globally. The delays are calculated relative to the time specified for the request and the median of the delays, this will help see which airports are more reliable than other airports.

Prerequisites

  • SQL Server instance, you can create one following this guide
  • Python, Version 3.11.0
  • Visual Studio Code, Version 1.72.1
  • Windows 10 PC or Windows Server 2019/2022
  • RapidAPI account, you can use a free subscription

Creating SQL Server Table

Before getting the data, create a SQL Server table to store the data.

We will use SQLCMD, but you can use SSMS or any other method to create a database and table.

Open a Windows command prompt or use the Python Console Terminal and enter the following using your server and credentials.

sqlcmd -S PC_NAME\SQL_INSTANCE -E

The following code will create a database named AirportDelaysDB.

CREATE DATABASE AirportDelaysDB;
GO

This code changes the context to the newly created database and creates a table with just the ID column. The other columns will be created in the Python script below.

USE AirportDelaysDB;
GO

CREATE TABLE AirportDelaysTable(ID int NOT NULL PRIMARY KEY);
GO

Here is the console sequence.

Creating database table sequence.

Great, now you have a table to write your flight delay data.

Creating SQL engine for Python

Now that you have created a database and a table, you need to create a connection string between your SQL instance and Python code.

Use the pip command pip install sqlalchemy to install sqlalchemy to your environment. SQLAlchemy creates an engine for Python to execute SQL statements. Also, urllib comes pre-installed with Python, so there's no need to install it.

using the pip command to install SqlAlchemy

Create a Python file named FlightDbConn.py and add the following code updating the information for your environment:

import sqlalchemy as sa
from sqlalchemy import create_engine
import urllib

conn = urllib.parse.quote_plus( 'Data Source Name=KNITNETDS;' 'Driver={SQL Server};' 'Server=PC_NAME\SQL_INSTANCE;' 'Database=AirportDelaysDB;' 'Trusted_connection=yes;' ) try: FlightEngine = create_engine('mssql+pyodbc:///?odbc_connect={}'.format(conn)) print("Passed") except: print("failed!")

Getting the Flight Data

We will use the AeroDataBox API to get Flight delays data per airport. Make sure to login and get your API keys. At the moment AeroDataBox API is provided exclusively via the RapidAPI – a platform where you can find and connect to thousands of various APIs with a single API key. You can test API for free with the Basic plan.

This tutorial uses ICAO codes instead of airport names. The ICAO airport code or location indicator is a four-letter code designating airports around the world. These codes, are defined by the International Civil Aviation Organization.

The airport delays API endpoints allow answering questions, like "How many flights are delayed or canceled right now or in past?" or "What are the delay statistics and delay index at an airport now or back then?" These endpoints are designed to get a statistical insight into the overall "smoothness" of flight operations. Information returned from the endpoint includes departures and arrivals, a median delay of recent flight movements, number of canceled flights, flight delay index, and validity periods of the batch.

To pull data, use the pip command pip install pandas. Pandas will be used to manage the data structure.

Create a file named FlightAPI.py and add the following. You will need to update with your keys.

import requests
import pandas as pd
from FlightDbConn import FlightEngine
 
delay_url                   = "https://aerodatabox.p.rapidapi.com/airports/delays/2023-03-06T00:18"
 
querystring                 = {"withAircraftImage":"false","withLocation":"false"}
 
headers                     = {
    "X-RapidAPI-Key": "046c3b4981msha4789c0d3484c16p1bc8d3jsna2f831b3710c",
    "X-RapidAPI-Host": "aerodatabox.p.rapidapi.com"
}
 
delay_response              = requests.request("GET", delay_url, headers=headers, params=querystring)

delay_response_Json         = delay_response.json()

Icao_array                  = []
initialTime_array           = []
finalTime_array             = []
departureDelayTotal_array   = []
arrivalDelayTotal_array     = []
cancelled_array             = []

for i in delay_response_Json:
    Icao                    = i['airportIcao']
    Icao_array.append(Icao)
    initialTime             = i['fromUtc']
    initialTime_array.append(initialTime)
    finalTime               = i['toUtc']
    finalTime_array.append(finalTime)
    departureDelay          = i['departuresDelayInformation']['numTotal']
    departureDelayTotal_array.append(departureDelay)
    arrivalDelays           = i['arrivalsDelayInformation']['numTotal']
    arrivalDelayTotal_array.append(arrivalDelays)
    cancelled               = i['arrivalsDelayInformation']['numCancelled']
    cancelled_array.append(cancelled)

delay_df                    = pd.DataFrame({"Icao":Icao_array,"FromTime":initialTime_array,"ToTime":finalTime_array,"DepartureTimeDelay":departureDelayTotal_array,"ArrivalTimeDelay":arrivalDelayTotal_array,"Cancelled":cancelled_array})

print(delay_df) 

The script above fetches data from the API, defines variables and stores the data in a pandas dataframe. The output shows the arrival and departure airport delays and number of canceled flights per airport. Keep in mind, this output is in the time frame shown.

Dataframe output data

Now we connect the pandas dataframe to the SQL database table created above. Modify the FlightAPI.py file as follows:

import requests
import pandas as pd
from FlightDbConn import FlightEngine

delay_url                   = "https://aerodatabox.p.rapidapi.com/airports/delays/2023-03-06T00:18"

querystring                 = {"withAircraftImage":"false","withLocation":"false"}

headers                     = {
    "X-RapidAPI-Key": "046c3b4981msha4789c0d3484c16p1bc8d3jsna2f831b3710c",
    "X-RapidAPI-Host": "aerodatabox.p.rapidapi.com"
}
 
delay_response              = requests.request("GET", delay_url, headers=headers, params=querystring)

delay_response_Json         = delay_response.json()
 
Icao_array                  = []
initialTime_array           = []
finalTime_array             = []
departureDelayTotal_array   = []
arrivalDelayTotal_array     = []
cancelled_array             = []

for i in delay_response_Json:
    Icao                    = i['airportIcao']
    Icao_array.append(Icao)
    initialTime             = i['fromUtc']
    initialTime_array.append(initialTime)
    finalTime               = i['toUtc']
    finalTime_array.append(finalTime)
    departureDelay          = i['departuresDelayInformation']['numTotal']
    departureDelayTotal_array.append(departureDelay)
    arrivalDelays           = i['arrivalsDelayInformation']['numTotal']
    arrivalDelayTotal_array.append(arrivalDelays)
    cancelled               = i['arrivalsDelayInformation']['numCancelled']
    cancelled_array.append(cancelled)
 
delay_df                    = pd.DataFrame({"Icao":Icao_array,"FromTime":initialTime_array,"ToTime":finalTime_array,"DepartureTimeDelay":departureDelayTotal_array,"ArrivalTimeDelay":arrivalDelayTotal_array,"Cancelled":cancelled_array})

print(delay_df)

try: 
    delay_df.to_sql('AirportDelaysTable',con=FlightEngine,if_exists='replace',index=True) 
    print("data added to the database")

except Exception as e: 
    print("Failed!") 
    print(e) 

The script above will write the API data to the SQL Server table which you can now query.

Conclusion

Great! Now you can utilize this API's flight data to capture and analyze the data. Check out the API to see all of the flight and airport related data you can retrieve.

Next Steps

Check out these related articles:



sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Levi Masonde Levi Masonde is a developer passionate about analyzing large datasets and creating useful information from these data. He is proficient in Python, ReactJS, and Power Platform applications. He is responsible for creating applications and managing databases as well as a lifetime student of programming and enjoys learning new technologies and how to utilize and share what he learns.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips


Article Last Updated: 2023-03-23

Comments For This Article

















get free sql tips
agree to terms