SQL Server Linked Server to Access Amazon Athena Tables

By:   |   Updated: 2023-10-06   |   Comments (1)   |   Related: > Amazon AWS


Problem

In the previous tips, An Introduction to Amazon Athena and Working with the Amazon Athena, we explored the use cases of Amazon Athena for querying data stored in the S3 bucket using standard SQL statements. SQL Server enables accessing external data (Oracle, MySQL, etc.) via linked servers. Can we use Linked Servers for Amazon Athena databases as well?

Let's explore leveraging Linked Servers to access Amazon Athena tables.

Solution

Data is spread across multiple locations and formats; integrating data is essential for any organization. Usually, we import data into database tables before querying it from SQL Server. The data import can use the SQL Server integration package (SSIS), Import\Export wizard, and BCP. You can directly query data stored in S3 bucket storage using Amazon Athena. This data can be in the following formats: Apache Parquet (.parquet), Apache ORC (.orc), CSV (Comma-Separated values), and JSON (JavaScript object notation).

Now, think of a solution: you receive the data from various sources regularly in different formats. These files are stored in the S3 bucket. Do you need to download files from the S3 bucket, import them into SQL Server tables, and query them? No, you can use Athena to query data from the files stored in the S3 bucket. Additionally, SQL Server can also connect to Athena using Linked Servers. You can directly use a SQL query for your data stored in the S3 bucket from the SQL Server using the Amazon Athena Linked Server.

ODBC Driver for Amazon Athena

To query Athena tables from SQL Server Linked Server, download the Amazon Athena ODBC Driver. It provides a standard ODBC driver interface to query the Amazon Athena interactive query services. It supports data aggregations and complex joins with compressive ANSI SQL-92 standard. There are two Amazon Athena ODBC drivers available for Windows, Linux, and macOS:

  • ODBC 2.x: supports only Windows.
  • ODBC 1.x: supports Windows, Linux, and macOS.

For this tip, we will use the ODBC 2.x driver. Note the following requirements before using the ODBC 2.x driver:

  • Athena uses port 444 for streaming query results. Therefore, you must keep this port open for outbound traffic.
  • You can use the Athena:GetQueryResultsStream policy in the IAM user to support the ODBC driver. This role supports the ODBC driver for streaming results.

Click the Windows download link (Windows 64-bit ODBC driver 2.0.1.0).

ODBC Driver for Amazon Athena

Launch the Amazon Athena ODBC driver setup wizard, as shown below.

Amazon Athena ODBC driver setup wizard

Click Next to review the license agreement and accept it.

license agreement

Choose the installation directory. Usually, it is good to install the ODBC driver on the OS drive.

custom setup

Click Install to begin the installation.

install

The Amazon Athena ODBC driver is installed successfully, as shown below.

installed successfully

ODBC Configuration

Launch ODBC 64-bit from the start menu and click System DSN -> Add.

Launch ODBC 64-bit

It opens a window to create a new data source. Choose Amazon Athena ODBC (x64) version 2.00.01.00, as shown below.

create a new data source

Click Finish and fill out the following information in the Amazon Athena ODBC configuration:

  • Data Source Name: Enter a data source name of your choice.
  • Region: Choose the AWS region for your AWS Athena database. You must use the correct region, or it won't connect to the Athena database. You can verify the region from the top-right corner of the web page, as shown below.
  • Catalog: Specify the Amazon Athena Data source name you created in the previous tip. As shown below, my catalog name is AwsDataCatalog.
data source name
  • Database: If you have a single database in Athena, you can use the default option or specify the database for which you create the ODBC.
  • S3 Output Locations: It requires a S3 directory to store the query results. You can specify the same S3 bucket that we used in the previous tip. The directory format is s3://[bucket]/folder.
  • Authentication Options: Choose IAM credentials as Auth schema from the drop-down menu. It asks you to enter the IAM user's AWS Access and Secret key. The IAM credentials must have access to the S3 bucket and Athena. Get the keys from the AWS IAM service and put them into the ODBC configuration page.
configuration
authentication options

Click Test Connection to see if the ODBC connection to Amazon Athena is successful. It gives the message "Successfully connected to Athena Engine version 3" for a successful connection.

connection test

Create a SQL Server Linked Server to Amazon Athena

Once we have the ODBC connection setup from the SQL Server machine to the Amazon Athena database, we can use the ODBC to create the Linked Server in SQL Server.

To create the Linked Server, we can run the following T-SQL using the stored procedures sp_addlinkedserver and sp_addlinkedserverlogin.

In the below query, specify the ODBC data source name in the @server, @datasrc, and @rmtsvrname parameters. You need to put the AWS access key in the @rmtuser and the secret key in the @rmtpassword parameter of the sp_addlinkedsvrlogin stored procedure.

EXEC master.dbo.sp_addlinkedserver 
   @server =N'AthenaQuery', 
   @srvproduct=N'', 
   @provider=N'MSDASQL', 
   @datasrc=N'AthenaQuery'
GO
 
EXEC master.dbo.sp_addlinkedsrvlogin 
   @rmtsrvname=N'AthenaQuery', 
   @useself=N'False', 
   @locallogin=NULL, 
   @rmtuser=N'AKIASEPWAFS6UVVVXMP7', 
   @rmtpassword='ON+T6J4/Zo92uCvZ/q4mZ1x6PUG9yVGxanFHwWQ9'
GO

Executing the query creates a Linked Server from the local SQL Server to Amazon Athena. You can verify the Linked Server by connecting to SQL Server in SSMS and expanding Server Objects -> Linked Servers.

linked server

Right-click on the Linked Server and Test Connection to verify the Linked Server can connect to the Amazon Athena database.

linked server test connection

Expand the Linked Server Catalogs to view all tables in your Amazon Athena setup. You can query these tables as a regular Linked Server table.

Linked Server Catalogs
Next Steps


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Rajendra Gupta Rajendra Gupta is a Consultant DBA with 14+ years of extensive experience in database administration including large critical OLAP, OLTP, Reporting and SharePoint databases.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips


Article Last Updated: 2023-10-06

Comments For This Article




Thursday, October 19, 2023 - 9:04:41 PM - Harish Mekwana Back To Top (91685)
Hi Rajendra,
Thanks for the article , it helped me a lot.
However, while I am getting the below error, any idea what is wrong here.

Regards

-- [AmazonAthena].[AwsDataCatalog].[datasmart_2ea034c2-70b0-4f0f-889f-314e8052624b].[t876_os_user] contains no columns that can be selected or the current user does not have permissions on that object.
GO
















get free sql tips
agree to terms