How to Query PostgreSQL with Athena

Data Engineering Academy
5 min readDec 29, 2020

Amazon Athena makes it easy to analyze semi-structured and non-structured data like json, csv & xml directly in Amazon S3 using SQL. However, it also allows you to easily query a number of relational databases hosted in AWS such as mySQL and PostgreSQL. For any Data Scientist, this opens up a world of potential because now it’s possible to write SQL queries that combine data from both relational and non-relational data that’s stored in your AWS data lake in a single query!

In this post, I’ll bring you through all the steps ( and all the gotchas!) to get up and running with the Athena federated query service.

Create a new Data Source

The first step, to get going, is to open the Athena service and click Connect data Source to setup a new connection.

Choose Query a data source then PostgreSQL as the data source that you want to query:

Athena uses data source connectors that run on AWS Lambda to run federated queries. Luckily, AWS have a number of prebuilt Athena data source connectors for JDBC-compliant relational data sources. In this next step we will choose to Configure a new AWS Lambda function to run our federated queries.

--

--