Querying AWS CloudFront and WAF Logs using AWS Athena — Part I

Upendra Kumarage
4 min readJun 10, 2021

--

Simple example of Cloudfront and WAF logs forward into S3 which can be queries by Athena

This article is written assuming that the CloudFront distribution and WAF (AWS Web Application Firewall) resources are already created and the respective logs are being forwarded to relevant S3 buckets. So we won’t be discussing the CloudFront and the WAF setup and how the logs are being pushed to S3 buckets. Also, it is assumed that these functions are being carried out by an IAM user who has the privilege to work with Athena, CloudFront, S3, and WAF.

Introduction

As a start, it is good to give a short introduction to CloudFront, WAF, and Athena. Quoting the lines from AWS documentation as they provide the simplest insight at a glance.

Amazon CloudFront is a fast content delivery network (CDN) service that can be seamlessly integrated with AWS services such as WAF and Shield. AWS WAF is a web application firewall that helps protect web applications or APIs against common web exploits and bots that may affect availability, compromise security, and so forth.

Athena is an interactive query service that provides the ability to analyze data in Amazon S3 using standard SQL. With Athena, there’s no need for complex ETL jobs to prepare the data for analysis. This makes it easy for anyone with SQL skills to quickly analyze large-scale datasets.

Setting things up

It can be recommended to set up an S3 bucket to save the query results as the first step. I won’t be explaining the process of creating an S3 bucket as it is a straightforward task.

Once the S3 bucket is created, choose the preferred region for the Athena deployment and then navigate to the Athena service in AWS Console.

Creating a workgroup in Athena can be done as the next step. By default, each account has a primary workgroup and the default permissions allow all authenticated users access to this workgroup. The primary workgroup cannot be deleted. Each workgroup that you create shows saved queries and query history only for queries that ran in it, and not for all queries in the account.

Go to the Workgroup tab and select Create Workgroup, to create a Workgroup.

Workgroup Tab

Add Workgroup name, Description as per your preference and add the S3 bucket path which was created earlier as Query result location (Ex:- S3://test-s3-bucket-for-athena-queries). The rest of the settings can be applied as per your preference (encrypting query results, adding tags, and so forth).

Create Workgroup

You can switch to the new Workgroup by selecting the workgroup you created and clicking the Switch workgroup button.

Switch Workgroup

Create a Database

In order to create Athena tables, first, it is required to create a Database in Athena. To create a Database named cf_logs, navigate to the Query editor tab and run the following SQL query,

CREATE DATABASE cf_logs
Run Query

Now we have a Database to create Athena tables. We will discuss creating the Athena tables for CloudFront and WAF in the next section as discussing it in this section will end up in unnecessary lengthening of this document.

As a final note, there are lots of documentation that you can refer to get knowledge on Athena, CloudFront, and WAF. AWS documentation is by far the best source to refer to. Also, it is not necessary to follow these steps in the same order as there can be different methods with additional enhancements you can always try out once you are familiar with these services.

References

[1]https://aws.amazon.com/cloudfront/

[2]https://aws.amazon.com/waf/

[3]https://aws.amazon.com/athena/

[4]https://docs.aws.amazon.com/athena/

--

--

Upendra Kumarage

Cloud & DevOps enthusiast, Cloud Operations professional