Guide to Deploy a Web App on AWS with Terraform and GitHub Actions

In this guide, I will show you how to deploy a very basic Express.js web application on AWS ECS using the Fargate launch type.

We will use Terraform to provision and manage all the resources needed, like the ECS cluster, VPC, ECR repository, IAM roles, etc. Then we'll use GitHub Actions to build the project and automatically deploy the application and infrastructure changes with the push of a new git tag.

Be aware that some AWS resources created throughout this guide are not free. The costs will depend on your region and the resources allocated. The initial costs will be lower if you qualify for the Free Tier.

Prerequisites

This guide assumes that you already have the following:

Some knowledge of Git, Terraform, AWS, and Docker
Terraform v1.4.1 (latest version at the time of writing)
AWS account
AWS credentials configured on your local machine
GitHub account and a git repository
Node.js installed on your machine.

A Brief Introduction to AWS ECS and Fargate:

ECS is a fully managed service for running and orchestrating Docker containers, so you don't have to worry about managing the underlying infrastructure. It allows you to create a cluster of EC2 instances and then run tasks on those instances where each task is a containerized application.

Fargate is a serverless compute engine for containers built on top of ECS. When you launch a task or a service on ECS with Fargate, it will provision all the necessary infrastructure to run your containers without the need to provision and manage the underlying EC2 instances.

You can read more about ECS and Fargate here and here.

Hello World! A very basic Express.js web application

In an empty git repository, create an app.js file and add the following:

const express = require('express')
const app = express()
const port = 3000

app.get('/', (req, res) => {
  res.send('Hello World!')
})

app.listen(port, () => {
  console.log(`Example app listening on port ${port}`)
})

Next, create a Dockerfile file with the following:

FROM node:18
WORKDIR /usr/src/app
COPY package*.json app.js ./
RUN npm install
EXPOSE 3000
CMD ["node", "app.js"]

And finally, create a .gitignore file with the following:

node_modules
terraform/.terraform

Now, inside the project run npm init -y to auto generate a package.json file and then run npm install express.

That's it! You can run the app with the following command node app.js and then go to http://localhost:3000. You should see "Hello World!".

Infrastructure Provisioning with Terraform

Before starting, I must mention that some examples in this guide will use the open-source "Cloud Posse" terraform modules. These modules simplify the provisioning of AWS services and the underlying resources. They are well-maintained and have very good documentation. You can find more about the modules and what properties each module supports on their GitHub.

Ok, without further ado, let's jump to the Terraform code.

First, create a folder named terraform in your project and add the following files: backend.tf, main.tf, outputs.tf, provider.tf, variables.tf, and versions.tf.

Let's set up the backend where terraform will store its state. For this, you must first create an S3 bucket and give it any name, e.g., "tfstate-a123" (name must be globally unique). This is the only resource that we'll create without terraform.

Tip: You should enable Bucket Versioning on the S3 bucket to allow for state recovery if something goes wrong.

Add the following to the backend.tf:

# backend.tf

terraform {
  backend "s3" {
    bucket  = "tfstate-a123" # name of the s3 bucket you created
    key     = "production/terraform.tfstate"
    region  = "eu-west-1" # change to your region
    encrypt = true
  }
}

Now let's add the rest of the Terraform configuration. Add the following to the provider.tf:

# provider.tf

provider "aws" {
  region = var.region # value will be set later in variables.tf

  default_tags {
    tags = {
      ManagedBy = "Terraform"
    }
  }
}

This tells Terraform which provider to use and the region where to create the resources. Here we also set the default_tags property to automatically tag all resources that support tagging.

Next, let's set the required terraform version and the providers. Add the following to the versions.tf:

# versions.tf

terraform {
  required_version = ">= 1.0.0"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = ">= 4.0"
    }
  }
}

Now let's set the variables that we'll use throughout the code. Add the following to the variables.tf:

# variables.tf

variable "region" {
  type        = string
  default     = "eu-west-1"
  description = "AWS Region"
}

variable "namespace" {
  type        = string
  default     = "app"
  description = "Usually an abbreviation of your organization name, e.g. 'eg' or 'cp'"
}

variable "stage" {
  type        = string
  default     = "prod"
  description = "Usually used to indicate role, e.g. 'prod', 'staging', 'source', 'build', 'test', 'deploy', 'release'"
}

variable "name" {
  type        = string
  default     = "myproject"
  description = "Project name"
}

variable "image_tag" {
  type        = string
  default     = "latest"
  description = "Docker image tag"
}

variable "container_port_mappings" {
  type = list(object({
    containerPort = number
    hostPort      = number
    protocol      = string
  }))
  default = [
    {
      containerPort = 3000
      hostPort      = 3000
      protocol      = "tcp"
    }
  ]
  description = "The port mappings to configure for the container. This is a list of maps. Each map should contain \"containerPort\", \"hostPort\", and \"protocol\", where \"protocol\" is one of \"tcp\" or \"udp\". If using containers in a task with the awsvpc or host network mode, the hostPort can either be left blank or set to the same value as the containerPort"
}

variable "desired_count" {
  type        = number
  description = "The number of instances of the task definition to place and keep running"
  default     = 1
}

The values for the namespace, stage and name will dictate how the resources will be named. Change them to your needs. With the default values in this example the resources will be named or prefixed with "app-prod-myproject".

And now comes the fun part, the provisioning of the services needed to run a web application. Let's start by adding the configurations for the VPC and Subnets resources first.

Add the following to the main.tf:

# main.tf

module "vpc" {
  source  = "cloudposse/vpc/aws"
  version = "2.0.0"

  namespace = var.namespace
  stage     = var.stage
  name      = var.name

  ipv4_primary_cidr_block = "10.0.0.0/16"
}

module "subnets" {
  source  = "cloudposse/dynamic-subnets/aws"
  version = "2.0.4"

  namespace = var.namespace
  stage     = var.stage
  name      = var.name

  availability_zones  = ["eu-west-1a", "eu-west-1b", "eu-west-1c"] # change to your AZs
  vpc_id              = module.vpc.vpc_id
  igw_id              = [module.vpc.igw_id]
  ipv4_cidr_block     = [module.vpc.vpc_cidr_block]
  nat_gateway_enabled = true
  max_nats            = 1
}

Here, max_nats is set to 1, to save costs at the expense of availability as NAT Gateways are fairly expensive. You can remove or comment out the max_nats property and it will create a NAT Gateway for each "Availability Zone" (number depends on your region) or you can set the nat_gateway_enabled to false if you don't want to create any NAT Gateway and then create resources under the public subnets if they need access to services outside the VPC.

From AWS documentation:

A NAT gateway is a Network Address Translation (NAT) service. You can use a NAT gateway so that instances in a private subnet can connect to services outside your VPC but external services cannot initiate a connection with those instances.

With that said, if you choose not to use a NAT Gateway, don't worry. The Security Group attached to the ECS Service will only allow incoming requests from the ALB.

Now let's add the configuration for the Application Load Balancer (ALB). This will be the entry point to the application and as the name suggest, it will automatically distribute the incoming traffic across multiple targets, meaning ECS Tasks aka docker containers.

# main.tf

module "alb" {
  source  = "cloudposse/alb/aws"
  version = "1.7.0"

  namespace = var.namespace
  stage     = var.stage
  name      = var.name

  access_logs_enabled   = false
  vpc_id                = module.vpc.vpc_id
  ip_address_type       = "ipv4"
  subnet_ids            = module.subnets.public_subnet_ids
  security_group_ids    = [module.vpc.vpc_default_security_group_id]
  # https_enabled         = true
  # certificate_arn       = aws_acm_certificate.cert.arn
  # http_redirect         = true
  health_check_interval = 60
}

The https related properties are commented as it's out of scope for this guide. There will be a follow up article on how to create an SSL certificate with AWS Certificate Manager and enable HTTPS.

Next comes the ECS configuration. This will create the ECS Cluster, ECR, CloudWatch Logs, a Container definition -- which will be used to create the ECS task definition -- and the Service that will run and manage the ECS task definition.

# main.tf

module "ecr" {
  source  = "cloudposse/ecr/aws"
  version = "0.35.0"

  namespace = var.namespace
  stage     = var.stage
  name      = var.name

  max_image_count         = 100
  protected_tags          = ["latest"]
  image_tag_mutability    = "MUTABLE"
  enable_lifecycle_policy = true

  # Whether to delete the repository even if it contains images
  force_delete = true
}

module "cloudwatch_logs" {
  source  = "cloudposse/cloudwatch-logs/aws"
  version = "0.6.6"

  namespace = var.namespace
  stage     = var.stage
  name      = var.name

  retention_in_days = 7
}

module "container_definition" {
  source  = "cloudposse/ecs-container-definition/aws"
  version = "0.58.1"

  container_name   = "${var.namespace}-${var.stage}-${var.name}"
  container_image  = "${module.ecr.repository_url}:${var.image_tag}"
  container_memory = 512 # optional for FARGATE launch type
  container_cpu    = 256 # optional for FARGATE launch type
  essential        = true
  port_mappings    = var.container_port_mappings

  # The environment variables to pass to the container.
  environment = [
    {
      name  = "ENV_NAME"
      value = "ENV_VALUE"
    },
  ]

  # Pull secrets from AWS Parameter Store.
  # "name" is the name of the env var.
  # "valueFrom" is the name of the secret in PS.
  secrets = [
    # {
    #   name      = "SECRET_ENV_NAME"
    #   valueFrom = "SECRET_ENV_NAME"
    # },
  ]

  log_configuration = {
    logDriver = "awslogs"
    options = {
      "awslogs-region"        = var.region
      "awslogs-group"         = module.cloudwatch_logs.log_group_name
      "awslogs-stream-prefix" = var.name
    }
    secretOptions = null
  }
}

module "ecs_alb_service_task" {
  source  = "cloudposse/ecs-alb-service-task/aws"
  version = "0.66.4"

  namespace = var.namespace
  stage     = var.stage
  name      = var.name

  use_alb_security_group         = true
  alb_security_group             = module.alb.security_group_id
  container_definition_json      = module.container_definition.json_map_encoded_list
  ecs_cluster_arn                = aws_ecs_cluster.ecs_cluster.arn
  launch_type                    = "FARGATE"
  vpc_id                         = module.vpc.vpc_id
  security_group_ids             = [module.vpc.vpc_default_security_group_id]
  subnet_ids                     = module.subnets.private_subnet_ids # change to "module.subnets.public_subnet_ids" if "nat_gateway_enabled" is false
  ignore_changes_task_definition = false
  network_mode                   = "awsvpc"
  assign_public_ip               = false # change to true if "nat_gateway_enabled" is false
  propagate_tags                 = "TASK_DEFINITION"
  desired_count                  = var.desired_count
  task_memory                    = 512
  task_cpu                       = 256
  force_new_deployment           = true
  container_port                 = var.container_port_mappings[0].containerPort

  ecs_load_balancers = [{
    container_name   = "${var.namespace}-${var.stage}-${var.name}"
    container_port   = var.container_port_mappings[0].containerPort
    elb_name         = ""
    target_group_arn = module.alb.default_target_group_arn
  }]
}

There are two parameters that I want to mention here, the ignore_changes_task_definition and force_new_deployment as they are related to how a new version of the app is deployed live. For the first, we want terraform to track changes to the ECS task definition, like the container_image value and to create a new ECS task definition version. For the second we tell the Service to deploy the latest version immediately.

And last we will create the assume IAM Role and the OpenID Connect (OIDC) Identity provider. This will allow GitHub Actions to request temporary security credentials for access to AWS resources.

# main.tf

resource "aws_iam_openid_connect_provider" "github_actions_oidc" {
  url = "https://token.actions.githubusercontent.com"

  client_id_list = [
    "sts.amazonaws.com",
  ]

  thumbprint_list = [
    "6938fd4d98bab03faadb97b34396831e3780aea1"
  ]

  tags = {
    Namespace = var.namespace
    Stage     = var.stage
    Name      = var.name
  }
}

resource "aws_iam_role" "github_actions_role" {
  name = "github_actions"

  assume_role_policy = jsonencode({
    Version = "2012-10-17",
    Statement = [
      {
        Effect = "Allow",
        Principal = {
          Federated = aws_iam_openid_connect_provider.github_actions_oidc.arn
        },
        Action = "sts:AssumeRoleWithWebIdentity",
        Condition = {
          StringLike = {
            "token.actions.githubusercontent.com:sub" : "repo:EXAMPLE_ORG/REPO_NAME:*"
          },
          StringEquals = {
            "token.actions.githubusercontent.com:aud" : "sts.amazonaws.com"
          }
        }
      }
    ]
  })

  managed_policy_arns = ["arn:aws:iam::aws:policy/AdministratorAccess"]

  tags = {
    Namespace = var.namespace
    Stage     = var.stage
    Name      = var.name
  }
}

The StringLike condition will match and allow any branch, pull request merge branch, or environment from the EXAMPLE_ORG/REPO_NAME to assume the github_actions IAM Role. If you want to limit to a specific branch like main, then replace the wildcard (*) with the branch name. You can read more here.

For the purpose of this guide we'll attach the AWS managed AdministratorAccess policy to the role. As the name suggests the role will have administrator permissions.

It's important to mention that the policies assigned to the role determine what the GitHub Actions user is allowed to do in AWS. As a security best practice it's highly recommended to give the least permissions to a role.

And at last add the following to the outputs.tf:

# outputs.tf

output "alb_dns_name" {
  description = "DNS name of ALB"
  value       = module.alb.alb_dns_name
}

output "github_actions_role_arn" {
  description = "The ARN of the role to be assumed by the GitHub Actions"
  value       = aws_iam_role.github_actions_role.arn
}

output "ecr_repository_name" {
  description = "The name of the ECR Repository"
  value       = module.ecr.repository_name
}

Now, inside the terraform folder, run the terraform init command and then run the terraform plan and check the output to see the resources that will be created, and lastly, you should see this:

...
Plan: 54 to add, 0 to change, 0 to destroy.

If everything looks good go on and run terraform apply. After all the resources were created, copy the alb_dns_name value from the outputs and paste it in the browser and... (drum rolls), you get a "503 Service Temporarily Unavailable" error. This is expected as the Task Definition is configured to use the docker image with the "latest" or "1.x.x" tag which does not exist in the ECR yet. That's where GitHub Actions comes into play. Before moving to the next section, copy the github_actions_role_arn value. We'll need it for the next step.

And that's it for the Infrastructure Provisioning part. Congrats for making it so far!

Automation with GitHub Actions

Now that we have the infrastructure ready, let's create the GitHub Actions workflow and automate the deployment process.

Create the following folder structure .github/workflows at the root of your project. Inside the workflows folder create a .yaml file and name it buildAndDeploy (or any name you want) and copy the following configuration.

# buildAndDeploy.yaml

name: 'Build and deploy with terraform'

on:
  push:
    tags:
      - '*'

env:
  AWS_REGION: eu-west-1 # Change to your region
  IAM_ROLE_ARN: arn:aws:iam::xxxxxxxxxxxx:role/github_actions # Change to github action role arn

permissions:
  id-token: write # This is required for requesting the JWT
  contents: read # This is required for actions/checkout

jobs:
  build:
    name: Build Docker Image
    runs-on: ubuntu-latest

    steps:
      - name: Check out code
        uses: actions/checkout@v3

      - name: Configure AWS credentials
        uses: aws-actions/configure-aws-credentials@v1-node16
        with:
          role-to-assume: ${{ env.IAM_ROLE_ARN }}
          aws-region: ${{ env.AWS_REGION }}

      - name: Login to Amazon ECR
        id: login-ecr
        uses: aws-actions/amazon-ecr-login@v1

      - name: Build, tag, and push image to Amazon ECR
        env:
          ECR_REGISTRY: ${{ steps.login-ecr.outputs.registry }}
          ECR_REPOSITORY: app-prod-myproject # namespace-stage-name
        run: |
          docker build -t $ECR_REGISTRY/$ECR_REPOSITORY:$GITHUB_REF_NAME -t $ECR_REGISTRY/$ECR_REPOSITORY:latest .
          docker image push -a $ECR_REGISTRY/$ECR_REPOSITORY

  terraform:
    name: Terraform Apply
    needs: build
    runs-on: ubuntu-latest
    environment: production

    defaults:
      run:
        shell: bash

    steps:
      - name: Check out code
        uses: actions/checkout@v3

      - name: Configure AWS credentials
        uses: aws-actions/configure-aws-credentials@v1-node16
        with:
          role-to-assume: ${{ env.IAM_ROLE_ARN }}
          aws-region: ${{ env.AWS_REGION }}

      - name: Setup Terraform
        uses: hashicorp/setup-terraform@v2
        with:
          terraform_version: 1.4.1

      - name: Terraform Init
        working-directory: ./terraform
        run: terraform init

      - name: Terraform Plan
        id: plan
        working-directory: ./terraform
        run: terraform plan -var="image_tag=$GITHUB_REF_NAME"
        continue-on-error: true

      - name: Terraform Plan Status
        if: steps.plan.outcome == 'failure'
        run: exit 1

      - name: Terraform Apply
        working-directory: ./terraform
        run: terraform apply -var="image_tag=$GITHUB_REF_NAME" -auto-approve

I believe the pipeline jobs and steps within are quite self explanatory and I will not go into details. I will mention though the important parts. Replace the AWS_REGION and IAM_ROLE_ARN environment values with your AWS region and the github_actions_role_arn value that you copied earlier from the terraform output.

In the build job, under the steps, change the ECR_REPOSITORY value with the values you set for the namespace, stage and name in the variables.tf file.

That's it! Commit everything, create a tag and push to GitHub. Now go to your repository in GitHub and click on the "Actions" tab. If everything was setup correctly you should see that the "Build and deploy with terraform" workflow has been started or it's starting.

After the deployment pipeline has finished successfully try again to access the ALB DNS in the browser. You should now see "Hello World!". If you still get a 503, give it a few minutes. It's possible that the Task did not start yet. Otherwise, for troubleshooting, go the the ECS/Cluster/{cluster name}/Services in the AWS Console and check the followings:

At least 1 Task is in running state
Unhealthy targets in the Target Groups
The "Deployments and events" tab
The logs under the "Logs" tab or in CloudWatch
The application is starting successfully and responds within the 200-399 http code range

Awesome, you made it! I hope it was easy to follow along. I tried to keep it simple and to the point as possible.

If, at this point, you want to delete the AWS resources created by terraform, run terraform destroy.

Additionally, If you want your website to be accessible under your custom domain, go to your DNS provider, e.g. GoDaddy and create a CNAME Record pointing to the ALB DNS. If your DNS is managed by Route 53 create an Alias (A) record pointing to the ALB DNS.

And finally, you can check the full code example in this repo: deploy-web-app-on-aws-with-terraform-and-github-actions

The following up articles will come soon:

Create an SSL certificate with AWS Certificate Manager and enable HTTPS
Setting up AWS CloudFront CDN for your website.
ECS autoscaling