On Building a Blog - CloudFormation

My previous post about starting a blog finished with an outline of my plans for building it, but didn’t go into any detail. Now that I’ve got everything up and running, I’m ready to write about the steps I went through, the dead ends I ran into, and what I ended up with. This post is about building the infrastructure behind it.

Getting Started

I mentioned that I planned to host the blog on AWS, and my favourite thing about AWS is CloudFormation. CloudFormation allows you to define all the pieces that will make up your cloud environment. In my case, the environment consists of:

  • An S3 Bucket for storing the files that make up the site
  • A CloudFront Distribution for serving the files over HTTPS, with the extra benefits of a CDN
  • A TLS Certificate (also known as an SSL certificate) to enable to HTTPS process
  • A Route53 Hosted Zone for managing the DNS records for my domain (cbax.tech)
  • A couple of Route53 records to get to my site via my domain
  • An IAM User record for deployment access

With this knowledge in-hand, I created a new template a new template and started laying it out. There’s lots of tools out there for building a CloudFormation template, including a great browser-based editor with the AWS Console itself, but I’m old-school and just enjoy writing it manually. You’ve got two language options: JSON and YAML. I prefer YAML as I find it nice and succinct. Others I know prefer the explicit structure you get from JSON.

One of the things I find great about defining your resources in CloudFormation is that it highlights dependencies before you’ve deployed anything at all. I started with a skeleton, which is not dissimliar to the above list of dot points:

AWSTemplateFormatVersion: "2010-09-09"
Description: |-
  Template for creating all the AWS resources for cbax.tech
Resources:
  SiteBucket:
    Type: "AWS::S3::Bucket"
    Properties: 
      ...
  SiteDistribution:
    Type: "AWS::CloudFront::Distribution"
    Properties:
      ...
  SiteCertificate:
    Type: "AWS::CertificateManager::Certificate"
    Properties:
      ...
  SiteHostedZone:
    Type: "AWS::Route53::HostedZone"
    Properties:
      ...
  SiteRecordSetRoot:
    Type: "AWS::Route53::RecordSet"
    Properties:
      ...
  SiteRecordSetWWW:
    Type: "AWS::Route53::RecordSet"
    Properties:
      ...
  SiteDeployUser:
    Type: "AWS::IAM::User"
    Properties:
      ...

With the skeleton laid out, I started to flesh out the properties, starting with the things I expected to be simple and then moving to the more complex…and pretty quickly found a problem. The S3 bucket was fine, but the CloudFront Distribution was going to depend on both the bucket as well as the TLS certificate.

Certificate Dramas

In theory, the TLS certificate should have been easy, but unfortunately CloudFormation doesn’t provide a way to fully automate the certificate creation. For these certificates to be secure, you need to prove that you control the related domain when you ask for one. The standard ways of proving it are email (click this link we just emailed to your domain), and DNS (put this value in a specific DNS record on your domain). My hope was that I could use DNS validation, and get CloudFormation to create the validation records for me. Alas - my hope was dashed.

My next thought was that I’d just go ahead and create a pending certificate, and attach it to CloudFront anyway. This also wasn’t possible. CloudFront requires the certificate to be ‘valid’ before it’s attached.

It’s worth pointing out before I continue that there are other ways I could have addressed this problem I ran into. I’m fairly sure that with a bit more work in CloudFormation I could have ‘paused’ the creation of the resources that needed the certificate while I went off and created the requisite records. In the interest of expediency I instead chose to throw the toys out of the pram.

I decided that the fastest way to resolve the issue was to just pull the creation of the certificate out of the CloudFormation template. This meant that I’d also need to pull out the Hosted Zone, as I needed that in place to create the DNS records for the certificate validation. Given there are other resources that depend on the certificate and hosted zone, I had to add a couple of parameters to allow me to pass the requisite values into the template:

AWSTemplateFormatVersion: "2010-09-09"
Description: |-
  Template for creating all the AWS resources for cbax.tech
Parameters:
  SiteCertificateArn:
    Type: String
    Description: Provide an ARN for a certificate from ACM. Other certificates not supported.
  SiteHostedZoneId:
    Type: String
    Description: Provide the Hosted Zone ID for cbax.tech
Resources:
  ...

Lovely. So I created my Hosted Zone manually in the AWS Console, and provisioned and validated the certificate. I just need to pass in the ARN for the certificate, and the Zone ID for the Hosted Zone when I run the Stack. Not too bad.

Keys and Secrets

The next bump in the road came with setting up the IAM User for me to use in my build pipeline when uploading files into my S3 Bucket. I wasn’t going to be running said pipeline inside my AWS Account (in which case I’d just use an EC2 Role), so I’d need to get the Access Key and Secret Key out somehow. The Secret Key is like a password, and you only get to see that when you create your Access Key - you can’t retrieve it later. Fortunately the fix was easy to find, and I just had to tell CloudFormation to spit the Secret Key out as an Output:

  ...
Outputs:
  DeployKey:
    Description: Access Key ID for deploying assets to the S3 Bucket
    Value: !Ref SiteDeployKey
  DeploySecret:
    Description: Secret Key for deploying assets to the S3 Bucket
    Value: !GetAtt SiteDeployKey.SecretAccessKey

This is less secure, as it means the secret remains visible in the CloudFormation logs, but they won’t be shared with anyone in this case. If I was doing this in a shared or organisational account, I’d need to take a different approach.

Origin Access Identities

The last issue I encountered was one that I didn’t discover until I was trying out my new site. In reading up on the S3 and CloudFront integration while building the template, I’d discovered this cool thing called an “Origin Access Identity”, which is like a special IAM User that doesn’t come under IAM - it instead lives in the CloudFront domain - and can only be used to grant CloudFront access to S3 origins (an origin being the place CloudFront sources its files from). This was cool.

Previously when I’d served a website via S3 I’d configured the Bucket as an S3 Website, which meant that the Bucket had to be Public, which meant that rather than only interacting with my website the way I intended via my HTTPS CloudFront distribution, you could also go straight to the S3 Bucket via a HTTP connection. Not a big deal, but it wasn’t as pure as I’d like.

By using an OAI I could avoid this. I’d keep the Bucket private, and grant access only to the CloudFront OAI, and then you’d have to access the site via CloudFront. So I went ahead and configured it all, and went to visit the site. Everything looked great. I tried going into my first blog post. ACCESS DENIED. Wat.

After a lot of frustrated investigation, the problem was a combination of how Hugo works, and how S3 works. When I visit a page like this one, the file is actually called index.html and it lives inside folders (/2019/03/on-building-a-blog-cloudformation/), and Hugo just uses the folder location to make the path in the address bar nice and pretty. This has worked in browsers for a while now, and web servers know how to handle it. S3 doesn’t…unless you configure your bucket as a Website. 😧

So again, I capitulated to the technology, disposed of my glorious plans for using an OAI, and configured the bucket as a website endpoint. This created a new drama for me, as there’s apparently no attribute exposed on an S3 Bucket resource within CloudFormation for getting the website URL with !GetAtt. It wasn’t too hard to roll my own though, which looks like this:

!Join [ "", [ !Ref SiteBucket, ".s3-website-", !Ref "AWS::Region", ".amazonaws.com" ] ]

That sums up my adventures with CloudFormation this time around. Hopefully it helps save you (or my future-self) some time.

Thanks for reading. Toodles!