Serving a Middleman blog via S3 and CloudFront

08 May 2013   aws, middleman

This blog is a static website generated using the Middleman framework. It’s hosted in an S3 bucket and served via the CloudFront CDN, which are both from Amazon Web Services. The S3 integration uses the middleman-s3-sync extension. The middleman-cloudfront extension is used to invalidate URLs from the CDN cache.

The site is stored in an S3 bucket named It’s possible to serve websites directly from S3 and the end-point for my bucket would be However serving directly from S3 means all users are served from a single AWS region. Also error rates from S3 are often higher than from CloudFront.

By serving users from CloudFront it means the files are downloaded from the nearest edge location in their worldwide network. Since it’s static this entire site can be served by CloudFront reducing page load times and increasing reliability.

A CloudFront end-point is called a distribution. Using the AWS web console it’s easy to make the origin of a distribution be a S3 bucket.

CloudFront distribution with S3 origin

Here you can see the rest of the configuration for the distribution.

CloudFront distribution config

Below is an extract from my Middleman config.rb file. All of the AWS configuration is loaded from a seperate file called aws.yml. I can then exclude this file from Git since it includes my AWS access keys.

aws_config = YAML::load('aws.yml'))

activate :s3_sync do |s3_sync|
  s3_sync.bucket                = aws_config['s3_bucket']
  s3_sync.region                = aws_config['aws_region']
  s3_sync.aws_access_key_id     = aws_config['access_key_id']
  s3_sync.aws_secret_access_key = aws_config['secret_access_key']
  s3_sync.delete                = true
  s3_sync.after_build           = false

activate :cloudfront do |cf|
  cf.access_key_id              = aws_config['access_key_id']
  cf.secret_access_key          = aws_config['secret_access_key']
  cf.distribution_id            = aws_config['cloud_front_dist_id']
  cf.filter                     = /(.html|.xml)/
  cf.after_build                = false

The s3_sync block shows the configuration for middleman-s3-sync. I used middleman-s3-sync instead of middleman-sync because it only updates changed files rather than all files in the bucket.

The next block is for middleman-cloudfront. With a CDN the site may be cached at multiple edge locations. When I update the site this extension calls the CloudFront API to remove my URLs from their caches. This API call can take up to 10 minutes and there are also usage costs for invalidating URLs. So the filter specifies that only the HTML pages and XML feeds are removed. To remove other assets such as JS or CSS I use the AWS web console manually.

Here is the aws.yml config file.

access_key_id: 'ACCESS_KEY_ID'
secret_access_key: 'SECRET_ACCESS_KEY'
aws_region: 'us-east-1'
s3_bucket: ''
cloud_front_dist_id: 'E20I7TV6EHSZNF'

Here are the commands for rebuilding the site, syncing with S3 and removing the URLs from the CloudFront cache.

ross@dev5:~/src/git/$ bundle exec middleman build
      create  build/img/blog/cloudfront_s3_origin.png
      create  build/img/blog/cloudfront_config.png
      update  build/feed.xml
      create  build/2013/05/08/serving-middleman-blog-via-s3-and-cloudfront.html
      update  build/sitemap.xml
   identical  build/about.html
   identical  build/2013/04/02/blog-development.html
   identical  build/robots.txt
      update  build/index.html
      update  build/archive.html
      update  build/tags/middleman.html
      update  build/tags/aws.html
ross@dev5:~/src/git/$ bundle exec middleman s3_sync
Gathering local files.
Gathering remote files from
== LiveReload is waiting for a browser to connect
Determine files to add to
Determine which files to delete from
Determine which local files are newer than their S3 counterparts

Determine which remaining files are actually different than their S3 counterpart.

Ready to apply updates to
Creating img/blog/cloudfront_s3_origin.png
Creating img/blog/cloudfront_config.png
Creating 2013/05/08/serving-middleman-blog-via-s3-and-cloudfront.html
Creating tags/aws.html
Updating archive.html
Updating sitemap.xml
Updating index.html
Updating feed.xml
Updating about.html
Updating tags/middleman.html

ross@dev5:~/src/git/$ bundle exec middleman invalidate
## Invalidating files on CloudFront
== LiveReload is waiting for a browser to connect
== LiveReload is waiting for a browser to connect
== LiveReload is waiting for a browser to connect
Please wait while Cloudfront is reloading 13 paths, it might take up to 10 minutes
== LiveReload is waiting for a browser to connect

The final piece of config are the DNS entries. Like most CDNs CloudFront works by using a CNAME record to point a custom domain at a CloudFront distribution. Unfortunately it isn’t possible to redirect an apex or naked domain (e.g. at a distribution.

This makes my blog look a bit retro with its www prefix ;) but for me the benefits of using a CDN outweigh this. This also means the final step is to use the free wwwizer service to redirect any requests for on to CloudFront.  86400 IN  CNAME  86400 IN  A

comments powered by Disqus