This blog is a static website generated using the Middleman framework. It’s hosted in an S3 bucket and served via the CloudFront CDN, which are both from Amazon Web Services. The S3 integration uses the middleman-s3-sync extension. The middleman-cloudfront extension is used to invalidate URLs from the CDN cache.
The site is stored in an S3 bucket named rossfairbanks.com. It’s possible to serve websites directly from S3 and the end-point for my bucket would be http://rossfairbanks.com.s3.amazonaws.com/. However serving directly from S3 means all users are served from a single AWS region. Also error rates from S3 are often higher than from CloudFront.
By serving users from CloudFront it means the files are downloaded from the nearest edge location in their worldwide network. Since it’s static this entire site can be served by CloudFront reducing page load times and increasing reliability.
A CloudFront end-point is called a distribution. Using the AWS web console it’s easy to make the origin of a distribution be a S3 bucket.
Here you can see the rest of the configuration for the distribution.
Below is an extract from my Middleman config.rb file. All of the AWS configuration is loaded from a seperate file called aws.yml. I can then exclude this file from Git since it includes my AWS access keys.
aws_config = YAML::load(File.open('aws.yml')) activate :s3_sync do |s3_sync| s3_sync.bucket = aws_config['s3_bucket'] s3_sync.region = aws_config['aws_region'] s3_sync.aws_access_key_id = aws_config['access_key_id'] s3_sync.aws_secret_access_key = aws_config['secret_access_key'] s3_sync.delete = true s3_sync.after_build = false end activate :cloudfront do |cf| cf.access_key_id = aws_config['access_key_id'] cf.secret_access_key = aws_config['secret_access_key'] cf.distribution_id = aws_config['cloud_front_dist_id'] cf.filter = /(.html|.xml)/ cf.after_build = false end
The s3_sync block shows the configuration for middleman-s3-sync. I used middleman-s3-sync instead of middleman-sync because it only updates changed files rather than all files in the bucket.
The next block is for middleman-cloudfront. With a CDN the site may be cached at multiple edge locations. When I update the site this extension calls the CloudFront API to remove my URLs from their caches. This API call can take up to 10 minutes and there are also usage costs for invalidating URLs. So the filter specifies that only the HTML pages and XML feeds are removed. To remove other assets such as JS or CSS I use the AWS web console manually.
Here is the aws.yml config file.
access_key_id: 'ACCESS_KEY_ID' secret_access_key: 'SECRET_ACCESS_KEY' aws_region: 'us-east-1' s3_bucket: 'rossfairbanks.com' cloud_front_dist_id: 'E20I7TV6EHSZNF'
Here are the commands for rebuilding the site, syncing with S3 and removing the URLs from the CloudFront cache.
ross@dev5:~/src/git/rossfairbanks.com$ bundle exec middleman build create build/img/blog/cloudfront_s3_origin.png create build/img/blog/cloudfront_config.png update build/feed.xml create build/2013/05/08/serving-middleman-blog-via-s3-and-cloudfront.html update build/sitemap.xml identical build/about.html identical build/2013/04/02/blog-development.html identical build/robots.txt update build/index.html update build/archive.html update build/tags/middleman.html update build/tags/aws.html ross@dev5:~/src/git/rossfairbanks.com$ bundle exec middleman s3_sync Gathering local files. Gathering remote files from rossfairbanks.com == LiveReload is waiting for a browser to connect Determine files to add to rossfairbanks.com. Determine which files to delete from rossfairbanks.com Determine which local files are newer than their S3 counterparts .................................................................. Determine which remaining files are actually different than their S3 counterpart. .................................................................. Ready to apply updates to rossfairbanks.com. Creating img/blog/cloudfront_s3_origin.png Creating img/blog/cloudfront_config.png Creating 2013/05/08/serving-middleman-blog-via-s3-and-cloudfront.html Creating tags/aws.html Updating archive.html Updating sitemap.xml Updating index.html Updating feed.xml Updating about.html Updating tags/middleman.html ross@dev5:~/src/git/rossfairbanks.com$ bundle exec middleman invalidate ## Invalidating files on CloudFront == LiveReload is waiting for a browser to connect == LiveReload is waiting for a browser to connect == LiveReload is waiting for a browser to connect Please wait while Cloudfront is reloading 13 paths, it might take up to 10 minutes == LiveReload is waiting for a browser to connect
The final piece of config are the DNS entries. Like most CDNs CloudFront works by using a CNAME record to point a custom domain at a CloudFront distribution. Unfortunately it isn’t possible to redirect an apex or naked domain (e.g. rossfairbanks.com) at a distribution.
This makes my blog look a bit retro with its www prefix ;) but for me the benefits of using a CDN outweigh this. This also means the final step is to use the free wwwizer service to redirect any requests for rossfairbanks.com on to CloudFront.
www.rossfairbanks.com. 86400 IN CNAME dxs22zjkgjc47.cloudfront.net. rossfairbanks.com. 86400 IN A 22.214.171.124