Craft Conf and

10 May 2015   docker, aws, commoncrawl, wikireverse, force12

2 weeks ago I was leaving Budapest after a fun week with two of my colleagues. We were attending Craft Conf, which is a 2-day multi track conference on software craftsmanship.

In Budapest I was also working with Anne on which we’re launching this Thursday (14th May). Force12 is a container scheduler that provides auto scaling. On Thursday we’re launching a demo of Force12 running on EC2 Container Service.

Here are my 5 favourite talks and they’re available to watch online. I’ve linked to the videos which auto start!

DevOps for everyone - Katherine Daniels

Katherine does Ops at Etsy and she gave a great talk about how to structure Dev and Ops teams in a way that builds empathy between them. One of her ideas was doing pair ops as well as pair programming.

We’re doing this for Force12. Anne is developing our scheduler in Go and I’m developing the demo in JavaScript / Ruby and deploying it to AWS. Anne has written a huge amount of C code over a long time whereas I have no C background. So Anne has learnt Go and written our scheduler. Whereas I’ve got a lot more experience using AWS. So by doing pair ops and using the deployment advantages of containers its made it easy to deploy the scheduler to AWS.

I also liked some of Katherine’s other ideas like designating an Ops engineer to each Dev team. These ideas are more useful to larger teams than ours but I liked that Craft had talks of interest to both small and very large teams

Automating the Modern Datacenter – Mitchell Hashimoto

Mitchell’s talk was one of the ones I was most looking forward to and it lived up to my expectations. His talk was a run through of Terraform and Consul.

Terraform implements Infrastructure as Code and will build entire environments from scratch on multiple platforms. I can see us using Terraform for Force12. At the moment we’re building our demo on AWS because it’s the platform we’re most familiar with. However I can see us wanting to build demos on other public clouds.

Consul is best known as a service discovery tool and it works by using DNS. So you can have a DNS entry like staging.database.consul and Consul will map this to the currently running database. Using DNS is a genius idea as it means that any application that integrates with DNS can be integrated with Consul.

Anther feature about Consul I didn’t know about was its key value store. For Force12 we currently use DynamoDB to store our metadata. As we support more platforms I can see us moving this data to Consul.

Beyond Features - Dan North

Dan had a busy day as he and Jessica Kerr gave the keynote in the morning and he closed the first day with this talk. The part that resonated with me the most was that we should be doing features, discovery and kaizen and we should be doing some of each at all times. We’re trying to apply this on Force12 and it’s an unusual project in that at the moment we’re doing mostly discovery rather than features.

We’re launching our demo on Thursday but we won’t be open sourcing the Force12 scheduler yet. Doing auto scaling with containers is a new area so we’re starting with a simple demo. Over time we’ll make the demo more complex and blog about what we’re learning whilst we do this.

Kaizen is about change for the better. We’re working on how we communicate what Force12 is and why we think auto scaling with containers is a good idea. One of our goals with Force12 is to start a discussion about auto scaling with containers.

Multi Host Docker Orchestration – Lajos Papp

Lajos is from SequenceIQ who were recently acquired by HortonWorks. He gave a talk on CloudBreak, which automates building Hadoop clusters using Docker containers.

Although Lajos had some problems with his demo it was a great hands on example of how to use Docker to implement a complex system like Hadoop. One of his biggest challenges was bootstrapping the cluster. CloudBreak uses Consul for service discovery. However it uses Swarm first to get the Consul agent installed on each host and them talking to each other.

In a lot of ways this was an ideal talk for me. As well as being about Docker it was on Hadoop, which I’ve used a lot on another of my projects, It’s a reverse link graph to Wikipedia articles using Common Crawl data. To generate the data I parsed 3.2 billion web pages for $64 USD.

This is very cheap but because I used EC2 spot instances the server capacity only cost $18. It cost $39 to use Elastic MapReduce to manage my cluster. I’m going to try running the WikiReverse code on CloudBreak and see if using containers can reduce the server costs below $18.

The Perfect Storm Intensifies - Mac Devine

This last talk I didn’t see live but I heard a lot of good things about it so I watched it last week. The talk is on how the convergence of big data, the public cloud and (IoT) Internet of Things is causing a major increase of potential but also complexity.

IoT like containers and micro services is in serious danger of getting over hyped. However Mac gives some excellent real world examples including why the Weather Channel is putting sensors on airliners.

A term I came across for the first time that I liked was Data Gravity. Public cloud and containers are providing a big increase in processing capacity but IoT is adding even more sensors and data. So a major design factor becomes how to move and store data between sensors and micro services.


For me the two biggest themes from the conference were micro services and “are containers ready for production?” For micro services in medium and large organisations I think they’re definitely the right choice. They enable autonomous teams and loosely coupled systems.

However it is very easy to implement a micro services architecture badly. Also a lot of my clients are early stage startups. So in some cases they may be better building a monolith now and splitting it into micro services later as they scale up their business. For containers a key benefit is that they enable a micro services architectures due to the higher server density they provide.

On whether containers are ready for production I really liked a comment from Mitchell. He mentioned a recent study about the Fortune 500 migration from physical to virtual machines. This mainly started in 2005 and is due to finish in 2015. For me the current situation with containers feels a lot it was with VMs in 2005.

Containers are definitely ready for dev and test enviroments. It also makes sense to migrate these environments first. Running containers in production is also possible but there are still problems to be fixed before they can be used widely. For Force12 I think it means once those issues are resolved then auto scaling is a great use for containers. As containers can be scaled up and down far more rapidly than with VMs.

comments powered by Disqus