Jobs

Developing on AWS, by Software Team Lead and AO’er Jon

Back to articles

An introduction to some small lessons learned developing on AWS

Throughout the last few months, my team has been developing a cloud-native data pipeline and aggregation solution. Here we look to document some of the smaller issues we encountered while learning what the cloud is and how we approached architecting the system we’re building.

Our solution comprised of the following components:

Apache Kafka cluster and Zookeeper ensemble
– Kafka Connect, Kafka Rest Proxy and Schema Registry run via ECS
– Internal software developed in dotnet core 2.0 run via ECS
– Lambda functions and S3 hosted static sites
MongoDB as a materialised view of data captured via Kafka

Whilst there has inevitably been some large lessons learned using the components above. Sometimes, the small things make a difference. I’m going to try and capture some of these small things in a series of blog posts covering:

– Containers
Serverless framework
Terraform
– Logging
– Deployments

Each of these will be approached in the context of developing for Amazon AWS, where we host all of our services.

Background

To frame the lessons learned, here is some background on my team and the context that we’re operating in. This should help to give some idea of the challenges we faced and the resultant learning we acquired along the way.

We were put together as a new team within AO around 10 months ago, with me as Team Lead and three team members. Our remit was to put together what is essentially a data pipeline to aid a number of key initiatives within the business. Each of us was from more traditional C# development backgrounds, very much focused on the development lifecycle only. Each member of the team was new to AO, which also meant we had a lot of discovery both learning and defining our domain.

One of our immediate challenges was that we were responsible for the full lifecycle of our applications. This meant we needed to consider not only the application but the infrastructure, deployment, testing and supporting it throughout its life in production.

We also had to source data from a variety of data sources, including Microsoft SQL Server, user click stream, SQS events, and RabbitMQ amongst others. This meant we had to look massively outside our current thinking to find a suitable, scalable solution to the challenges presented.

Approach

Due to the fact that we were starting with a blank canvas, we decided to start as we meant to go on, with as much a DevOps approach as possible. This meant owning everything from start to finish — using approaches such as Infrastructure as Code, Immutable servers, utilising Serverless architecture where appropriate, codifying pipelines, continuous deployment and testing in production among others.

This also aligned with the team taking full ownership, creating an environment of trust and experimentation, accepting that failure happens and thinking of the system as a whole.

It’s probably worth noting here some of the great resources that have been utilised along the way. The DevOps handbook by Gene Kim and Patrick Debois, along with the novel The Phoenix Project by Gene Kim and Kevin Behr, a novel describing the journey Bill Palmer takes his IT organisation on to a more DevOps approach helped to give us a great framework on which to pin our principles and approach.

We track our progress using Kanban, initially we based our process on Agile Project Management with Kanban by Eric Brechner. We further improved our board with something similar to the Arrow Kanban board, which helped us to prioritise our backlog. More recently, Making Work Visible by Dominica Degrandis gave us some great insights into the importance of what we track and some techniques to help deal with things that stop working being completed.

Finally, team health is really important for us, we have created a set of team values, in addition to the company values which help us to shape how we want our team to behave. We also capture team feel daily, on a scale of going well, could be better, not so good which helps to show the general feeling of the team at a glance. A couple of helpful books for building a great team are Patrick Lencionis Five Dysfunctions of a Team, Radical Candor by Kim Scott and Permission to Screw Up by Kristen Hadeed.

In Conclusion

Throughout the series, we will be documenting some of the things that tripped us up. Nothing is too small, and the purpose is to show that even small improvements can lead to huge gains.

As the series progresses, I’ll include links to each post here:

Lessons learned with Serverless
– Lessons learned with Terraform
– Lessons learned with Containers
– Lessons learned with AWS and logging
– Lessons learned deploying to AWS