Two years ago I challenged our Engineering and TechOps teams to get to a place where our entire Pulse Platform would be “one-button” deployable. The platform consists of dozens of technologies and dependencies and has the ability to ingest 3.5B internet based documents a month and conduct surveys and SMS engagements at massive scale. Pulse allows us to provide customers with a single platform to conduct Social Listening, Content Discovery, and Population Engagement. Our customer base is wide and we have seen the technology deployed to support dozens of different use cases from countering human trafficking to conducting medical research in Africa. Because of the nature of our customers, many wanted to make sure they had their own dedicated infrastructure to support their operations. My goal on the “one-button” challenge was to make it possible for an order to come in and the TechOps team be able to have the system up and running in a couple of hours. At a recent Sprint Demo / Retro session, my colleague David Cavitt, demonstrated a 30-minute deployment and configuration of our entire platform. This two-orders of magnitude reduction in deployment time was a key piece in our overall goal of reducing the cost of the platform by 50%. I have included a couple of excerpts from David’s paper below, and as always, I am happy to forward the entire paper to you and / or connect you with David so he can walk you through his system.
“When we first knew we needed a faster method of deployment into the cloud, the first question we asked was, “What tools are out there that can do this?”, and the next question was, “Can any of the tools we have in-house do this for us?”. We were already using Ansible to deploy our Technology Stack to our cloud environments, so that seemed like a logical fit from the deployment methodologies standpoint, however, understanding how the code would fit into the deployment framework became a much larger question.”
“The AWS Services we needed to automate, in terms of deployments and configurations, included VPC, EC2, Route53, S3, Elasticache, RDS, IAM, and Certificate Manager. Even though many of those services were supported within the Ansible cloud modules, there were limitations with regard to the versions of boto that the module versions were currently at versus the latest version of the AWS API; and we didn’t want to have to rely on Ansible to update its’ codebase for its’ modules in order to interact with the AWS services every time AWS updated its’ API.”
“A second reason we went with Cloudformation as the platform for our rapid deployment was because there was less code maintenance due to the fact that the Ansible AWS modules would require far more coding in terms of the number of Ansible plays required (as you’ll see) than having the Cloudformation template files do most of the legwork for us in terms of the manipulation of the AWS services. ”
“With that being said, we did end up using a hybrid solution between Cloudformation templates and using the AWS cli to accomplish certain tasks (i.e. automating the approval of SSL certificates through the certificate manager service). Using the AWS cli also allowed us the flexibility to do things like get the status of a Cloudformation deployment before moving on to the next Ansible play.”
“The process for figuring out a deployment solution centered around having to do the least work possible (from a code standpoint) and also utilizing the tools we already have in place. Utilizing Cloudformation hands over the “leg work” to AWS to stand up the infrastructures and utilizing Ansible allows us to customize the deployments and keep variables encrypted utilizing ansible-vault. This also allows us to not have to worry too much about the Cloudformation templates needing to change (in terms of syntax) other than if we want to scale up or down the infrastructure and can focus on the configuration management (aka Ansible) aspect of the solution when looking to optimize the code.”
“The last thing I want to point out with the way we’ve implemented this solution is there’s probably a lot of room for improvement and optimization with task management and jinja2 template functionality as well. As improvements get made to this solution, I’ll post a second in-depth blog to what things we change and why. Hopefully others can benefit from the hurdles and challenges I had to overcome working through the implementation of this. Cheers! David Cavitt”