Backend Development
Integrating into Qualtrics: Docker Deployment
About a year ago, Qualtrics made their first acquisition of a small company called Statwing. I was a part of the Statwing team and have worked on integrating the Statwing product into the Qualtrics platform over the last year. This post will cover what we changed to our application deployment when we joined Qualtrics.
As a smaller startup, we made some trade-offs with our architecture and deployment to move quickly, but resulted in some harder to maintain hacks requiring deep tribal knowledge. When joining Qualtrics, we expected we’d be migrating to a more mature deployment pipeline and that there might be more friction with some processes. Thankfully, we got the mature pipeline without a large amount of pain because Qualtrics used Docker, a tool for making and managing containers.
Deployment Model Before Qualtrics
As a small startup we wanted to quickly set up deployment and not have to worry about it once it was “working”. With that in mind, we deployed our application to AWS using a set of custom scripts.
First, we have a bootstrap script to save an Amazon Machine Image (AMI) from which we would start instances. If we had any package changes we needed to make, we would manually update them and create a new AMI.
Second, for general code deployment, we used the fabric library in python to run remote commands on our servers from our continuous integration server. The basic deployment used a hardcoded list of servers, pulled the latest version of our code onto those servers, and then restarted the web server/background worker process.
For application configuration, like a database URL, we injected environment variables, similarly to how Heroku uses them. We saved these configurations in encrypted files and uploaded them to our machines when they changed.
As a small startup, our deployment and configuration was quick to setup and easy to automate for most cases. We had some manual steps for creating AMIs but we didn’t update them often. Given the number of servers we were working with, we could handle hard-coding references to them in code. While this worked for our size, if we updated packages more often and maintained more machines in a more complex configuration, we'd hit some bottlenecks. Particularly, we’d hit these bottlenecks trying to scale to Qualtrics’ size.
Inside Qualtrics
Thankfully, Qualtrics uses Docker to create containers that can be deployed on the Qualtrics infrastructure. Transitioning our custom bootstrap scripts was pretty easy. We turned them into Dockerfiles by taking lines in our script and making them RUN
commands in Docker.
Moving to Docker builds helped us keep a consistent environment for packages and removed our hacked together bootstrap scripts. We also did not need to manually save AMIs. Configuration and code deployment was simplified because we no longer needed a custom script to update code and restart processes. These steps are now handled by Docker with a pull
and restart
. Furthermore, every other team can quickly understand our process because it is standardized with Docker
and there are no custom scripts to learn.
Another benefit came in our development environment setup. Using Docker allowed us to use Docker compose. Docker compose is a tool for defining and running multiple Docker containers. Now, our environment can be brought up in a way that better matched production. The setup for getting on-boarded on our team went from install multiple dependencies to get the code running locally, to just install Docker and run docker-compose -f compose.yaml
.
Lastly, our previous deployment required a hard-coded list of servers. We gladly no longer have to keep track of that. Qualtrics uses consul, a service discovery and configuration service, and we were able to leverage that when we moved our service to Qualtrics.
Any pitfalls?
Our Docker containers were initially very large, about 1.2 GB. There are a lot of posts on the topic of large containers, as you can see: on StackOverflow, on RedHat’s blog, and this blog here.
A couple of tips to keep your containers small:
- Build on a small base container, like alpine linux container
- Avoid multiple container layers if possible. i.e commands that build and remove artifacts should run in a single command
- Build dependencies in a separate container and copy them to your final container
These tips decreased `docker pull` times, as the amount of data pulled in worst case went from more than 200MB to about 10MB. Less data means quicker Docker pulls means a quicker deploy.
Another learning was to ensure data that needed to be persisted outside the Docker container was mounted to an appropriate disk. Otherwise, that data is lost when the container is restarted. We solved this by adding the appropriate mounts as a part of our deploy configuration.
We also learned that running multiple processes on a container must be managed carefully. If you want to run multiple processes, set up a process supervisor that passes on kill signals to your child processes so processes can clean themselves up. Otherwise, it's possible to create many zombie processes and cause unexpected behavior. Here is a good post describing the zombie problem on Docker more generally.
Conclusion
Docker helped us move to more mature build and deployment process with fewer hacks, and the move was relatively painless. With it, we were able to move quickly on building the Statwing product within Qualtrics. The goal of Statwing is to make statistical analysis efficient and delightful, and at Qualtrics there is much to do to make that happen. If this is interesting to you, send me a note at johnl@qualtrics.com and come join us!