TL;DR — We reduced deploy times from ten minutes to less than five seconds by replacing the standard Capistrano deploy tasks with a simpler, Git-based workflow and avoiding slow, unnecessary work.
At Code Climate, we try to minimize the time between when code is written and when it is live in production. When deploys slowed until they left enough time to make a pot of coffee, we invested in speeding them up.
What’s in a deploy?
At its core, deploying a modern Rails application consists of a few simple steps:
- Update the application code
bundle install(if the Gemfile was updated)
- Precompile assets (if assets were updated)
- Restart the application processes (e.g. Unicorn)
If the deploy fails, the developer needs to be alerted immediately. If application processes fail to rollover to the latest code, we need to detect that.
For kicks, I wrote a Bash script to perform those steps, to determine our theoretical lowest deploy time (just the time for SSH and running the minimum, required commands). It took about three seconds when there were no
Gemfile or asset changes. So I set out to reduce our ten minute deploys to as close to that number as possible.
If you take anything away from this article, make it this: Capistrano is really two tools in one. It provides both:
- A runtime allowing you to run arbitrary commands against sets of remote servers via SSH
- A set of default tasks for deploying Rails applications
The runtime is incredibly useful. The default tasks, which originated back in 2005, come from a pre-Git era and are unnecessarily slow and complex for most Rails applications today.
By default, Capistrano creates a
releases directory to store each deployed version of the code, and implicitly serve as a deployment history for rollback. The
current symlink points to the active version of the code. For files that need to be shared across deployments (e.g. logs and PID files), Capistrano creates symlinks into the
Git for faster, simpler deploys
We avoid the complexity of the
shared directories, and the slowness of copying our application code on every deploy by using Git. To begin, we clone our Git repo into what will become our
deploy_to directory (in Capistrano speak):
git clone ssh://github.com/codeclimate/codeclimate.git /data/codeclimate/app
To update the code, a simple
git fetch followed by
git reset —hard will suffice. Local Git tags (on the app servers) work beautifully for tracking the deployment history that the
releases directory did. Because the same checkout is used across deployments, there’s no need for
shared symlinks. As a bonus, we use Git history to detect whether post-update work like bundling Gems needs to be done (more on that later).
Our new deploy process is heavily inspired by (read: stolen from) Recap, a fantastic set of modern Capistrano tasks intended to replace the defaults. We would have used Recap directly, but it only works on Ubuntu right now.
In the end we extracted a small set of Capistrano tasks that work together to give us the simple, extremely fast deploys:
deploy:update_code— Resets the Git working directory to the latest code we want to deploy.
bundle:install:if_changed— Checks if either the
Gemfile.lockwere changed, and if so invokes the
bundle:installtask. Most deploys don’t include
Gemfilechanges so this saves some time.
assets:precompile:if_changed— Similar to the above, this invokes the
assets:precompiletask if and only if there were changes that may necessitate asset updates. We look for changes to three paths:
config. Asset pre-compilation is notoriously slow, and this saves us a lot of time when pushing out changes that only touch Ruby code or configuration.
deploy:tag— Creates a Git tag on the app server for the release. We never push these tags upstream to GitHub.
deploy:restart— This part varies depending on your application server of choice. For us, we use God to send a
USR2signal to our Unicorn master process.
deploy:verify— This is the most complex part. The simplest approach would have Capistrano wait until the Unicorn processes reboot (with a timeout). However, since Unicorn reboots take 30 seconds, I didn’t want to wait all that extra time just to confirm something that works 99% of the time. Using every ounce of Unix-fu I could muster, I cobbled together a solution using the
echo 'curl -sS http://127.0.0.1:3000/system/revision | grep "c7fe01a813" > /dev/null || echo "Expected SHA: c7fe01a813" | mail -s "Unicorn restart failed" firstname.lastname@example.org' | at now + 2 minutes
Here’s where we ended up: (Note: I edited the output a bit for clarity.)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
If your deploys are not as zippy as you’d like, consider if a similar approach would work for you. The entire project took me about a day of upfront work, but it pays dividends each and every time we deploy.
- Recap — Discussed above. Highly recommend taking a look at the source, even if you don’t use it.
- Deployment Script Spring Cleaning from the GitHub blog — The first time I encountered the idea of deploying directly from a single Git working copy. I thought it was crazy at the time but have come around.