This is our third and final post in the Drupal deployment series looking at how we deploy updates to our Drupal sites nzpost.co.nz, stamps.nzpost.co.nz, and coins.nzpost.co.nz. In part one, senior Drupal developer Neil Bertram from Catalyst IT discussed the problem of Drupal deployments on large sites and outlines the foundations we depend on, source control and deployment with Debian packages. In part two, he talked about the modules we rely on, custom hooks, and how we deploy. This week we dream about seamless releases that we can be run on demand, and the three biggest issues we face with deployment.
Dreaming of seamless releases
One of the major things we’d like to be able to do is release small and well-tested features to the site as they’re built, in line with our Agile methodology. At the moment, scheduling and releasing changes to the site early in the early-morning release window can require weeks of preparation, tying up a lot of the team’s time. Putting out smaller, more frequent releases makes each release less risky, and means we can iterate faster to get things a little closer to ideal.
Unfortunately, this is an area where there are no solutions that we’re aware of in the Drupal universe. There is a longstanding assumption that a Drupal site will be placed into maintenance mode in order to perform any updates, and that the site will not be in use when update hooks are running. In some ways this makes sense, given the inability to roll back a database update, but there are still plenty of things that can in theory we released to production that don’t need change the database in dangerous ways.
In the previous section I showed a list of Drush commands that we use to do a release. Almost every single one of them is something that can’t be safely run in a busy production environment, because they trigger general cache clears. We know from experience that a general cache clear during the day is likely to lead to a cache stampede (which Drupal core only just started defending itself against recently, and only then in certain cases). Fixing the blunt nature (and lack of locking) of cache clears in Drupal core is an ongoing effort that we hear may be better in Drupal 8, but for now they’re something we have to live with.
We have a couple of scenarios we can handle, however. If the changes to be released are only to straight code, without any update hooks or CSS/JS changes (which require Advagg to rebuild bundles, and hence a cache clear), then we can release the code without running any of the Drush commands, leaving the site online the whole time. We term this a “code-only” release, and we’ve been doing them when we can, especially to patch up small bugs out of cycle.
Another scenario we can now handle is a release that requires update hooks, but not any feature changes or anything else that requires a cache clear. This enables us to release hooks that add nodes, change roles, add menu links and some minor block manipulation while the site remains fully online and under load. Because such hooks don’t impact the site any more than the content editors would, they’re safe to run so long as they don’t clear caches. We have written a very short alternative to “drush updatedb” that does exactly the same thing, except it doesn’t do a general cache clear at the end. In Drupal 7 this wouldn’t be required, as updatedb does not do a general cache clear in that version anyway.
I believe we’re about as good as we can be at this stage in terms of releasing things predictably and whenever we want. Things like schema changes and module installation will probably always need to be done out of hours with the site offline, but eventually we hope to not need to do such operations often.
Things to solve
As close as we are to where we want to be, there’re some things that still irk us.
The cache clearing issue is a major one. We’d like to be able to do feature reverts in production without doing a blanket cache clear. Things like Views and Panels that are read directly from code shouldn’t require a cache clear to reload, but they seem to. The way Drush runs update hooks in an inner bootstrap, then performs a cache clear from an outer bootstrap that was loaded earlier really messes things up when the update hook has installed a module. The cache clear can then often back out any changes related to that module (like block information), as the module appears still uninstalled to that session that’s rebuilding the cache.
It would also be nice if Advagg could more reliably rebuild its bundles on-demand. We find that it’s quite lazy to discover changes and rebuilds bundles sometimes later at an inconvenient time.
Some of these things are improved somewhat in Drupal 7, which has a lazy cache rebuild option that allows a process to rebuild a cache without first clearing it. This means that any ongoing request will continue to use the old cache until the new one is inserted. Definitely an improvement we could do with.
Another problem we have is with menus. Drupal is fairly rudimentary in its ability to build up a menu programmatically. Since there is no support for UUID-based menu items, we can’t easily load in a whole new section of the site through an update script. We have ways of getting around this, but they rely on the script being written with a fairly good idea of what’s currently on the live site.
Any ideas for us?
Are you looking after a major Drupal site? We’re always interested in hearing how others solve the problems we’ve faced.
Thanks for reading!