The problem with “traditional” release management
I say “traditional” release management for the lack of a better categorisation. What I mean by that is having separate phases (and teams) for development, system test, user acceptance test (UAT) and finally production. The problem is that it just takes too long after the developer commits his code until it gets eventually deployed to production. In my experience, this is mostly due to delays in hand offs from one phase to another. And as we all know, thanks to Lean Thinking this is waste.
Why you (as a developer) should care
The deployments to these different environments are usually stressful, because they bare some amount of risk of taking down the system you are deploying to. And the closer you get to production, the higher the penalty for doing so becomes. Because of this risk, deployments usually take place when potential problems would have the least impact on the users of these environments, which in general happens to be out of business hours. And last but not least, deployments tend to be boring and tedious tasks. Or did you ever get excited about doing a release? Yeah, that’s what I thought.
Why you (as a manager) should care
Of course, you also care about the factors mentioned above, because you want your developers to be as happy as possible, right? But furthermore, you especially dislike the risks associated with deployments, because you have to factor them into all sorts of calculations. Additionally, all these delays between phases mean that your time to market for new features and bug fixes is way higher than it needs to be, which means you are loosing money. Constantly.
The obvious solution
It’s easy, right? The answer is continuous deployment. You fully automate your testing and deployment and you deploy straight to production after every commit. Sounds great. The problem is: Most organisations are not ready for full test automation or just not willing to give up manual testing. This may be for security or legal reasons, potential loss of revenue and/or reputation or something else. Bottom line is: Pure continuous deployment into production without any manual testing is not widely accepted. Period. I am not saying this goal is not worth aiming for. Trust me, I am all for it. I just don’t see it happening in the near future in most companies. In my opinion, the only way to practice continuous deployment into production is by having sophisticated real-time alerting combined with a system with built- in resilience. This would allow you to automatically detect and back out a failed deployment without an interruption of the system as a whole. Have a read of this blog post about how the guys at kaChing managed to achieve this.
A step in the right direction
Don’t worry, not all is lost. Even if continuous deployment is not achievable for you right now, that doesn’t mean you can’t improve your situation at all. You can take the first steps towards continuous deployment by decreasing the batch size of deployments and automating the deployment and hand-offs between the environments.
- Decreasing the batch size should be easily achieved by just allowing automatic deployments into your first test environment. Only, of course, after a successful build that runs all your automated tests.
- Automating the deployment may require some tailored scripts to suit your system. However, if you are using Hudson and your deployment artifact is a war or ear file you can use the deploy plugin, which is based on Cargo.
- Automating the hand-offs could be achieved by using a central release management dashboard that is used by developers, testers and managers.
So far I haven’t found any Hudson plugin that would provide this. What I have in mind is something like the following:
- Deployments can be configured to happen automatically or to require manual approval.
- When deployment to an environment finishes, a configured set of people gets notified
- The different environments where the application is running can be accessed directly from the dashboard
- When approving a build, and all previous builds have been approved as well (there might be dependencies), deploy to next environment.
- When rejecting a build
- A tester raises one or more issues for the build. The issue tracking system is tightly integrated in the dashboard and can be reached via a link, which pre-fills relevant fields about the current build and project. This automatically rejects the build.
- The developer that committed the change that triggered the rejected build gets notified.
- The developer fixes the issue (after creating a failing test to expose it, of course).
- The developer commits the fix with reference to the issue in the commit comment. This triggers a new automatic build, plus an automatic update of the issue and a notification to the tester who raised it (like the Trac SVN Policies Plugin).
- When the tester approves the new build the old build gets automatically approved as well, if this was the last outstanding issue for the old build.
In my mind it looks somewhat like this (click image to enlarge):
In order to do pure continuous deployment you could then configure the dashboard with only one environment (=production) with automatic deployment. You can, however, choose to set up as many intermediate environments as you like and also require manual approval of deployments to any of them.
Does anyone know if a Hudson plugin with this functionality exists or is in the making?
With this in place I believe deployments can be non-events, in the meaning of being effortless and stress free. This does not mean you should stop celebrating your releases, though.