Continuous Integration (CI) – the missing pieces
Continuous Integration (CI) has been around for a while now, and is part of most decent development environments these days. It is certainly mature enough to not give you an excuse any more for not using it. Yet there are a few features that I haven’t found in any CI tool I have used so far, but that I would like to see implemented.
On top of my most wanted list is resource awareness. A resource aware CI tool would run builds depending on which resources are available at the time. I have my builds set up to run different categories of tests and I configured my CI tool so that:
- my unit test build runs on every commit to the version control system
- integration tests, that require external systems like databases or the file system are run in a subsequent build, which gets triggered after a successful completion of the unit test build
- functional tests, that require a deployed system are run on a set schedule, e.g. nightly (if there were any changes since the last build)
This setup is by no means following any hard rules, and your own setup most likely differs from mine depending on your own needs. However, what I am fairly certain of is that you had to configure your CI tool of choice when to run your tests and that you probably don’t adjust these settings once you found one that seems to work for you. This might be fine in your eyes, but I think we can do better. What if your CI tool decided when to run tests depending on which resources are available to do so? Why not run your functional tests right now if some of your processors are idle? Why not postpone slower integration tests if there are more commit builds in the queue that are waiting to have their unit tests run? Hudson made a step in this direction by automatically building multi module Maven projects in parallel if possible, but there is certainly a lot of room for improvement.
Closely related to the above is my desire for automatic test categorization. In order to run tests in different builds with different frequencies you need to split them up into separate categories. The main reason being to run the fastest tests first in order to receive feedback as quickly as possible. What if your CI tool did this for you? If a test runs under 200 milliseconds it’s probably a unit test. If a test accesses a database it’s at least an integration test. If a test is a Selenium test it is most certainly a functional test. With a set of rules like these the CI tool could split up your tests for you. This doesn’t mean you shouldn’t organize your tests in separate folders or packages, but it means that you don’t have to if you choose not to.
In addition to this categorization, I would like CI tools to be smarter about test execution order, similar to how JUnit Max or Infinitest are doing this in your IDE. This means that previously failing tests run first; quicker tests run before slower tests; new ones before older and so forth. (Update 1 July 2011: Sometimes dreams do come true. Check out Test Load Balancer)
And last but not least I would like a visual presentation of where my builds are at, which I have described in a previous post in more detail. Go‘s deployment pipelines are so far the best solution that I have seen in this respect.
Oh yeah, one more thing. If you have been practicing CI for a while you know that broken builds have to be fixed right away. In order to achieve this everyone gets notified as soon as the build breaks. Alarms go off, lights are flashing and emails and SMSes are sent out. Everyone gets interrupted. To avoid this undesirable event it has become common practice for developers to run a full build on their local machine before committing any code. This requires discipline by each and every developer though and I would prefer to see this automated. One way to achieve this could be a distributed CI environment, which could work as follows:
- On every commit/save to your local repository/workspace your local CI client performs an update from the central repository and runs all tests.
- If no newer version is in the central repository you may configure to skip the unit tests if these are run by your IDE anyway.
- On success a push/commit to the central repository is performed where the central CI server picks up the changes and runs configured builds.
- All the way you receive unobtrusive feedback in your IDE. Only if your commit happens to break the build more interruptive measures are taken.
For this to work efficiently with TDD, where you create failing tests all the time, the above mentioned smart test execution order is essential. Otherwise you keep running your entire unit test suite over and over again even though you already know that the test you just wrote is going to fail.
‘But why all this?’ you ask. There are two main objectives for the features described above: Minimizing manual work and speeding up feedback.