Why distributed CI is the logical next step

It has become a best practice to run a private build on the developer machine before committing a change to the central source repository to minimize broken CI builds. Despite being a sensible measure to improve the development process this practice has always bugged me, because it is an extra manual step. In my opinion automating this step is one of the missing pieces for a better CI process.

Most of us (should) have got used to the following five steps when changing/adding a piece of code:

  1. Save/commit your changes in your IDE
  2. Run the unit tests that cover the changed area
  3. Run a private local build
  4. Push your changes to the master repository
  5. Your CI tool automatically picks up the changes and executes your deployment pipeline

You also probably didn’t execute steps three and four every time you did step one. Mostly not because it wouldn’t make sense to do so, but because these steps are time consuming. I believe that only the first step should be manual, all others can be triggered by their previous step.

  1. Save/commit your changes in your IDE
  2. Your IDE automatically runs the necessary unit tests (e.g. with JUnit Max or Infinitest)
  3. Your distributed CI tool automatically pulls the latest changes from the master repository and runs a local build
  4. Your distributed CI tool automatically pushes the changes to the master repository
  5. Your central CI tool automatically picks up the changes and executes your deployment pipeline

When practicing rigorous TDD one could even argue that not even the first step needs to be manual. The first two steps could be switched around and your IDE could run the unit tests whenever the code compiles and then automatically save your changes when all unit tests are green.

  1. Your IDE automatically runs the necessary unit tests whenever the code compiles
  2. Your IDE automatically commits your changes when all unit tests are green
  3. Your distributed CI tool automatically pulls the latest changes from the master repository and runs a local build
  4. Your distributed CI tool automatically pushes the changes to the master repository
  5. Your CI tool automatically picks up the changes and executes your deployment pipeline

No more manual steps at all. Just pure coding. It will be a bright future.

I set up the distributed CI part by installing a Hudson instance on my developer machine and another one on my CI server. When my local Hudson detects a change to my local Git repository it runs jobs to pull the latest changes from the master repository and update my clone, then executes a commit build and finally pushes the changes to the master repository, where the other Hudson instance runs the full deployment pipeline, including acceptance tests. This seems to work fine, but is of course quite a bit to set up and even harder to keep in sync without native tool support by Hudson.

P.S. I made a visual representation of how I am envisaging the development process with distributed CI. You can find it here. I will probably explain the diagram in more detail in a follow up post.

About these ads

11 thoughts on “Why distributed CI is the logical next step

  1. The only problem with your idea, is that running a full build/test is a time consuming step. For my project that’s easily a 10-15 minute build/test step that can keep my system heavily used.

    • Fair point. This process assumes that you are working on a machine that can handle the load of running a build in the background. Unfortunately, we don’t always have this luxury. Although, I do believe that the time savings permit the purchase of well equipped developer machines.

      Assuming you have a deployment pipeline with several stages, you might decide to only run the first stage with your (real) unit tests on the developer machines. Since these tests don’t use any external systems, databases, file system etc, they should be reasonably fast. I would think well under 10 minutes. For a thorough discussion of deployment pipelines and CI in general I can suggest ‘Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation’ by Jez Humble and David Farley.

    • Running the local CI instances in the cloud might also be an option. Although I am not convinced that the performance is good enough. This is just an assumption though, I never tried it myself.

  2. We’ve been using Hudson with Gerrit to do something similar (for the later stages at least).
    Gerrit is a Git code review system.

    When a developer is ready they push their changes for review to gerrit. Hudson sees that a new review request has been created, pulls the code, builds it, updates the review with whether it is verified or fails build). Once this is done a reviewer reviews the code (unit testing is only as good as the tests that have actually been written) and then tells Gerrit to push it onto the master branch.

    This ensures that the code builds, passes all tests and has actually been reviewed before it is pushed into the master branch and breaks the main CI build.

  3. Nice blog, and private CI system trigger me some thinking.

    Also, if we have nice ALM system integrated, then only finished features will be triggered as one build and push to master branch, it means half finished features are useless to main branch.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s