Thoughts on Travis CI

This article details some of my experiences with Travis CI, aiming to make this service more accessible to those who—like me— do not consider themselves to be “proper” software developers.

Prelude and history

I am by no means a professional software developer, but my job offers me the possibility to churn out some code more often than not. Naturally, I want my code to be used by other people—especially when said code is part of a publication. There is just one catch: code does not exist in a vacuum but is part of some ecosystem. Even when you are developing everything in Python, a language that arguably tries to hide a lot of complexity so that one is able to focus on the task at hand, your code is not an island. Save for extremely trivial programs, you will probably be using libraries, such as the trinity of data science, i.e. import numpy as np, import scipy as sp, and last, but certainly not least, import matplotlib.pyplot as plt.

Your users, however, might have radically different configurations—it is well possible that some have more recent versions of all packages, while others will still be running the first release of Ubuntu. Some of them, and this is really hard for me to admit, may not even be running the best Linux distribution in the world. Some of them may not even be running a Linux operating system!

Given this untidy state of affairs, how can you at least make some notion of credible attempt at pretending to support your code for more than your machine only? That is where continuous integration comes in! Originally a technique in test-driven development, CI is supposed to make your lifer a tad easier by ensuring that the changes you make do not stop the project in total from working. This is achieved by integrating your changes often into the current master branch of your repository, and checking that all changes you make do not break the build.

Professional software development companies set up their own build servers, thereby ensuring that the product still is built as per spec. How does this pertain exactly to us academics?

Software development in academia

Most software developed in academia is held together by the same ingredients, viz. hope, the tears of Ph.D. students, and sheer faith. Code is supposed to work until the deadline and, preferably, until the paper has been accepted. For many students, git is something one has heard of in that pesky software engineering class. The full power of version control systems is often not appreciated, even by research group leaders. Let me describe you a nice scenario: Suppose you are working towards the important deadline. Suddenly, one person (Alice) in your team discovers that the main algorithm has a bug. “Woe is me”, you say. But Alice, being very good at what she does, fixes the bug and after a few tense minutes she announces that the results you report in the paper still hold. You submit the paper early and go home to sleep.

This is the kind of magic that continuous integration can bring to your projects if you care to use it. Read on if you are interested.

The magic

In a nutshell, Travis CI “merely” provides a set of virtual machines (featuring different operating systems even!) on which you can build and run your code. And the best thing is that this works automatically, whenever you update your repository via git push. No more worrying about whether that small change you did might have changed all the calculations—instead, a rather blissful existence.

Of course, this only works if you invest some time in setting up your project. At its core, Travis looks for a file called .travis.yml in your repository. It is this file that allows you to configure the steps that Travis performs for each update. This is what a simple file may look like if your project uses CMake as its build system and has no additional dependencies:

language: cpp

os:
  - linux

compiler:
  - gcc

script:
  - mkdir build
  - cd build
  - cmake ../
  - make

After each git push, Travis will dutifully clone the repository and execute all steps that you provide in the script section. Here, this means creating a separate build directory and checking that the project can be built. This is not much, of course, and does not really help in catching problems with an algorithm, but it is a start. Suppose you want to ensure that your project is also compilable with clang. Just change the corresponding section:

compiler:
  - clang
  - gcc

Or suppose that you want to add Mac OS support:

os:
  - linux
  - osx

It really is that simple. Travis will now create a build matrix, consisting of two operating systems (Linux and MacOS X) and two compilers (clang and gcc).

But I want more magic!

Coming back to the scenario above, how can Travis help in this case? Well, of course you have to provide the tests that Travis needs to run. For example, I like to create a special test build target for CMake and let it execute my unit tests for me. Unit tests range from banal checks of classes to longer programs that compare expected results of algorithms to current ones. You will have to take my word for it, but tests like this have helped me multiple times in the past for various publications. If you are interested in how they may look, please refer to Aleph, my library for experimenting with topological data analysis. I do not claim that the code is perfect, but the tests subdirectory contains numerous unit tests that may drive home the point I am trying to make here. The best thing is that the configuration file is not getting needlessly complicated. Due to the nice feaures of CMake, it is again sufficient to extend the script section:

script:
  - mkdir build
  - cd build
  - cmake ..
  - make
  - CTEST_OUTPUT_ON_FAILURE=1 make test

The ugly last line is only necessary in order to force the testing harness of CMake to be a little more verbose. You can see the output of Travis for this project here, and you will see that the unit tests are always executed—giving me peace and tranquility to some extent.

So?

That is the basic gist of Travis. I think that using such a service can be beneficial for academical projects. Not only does give you the confidence that your code is doing something right, it can also be used to promote your research—by providing a repository that is tested under different platforms, with different compilers, you make it easier for people to actually use your cool algorithm (and cite you, of course, lest we forget about that).

Dependencies

Now you are probably huffing at this article because your program is more complicated and has some dependencies. Well, Travis has you covered to some extent.

Travis permits you to change the build environment to some extent. For example, Aleph uses Boost and Eigen. Since those exist as packages under Ubuntu, the default Linux distribution used by Travis, I can easily install them in .travis.yml:

language: cpp

os: linux
sudo: false

addons:
  apt:
    packages:
      - libboost-dev
      - libeigen3-dev

For Mac OS X, support for Homebrew is available, but the use is slightly more complex:

before_install:
  - if \[[ "$TRAVIS_OS_NAME" == "osx" ]]; then brew install boost; fi
  - if \[[ "$TRAVIS_OS_NAME" == "osx" ]]; then brew install eigen; fi

Overall, though, this works just fine.

Troubles

I do not want to sound overly enthusiastic here. Adding support for multiple operating systems and configurations can be painful. Heck, I often have to commit multiple times because the build just breaks for some darn reason. It is a fine balance that one has to achieve here: code should be usable, but your time is not infinite.

Travis unfortunately makes my life sometimes harder than it has to be. While I am grateful for the free (free!) service they offer, some things just irk me:

  • The default Ubuntu image is old. Like really, really old, viz. the version of Ubuntu from frigging 2014. As an Arch Linux user, I am stunned. Some of my time is thus spent correcting for some arcane problems with old package versions.

  • The OS X images are strange: the update process of brew just stalls, and the same goes for the builds themselves sometimes. I get many e-mails that tell me that my build errored (which in Travis lingo does not mean that the build failed, but rather that some virtual machine could not be started).

  • The number of available images is very small. Ideally, I would like to be able to check my software on even more variants of Ubuntu but also on other flavours of Linux. And what about BSD? One of my users may want to install the software while running an old version of OpenBSD on toaster, so where is the support for that?

  • Do not get me started on supporting older and newer package versions…

Nonetheless, I am grateful for the Travis CI engineers. They are doing a heck of a job just so that I can pretend that my GitHub projects are actually useful to someone out there. Thank you (I really mean it)!

Coda

All in all, Travis CI has been very beneficial for my projects. I sleep a little bit better knowing that I do not have to worry about breaking my algorithms just by changing the interface somewhere else. Even though it has it shortcomings—but nothing is perfect—I urge you to consider using it for your projects and your publications!

By the way: if you are a hardcore Bitbucket user, take a look at Pipelines. They provide similar functionality for those of us that are not living in the land of GitHub.

Happy coding/testing/integrating, until next time!