Software engineering for academia
Tags: research, programming
During the course of my research, I tend to write a lot of code. My forays into the world of software development and my usage of C++ taught me the value of writing unit tests, following the mystical best practices (whatever they may be), and caring about the quality of my code. It recently occurred to me that this perspective is not too prevalent in academia because while software engineering seems “nice” and useful, it also seems to detract from making progress, viz. from writing that code and pummelling it into shape until it works. As one of my colleagues so aptly put it:
The code does not have to be maintainable, it just has to survive until the conference deadline has passed.
While that seems about right, I argue that this perspective is short-sighted. Moreover, I claim that following some good practices is bound to pay off for the following reasons:
- You will be able to trust your own code better.
- You will be able to trust the results of your own code better.
- Modifications of said code (such as additional experiments that have been requested by “Reviewer 2”, that fearsome beast whose hooves have trampled many papers already) are easier.
- Your results will be stronger and less susceptible to “classical” mistakes, such as incorrect parameters, wrong cross-validation strategies, and so on.
So, without further ado, here are three strategies that will be helpful:
- Use a version control system such as
- Write tests.
- Refactor your code whenever it becomes too complicated.
I will not go into detail about the first one—the good people at GitHub have a pretty nice tutorial that covers the basics (and more)!, but let us take a look at the other two instead.
As is so often the case in the software industry, testing&nbps;(or more precisely, test-driven development) is hailed as the solution to all of your programming woes. I would not go so far. But there are different strategies in testing that will be very helpful in academia:
- Regression testing: this refers to writing a test that asserts the basic functionality of your code. The best way to approach this kind of tests is to have manual examples of what your code is supposed to do and check them against what the code actually does. For example, suppose you are writing code that calculates the connected components of a graph. A simple functional test involves taking a simple graph, figuring out its connected components with pen and paper, and checking that the code arrives at the same answer. The trick is not to stop now—instead of concluding that the code is correct, you now write additional code that performs the same verification for you automatically (I will discuss later on how you do this in practice). The idea behind this extra effort is that you might change something in the connected component calculation code in a few weeks and forget to check your results again. However, if you have code available for checking the results automatically, you will catch these regressions easily (hence the name).
- Sanity checks: often, our code is too complicated to have examples that we can manually check. In these cases, it might make sense to check simple properties of the results of your code instead. For example, if you have code that should yield a positive definite matrix, a sanity check might be a function that tests whether the eigenvalues of said matrix are all positive. While less powerful than functional tests or regression tests, sanity checks can nonetheless be very helpful in restoring your sanity.
- Fuzz testing: this sort of testing refers to test cases in which your function (or model, etc.) is subject to essentially random inputs. When you are writing a routine for calculating the eigensystem of a matrix, for example, your code should be capable of handling any matrix (while raising some errors for those that cannot decomposed, of course). By providing random inputs, you make it less likely that you (inadvertently) cherry-picked some matrix with a special property in your tests. Fuzz testing may also be used to ensure robustness against malicious uses of your functions, but if you really need this, you should not read my blog but study how to write secure code.
Of course, even a test that is well thought-out will might not help in
detecting all errors down the line, but overall, writing some test cases
is worth the trouble. In most languages, testing frameworks are readily
available. Python has the great
whereas C++ users have the choice of Google Test,
Boost.Test, and many more. If, like me, you would like to roll your own testing framework, look at the test cases of Aleph, my library for topological data analysis.
Another blog post of mine
explains some design choices and shows you how to combine testing with
Last, I briefly want to discuss another strategy for combatting the complexities of your code: refactoring. This is merely a fancy word for rewriting your code while not changing its intended output (and at the risk of sounding like a broken record, this is exactly why you need unit tests). Of course, the general idea is to make the code simpler—so that you will understand it even a month or a year after you last touched it. Here are some general ideals you should strive for when refactoring:
- Make information local, not global: try to provide a single flow for all your information in your code. This means not using global variables, and relying on small state information whenever possible.
- Make your functions do one thing: try to keep your function simple or at least single-purpose. Think about splitting up large functions into smaller ones.
- Strive for precision: use small classes with descriptive attributes, in particular in languages like Python, to store information about a state in your program rather than using tuples (with hard-coded indices), for example.
Of course, refactoring can also go too far, but if you are doing it to make your life easier and not just to waste some time procrastinating, it will probably he helpful. You should also take a look at Jeff Atwood's article on “code smells” because recognizing these problematic patterns makes it easier to find parts of your code that benefit from refactoring.
Good luck with your coding efforts, until next time!