Fun with unsequenced operations

Tags: programming

Published on
« Previous post: Implementing filter and map with C++11 — Next post: Some adjustments for LaTeX glossaries »

Time for another tale from the trenches, better known as the course on C++ programming taught by my colleague Filip Sadlo. This time, the problem is very subtle and involves unsequenced operations. If you are not familiar with this term—because you have been working with nicer programming languages than C++ so far—let me briefly explain: A sequence point is a point in a program where no more side effects of previous expressions exist any more. It is thus the smallest self-contained unit of code that may be executed. If an expression does not contain any sequence points, the order in which different sub-expressions are evaluated is not necessarily specified.

At this point you may ask yourself: Why should I care? Well, the point is that in most cases, you do not need to care. However, there are a few cases where it makes sense to think about sequencing—especially, when the same variable appears multiple times in an expression.

A classical example, which I have to admit also used to be part of our codebase, is

i = i++

for which the final value of i is ambiguous because the increment is not guaranteed to be executed before the assignment. In essence, the problem is that both operations modify i as a side-effect but the assignment operator does not define a particular execution order.

A similar example, which was part of an examination, is

--x == x--

which also combines multiple side-effects concerning the same variable.

So, what can a poor compiler do in these cases? Nothing much, it turns out. In fact, this is undefined behaviour territory. Just for the fun of its, let us take a look at how different compilers handle this. I will use the following code:

#include <iostream>

int main()
{
  int x = 0;

  std::cout << ( --x == x-- ) << std::endl;
  std::cout << ( x-- == --x ) << std::endl;
}

First, gcc is on.

$ g++ --version
g++ (GCC) 5.3.0
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ g++ -std=c++11 unsequenced.cc; ./a.out
0
0

No warnings? Shame on you. You can surely do better!

$ g++ -std=c++11 -Wall unsequenced.cc; ./a.out
unsequenced.cc: In function ‘int main()’:
unsequenced.cc:7:43: warning: operation on ‘x’ may be undefined [-Wsequence-point]
   std::cout << ( --x == x-- ) << std::endl;
                                           ^
unsequenced.cc:8:43: warning: operation on ‘x’ may be undefined [-Wsequence-point]
   std::cout << ( x-- == --x ) << std::endl;

0
0

Now on to clang!

$ clang++ --version
clang version 3.7.1 (tags/RELEASE_371/final)
Target: x86_64-unknown-linux-gnu
Thread model: posix
$ clang++ -std=c++11 unsequenced.cc; ./a.out      
unsequenced.cc:7:18: warning: multiple unsequenced modifications to 'x'
      [-Wunsequenced]
  std::cout << ( --x == x-- ) << std::endl;
                 ^       ~~
unsequenced.cc:8:19: warning: multiple unsequenced modifications to 'x'
      [-Wunsequenced]
  std::cout << ( x-- == --x ) << std::endl;
                  ^     ~~
2 warnings generated.
1
0

That’s more like it!

So, from the output we can see that anything may happen when we journey into undefined behaviour territory. It is important to be at least aware of these pitfalls—whenever you use some variable multiple times in an expression, I would recommend to briefly reflect about whether some unsequenced modifications may have slipped in.

By the way: As the output of gcc shows, it is almost always a very good idea to compile with a good set of warnings enabled. At the very least, you should use -Wall, which—contrary to its name—does not activate all warnings. I guess naming the switch -Wsome-but-not-all-warnings would have been too silly…