A visual introduction to Morse theory
Tags: howtos, research
In a previous article, I already introduced manifolds and their, ahem, manifold ways. Among the many interesting comments I received about this article, some were dealing with a basic question, namely how to measure the properties of a manifold. In this article, I want to present a few concepts from Morse theory, one of the ‘classical’ tools to analyse a manifold^{1}.
The premise
The underlying idea of Morse theory is to study a manifold $\mathbb{M}$ by analysing functions defined on that manifold. This is actually a common trick in mathematics: if something is too complex to be studied directly, just observe its behaviour when it interacts with something that you understand somewhat better. In our case, we study the behaviour of realvalued functions on a manifold, i.e. functions of the form $f\colon\mathbb{M}\to\mathbb{R}$ that assign each point $x \in \mathbb{M}$ a corresponding realvalued measurement $f(x)$.
In some sense, these functions are the ‘natural’ thing to study here because they match our intuition quite well. For example, $f$ could be a function that measures a temperature (because temperatures are typically realvalued), the air pressure, or the elevation above sea level. Of course, there are some technical requirements for $f$ to be an admissible function: it has to be differentiable. Intuitively^{2}, this means that our function does not have any breaks, bends, or other weird behaviour.
Thus, temperature, air pressure, or elevation above sea level are the perfect examples to keep in mind—we can evaluate them anywhere on, say, our beautiful planet (which is a manifold, as we learned in a previous article) and they will behave in a differentiable manner.
Level sets
To start our foray into Morse theory, let us assume that $f$ is a function that measures elevation above some plane of our manifold $\mathbb{M}$. Again, it helps to think about this in ‘earthly terms’ and consider $f$ to measure the elevation above sea level. We are now interested in the level sets of $f$^{3}. Given $c \in \mathbb{R}$, the level set of $f$ according to $c$ is defined as
$$ \mathcal{L}_{f}(c) := \{x \in \mathbb{M} \mid f(x) = c\} $$
In other words, the level set contains all those points of the manifold that get mapped to $c$ by $f$. This definition may seem somewhat artificial, but you already know level sets under a different name: open up any map and you will see contour lines, i.e. lines of equal elevation; you can think of the level set as nothing but a contour line for some parameter $c$. Of course, we have to draw the contour line on some weird manifold $\mathbb{M}$, but its essence remains the same.
Let’s consider a simple example before discussing more properties: we take a 1D differentiable function and select a single value for $c$. Hence, according to our notation, we have $\mathbb{M} = \mathbb{R}$, which is also a manifold^{4}. The value of $c$ is shown as a dashed line; it makes a neat horizontal cut (the cut is horizontal because by convention, we use the $y$ axis to denote function values). Every point—or rather, every segment— that is intersected by this line is a part of $\mathcal{L}_{f}(c)$.
Unless there is a horizontal segment in the function, every level set will only consist of a finite number of points here. Not very interesting in terms of their structure, but it serves to sharpen our thinking about level sets in higher dimensions. However, lest you consider these level sets to be useless, be forewarned—they will make a stellar comeback towards the end of the article.
Superlevel and sublevel sets
Level sets might be nice, but at least in our 1D example, they were lacking some structure, consisting merely of points. Let’s update our definition. If we go back to the original level set definition, we can see that we wanted $f(x) = c$ as a condition. This is somewhat restrictive, so why not use an inequality here? This results in two new types of sets:
We call $\mathcal{L}_{f}^{+}(c)$ the superlevel set, while we call the second set the sublevel set. The mnemonic is to take a look at the superscript of the set. A ‘$+$’ indicates that the condition is $f(x) \geq c$, whereas a ‘$$’ indicates that the condition is $f(x) \leq c$.
Superlevel and sublevel behave slightly better in terms of their structural properties. To see this, first take a look at a superlevel set of the same 1D function than before. Again, a horizontal cut indicates our $c$ parameter; everything that is part of the superlevel set is shown in black:
Behold, suddenly we obtain segments and not just mere points! Intuitively, this makes sense: the function cannot behave in an arbitrary manner—after all, we explicitly required it to be differentiable. Hence, unless we pick the global maximum of the function as our $c$, we will always have at least one small segment. Similarly, for the sublevel set:
If we eyeball these two sets for long enough, we can see that they are complementary to some extent. The previous figure contains segments below the line and on the line, while the superlevel set figure contains segments above and on the line. This observation leads to a nice insight: the level set can be expressed as the intersection between superlevel and sublevel set. This is because both of them also contain the level set. In other words, we have:
This is a neat characterisation, but we can do much more when varying $c$ instead of keeping it fixed.
Connectivity in superlevel and sublevel sets
What happens when we slightly perturb $c$? In the case of a level set, we cannot really say—we might have more intersections, or fewer. Thus, let’s discard this type of set for now and focus on the superlevel sets only^{5}. More precisely, we now take a look at what happens when we decrease the value of $c$^{6}. Here are the superlevel sets for some selected values:
We can see that the segments continue to grow as we decrease $c$. Finally, when we reach the lowest value of the function, the two segments will have become one segment. Let us introduce some more formal terminology here: instead of using the word segment, we will use the term connected component. This term is more fancy but also more precise; all points within a specific connected component are connected in the sense that there is path between them. We can see that this is true for the 1D segments here, and it is equally true for an arbitrary manifold.
The growth behaviour that we observed is one instance of a monotonicity property in mathematics. It means that as we change our threshold $c$ (here, we decrease it, but for sublevel sets, we have to increase it), some of the structure from the previous step is preserved. Going from $c$ to $c  \epsilon$, where $\epsilon$ is an incredibly small number, we know that every point that is part of the superlevel set for $c$ will also be part of the superlevel set of $c  \epsilon$. Some more points might occur, but we can keep (!) the ones we have already seen in the previous step. Discovering such properties is always exciting because it gives us some hints about preservation properties.
A critical analysis
So far, we know that superlevel sets are wellbehaved and display monotonicity upon decreasing the threshold. We can find out even more, though. If we briefly reconsider the previous figure that depicts different thresholds, we can make the following observations:
 Segments—connected components—can only grow when we decrease $c$. They never lose any points because of monotonicity.
 If $c$ contains a (local) minimum, connected components are joined.
 If $c$ does not contain a (local) minimum, connected components only grow.
This seems to suggest that not all values of $c$ are equally important. Minima seem to be more important than other points. What about maxima? They are also playing a certain role. To see this, let’s start the superlevel set calculation from the (literal) top (of the function):
We can see that for extremely large values of $c$, there is only one connected component, namely the one that appears as we pass the first maximum (the global maximum). As soon as $c$ passes the second maximum, we obtain a second connected component. We can thus conclude:
Connected components in superlevel sets appear at (local) maxima, and merge at (local) minima. All other points only extend a connected component, so they are not ‘interesting’.
Morse theory uses the term critical points, or critical values (since technically, $c$ determines the superlevel set but is not a point of the manifold) for the local extrema; all other points are thus considered to be regular points. A famous theorem in Morse theory states that the topology (or the intrinsic ‘shape’) of the manifold does not change in an interval $[a, b]$ if said interval does not contain a critical value.
A slightly more complex example
As always, everything we discovered so far also applies to higher dimensions. The following example^{7} shows a slightly more complex manifold and illustrates all the sets we encountered so far:
From left to right, we see the level set, the superlevel set, and the sublevel set for the same critical value $c$. The vertical axis is used to indicate how $f\colon\mathbb{M} \to \mathbb{R}$ works here; it measures the elevation above the ‘plane’ of the manifold. Since $\mathbb{M}$ has no interior here, superlevel and sublevel sets can also be seen as filling the manifold with water—Morse theory, in this setting, then describes how different contours merge and split, as the water level is varied^{8}.
A caveat. There is a slight technicality here, arising from the increase in dimensionality: now, our function $f$ does not only have minima and maxima, but also saddle points, i.e. points with a zero derivative that are not extrema themselves^{9}. The statements above about extrema thus have to be amended to include them. But other than that, nothing changes.
Towards a manifold descriptor
After this tour d’horizon, it is time to summarise what we discussed:
 We learned that, given a differentiable function $f\colon\mathbb{M}\to\mathbb{R}$, there are interesting sets on a manifold that we can study.
 These sets change their structure (or connectivity) only at certain critical values that are determined by the minima, maxima, and saddle points of the function. There is no need to study regular values because they are, mathematically speaking, boring.
 All of these sets can be calculated in arbitrary dimensions.
Let’s put this knowledge to good use and try to develop a descriptor of a manifold. Since we only need to consider critical points, our descriptor does not have to take regular points into account. As we saw earlier, superlevel sets, for example, contain different connected components that merge at a (local) minimum. Thus, we can represent the critical points as nodes in a graph and use the edges to indicate that a merge happens. This will naturally lead to a graph structure. Here is the graph constructed from the superlevel sets:
We can clearly see the monotonicity here—when something is merged into one connected component, it stays merged. Likewise, here is the graph of the sublevel sets:
Again, the monotonicity is obvious: something that has been merged will never be split.
In total, it seems that both graphs capture different aspects of $\mathbb{M}$ and its critical points, but none of the graphs is capable of detecting the ‘hole’ in its middle. For this, we need to combine both of them—this means that we have to go back to level sets, though, because we saw earlier that level sets are the intersection of superlevel and sublevel sets.
The Reeb graph of a manifold
When used correctly, the level sets enable us to combine information about both the superlevel and the sublevel sets. This permits us to finally obtain a compressed descriptor, the Reeb graph^{10}, of a manifold $\mathbb{M}$.
Calculation. To calculate a Reeb graph, we first represent $\mathbb{M}$ by its critical points, i.e. its minima, maxima, and saddle points. Each of these points now becomes a node in a graph. We now connect different critical points according to how the connectivity of the level set changes as we pass a critical value: for example, we connect a local maximum to the local minimum at which it is joining another connected component. More precisely—but also, with slightly more jargon—we consider two regular points $x, y \in \mathbb{M}$ with $f(x) = f(y)$ to be equivalent if they are in the same connected component of the level set for $c := f(x) = f(y)$. We express this equivalence relation by ‘compressing’ equivalent points to a single line that connects two critical points. This is justified according to the theorem mentioned above, which states that the underlying shape of the level set set does not change for regular points. There is a nice visual interpretation of this operation: we just pinch $\mathbb{M}$ so that equivalent points are compressed together. Here is how this looks for our 2D example:
We can follow the calculation along to see that it makes sense: going from top to bottom, we first have two connected components that are ‘created’ in the two ‘horns’ of $\mathbb{M}$. Their contours merge together at some point, and stay together until the ‘hole’ in $\mathbb{M}$ is reached. At this point, we have to create two arcs going out from the critical point. This is because at these values of $c$, as shown previously with the level sets, we have two distinct contours that we need to account for. As soon as the end of the ‘hole’ is reached, the contours merge again, but start splitting up for the ‘legs’ of $\mathbb{M}$.
Why Reeb graphs are great. The great thing about Reeb graphs is that they can always be represented by nodes and edges only. This means that we can draw them, or even calculate them automatically regardless of how complicated $\mathbb{M}$ is. Of course, a lot of shape information, such as the ‘volume’, gets lost here, but the Reeb graph offers a simple and efficient way to compare two highdimensional manifolds with each other—provided that there is a realvalued, differentiable function defined on them. Thus, the Reeb graph is an indispensable tool in Morse theory. It has found a lot of applications in the field of scientific visualisation. See Julien Tierny’s dissertation or Valerio Pascucci’s work for a lot of beautiful examples.
More? More!
There’s so much more we could talk about, but I will keep that for a followup article because this one is already getting quite long. If you feel adventurous, you can read John Milnor’s book on Morse theory, but it can be really tough to stomach if you are not familiar with topological concepts. Similarly, Robert Ghrist’s book on Elementary Applied Topology has a chapter on Morse theory with many beautiful illustrations.
I hope this inspires you to take a look at this beautiful subject. Until next time!

Despite the obvious similarity of names, it is absolutely unrelated to Morse code. More precisely, Morse theory was pioneered by Marston Morse, whereas we have Samuel F. B. Morse to thank for the Morse code. Both inventions and both people are interesting to look up, but please stay on this website a little bit longer before falling into a rabbit hole. ↩︎

If you are already familiar with the term differentiable, you probably do not need the intuition and are equally well at home with the standard definition based on limits. ↩︎

This name will make sense in a few seconds. Bear with me—mathematics does nothing without a good reason. ↩︎

Technically speaking, every high school student is thus equipped to know the basics on smooth manifolds. Who said modern mathematics has to be scary? ↩︎

We could equally well focus on sublevel sets. But, as mathematicians like to say, without loss of generality (w.l.o.g.), everything we discover for one kind of set will also apply to the other kind. ↩︎

Since we are thus decreasing the $c$level, one could see this as some kind of reverse climate change… ↩︎

Taken from the ‘related work’ chapter of my dissertation, in case you are interested. ↩︎

There is a mathematical legend that an old course on Morse theory had the lecturer, potentially Marston Morse, or, in other variations of the legend, John Milnor, come in with a model of a landscape that was subsequently filled with water to illustrate these concepts. I was unable to track this video down, but I want to believe that it exists. ↩︎

A good example is the function $f(x) = x^3$, which has a saddle point at $x = 0$. Saddle points are quite notorious in machine learning, for example, because they may impede the training of models. In Morse theory, they are merely a special type of critical point. ↩︎

See Julien Tierny’s Ph.D. thesis for more explanations and truly magnificent visualisations. ↩︎