# A visual introduction to Morse theory

## Tags: howtos, research

In a previous article, I already introduced
*manifolds* and their, ahem, manifold ways. Among the many interesting
comments I received about this article, some were dealing with a basic
question, namely how to measure the properties of a manifold. In this
article, I want to present a few concepts from Morse theory, one of the
‘classical’ tools to analyse a manifold^{1}.

# The premise

The underlying idea of Morse theory is to study a manifold $\mathbb{M}$
by analysing *functions* defined on that manifold. This is actually
a common trick in mathematics: if something is too complex to be studied
*directly*, just observe its behaviour when it interacts with something
that you understand somewhat better. In our case, we study the behaviour
of *real-valued functions* on a manifold, i.e. functions of the form
$f\colon\mathbb{M}\to\mathbb{R}$ that assign each point $x \in
\mathbb{M}$ a corresponding real-valued measurement $f(x)$.

In some sense, these functions are the ‘natural’ thing to study here
because they match our intuition quite well. For example, $f$ could be
a function that measures a temperature (because temperatures are
typically real-valued), the air pressure, or the elevation above sea
level. Of course, there are some technical requirements for $f$ to be an
admissible function: it has to be *differentiable*. Intuitively^{2},
this means that our function does not have any breaks, bends, or other
weird behaviour.

Thus, temperature, air pressure, or elevation above sea level are the perfect examples to keep in mind—we can evaluate them anywhere on, say, our beautiful planet (which is a manifold, as we learned in a previous article) and they will behave in a differentiable manner.

# Level sets

To start our foray into Morse theory, let us assume that $f$ is
a function that measures *elevation* above some plane of our manifold
$\mathbb{M}$. Again, it helps to think about this in ‘earthly terms’
and consider $f$ to measure the elevation above sea level. We are now
interested in the *level sets* of $f$^{3}. Given $c \in \mathbb{R}$,
the level set of $f$ according to $c$ is defined as

$$ \mathcal{L}_{f}(c) := \{x \in \mathbb{M} \mid f(x) = c\} $$

In other words, the level set contains all those points of the manifold
that get mapped to $c$ by $f$. This definition may seem somewhat
artificial, but you already know level sets under a different name: open
up any map and you will see *contour lines*, i.e. lines of equal
elevation; you can think of the level set as nothing but a contour line
for some parameter $c$. Of course, we have to draw the contour line
on some weird manifold $\mathbb{M}$, but its essence remains the same.

Let’s consider a simple example before discussing more properties: we
take a 1D differentiable function and select a single value for $c$.
Hence, according to our notation, we have $\mathbb{M} = \mathbb{R}$, which
is also a manifold^{4}.
The value of $c$ is shown as a dashed line; it makes a neat horizontal
cut (the cut is horizontal because by convention, we use the $y$
axis to denote function values). Every point—or rather, every
*segment*— that is intersected by this line is a part of
$\mathcal{L}_{f}(c)$.

Unless there is a horizontal segment in the function, every level set will only consist of a finite number of points here. Not very interesting in terms of their structure, but it serves to sharpen our thinking about level sets in higher dimensions. However, lest you consider these level sets to be useless, be forewarned—they will make a stellar comeback towards the end of the article.

# Superlevel and sublevel sets

Level sets might be nice, but at least in our 1D example, they were lacking
some structure, consisting merely of points. Let’s update our
definition. If we go back to the original level set definition, we can
see that we wanted $f(x) = c$ as a condition. This is somewhat
restrictive, so why not use an *inequality* here? This results in two
new types of sets:

We call $\mathcal{L}_{f}^{+}(c)$ the *superlevel set*, while we call
the second set the *sublevel set*. The mnemonic is to take a look at the
superscript of the set. A ‘$+$’ indicates that the condition is $f(x) \geq
c$, whereas a ‘$-$’ indicates that the condition is $f(x) \leq c$.

Superlevel and sublevel behave slightly better in terms of their structural properties. To see this, first take a look at a superlevel set of the same 1D function than before. Again, a horizontal cut indicates our $c$ parameter; everything that is part of the superlevel set is shown in black:

Behold, suddenly we obtain *segments* and not just mere points!
Intuitively, this makes sense: the function cannot behave in an
arbitrary manner—after all, we explicitly required it to be
differentiable. Hence, unless we pick the global maximum of the function
as our $c$, we will *always* have at least one small segment. Similarly,
for the sublevel set:

If we eyeball these two sets for long enough, we can see that they are
*complementary* to some extent. The previous figure contains segments
*below* the line and *on* the line, while the superlevel set figure
contains segments *above* and *on* the line. This observation leads to
a nice insight: the level set can be expressed as the *intersection*
between superlevel and sublevel set. This is because both of them
also contain the level set. In other words, we have:

This is a neat characterisation, but we can do much more when *varying*
$c$ instead of keeping it fixed.

# Connectivity in superlevel and sublevel sets

What happens when we *slightly* perturb $c$? In the case of a level set,
we cannot really say—we might have more intersections, or fewer.
Thus, let’s discard this type of set for now and focus on the superlevel sets
only^{5}. More precisely, we now take a look at what happens when we
*decrease* the value of $c$^{6}. Here are the superlevel sets for some
selected values:

We can see that the segments continue to grow as we decrease $c$.
Finally, when we reach the lowest value of the function, the two
segments will have become one segment. Let us introduce some more formal
terminology here: instead of using the word *segment*, we will use
the term *connected component*. This term is more fancy but also more
precise; all points within a specific connected component
are *connected* in the sense that there is path between them. We can see
that this is true for the 1D segments here, and it is equally true for
an arbitrary manifold.

The growth behaviour that we observed is one instance of a *monotonicity property* in
mathematics. It means that as we change our threshold $c$ (here, we
decrease it, but for sublevel sets, we have to *increase* it), *some*
of the structure from the previous step is preserved. Going from $c$ to
$c - \epsilon$, where $\epsilon$ is an incredibly small number, we know
that every point that is part of the superlevel set for $c$ will *also*
be part of the superlevel set of $c - \epsilon$. Some more points might
occur, but we can *keep* (!) the ones we have already seen in
the previous step. Discovering such properties is always exciting
because it gives us some hints about preservation properties.

# A critical analysis

So far, we know that superlevel sets are well-behaved and display monotonicity upon decreasing the threshold. We can find out even more, though. If we briefly reconsider the previous figure that depicts different thresholds, we can make the following observations:

- Segments—connected components—can only grow when we decrease $c$. They never lose any points because of monotonicity.
- If $c$ contains a (local) minimum, connected components are joined.
- If $c$ does not contain a (local) minimum, connected components only grow.

This seems to suggest that not all values of $c$ are equally important. Minima seem to be more important than other points. What about maxima? They are also playing a certain role. To see this, let’s start the superlevel set calculation from the (literal) top (of the function):

We can see that for extremely large values of $c$, there is only *one*
connected component, namely the one that *appears* as we pass the first
maximum (the *global* maximum). As soon as $c$ passes the second
maximum, we obtain a second connected component. We can thus conclude:

Connected components in superlevel sets

appearat (local) maxima, andmergeat (local) minima. All other points only extend a connected component, so they are not ‘interesting’.

Morse theory uses the term *critical points*, or *critical
values* (since technically, $c$ determines the superlevel set but
is *not* a point of the manifold) for the local extrema; all
other points are thus considered to be *regular points*. A famous
theorem in Morse theory states that the topology (or the intrinsic ‘shape’)
of the manifold does not change in an interval $[a, b]$ if said interval
does not contain a critical value.

# A slightly more complex example

As always, everything we discovered so far also applies to higher
dimensions. The following example^{7} shows a slightly more complex manifold
and illustrates all the sets we encountered so far:

From left to right, we see the *level set*, the *superlevel set*, and
the *sublevel set* for the same critical value $c$. The vertical axis
is used to indicate how $f\colon\mathbb{M} \to \mathbb{R}$ works here;
it measures the elevation above the ‘plane’ of the manifold. Since
$\mathbb{M}$ has no *interior* here, superlevel and sublevel sets can
also be seen as filling the manifold with water—Morse theory, in
this setting, then describes how different contours merge and split,
as the water level is varied^{8}.

**A caveat.** There is a slight technicality here, arising from the
increase in dimensionality: now, our function $f$ does not only have
minima and maxima, but also *saddle points*, i.e. points with
a zero derivative that are *not* extrema themselves^{9}. The statements
above about extrema thus have to be amended to include them. But other
than that, nothing changes.

# Towards a manifold descriptor

After this *tour d’horizon*, it is time to summarise what we discussed:

- We learned that, given a differentiable function $f\colon\mathbb{M}\to\mathbb{R}$, there are interesting sets on a manifold that we can study.
- These sets change their structure (or
*connectivity*) only at certain*critical values*that are determined by the minima, maxima, and saddle points of the function. There is no need to study*regular values*because they are, mathematically speaking, boring. - All of these sets can be calculated in arbitrary dimensions.

Let’s put this knowledge to good use and try to develop a descriptor of
a manifold. Since we only need to consider critical points, our
descriptor does not have to take regular points into account. As we saw
earlier, superlevel sets, for example, contain different connected
components that *merge* at a (local) minimum. Thus, we can
represent the critical points as *nodes* in a graph and use the *edges*
to indicate that a merge happens. This will naturally lead to a graph
structure. Here is the graph constructed from the superlevel sets:

We can clearly see the monotonicity here—when something is
*merged* into one connected component, it stays merged. Likewise,
here is the graph of the sublevel sets:

Again, the monotonicity is obvious: something that has been merged will
*never* be split.

In total, it seems that both graphs capture different aspects of
$\mathbb{M}$ and its critical points, but *none* of the graphs
is capable of detecting the ‘hole’ in its middle. For this,
we need to *combine* both of them—this means that we have to go
back to *level sets*, though, because we saw earlier that level sets are
the intersection of superlevel and sublevel sets.

# The Reeb graph of a manifold

When used correctly, the level sets enable us to combine information
about *both* the superlevel and the sublevel sets. This permits us to
finally obtain a compressed descriptor, the *Reeb graph*^{10}, of
a manifold $\mathbb{M}$.

**Calculation.** To calculate a Reeb graph, we first represent $\mathbb{M}$ by its critical points, i.e. its
minima, maxima, and saddle points. Each of these points now becomes
a node in a graph. We now *connect* different critical points according
to how the connectivity of the level set changes as we pass a critical
value: for example, we connect a local maximum to the local minimum at
which it is joining another connected component.
More precisely—but also, with slightly more jargon—we
consider two regular points $x, y \in \mathbb{M}$ with $f(x) = f(y)$ to
be *equivalent* if they are in the same connected component of the level
set for $c := f(x) = f(y)$. We express this *equivalence relation* by
‘compressing’ equivalent points to a single line that connects two
critical points. This is justified according to the theorem mentioned
above, which states that the underlying shape of the level set set does
not change for regular points.
There is a nice *visual interpretation* of this operation: we
just *pinch* $\mathbb{M}$ so that equivalent points are compressed
together. Here is how this looks for our 2D example:

We can follow the calculation along to see that it makes sense: going
from top to bottom, we first have two connected components that are
‘created’ in the two ‘horns’ of $\mathbb{M}$. Their contours merge together at
some point, and stay together until the ‘hole’ in $\mathbb{M}$ is
reached. At this point, we have to create *two* arcs going out from the
critical point. This is because at these values of $c$, as shown previously
with the level sets, we have *two* distinct contours that we need to
account for. As soon as the end of the ‘hole’ is reached, the contours
merge again, but start splitting up for the ‘legs’ of $\mathbb{M}$.

**Why Reeb graphs are great.** The great thing about Reeb graphs is that they
can *always* be represented by nodes and edges only. This means that we can draw them, or even
calculate them automatically regardless of how complicated $\mathbb{M}$
is. Of course, a lot of shape information, such as the ‘volume’, gets
lost here, but the Reeb graph offers a simple and efficient way to
compare two high-dimensional manifolds with each other—provided
that there is a real-valued, differentiable function defined on them.
Thus, the Reeb graph is an indispensable tool in Morse theory. It has
found a lot of applications in the field of scientific visualisation.
See Julien Tierny’s
dissertation
or Valerio Pascucci’s
work
for a lot of beautiful examples.

# More? More!

There’s so much more we could talk about, but I will keep that for a follow-up article because this one is already getting quite long. If you feel adventurous, you can read John Milnor’s book on Morse theory, but it can be really tough to stomach if you are not familiar with topological concepts. Similarly, Robert Ghrist’s book on Elementary Applied Topology has a chapter on Morse theory with many beautiful illustrations.

I hope this inspires you to take a look at this beautiful subject. Until next time!

- Despite the obvious similarity of names, it is absolutely unrelated to Morse code. More precisely, Morse theory was pioneered by Marston Morse, whereas we have Samuel F. B. Morse to thank for the Morse code. Both inventions and both people are interesting to look up, but please stay on this website a little bit longer before falling into a rabbit hole.
^{↩} - If you are already familiar with the term
*differentiable*, you probably do not need the intuition and are equally well at home with the standard definition based on limits.^{↩} - This name will make sense in a few seconds. Bear with me—mathematics does nothing without a good reason.
^{↩} - Technically speaking, every high school student is thus equipped to know the basics on smooth manifolds. Who said modern mathematics has to be scary?
^{↩} - We could equally well focus on sublevel sets. But, as mathematicians like to say,
*without loss of generality*(w.l.o.g.), everything we discover for one kind of set will also apply to the other kind.^{↩} - Since we are thus
*decreasing*the $c$-level, one could see this as some kind of reverse climate change…^{↩} - Taken from the ‘related work’ chapter of my dissertation, in case you are interested.
^{↩} - There is a mathematical legend that an old course on Morse theory had the lecturer, potentially Marston Morse, or, in other variations of the legend, John Milnor, come in with a model of a landscape that was subsequently filled with water to illustrate these concepts. I was unable to track this video down, but I want to believe that it exists.
^{↩} - A good example is the function $f(x) = x^3$, which has a
*saddle point*at $x = 0$. Saddle points are quite notorious in machine learning, for example, because they*may*impede the training of models. In Morse theory, they are merely a special type of critical point.^{↩} - See Julien Tierny’s Ph.D. thesis for more explanations and truly magnificent visualisations.
^{↩}