Minimum elucidating examples
I tend to learn new concepts best by seeing an example of them. This dates back to when I first started studying mathematics: I felt that I was able to actually work with a certain definition once I had at least an intuitive grasp of what is meant. I quickly discovered that there are at least three kinds of examples. The first kind is trivial, leading to even more confusion, the second kind is too complex, and the third kind is actually elucidating. It is the third kind of example that I am primarily interested in here, so let’s quickly discuss the other two and move on.
Examples of the first kind are trivial. Suppose you want to illustrate the concept of addition and state that $1 + 0 = 1$. This is obviously correct but does not teach you anything because it contains the idea that $a + 0 = a$ as a confounding property. Likewise, mentioning the empty set $\emptyset$ as an example of a certain kind of structure is always good for some laughs,^{1} but it does not really help you understand what is going on.
Examples of the second kind are not more helpful. They are characterised by a rapid increase in complexity that just bogs you down without really helping you gain intuition or understanding. This Abstruse Goose comic provides a marvellous instance.^{2} Other examples include, say, teaching you about determinants by letting you invert a $10 \times 10$ matrix, when a smaller dimension would have been sufficient.
Examples of the third kind help you grow and understand. I call these examples minimum elucidating examples: they have just the right amount of complexity to not be wholly trivial, but they are simple enough to be remembered and understood correctly. For instance, when learning about group theory, some elucidating examples are the additive group over the integers, $\mathbb{Z}/2\mathbb{Z}$ (the group with only two elements), or, to be a little bit more complex, the general linear group of all real invertible matrices. The utility of such an elucidating example depends heavily on the context—as a student, I preferred learning about the ‘number groups’ first, but nowadays, my preferred goto example is the general linear group, because it is more abstract and provides a better glimpse into the power of group theory.
Closely related to such elucidating examples are elucidating nonexamples (or counterexamples). In group theory, the natural numbers $\mathbb{N}$ come to mind: lacking an additive inverse for every element except $0$, every student can easily grasp that the additional requirements raise the bar for something to be called a group, while at the same time illustrating that a much richer structure is exposed by demanding these properties.
Of particular pedagogical interest are examples that defy our presuppositions. These are counterexamples to our misplaced intuitions! For instance, the Cantor set is a marvellous counterexample in measure theory:^{3} typically, students only encounter sets with measure zero that are countable. This can lead to the wrong impression that ‘measure zero’ and ‘countable’ are somewhat equivalent. The Cantor set destroys this illusion, as it constitutes a set that is uncountable while having measure zero! My mind was shattered when I first learned about this, and since then, the Cantor set has been a cherished counterexample in my toolbox.
Having switched academic fields for a while now, I sometimes wonder what minimum elucidating examples we should develop for machine learning (ML). What is the smallest useful architecture and data set that demonstrates issues such as vanishing gradients? Is there a way to store and disseminate such counterexamples? Would the aspiring ML researcher find them useful and digest them, or are we still in the ‘pragmatic phase’ of our field and use whatever works, without caring too much about the theoretical underpinnings? Time will tell—and I for one hope to collect as many juicy minimum elucidating (counter)examples as I possibly can.
Until next time, have fun being exemplary!

Full disclosure: as a cheeky undergraduate, I did this a few times in oral exams. It is funny but probably less so for the professors who have to endure that sort of humour all day long. ↩︎

I disagree with the comic on one thing, though: maths textbooks tend to follow this recipe more often than computer science textbooks, in my experience. ↩︎

Measure theory deals with measuring subsets of Euclidean space, assigning them a number that describes their ‘size’; think of a generalisation of the concepts of area or volume. Measure theory is often invoked when dealing with problematic solutions or inputs to functions—essentially, one tries to show that while such degenerate inputs may exist, they are ‘too small’ (in the sense of measure theory) to worry about. ↩︎