Doing ML Research: Gates Open, Come on In!

Tags: academia, research, musings

Published on
« Previous post: There Are Probably No Silver Bullets — Next post: It’s Still Theft »

When colleagues learn about my background in mathematics and are aware of my current research agenda, they often say things like ‘I wish I could also do this, but my topic is just too abstract,’ or ‘Man, ML is so unwelcoming.’

I strongly believe that both of these statements are not correct. This post is meant to be an invitation to ML research and aims to provide a source of encouragement for those who are interested in ML but not yet willing to make the first step.

Caveat lector: It is hard to make blanket statements about any type of research community. ML researchers are certainly not a homogeneous bunch, so there will always be naysayers and toxic people. My intention is to build the community I want to participate in myself. Change needs more people to be effective.

It Is Not Too Late (or: ML Needs You)

The current improvements and advances notwithstanding, it is not too late to join the field and contribute meaningfully. There is a wealth of difference between machine learning research as it is depicted in social media, news outlets, etc. and machine learning research as it is done in practice. Of course, large language models have a lot of traction right now, but not everything revolves around them. There is still a need for different perspectives and strong theoretical contributions.

In fact, my impression is that contemporary machine learning research is a protoscience in the sense of Popper. There is no overarching or underlying theoretical framework yet that could be falsified.1

You can help us create such a framework! I strongly believe that, overall, machine learning will be served far better by an influx of people with different perspectives. This is similar to the birds and frogs metaphor in mathematics, but in the case of ML, the ‘divide’ is more along the lines of ’engineers’ versus ’theoreticians.’ Both parties like to build. The engineers like to build models, whereas the theoreticians like to build, well, theories. Both are required. Both are needed. Neither one of them talks to the other—and that is a problem.

Hence, new people can also serve as bridge builders, and thus enrich the field. It also seems that history is on my side here: ML used to be dominated by mathematicians and statisticians; it became more applied over time.2 As we are still facing fundamental questions that cannot be answered empirically, there is always space for new ideas and perspectives.

Talk ML to Me!

I hope that the preceding paragraphs have convinced you that joining ML is a good and timely idea. To support your transition, let me point out several tropes in ML papers that you might not be familiar with.

Ablation Studies

Good ML papers are chock full of ablations. Essentially, the idea is that if you propose a model with several moving parts, you start disabling them one by one to see where the measured gains are coming from. For instance, is your way of representing the data the thing that makes your model perform better, or is the choice of aggregation mechanism, or something else entirely? A strong ML publication aims to answer these questions.

This is very different from mathematics research, where you would not be expected to disentangle your calculations or proofs in such a manner; if you prove the Riemannian hypothesis, you do not have to show that it is due to this ‘one weird trick’ in your proofs. ML, as a protoscience, is different here—we often do not directly understand what makes our models work. Hence the need for ablation studies.

Quantitative Evaluations

Another thing that might seem slightly weird is the propensity of ML reviewers to like quantitative evaluations. Everything should be a table, ideally with your proposed method having the ‘best’ numbers, whatever that means. At least that is the cliché! In practice, the situation is allowed to be more nuanced, but you should make sure to describe the merits of your method in detail. Merits could include…

  • the number of parameters (lower is often better in case your performance does not suffer from it)

  • the predictive performance (which should be ideally high, or at least higher on some data sets)

  • the computational performance (which is indeed not measured all that often any more, but all the reason to show off your clever technique that also works faster than others)

…and many other aspects. To conclude this section: machine learning reviewers often care much more about breadth (your model versus other models on a large collection of data sets) than depth (your model versus some models on a specific data set). This is not to say that ML reviewers are incapable of nuance—on the contrary! It is just that, when confronted with novel concepts, ML reviewers tend to want to understand what a new method brings to the table.

This desire comes from the bad habit of some researchers3 to add what I like to refer to as cosmetic mathematics to their papers. That is, instead of using mathematics in search for truth, better models, and general enlightenment, these bad apples add mathematics to make their method sound more awesome4 than it actually is—quite often banking on reviewers having an insufficient background to evaluate an idea on its mathematical merits alone. Hence, when confronted with papers that are more theory-laden, ML reviewers often retreat to safe ground and look for quantitative results.

That does not mean that case studies are not appreciated—you just need to make sure to hit the right tone in terms of impact or utility. A mistake my early drafts made quite often is to use a somewhat ‘self-serving’ example as a case study, such as data sets that contain specific structures that my method can leverage better than others. This is perfectly all right to motivate a method, but you should not stop there (my mistake was that I stopped there quite often). So, go boldly through some existing papers and take a look at their evaluations.

And whatever you do: be sure to repeat your experiments multiple times and report standard deviations.5 Due to the stochastic nature of many machine learning models, it is vital to understand how stable your method is with respect to different initialisations and so on.

But Why?

I hope I have convinced you that ML research can be a worthwhile endeavour. In case you are still on the fence, let me be honest by saying that it is absolutely crucial to ensure that ML research remains diverse. At present, there is a tendency for quick engineering fixes that tend to oversell their contribution. This trend is not good for the future of the field. I believe that more voices, in particular those that are used to more theoretical research, will make ML (more) honest.

Good luck on your journey!


  1. This is one of these phrases that will get me in trouble. ‘Protoscience’ does not mean that everyone is doing shoddy work. In fact, there are many subfields of machine learning that have such a strong underlying theoretical framework, such as anything related to statistical learning. I am talking about those aspects that are typically shown off outside the community; here, you will be very unlikely to find the theoretical explanation of the success of the attention mechanism, for instance. ↩︎

  2. Again, this is not a bad thing, but let me repeat that the image projected in social media is not necessarily representative of the field as a whole. ↩︎

  3. It really is a small set of people doing this, but their publications are often quite prominent. ↩︎

  4. Quidquid latine dictum sit, altum videtur↩︎

  5. Some established researchers are also not doing this. Maybe because it makes their method look bad? ↩︎