A vademecum for those that are new to machine learning
I was recently asked to become a mentor for the New in ML 2019 Workshop, an event that is co-located with NeurIPS 2019 and specifically targets newcomers to machine learning. It is an honour for me to be mentioned on the same page as so many of my role models, and I look forward to inspiring discussions with all participants. I decided that I wanted to have something more tangible than conversations before the workshop starts—a resource to which I could point and that people might consider helpful. In the best scholarly tradition, I hope that this post will serve as a useful vademecum1 for the ones that are new to this exciting field of research.
Caveat lector: I myself am rather new to machine learning research—I moved into this field after finishing my Ph.D. in multivariate data analysis and visualisation of complex data sets. My vantage point is thus by necessity not as high and panoramic as that of many of the other participants. Nevertheless, I sincerely hope that every reader will find at least a modicum of sage advice here. When reading this post, please think of the words of Francis Bacon:
Read not to contradict and confute; nor to believe and take for granted; nor to find talk and discourse; but to weigh and consider.
That being said, enjoy reading!
So you want to do research in machine learning? Regardless of your background—whether you are a seasoned veteran of many paper submissions or a graduate student about to write their first paper—I welcome you! You will find that machine learning (ML) is a vast field of research. The following sections will provide you with some hints and comments on certain aspects. Let us start by considering some common misconceptions.
Misconception 1: ‘Machine learning is just X in disguise’
Sometimes you will hear a sentence like the headline above, with values of X typically being ‘statistics’, ‘data science’, or even ‘sloppy mathematics’2. Do not let these things dismay you! You will find that machine learning is a vast research field with a lot of width and depth. Just take a look at any programme of any of the large conferences: the papers span so many different topics, ranging from the highly theoretical to the very applied. There can be a place for everyone in this community!
You also may have encountered people who like to judge certain parts of this spectrum. I emphatically reject that notion—neither type of paper should be seen as ‘less valid’ or ‘less valuable’ than the other: empirical challenges encountered in a solid application paper can stimulate theoretical research, while an accessible theoretical paper may inspire novel applications. You will find that the best researchers often have papers on all sides of this spectrum.
Misconception 2: ‘Machine learning and deep learning are the same’
Again, a common misconception. While it is true that deep learning has numerous spectacular successes, not everything in machine learning boils down to deep learning3. If you like it and feel comfortable in that area, good for you! If you do not feel that way, please know that there is room for lots of other methods. Whether you are interested in making algorithms fair, or making their output more interpretable, there is no need to only think of deep learning. Again, the best researchers have papers that are not restricted to a single framework.
In fact, one of the worst errors we scientists can make is seeing a dichotomy where there is none. You can contribute to deep learning and to another sub-field. Think of machine learning methods as a toolbox: sometimes you require a hammer, sometimes you require a drill, sometimes you may even need a screwdriver4. It is a bad idea to always confine oneself to a single tool, though.
Find out which topic suits you best, try to master that, and only then move on to other tools.
Misconception 3: ‘Machine learning requires complicated mathematics’
Before discussing this misconception, I have to point out its similarities to another misconception, namely the one that ‘one has to be a genius to do maths’. Terence Tao answered this misconception beautifully, eloquently, and with a lot of empathy in his blog, and I strongly encourage you to read this article first before returning to this post.
Back from reading the post? Superb! Let me now address this misconception. It is clear that you need to have some mathematical knowledge to understand concepts in machine learning. However, at the risk of aggravating people, I think that some knowledge of undergraduate mathematics5 is sufficient to dive into machine learning. If you know a little bit about calculus and statistics, you should have no trouble understanding at the very least the intuition behind many papers. Even if you lack this knowledge (for now), do not be dismayed—you can pick up the basics quite rapidly by reading papers, blog posts, and talking to others. To summarise: you do not have to be a master carpenter just to use a chair and a table, but you should know something about carpentry if you want to build your own furniture.
If you struggle with with mathematics, do not give up! You have to be willing to learn and be somewhat tenacious to force yourself to understand difficult concepts. Use different sources to ‘stomach’ the material; everyone learns in a different manner. In any case, your mathematical knowledge should never become a barrier for your research. Ideas matter a lot in machine learning—discuss them with your peers and try to learn the rigour along the way. Mathematics is but a language for formulating certain thoughts in a more precise manner. The underlying idea of your project is much more important than your ability to calculate gradients manually.
These are the three misconceptions I encountered quite often. In the following sections, I want to address a few other topics that I consider of importance for those new to the field and new to scientific publishing in general.
Your articles will be scrutinised by several experts. Sometimes, you will be happy with the results, while at other times, you may be frustrated. This is completely normal and to be expected—in mathematics, jokes about reviewers and referees have been quite common for more than a hundred years now.
My advice is to try to use the reviews to improve your paper. If a reviewer is not representing your paper correctly in their review, take this as an incentive to rewrite it and make it more accessible. If a reviewer is bemoaning the lack of a certain experiment, try to add this experiment—or make it very clear why you did not include it. Of course, you will not always get high-quality reviews. That is one of the downsides of the reviewing system. However, even though the appeal to vent publicly may be there, do not give one person so much power over your thoughts and your paper. Rather, try to find the good even in a bad review and use it to revise your paper. If you are really sure that something is amiss with your reviewers, the chain of command in most of the ML conferences involves writing to the ‘area chair’ (AC), who is usually a more senior researcher in charge of multiple papers. Your AC may have the power to overrule reviewer decisions, in particular when the reviews are biased. Moreover, an AC may also solicit more reviews for borderline papers, which can make the difference between an accept and a reject.
In any case, you should always consider that the reviewers are human beings just like the rest of us. Being a reviewer takes time, and it can be a very demanding job. The workload of some reviewers may be too high, potentially leading to a lack of diligence when reading papers. Restructuring and improving the current reviewing system is something that will keep our community busy for years to come.
(You may wonder about my thoughts on writing reviews, then. This is a topic best reserved for a future article, though—as someone new to ML, you should first try to understand the field and write your own research before having to review the work of others. In particular if you just started your Ph.D. , I strongly recommend not to take on reviewing duties too early. Focus on your research first and try to get a feeling for what constitutes a ‘good’ paper.)
Closely tied to the reviews is the dreaded aspect of novelty. Just like a unicorn, novelty is a concept that is highly elusive and gets thrown around too much in many discussions. Already in approximately 300 BCE, the anonymous author of the book of Ecclesiastes wrote:
What has been will be again,
what has been done will be done again;
there is nothing new under the sun.
And yet, this did not stop us from developing and inventing quite a lot of ingenious things in the meantime. I see a lot of my friends and colleagues struggle with the idea of coming up with something ‘completely novel’, when in reality, there is no need for this hurdle. Good contributions can come in different packages. As a first project, for example, you may want to try to replicate something that has been done before and report your results back. This can be very surprising and instructive—both for you and the community.
If you feel overwhelmed and are convinced that ‘everything has been done already’, try to adopt another perspective. Read widely and look for the gaps in knowledge—or for the things that everyone takes for granted. Your first project does not have to change the world. It is common to consider your Ph.D. thesis to be a ‘masterpiece’ and the best work you will do in your career, but this is just incorrect. In particular when you want to stay in academia, your dissertation is only the beginning of your career. How sad it would be if that was the epitome of your research, then6
Furthermore, avoid the trap of having to beat the state of the art. Some cynical researchers believe that having a lot of bold cells in your results table is the sure ticket to get your paper accepted. I think the foremost focus should be on good science. Good science does not mean that your method outperforms everything in terms of accuracy. Every method has disadvantages and advantages, and good papers discuss both of them in an honest way. Do not force yourself to think in terms of ‘winning’ or ‘losing’ when working on a method—this is a very binary way of thinking that can make you lose sight of your goal, viz. the science behind your project. There are papers that do not outperform everything on every7 data set, but rather give us new insights into known methods. The paper The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks by Jonathan Frankle and Michael Carbin embodies these properties perfectly. It is well worth a read and may provide hope to those that are misled by the idea that ML papers are only about ‘winning’ in the sense of ‘beating the state of the art’.
In short: winning means doing good science. End of story.
Another daunting aspect of ML involves writing papers. This subject has been debated at length by many people, including myself. In this post, I merely want to give a quick recipe and provide some relief for those who struggle. I do not recall the exact author of the quote, but I always liked the following ‘recipe’ for writing:
- Have something to say
- Say it
You can read these two ‘steps’ as a flippant way that avoids talking about the issue altogether. I see them more as a revelation of a few underlying truths, viz. that writing is hard and everyone needs to find their own voice. If this sounds strange to you, consider the differences between the narrative voices of our greatest authors. Some prefer minimalism in the form of an Ernest Hemingway, while others enjoy the maximalism of the late David Foster Wallace. Some use flowery prose, while others are very terse and omit almost all adjectives. Just like these authors, you need to find your voice when writing a paper. Doing so takes practice; you have to constantly rewrite and not put it off until the very last minute.
On a more practical note, I find it very helpful to think about papers that I enjoy reading and find out why I enjoy reading them. Most of the time, I can update my writing style to some extent. Of course, some papers are sloppily written—given the many publications in our field, this should come as no surprise. Thus, if you find a nugget of clear writing, preserve it and try to learn as much as you can from it.
On failure & success
I want to end this post with a brief discussion on failure and success8. While it may seem a trite observation, you should be aware that social media such as Twitter typically give you a view of the highlights of other people, while you yourself seemingly have to endure the out-takes of your own life. However, this is not the whole truth. Notice that everyone is happy to share their successes—and there is absolutely nothing wrong with that—but few people share their failures. And yet, failures are just a part of any normal career. You will fail at certain things: an experiment may not work, a paper might get rejected, your hypothesis might turn out to be incorrect, and so on. This is normal, and in fact, you should celebrate it to some extent. For some, the taste of success is so much sweeter when they had to overcome adversity. Our lives are marathons, not sprints!
So, please remember the asymmetry of how others might portray their lives. For example, when we tweeted about one of our ICLR papers, we did not disclose that it had been previously rejected at NeurIPS 2018. Likewise, when you read a CV, you will only see the publications that made it, not the ones that are still in resubmission, or in preparation, and so on. If you are not aware of this, it can create the wrong impression about the quality of your own research. If you want to peek behind the curtains, some people courageously display their ‘shadow CV’, a CV that contains a list of their failures. Use this to understand that failing is natural and do not be afraid to try again with the same enthusiasm. Shakespeare put this very succinctly:
Our doubts are traitors
and make us lose the good we oft might win by fearing to attempt.
I hope this post was able to provide you with some wisdom. There are so many more resources out there, though. As a parting gift, I encourage you to reach out to people with any questions. Ask them about your papers, ask them about their papers, ask them about strategies, discuss research, etc.—you will find that most are happy to share their experience. Plus, the more you talk about the things that trouble you, the more support you will get. And this is what makes our community stronger and why I am so happy to be a part of it.
Acknowledgements: I thank Christian Bock for discussing numerous aspects of this article.
A contraction of the Latin words vade mecum, meaning ‘Go with me’, referring to a handbook or useful object that one should never be without. ↩︎
The last statement—and variations thereof—is typically found in the comments section of news articles on ML. ↩︎
Moreover, it would be a boring situation and essentially equivalent to stating that all of mathematics is calculus, merely because calculus is very prevalent. ↩︎
I am not emphasising this to denigrate the rigour of ML papers, I merely want to point out that you certainly do not have to have an advanced degree in mathematics to contribute to machine learning. Depending on your research, more skills are needed, of course. ↩︎
This does not mean that you should treat your thesis irreverently. Make it a documentation of your skills as a researcher, full of well-presented content, but do not fall into the trap that it has to be your best work. Strive to improve. ↩︎
Or rather, on every data set that may or may not have been carefully selected by the authors. Honi soit qui mal y pense. ↩︎
This is of course not specific to the ML community. ↩︎