Supervised Machine Learning for Science

How to stop worrying and love your black box


Supervised Machine Learning for Science cover image

Supervised Machine Learning for Science is a philosophical and pragmatic justification for applying machine learning models in research.

Where To Buy


Machine learning has revolutionized science, from folding proteins and predicting tornadoes to studying human nature. While science has always had an intimate relationship with prediction, machine learning amplified this focus. But can this hyper-focus on prediction models be justified? Can a machine learning model be part of a scientific model? Or are we on the wrong track?

In this book, we explore and justify supervised machine learning in science. However, a naive application of supervised learning won’t get you far because machine learning in raw form is unsuitable for science. After all, it lacks interpretability, uncertainty quantification, causality, and many more desirable attributes. Yet, we already have all the puzzle pieces needed to improve machine learning, from incorporating domain knowledge and ensuring the representativeness of the training data to creating robust, interpretable, and causal models. The problem is that the solutions are scattered everywhere.

In this book, we bring together the philosophical justification and the solutions that make supervised machine learning a powerful tool for science.

The book consists of two parts:

  • Part 1 discusses the relationship between science and machine learning.
  • Part 2 addresses the shortcomings of supervised machine learning.

Who This Book Is For

This book is aimed at scientists who use or want to use machine learning. But we believe that anyone who uses machine learning beyond pure prediction will benefit from the book.


  • You should know the basics of machine learning
  • Be interested in machine learning beyond prediction

Table Of Contents (work in progress)

  1. Preface
  2. Introduction
  3. Bare-Bones Supervised Machine Learning
  4. The Role of Prediction in Science
  5. Machine Learning and Other Scientific Goals: A Clash
  6. Justification to Use Machine Learning
  7. Bare-Bones Machine Learning is Insufficient
  8. Generalization
  9. Domain Knowledge
  10. Interpretability
  11. Causality
  12. Robustnes
  13. Uncertainty
  14. Reporting
  15. Reproducibility