The Prevalence of Errors in Machine Learning Experiments
Tid: 17 maj 2022 13:00-15:00
Plats: Campus Gräsvik
We are glad to welcome Martin Sheppard, Gothenburg University/Brunel University London, to BTH on 17 May for his presentation.
Venue: Room J1360. If you want to participate via Zoom, please contact email@example.com
Computational experiments are the dominant paradigm to understand and compare machine learning algorithms. Typically, multiple learning algorithms (the treatments) are compared over multiple datasets that provide training and validation subsets using various predictive performance metrics, i.e., the response variables.
Such experimental designs are referred to as repeated-measure designs. This way we build knowledge through sense-making of many results. But we need to be sure our experimental results are reliable. I answer this question by examining the domain of software defect prediction. A re-analysis of experiments found ~40% contained inconsistent results and/or basic statistical errors. Elsewhere I show that inappropriate response metrics can not only change the magnitude of results but also the direction of effects in ~25% of cases.
We all make errors, and there can be considerable complexity in our computational experiments, so I recommend (i) use open science to expose studies to scrutiny, (ii) try to avoid dichotomous inferencing methods and (iii) use meta-analysis with caution!
About Martin Sheppherd
Martin Shepperd is the 2022 Swedish Tage Erlander research professor funded by the Swedish Research Council – the first professor of Computer Science holding this professorship since its inception in 1982. This year the professorship is placed at Gothenburg University, and he also has the chair of Software Modelling & Technology at Brunel University London. He has a BSc in Economics, and an MSc and PhD in Computer Science. He worked as a software developer for HSBC before returning to academia. He has published 3 books and more than 180 refereed research articles in the areas of software engineering and machine learning. He is a fellow of the British Computer Society.
Contact Anna Eriksson, firstname.lastname@example.org for questions.