Reporting Standards for Machine Learning Based Science
By Sayash Kapoor et al
Published on Sept. 19, 2023
Read the original document by opening this link in a new tab.
Table of Contents
Abstract
Introduction
Box 1: What is ML-based science?
Methods
Section Item
Box 2: Goals of the /r.sc/e.sc/f.sc/o.sc/r.sc/m.sc/s.sc checklist
Summary
This document discusses the importance of clear reporting standards for Machine Learning (ML) based science. It addresses common failures in validity, reproducibility, and generalizability of ML methods in scientific research. The authors present a checklist consisting of 32 questions and guidelines for researchers, referees, and journals to improve transparency and reproducibility in ML-based science. The checklist covers key areas such as study goals, computational reproducibility, data quality, modeling decisions, evaluation methods, data leakage prevention, metrics and uncertainty quantification, and generalizability. The document emphasizes the need for researchers to establish scientific claims, execute ML tasks correctly, and report performance accurately to prevent errors in ML-based science.