Performing reproductions to understand the state of reproducibility in current AI research

Mølnå, Martin; Cappelen, Odd

Mølnå, Martin; Cappelen, Odd

Master thesis

Åpne

20207_FULLTEXT.pdf (1023.Kb)

20207_COVER.pdf (1.556Mb)

Permanent lenke

http://hdl.handle.net/11250/2571693

Utgivelsesdato

2018

Metadata

Vis full innførsel

Samlinger

Institutt for datateknologi og informatikk [6768]

Sammendrag

In the last few years, the issue of reproducibility has gained increased attention in many scientific fields, including Artificial Intelligence (AI). Reproducibility of published results is a key concept of the scientific method, yet recent studies in AI and other computational sciences have shown that many experiments cannot be reproduced, and that current documentation practices are insufficient. In this project, reproductions are attempted of experiments from 30 highly cited papers in AI from recent years. The goal is to provide a better understanding of the state of reproducibility in the field, and identify issues limiting reproductions.

Three hypotheses are investigated in the project. First, it is hypothesized that most studies are difficult to reproduce. Secondly, the issues that make reproductions difficult are hypothesized to be similar across different studies. Thirdly, the level of documentation measured for an article is hypothesized to be related to how easily it can be reproduced. From the 30 papers investigated, 22 reproduction attempts were performed, where 10 were partially successful. The results achieved corroborate the first and second hypothesis, and the third hypothesis can neither be rejected nor corroborated.

Lastly, this project presents three contributions. The first contribution is the overview of the current state of reproducibility in AI provided by the results of the reproduction attempts. The second is a model for interpreting research articles in AI and estimating the level of documentation provided. The last is a set of categories of issues intended to cover most issues encountered during reproductions.

Utgiver

NTNU