A theoretical and empirical assessment of probabilistic multiple choice tests
Abstract
In this thesis, the probabilistic multiple choice test is analysed empirically and theoretically. It is suggested as an alternative to the traditional multiple choice test. The probabilistic multiple choice test has a long history. However, there are no known published research papers on the subject based on test results from Norwegian students.
We will compare the theoretical performance of the traditional and probabilistic multiple choice test. In addition, we will analyse their performance as estimators of level of knowledge. To estimate the level of knowledge, we want to be sure that the students estimate their abilities accurately. We will therefore analyse what may influence students to inaccurately estimate their abilities in a probabilistic multiple choice test. We call it overconfidence if the students overestimate their abilities, and conversely underconfidence if the students underestimate their abilities. Furthermore, we will take a closer look at score functions that could be suitable for the probabilistic multiple choice test.
This thesis is a quantitative research study of the probabilistic multiple choice test. The empirical research is done by a test administered to a group of students enrolled for the subject TMA4240 Statistics at NTNU, because of their knowledge of probability and statistics. Since the test was voluntary, an incentive to take the test was given in the form of a possibility to win one of two gift cards. The data provides a basis for analysis and inference on the probabilistic multiple choice test and the participant's overconfidence. The Dirichlet distribution is used to model the theoretical properties of the test. In addition, it is used to analyse the score functions that we evaluate the student's performance with.
The results show that the probabilistic multiple choice test with a logarithmic score function is an unbiased estimator of the level of knowledge of a participant. The participant's ability to correctly estimate their own level of confidence is influenced by their sex, the requirement of obtaining a minimum score, feedback and the score function their score is calculated by. We find that a good test for both female and male participants has a logarithmic score function and gives feedback during the test.
In the field of education, the probabilistic multiple choice method has the potential of redefining the use of multiple choice tests. First of all because it provides an accurate quantification of the student's level of confidence, and second of all by making the student's knowledge transparent to the educator.