Feature Analysis of Supervised Machine Learning Models in IDE-Based Learning Analytics - Exploring the use of correlation coefficients and p-values as feature utility measures through estimating student performance in an introductory programming course
dc.contributor.advisor | Trætteberg, Hallvard | |
dc.contributor.author | Nygård, Boye Borg | |
dc.date.accessioned | 2018-11-13T15:00:35Z | |
dc.date.available | 2018-11-13T15:00:35Z | |
dc.date.created | 2018-06-14 | |
dc.date.issued | 2018 | |
dc.identifier | ntnudaim:15937 | |
dc.identifier.uri | http://hdl.handle.net/11250/2572390 | |
dc.description.abstract | Due to the recent proliferation of large datasets collected from human behavior in digital environments, IDE-based learning analytics using supervised learning has emerged as a scientific field. However, due to its novelty, research methods tailored to the needs of IDE-based learning analytics is yet to be developed. In this paper, we present a research method for evaluating features used in supervised learning models in relation to their effect on the model s performance. We show that correlation coefficients in combination with p-values can be used as a measure of a feature s usefulness. The goal of the method is to enable researchers to understand and compare different features, allowing a higher degree of utilization of previous research, and increasing the overall research value of supervised learning in IDE-based learning analytics. | |
dc.language | eng | |
dc.publisher | NTNU | |
dc.subject | Datateknologi, Kunstig intelligens | |
dc.title | Feature Analysis of Supervised Machine Learning Models in IDE-Based Learning Analytics - Exploring the use of correlation coefficients and p-values as feature utility measures through estimating student performance in an introductory programming course | |
dc.type | Master thesis |