Evaluating Top-N Recommendations Using Ranked Error Approach: An Empirical Analysis

Evaluation metrics or measures are necessary tools for evaluating and choosing the most appropriate modeling approaches within recommender systems. However, evaluation measures can sometimes fall short when evaluating recommendation lists that best match users’ top preferences. A possible reason for this shortcoming is that most measures mainly focus on the list-wise performance of the recommendations and generally do not consider the item-wise performance. As a result, a recommender system might apply a weak or less accurate modeling approach instead of the best one. To address these challenges, we propose a new evaluation measure that incorporates the rank order of a prediction list with an error-based metric to make it more powerful and discriminative and thus more suited for top-N recommendations. The main goal of the proposed metric is to provide recommender systems, developers and researchers an even better tool, which enables them to choose the best modeling approach possible, and hence maximizing the quality of top-N recommendations. To evaluate the proposed metric and compare its general properties against existing metrics, we perform extensive experiments with detailed empirical analysis. Our experiments and the analysis show the usefulness, effectiveness and feasibility of the new metric.

Utgiver

IEEE

Tidsskrift

IEEE Access

Med mindre annet er angitt, så er denne innførselen lisensiert som Navngivelse 4.0 Internasjonal