A Data-Driven Approach for Determining Weights in Global Similarity Functions
Peer reviewed, Journal article
Accepted version
View/ Open
Date
2019Metadata
Show full item recordCollections
Original version
Lecture Notes in Computer Science (LNCS). 2019, 11680 125-139. 10.1007/978-3-030-29249-2_9Abstract
This paper presents a method to discover initial global similarity weights while developing a case-based reasoning (CBR) system. The approach is based on multiple feature relevance scoring methods and the relevance of features within each scoring method. The objective of this work is to utilize the characteristics of a dataset when creating similarity measures. The primary advantage of this method lies in its data-driven approach in the absence of domain knowledge in the early phase of a CBR system development. The results obtained based on the experiments on multiple public datasets show that the method improves the performance of similarity measures for a CBR system in discriminating relevant similar cases. Evaluation of the results is based on the method suitable for unbalanced datasets.