Clustering Uncertain Data Objects Using Jeffreys-Divergence and Maximum Bipartite Matching Based Similarity Measure
Peer reviewed, Journal article
Published version
View/ Open
Date
2021Metadata
Show full item recordCollections
Abstract
In recent years, uncertain data clustering has become the subject of active research in many fields, for example, pattern recognition, and machine learning. Nowadays, researchers have committed themselves to substitute the traditional distance or similarity measures with new metrics in the existing centralized clustering algorithms in order to tackle uncertainty in data. However, in order to perform uncertain data clustering, representation plays an imperative role. In this paper, a Monte-Carlo integration is adopted and modified to express uncertain data in a probabilistic form. Then three similarity measures are used to determine the closeness between two probability distributions including one novel measure. These similarity measures are derived from the notion of Kullback-Leibler divergence and Jeffreys divergence. Finally, density-based spatial clustering of applications with noise and k-medoids algorithms are modified and implemented on one synthetic database and three real-world uncertain databases. The obtained outcomes confirm that the proposed clustering technique defeats some of the existing algorithms.