dc.contributor.author | Chitrakar, Ambika Shrestha | |
dc.contributor.author | Petrovic, Slobodan | |
dc.date.accessioned | 2019-06-12T09:29:17Z | |
dc.date.available | 2019-06-12T09:29:17Z | |
dc.date.created | 2019-02-28T13:51:47Z | |
dc.date.issued | 2018 | |
dc.identifier.citation | IEEE International Conference on Big Data (Big Data). 2018 | nb_NO |
dc.identifier.isbn | 978-1-5386-5035-6 | |
dc.identifier.uri | http://hdl.handle.net/11250/2600575 | |
dc.description.abstract | Analyzing digital evidence has become a big data problem, which requires faster methods to handle them on a scalable framework. Standard k-means clustering algorithm is widely used in analyzing digital evidence. However, it is a hill-climbing method and it becomes slower with the increase of data, its dimension, and the number of cluster centers. This paper presents a framework to implement parallel k-means with triangle inequality (k-meansTI) algorithm on Spark, which is supposed to improve the speed of the standard k-means algorithm by skipping many point-center distance computations, giving the same clustering results. Our experimental results show that the parallel implementation of k-meansTI on Spark can be faster than the Spark ML k-means when a data set is large, does not contain many sparse data, and is high dimensional. These results are based on the experiments performed on six different data sets that have variations on the number of features and the number of data instances. | nb_NO |
dc.language.iso | eng | nb_NO |
dc.publisher | Institute of Electrical and Electronics Engineers (IEEE) | nb_NO |
dc.relation.ispartof | 2018 IEEE International Conference on Big Data | |
dc.title | Analyzing Digital Evidence Using Parallel k-means with Triangle Inequality on Spark | nb_NO |
dc.type | Chapter | nb_NO |
dc.type | Peer reviewed | nb_NO |
dc.description.version | acceptedVersion | nb_NO |
dc.source.pagenumber | 3049-3058 | nb_NO |
dc.identifier.doi | 10.1109/BigData.2018.8622430 | |
dc.identifier.cristin | 1681413 | |
dc.description.localcode | © 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. | nb_NO |
cristin.unitcode | 194,63,30,0 | |
cristin.unitname | Institutt for informasjonssikkerhet og kommunikasjonsteknologi | |
cristin.ispublished | true | |
cristin.fulltext | postprint | |
cristin.qualitycode | 1 | |