Performance analysis of machine learning classifiers on improved concept vector space models

Kastrati, Zenun; Imran, Ali Shariq

dc.contributor.author	Kastrati, Zenun
dc.contributor.author	Imran, Ali Shariq
dc.date.accessioned	2020-01-30T08:26:57Z
dc.date.available	2020-01-30T08:26:57Z
dc.date.created	2019-02-19T09:36:58Z
dc.date.issued	2019
dc.identifier.citation	Future generations computer systems. 2019, 96 552-562.	nb_NO
dc.identifier.issn	0167-739X
dc.identifier.uri	http://hdl.handle.net/11250/2638748
dc.description.abstract	This paper provides a comprehensive performance analysis of parametric and non-parametric machine learning classifiers including a deep feed-forward multi-layer perceptron (MLP) network on two variants of improved Concept Vector Space (iCVS) model. In the first variant, a weighting scheme enhanced with the notion of concept importance is used to assess weight of ontology concepts. Concept importance shows how important a concept is in an ontology and it is automatically computed by converting the ontology into a graph and then applying one of the Markov based algorithms. In the second variant of iCVS, concepts provided by the ontology and their semantically related terms are used to construct concept vectors in order to represent the document into a semantic vector space. We conducted various experiments using a variety of machine learning classifiers for three different models of document representation. The first model is a baseline concept vector space (CVS) model that relies on an exact/partial match technique to represent a document into a vector space. The second and third model is an iCVS model that employs an enhanced concept weighting scheme for assessing weights of concepts (variant 1), and the acquisition of terms that are semantically related to concepts of the ontology for semantic document representation (variant 2), respectively. Additionally, a comparison between seven different classifiers is performed for all three models using precision, recall, and F1 score. Results for multiple configurations of deep learning architecture are obtained by varying the number of hidden layers and nodes in each layer, and are compared to those obtained with conventional classifiers. The obtained results show that the classification performance is highly dependent upon the choice of a classifier, and that the Random Forest, Gradient Boosting, and Multilayer Perceptron are among the classifiers that performed rather well for all three models.	nb_NO
dc.language.iso	eng	nb_NO
dc.publisher	Elsevier	nb_NO
dc.rights	Navngivelse 4.0 Internasjonal	*
dc.rights.uri	http://creativecommons.org/licenses/by/4.0/deed.no	*
dc.title	Performance analysis of machine learning classifiers on improved concept vector space models	nb_NO
dc.type	Journal article	nb_NO
dc.type	Peer reviewed	nb_NO
dc.description.version	publishedVersion	nb_NO
dc.source.pagenumber	552-562	nb_NO
dc.source.volume	96	nb_NO
dc.source.journal	Future generations computer systems	nb_NO
dc.identifier.doi	10.1016/j.future.2019.02.006
dc.identifier.cristin	1678590
dc.description.localcode	©2019 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).	nb_NO
cristin.unitcode	194,63,35,0
cristin.unitname	Institutt for elektroniske systemer
cristin.ispublished	true
cristin.fulltext	postprint
cristin.qualitycode	1

Tilhørende fil(er)

Filnavn:: Kastrati.pdf
Størrelse:: 1.154Mb
Format:: PDF

Åpne

Denne innførselen finnes i følgende samling(er)

Institutt for elektroniske systemer [2308]
Publikasjoner fra CRIStin - NTNU [37727]

Vis enkel innførsel

Med mindre annet er angitt, så er denne innførselen lisensiert som Navngivelse 4.0 Internasjonal