Vis enkel innførsel

dc.contributor.authorKastrati, Zenun
dc.contributor.authorImran, Ali Shariq
dc.contributor.authorYildirim Yayilgan, Sule
dc.date.accessioned2020-01-15T13:48:02Z
dc.date.available2020-01-15T13:48:02Z
dc.date.created2019-05-07T12:47:50Z
dc.date.issued2019
dc.identifier.citationInformation Processing & Management. 2019, 56 (5), 1618-1632.nb_NO
dc.identifier.issn0306-4573
dc.identifier.urihttp://hdl.handle.net/11250/2636466
dc.description.abstractThis paper presents a semantically rich document representation model for automatically classifying financial documents into predefined categories utilizing deep learning. The model architecture consists of two main modules including document representation and document classification. In the first module, a document is enriched with semantics using background knowledge provided by an ontology and through the acquisition of its relevant terminology. Acquisition of terminology integrated to the ontology extends the capabilities of semantically rich document representations with an in depth-coverage of concepts, thereby capturing the whole conceptualization involved in documents. Semantically rich representations obtained from the first module will serve as input to the document classification module which aims at finding the most appropriate category for that document through deep learning. Three different deep learning networks each belonging to a different category of machine learning techniques for ontological document classification using a real-life ontology are used. Multiple simulations are carried out with various deep neural networks configurations, and our findings reveal that a three hidden layer feedforward network with 1024 neurons obtain the highest document classification performance on the INFUSE dataset. The performance in terms of F1 score is further increased by almost five percentage points to 78.10% for the same network configuration when the relevant terminology integrated to the ontology is applied to enrich document representation. Furthermore, we conducted a comparative performance evaluation using various state-of-the-art document representation approaches and classification techniques including shallow and conventional machine learning classifiers.nb_NO
dc.description.abstractThe impact of deep learning on document classification using semantically rich representationsnb_NO
dc.language.isoengnb_NO
dc.publisherElseviernb_NO
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 Internasjonal*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/deed.no*
dc.titleThe impact of deep learning on document classification using semantically rich representationsnb_NO
dc.typeJournal articlenb_NO
dc.typePeer reviewednb_NO
dc.description.versionacceptedVersionnb_NO
dc.source.pagenumber1618-1632nb_NO
dc.source.volume56nb_NO
dc.source.journalInformation Processing & Managementnb_NO
dc.source.issue5nb_NO
dc.identifier.doi10.1016/j.ipm.2019.05.003
dc.identifier.cristin1696042
dc.description.localcode© 2019. This is the authors’ accepted and refereed manuscript to the article. Locked until 15.5.2021 due to copyright restrictions. This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/nb_NO
cristin.unitcode194,0,0,0
cristin.unitcode194,63,10,0
cristin.unitcode194,63,30,0
cristin.unitnameNorges teknisk-naturvitenskapelige universitet
cristin.unitnameInstitutt for datateknologi og informatikk
cristin.unitnameInstitutt for informasjonssikkerhet og kommunikasjonsteknologi
cristin.ispublishedtrue
cristin.fulltextpostprint
cristin.qualitycode2


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel

Attribution-NonCommercial-NoDerivatives 4.0 Internasjonal
Med mindre annet er angitt, så er denne innførselen lisensiert som Attribution-NonCommercial-NoDerivatives 4.0 Internasjonal