Authorship Identification of Research Papers

Skoglund, Simen

dc.contributor.advisor	Nørvåg, Kjetil
dc.contributor.author	Skoglund, Simen
dc.date.created	2015-08-06
dc.date.issued	2015
dc.identifier	ntnudaim:12344
dc.identifier.uri	http://hdl.handle.net/11250/2353615
dc.description.abstract	Authorship identification is a technique used to identify anonymous documents by identifying and extracting an authors stylometric features. The focus of this thesis is to apply an authorship identification technique, classification, to a set of research papers to determine the authorship. We go through theory and previous work of authorship identification before we present the implemented system. In the end, we perform two separate experiments and discuss their results. The experiments show good results in specific cases, and we achieve an accuracy of 100% in the best case. The algorithms used are support vector machines, artificial neural networks, decision trees, random forests and the k-nearest neighbor. In our experiments support vector machines and artificial neural network had the best performance while decision trees performed worst. Based on our results we propose caution when applying authorship identification before or after having performed a double-blind review, or for an author to use authorship identification to acquire an unbiased review of a research paper. Even though we state that authorship identification should be used with caution, it is still a great tool and gives a general idea of finding the authorship of an anonymous document.
dc.language	eng
dc.publisher	NTNU
dc.subject	Informatikk, Data- og informasjonsforvaltning
dc.title	Authorship Identification of Research Papers
dc.type	Master thesis

Tilhørende fil(er)

Filnavn:: 12344_FULLTEXT.pdf
Størrelse:: 1.233Mb
Format:: PDF

Åpne

Filnavn:: 12344_ATTACHMENT.zip
Størrelse:: 2.807Gb
Format:: application/zip

Åpne

Filnavn:: 12344_COVER.pdf
Størrelse:: 234.3Kb
Format:: PDF

Åpne

Denne innførselen finnes i følgende samling(er)

Institutt for datateknologi og informatikk [6552]

Vis enkel innførsel