Vis enkel innførsel

dc.contributor.advisorGulla, Jon Atlenb_NO
dc.contributor.advisorTomassen, Stein L.nb_NO
dc.contributor.advisorØhrn, Aleksandernb_NO
dc.contributor.authorLund, Kristiannb_NO
dc.date.accessioned2014-12-19T13:37:56Z
dc.date.available2014-12-19T13:37:56Z
dc.date.created2011-10-19nb_NO
dc.date.issued2011nb_NO
dc.identifier449016nb_NO
dc.identifierntnudaim:6047nb_NO
dc.identifier.urihttp://hdl.handle.net/11250/252668
dc.description.abstractExtraction of keyphrases from individual documents is a research area in which one try to extract a small set of keyphrases that describe the content of a single document. The advantages with this form of extraction is that it retains most of the semantic context from the document.In this thesis we focus on the news article domain and use the structure of a news article to improve the quality of the extracted keyphrases. An existing individual document keyphrase extraction algorithm is used as the basis. This algorithm is enhanced by implementing a weighting system based upon the structure of news articles. In addition some other common methods for keyword extraction is applied. The effects of these changes are tested extensively in the evaluation phase.In the evaluation of the implemented prototype we find that the introduction of a weight based system yields results that are equal to the basic algorithm and that few improvements can be made. We do however find that an automatically generated stopword list based on the corpus improves the results by 1-2%.nb_NO
dc.languageengnb_NO
dc.publisherInstitutt for datateknikk og informasjonsvitenskapnb_NO
dc.subjectntnudaim:6047no_NO
dc.subjectMTDT datateknikkno_NO
dc.subjectProgram- og informasjonssystemerno_NO
dc.titleExtracting Keyphrases from Individual News Articlesnb_NO
dc.typeMaster thesisnb_NO
dc.source.pagenumber80nb_NO
dc.contributor.departmentNorges teknisk-naturvitenskapelige universitet, Fakultet for informasjonsteknologi, matematikk og elektroteknikk, Institutt for datateknikk og informasjonsvitenskapnb_NO


Tilhørende fil(er)

Thumbnail
Thumbnail
Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel