Show simple item record

dc.contributor.advisorÖztürk, Pinarnb_NO
dc.contributor.authorMoen, Hansnb_NO
dc.date.accessioned2014-12-19T13:31:26Z
dc.date.available2014-12-19T13:31:26Z
dc.date.created2010-09-02nb_NO
dc.date.issued2009nb_NO
dc.identifier347127nb_NO
dc.identifierntnudaim:4569nb_NO
dc.identifier.urihttp://hdl.handle.net/11250/250346
dc.description.abstractDetermining similarity between two documents for information retrieval purpose requires more than just knowing which words are used in these documents. It is also important how the words are used. This thesis studies an algorithm, called Holographic Reduced Representations (HRR), that takes into consideration both co-occurrence of words and the way these words are used in the sentences, hence, syntactic structure information. HRR is a rather novel algorithm and performs text classification automatically based on statistical information, and is based upon representing concepts, i.e. text, in randomly initiated vectors. It captures term context information and term order information from sentences by using vector addition and binding. Concepts related to HRR and text classification are introduced. The HRR algorithms ability to capture term context information and term order information were tested. Ways to use this information at a document level were discussed. HRR's suitability to text classification compared to traditional Vector Space Model (VSM) is tested and discussed. For a query term, the retrieved terms with most similar order information seems to be terms that has the same part of speech as the query. Using the combined context and order information when retrieving terms/documents for a query gave results with increased depth, i.e. results that also include grammatical information compared to context information alone.nb_NO
dc.languageengnb_NO
dc.publisherInstitutt for datateknikk og informasjonsvitenskapnb_NO
dc.subjectntnudaimno_NO
dc.subjectSIF2 datateknikkno_NO
dc.subjectIntelligente systemerno_NO
dc.titleThe Use of Holographic Reduced Representations for Text Classificationnb_NO
dc.typeMaster thesisnb_NO
dc.source.pagenumber61nb_NO
dc.contributor.departmentNorges teknisk-naturvitenskapelige universitet, Fakultet for informasjonsteknologi, matematikk og elektroteknikk, Institutt for datateknikk og informasjonsvitenskapnb_NO


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record