Show simple item record

dc.contributor.advisorÖztürk, Pinarnb_NO
dc.contributor.authorValle, Kjetilnb_NO
dc.date.accessioned2014-12-19T13:37:17Z
dc.date.available2014-12-19T13:37:17Z
dc.date.created2011-09-13nb_NO
dc.date.issued2011nb_NO
dc.identifier440510nb_NO
dc.identifierntnudaim:5757nb_NO
dc.identifier.urihttp://hdl.handle.net/11250/252468
dc.description.abstractThis thesis presents a graph-based approach to the problem of text representation. The work is motivated by the need for better representations for use in textual Case-Based Reasoning (CBR). In CBR new problems are solved by reasoning based on similar past problem cases. When the cases are represented in free text format, measuring the similarity between a new problem and previously solved problems become a challenging task. The case documents need to be re-represented before they can be compared/matched.Textual CBR (TCBR) addresses this issue. We investigate automatic re-representation of textual cases, in particular measuring the salience of features (entities in the text) towards this end. We use the classical vector space model in Information Retrieval (IR) but investigate whether graph-representation and salience inference using graphs can improve on the Term Frequency (TF) and Term Frequency-Inverse Document Frequency (TF-IDF) measures, emph{bag of words} approaches predominant in IR.Our special focus is whether, and possibly how, the co-occurrence and the syntactic dependency relations between terms have an impact on feature weighting. We measure salience through the notion of graph centrality. We experiment with two types of application tasks, classification and case retrieval. Although classification is not a typical TCBR task, it is easier to find datasets for this application, and the centrality measures we have studied are not specific to TCBR. The experiments on this task are therefore relevant to the second application task which is our ultimate target. We test various centrality metrics described in the literature, make a distinction between local and global weighting measures and compare them for both application tasks. In general, our graph-based salience inference methods perform better than TF and TF-IDF.nb_NO
dc.languageengnb_NO
dc.publisherInstitutt for datateknikk og informasjonsvitenskapnb_NO
dc.subjectntnudaim:5757no_NO
dc.subjectMTDT datateknikkno_NO
dc.subjectIntelligente systemerno_NO
dc.titleGraph-Based Representations for Textual Case-Based Reasoningnb_NO
dc.typeMaster thesisnb_NO
dc.source.pagenumber143nb_NO
dc.contributor.departmentNorges teknisk-naturvitenskapelige universitet, Fakultet for informasjonsteknologi, matematikk og elektroteknikk, Institutt for datateknikk og informasjonsvitenskapnb_NO


Files in this item

Thumbnail
Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record