Vis enkel innførsel

dc.contributor.advisorAmble, Torenb_NO
dc.contributor.advisorNordgård, Torbjørnnb_NO
dc.contributor.advisorGambäck, Björnnb_NO
dc.contributor.authorRanang, Martin Thorsennb_NO
dc.date.accessioned2014-12-19T13:30:31Z
dc.date.available2014-12-19T13:30:31Z
dc.date.created2010-01-11nb_NO
dc.date.issued2010nb_NO
dc.identifier293836nb_NO
dc.identifier.isbn978-82-471-1973-0nb_NO
dc.identifier.isbn978-82-471-1974-7nb_NO
dc.identifier.urihttp://hdl.handle.net/11250/250007
dc.description.abstractNo large-scale, open-domain semantic resource for Norwegian, with a rich number of semantic relations currently exists. The existing semantic resources for Norwegian are either limited in size and/or incompatible with the de facto standard resources used for Natural Language Processing for English. Both current and future cultural, technological, economical, and educational consequences caused by the scarcity of advanced Norwegian language-technological solutions and resources has been widely acknowledged (Simonsen 2005; Norwegian Language Council 2005; Norwegian Ministry of Culture and Church Affairs 2008). This dissertation presents (1) a novel method that consists of a model and several algorithms for automatically mapping content words from a non-English source language to (a power set of) WordNet (Miller 1995; Fellbaum 1998) senses with average precision of up to 92.1 % and recall of up to 36.5 %. Because an important feature of the method is its ability to correctly handle compounds, this dissertation also presents (2) a practical implementation, including algorithms and a grammar, of a program for automatically analyzing Norwegian compounds. This work also shows (3) how Verto, an implementation of the model and algorithms, is used to create Ordnett, a large-scale, open-domain lexical-semantic resource for Norwegian with a rich number of semantic relations. Finally, this work argues that the new method and automatically generated resource makes it possible to build large-scale open-domain Natural Language Understanding systems, that offer both wide coverage and deep analyses, for Norwegian texts. This is done by showing (4) how Ordnett can be used in an open-domain question answering system that automatically extracts and acquires knowledge from Norwegian encyclopedic articles and uses the acquired knowledge to answer questions formulated in natural language by its users. The open-domain question answering system, named TUClopedia, is based on The Understanding Computer (Amble 2003) which has previously been successfully applied to narrow domains.nb_NO
dc.languageengnb_NO
dc.publisherNorges teknisk-naturvitenskapelige universitetnb_NO
dc.relation.ispartofseriesDoctoral Theses at NTNU, 1503-8181; 2010:11nb_NO
dc.subjectontologyen_GB
dc.subjectnatural language understandingen_GB
dc.subjectmappingen_GB
dc.subjectknowledge extractionen_GB
dc.subjectquestion answeringen_GB
dc.subjectNorwegianen_GB
dc.subjectWordNeten_GB
dc.titleOpen-Domain Word-Level Interpretation of Norwegian: Towards a General Encyclopedic Question-Answering System for Norwegiannb_NO
dc.typeDoctoral thesisnb_NO
dc.source.pagenumber233nb_NO
dc.contributor.departmentNorges teknisk-naturvitenskapelige universitet, Fakultet for informasjonsteknologi, matematikk og elektroteknikk, Institutt for datateknikk og informasjonsvitenskapnb_NO
dc.description.degreePhD i informasjons- og kommunikasjonsteknologinb_NO
dc.description.degreePhD in Information and Communications Technologyen_GB


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel