Vis enkel innførsel

dc.contributor.advisorAmble, Torenb_NO
dc.contributor.advisorSætre, Runenb_NO
dc.contributor.authorSøvik, Haraldnb_NO
dc.date.accessioned2014-12-19T13:34:09Z
dc.date.available2014-12-19T13:34:09Z
dc.date.created2010-09-05nb_NO
dc.date.issued2006nb_NO
dc.identifier348994nb_NO
dc.identifierntnudaim:1337nb_NO
dc.identifier.urihttp://hdl.handle.net/11250/251430
dc.description.abstractAs Natural Language Processing systems converge on a high percentage of successful deeply parsed text, parse success alone is an incomplete measure of the ``intelligence'' exhibited by the system. Because systems apply different grammars, dictionaries and programming languages, the internal representation of parsed text is often different from system to system, making it difficult to compare performance and exchange useful data such as tagged corpora or semantic interpretations. This report describes how semantically annotated corpora can be used to measure quality of Natural Language Processing systems. A selected corpus produced by the GENIA project were used as ``golden standard'' (event-annotated abstracts from MEDLINE). This corpus were sparse (19 abstracts), thus manual methods were employed to produce a mapping from the native GeneTUC knowledge format (TQL). This mapping were used to produce an evaluation of events in GeneTUC. We were able to attain a recall of 67% and average precision of 33% on the training data. These results suggest that the mapping is inadequate. On test data, the recall were 28% and average precision 21%. Because events is a new ``feature'' in NLP-applications, there are no large corpora that can be used for automated rule learning. The conclusion is that at least there exists a partial mapping from TQL to GENIA events, and that larger corpora and AI-methods should be applied to refine the mapping rules. In addition, we discovered that this mapping can be of use for extraction of protein-protein interactions.nb_NO
dc.languageengnb_NO
dc.publisherInstitutt for datateknikk og informasjonsvitenskapnb_NO
dc.subjectntnudaimno_NO
dc.subjectSIF2 datateknikkno_NO
dc.subjectIntelligente systemerno_NO
dc.titleGeneTUC: Event extraction from TQL logicnb_NO
dc.typeMaster thesisnb_NO
dc.source.pagenumber144nb_NO
dc.contributor.departmentNorges teknisk-naturvitenskapelige universitet, Fakultet for informasjonsteknologi, matematikk og elektroteknikk, Institutt for datateknikk og informasjonsvitenskapnb_NO


Tilhørende fil(er)

Thumbnail
Thumbnail
Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel