• norsk
    • English
  • English 
    • norsk
    • English
  • Login
View Item 
  •   Home
  • Fakultet for informasjonsteknologi og elektroteknikk (IE)
  • Institutt for datateknologi og informatikk
  • View Item
  •   Home
  • Fakultet for informasjonsteknologi og elektroteknikk (IE)
  • Institutt for datateknologi og informatikk
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

GeneTUC - Event extraction from TQL logic

Søvik, Harald
Master thesis
Thumbnail
View/Open
1337_FULLTEXT.pdf (1.198Mb)
1337_ATTACHMENT.zip (1.956Mb)
1337_COVER.pdf (390.3Kb)
URI
http://hdl.handle.net/11250/2571047
Date
2006
Metadata
Show full item record
Collections
  • Institutt for datateknologi og informatikk [3771]
Abstract
As Natural Language Processing systems converge on a high percentage of successful deeply parsed text, parse success alone is an incomplete measure of the ``intelligence'' exhibited by the system. Because systems apply different grammars, dictionaries and programming languages, the internal representation of parsed text is often different from system to system, making it difficult to compare performance and exchange useful data such as tagged corpora or semantic interpretations.

This report describes how semantically annotated corpora can be used to measure quality of Natural Language Processing systems. A selected corpus produced by the GENIA project were used as ``golden standard'' (event-annotated abstracts from MEDLINE). This corpus were sparse (19 abstracts), thus manual methods were employed to produce a mapping from the native GeneTUC knowledge format (TQL). This mapping were used to produce an evaluation of events in GeneTUC. We were able to attain a recall of 67% and average precision of 33% on the training data. These results suggest that the mapping is inadequate. On test data, the recall were 28% and average precision 21%.

Because events is a new ``feature'' in NLP-applications, there are no large corpora that can be used for automated rule learning. The conclusion is that at least there exists a partial mapping from TQL to GENIA events, and that larger corpora and AI-methods should be applied to refine the mapping rules. In addition, we discovered that this mapping can be of use for extraction of protein-protein interactions.
Publisher
NTNU

Contact Us | Send Feedback

Privacy policy
DSpace software copyright © 2002-2019  DuraSpace

Service from  Unit
 

 

Browse

ArchiveCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsDocument TypesJournalsThis CollectionBy Issue DateAuthorsTitlesSubjectsDocument TypesJournals

My Account

Login

Statistics

View Usage Statistics

Contact Us | Send Feedback

Privacy policy
DSpace software copyright © 2002-2019  DuraSpace

Service from  Unit