Vis enkel innførsel

dc.contributor.advisorRamampiaro, Herindrasananb_NO
dc.contributor.authorSkuland, Magnusnb_NO
dc.date.accessioned2014-12-19T13:33:08Z
dc.date.available2014-12-19T13:33:08Z
dc.date.created2010-09-03nb_NO
dc.date.issued2005nb_NO
dc.identifier348109nb_NO
dc.identifierntnudaim:1056nb_NO
dc.identifier.urihttp://hdl.handle.net/11250/250988
dc.description.abstractThe aim of this paper was to develop a system for identification of biomedical entities, such as protein and gene names, from a corpora of Medline abstracts. Another aim was to manage to extract the most relevant terms from the set of identified biomedical terms and make them readily presentable for an end-user. The developed prototype, named iMasterThesis, uses a dictionary-based approach to the problem. A dictionary, consisting of 21K gene names and 425K protein names, was constructed in an automatic fashion. With the realization of the protein name dictionary as a multi-level tree structure of hash tables, the approach tries to facilitate a more flexible and relaxed matching scheme than previous approaches. The system was evaluated against a golden standard consisting of 101 expert-annotated Medline abstracts. It is capable of identifying protein and gene names from these abstracts with a 10% recall and 14% precision. It seems clear that for further improvements of the obtained results, the quality of the dictionary needs to be increased, possibly through manual inspection by domain experts. A graphical user interface, presenting an end-user with the most relevant terms identified, has been developed as well.nb_NO
dc.languageengnb_NO
dc.publisherInstitutt for datateknikk og informasjonsvitenskapnb_NO
dc.subjectntnudaimno_NO
dc.subjectSIF2 datateknikkno_NO
dc.subjectProgram- og informasjonssystemerno_NO
dc.titleIdentification of biomedical entities from Medline abstracts using a dictionary-based approachnb_NO
dc.typeMaster thesisnb_NO
dc.source.pagenumber154nb_NO
dc.contributor.departmentNorges teknisk-naturvitenskapelige universitet, Fakultet for informasjonsteknologi, matematikk og elektroteknikk, Institutt for datateknikk og informasjonsvitenskapnb_NO


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel