dc.contributor.advisor | Ramampiaro, Herindrasana | nb_NO |
dc.contributor.author | Skuland, Magnus | nb_NO |
dc.date.accessioned | 2014-12-19T13:33:08Z | |
dc.date.available | 2014-12-19T13:33:08Z | |
dc.date.created | 2010-09-03 | nb_NO |
dc.date.issued | 2005 | nb_NO |
dc.identifier | 348109 | nb_NO |
dc.identifier | ntnudaim:1056 | nb_NO |
dc.identifier.uri | http://hdl.handle.net/11250/250988 | |
dc.description.abstract | The aim of this paper was to develop a system for identification of biomedical entities, such as protein and gene names, from a corpora of Medline abstracts. Another aim was to manage to extract the most relevant terms from the set of identified biomedical terms and make them readily presentable for an end-user. The developed prototype, named iMasterThesis, uses a dictionary-based approach to the problem. A dictionary, consisting of 21K gene names and 425K protein names, was constructed in an automatic fashion. With the realization of the protein name dictionary as a multi-level tree structure of hash tables, the approach tries to facilitate a more flexible and relaxed matching scheme than previous approaches. The system was evaluated against a golden standard consisting of 101 expert-annotated Medline abstracts. It is capable of identifying protein and gene names from these abstracts with a 10% recall and 14% precision. It seems clear that for further improvements of the obtained results, the quality of the dictionary needs to be increased, possibly through manual inspection by domain experts. A graphical user interface, presenting an end-user with the most relevant terms identified, has been developed as well. | nb_NO |
dc.language | eng | nb_NO |
dc.publisher | Institutt for datateknikk og informasjonsvitenskap | nb_NO |
dc.subject | ntnudaim | no_NO |
dc.subject | SIF2 datateknikk | no_NO |
dc.subject | Program- og informasjonssystemer | no_NO |
dc.title | Identification of biomedical entities from Medline abstracts using a dictionary-based approach | nb_NO |
dc.type | Master thesis | nb_NO |
dc.source.pagenumber | 154 | nb_NO |
dc.contributor.department | Norges teknisk-naturvitenskapelige universitet, Fakultet for informasjonsteknologi, matematikk og elektroteknikk, Institutt for datateknikk og informasjonsvitenskap | nb_NO |