Vis enkel innførsel

dc.contributor.advisorLi, Jingyue
dc.contributor.advisorJiang, Shanshan
dc.contributor.authorHagelien, Thomas Fjæstad
dc.date.accessioned2018-10-09T14:00:25Z
dc.date.available2018-10-09T14:00:25Z
dc.date.created2018-07-09
dc.date.issued2018
dc.identifierntnudaim:18057
dc.identifier.urihttp://hdl.handle.net/11250/2567220
dc.description.abstractPublicly-accessible open transport data is provided by the public sector in an effort to create new opportunities, stimulate innovation and enable new solutions that benefits the society. The number of datasets available are however limited. This is partially due to the necessary, but labor intensive, preparation process of each dataset. The datasets need to be annotated with descriptions that explain their purpose and content. The search and retrieval functionality of current publishing platforms are limited to classical keyword based search, which is much more restricted than the search technology used for finding information on the world wide web. This is due to the fact that information in most cases cannot be retrieved directly from the data itself, but depends on the dataset descriptions. Open Datasets are encoded in a rich variety of formats which makes it difficult to reuse them directly in software applications. This study investigates how a transport domain knowledge model, namely an ontology of the transport domain, can enable data to be identified in terms of its meaning in a given context, i.e. semantics, and not by keywords and tags alone. The study further to investigates how semantic technology can be applied to improve discoverability and reuse of datasets. This was done by initially developing a prototype framework for ontology based semantic classification. The framework works as a test bed that allows for different algorithms to be tested and compared against different ontologies. The framework also includes the development of an online search engine that is used to measure the efficiency of the data discovery method. This study further includes a conceptual design for a software system that allows transport related software applications to utilize datasets from heterogenous sources. The study finds that automated classification based on natural language processing of dataset descriptions is possible and shows promising results. This approach appears to improve the search and retrieval functionality of limited datasets, however it is currently sensitive to the quality of the description text and needs to developed further.
dc.languageeng
dc.publisherNTNU
dc.subjectInformatikk, Programvaresystemer
dc.titleA Framework for Ontology Based Semantic Search
dc.typeMaster thesis


Tilhørende fil(er)

Thumbnail
Thumbnail
Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel