Ontology-based Data Extraction in the Scholarship-Related Content
Master thesis
Permanent lenke
http://hdl.handle.net/11250/144038Utgivelsesdato
2013Metadata
Vis full innførselSamlinger
Sammendrag
Master Thesis on the topic "Ontology-based Data Extraction in the Scholarship-Related Content"
is concentrated on the area of ontologies and on the research of the methods by which ontological
concepts can be recognized in the text, enhancing its semantic meaning.
The use of ontologies can significantly improve semantic richness of the texts presented on
the Web, but to be able to exploit all their capabilities, specific XML-based notations must be
written to describe each and every resource. This is usually quite a big amount of human work,
and the Thesis is seeking for the ways to decrease the amount of human resources, either by
suggesting automatic or semi-automatic approaches for ontology-based information retrieval.
In the experiments conducted in the domain of scholarships, ontology for scholarships has
been thoroughly evaluated, and the names of the disciplines were chosen as a target area for the
further information retrieval research.
Discovery of the ontological concepts in the text was performed by, first, scraping the webpage
for the target section, and then by implementing Boolean search method with and without prior
preprocessing. Such approach demonstrated very good results, and with preprocessing roughly
70% of the disciplines were retrieved. Furthermore, extension of the ontology has been proposed
as the way to increase extraction rate by 10%. Overall, 80% of the disciplines can be retrieved
by our method.