Entity Linking

Møkkelgjerd, Kjetil

Møkkelgjerd, Kjetil

Master thesis

Åpne

16655_FULLTEXT.pdf (1.588Mb)

16655_COVER.pdf (1.556Mb)

Permanent lenke

http://hdl.handle.net/11250/2457130

Utgivelsesdato

2017

Metadata

Vis full innførsel

Samlinger

Institutt for datateknologi og informatikk [6772]

Sammendrag

Entity linking may be of help to quickly supplement a reader with further information about entities within a text by providing links to a knowledge base containing more information about each entity, and may thus potentially enhance the reading experience. In this thesis, we look at existing solutions, and implement our own deterministic entity linking system based on our research, using our own approach to the problem.

Our system extracts all entities within the input text, and then disambiguate each entity considering the context. The extraction step is handled by an external framework, while the disambiguation step focuses on entity-similarities, where similarity is defined by how similar the entities' categories are, which we measure by using data from a structured knowledge base called DBpedia. We base this approach on the assumption that similar entities usually occur close to each other in written text, thus we select entities that appears to be similar to other nearby entities.

Experiments show that our implementation is not as effective as some of the existing systems we use for reference, and that our approach has some weaknesses which should be addressed. For example, DBpedia is not an as consistent knowledge base as we would like, and the entity extraction framework often fail to reproduce the same entity set as the dataset we use for evaluation. However, our solution show promising results in many cases.

Utgiver

NTNU