Document Dating Using Temporal Information Extracted From Wikipedia
Abstract
An important dimension in several information retrieval systems is the temporal dimension. In information retrieval systems, one aspect of the temporal dimension is the time of creation for the documents in the system. There is a great number of applications where the information about the time of creation for a document is used, but this information is also often either non-existent, ambiguous or just plain wrong. To adjust for this, several approaches have been made, where the goal is to find the time of creation of documents where the creation date is unknown. This is also known as document dating. In this thesis, we seek to improve the results of document dating by using temporal information from knowledge bases like Wikipedia. To do this, we have implemented the state of the art algorithm in document dating and developed our own system that extracts temporal information from Wikipedia articles that are found based on the content of the document we want to date.