Show simple item record

dc.contributor.authorKanhabua, Nattiyanb_NO
dc.date.accessioned2014-12-19T13:38:20Z
dc.date.available2014-12-19T13:38:20Z
dc.date.created2012-05-07nb_NO
dc.date.issued2012nb_NO
dc.identifier528125nb_NO
dc.identifier.isbn978-82-471-3264-7 (printed ver.)nb_NO
dc.identifier.isbn978-82-471-3265-4 (electronic ver.)nb_NO
dc.identifier.urihttp://hdl.handle.net/11250/252812
dc.description.abstractIn this thesis, we address major challenges in searching temporal document collections. In such collections, documents are created and/or edited over time. Examples of temporal document collections are web archives, news archives, blogs, personal emails and enterprise documents. Unfortunately, traditional IR approaches based on term-matching only can give unsatisfactory results when searching temporal document collections. The reason for this is twofold: the contents of documents are strongly time-dependent, i.e., documents are about events happened at particular time periods, and a query representing an information need can be time-dependent as well, i.e., a temporal query. Our contributions in this thesis are different time-aware approaches within three topics in IR: content analysis, query analysis, and retrieval and ranking models. In particular, we aim at improving the retrieval effectiveness by 1) analyzing the contents of temporal document collections, 2) performing an analysis of temporal queries, and 3) explicitly modeling the time dimension into retrieval and ranking. Leveraging the time dimension in ranking can improve the retrieval effectiveness if information about the creation or publication time of documents is available. In this thesis, we analyze the contents of documents in order to determine the time of non-timestamped documents using temporal language models. We subsequently employ the temporal language models for determining the time of implicit temporal queries, and the determined time is used for re-ranking search results in order to improve the retrieval effectiveness. We study the effect of terminology changes over time and propose an approach to handling terminology changes using time-based synonyms. In addition, we propose different methods for predicting the effectiveness of temporal queries, so that a particular query enhancement technique can be performed to improve the overall performance. When the time dimension is incorporated into ranking, documents will be ranked according to both textual and temporal similarity. In this case, time uncertainty should also be taken into account. Thus, we propose a ranking model that considers the time uncertainty, and improve ranking by combining multiple features using learning-to-rank techniques. Through extensive evaluation, we show that our proposed time-aware approaches outperform traditional retrieval methods and improve the retrieval effectiveness in searching temporal document collections.nb_NO
dc.languageengnb_NO
dc.publisherNorges teknisk-naturvitenskapelige universitet, Fakultet for informasjonsteknologi, matematikk og elektroteknikk, Institutt for datateknikk og informasjonsvitenskapnb_NO
dc.relation.ispartofseriesDoktoravhandlinger ved NTNU, 1503-8181; 2012:5nb_NO
dc.relation.haspartKanhabua, Nattiya; Norvag, Kjetil. Improving Temporal Language Models for Determining Time of Non-timestamped Documents. RESEARCH AND ADVANCED TECHNOLOGY FOR DIGITAL LIBRARIES: 358-370, 2008. <a href='http://dx.doi.org/10.1007/978-3-540-87599-4_37'>10.1007/978-3-540-87599-4_37</a>.nb_NO
dc.relation.haspartKanhabua, Nattiya; Norvag, Kjetil. Using Temporal Language Models for Document Dating. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT II: 738-741, 2009. <a href='http://dx.doi.org/10.1007/978-3-642-04174-7_53'>10.1007/978-3-642-04174-7_53</a>.nb_NO
dc.relation.haspartKanhabua, Nattiya; Nørvåg, Kjetil. Exploiting time-based synonyms in searching document archives. Proceedings of the 2010 Joint International Conference on Digital Libraries: 79-88, 2010. <a href='http://dx.doi.org/10.1145/1816123.1816135'>10.1145/1816123.1816135</a>.nb_NO
dc.relation.haspartKanhabua, Nattiya; Norvag, Kjetil. Determining Time of Queries for Re-ranking Search Results. RESEARCH AND ADVANCED TECHNOLOGY FOR DIGITAL LIBRARIES: 261-272, 2010. <a href='http://dx.doi.org/10.1007/978-3-642-15464-5_27'>10.1007/978-3-642-15464-5_27</a>.nb_NO
dc.relation.haspartKanhabua, Nattiya; Nørvåg, Kjetil. QUEST: Query Expansion using Synonyms over Time. Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2010. <a href='http://dx.doi.org/10.1007/978-3-642-15939-8_41'>10.1007/978-3-642-15939-8_41</a>.nb_NO
dc.relation.haspartKanhabua, Nattiya; Nørvåg, Kjetil. A comparison of time-aware ranking methods. Proceeding of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval : 1257-1258, 2011. <a href='http://dx.doi.org/10.1145/2009916.2010147'>10.1145/2009916.2010147</a>.nb_NO
dc.relation.haspartKanhabua, Nattiya; Nørvåg, Kjetil. Time-based query performance predictors. Proceeding of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval: 1181-1182, 2011. <a href='http://dx.doi.org/10.1145/2009916.2010109'>10.1145/2009916.2010109</a>.nb_NO
dc.relation.haspartKanhabua, Nattiya; Blanco, Roi; Matthews, Michael. Ranking related news predictions. Proceeding of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval: 755-764, 2011. <a href='http://dx.doi.org/10.1145/2009916.2010018'>10.1145/2009916.2010018</a>.nb_NO
dc.titleTime-aware Approaches to Information Retrievalnb_NO
dc.typeDoctoral thesisnb_NO
dc.contributor.departmentNorges teknisk-naturvitenskapelige universitet, Fakultet for informasjonsteknologi, matematikk og elektroteknikk, Institutt for datateknikk og informasjonsvitenskapnb_NO
dc.description.degreePhD i informasjonsteknologinb_NO
dc.description.degreePhD in Information Technologyen_GB


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record