Semantic Relations in Yahoo! News Search
MetadataShow full item record
In this thesis we propose a novel approach were 3 days of raw Yahoo! News search query logs are analyzed to find semantic relations among queries. The analysis is based on two independent contributions. The first use session data extracted from the query logs. By finding the term best describing each session, we get a vocabulary of queries related to that term. Sessions with similar terms are merged to create larger groups of queries with one common term or phrase. The second contribution is the use of temporal correlation to give a measure of frequency variation similarity. Queries that show a similar variation over time have a high chance of either being semantically related or appear in the same situations. These two contributions are then merged into related term groups, based on their session group label and the most prominent term or phrase of the correlation query. With the use of non strict parameter settings on the contribution calculations, a great number of queries are found. With the intersection of the results this leaves high accuracy groups of related queries with a term or phrase as group label. A prototype search application was developed to use the created term groups in a search environment. The groups of queries were converted into a tree structure with their group label as the main node. This navigation tree structure let the user navigate up and down in the tree or click directly on a tree node to view its results. When a users search match one of the generated groups, he or she is presented with the first search results of the trees main node together with its children.