Semantic User Behaviour Prediction in Online News - Applying Topic Modeling, Community Detection, and User Modeling for News Recommendation
Abstract
In recent years, predicting user behaviour has become increasingly important within the news recommendation area. Knowledge of behavioural patterns offers valuable insight for developing efficient and user friendly services, ensuring good user experiences for users and increased revenues for companies. Though very useful, these user recommendations can be difficult to make and the news domain poses its own, unique challenges for the task. Challenges such as short life span for item recommendations, sparse connections to large item vocabularies, and volatile networks and user behaviours are prominent in news recommendation. This creates performance difficulties for traditional recommendation methods, especially due to the need to recommend unseen items.
By employing topic modeling on the rich text objects provided by the news domain, new articles introduced in the system can be compared to old articles in a meaningful way. Organizing these articles in clusters based on topic proportions, useful topic combinations can be elicited. Using these clusters to create user models based on composite interests, newly introduced articles can be recommended to users.
While a key finding is the conflict between optimizing goals for subtasks and optimal performance in the prediction task, implementing these technologies does lead to improvements of user prediction results.Different combinations of parameters provides prediction results with very different qualitative characteristics, but comparing to predictions using original labels and random recommendations, the models obtain better MAP and RA than both. With models using both topic modeling and clustering,RA shows improvements on random, ranged from 34%-64%, for models which performed well. Other models gains improvements ranging from 18%-278 % on top 3, 5, 10 and 30 positions for MAP, but declines in performance on RA by 18%-28%. Considering the characteristics of each prediction evaluation metric, RA appears to better capture the quality of recommendations.