Sentiment Analysis in Norwegian Political News - Employing Replicated Methods and Experimental Features to Understand a Complex Domain
MetadataVis full innførsel
This thesis employs machine learning in an effort to develop a sentiment analysis engine for the Norwegian political news domain. In combination with computa- tional linguistics and statistics we set out to gain knowledge and understanding of a less researched area of sentiment analysis, which is more complex than other well-known domains. As the mass media is setting the agenda for what should be focused on by the general public, the news world has significant influence on what is subsequently expressed on social media. The motivation for choosing this do- main is the lack of research and the fact that if Twitter and Facebook are deemed important platforms for sentiment analysis, the news should be as well.Replicating proven methods from other well-structured and understood do- mains, we try to achieve similar precision results in spite of the lack of resources available in the Norwegian language. Evaluating the results from this work led to the discovery of essential characteristics of the political news domain. These char- acteristics portray the challenges to overcome in order to achieve state-of-the-art classification results. We uncovered that the language in the domain in question is unstructured, sentiment is conveyed in a subtle manner without the use of explicit sentiment-bearing words, and require contextual knowledge.Further, we experimented with a two-step binary classification method to pin- point the areas of effect for each feature included in the sentiment engine. Observ- ing the results of each classification step, we note that negation count does not in fact improve performance. However, the exclusion of neutral co-occurring terms in the polarity classification step achieved close to state-of-the-art precision scores. In addition to this, we find that the most imperative area of focus should be on the subjectivity classification step, as improvements here will eventually show a momentous increase in overall precision of the sentiment engine.