dc.description.abstract | In the past decade, social media like Twitter have become popular and
a part of everyday life for many people. Opinion mining of the thoughts
and opinions they share can be of interest to, e.g., companies and organizations.
The sentiment of a text can be drastically altered when figurative
language such as sarcasm is used. This thesis presents a system for automatic
sarcasm detection in Twitter messages.
To get a better understanding of the field, state-of-the-art systems for
detecting sarcasm in Twitter messages are explored. Many such systems
already exist, and a common theme among them is the use of automatically
annotated data for both training and testing. In addition to presenting a
system for detecting sarcasm, this thesis also looks into the use of manually
annotated data for testing. To this end, a dataset of tweets manually
annotated with respect to the presence of sarcasm was built. The result
was very similar to that of a previously made set, and both of them showed
considerable deviation from automatic annotation. This implies that using
automatically annotated data for the task of sarcasm detection in tweets
is a mediocre approximation.
Experiments with both of the manually annotated datasets also gave
very similar results, showing that they are well annotated and reasonably
representative for sarcasm detection in tweets. | |