Automatic Sarcasm Detection in Twitter Messages

2016

In the past decade, social media like Twitter have become popular and

a part of everyday life for many people. Opinion mining of the thoughts

and opinions they share can be of interest to, e.g., companies and organizations.

The sentiment of a text can be drastically altered when figurative

language such as sarcasm is used. This thesis presents a system for automatic

sarcasm detection in Twitter messages.

To get a better understanding of the field, state-of-the-art systems for

detecting sarcasm in Twitter messages are explored. Many such systems

already exist, and a common theme among them is the use of automatically

annotated data for both training and testing. In addition to presenting a

system for detecting sarcasm, this thesis also looks into the use of manually

annotated data for testing. To this end, a dataset of tweets manually

annotated with respect to the presence of sarcasm was built. The result

was very similar to that of a previously made set, and both of them showed

considerable deviation from automatic annotation. This implies that using

automatically annotated data for the task of sarcasm detection in tweets

is a mediocre approximation.

Experiments with both of the manually annotated datasets also gave

very similar results, showing that they are well annotated and reasonably

representative for sarcasm detection in tweets.

NTNU