Abstractive microblogs summarization

2015

Microblogging is a new electronic communication medium based on short status updates

containing personal and instant information. Due to the popularity of microblogs, the

volume of information is enormous and big portion of it is duplicative or irrelevant. The

effective way to summarize information can be used by scientists, journalists and marketing

analysts to get cleverer insights about people’s reactions and opinions on different

topics: political debates, sport events or product presentations.

Existing summarization algorithms can be enhanced in several ways. The first way

is to add sentiment analysis. As information in microblogs is very opinionated, analyzing

tweets polarity can improve machine summaries by selecting more sentiment tweets

than pure topical. Another enhancement is to use different summary length for different

topics. Previous studies often limit summaries to be particular length. Relaxing this

restriction can present summaries that are more optimal for a particular topic.

The goal of this research is to perform qualitative study of these enhancements and

to provide insights and suggestions for conducting bigger qualitative research. In total

ten topics are selected, for which human summaries are compared to state-of-the-art

non-sentiment and sentiment summarizers.

Resulting observations are the following: there is more topical than sentiment content

in summaries generated by humans, however individual biases could be against the

trend; the length of the summary is an important feature that influences both generation

of human summaries and interpretation of evaluation results, different topics require

summaries of different length; sentiment summarization doesn’t produce better results

for any evaluation metric used, but there could be possibility for its application in proper

settings with specific topics.