Automatic Detection of Fake News in Social Media using Contextual Information

Granskogen, Torstein

Granskogen, Torstein

Master thesis

View/Open

18038_FULLTEXT.pdf (1001.Kb)

18038_COVER.pdf (1.556Mb)

URI

http://hdl.handle.net/11250/2559124

Date

2018

Metadata

Show full item record

Collections

Institutt for datateknologi og informatikk [6831]

Abstract

Misinformation has become an important part of society, especially with the increase in fake news. This thesis investigates how using contextual and network data may be used as a detection system for news articles or other information pieces. Either as a standalone system or part of a bigger, hybrid solution.

A series of experiments have been conducted to explore the validity of contextual information in structured data from Facebook. Two different algorithms have been used, Logistic Regression and Harmonic Boolean Label Crowdsourcing, achieving a diverse result set shedding light on strengths and weaknesses. Using two different datasets consisting of scientific and fake news sources ranging from 4200 to 15.500 posts in size, and up to 9.5 million users, results with over 90 \% accuracy in classification in supervised training scenarios, consolidating previous results on both old and new datasets.

As a result, this thesis concludes with very promising results using contextual data only. This approach is still novel and needs more rigorous testing, but combining it with existing Natural Language Processing systems might yield better results than the current state of the art systems. A lot of work is still needed to be able to apply the methods to less structured data.

Publisher

NTNU