Deep Detection of Hate Speech in Text Through a Two-Pronged Approach

Meyer, Johannes Skjeggestad

dc.contributor.advisor	Gambäck, Björn
dc.contributor.author	Meyer, Johannes Skjeggestad
dc.date.accessioned	2018-10-15T14:00:28Z
dc.date.available	2018-10-15T14:00:28Z
dc.date.created	2018-06-04
dc.date.issued	2018
dc.identifier	ntnudaim:18862
dc.identifier.uri	http://hdl.handle.net/11250/2568080
dc.description.abstract	With the widespread use of online services like Facebook and Twitter, disseminating hateful messages has become a simple matter. These messages not only spoil the experience for other users of a service. There is also an increasing legal pressure for the services to prevent and remove such hate-spreading content. For this to be practically feasible, there is a need for systems that can automatically detect hate speech in text. Research on automatic detection of hateful and abusive language has been an ongoing project over the last 20 years. However, the state-of-the-art is still not good enough to be practically usable for identifying hate speech in a fully automatic manner. Thus, this thesis continues the efforts to reach that goal. With the increasing legal pressure to remove hate speech, and the multitude of services and platforms this pressure applies to, detection approaches are needed that do not depend on any information specific to a given platform. This is so that the approach can be used across several different platforms without being changed. For instance, the information stored about the text s author may differ between services, and so using such data would reduce the general applicability of the system. Therefore, the research in this thesis aims at avoiding any such information, using exclusively text-based input in the detection. This thesis proposes a novel, Deep Learning-based approach to hate speech detection, using a two-pronged architecture that combines both Convolutional Neural Networks and Long Short-Term Memory-networks. The proposed architecture uses Character N-grams and Word Embeddings as inputs to its two prongs, which then merge and produce a final classification. The experiments show that this architecture, using its optimal configurations, performs better than most state-of-the-art systems.
dc.language	eng
dc.publisher	NTNU
dc.subject	Datateknologi, Kunstig intelligens
dc.title	Deep Detection of Hate Speech in Text Through a Two-Pronged Approach
dc.type	Master thesis

Tilhørende fil(er)

Filnavn:: 18862_FULLTEXT.pdf
Størrelse:: 1.105Mb
Format:: PDF

Åpne

Filnavn:: 18862_ATTACHMENT.zip
Størrelse:: 4.110Mb
Format:: application/zip

Åpne

Filnavn:: 18862_COVER.pdf
Størrelse:: 1.556Mb
Format:: PDF

Åpne

Denne innførselen finnes i følgende samling(er)

Institutt for datateknologi og informatikk [6819]

Vis enkel innførsel