Gender prediction on Norwegian Twitter accounts

dc.contributor.advisor	Rue, Håvard
dc.contributor.advisor	Hesse, Dirk
dc.contributor.author	Kvamme, Håvard
dc.date.accessioned	2016-03-30T14:01:01Z
dc.date.available	2016-03-30T14:01:01Z
dc.date.created	2015-12-21
dc.date.issued	2015
dc.identifier	ntnudaim:14409
dc.identifier.uri	http://hdl.handle.net/11250/2383180
dc.description.abstract	In this thesis, methods for predicting the gender of Norwegian Twitter accounts were investigated. Through Twitterâ s public APIs, various account information is available. Tweets (text), personal descriptions, friends networks, and profile images were the main fields investigated. First separate classifiers were fitted to features from the different fields, and later the individual classifiersâ posterior probability estimates were combined to achieve increased accuracy. The datasets were labeled though comparison of the accountsâ names and names in the Norwegian population. Subsets of accounts with very gender specific names were used for training and testing. The highest balanced accuracy obtained was around 0.89. This, however, required access to the accountsâ profile images (85% of the data). Without images, the accuracy dropped to around 0.85.
dc.language	eng
dc.publisher	NTNU
dc.subject	Fysikk og matematikk, Industriell matematikk
dc.title	Gender prediction on Norwegian Twitter accounts
dc.type	Master thesis
dc.source.pagenumber	119