Interactive Exploration of Consensus in Climate Science

Bøhler, Henrik; Asla, Petter Fagerlund

Bøhler, Henrik; Asla, Petter Fagerlund

Master thesis

Åpne

14559_FULLTEXT.pdf (5.545Mb)

14559_COVER.pdf (1.556Mb)

Permanent lenke

http://hdl.handle.net/11250/2406921

Utgivelsesdato

2016

Metadata

Vis full innførsel

Samlinger

Institutt for datateknologi og informatikk [6779]

Sammendrag

Global warming has been a controversial topic of discussion in environmental studies in recent years. This thesis describes an approach to automatically classify stance on anthropogenic (human-induced) global warming in climate science with the aim to investigate the development of the consensus after 2011. In addition, we explore how stance is related to external factors.

The thesis is based on the work of Cook et al. (2013). They manually examined and labeled 11,944 abstracts related to climate change by their stance on anthropogenic global warming (favor, against, none). Of those taking a position, Cook et al. reported an observed consensus of 97 % endorsing the fact. We have explored approaches to automate Cook et al. s work by classifying stance using machine learning.

Our system is divided into three major components: Search & Information Retrieval, Stance Detection and Visualization. The Search & Information Retrieval component make up the foundation of the system. It provides the classification component with the raw data obtained from literature databases such as Web of Science. It shows solid results for collection of meta-data, successfully retrieving data for 95.70 % of the publications. Evaluation of the strategy for obtaining recent climate literature indicates that approximately 18 % of records can be proved relevant. However, analysis of the data suggests that most of the collected literature is relevant.

Using the data collected by Cook et al. and The Consensus Project, we conducted a large number of experiments to determine the best stance detection model. The final system combines a Logistic Regression classifier trained on GloVe word embeddings and an optimized SVM in a voting scheme, representing the best of both classifiers. The approach achieves a substantial improvement over the baseline, achieving a macro F-score of 60.67 % on the test set. The predictions of new data, along with collected meta-data, serves as input to the visualization component. The resulting visual representation suggests no major change in the consensus. The number of publications per country plotted on a geographical map, inferred by meta-data regarding author affiliations, suggests a distinction between developed and developing countries.

Utgiver

NTNU