Spatial Statistics of Term Co-occurrences for Location Prediction of Tweets
Journal article, Peer reviewed
Accepted version

View/ Open
Date
2018Metadata
Show full item recordCollections
Original version
Lecture Notes in Computer Science. 2018, 10772 494-506. 10.1007/978-3-319-76941-7_37Abstract
Predicting the locations of non-geotagged tweets is an active research area in geographical information retrieval. In this work, we propose a method to detect term co-occurrences in tweets that exhibit spatial clustering or dispersion tendency with significant deviation from the underlying single-term patterns, and use these co-occurrences to extend the feature space in probabilistic language models. We observe that using term pairs that spatially attract or repel each other yields significant increase in the accuracy of predicted locations. The method we propose relies purely on statistical approaches and spatial point patterns without using external data sources or gazetteers. Evaluations conducted on a large set of multilingual tweets indicate higher accuracy than the existing state-of-the-art methods.