Predicting Bike-Sharing Traffic Flow using Machine Learning
Abstract
Bike-Sharing Systems (BSSs) have rapidly grown in popularity worldwide in recent years. The driving forces of this explosive growth are attributed to access to modern ubiquitous technology, increased urbanization, a desire to decrease pollution, and the need for flexible and integrated mobility in city environments. However, in order to keep BSSs in a balanced state where bikes and stations are readily available for users, companies are seeking ways to accurately predict future demand.
This project aims to objectively compare and evaluate various machine learning algorithms for the problem of predicting cluster-level BSS traffic flow. All models are evaluated using two different station clustering techniques, zone-based and grid-based, representing ways to group stations both with and without expert knowledge of the system.
Furthermore, this thesis presents a rigorous state-of-the-art review of current research on demand prediction in BSSs and other on-demand transport services. Random Forest (RF), Feed-Forward Neural Network (FFNN) and Deep Residual Network (ResNet) emerged as the most promising models from this review, and were therefore implemented. Additionally, this thesis presents the first evaluation of using Recurrent Neural Networks within the context of BSSs demand prediction. FFNN proved to be the most successful model, reaching 21.1% and 36.65% improvements over the best baseline using grid-based- and zone-based clustering respectively. The developed system is designed to run in a cloud production environment, and elaborations are made as to how it can be extended to handle real-time streaming data from Oslo City Bike users.