Learning event-driven time series with phased recurrent neural networks
MetadataShow full item record
We explore machine learning algorithms for time series data, particularly recurrent neural networks. For us, the most interesting methods are ones for handling long duration time series, where the sampling of data channels happen asynchronously in relation to each other, and the sampling happens with no fixed period. This kind of data frequently appears in a variety of real-world contexts. They exist in patient health records, as well as in logs from transportation safety systems, and IT intrusion detection systems. They also come in the form of event-driven sensor signals from manufacturing infrastructure, an example of which we examine in this thesis. We are motivated by the fact that improved methods for learning from this kind of data can provide value to a wide variety of industries and sectors. We implement recurrent network models and test them on publicly available benchmark datasets, Sequential MNIST, and aperiodic sine wave classification. We also explore applications of the models on datasets consisting of sensor data from manufacturing systems in a real-world food production facility. These datasets are not publicly available. The recurrent architectures we implement and test are the generic LSTM and GRU models, as well as a specialized model, the Phased LSTM, which is hypothesized to perform well on long, asynchronous and aperiodic signals. We also conceive, implement, and test a natural variation of the Phased LSTM, the Phased GRU, as well as code enabling the construction of multilayer phased models in TensorFlow. On the publicly available datasets considered, we find that the concept of a time gate from the Phased LSTM shows value, and generalizes well to the GRU cell architecture. Both our Phased GRU, and the Phased LSTM it is inspired by, outperform the baseline LSTM and GRU models in terms of classification accuracy. The phased models also show accuracy performance increases when stacked in layers, as is the case for the baseline models. The Phased GRU is slightly faster, and performs with slightly better accuracy than the PLSTM on tested problems, as is the case for GRU when compared to LSTM. This indicates that performance affecting traits in the GRU and LSTM models are still relevant in the phased context. This has not been researched further in cases where LSTM might outperform GRU in accuracy. On the real, event-driven factory datasets we find some of the weaknesses of phased models, as they show clear disadvantages in accuracy compared to LSTM and GRU in short input sequence tasks. The results are more nuanced for longer input sequences and higher sampling rates, where the phased models sometimes outperform LSTM and GRU.