Impact of Domain Knowledge in Time Series Forecasting of Oil and Gas Sensor Data

Grøneng, Sigurd; Bratvedt, Øyvind

Grøneng, Sigurd; Bratvedt, Øyvind

Master thesis

Åpne

18053_FULLTEXT.pdf (3.253Mb)

18053_COVER.pdf (1.556Mb)

Permanent lenke

http://hdl.handle.net/11250/2569997

Utgivelsesdato

2018

Metadata

Vis full innførsel

Samlinger

Institutt for datateknologi og informatikk [6788]

Sammendrag

Production of oil and gas is a complicated operation, and due to this complexity, the work is being closely monitored. Sensors mounted on platforms measure the current state of the system, with alarms indicating when values are above or below a threshold that is considered normal. Values from the sensors are stored as time series, documenting the historical operation of the oil and gas platform.

The work in this thesis looked into how sensor time series can be utilized in terms of time series forecasting and anomaly detection. With the help of the deep learning technique long short-term memory (LSTM) neural network we have investigated how the domain knowledge of related sensors can affect the forecasts. Time series from sensors located on one of Aker BP's oil and gas platforms in the North Sea have been selected. In order to make sure that they are relevant, they originate from the same system on the platform. At a point in time, this system had a major malfunction, which makes it possible to see if this anomaly can be predicted by the LSTM. Subsets of the time series have been created based on their sensor type and location. These subsets have been used as an input to the LSTM in order to detect if some sensors have a high impact on the predictions.

Results show that, in terms of forecasting error, including a combination of related sensors can be better than both using all sensors in the input, and only the sensor that is predicted as an input. Some combinations can also greatly increase the forecasting error, suggesting that sensors should not uncritically be included. A large portion of the combinations ended up mimicking the time series, leading to low forecasting and anomaly detection power. Two combinations did not include the predicted sensor in the input, forcing the models to not mimic the predicted time series. These models had the highest forecasting error, but showed the most promising results of the combinations in terms of anomaly detection and forecasting power. Ultimately, we suggest extensive cleansing of data if deep learning should be used in this domain.

Utgiver

NTNU