Data-driven fault detection for plunger pumps
MetadataShow full item record
- Institutt for marin teknikk 
In this thesis, a method for detecting faults for plunger pumps using a data-driven approach is developed. Large amounts of data have been accumulated in the process industry during the recent decades. A general digitization trend in the industry indicates that there will be generated and stored even more data in future years. One key aspect for an industry company is to benefit from the information contained in this vast amount of data. Data-driven approaches facilitate the task of revealing this knowledge, and the potential of discovering data patterns associated with equipment failure may enable more efficient maintenance. Equipment such as reciprocating pumps constitute an essential component of the system in which they operate, and detecting incipient failures in advance reduces the overall risk level associated with pump operation. Combining pump operation data and failure logs in a data-driven model present an opportunity to recognize the faulty patterns in future operation. This thesis intends to employ machine learning together with available data for the pumps at a gas processing plant to develop a method for data-driven fault detection. The method that has been developed solves the fault detection problem by proposing a series of steps toward producing a model capable of predicting whether the pump will fail or not. Relevant failure examples are extracted from the maintenance records and pump operation data. The pump operation measurements available are rotations per minute, pressure and flow. The dataset undergoes a data processing procedure which results in a filtered dataset with additional descriptive data features and a target vector containing class labels which indicate pump condition. The pump condition assigned to each data instance is determined according to a prediction window, where all instances contained in the window before plunger failure are labeled as a critical pump condition. The data is explored using a principal component analysis to discover possible patterns of the critical conditions. The labeled data is used together with a Bayesian hyperparameter tuning setup and a technique for model validation to efficiently train, validate and produce a classifier model that predicts the pump condition based on operation input data. A boosted ensemble of decision tree algorithms called RUSBoost is employed as the machine learning algorithm to produce this fault detection model. The unsupervised analysis of the data reveals no homogeneous and distinct clustering patterns between normal and critical pump condition. This lack of faulty patterns weakens the notion that the critical conditions occurring right before failure are detectable, based on the current information contained in the data.In the supervised approach, different prediction windows are tested to determine the most probable P-F interval. With a prediction window of 0.8 minutes, the fault detection model is able to detect 67\% of the critical instances contained in the window. Assessment of the model validation results indicates that the model overfits the training data. When validating the model, it generalizes sufficiently well to the given operation training data. However, if applied online and predicting on unseen input the model is anticipated to give unreliable detection results. It is apparent that additional measurements that capture the critical condition should be included, as the given RPM, pressure and flow data do not contain sufficient failure information to produce a reliable fault detection model. Measurements that are more closely related to the most probable failure causes such as friction are recommended. As the increased friction on the plunger increases the work required to maintain a constant motion of the plungers, including measurements of the power consumption of the electric motors in the training data may enhance the critical conditions in the data.