Multi-Parameter Performance Modeling Based on Machine Learning with Basic Block Features
MetadataShow full item record
Considering the increasing complexity and scale of HPC architecture and software, the performance modeling of parallel applications on large-scale HPC platforms has become increasingly important. It plays an important role in many areas, such as performance analysis, job management, and resource estimation. In this work, we propose a multi-parameter performance modeling and prediction framework called MPerfPred, which utilizes basic block frequencies as features and uses machine learning algorithms to automatically construct multi-parameter performance models with high generalization ability. To reduce the prediction overhead, we propose some feature-filtering strategies to reduce the number of features in the training stage and build a serial program called BBF collector for each target application to quickly collect feature values in the prediction stage. We demonstrate the use of MPerfPred on the TianHe-2 supercomputer with six parallel applications. Results show that MPerfPred with SVR achieves better prediction than other input parameter-based modeling methods. The average prediction error and average standard deviation of prediction errors of MPerfPred are 8.42% and 6.09%, respectively. In the prediction stage, the average prediction overhead of MPerfPred is less than 0.13% of the total execution time.