Training sets based on uncertainty estimates in the cluster-expansion method
Peer reviewed, Journal article
Published version
Permanent lenke
https://hdl.handle.net/11250/3053345Utgivelsesdato
2021Metadata
Vis full innførselSamlinger
- Institutt for fysikk [2655]
- Publikasjoner fra CRIStin - NTNU [37304]
Sammendrag
Cluster expansion (CE) has gained an increasing level of popularity in recent years, and its applications go far beyond its original root in binary alloys, reaching even complex crystalline systems often used in energy materials research. Similar to other modern machine learning approaches in materials science, many strategies have been proposed for training and fitting the CE models to first-principles calculation results. Here, we propose a new strategy for constructing a training set based on their relevance in Monte Carlo sampling for statistical analysis and reduction of the expected error. The CE model constructed from the proposed approach has lower dependence on the specific details of the training set, thereby increasing the reproducibility of the model. The same method can be applied to other machine learning approaches where it is desirable to sample relevant configurational space with a small set of training data, which is often the case when they consist of first-principles calculations. Training sets based on uncertainty estimates in the cluster-expansion method