Discovering novel materials can be greatly accelerated by iterative machine learning-informed proposal of candidates—active learning. However, standard global error metrics for model quality are not predictive of discovery performance, and can be misleading. We introduce the notion of Pareto shell error to help judge the suitability of a model for proposing material candidates. Further, through synthetic cases and a thermoelectric dataset, we probe the relation between acquisition function fidelity and active learning performance. Results suggest novel diagnostic tools, as well as new insights for acquisition function design.
Zachary del Rosario (Stanford University), Matthias Rupp, Yoolhee Kim, Erin Antono, Julia Ling (Citrine Informatics). “Assessing the Frontier: Active Learning, Model Accuracy, and Multi-objective Materials Discovery and Optimization” arXiv pre-print (2020).