Key Contribution
This paper presents the Geometric Ensemble Of Molecules (GEOM) dataset to address the critical lack of a large-scale dataset linking molecular conformer ensembles to experimental data, which is necessary for training more advanced and accurate machine learning models.