
GDB-13: Chemical Universe Database (970M Molecules)
A dataset card for the Generated Database 13 (GDB-13), a database of nearly 1 billion small organic molecules for …

A dataset card for the Generated Database 13 (GDB-13), a database of nearly 1 billion small organic molecules for …

Dataset card for GDB-17, containing 166 billion small organic molecules representing the largest enumerated chemical …

Learn how GEOM transforms 2D molecular graphs into dynamic 3D conformer ensembles for molecular machine learning …

Skinnider (2024) shows that generating invalid SMILES actually improves chemical language model performance through …

An end-to-end cheminformatics pipeline transforming 1D chemical formulas into 3D conformer datasets using graph …

Supervised learning reveals hidden eigenvalue patterns that clustering missed, testing k-NN and logistic regression on …

Clustering analysis reveals why Coulomb matrix eigenvalues struggle with larger alkanes, using Dunn Index and silhouette …

Explore molecular shape recognition using Coulomb matrix eigenvalues. Analysis of alkane isomers from data generation to …

Learn how Coulomb matrices encode 3D molecular structure for machine learning from basic theory to Python implementation …

Perspective on SELFIES as a 100% robust SMILES alternative, with 16 future research directions for molecular AI.