
GDB-13: Chemical Universe Database (970M Molecules)
A dataset card for the Generated Database 13 (GDB-13), a database of nearly 1 billion small organic molecules for …
A dataset card for the Generated Database 13 (GDB-13), a database of nearly 1 billion small organic molecules for …
Dataset card for GDB-17, containing 166 billion small organic molecules representing the largest enumerated chemical …
Skinnider's 2024 Nature Machine Intelligence paper demonstrates that the ability to generate invalid SMILES is actually …...
Clustering analysis reveals why Coulomb matrix eigenvalues struggle with larger alkanes, using Dunn Index and silhouette …
Explore molecular shape recognition using Coulomb matrix eigenvalues. Analysis of alkane isomers from data generation to …
A comprehensive perspective on molecular string representations, focusing on SELFIES as a 100% robust alternative to …...
Campos & Ji's method for converting 2D molecular images to SMILES strings using Transformers and SELFIES representation....