Computational Chemistry
SELFIES robustness demonstration

Invalid SMILES Benefit Chemical Language Models: A Study

A provocative 2024 Nature Machine Intelligence paper challenging the assumption that invalid SMILES are failures, showing empirically that the ability to generate invalid outputs actually improves chemical language model performance by enabling quality filtering and providing richer training signals.

Computational Chemistry
3D ball-and-stick model of butane molecule representing the structural isomer generation process

Synthetic Isomer Data Generation Pipeline

An end-to-end data factory for molecular machine learning that transforms raw chemical formulas (e.g., C6H14) into labeled 3D conformer datasets, using MAYGEN for structural isomer enumeration, RDKit for 3D embedding, and physics-based featurization to address data scarcity in computational drug discovery.

Computational Chemistry
3D ball-and-stick model of butane molecule showing linear carbon chain structure

Hearing Molecular Shape via Coulomb Matrix Eigenvalues

Can mathematical signatures capture molecular shape? We test whether Coulomb matrix eigenvalues can distinguish alkane constitutional isomers, from unsupervised clustering failures to supervised learning successes.

Computational Chemistry
Coulomb matrix heatmap visualization showing molecular structure encoding on logarithmic scale

Coulomb Matrices for Molecular Machine Learning

A practical introduction to Coulomb matrices: how they transform molecular 3D structures into ML features, complete with Python examples and honest assessment of their limitations.

Computational Chemistry
Copper adatom trajectory on Cu(100) surface

Copper Adatom Diffusion on Cu(100): LAMMPS Simulation

Watch copper atoms move across a crystal surface in this molecular dynamics simulation. This video demonstrates surface diffusion mechanisms important for understanding catalysis and crystal growth processes.

Computational Chemistry
Ball model representation of a crystal surface with steps, kinks, adatoms, and vacancies showing various surface features

LAMMPS Tutorial: Copper and Platinum Adatom Diffusion

Step-by-step LAMMPS tutorial for simulating copper and platinum adatom diffusion. Learn surface dynamics simulation, trajectory analysis, and how atomic mass affects diffusion for machine learning datasets.

Computational Chemistry
Schematic showing atom-surface interaction

Platinum Adatom Diffusion on Pt(100): LAMMPS Simulation

Visualize platinum atom diffusion on crystal surfaces in this LAMMPS molecular dynamics simulation. Understand surface mobility mechanisms crucial for catalysis and materials design.

Computational Chemistry
SELFIES robustness demonstration

SELFIES and the Future of Molecular String Representations

A 2022 perspective paper by Krenn, Aspuru-Guzik, and colleagues reviewing 250 years of chemical notation evolution and proposing 16 concrete research projects to extend SELFIES beyond traditional organic chemistry into polymers, crystals, reactions, and other complex chemical systems where traditional representations break down.