Computational Chemistry
Embedding energy and effective charge functions for Ni and Pd from the original EAM paper

Embedded-Atom Method: Impurities and Defects in Metals

The foundational 1984 paper introducing EAM, a semi-empirical many-body interatomic potential that incorporates density functional theory concepts to accurately simulate metallic systems while maintaining computational efficiency comparable to pair potentials.

Computational Chemistry
Protein folding funnel diagram illustrating energy landscape

Umbrella Sampling: Monte Carlo Free-Energy Estimation

Torrie and Valleau’s 1977 paper introducing importance sampling with non-physical distributions to overcome the sampling gap problem in Monte Carlo free-energy calculations, particularly for phase transitions.

Computational Chemistry
Schematic showing atom-surface interaction using the method of images

Lennard-Jones on Adsorption and Diffusion on Surfaces

Lennard-Jones’s 1932 theoretical paper applying quantum mechanical potential energy surfaces to gas-solid interactions, providing the first unified framework explaining both physisorption and chemisorption as different regions of the same energy landscape.

Computational Chemistry
GDB-13 molecule structure showing CCCC(O)(CO)CC1CC1CN

GDB-13: Chemical Universe Database (970M Molecules)

GDB-13 contains nearly 1 billion systematically generated small organic molecules with up to 13 atoms, achieving billion-scale chemical space exploration while maintaining drug-like properties.

Computational Chemistry
GDB-17 molecule structure showing complex polycyclic architecture

GDB-17: Chemical Universe Database (166.4B Molecules)

GDB-17 contains 166.4 billion systematically generated small organic molecules with up to 17 atoms. It represents the most comprehensive exploration of drug-relevant chemical space achieved through computational enumeration.

Computational Chemistry
3D conformer ensemble of a drug-like molecule from the GEOM dataset

GEOM Dataset: 3D Molecular Conformer Generation

Get a practical overview of the GEOM dataset and learn how it’s advancing 3D molecular machine learning by bridging static graphs and dynamic reality.

Computational Chemistry
Markush structure diagram

SubGrapher: Visual Fingerprinting of Chemical Structures

SubGrapher introduces a visual fingerprinting approach to Optical Chemical Structure Recognition that detects functional groups directly from images, enabling chemical database searches without full structure reconstruction and handling complex patent images including Markush structures.

Computational Chemistry
Diagram showing how Ring-Free Language decouples a molecular graph into skeleton, ring structures, and branch information

RFL: Simplifying Chemical Structure Recognition (AAAI 2025)

Proposes Ring-Free Language (RFL) to hierarchically decouple molecular graphs into skeletons, rings, and branches, solving issues with 1D serialization of complex 2D structures. Introduces the Molecular Skeleton Decoder (MSD) to progressively predict these components, achieving strong results on handwritten and printed chemical structure recognition benchmarks.

Computational Chemistry
3D ball-and-stick model of butane molecule representing the structural isomer generation process

Synthetic Isomer Data Generation Pipeline

An end-to-end data factory for molecular machine learning that transforms raw chemical formulas (e.g., C6H14) into labeled 3D conformer datasets, using MAYGEN for structural isomer enumeration, RDKit for 3D embedding, and physics-based featurization to address data scarcity in computational drug discovery.

Computational Chemistry
3D ball-and-stick model of butane molecule showing linear carbon chain structure

Hearing Molecular Shape via Coulomb Matrix Eigenvalues

Can mathematical signatures capture molecular shape? We test whether Coulomb matrix eigenvalues can distinguish alkane constitutional isomers, from unsupervised clustering failures to supervised learning successes.

Computational Chemistry
Coulomb matrix heatmap visualization showing molecular structure encoding on logarithmic scale

Coulomb Matrices for Molecular Machine Learning

A practical introduction to Coulomb matrices: how they transform molecular 3D structures into ML features, complete with Python examples and honest assessment of their limitations.

Computational Chemistry
Copper adatom trajectory on Cu(100) surface

Copper Adatom Diffusion on Cu(100): LAMMPS Simulation

Watch copper atoms move across a crystal surface in this molecular dynamics simulation. This video demonstrates surface diffusion mechanisms important for understanding catalysis and crystal growth processes.