Molecular Simulation on Hunter Heidenreich | ML Research Scientist

Ewald Message Passing for Molecular Graphs

Tue, 07 Apr 2026 00:00:00 +0000

A Fourier-Space Long-Range Correction for Molecular GNNs

This is a Method paper that introduces Ewald message passing (Ewald MP), a general framework for incorporating long-range interactions into message passing neural networks (MPNNs) for molecular potential energy surface prediction. The key contribution is a nonlocal Fourier-space message passing scheme, grounded in the classical Ewald summation technique from computational physics, that complements the short-range message passing of existing GNN architectures.

The Long-Range Interaction Problem in Molecular GNNs

Standard MPNNs for molecular property prediction rely on a spatial distance cutoff to define atomic neighborhoods. While this locality assumption enables favorable scaling with system size and provides a useful inductive bias, it fundamentally limits the model’s ability to capture long-range interactions such as electrostatic forces and van der Waals (London dispersion) interactions. These interactions decay slowly with distance (e.g., electrostatic energy follows a $1/r$ power law), and truncating them with a distance cutoff can introduce severe artifacts in thermochemical predictions.

This problem is well-known in molecular dynamics, where empirical force fields explicitly separate bonded (short-range) and non-bonded (long-range) energy terms. The Ewald summation technique addresses this by decomposing interactions into a short-range part that converges quickly with a distance cutoff and a long-range part whose Fourier transform converges quickly with a frequency cutoff. The authors propose bringing this same strategy into the GNN paradigm.

From Ewald Summation to Learnable Fourier-Space Messages

The core insight is a formal analogy between the continuous-filter convolution used in MPNNs and the electrostatic potential computation in Ewald summation. In a standard continuous-filter convolution, the message sum for atom $i$ is:

$$ M_i^{(l+1)} = \sum_{j \in \mathcal{N}(i)} h_j^{(l)} \cdot \Phi^{(l)}(| \mathbf{x}_i - \mathbf{x}_j |) $$

where $h_j^{(l)}$ are atom embeddings and $\Phi^{(l)}$ is a learned radial filter. Comparing this to the electrostatic potential $V_i^{\text{es}}(\mathbf{x}_i) = \sum_{j \neq i} q_j \cdot \Phi^{\text{es}}(| \mathbf{x}_i - \mathbf{x}_j |)$ reveals a direct correspondence: atom embeddings play the role of partial charges, and learned filters replace the $1/r$ kernel.

Ewald MP decomposes the learned filter into short-range and long-range components. The short-range part is handled by any existing GNN architecture with a distance cutoff. The long-range part is computed as a sum over Fourier frequencies:

$$ M^{\text{lr}}(\mathbf{x}_i) = \sum_{\mathbf{k}} \exp(i \mathbf{k}^T \mathbf{x}_i) \cdot s_{\mathbf{k}} \cdot \hat{\Phi}^{\text{lr}}(| \mathbf{k} |) $$

where $s_{\mathbf{k}}$ are structure factor embeddings, computed as:

$$ s_{\mathbf{k}} = \sum_{j \in \mathcal{S}} h_j \exp(-i \mathbf{k}^T \mathbf{x}_j) $$

These structure factor embeddings are a Fourier-space representation of the atom embedding distribution, and truncating to low frequencies effectively coarse-grains the hidden model state while preserving long-range information. The frequency filters $\hat{\Phi}^{\text{lr}}$ are learned, making the entire scheme data-driven rather than tied to a fixed physical functional form.

The method handles both periodic systems (where the reciprocal lattice provides a natural frequency discretization) and aperiodic systems (where the Fourier domain is discretized using a cubic voxel grid with SVD-based rotation alignment to preserve rotation invariance). The combined embedding update becomes:

$$ h_i^{(l+1)} = \frac{1}{\sqrt{3}} \left[ h_i^{(l)} + f_{\text{upd}}^{\text{sr}}(M_i^{\text{sr}}) + f_{\text{upd}}^{\text{lr}}(M_i^{\text{lr}}) \right] $$

The computational complexity is $\mathcal{O}(N_{\text{at}} N_{\text{k}})$, and by fixing the number of frequency vectors $N_{\text{k}}$, linear scaling $\mathcal{O}(N_{\text{at}})$ is achievable.

Experiments Across Four GNN Architectures and Two Datasets

The authors test Ewald MP as an augmentation on four baseline architectures: SchNet, PaiNN, DimeNet++, and GemNet-T. Two datasets are used:

OC20 (Chanussot et al., 2021): ~265M periodic structures of adsorbate-catalyst systems with DFT-computed energies and forces. The OC20-2M subsplit is used for training.
OE62 (Stuke et al., 2020): ~62,000 large aperiodic organic molecules with DFT-computed energies that include a DFT-D3 dispersion correction for London dispersion interactions.

All baselines use a 6 Å distance cutoff and 50 maximum neighbors. The Ewald modification is minimal: the long-range message sum is added as an additional skip connection term in each interaction block. Comparison studies include: (1) increasing the distance cutoff to match the computational cost of Ewald MP, (2) replacing the Ewald block with a SchNet interaction block at increased cutoff, and (3) increasing atom embedding dimensions to match Ewald MP’s parameter count.

Key Energy MAE Results on OE62

Model	Baseline (meV)	Ewald MP (meV)	Improvement
SchNet	133.5	79.2	40.7%
PaiNN	61.4	57.9	5.7%
DimeNet++	51.2	46.5	9.2%
GemNet-T	51.5	47.4	8.0%

Key Energy MAE Results on OC20 (Averaged Across Test Splits)

Model	Baseline (meV)	Ewald MP (meV)	Improvement
SchNet	895	830	7.3%
PaiNN	448	393	12.3%
DimeNet++	496	445	10.4%
GemNet-T	346	307	11.3%

Robust Long-Range Improvements and Dispersion Recovery

Ewald MP achieves consistent improvements across all models and both datasets, averaging 16.1% on OE62 and 10.3% on OC20. Several findings stand out:

Robustness: Unlike the increased-cutoff and SchNet-LR alternatives, Ewald MP never produces detrimental effects in any tested configuration. The increased cutoff setting hurts SchNet and PaiNN on OE62, and the SchNet-LR block fails to improve DimeNet++ and GemNet-T.
Long-range specificity: A binning analysis on OE62 groups molecules by the magnitude of their DFT-D3 dispersion correction. Ewald MP shows an outsize improvement for structures with large long-range energy contributions. It recovers or surpasses a “cheating” baseline that receives the exact DFT-D3 ground truth as an additional input.
Efficiency on periodic systems: Ewald MP achieves similar relative improvements on OC20 at roughly half the relative computational cost compared to OE62, suggesting periodic structures as a particularly attractive application domain.
Force predictions: Improvements in force MAEs are consistent but small, which is expected since the frequency truncation removes high-frequency contributions to the potential energy surface.
Ablation studies: Results are robust across different frequency cutoffs, voxel resolutions, and filtering strategies, with the non-radial periodic filtering scheme outperforming radial alternatives on out-of-distribution generalization.

Limitations include the current focus on scalar (invariant) embeddings only (PaiNN’s equivariant vector embeddings are not augmented), and the potential for a “gap” of medium-range interactions when $N_{\text{k}}$ is fixed for linear scaling. The authors suggest adapting more efficient Ewald summation variants (e.g., particle mesh Ewald with $\mathcal{O}(N \log N)$ scaling) as future work.

Reproducibility Details

Data

Purpose	Dataset	Size	Notes
Training (periodic)	OC20-2M	~2M structures	Subsplit of OC20; PBC; DFT energies and forces
Training (aperiodic)	OE62	~62,000 molecules	Large organic molecules; DFT energies with D3 correction
Evaluation	OC20-test (4 splits: ID, OOD-ads, OOD-cat, OOD-both)	Varies	Evaluated via submission to OC20 evaluation server
Evaluation	OE62-val, OE62-test	~6,000 each	Direct evaluation

Algorithms

Ewald message passing is integrated as an additional skip connection term in each interaction block
For periodic systems: non-radial filtering with fixed reciprocal lattice positions ($N_x, N_y, N_z$ hyperparameters)
For aperiodic systems: radial Gaussian basis function filtering with frequency cutoff $c_k$ and voxel resolution $\Delta = 0.2$ Å$^{-1}$
SVD-based coordinate alignment for rotation invariance in the aperiodic case
Bottleneck dimension $N_\downarrow = 16$ (GemNet-T) or $N_\downarrow = 8$ (others)
Update function: dense layer + $N_{\text{hidden}}$ residual layers ($N_{\text{hidden}} = 3$, except PaiNN with $N_{\text{hidden}} = 0$)

Models

Model	Embedding Size (OE62)	Interaction Blocks	Ewald Params (OE62)
SchNet	512	4	12.2M total
PaiNN	512	4	15.7M total
DimeNet++	256	3	4.8M total
GemNet-T	256	3	16.1M total

Evaluation

Primary metric: Energy mean absolute error (EMAE) in meV
Secondary metric: Force MAE in meV/Å (OC20 only)
Loss: Linear combination of energy and force MAEs (Eq. 15) with model-specific force multipliers
Optimizer: Adam with weight decay ($\lambda = 0.01$)

Hardware

All runtime measurements on NVIDIA A100 GPUs
Runtimes measured after 50 warmup batches, averaged over 500 batches, minimum of 3 repetitions
Code: EwaldMP (Hippocratic License 3.0)

Artifacts

Artifact	Type	License	Notes
EwaldMP	Code	Hippocratic License 3.0 (new files) / MIT (OC20 base)	Official implementation built on the Open Catalyst Project codebase
OC20	Dataset	CC-BY-4.0	~265M periodic adsorbate-catalyst structures with DFT energies and forces
OE62	Dataset	CC-BY-4.0	~62,000 large organic molecules with DFT energies including D3 correction

Reproducibility status: Highly Reproducible. Source code, both datasets, and detailed hyperparameters (including per-model learning rates, batch sizes, and Ewald-specific settings) are all publicly available. Pre-trained model weights are not provided.

Paper Information

Citation: Kosmala, A., Gasteiger, J., Gao, N., & Günnemann, S. (2023). Ewald-based Long-Range Message Passing for Molecular Graphs. In Proceedings of the 40th International Conference on Machine Learning (ICML 2023).

Publication: ICML 2023

@inproceedings{kosmala2023ewald,
  title={Ewald-based Long-Range Message Passing for Molecular Graphs},
  author={Kosmala, Arthur and Gasteiger, Johannes and Gao, Nicholas and G{\"u}nnemann, Stephan},
  booktitle={Proceedings of the 40th International Conference on Machine Learning},
  year={2023},
  series={PMLR},
  volume={202}
}

PharMolixFM: Multi-Modal All-Atom Molecular Models

Sat, 28 Mar 2026 00:00:00 +0000

A Unified Framework for All-Atom Molecular Foundation Models

PharMolixFM is a Method paper that introduces a unified framework for constructing all-atom foundation models for molecular modeling and generation. The primary contribution is the systematic implementation of three multi-modal generative model variants (diffusion, flow matching, and Bayesian flow networks) within a single architecture, along with a task-unifying denoising formulation that enables training on multiple structural biology tasks simultaneously. The framework achieves competitive performance on protein-small-molecule docking and structure-based drug design while providing the first empirical analysis of inference scaling laws for molecular generative models.

Existing all-atom foundation models such as AlphaFold3, RoseTTAFold All-Atom, and ESM-AA face two core challenges that limit their generalization across molecular modeling and generation tasks.

First, atomic data is inherently multi-modal: each atom comprises both a discrete atom type and continuous 3D coordinates. This poses challenges for structure models that need to jointly capture and predict both modalities. Unlike text or image data that exhibit a single modality, molecular structures require generative models that can handle discrete categorical variables (atom types, bond types) and continuous variables (coordinates) simultaneously.

Second, there has been no comprehensive analysis of how different training objectives and sampling strategies impact the performance of all-atom foundation models. Prior work has focused on individual model architectures without systematically comparing generative frameworks or studying how inference-time compute scaling affects prediction quality.

PharMolixFM addresses both challenges by providing a unified framework that implements three state-of-the-art multi-modal generative models and formulates all downstream tasks as a generalized denoising process with task-specific priors.

The core innovation of PharMolixFM is the formulation of molecular tasks as a generalized denoising process where task-specific priors control which parts of the molecular system are noised during training. The framework decomposes a biomolecular system into $N$ atoms represented as a triplet $\bar{\mathbf{S}}_0 = \langle \mathbf{X}_0, \mathbf{A}_0, \mathbf{E}_0 \rangle$, where $\mathbf{X}_0 \in \mathbb{R}^{N \times 3}$ are atom coordinates, $\mathbf{A}_0 \in \mathbb{Z}^{N \times D_1}$ are one-hot atom types, and $\mathbf{E}_0 \in \mathbb{Z}^{N \times N \times D_2}$ are one-hot bond types.

The generative model estimates the density $p_\theta(\langle \mathbf{X}_0, \mathbf{A}_0, \mathbf{E}_0 \rangle)$ subject to SE(3) invariance:

$$ p_\theta(\langle \mathbf{R}\mathbf{X}_0 + \mathbf{t}, \mathbf{A}_0, \mathbf{E}_0 \rangle) = p_\theta(\langle \mathbf{X}_0, \mathbf{A}_0, \mathbf{E}_0 \rangle) $$

The variational lower bound is optimized over latent variables $S_1, \ldots, S_T$ obtained by adding independent noise to different modalities and atoms:

$$ q(S_{1:T} \mid S_0) = \prod_{i=1}^{T} \prod_{j=1}^{N} q(\mathbf{X}_{i,j} \mid \mathbf{X}_{0,j}, \sigma_{i,j}^{(\mathbf{X})}) , q(\mathbf{A}_{i,j} \mid \mathbf{A}_{0,j}, \sigma_{i,j}^{(\mathbf{A})}) , q(\mathbf{E}_{i,j} \mid \mathbf{E}_{0,j}, \sigma_{i,j}^{(\mathbf{E})}) $$

A key design choice is the noise schedule $\sigma_{i,j}^{(\mathcal{M})} = \frac{i}{T} \cdot \text{fix}_j^{(\mathcal{M})}$, where $\text{fix}_j^{(\mathcal{M})}$ is a scaling factor between 0 and 1 that controls which atoms and modalities receive noise. This “Fix” mechanism enables multiple training tasks:

Docking ($\text{Fix} = 1$ for protein and molecular graph, $\text{Fix} = 0$ for molecule coordinates): predicts binding pose given known atom/bond types.
Structure-based drug design ($\text{Fix} = 1$ for protein, $\text{Fix} = 0$ for all molecule properties): generates novel molecules for a given pocket.
Robustness augmentation ($\text{Fix} = 0.7$ for 15% randomly selected atoms, $\text{Fix} = 0$ for rest): simulates partial structure determination.

Three Generative Model Variants

Multi-modal diffusion (PharMolixFM-Diff) uses a Markovian forward process. Continuous coordinates follow Gaussian diffusion while discrete variables use a D3PM categorical transition:

$$ q(\mathbf{X}_{i,j} \mid \mathbf{X}_{0,j}) = \mathcal{N}(\sqrt{\alpha_{i,j}} , \mathbf{X}_{0,j}, (1 - \alpha_{i,j}) \mathbf{I}), \quad \alpha_{i,j} = \prod_{k=1}^{i}(1 - \sigma_{i,j}^{(\mathbf{X})}) $$

$$ q(\mathbf{A}_{i,j} \mid \mathbf{A}_{0,j}) = \text{Cat}(\mathbf{A}_{0,j} \bar{Q}_{i,j}^{(\mathbf{A})}), \quad Q_{i,j}^{(\mathbf{A})} = (1 - \sigma_{i,j}^{(\mathbf{A})}) \mathbf{I} + \frac{\sigma_{i,j}^{(\mathbf{A})}}{D_1} \mathbb{1}\mathbb{1}^T $$

The training loss combines coordinate MSE with cross-entropy for discrete variables:

$$ \mathcal{L} = \mathbb{E}_{S_0, i, S_i} \left[ \lambda_i^{(\mathbf{X})} | \tilde{\mathbf{X}}_0 - \mathbf{X}_0 |_2^2 + \lambda_i^{(\mathbf{A})} \mathcal{L}_{CE}(\tilde{\mathbf{A}}_0, \mathbf{A}_0) + \lambda_i^{(\mathbf{E})} \mathcal{L}_{CE}(\tilde{\mathbf{E}}_0, \mathbf{E}_0) \right] $$

Multi-modal flow matching (PharMolixFM-Flow) constructs a direct mapping between data and prior distributions using conditional vector fields. For coordinates, the conditional flow uses a Gaussian path $q(\mathbf{X}_{i,j} \mid \mathbf{X}_{0,j}) = \mathcal{N}((1 - \sigma_{i,j}^{(\mathbf{X})}) \mathbf{X}_{0,j}, (\sigma_{i,j}^{(\mathbf{X})})^2 \mathbf{I})$, while discrete variables use the same D3PM Markov chain. Sampling proceeds by solving an ODE via Euler integration.

Bayesian flow networks (PharMolixFM-BFN) perform generative modeling in the parameter space of the data distribution rather than the data space. The Bayesian flow distribution for coordinates is:

$$ p_F(\tilde{\mathbf{X}}_{i,j}^{(\theta)} \mid \mathbf{X}_{0,j}) = \mathcal{N}(\gamma_{i,j} \mathbf{X}_{0,j}, \gamma_{i,j}(1 - \gamma_{i,j}) \mathbf{I}), \quad \gamma_{i,j} = 1 - \alpha^{2(1 - \sigma_{i,j}^{(\mathbf{X})})} $$

Network Architecture

The architecture follows PocketXMol with a dual-branch SE(3)-equivariant graph neural network. A protein branch (4-layer GNN with kNN graph) processes pocket atoms, then representations are passed to a molecule branch (6-layer GNN) that captures protein-molecule interactions. Independent prediction heads reconstruct atom coordinates, atom types, and bond types, with additional confidence heads for self-ranking during inference.

Docking and Drug Design Experiments

Protein-Small-Molecule Docking

PharMolixFM is evaluated on the PoseBusters benchmark (428 protein-small-molecule complexes) using the holo docking setting with a known protein structure and 10 Angstrom binding pocket. The metric is the ratio of predictions with RMSD < 2 Angstrom.

Method	Self-Ranking (%)	Oracle-Ranking (%)
DiffDock	38.0	-
RFAA	42.0	-
Vina	52.3	-
UniMol-Docking V2	77.6	-
SurfDock	78.0	-
AlphaFold3	90.4	-
PocketXMol (50 repeats)	82.2	95.3
PharMolixFM-Diff (50 repeats)	83.4	96.0
PharMolixFM-Flow (50 repeats)	73.4	93.7
PharMolixFM-BFN (50 repeats)	78.5	93.5
PharMolixFM-Diff (500 repeats)	83.9	98.1

PharMolixFM-Diff achieves the second-best self-ranking result (83.4%), outperforming PocketXMol by 1.7% absolute but trailing AlphaFold3 (90.4%). The key advantage is inference speed: approximately 4.6 seconds per complex on a single A800 GPU compared to approximately 249.0 seconds for AlphaFold3 (a 54x speedup). Under oracle-ranking with 500 repeats, PharMolixFM-Diff reaches 98.1%, suggesting that better ranking strategies could further improve practical performance.

Structure-Based Drug Design

Evaluation uses the CrossDocked test set (100 protein pockets, 100 molecules generated per pocket), measuring Vina binding affinity scores and drug-likeness properties (QED and SA).

Method	Vina Score (Avg/Med)	QED	SA
Pocket2Mol	-5.14 / -4.70	0.57	0.76
TargetDiff	-5.47 / -6.30	0.48	0.58
DecompDiff	-5.67 / -6.04	0.45	0.61
MolCRAFT	-6.61 / -8.14	0.46	0.62
PharMolixFM-Diff	-6.18 / -6.44	0.50	0.73
PharMolixFM-Flow	-6.34 / -6.47	0.49	0.74
PharMolixFM-BFN	-6.38 / -6.45	0.48	0.64

PharMolixFM achieves a better balance between binding affinity and drug-like properties compared to baselines. While MolCRAFT achieves the best Vina scores, PharMolixFM-Diff and Flow variants show notably higher QED (0.49-0.50 vs. 0.45-0.48) and SA (0.73-0.74 vs. 0.58-0.62), which are important for downstream validation and in-vivo application.

Inference Scaling Law

The paper explores whether inference-time scaling holds for molecular generative models, fitting the relationship:

$$ \text{Acc} = a \log(bR + c) + d $$

where $R$ is the number of sampling repeats. All three PharMolixFM variants exhibit logarithmic improvement in docking accuracy with increased sampling repeats, analogous to inference scaling laws observed in NLP. Performance plateaus eventually due to distributional differences between training and test sets.

Competitive Docking with Faster Inference, but Limited Task Scope

PharMolixFM demonstrates that multi-modal generative models can achieve competitive all-atom molecular modeling with substantial inference speed advantages over AlphaFold3. The key findings are:

Diffusion outperforms flow matching and BFN for docking under standard sampling budgets. The stochastic nature of diffusion sampling appears beneficial compared to the deterministic ODE integration of flow matching.
Oracle-ranking reveals untapped potential: the gap between self-ranking (83.4%) and oracle-ranking (98.1%) at 500 repeats indicates that confidence-based ranking is a bottleneck. Better ranking methods could close the gap with AlphaFold3.
The three variants show similar performance for drug design, suggesting that model architecture and training data may matter more than the generative framework for generation tasks.
Inference scaling laws hold for molecular generative models, paralleling findings in NLP.

Limitations include that the framework is only evaluated on two tasks (docking and SBDD), and the paper does not address protein structure prediction, protein-protein interactions, or nucleic acid modeling, which are part of AlphaFold3’s scope. The BFN variant underperforms the diffusion model, which the authors attribute to smaller noise scales at early sampling steps making training less challenging. The paper also does not compare against concurrent work on inference-time scaling for molecular models.

Reproducibility Details

Data

Purpose	Dataset	Size	Notes
Training	PDBBind, Binding MOAD, CrossDocked2020, PepBDB	Not specified	Filtered by PocketXMol criteria
Docking eval	PoseBusters benchmark	428 complexes	Holo docking with known protein
SBDD eval	CrossDocked test set	100 pockets	100 molecules per pocket

Algorithms

Three generative variants: multi-modal diffusion (D3PM), flow matching, Bayesian flow networks
Task-specific noise via Fix mechanism (0, 0.7, or 1.0)
Training tasks selected with equal probability per sample
AdamW optimizer: weight decay 0.001, $\beta_1 = 0.99$, $\beta_2 = 0.999$
Linear warmup to learning rate 0.001 over 1000 steps
180K training steps with batch size 40

Models

Dual-branch SE(3)-equivariant GNN (protein: 4-layer, molecule: 6-layer)
kNN graph construction for protein and protein-molecule interactions
Independent prediction heads for coordinates, atom types, bond types
Confidence heads for self-ranking during inference

Evaluation

Metric	PharMolixFM-Diff	AlphaFold3	Notes
RMSD < 2A self-ranking	83.4% (50 rep)	90.4%	PoseBusters docking
RMSD < 2A oracle-ranking	98.1% (500 rep)	-	PoseBusters docking
Inference time (per complex)	~4.6s	~249.0s	Single A800 GPU
Vina score (avg)	-6.18	-	CrossDocked SBDD

Hardware

Training: 4x 80GB A800 GPUs
Inference benchmarked on single A800 GPU

Artifacts

Artifact	Type	License	Notes
OpenBioMed (GitHub)	Code	MIT	Official implementation

Paper Information

Citation: Luo, Y., Wang, J., Fan, S., & Nie, Z. (2025). PharMolixFM: All-Atom Foundation Models for Molecular Modeling and Generation. arXiv preprint arXiv:2503.21788.

@article{luo2025pharmolixfm,
  title={PharMolixFM: All-Atom Foundation Models for Molecular Modeling and Generation},
  author={Luo, Yizhen and Wang, Jiashuo and Fan, Siqi and Nie, Zaiqing},
  journal={arXiv preprint arXiv:2503.21788},
  year={2025}
}

MAT: Graph-Augmented Transformer for Molecules (2020)

Fri, 27 Mar 2026 00:00:00 +0000

A Graph-Augmented Transformer for Molecular Property Prediction

This is a Method paper that proposes the Molecule Attention Transformer (MAT), a Transformer-based architecture adapted for molecular property prediction. The primary contribution is a modified self-attention mechanism that incorporates inter-atomic distances and molecular graph structure alongside the standard query-key attention. Combined with self-supervised pretraining on 2 million molecules from ZINC15, MAT achieves competitive performance across seven diverse molecular property prediction tasks while requiring minimal hyperparameter tuning.

Challenges in Deep Learning for Molecular Properties

Predicting molecular properties is central to drug discovery and materials design, yet deep neural networks have struggled to consistently outperform shallow methods like random forests and SVMs on these tasks. Wu et al. (2018) demonstrated through the MoleculeNet benchmark that graph neural networks do not reliably beat classical models. Two recurring problems compound this:

Underfitting: Graph neural networks tend to underfit training data, with performance failing to scale with model complexity (Ishiguro et al., 2019).
Hyperparameter sensitivity: Deep models for molecule property prediction require extensive hyperparameter search (often 500+ configurations) to achieve competitive results, making them impractical for many practitioners.

Concurrent work explored using vanilla Transformers on SMILES string representations of molecules (Honda et al., 2019; Wang et al., 2019), but these approaches discard the explicit structural information encoded in molecular graphs and 3D conformations. The motivation for MAT is to combine the flexibility of the Transformer architecture with domain-specific inductive biases from molecular structure.

Molecule Self-Attention: Combining Attention, Distance, and Graph Structure

The core innovation is the Molecule Self-Attention layer, which replaces standard Transformer self-attention. In a standard Transformer, head $i$ computes:

$$ \mathcal{A}^{(i)} = \rho\left(\frac{\mathbf{Q}_{i} \mathbf{K}_{i}^{T}}{\sqrt{d_{k}}}\right) \mathbf{V}_{i} $$

MAT augments this with two additional information sources. Let $\mathbf{A} \in {0, 1}^{N_{\text{atoms}} \times N_{\text{atoms}}}$ denote the molecular graph adjacency matrix and $\mathbf{D} \in \mathbb{R}^{N_{\text{atoms}} \times N_{\text{atoms}}}$ denote the inter-atomic distance matrix. The modified attention becomes:

$$ \mathcal{A}^{(i)} = \left(\lambda_{a} \rho\left(\frac{\mathbf{Q}_{i} \mathbf{K}_{i}^{T}}{\sqrt{d_{k}}}\right) + \lambda_{d}, g(\mathbf{D}) + \lambda_{g}, \mathbf{A}\right) \mathbf{V}_{i} $$

where $\lambda_{a}$, $\lambda_{d}$, and $\lambda_{g}$ are scalar hyperparameters weighting each component, and $g$ is either a row-wise softmax or an element-wise exponential decay $g(d) = \exp(-d)$.

Key architectural details:

Atom embedding: Each atom is represented as a 26-dimensional vector encoding atomic identity (one-hot over B, N, C, O, F, P, S, Cl, Br, I, dummy, other), number of heavy neighbors, number of hydrogens, formal charge, ring membership, and aromaticity.
Dummy node: An artificial disconnected node (distance $10^{6}$ from all atoms) is added to each molecule, allowing the model to “skip” attention heads when no relevant pattern exists, similar to how BERT uses the separation token.
3D conformers: Distance matrices are computed from RDKit-generated 3D conformers using the Universal Force Field (UFF).
Pretraining: Node-level masked atom prediction on 2 million ZINC15 molecules (following Hu et al., 2019), where 15% of atom features are masked and the model predicts them.

Benchmark Evaluation and Ablation Studies

Experimental setup

MAT is evaluated on seven molecular property prediction datasets spanning regression and classification:

Dataset	Task	Size	Metric	Split
FreeSolv	Regression (hydration free energy)	642	RMSE	Random
ESOL	Regression (log solubility)	1,128	RMSE	Random
BBBP	Classification (BBB permeability)	2,039	ROC AUC	Scaffold
Estrogen-alpha	Classification (receptor activity)	2,398	ROC AUC	Scaffold
Estrogen-beta	Classification (receptor activity)	1,961	ROC AUC	Scaffold
MetStab-high	Classification (metabolic stability)	2,127	ROC AUC	Random
MetStab-low	Classification (metabolic stability)	2,127	ROC AUC	Random

Baselines include GCN, Weave, EAGCN, Random Forest (RF), and SVM. Each model receives the same hyperparameter search budget (150 or 500 evaluations). Results are averaged over 6 random train/validation/test splits.

Main results

MAT achieves the best average rank across all seven tasks:

Model	Avg. Rank (500 budget)	Avg. Rank (150 budget)
MAT	2.42	2.71
RF	3.14	3.14
SVM	3.57	3.28
GCN	3.57	3.71
Weave	3.71	3.57
EAGCN	4.14	4.14

With self-supervised pretraining, Pretrained MAT achieves an average rank of 1.57, outperforming both Pretrained EAGCN (4.0) and SMILES Transformer (4.29). Pretrained MAT requires tuning only the learning rate (7 values tested), compared to 500 hyperparameter combinations for the non-pretrained models.

Ablation results

Ablation studies on BBBP, ESOL, and FreeSolv reveal:

Variant	BBBP (AUC)	ESOL (RMSE)	FreeSolv (RMSE)
MAT (full)	.723	.286	.250
- Graph	.716	.316	.276
- Distance	.729	.281	.281
- Attention	.692	.306	.329
- Dummy node	.714	.317	.249
+ Edge features	.683	.314	.358

Removing any single component degrades performance on at least one task, supporting the value of combining all three information sources. Adding edge features does not help, suggesting the adjacency and distance matrices already capture sufficient bond-level information.

Interpretability analysis

Individual attention heads in the first layer learn chemically meaningful functions. Six heads were identified that focus on specific chemical patterns: 2-neighbored aromatic carbons, sulfur atoms, non-ring nitrogens, carbonyl oxygens, 3-neighbored aromatic atoms (substitution positions), and aromatic ring nitrogens. Statistical validation using Kruskal-Wallis tests confirmed that atoms matching these SMARTS patterns receive significantly higher attention weights ($p < 0.001$ for all patterns).

Findings, Limitations, and Future Directions

MAT demonstrates that augmenting Transformer self-attention with molecular graph structure and 3D distance information produces a model that performs consistently well across diverse property prediction tasks. The key practical finding is that self-supervised pretraining dramatically reduces the hyperparameter tuning burden: Pretrained MAT matches or exceeds the performance of extensively tuned models while requiring only learning rate selection.

Several limitations are acknowledged:

Fingerprint-based models still win on some tasks: RF and SVM with extended-connectivity fingerprints outperform MAT on metabolic stability and Estrogen-beta tasks, suggesting that incorporating fingerprint representations could improve MAT further.
Single conformer: Only one pre-computed 3D conformer is used per molecule. More sophisticated conformer sampling or ensemble strategies were not explored.
Limited pretraining exploration: Only the masked atom prediction task from Hu et al. (2019) was used. The authors note that exploring additional pretraining objectives is a promising direction.
Scalability: The pretrained model uses 1024-dimensional embeddings with 8 layers and 16 attention heads, fitting the largest model that fits in GPU memory.

Reproducibility Details

Data

Purpose	Dataset	Size	Notes
Pretraining	ZINC15	2M molecules	Sampled from ZINC database
Evaluation	FreeSolv	642	Hydration free energy regression
Evaluation	ESOL	1,128	Log solubility regression
Evaluation	BBBP	2,039	Blood-brain barrier classification
Evaluation	Estrogen-alpha/beta	2,398 / 1,961	Receptor activity classification
Evaluation	MetStab-high/low	2,127 each	Metabolic stability classification

Algorithms

Optimizer: Adam with Noam learning rate scheduler (warmup then inverse square root decay)
Pretraining: 8 epochs, learning rate 0.001, batch size 256, binary cross-entropy loss
Fine-tuning: 100 epochs, batch size 32, learning rate selected from {1e-3, 5e-4, 1e-4, 5e-5, 1e-5, 5e-6, 1e-6}
Distance kernel: exponential decay $g(d) = \exp(-d)$ for pretrained model
Lambda weights: $\lambda_{a} = \lambda_{d} = 0.33$ for pretrained model

Models

Pretrained MAT: 1024-dim embeddings, 8 layers, 16 attention heads, 1 feed-forward layer per block
Dropout: 0.0, weight decay: 0.0 for pretrained model
Atom featurization: 26-dimensional one-hot encoding (Table 1 in paper)

Evaluation

Regression: RMSE (FreeSolv, ESOL)
Classification: ROC AUC (BBBP, Estrogen-alpha/beta, MetStab-high/low)
All experiments repeated 6 times with different train/validation/test splits
Scaffold split for BBBP, Estrogen, random split for others

Hardware

The paper does not specify exact hardware details. The pretrained model is described as “the largest model that still fits the GPU memory.”

Artifacts

Artifact	Type	License	Notes
gmum/MAT	Code	MIT	Official implementation with pretrained weights

Paper Information

Citation: Maziarka, Ł., Danel, T., Mucha, S., Rataj, K., Tabor, J., & Jastrzębski, S. (2020). Molecule Attention Transformer. arXiv preprint arXiv:2002.08264.

@article{maziarka2020molecule,
  title={Molecule Attention Transformer},
  author={Maziarka, {\L}ukasz and Danel, Tomasz and Mucha, S{\l}awomir and Rataj, Krzysztof and Tabor, Jacek and Jastrz{\k{e}}bski, Stanis{\l}aw},
  journal={arXiv preprint arXiv:2002.08264},
  year={2020}
}

MOFFlow: Flow Matching for MOF Structure Prediction

Sat, 20 Dec 2025 00:00:00 +0000

Methodological Contribution: MOFFlow Architecture

This is a Methodological Paper ($\Psi_{\text{Method}}$).

It introduces MOFFlow, a generative architecture and training framework designed specifically for the structure prediction of Metal-Organic Frameworks (MOFs). The paper focuses on the algorithmic innovation of decomposing the problem into rigid-body assembly on a Riemannian manifold, validates this through comparison against existing baselines, and performs ablation studies to justify architectural choices. While it leverages the theory of flow matching, its primary contribution is the application-specific architecture and the handling of modular constraints.

Motivation: Scaling Limits of Atom-Level Generation

The primary motivation is to overcome the scalability and accuracy limitations of existing methods for MOF structure prediction.

Computational Cost of DFT: Conventional approaches rely on ab initio calculations (DFT) combined with random search, which are computationally prohibitive for large, complex systems like MOFs.
Failure of General CSP: Existing deep generative models for general Crystal Structure Prediction (CSP) operate on an atom-by-atom basis. They fail to scale to MOFs, which often contain hundreds or thousands of atoms per unit cell, and do not exploit the inherent modular nature (building blocks) of MOFs.
Tunability: MOFs have applications in carbon capture and drug delivery due to their tunable porosity, making automated design tools valuable.

Core Innovation: Rigid-Body Flow Matching on SE(3)

MOFFlow introduces a hierarchical, rigid-body flow matching framework tailored for MOFs.

Rigid Body Decomposition: MOFFlow treats metal nodes and organic linkers as rigid bodies, reducing the search space from $3N$ (atoms) to $6M$ (roto-translation of $M$ blocks) compared to atom-based methods.
Riemannian Flow Matching on $SE(3)$: It is the first end-to-end model to jointly generate block-level rotations ($SO(3)$), translations ($\mathbb{R}^3$), and lattice parameters using Riemannian flow matching.
MOFAttention: A custom attention module designed to encode the geometric relationships between building blocks, lattice parameters, and rotational constraints.
Constraint Handling: It incorporates domain knowledge by operating on a mean-free system for translation invariance and using canonicalized coordinates for rotation invariance.

Experimental Setup and Baselines

The authors evaluated MOFFlow on structure prediction accuracy, physical property preservation, and scalability.

Dataset: The Boyd et al. (2019) dataset consisting of 324,426 hypothetical MOF structures, decomposed into building blocks using the MOFid algorithm. Filtered to structures with <200 blocks, yielding 308,829 structures (247,066 train / 30,883 val / 30,880 test). Structures contain up to approximately 2,400 atoms per unit cell.
Baselines:
- Optimization-based: Random Search (RS) and Evolutionary Algorithm (EA) using CrySPY and CHGNet.
- Deep Learning: DiffCSP (deep generative model for general crystals).
- Self-Assembly: A heuristic algorithm used in MOFDiff (adapted for comparison).
Metrics:
- Match Rate (MR): Percentage of generated structures matching ground truth within tolerance.
- RMSE: Root mean squared displacement normalized by average free length per atom.
- Structural Properties: Volumetric/Gravimetric Surface Area (VSA/GSA), Pore Limiting Diameter (PLD), Void Fraction, etc., calculated via Zeo++.
- Scalability: Performance vs. number of atoms and building blocks.

Results and Generative Performance

MOFFlow outperformed all baselines in accuracy and efficiency, particularly for large structures.

Accuracy: With a single sample, MOFFlow achieved a 31.69% match rate (stol=0.5) and 87.46% (stol=1.0) on the full test set (30,880 structures). With 5 samples, these rose to 44.75% (stol=0.5) and 100.0% (stol=1.0). RS and EA (tested on 100 and 15 samples respectively due to computational cost, generating 20 candidates each) achieved 0.00% MR at both tolerance levels. DiffCSP reached 0.09% (stol=0.5) and 23.12% (stol=1.0) with 1 sample.
Speed: Inference took 1.94 seconds per structure, compared to 5.37s for DiffCSP, 332s for RS, and 1,959s for EA.
Scalability: MOFFlow preserved high match rates across all system sizes, while DiffCSP’s match rate dropped sharply beyond 200 atoms.
Property Preservation: The distributions of physical properties (e.g., surface area, void fraction) for MOFFlow-generated structures closely matched the ground truth. DiffCSP frequently reduced volumetric surface area and void fraction to zero.
Self-Assembly Comparison: In a controlled comparison where the self-assembly (SA) algorithm received MOFFlow’s predicted translations and lattice, MOFFlow (MR=31.69%, RMSE=0.2820) outperformed SA (MR=30.04%, RMSE=0.3084), confirming the value of the learned rotational vector fields. In an extended scalability comparison, SA scaled better for structures with many building blocks, but MOFFlow achieved higher overall match rate (31.69% vs. 27.14%).
Batch Implementation: A refactored Batch version achieves improved results: 32.73% MR (stol=0.5), RMSE of 0.2743, inference in 0.19s per structure (10x faster), and training in roughly 1/3 the GPU hours.

Limitations

The paper identifies three key limitations:

Hypothetical-only evaluation: All experiments use the Boyd et al. hypothetical database. Evaluation on more challenging real-world datasets remains needed.
Rigid-body assumption: The model assumes that local building block structures are known, which may be impractical for rare building blocks whose structural information is missing from existing libraries or is inaccurate.
Periodic invariance: The model is not invariant to periodic transformations of the input. Explicitly modeling periodic invariance could further improve performance.

Reproducibility Details

Data

Source: MOF dataset by Boyd et al. (2019).
Preprocessing: Structures were decomposed using the metal-oxo decomposition algorithm from MOFid.
Filtering: Structures with fewer than 200 building blocks were used, yielding 308,829 structures.
Splits: Train/Validation/Test ratio of 8:1:1 (247,066 / 30,883 / 30,880).
Availability: Pre-processed dataset is available on Zenodo.
Representations:
- Atom-level: Tuple $(X, a, l)$ (coordinates, types, lattice).
- Block-level: Tuple $(\mathcal{B}, q, \tau, l)$ (blocks, rotations, translations, lattice).

Algorithms

Framework: Riemannian Flow Matching.
Objective: Conditional Flow Matching (CFM) loss regressing to clean data $q_1, \tau_1, l_1$. $$ \begin{aligned} \mathcal{L}(\theta) = \mathbb{E}_{t, \mathcal{S}^{(1)}} \left[ \frac{1}{(1-t)^2} \left( \lambda_1 |\log_{q_t}(\hat{q}_1) - \log_{q_t}(q_1)|^2 + \dots \right) \right] \end{aligned} $$
Priors:
- Rotations ($q$): Uniform on $SO(3)$.
- Translations ($\tau$): Standard normal on $\mathbb{R}^3$.
- Lattice ($l$): Log-normal for lengths, Uniform(60, 120) for angles (Niggli reduced).
Inference: ODE solver with 50 integration steps.
Local Coordinates: Defined using PCA axes, corrected for symmetry to ensure consistency.

Models

Architecture: Hierarchical structure with two key modules.
- Atom-level Update Layers: 4-layer EGNN-like structure to encode building block features $h_m$ from atomic graphs (cutoff 5Å).
- Block-level Update Layers: 6 layers that iteratively update $q, \tau, l$ using the MOFAttention module.
MOFAttention: Modified Invariant Point Attention (IPA) that incorporates lattice parameters as offsets to the attention matrix.
Hyperparameters:
- Node dimension: 256 (block-level), 64 (atom-level).
- Attention heads: 24.
- Loss coefficients: $\lambda_1=1.0$ (rot), $\lambda_2=2.0$ (trans), $\lambda_3=0.1$ (lattice).
Checkpoints: Pre-trained weights and models are openly provided on Zenodo.

Evaluation

Metrics:
- Match Rate: Using StructureMatcher from pymatgen. Tolerances: stol=0.5/1.0, ltol=0.3, angle_tol=10.0.
- RMSE: Normalized by average free length per atom.
Tools: Zeo++ for structural property calculations (Surface Area, Pore Diameter, etc.).

Metric	MOFFlow	DiffCSP	RS (20 cands)	EA (20 cands)
MR (stol=0.5, k=1)	31.69%	0.09%	0.00%	0.00%
MR (stol=1.0, k=1)	87.46%	23.12%	0.00%	0.00%
MR (stol=0.5, k=5)	44.75%	0.34%	-	-
MR (stol=1.0, k=5)	100.0%	38.94%	-	-
RMSE (stol=0.5, k=1)	0.2820	0.3961	-	-
Avg. time per structure	1.94s	5.37s	332s	1,959s

Hardware

Training Hardware: 8 $\times$ NVIDIA RTX 3090 (24GB VRAM).
Training Time:
- TimestepBatch version (main paper): ~5 days 15 hours.
- Batch version: ~1 day 17 hours (332.74 GPU hours). The authors also release this refactored implementation, which achieves comparable performance with faster convergence.
Batch Size: 160 (capped by $N^2$ where $N$ is the number of atoms, for memory management).

Artifacts

Artifact	Type	License	Notes
MOFFlow (GitHub)	Code	MIT	Official implementation built on DiffDock, EGNN, MOFDiff, and protein-frame-flow
Pre-processed dataset and checkpoints (Zenodo)	Dataset / Model	Unknown	Includes pre-processed MOF structures and trained model weights

Paper Information

Citation: Kim, N., Kim, S., Kim, M., Park, J., & Ahn, S. (2025). MOFFlow: Flow Matching for Structure Prediction of Metal-Organic Frameworks. International Conference on Learning Representations (ICLR).

Publication: ICLR 2025

@inproceedings{kimMOFFlowFlowMatching2025,
  title={MOFFlow: Flow Matching for Structure Prediction of Metal-Organic Frameworks},
  author={Kim, Nayoung and Kim, Seongsu and Kim, Minsu and Park, Jinkyoo and Ahn, Sungsoo},
  booktitle={The Thirteenth International Conference on Learning Representations},
  year={2025},
  url={https://openreview.net/forum?id=dNT3abOsLo}
}

Additional Resources:

Stillinger-Weber Potential for Silicon Simulation

Sun, 14 Dec 2025 00:00:00 +0000

Core Methodological Contribution

This is a Method paper.

Its primary contribution is the formulation of the Stillinger-Weber potential, a non-additive potential energy function designed to model tetrahedral semiconductors. The paper also uses molecular dynamics simulation to explore physical properties of silicon in both crystalline and liquid phases, but the methodological contribution (the potential architecture) is what enabled subsequent research on covalent materials.

The Failure of Pair Potentials in Silicon

The authors aimed to simulate the melting and liquid properties of tetrahedral semiconductors (Silicon and Germanium).

The Problem: Standard pair potentials (like Lennard-Jones) favor close-packed structures (12 nearest neighbors) and cannot stabilize the open diamond structure (4 nearest neighbors) of Silicon.
The Gap: Earlier classical potentials lacked the flexibility to describe the profound structural change where Silicon shrinks upon melting (coordination number increases from 4 to >6) while remaining conductive.
The Goal: To construct a potential that spans the entire configuration space, describing both the rigid crystal and the diffusive liquid, without requiring quantum mechanical calculations.

The Three-Body Interaction Novelty

The core novelty is the introduction of a stabilizing three-body interaction term ($v_3$) to the potential energy function.

3-Body Term: Explicitly penalizes deviations from the ideal tetrahedral angle ($\cos \theta_t = -1/3$).
Unified Model: This potential handles bond breaking and reforming, allowing for the simulation of melting and liquid diffusion. Previous “Keating” potentials model only small elastic deformations.
Mapping Technique: The application of “steepest-descent mapping” to quench dynamical configurations into their underlying “inherent structures” (local minima), revealing the fundamental topology of the liquid energy landscape.

Molecular Dynamics Validation

The authors performed Molecular Dynamics (MD) simulations using the proposed potential.

System: 216 Silicon atoms in a cubic cell with periodic boundary conditions.
State Points: Fixed density $\rho = 2.53 \text{ g/cm}^3$ (matching experimental liquid density at melting).
Process:
1. Start with diamond crystal at low temperature.
2. Systematically heat to induce spontaneous nucleation and melting.
3. Equilibrate the liquid.
4. Periodically map configurations to potential minima (inherent structures) using steepest descent.

Phase Topology and Inverse Lindemann Criterion

Validation: The potential successfully stabilizes the diamond structure as the global minimum at zero pressure.
Liquid Structure: The simulated liquid pair-correlation function $g(r)$ and structure factor $S(k)$ qualitatively match experimental diffraction data, including the characteristic shoulder on the structure factor peak.
Inherent Structure: The liquid possesses a temperature-independent inherent structure (amorphous network) hidden beneath thermal vibrations.
Melting/Freezing Criteria: The study proposes an “Inverse Lindemann Criterion”: while crystals melt when vibration amplitude exceeds ~0.19 lattice spacings, liquids freeze when atom displacements from their inherent minima drop below ~0.30 neighbor spacings.

Limitations and Energy Scale Problem

The authors acknowledge a quantitative energy scale discrepancy. To match the observed melting temperature of Si ($1410°$C), $\epsilon$ would need to be approximately 42 kcal/mol, considerably less than the 50 kcal/mol required to reproduce the correct cohesive energy of the crystal. The authors suggest this could be resolved either by further optimization of $v_2$ and $v_3$, or by adding position-independent single-particle terms $v_1 \approx -16$ kcal/mol arising from the electronic structure. Adding $v_1$ terms only affects the temperature scale and has no influence on local structure at a given reduced temperature.

The simulated liquid coordination number (8.07) is also higher than the experimentally reported value of approximately 6.4, though the authors note that the experimental definition of “nearest neighbors” was not precisely stated.

Bonding Statistics in Inherent Structures

Analysis of potential-energy minima (inherent structures) using a bond cutoff of $r/\sigma = 1.40$ reveals the coordination distribution in the liquid:

Coordination Number	Fraction of Atoms
4	0.201
5	0.568
6	0.205
7	0.024

Five-coordinate atoms dominate the liquid’s inherent structure, with four- and six-coordinate atoms each accounting for about 20% of the population. The three-body interactions prevent any occurrence of coordination numbers near 12 that would indicate local close packing.

Reproducibility Details

Algorithms

Integration: Equations of motion integrated using a fifth-order Gear algorithm.
Time Step: $\Delta t = 5 \times 10^{-3} \tau$ (approx $3.83 \times 10^{-16}$ s), where $\tau = \sigma(m/\epsilon)^{1/2} = 7.6634 \times 10^{-14}$ s.
Minimization: Steepest-descent mapping utilized Newton’s method to find limiting solutions ($\nabla \Phi = 0$).

Models

To reproduce this work, one must implement the potential $\Phi = \sum v_2 + \sum v_3$ with the exact functional forms and parameters provided.

Left: Two-body radial potential $v_2(r)$ showing the characteristic well at $r_{min} \approx 1.12\sigma$. Right: Three-body angular penalty $h(r_{min}, r_{min}, \theta)$ demonstrating the minimum at the tetrahedral angle (109.5°), which enforces the diamond crystal structure.

Reduced Units

$\sigma = 0.20951 \text{ nm}$
$\epsilon = 50 \text{ kcal/mol} = 3.4723 \times 10^{-12} \text{ erg}$

Two-Body Term ($v_2$)

$$ v_2(r_{ij}) = \epsilon A (B r_{ij}^{-p} - r_{ij}^{-q}) \exp[(r_{ij} - a)^{-1}] \quad \text{for } r_{ij} < a $$

(Vanishes for $r \geq a$)

Three-Body Term ($v_3$)

$$ v_3(r_i, r_j, r_k) = \epsilon [h(r_{ij}, r_{ik}, \theta_{jik}) + h(r_{ji}, r_{jk}, \theta_{ijk}) + h(r_{ki}, r_{kj}, \theta_{ikj})] $$

where:

$$ h(r_{ij}, r_{ik}, \theta_{jik}) = \lambda \exp[\gamma(r_{ij}-a)^{-1} + \gamma(r_{ik}-a)^{-1}] (\cos\theta_{jik} + \frac{1}{3})^2 $$

(Vanishes if distances $\geq a$)

Parameters

Parameter	Value
$A$	$7.049556277$
$B$	$0.6022245584$
$p$	$4$
$q$	$0$
$a$	$1.80$
$\lambda$	$21.0$
$\gamma$	$1.20$

Evaluation

The paper evaluates the model against experimental diffraction data.

Metric	Simulated Value	Experimental Value	Notes
*Melting Point ($T_m^$)**	$\approx 0.080$	N/A	Reduced units. Requires $\epsilon \approx 42$ kcal/mol to match real $T_m = 1410°$C, vs 50 kcal/mol for correct cohesive energy.
Coordination (Liquid)	$8.07$	$\approx 6.4$	Evaluated at first $g(r)$ minimum ($r/\sigma = 1.625$). Simulated value is higher than experiment.
$S(k)$ First Peak	$2.53$ $\AA^{-1}$	$2.80$ $\AA^{-1}$	From Table I.
$S(k)$ Shoulder	$3.25$ $\AA^{-1}$	$3.25$ $\AA^{-1}$	From Table I. Exact match with experiment.
$S(k)$ Second Peak	$5.35$ $\AA^{-1}$	$5.75$ $\AA^{-1}$	From Table I.
$S(k)$ Third Peak	$8.16$ $\AA^{-1}$	$8.50$ $\AA^{-1}$	From Table I.
$S(k)$ Fourth Peak	$10.60$ $\AA^{-1}$	$11.20$ $\AA^{-1}$	From Table I.
Entropy of Melting ($\Delta S / N k_B$)	$\approx 3.7$	$3.25$	Simulated at constant volume; experimental at constant pressure (1 atm).

Paper Information

Citation: Stillinger, F. H., & Weber, T. A. (1985). Computer simulation of local order in condensed phases of silicon. Physical Review B, 31(8), 5262-5271. https://doi.org/10.1103/PhysRevB.31.5262

Publication: Physical Review B, 1985

@article{stillingerComputerSimulationLocal1985,
  title = {Computer Simulation of Local Order in Condensed Phases of Silicon},
  author = {Stillinger, Frank H. and Weber, Thomas A.},
  year = 1985,
  month = apr,
  journal = {Physical Review B},
  volume = {31},
  number = {8},
  pages = {5262--5271},
  publisher = {American Physical Society},
  doi = {10.1103/PhysRevB.31.5262}
}

Second-Order Langevin Equation for Field Simulations

Sun, 14 Dec 2025 00:00:00 +0000

Contribution and Paper Type

This is a Methodological Paper ($\Psi_{\text{Method}}$). It proposes a novel stochastic algorithm, the Hyperbolic Algorithm (HA), and validates its superior efficiency against the existing Langevin Algorithm (LA) through formal error analysis and numerical simulation. It contains significant theoretical derivation (Liouville dynamics) that serves primarily to justify the algorithmic performance claims.

Motivation and Gaps in Prior Work

The standard Langevin Algorithm (LA) for numerical simulation of Euclidean field theories suffers from efficiency bottlenecks. The simplest Euler-discretization of the LA introduces systematic errors of $O(\epsilon)$ (where $\epsilon$ is the step size). To maintain accuracy, $\epsilon$ must be kept small, which increases the sweep-sweep correlation time (autocorrelation time), making simulations computationally expensive.

Core Novelty: Second-Order Dynamics

The core contribution is the introduction of a second-order derivative in fictitious time to the stochastic equation. This converts the parabolic Langevin equation into a hyperbolic equation:

$$ \begin{aligned} \frac{\partial^{2}\phi}{\partial t^{2}}+\gamma\frac{\partial\phi}{\partial t}=-\frac{\partial S}{\partial\phi}+\eta \end{aligned} $$

Equation Comparison

The key difference from the standard (first-order) Langevin equation:

Equation Type	Formula
Hyperbolic (Second Order)	$$\frac{\partial^{2}\phi}{\partial t^{2}}+\gamma\frac{\partial\phi}{\partial t}=-\frac{\partial S}{\partial\phi}+\eta$$
Langevin (First Order)	$$\frac{\partial\phi}{\partial t}=-\frac{\partial S}{\partial\phi}+\eta$$

The standard Langevin equation corresponds to the overdamped limit where the acceleration term is absent. Physically, the Hyperbolic equation can be viewed as microcanonical equations of motion with an added friction term.

Key Innovations

Higher Order Accuracy: The simplest discretization of this equation leads to systematic errors of only $O(\epsilon^2)$ compared to $O(\epsilon)$ for LA.
Tunable Damping: The addition of the damping parameter $\gamma$ allows tuning to minimize autocorrelation tails.
Uniform Evolution: The method evolves structures of different wavelengths more uniformly than LA due to the specific dissipation structure.

Methodology and Experiments

The author validated the method using the XY Model on 2D lattices.

System: Euclidean action $S = -\sum_{x,\mu} \cos(\theta_{x+\mu} - \theta_x)$.
Setup:
- Lattice sizes: $15^2$ (helical boundary conditions) and $30^2$.
- $\beta$ range: 0.9 to 1.2 (crossing the critical point $\approx 1.0$).
- Run length: >100,000 updates in equilibrium.
Metrics:
- Autocorrelation time ($\tau$): Defined as the number of updates for the time-correlation function to drop to 10% of its initial value.
- Systematic Error: Measured via deviation of average action from Monte Carlo values.

Results and Conclusions

Efficiency: The Hyperbolic Algorithm (HA) is far more efficient. For equal systematic errors, sweep-sweep correlation times are significantly lower than LA.
Error Scaling: Numerical results confirmed that HA step size $\epsilon_H = 0.1$ yields systematic errors comparable to LA step size $\epsilon_L \approx 0.008$ ($O(\epsilon^2)$ vs $O(\epsilon)$ scaling).
Speedup: In the disordered phase, HA is roughly $\epsilon_H / \epsilon_L$ times faster (approximately a factor of 12.5 for $\epsilon_H = 0.1$, $\epsilon_L = 0.008$). In the ordered phase, efficiency gains increase with distance scale, reaching factors of 20 or more for long-range correlations.
Optimal Damping: For the XY model, the optimal damping parameter was found to be $\gamma \approx 0.4$.

Reproducibility Details

Algorithms

1. The Hyperbolic Algorithm (HA)

The discretized update equations for scalar fields are:

$$ \begin{aligned} \pi_{t+\epsilon} - \pi_{t} &= -\epsilon\gamma\pi_{t} - \epsilon\frac{\partial S}{\partial\phi_{t}} + \sqrt{2\epsilon\gamma/\beta}\xi_{t} \\ \phi_{t+\epsilon} - \phi_{t} &= \epsilon\pi_{t+\epsilon} \end{aligned} $$

Variables: $\phi$ is the field, $\pi$ is the conjugate momentum ($\dot{\phi}$).
Parameters: $\epsilon$ (step size), $\gamma$ (damping constant).
Noise: $\xi$ is Gaussian noise with $\langle\xi_x \xi_y\rangle = \delta_{x,y}$.
Storage: Requires storing both $\phi$ and $\pi$ vectors.

2. Non-Abelian Generalization

For Lie group elements $U$ with generators $T^a$:

$$ \begin{aligned} \pi_{t+\epsilon}^a - \pi_{t}^a &= -\epsilon\gamma\pi_{t}^a - \epsilon\delta^a S[U_t] + \sqrt{2\epsilon\gamma/\beta}\xi_{t}^a \\ U_{t+\epsilon} &= e^{i\epsilon\pi_{t+\epsilon}^a T^a} U_t \end{aligned} $$

Theoretical Proof of $O(\epsilon^2)$ Accuracy

The derivation relies on the generalized Liouville equation for the probability distribution $P[\phi, \pi; t]$.

Transition Probability: The transition $W$ for one iteration is defined.
Effective Liouville Operator: The evolution is written as $P(t+\epsilon) = \exp(\epsilon L_{\text{eff}}) P(t)$.
Baker-Hausdorff Expansion: Using normal ordering of operators, the equilibrium distribution $P_{\text{eq}}$ is derived through $O(\epsilon^2)$:

$$ \begin{aligned} P_{\text{eq}} &= \exp\left\lbrace-\frac{1}{2}\beta_{1}\sum_{x}\pi_{x}^{2} - \beta S[\phi] + \frac{1}{2}\epsilon\beta\sum_{x}\pi_{x}S_{x} + \epsilon^{2}G + O(\epsilon^3)\right\rbrace \end{aligned} $$

where $\beta_1 = \beta\left(1 - \frac{1}{2}\epsilon\gamma\right)$.

Effective Action: Integrating out $\pi$ yields the effective action for $\phi$:

$$ \begin{aligned} S_{\text{eff}}[\phi] &= S[\phi] - \frac{1}{8}\epsilon^2 \sum_x S_x^2 + \dots \end{aligned} $$

The absence of $O(\epsilon)$ terms proves the higher-order accuracy.

Evaluation

Model: XY Model (2D)
Hamiltonian: $H = \frac{1}{2}\sum \pi^2 + S[\phi]$ where $S = -\sum \cos(\Delta \theta)$.
Observables:
- $\Gamma_n = \cos(\theta_{m+n} - \theta_m)$ (averaged over lattice $m$).
Comparisons:
- LA Step: $\epsilon_L \approx 0.005 - 0.02$.
- HA Step: $\epsilon_H \approx 0.1 - 0.2$.
- Equivalence: $\epsilon_H = 0.1$ matches error of $\epsilon_L \approx 0.008$.

Terminology Note

The naming conventions in this paper differ from those commonly used in molecular dynamics (MD). The following table provides a cross-field mapping:

Concept	Field Theory (This Paper)	Molecular Dynamics	Mathematics
Equation 1	“Langevin Equation”	Brownian Dynamics (BD)	Overdamped Langevin
Equation 2	“Hyperbolic Equation”	Langevin Dynamics (LD)	Underdamped Langevin
Integrator 1	Euler Discretization	Euler Integrator	Euler-Maruyama
Integrator 2	Hyperbolic Algorithm (HA)	Velocity Verlet / Leapfrog	Quasi-Symplectic Splitting

Key insight: The paper’s “Hyperbolic Algorithm” is mathematically equivalent to Langevin Dynamics with a Leapfrog/Verlet integrator, commonly used in MD. The baseline “Langevin Algorithm” corresponds to Brownian Dynamics. The term “Langevin equation” is overloaded: field theorists often use it for overdamped dynamics (no inertia), while chemists assume it includes momentum ($F=ma$).

Paper Information

Citation: Horowitz, A. M. (1987). The Second Order Langevin Equation and Numerical Simulations. Nuclear Physics B, 280, 510-522. https://doi.org/10.1016/0550-3213(87)90159-3

Publication: Nuclear Physics B 1987

@article{horowitzSecondOrderLangevin1987,
  title = {The Second Order {{Langevin}} Equation and Numerical Simulations},
  author = {Horowitz, Alan M.},
  year = 1987,
  month = jan,
  journal = {Nuclear Physics B},
  volume = {280},
  pages = {510--522},
  issn = {05503213},
  doi = {10.1016/0550-3213(87)90159-3}
}

Oscillatory CO Oxidation on Pt(110): Temporal Modeling

Sun, 14 Dec 2025 00:00:00 +0000

Related Work: This builds on Kinetic Oscillations on Pt(100), which established that surface phase transitions drive oscillatory catalysis. The Pt(110) system exhibits richer dynamics including mixed-mode oscillations and chaos.

Method Presentation: Modeling Temporal Self-Organization

This is primarily a Method paper, supported by Theory.

Method: The authors construct a specific computational architecture, a set of coupled Ordinary Differential Equations (ODEs), to simulate the catalytic oxidation of CO. They systematically “ablate” the model, starting with 2 variables (bistability only), adding a 3rd (simple oscillations), and finally a 4th (mixed-mode oscillations) to demonstrate the necessity of each physical component.
Theory: The model is analyzed using formal bifurcation theory (continuation methods) to map the topology of the phase space (Hopf bifurcations, saddle-node loops, etc.).

Motivation: Bridging Microscopic Structure and Macroscopic Dynamics

The Pt(110) surface exhibits complex temporal behavior during CO oxidation, including bistability, sustained oscillations, mixed-mode oscillations (MMOs), and chaos. Previous simple models could explain bistability but failed to capture the oscillatory dynamics observed experimentally. There was a need for a “realistic” model that used physically derived parameters to quantitatively link microscopic surface changes (structural phase transitions) to macroscopic reaction rates.

Novelty: Coupling Reaction Kinetics and Surface Phase Transitions

The core novelty is the “Reconstruction Model”, which couples the chemical kinetics (Langmuir-Hinshelwood mechanism) with the physical structural phase transition of the platinum surface ($1\times1 \leftrightarrow 1\times2$).

They treat the surface structure as a dynamic variable ($w$).
They introduce a fourth variable ($z$) representing “faceting” to explain complex mixed-mode oscillations, identifying the interplay between two negative feedback loops on different time scales as the driver for this behavior.

Methodology: Experimental Parameters and Bifurcation Topology

The validation approach involved a tight loop between numerical simulation and physical experiment:

Parameter Determination: They experimentally measured individual rate constants (sticking coefficients, desorption energies) using Surface Science techniques (LEED, TDS) to ground the model in reality.
Bifurcation Analysis: They used numerical continuation methods (AUTO package) to compute “skeleton bifurcation diagrams,” mapping the boundaries between stable states, simple oscillations, and chaos in parameter space ($p_{CO}$ vs $p_{O_2}$).
Physical Validation: These diagrams were compared directly against experimental work function ($\Delta \phi$) measurements and LEED intensity profiles to verify the existence regions of different dynamic regimes.

Results and Limitations: Mixed-Mode Oscillations vs. Spatiotemporal Chaos

Successes: The 3-variable model successfully reproduces bistability and simple oscillations (limit cycles). The extended 4-variable model qualitatively captures mixed-mode oscillations (MMOs).
Mechanism: Oscillations arise from the delay between CO adsorption and the resulting surface phase transition (which changes oxygen sticking probabilities).
Limitations: The 4-variable model only reproduces one type of MMO; certain experimental patterns (e.g., square-wave forms with small oscillations on both high and low work-function levels) were not obtained. The oscillatory region also does not extend to low temperatures as observed experimentally. More fundamentally, the ODE model fails to predict the period-doubling cascade to chaos or hyperchaos observed in experiments. The authors conclude these are likely spatiotemporal phenomena (involving wave propagation and pattern formation) that require Partial Differential Equations (PDEs).

Reproducibility Details

The paper provides a complete set of equations and parameters required to reproduce the dynamics.

Data (Parameters)

The model uses kinetic parameters derived from Pt(110) experiments. Key constants for reproduction:

Parameter	Value	Description
$\kappa_c$	$3.135 \times 10^5 , s^{-1} \text{mbar}^{-1}$	Rate of CO hitting surface
$s_c$	$1.0$	CO sticking coefficient
$q$	$3$	Mobility parameter of precursor adsorption
$u_s$	$1.0$	Saturation coverage ($CO$)
$\kappa_o$	$5.858 \times 10^5 , s^{-1} \text{mbar}^{-1}$	Rate of $O_2$ hitting surface
$s_{o,1\times2}$	$0.4$	$O_2$ sticking coeff ($1\times2$ phase)
$s_{o,1\times1}$	$0.6$	$O_2$ sticking coeff ($1\times1$ phase)
$v_s$	$0.8$	Saturation coverage ($O$)
$k_{r}^{0}$	$3 \times 10^6 , s^{-1}$	Reaction pre-exponential
$E_r$	$10 , \text{kcal/mol}$	Reaction activation energy
$k_{d}^{0}$	$2 \times 10^{16} , s^{-1}$	Desorption pre-exponential
$E_d$	$38 , \text{kcal/mol}$	Desorption activation energy
$k_{p}^{0}$	$10^2 , s^{-1}$	Phase transition pre-exponential
$E_p$	$7 , \text{kcal/mol}$	Phase transition activation energy
$k_f$	$0.03 , s^{-1}$	Rate of facet formation
$k_{t}^{0}$	$2.65 \times 10^5 , s^{-1}$	Thermal annealing pre-exponential
$E_t$	$20 , \text{kcal/mol}$	Thermal annealing activation energy
$s_{o,3}$	$0.2$	Increase of $s_o$ for max faceting ($z=1$)

Algorithms (The Equations)

The system is defined by a set of coupled Ordinary Differential Equations (ODEs).

1. Basic 3-Variable Model (Reconstruction Model)

The core system is structured as a single mathematical block of coupled variables representing CO coverage ($u$), Oxygen coverage ($v$), and the surface phase fraction ($w$):

$$ \begin{aligned} \dot{u} &= p_{CO} \kappa_c s_c \left(1 - \left(\frac{u}{u_s}\right)^q \right) - k_d u - k_r u v \\ \dot{v} &= p_{O_2} \kappa_o s_o \left(1 - \frac{u}{u_s} - \frac{v}{v_s}\right)^2 - k_r u v \\ \dot{w} &= k_p (w_{eq}(u) - w) \end{aligned} $$

Note: The oxygen sticking coefficient $s_o$ dynamically depends on the structure $w$, calculated as $s_o = w \cdot s_{o,1\times1} + (1-w) \cdot s_{o,1\times2}$. The equilibrium function $w_{eq}(u)$ is a polynomial step function that activates the phase transition:

$$ w_{eq}(u) = \begin{cases} 0 & u \le 0.2 \ \sum_{i=0}^3 r_i u^i & 0.2 < u < 0.5 \ 1 & u \ge 0.5 \end{cases} $$

The polynomial coefficients from Table II are: $r_3 = -1/0.0135$, $r_2 = -1.05 r_3$, $r_1 = 0.3 r_3$, $r_0 = -0.026 r_3$.

2. Extended 4-Variable Model (Faceting)

To reproduce Mixed-Mode Oscillations, the model adds a faceting variable $z$:

$$ \begin{aligned} s_o &= w \cdot s_{o,1\times1} + (1-w) \cdot s_{o,1\times2} + s_{o,3} z \\ \dot{z} &= k_f \cdot u \cdot v \cdot w \cdot (1-z) - k_t z (1-u) \end{aligned} $$

Models

The authors define two distinct configurations:

3-Variable (u, v, w): Sufficient for bistability and simple oscillations (limit cycles).
4-Variable (u, v, w, z): Required for mixed-mode oscillations (small oscillations superimposed on large relaxation spikes).

Evaluation

Bifurcation Analysis: The system should be evaluated by computing steady states and detecting Hopf bifurcations as a function of $p_{CO}$ and $p_{O_2}$.
Time Integration: Stiff ODE solvers (e.g., scipy.integrate.odeint or solve_ivp with ‘Radau’ or ‘BDF’ method) are recommended due to the differing time scales of reaction ($u,v$) and reconstruction ($w,z$).

Hardware

Original: VAX 6800 and VAX station 3100.
Modern Reqs: Minimal. Can be solved in milliseconds on any modern CPU using standard scientific libraries (Python/Matlab).

Reference Implementation

The following Python script implements the 3-variable Reconstruction Model described in the paper, replicating the stable oscillations shown in Figure 7 (T=540K):

import numpy as np
from scipy.integrate import odeint
import matplotlib.pyplot as plt

# --- 1. CONSTANTS & PARAMETERS ---
R = 0.001987
k_c, s_c, q = 3.135e5, 1.0, 3.0
k_o, s_o1, s_o2 = 5.858e5, 0.6, 0.4
k_d0, E_d = 2.0e16, 38.0
k_r0, E_r = 3.0e6, 10.0
k_p0, E_p = 100.0, 7.0
u_s, v_s = 1.0, 0.8
T, p_CO, p_O2 = 540.0, 3.0e-5, 6.67e-5

# Calculate Arrhenius rates
k_d = k_d0 * np.exp(-E_d / (R * T))
k_r = k_r0 * np.exp(-E_r / (R * T))
k_p = k_p0 * np.exp(-E_p / (R * T))

def model(y, t):
    u, v, w = y
    s_o = w * s_o1 + (1 - w) * s_o2

    # Smooth step function for Equilibrium w
    if u <= 0.2: weq = 0.0
    elif u >= 0.5: weq = 1.0
    else:
        x = (u - 0.2) / 0.3
        weq = 3*x**2 - 2*x**3

    r_reac = k_r * u * v
    du = p_CO * k_c * s_c * (1 - (u/u_s)**q) - k_d * u - r_reac
    dv = p_O2 * k_o * s_o * (1 - u/u_s - v/v_s)**2 - r_reac
    dw = k_p * (weq - w)
    return [du, dv, dw]

# --- 2. SIMULATION STRATEGY ---
# Simulate for 300 seconds to kill transients
t_full = np.linspace(0, 300, 3000)
y0 = [0.1, 0.1, 0.0]
solution = odeint(model, y0, t_full)

# --- 3. SLICING FOR FIGURE 7 ---
# Only take the last 60 seconds (stable limit cycle)
mask = (t_full > 240) & (t_full < 300)
t_plot = t_full[mask]
# Shift time axis to start at 10s (matching Fig 7 style)
t_display = t_plot - t_plot[0] + 10

u_plot = solution[mask, 0]
v_plot = solution[mask, 1]
w_plot = solution[mask, 2]

# --- 4. VISUALIZATION ---
plt.figure(figsize=(8, 5))

# Plot CO (u) and Structure (w) on top (Primary Axis)
plt.plot(t_display, w_plot, 'g--', label='1x1 Fraction (w)', linewidth=1.5)
plt.plot(t_display, u_plot, 'k-', label='CO Coverage (u)', linewidth=2)

# Plot Oxygen (v) on bottom
plt.plot(t_display, v_plot, 'r-.', label='Oxygen (v)', linewidth=1.5)

plt.title('Replication of Figure 7: Stable Oscillations')
plt.xlabel('Time (s)')
plt.ylabel('Coverage [ML]')
plt.legend(loc='upper center', ncol=3)
plt.xlim(10, 60)
plt.ylim(0, 1.0)
plt.grid(True, alpha=0.3)
plt.show()

Output of the reference implementation showing stable oscillations on Pt(110)

This plot faithfully replicates the stable limit cycle shown in Figure 7 of the paper:

Timeframe: Shows a 50-second window (labeled 10-60s) after initial transients have died out.
Period: Regular oscillations with a period of roughly 7-8 seconds.
Phase Relationship: The surface phase reconstruction ($w$, green dashed) lags slightly behind the CO coverage ($u$, black solid). This delay is the crucial “memory” effect that enables the oscillation.
Anticorrelation: The oxygen coverage ($v$, red dash-dot) spikes exactly when the surface is in the active $1\times1$ phase (high $w$) and CO is low, confirming the “Langmuir-Hinshelwood” reaction mechanism.

Paper Information

Citation: Krischer, K., Eiswirth, M., & Ertl, G. (1992). Oscillatory CO oxidation on Pt(110): Modeling of temporal self-organization. The Journal of Chemical Physics, 96(12), 9161-9172. https://doi.org/10.1063/1.462226

Publication: Journal of Chemical Physics 1992

@article{krischerOscillatoryCOOxidation1992,
  title = {Oscillatory {{CO}} Oxidation on {{Pt}}(110): {{Modeling}} of Temporal Self-organization},
  shorttitle = {Oscillatory {{CO}} Oxidation on {{Pt}}(110)},
  author = {Krischer, K. and Eiswirth, M. and Ertl, G.},
  year = 1992,
  month = jun,
  journal = {The Journal of Chemical Physics},
  volume = {96},
  number = {12},
  pages = {9161--9172},
  issn = {0021-9606, 1089-7690},
  doi = {10.1063/1.462226}
}

MD Simulation of Self-Diffusion on Metal Surfaces (1994)

Sun, 14 Dec 2025 00:00:00 +0000

Scientific Typology: Computational Discovery

This is primarily a Discovery ($\Psi_{\text{Discovery}}$) paper, with strong supporting contributions as a Method ($\Psi_{\text{Method}}$) evaluation. The primary contribution is the validation and mechanistic visualization of the “exchange mechanism” for surface diffusion using computational methods (Molecular Dynamics with many-body potentials). This physical phenomenon was previously observed in Field Ion Microscope (FIM) experiments but difficult to characterize dynamically. The paper focuses on determining how atoms move, specifically distinguishing between hopping and exchange mechanisms.

The Field Ion Microscope (FIM) Observation Gap

Surface diffusion is critical for understanding phenomena like crystal growth, epitaxy, and catalysis. Experimental evidence from FIM on fcc(001) surfaces (specifically Pt and Ir) suggested an “exchange mechanism” where an adatom replaces a substrate atom, challenging the conventional wisdom that adatoms migrate by hopping over potential barriers (bridge sites) between binding sites. The authors sought to:

Investigate whether this exchange mechanism could be reproduced dynamically in simulation.
Determine which interatomic potentials (EAM, Sutton-Chen, R-G-L) accurately describe these surface behaviors compared to bulk properties.

Dynamic Visualization of Atomic Exchange

The study provides a direct dynamic visualization of the “concerted motion” involved in exchange diffusion events, which happens on timescales too fast for experimental imaging. By comparing three different many-body potentials, the authors demonstrate that the choice of potential is critical for capturing surface phenomena; specifically, identifying that “bulk” derived potentials (like Sutton-Chen) may fail to capture specific surface exchange events that EAM and R-G-L potentials successfully model.

Simulation Protocol & Evaluated Potentials

The authors performed Molecular Dynamics (MD) simulations on Iridium (Ir) surfaces:

Surfaces: Channeled (110), densely packed (111), and loosely packed (001).
Potentials: Three many-body models were tested: Embedded Atom Method (EAM), Sutton-Chen (S-C), and Rosato-Guillope-Legrand (R-G-L).
Conditions: Simulations were primarily run at $T=800$ K to ensure sufficient sampling of diffusion events.
Cross-Validation: The study extended the analysis to Cu, Rh, and Pt systems to verify the universality of the exchange mechanism against experimental data.

Confirmation of Concerted Motion Mechanisms

Mechanism Confirmation: The study confirmed that diffusion on Ir(001) proceeds via an atomic exchange mechanism (concerted motion). The activation energy for exchange ($0.77$ eV) was found to be significantly lower than for hopping over bridge sites ($1.57$ eV).
Surface Structure Dependence:
- Ir(111): Diffusion is rapid (activation energy $V_a = 0.17$ eV from R-G-L Arrhenius plot) and occurs exclusively via hopping; no exchange events were observed due to the close-packed nature of the surface.
- Ir(110): Diffusion is anisotropic; atoms hop along channels but use the exchange mechanism to move across channels.
Potential Validity: The R-G-L and EAM potentials successfully reproduced experimental exchange behaviors, whereas the Sutton-Chen potential failed to predict exchange on Ir(001). The authors attribute the S-C failure primarily to the use of “bulk” potential parameters to describe interactions at the surface.
Cross-System Comparison: The study extended the analysis to Cu, Rh, and Pt systems. Both S-C and R-G-L potentials correctly predicted the absence of exchange on all three Rh surfaces and on (111) surfaces of Cu and Pt. Exchange events were correctly predicted on Cu(001), Cu(110), Pt(001), and Pt(110) by both potentials. The sole discrepancy was S-C failing to predict exchange on Ir(001), where R-G-L and EAM succeeded in agreement with experiment.

Reproducibility Details

Algorithms

Integration: “Velocity” form of the Verlet algorithm.
Time Step: $\Delta t = 0.01$ ps ($10^{-14}$ s).
Simulation Protocol:
1. Quenching: System relaxed to 0 K by zeroing velocities when $v \cdot F < 0$.
2. Equilibration: 5 ps constant-temperature run (renormalizing velocities every step).
3. Production: 15 ps constant-energy (microcanonical) run where trajectories are collected.

Models

The study relies on three specific many-body potential formulations:

Embedded Atom Method (EAM):
- Total energy: $$U_{tot} = \sum_i F_i(\rho_i) + \frac{1}{2} \sum_{j \neq i} \phi_{ij}(r_{ij})$$
Sutton-Chen (S-C):
- Uses a square root density dependence and power-law pair repulsion $(a/r)^{n}$: $$F(\rho) \propto \rho^{1/2}$$
Rosato-Guillope-Legrand (R-G-L):
- Born-Mayer type repulsion: $$\phi_{ij}(r) = A \exp[-p(r/r_0 - 1)]$$
- Attractive band energy: $$F_i(\rho) = -\left(\sum \xi^2 \exp[-2q(r/r_0 - 1)]\right)^{1/2}$$

Data

System Size: 648 classical atoms.
Geometry:
- Cubic box with fixed volume.
- Periodic boundary conditions in $x$ and $y$ (parallel to surface), free motion in $z$.
- Substrate depth: 8, 12, or 9 atomic layers depending on orientation [(001), (110), (111)].
Cutoff Radius: 14 bohr ($\sim 7.4$ Å).
Initial Conditions: Velocities initialized from a Maxwellian distribution.

Evaluation

Diffusion Constant ($D$): Calculated using the Einstein relation via Mean Square Displacement (MSD): $$D = \lim_{t \to \infty} \frac{\langle \Delta r^2(t) \rangle}{2td}$$ where $d=2$ for surface diffusion.
Activation Energy ($V_a$): Extracted from the slope of Arrhenius plots ($\ln D$ vs $1/T$).
Attempt Frequency ($\nu$): Estimated via harmonic approximation: $\nu = \frac{1}{2\pi}\sqrt{c/M}$.

Paper Information

Citation: Shiang, K.-D., Wei, C. M., & Tsong, T. T. (1994). A molecular dynamics study of self-diffusion on metal surfaces. Surface Science, 301(1-3), 136-150. https://doi.org/10.1016/0039-6028(94)91295-5

Publication: Surface Science 1994

@article{shiang1994molecular,
  title={A molecular dynamics study of self-diffusion on metal surfaces},
  author={Shiang, Keh-Dong and Wei, C.M. and Tsong, Tien T.},
  journal={Surface Science},
  volume={301},
  number={1-3},
  pages={136--150},
  year={1994},
  publisher={Elsevier},
  doi={10.1016/0039-6028(94)91295-5}
}

Kinetic Oscillations in CO Oxidation on Pt(100): Theory

Sun, 14 Dec 2025 00:00:00 +0000

CO molecule adsorbed in hollow site on Pt(100) surface. The surface structure and CO binding configurations are central to understanding the oscillatory behavior.

Contribution: Theoretical Modeling of Kinetic Oscillations

Theory ($\Psi_{\text{Theory}}$).

This paper derives a microscopic mechanism based on experimental kinetic data to explain observed kinetic oscillations. It relies heavily on formal analysis, including a Linear Stability Analysis of a simplified model to derive eigenvalues and characterize stationary points (stable nodes, saddle points, and foci) whose appearance and disappearance drive relaxation oscillations. The primary contribution is the mathematical formulation of the surface phase transition.

Motivation: Explaining Periodicity in Surface Reactions

Experimental studies had shown that the catalytic oxidation of Carbon Monoxide (CO) on Platinum (100) surfaces exhibits temporal oscillations and spatial wave patterns at low pressures ($10^{-4}$ Torr). While the individual elementary steps (adsorption, desorption, reaction) were known, the mechanism driving the periodicity was not understood. Prior models relied on indirect evidence; this work aimed to ground the theory in new LEED (Low-Energy Electron Diffraction) observations showing that the surface structure itself transforms periodically between a reconstructed hex phase and a bulk-like 1x1 phase.

Novelty: The Surface Phase Transition Model

The core novelty is the Surface Phase Transition Model. The authors propose that the oscillations are driven by the reversible phase transition of the Pt surface atoms, which is triggered by critical adsorbate coverages:

State Dependent Kinetics: The hex and 1x1 phases have vastly different sticking coefficients for Oxygen (negligible on hex, high on 1x1).
Critical Coverage Triggers: The transition depends on whether local CO coverage exceeds a critical threshold ($U_{a,grow}$) or falls below another ($U_{a,crit}$).
Trapping-Desorption: The model introduces a “trapping” term where CO diffuses from the weakly-binding hex phase to the strongly-binding 1x1 patches, creating a feedback loop.

Methodology: Reaction-Diffusion Simulations

As a theoretical paper, the “experiments” were computational simulations and mathematical derivations:

Linear Stability Analysis: They simplified the 4-variable model to a 3-variable system ($u$, $v$, $a$), then treated the phase fraction $a$ as a slowly varying parameter. This allowed them to perform a 2-variable stability analysis on the $u$-$v$ subsystem, identifying the conditions for oscillations through the appearance and disappearance of stationary points as $a$ varies.
Hysteresis Simulation: They simulated temperature-programmed variations to match experimental CO adsorption hysteresis loops, fitting the critical coverage parameters ($U_{a,grow} \approx 0.5$).
Reaction-Diffusion Simulation: They numerically integrated the full set of 4 coupled differential equations over a 1D spatial grid (40 compartments) to reproduce temporal oscillations and propagating wave fronts.

Results: Mechanisms of Spatiotemporal Self-Organization

Mechanism Validation: The model successfully reproduced the asymmetric oscillation waveform (a slow plateau followed by a steep breakdown) observed in work function and LEED measurements.
Phase Transition Role: Confirmed that the “slow” step driving the oscillation period is the phase transformation, specifically the requirement for CO to build up to a critical level to nucleate the reactive 1x1 phase.
Spatial Self-Organization: The addition of diffusion terms allowed the model to reproduce wave propagation, showing that defects at crystal edges can act as “pacemakers” or triggers for the rest of the surface.
Chaotic Behavior: Under slightly different conditions (e.g., $T = 470$ K instead of 480 K), the coupled system produces irregular, chaotic work function oscillations. This arises when not every trigger compartment oscillation drives a wave into the bulk because the bulk has not yet recovered from the previous wave front. The authors note that such irregular behavior is the rule rather than the exception in experimental observations.
Quantitative Limitations: The calculated oscillation periods are at least one order of magnitude shorter than experimental values (1 to 4 min). This discrepancy arises mainly from unrealistically high values of $k_5$ and $k_8$ used to reduce computational time. The model also restricts spatial analysis to a 1D grid, which oversimplifies the true 2D wave patterns seen in experiments. The authors note that microscopic adsorbate-adsorbate interactions and island formation are not included, which would require multi-scale modeling.

Reproducibility Details

To faithfully replicate this study, one must implement the system of four coupled differential equations. The hardware requirements are negligible by modern standards.

Models

The system tracks four state variables:

$u_a$: CO coverage on the 1x1 phase (normalized to local area $a$)
$u_b$: CO coverage on the hex phase (normalized to local area $b$)
$v_a$: Oxygen coverage on the 1x1 phase (normalized to local area $a$)
$a$: Fraction of surface in 1x1 phase ($b = 1 - a$)

The Governing Equations:

CO coverage on 1x1 phase: $$ \begin{aligned} \frac{\partial u_a}{\partial t} = k_1 a p_{CO} - k_2 u_a + k_3 a u_b - k_4 u_a v_a / a + k_5 \nabla^2(u_a/a) \end{aligned} $$

CO coverage on hex phase: $$ \begin{aligned} \frac{\partial u_b}{\partial t} = k_1 b p_{CO} - k_6 u_b - k_3 a u_b \end{aligned} $$

Oxygen coverage on 1x1 phase: $$ \begin{aligned} \frac{\partial v_a}{\partial t} = k_7 a p_{O_2} \left[ \left(1 - 2 \frac{u_a}{a} - \frac{5}{3} \frac{v_a}{a}\right)^2 + \alpha \left(1 - \frac{5}{3}\frac{v_a}{a}\right)^2 \right] - k_4 u_a v_a / a \end{aligned} $$

The Phase Transition Logic ($da/dt$):

The growth of the 1x1 phase ($a$) is piecewise, defined by critical coverages:

If $U_a > U_{a,grow}$ and $\partial u_a/\partial t > 0$: island growth with $\partial a/\partial t = (1/U_{a,grow}) \cdot \partial u_a/\partial t$
If $c = U_a/U_{a,crit} + V_a/V_{a,crit} < 1$: decay to hex with $\partial a/\partial t = -k_8 a c$
Otherwise: $\partial a/\partial t = 0$

Algorithms

Time Integration: Runge-Kutta-Merson routine.
Spatial Integration: Crank-Nicholson algorithm for the diffusion term.
Time Step: $\Delta t = 10^{-4}$ s.
Spatial Grid: 1D array of 40 compartments, total length 0.4 cm (each compartment 0.01 cm).
Boundary Conditions: Closed ends (no flux). Defects simulated by setting $\alpha$ higher in the first 3 “edge” compartments.

Data

Replication requires the specific rate constants. Note: $k_3$ and $\alpha$ are fitting parameters.

Parameter	Symbol	Value (at 480 K)	Description
CO Stick	$k_1$	$2.94 \times 10^5$ ML/s/Torr	Pre-exponential factor
CO Desorp (1x1)	$k_2$	$1.5$ s$^{-1}$ ($U_a = 0.5$)	$E_a = 37.3$ (low cov), $33.5$ kcal/mol (high cov)
Trapping	$k_3$	$50 \pm 30$ s$^{-1}$	Hex to 1x1 diffusion
Reaction	$k_4$	$10^3 - 10^5$ ML$^{-1}$s$^{-1}$	Langmuir-Hinshelwood
Diffusion	$k_5$	$4 \times 10^{-4}$ cm$^2$/s	CO surface diffusion (elevated for computational speed; realistic: $10^{-7}$ to $10^{-5}$)
CO Desorp (hex)	$k_6$	$11$ s$^{-1}$	$E_a = 27.5$ kcal/mol
O2 Adsorption	$k_7$	$5.6 \times 10^5$ ML/s/Torr	Only on 1x1 phase
Phase Trans	$k_8$	$0.4 - 2.0$ s$^{-1}$	Relaxation constant
Defect Coeff	$\alpha$	$0.1 - 0.5$	Fitting param for defects
Crit Cov (Grow)	$U_{a,grow}$	$0.5 \pm 0.1$	Trigger for hex to 1x1
Crit Cov (Decay)	$U_{a,crit}$	$0.32$	Trigger for 1x1 to hex (CO)
Crit O Cov	$V_{a,crit}$	$0.4$	Trigger for 1x1 to hex (O)

Evaluation

The model was evaluated by comparing the simulated temporal oscillations and spatial wave patterns against experimental work function measurements and LEED observations.

Hardware

The hardware requirements are negligible by modern standards. The original simulations were likely performed on a mainframe or minicomputer of the era. Today, they can be run on any standard personal computer.

Paper Information

Citation: Imbihl, R., Cox, M. P., Ertl, G., Müller, H., & Brenig, W. (1985). Kinetic oscillations in the catalytic CO oxidation on Pt(100): Theory. The Journal of Chemical Physics, 83(4), 1578-1587. https://doi.org/10.1063/1.449834

Publication: The Journal of Chemical Physics 1985

Related Work: See also Oscillatory CO Oxidation on Pt(110) for the same catalytic system on a different crystal face, demonstrating that surface phase transitions drive oscillatory behavior across multiple platinum surfaces.

@article{imbihl1985kinetic,
  title={Kinetic oscillations in the catalytic CO oxidation on Pt(100): Theory},
  author={Imbihl, R and Cox, MP and Ertl, G and M{\"u}ller, H and Brenig, W},
  journal={The Journal of Chemical Physics},
  volume={83},
  number={4},
  pages={1578--1587},
  year={1985},
  publisher={American Institute of Physics}
}

In Situ XRD of Oxidation-Reduction Oscillations on Pt/SiO2

Sun, 14 Dec 2025 00:00:00 +0000

Experimental Validation of the Oxide Model

This is a Discovery (Translational/Application) paper.

It is classified as such because the primary contribution is the experimental resolution of a long-standing scientific debate regarding the physical driving force of kinetic oscillations. The authors use established techniques (in situ X-ray diffraction and Debye Function Analysis) to falsify existing hypotheses (reconstruction model, carbon model) and validate a specific physical mechanism (the oxide model).

The Missing Driving Force in High-Pressure CO Oxidation

The study addresses the debate surrounding the driving force of kinetic oscillations in CO oxidation on platinum catalysts at high pressures ($p > 10^{-3}$ mbar). While low-pressure oscillations on single crystals were known to be caused by surface reconstruction, the mechanism for high-pressure oscillations on supported catalysts was unresolved. Three main models existed:

Reconstruction model: Structural changes of the substrate
Carbon model: Periodic deactivation by carbon
Oxide model: Periodic formation and reduction of surface oxides

Prior to this work, there was no conclusive experimental proof demonstrating the periodic oxidation and reduction required by the oxide model.

Direct In Situ XRD Proof

The core novelty is the first direct experimental evidence connecting periodic structural changes in the catalyst to rate oscillations. Using in situ X-ray diffraction (XRD), the authors demonstrated that the intensity of the Pt(111) Bragg peak oscillates in sync with the reaction rate.

By applying Debye Function Analysis (DFA) to the diffraction profiles, they quantitatively showed that the catalyst transitions between a metallic Pt state and a partially oxidized state (containing $\text{PtO}$ and $\text{Pt}_3\text{O}_4$). This definitively ruled out the reconstruction model (which would produce much smaller intensity variations) and confirmed the oxide model.

In Situ X-ray Diffraction and Activity Monitoring

The authors performed in situ X-ray diffraction experiments on a supported Pt catalyst (EuroPt-1) during the CO oxidation reaction.

Reaction Monitoring: They cycled the temperature and gas flow rates (CO, $\text{O}_2$, He) to induce ignition, extinction, and oscillations.
Activity Metrics: Catalytic activity was tracked via sample temperature (using thermocouples) and $\text{CO}_2$ production (using a quadrupole mass spectrometer).
Structural Monitoring: They recorded the intensity of the Pt(111) Bragg peak continuously.
Cluster Analysis: Detailed angular scans of diffracted intensity were taken at stationary points (active vs. inactive states) and analyzed using Debye functions to determine cluster size and composition.

Periodic Oxidation Mechanism and Reversibility

Key Findings:

Oscillation Mechanism: Rate oscillations are accompanied by the periodic oxidation and reduction of the Pt catalyst.
Phase Relationship: The X-ray intensity (oxide amount) oscillates approximately 120° ahead of the temperature (reaction rate), consistent with the oxide model: oxidation deactivates the surface → rate drops → CO reduces the surface → rate rises.
Oxide Composition: The oxidized state consists of a mixture of metallic clusters, $\text{PtO}$, and $\text{Pt}_3\text{O}_4$. $\text{PtO}_2$ was not found.
Extent of Oxidation: Approximately 20-30% of the metal atoms are oxidized, corresponding effectively to a shell of oxide on the surface of the nanoclusters.
Reversibility: The transition between metallic and oxidized states is fully reversible with no sintering observed under the experimental conditions.
Scope Limitation: The authors note that whether the oxide model also applies to kinetic oscillations on Pt foils or Pt wires remains to be verified, since small Pt clusters likely have a much higher tendency to form oxides than massive Pt metal.

Reproducibility Details

Data

The study used the EuroPt-1 standard catalyst.

Type	Material	Details
Catalyst	EuroPt-1 ($\text{Pt/SiO}_2$)	6.3% Pt loading on silica support
Particle Size	Pt Clusters	Mean diameter ~15.5 Å; dispersion $65 \pm 5\%$
Sample Prep	Pellets	40 mg of catalyst pressed into $15 \times 12 \times 0.3 \text{ mm}^3$ self-supporting pellets

Algorithms

Debye Function Analysis (DFA)

The study used DFA to fit theoretical scattering curves to experimental intensity profiles. This method is suitable for randomly oriented clusters where standard crystallographic methods might fail due to finite size effects.

$$I_{N}(b)=\sum_{m,n=1}^{N}f_{m}f_{n}\frac{\sin(2\pi br_{mn})}{2\pi br_{mn}}$$

Where:

$b$: Scattering vector magnitude, $b=2 \sin \vartheta/\lambda$
$f_m, f_n$: Atomic scattering amplitudes
$r_{mn}$: Distance between atom pairs
Shape Assumption: Cuboctahedral clusters (nearly spherical)

Models

1. The Oxide Model (Physical Mechanism)

Proposed by Sales, Turner, and Maple, validated here:

Oxidation: As oxygen coverage increases, the surface forms a catalytically inactive oxide layer ($\text{PtO}_x$).
Deactivation: The reaction rate drops as the surface deactivates.
Reduction: CO adsorption leads to the reduction of the oxide layer, restoring the metallic surface.
Reactivation: The metallic surface is active for CO oxidation, increasing the rate until oxygen coverage builds up again.

2. Shell Model (Structural)

The diffraction data was fit using a “Shell Model” where a metallic Pt core is surrounded by an oxide shell.

Evaluation

Key Experimental Signatures for Replication:

Ignition Point: A sharp increase in sample temperature accompanied by a steep 18% decrease in Bragg intensity. After the He flow was switched off, the intensity dropped further to a total decrease of 31.5%.
Oscillation Regime: Observed at flow rates $\sim 100 \text{ ml/min}$ after cooling the sample to $\sim 375 \text{ K}$. Below $50 \text{ ml/min}$, only bistability is observed. Temperature oscillations had $\sim 50 \text{ K}$ peak-to-peak amplitude.
Magnitude: Bragg intensity oscillations of ~11% amplitude.

Hardware

Experimental Setup:

Diffractometer: Commercial Guinier diffractometer (HUBER) with monochromatized Cu $K_{\alpha1}$ radiation (45° transmission geometry).
Reactor Cell: Custom 115 $\text{cm}^3$ cell, evacuatable to $10^{-7}$ mbar, equipped with Kapton windows and a Be-cover.
Gases: CO (4.7 purity), $\text{O}_2$ (4.5 purity), He (4.6 purity) regulated by flow controllers.
Sensors: Two K-type thermocouples (surface and gas phase) and a differentially pumped Quadrupole Mass Spectrometer (QMS).

Paper Information

Citation: Hartmann, N., Imbihl, R., & Vogel, W. (1994). Experimental evidence for an oxidation/reduction mechanism in rate oscillations of catalytic CO oxidation on Pt/SiO2. Catalysis Letters, 28(2-4), 373-381. https://doi.org/10.1007/BF00806068

Publication: Catalysis Letters 1994

Related Work: This work complements Oscillatory CO Oxidation on Pt(110), which modeled oscillations via surface reconstruction. Here, the driving force is oxidation/reduction.

@article{hartmannExperimentalEvidenceOxidation1994,
  title = {Experimental Evidence for an Oxidation/Reduction Mechanism in Rate Oscillations of Catalytic {{CO}} Oxidation on {{Pt}}/{{SiO2}}},
  author = {Hartmann, N. and Imbihl, R. and Vogel, W.},
  year = 1994,
  journal = {Catalysis Letters},
  volume = {28},
  number = {2-4},
  pages = {373--381},
  issn = {1011-372X, 1572-879X},
  doi = {10.1007/BF00806068}
}

Evans 1986: Thermal Conductivity of Lennard-Jones Fluid

Sun, 14 Dec 2025 00:00:00 +0000

Methodological Validation and Physical Discovery

This is primarily a Methodological Paper ($\Psi_{\text{Method}}$), with a significant secondary component of Discovery ($\Psi_{\text{Discovery}}$).

It focuses on validating a specific algorithm (the “Evans method”) for Non-Equilibrium Molecular Dynamics (NEMD) by comparing its results against experimental benchmarks. However, it also uncovers physical anomalies, specifically “long-time tails” in the heat flux autocorrelation function that deviate significantly from theoretical predictions, marking a discovery about the physics of the Lennard-Jones fluid itself.

Flow Gradients and Boundary Limitations

The primary motivation is to overcome the limitations of simulating heat flow using physical boundaries (e.g., walls at different temperatures), which causes severe interpretive difficulties due to density and temperature gradients.

The “Evans method” uses a fictitious external field to induce heat flow in a periodic, homogeneous system. This paper serves to:

Validate this method across a wide range of state points (temperatures and densities) beyond the triple point.
Investigate the system’s behavior near the critical point, where transport properties are known to be anomalous.

Core Innovations of the Evans Algorithm

The core contribution is the rigorous stress-testing of the homogeneous heat flow algorithm (Evans method) combined with a Gaussian thermostat.

Specific novel insights include:

Linearity Validation: Establishing that, away from phase boundaries, the effective thermal conductivity is a monotonic, virtually linear function of the external field, justifying the extrapolation to zero field.
Critical Anomaly Detection: Finding that near the critical point, conductivity becomes a non-monotonic function of the field, challenging standard simulation approaches in this regime.
Tail Amplitude Discovery: Demonstrating that the “long-time tails” of the heat flux autocorrelation function have amplitudes roughly 6 times larger than those predicted by mode-coupling theory.

NEMD Simulation Setup

The author performed Non-Equilibrium Molecular Dynamics (NEMD) simulations using the Lennard-Jones potential.

System: Mostly $N=108$ particles, with some checks using $N=256$ to test size dependence.
Thermostat: A Gaussian thermostat was used to keep the kinetic energy (temperature) constant.
State Points:
- Critical Isotherm: $T=1.35$, varying density.
- Supercritical Isotherm: $T=2.0$.
- Freezing Line: Two points ($T=2.74, \rho=1.113$ and $T=2.0, \rho=1.04$).
Validation: Results were compared against experimental data for Argon (using standard LJ parameters).
Ablation:
- Field Strength ($F$): Varied to check for linearity/non-linearity.
- System Size ($N$): Comparison between 108 and 256 particles to rule out finite-size artifacts.

Linearity Regimes and Long-Time Tail Anomalies

Agreement with Experiment: The Evans method yields thermal conductivities in broad agreement with experimental Argon data for most state points.
Linearity: Away from the critical point, conductivity is a virtually linear function of the field strength $F$, allowing for accurate zero-field extrapolation.
Critical Region Failure: Near the critical point ($T=1.35, \rho=0.4$), the method struggles; the conductivity is non-monotonic with respect to $F$, and the zero-field extrapolation underestimates the experimental value by ~11%.
Long-Time Tails: The decay of the heat flux autocorrelation function follows a $t^{-3/2}$ tail (consistent with mode-coupling theory), but the amplitude is ~6x larger than predicted.
Phase Hysteresis: In high-density regions near the freezing line, the system exhibits hysteresis and bi-stability between solid and liquid phases depending on the field strength.

Reproducibility Details

Data

The simulation relies on the Lennard-Jones (LJ) potential to model Argon. No external training data is used; the “data” consists of the physical constants defining the system.

Parameter	Value/Description	Notes
Potential	$\Phi(q)=4(q^{-12}-q^{-6})$	Standard LJ 12-6 potential
Cutoff	$r_c = 2.5$	Truncated at 2.5 distance units
Comparison	Argon Experimental Data	Sourced from NBS recommended values

Algorithms

The core algorithm is the Evans Homogeneous Heat Flow method. To reproduce this, one must implement the specific Equations of Motion (EOM) derived from linear response theory.

Equations of Motion:

The trajectories are generated by: $$ \begin{aligned} \dot{q}_i &= \frac{p_i}{m} \\ \dot{p}_i &= F_i^{\text{inter}} + (E_i - \bar{E})F(t) - \sum_{j} F_{ij} q_{ij} \cdot F(t) + \frac{1}{2N} \sum_{j,k} F_{jk} q_{jk} \cdot F(t) - \alpha p_i \end{aligned} $$

Where:

$F(t)$ is the fictitious external field driving heat flow.
$E_i$ is the instantaneous energy of particle $i$.
$\alpha$ is the Gaussian Thermostat multiplier (calculated at every step to strictly conserve kinetic energy/Temperature): $$\alpha = \frac{\sum_i [\dots]_{\text{force terms}} \cdot p_i}{\sum_i p_i \cdot p_i}$$

Conductivity Calculation:

The zero-frequency limit is extrapolated as: $$ \lambda = \lim_{F \to 0} \frac{J_Q}{FT} $$

The frequency-dependent conductivity relies on the heat-flux autocorrelation: $$ \lambda(\omega) = \frac{V}{3k_B T^2} \int_0^\infty dt , e^{i\omega t} \langle J_Q(t) \cdot J_Q(0) \rangle $$

Models

The “model” here is the physical simulation setup.

Particle Count: $N = 108$ (primary), $N = 256$ (validation).
Boundary Conditions: Periodic Boundary Conditions (PBC).
Thermostat: Gaussian Isokinetic (Temperature is a constant of motion).

Evaluation

The primary metric is the Thermal Conductivity ($\lambda$).

Metric	Definition	Baseline	Result
Thermal Conductivity	Ratio of heat flux $J_Q$ to field $F$ (extrapolated to $F=0$)	Experimental Argon (NBS Data)	Good agreement away from critical point
Tail Amplitude	Coefficient of the $\omega^{1/2}$ term in frequency-dependent conductivity	Mode-Coupling Theory ($\approx 0.05$)	Simulation value $\approx 0.3$ (6x larger)

Hardware

Requirements: While 1986 hardware is obsolete, reproducing this requires a standard MD code capable of non-conservative forces (NEMD).
Compute Cost: Low by modern standards. 108 particles for $\sim 10^5$ to $10^6$ steps is trivial on modern CPUs.

Paper Information

Citation: Evans, D. J. (1986). Thermal conductivity of the Lennard-Jones fluid. Physical Review A, 34(2), 1449-1453. https://doi.org/10.1103/PhysRevA.34.1449

Publication: Physical Review A, 1986

@article{PhysRevA.34.1449,
  title = {Thermal conductivity of the Lennard-Jones fluid},
  author = {Evans, Denis J.},
  journal = {Phys. Rev. A},
  volume = {34},
  number = {2},
  pages = {1449--1453},
  numpages = {0},
  year = {1986},
  month = {Aug},
  publisher = {American Physical Society},
  doi = {10.1103/PhysRevA.34.1449},
  url = {https://link.aps.org/doi/10.1103/PhysRevA.34.1449}
}

Embedded-Atom Method: Theory and Applications Review

Sun, 14 Dec 2025 00:00:00 +0000

Systematizing the Embedded-Atom Method

This is a Systematization (Review) paper. It consolidates the theoretical development, semi-empirical parameterization, and broad applications of the Embedded-Atom Method (EAM) into a unified framework. The paper systematizes the field by connecting the EAM to related theories (Effective Medium Theory, Finnis-Sinclair, “glue” models) and organizing phenomenological results across diverse physical regimes (bulk, surfaces, interfaces).

The authors explicitly frame the work as a survey, stating “We review here the history, development, and application of the EAM” and “This review emphasizes the physical insight that motivated the EAM.” The paper follows a classic survey structure, organizing the literature by application domains.

The Failure of Pair Potentials in Metallic Systems

The primary motivation is the failure of pair-potential models to accurately describe metallic bonding, particularly at defects and interfaces.

Physics Gap: Pair potentials assume bond strength is independent of environment, implying cohesive energy scales linearly with coordination ($Z$), whereas in reality it scales roughly as $\sqrt{Z}$.

Empirical Failures: Pair potentials incorrectly predict the “Cauchy relation” ($C_{12} = C_{44}$) and predict a vacancy formation energy equal to the cohesive energy, contradicting experimental data for fcc metals.

Practical Need: First-principles calculations (like DFT) were computationally too expensive for low-symmetry systems like grain boundaries and fracture tips, creating a need for an efficient, semi-empirical many-body potential.

Theoretical Unification & Core Innovations

The paper’s core contribution is the synthesis of the EAM as a practical computational tool that captures “coordination-dependent bond strength” without the cost of ab initio methods.

Theoretical Unification: It demonstrates that the EAM ansatz can be derived from Density Functional Theory (DFT) by assuming the total electron density is a superposition of atomic densities.

Environmental Dependence: It explicitly formulates how the “effective” pair interaction stiffens and shortens as coordination decreases (e.g., at surfaces), a feature naturally arising from the non-linearity of the embedding function.

Broad Validation: It provides a centralized evaluation of the method across a vast array of metallic properties, establishing it as the standard for atomistic simulations of face-centered cubic (fcc) metals.

Validating EAM Across Application Domains

The authors review computational experiments using Energy Minimization, Molecular Dynamics (MD), and Monte Carlo (MC) simulations across several domains:

Bulk Properties: Calculation of phonon spectra, liquid structure factors, thermal expansion coefficients, and melting points for fcc metals (Ni, Pd, Pt, Cu, Ag, Au).

Defects: Computation of vacancy formation/migration energies and self-interstitial geometries.

Grain Boundaries: Calculation of grain boundary structures, energies, and elastic properties for twist and tilt boundaries in Au and Al. Computed structures show good agreement with X-ray diffraction and HRTEM experiments. The many-body interactions in the EAM produce somewhat better agreement than pair potentials, which tend to overestimate boundary expansion.

Surfaces: Analysis of surface energies, relaxations, reconstructions (e.g., Au(110) missing row), and surface phonons.

Alloys: Investigation of heat of solution, surface segregation profiles (e.g., Ni-Cu), and order-disorder transitions.

Mechanical Properties: Simulation of dislocation mobility, pinning by defects (He bubbles), and crack tip plasticity (ductile vs. brittle fracture modes).

Key Outcomes and the Limits of EAM

Many-Body Success: The EAM successfully reproduces the breakdown of the Cauchy relation and the correct ratio of vacancy formation energy to cohesive energy (~0.35) for fcc metals.

Surface Accuracy: It correctly predicts that surface bonds are shorter and stiffer than bulk bonds due to lower coordination. It accurately predicts surface reconstructions (e.g., Au(110) $(1 \times 2)$).

Alloy Behavior: The method naturally captures segregation phenomena, including oscillating concentration profiles in Ni-Cu, driven by the embedding energy.

Limitations: The method is less accurate for systems with strong directional bonding (covalent materials) or significant Fermi-surface effects, as it assumes spherically averaged electron densities.

Reproducibility Details

Data

Fitting Data: The semi-empirical functions are fitted to basic bulk properties: lattice constants, cohesive energy, elastic constants ($C_{11}$, $C_{12}$, $C_{44}$), and vacancy formation energy.

Universal Binding Curve: The cohesive energy as a function of lattice constant is constrained to follow the “universal binding curve” of Rose et al. to ensure accurate anharmonic behavior.

Alloy Data: For binary alloys, dilute heats of alloying are used for fitting cross-interactions.

Algorithms

Core Ansatz: The total energy is defined as:

$$E_{coh} = \sum_{i} G_i\left( \sum_{j \neq i} \rho_j^a(R_{ij}) \right) + \frac{1}{2} \sum_{i, j (j \neq i)} U_{ij}(R_{ij})$$

where $G$ is the embedding energy (function of local electron density $\rho$), and $U$ is a pair interaction.

Simulation Techniques:

Molecular Dynamics (MD): Used for liquids, phonons, and fracture simulations.
Monte Carlo (MC): Used for phase diagrams and segregation profiles (e.g., approximately $10^5$ iterations per atom).
Phonons: Calculated via the dynamical matrix derived from the force-constant tensor $K_{ij}$.
Normal-Mode Analysis: Vibrational normal modes obtained by diagonalizing the dynamical matrix, feasible for unit cells of up to about 260 atoms.

Models

Parameterizations: The review lists several specific function sets developed by the authors (Table 2), including:

Daw and Baskes: For Ni, Pd, H (elemental metals and H in solution/on surfaces)
Foiles: For Cu, Ag, Au, Ni, Pd, Pt (elemental metals)
Foiles: For Cu, Ni (tailored for the Ni-Cu alloy system)
Foiles, Baskes and Daw: For Cu, Ag, Au, Ni, Pd, Pt (dilute alloys)
Daw, Baskes, Bisson and Wolfer: For Ni, H (fracture, dislocations, H embrittlement)
Foiles and Daw: For Ni, Al (Ni-rich end of the Ni-Al alloy system)
Daw: For Ni (calculated from first principles, not semi-empirical)
Hoagland, Daw, Foiles and Baskes: For Al (elemental Al)

Many of these historical parameterizations are directly downloadable in machine-readable formats from the NIST Interatomic Potentials Repository (linked in the resources below).

Transferability: EAM functions are generally not transferable between different parameterization sets; mixing functions from different sets (e.g., Daw-Baskes Ni with Foiles Pd) is invalid.

Evaluation

Bulk Validation: Phonon dispersion curves for Cu show excellent agreement with experiment across the full Brillouin zone.

Thermal Properties: Linear thermal expansion coefficients match experiment well (e.g., Cu calculated: $16.4 \times 10^{-6}/K$ vs experimental: $16.7 \times 10^{-6}/K$).

Defect Energetics: Vacancy migration energies and divacancy binding energies (~0.1-0.2 eV) align with experimental data.

Surface Segregation: Correctly predicts segregation species for 18 distinct dilute alloy cases (e.g., Cu segregating in Ni).

Hardware

Compute Scale: At the time of publication (1993), Molecular Dynamics simulations of up to 35,000 atoms were possible.

Platforms: Calculations were performed on supercomputers like the CRAY-XMP, though smaller calculations were noted as feasible on high-performance workstations.

Paper Information

Citation: Daw, M. S., Foiles, S. M., & Baskes, M. I. (1993). The embedded-atom method: a review of theory and applications. Materials Science Reports, 9(7-8), 251-310. https://doi.org/10.1016/0920-2307(93)90001-U

Publication: Materials Science Reports 1993

@article{dawEmbeddedatomMethodReview1993,
  title = {The embedded-atom method: a review of theory and applications},
  shorttitle = {The Embedded-Atom Method},
  author = {Daw, Murray S. and Foiles, Stephen M. and Baskes, Michael I.},
  year = 1993,
  month = mar,
  journal = {Materials Science Reports},
  volume = {9},
  number = {7-8},
  pages = {251--310},
  issn = {0920-2307},
  doi = {10.1016/0920-2307(93)90001-U}
}

Additional Resources:

Embedded-Atom Method User Guide: Voter's 1994 Chapter

Sun, 14 Dec 2025 00:00:00 +0000

Contribution: Systematizing the Embedded-Atom Method

This is a Systematization paper (specifically a handbook chapter) with a strong secondary Method projection.

Its primary goal is to serve as a “users’ guide” to the Embedded-Atom Method (EAM). The text organizes existing knowledge:

It traces the physical origins of EAM from Density Functional Theory (DFT) and Effective Medium Theory.
It synthesizes “closely related methods” (Second Moment Approximation, Glue Model), showing they are mathematically equivalent or very similar to EAM.
It provides a pedagogical, step-by-step methodology for fitting potentials to experimental data.

Motivation: Bridging the Gap Between DFT and Pair Potentials

The primary motivation is to bridge the gap between accurate, expensive electronic structure calculations and fast, inaccurate pair potentials.

Computational Efficiency: First-principles methods scale as $O(N^3)$ or worse, limiting simulations to $<100$ atoms (in 1994). Pair potentials scale as $O(N)$ and fail to capture essential many-body physics of metals.
Physical Accuracy: Simple pair potentials cannot accurately model metallic defects; they predict zero Cauchy pressure ($C_{12} - C_{44} = 0$) and equate vacancy formation energy to cohesive energy, both of which are incorrect for transition metals.
Practical Utility: There was a need for a clear guide on how to construct and apply these potentials for large-scale simulations ($10^6+$ atoms) of fracture and defects.

Novelty: A Unified Framework and Robust Fitting Recipe

As a review chapter, the novelty lies in the synthesis and the specific, reproducible recipe for potential construction. Central to this synthesis is the core EAM energy functional:

$$E_{\text{tot}} = \sum_i \left( F(\bar{\rho}_i) + \frac{1}{2} \sum_{j \neq i} \phi(r_{ij}) \right)$$

where the total energy $E_{\text{tot}}$ depends on embedding an atom $i$ into a local background electron density $\bar{\rho}_i = \sum_{j \neq i} \rho(r_{ij})$, plus a repulsive pair interaction $\phi(r_{ij})$.

Unified Framework: It explicitly maps the “Second Moment Approximation” (Tight Binding) and the “Glue Model” onto the fundamental EAM framework above, clarifying that they differ primarily in terminology or specific functional choices (e.g., square root embedding functions).
Cross-Potential Fitting Recipe: It details a robust method for fitting alloy potentials (specifically Ni-Al-B) by using “transformation invariance”, scaling the density and shifting the embedding function to fit alloy properties without disturbing pure element fits.
Specific Parameters: It publishes optimized potential parameters for Ni, Al, and B that accurately reproduce properties like the Boron interstitial preference in $\text{Ni}_3\text{Al}$.

Validation: Computational Benchmarks and Simulations

The “experiments” described are computational validations and simulations using the fitted Ni-Al-B potential:

Potential Fitting:
- Pure elements (Ni, Al) were fitted to elastic constants, vacancy formation energies, and diatomic data. The Ni fit achieved $\chi_{\text{rms}} = 0.75%$ and Al achieved $\chi_{\text{rms}} = 3.85%$.
- Boron was fitted using hypothetical crystal structures (fcc, bcc) calculated via LMTO (Linear Muffin-Tin Orbital) since experimental data for fcc B does not exist.
Molecular Statics (Validation):
- Surface Relaxation: Demonstrated that EAM captures the oscillatory relaxation of atomic layers near a free surface, a many-body effect that pair potentials fail to capture.
- Defect Energetics: Calculated formation energies for Boron interstitials in $\text{Ni}_3\text{Al}$. Found the 6Ni-octahedral site is most stable ($-4.59$ eV relative to an isolated B atom and unperturbed crystal), followed by the 4Ni-2Al octahedral site ($-3.65$ eV) and the 3Ni-1Al tetrahedral site ($-2.99$ eV), consistent with channeling experiments.
Molecular Dynamics (Application):
- Grain Boundary (GB) Cleavage: Simulated the fracture of a (210) tilt grain boundary in $\text{Ni}_3\text{Al}$ at a strain rate of $5 \times 10^{10}$ s$^{-1}$.
- Comparison: Compared pure $\text{Ni}_3\text{Al}$ boundaries vs. those doped with Boron and substitutional Nickel.

Key Outcomes: EAM Efficiency and Boron Strengthening

EAM Efficiency: Confirmed that EAM scales linearly with atom count ($N$), requiring only 2-5 times the computational work of pair potentials.
Boron Strengthening Mechanism: The simulations suggested that Boron segregates to grain boundaries and, specifically when co-segregated with Ni, significantly increases cohesion.
- The maximum stress for the enriched boundary was approximately 22 GPa, compared to approximately 19 GPa for the clean boundary.
- The B-doped boundary required approximately 44% more work to cleave than the undoped boundary.
- The fracture mode shifted from cleaving along the GB to failure in the bulk.
Grain Boundary Segregation: Molecular statics calculations found B interstitial energies at the GB as low as $-6.9$ eV, compared to $-4.59$ eV in the bulk, consistent with experimental observations of boron segregation to grain boundaries.
Limitations: The author concludes that while EAM is excellent for metals, it lacks the angular dependence required for strongly covalent materials (like $\text{MoSi}_2$) or directional bonding.

Reproducibility Details

The chapter provides nearly all details required to implement the described potential from scratch.

Data

Experimental/Reference Data: Used for fitting the cost function $\chi_{\text{rms}}$.
- Pure Elements: Lattice constants ($a_0$), cohesive energy ($E_{\text{coh}}$), bulk modulus ($B$), elastic constants ($C_{11}, C_{12}, C_{44}$), vacancy formation energy ($E_{\text{vac}}^f$), and diatomic bond length/strength ($R_e, D_e$).
- Alloys: Heat of solution and defect energies (APB, SISF) for $\text{Ni}_3\text{Al}$.
- Hypothetical Data: LMTO first-principles data used for unobserved phases (e.g., fcc Boron, B2 NiB) to constrain the fit.

Algorithms

Component Functions:
- Pair Potential $\phi(r)$: Morse potential form: $$\phi(r) = D_M {1 - \exp[-\alpha_M(r - R_M)]}^2 - D_M$$
- Density Function $\rho(r)$: Modified hydrogenic 4s orbital: $$\rho(r) = r^6(e^{-\beta r} + 2^9 e^{-2\beta r})$$
- Embedding Function $F(\bar{\rho})$: Derived numerically to force the crystal energy to match the “Universal Energy Relation” (Rose et al.) as a function of lattice constant.
Fitting Strategy:
- Smooth Cutoff: A polynomial smoothing function ($h_{\text{smooth}}$) applied at $r_{\text{cut}}$ to ensure continuous derivatives.
- Simplex Algorithm: Used to optimize parameters ($D_M, R_M, \alpha_M, \beta, r_{\text{cut}}$).
- Alloy Invariance: Used transformations $F’(\rho) = F(\rho) + g\rho$ and $\rho’(r) = s\rho(r)$ to fit cross-potentials without altering pure-element properties.

Models

Parameters: The text provides the exact optimized parameters for the Ni-Al-B potential in Table 2 (Pure elements) and Table 5 (Cross-potentials).
- Example Ni parameters: $D_M=1.5335$ eV, $\alpha_M=1.7728$ Å$^{-1}$, $r_{\text{cut}}=4.7895$ Å.

Hardware

1994 Context: Mentions that simulations of $10^6$ atoms were possible on the “fastest computers available”.
Scaling: Explicitly notes computational work scales as $O(N)$, roughly 2-5x slower than pair potentials.

Paper Information

Citation: Voter, A. F. (1994). Chapter 4: The Embedded-Atom Method. In Intermetallic Compounds: Vol. 1, Principles, edited by J. H. Westbrook and R. L. Fleischer. John Wiley & Sons Ltd.

Publication: Intermetallic Compounds: Vol. 1, Principles (1994)

@incollection{voterEmbeddedAtomMethod1994,
  title = {The Embedded-Atom Method},
  author = {Voter, Arthur F.},
  booktitle = {Intermetallic Compounds: Vol. 1, Principles},
  editor = {Westbrook, J. H. and Fleischer, R. L.},
  year = {1994},
  publisher = {John Wiley & Sons Ltd},
  pages = {77--90},
  chapter = {4}
}

Additional Resources:

NIST Interatomic Potentials Repository (Modern repository often hosting EAM files)
Original EAM Paper (1984)
EAM Review (1993)

Dynamical Corrections to TST for Surface Diffusion

Sun, 14 Dec 2025 00:00:00 +0000

Bridging MD and TST for Surface Diffusion

This is primarily a Methodological Paper with a secondary contribution in Discovery.

The authors’ primary goal is to demonstrate the validity of the “dynamical corrections formalism” for calculating diffusion constants. They validate this by reproducing Molecular Dynamics (MD) results at high temperatures and then extending the method into low-temperature regimes where MD is infeasible.

By applying this method, they uncover a specific physical phenomenon, “bounce-back recrossings”, that causes a dip in the diffusion coefficient at low temperatures, a detail previously unobserved.

Timescale Limits in Molecular Dynamics

The authors aim to solve the timescale problem in simulating surface diffusion.

Limit of MD: Molecular Dynamics (MD) is effective at high temperatures but becomes computationally infeasible at low temperatures because the time between diffusive hops increases drastically.

Limit of TST: Standard Transition State Theory (TST) can handle long timescales but assumes all barrier crossings are successful, ignoring correlated dynamical events like immediate recrossings or multiple jumps.

Goal: They seek to apply a formalism that corrects TST using short-time trajectory data, allowing for accurate calculation of diffusion constants across the entire temperature range.

The Bounce-Back Mechanism

The core novelty is the rigorous application of the dynamical corrections formalism to a multi-site system (fcc/hcp sites) to characterize non-Arrhenius behavior at low temperatures.

Unified Approach: They demonstrate that this method works for all temperatures, bridging the gap between the “rare-event regime” and the high-temperature regime dominated by fluid-like motion.

Bounce-back Mechanism: They identify a specific “dip” in the dynamical correction factor ($f_d < 1$) at low temperatures ($T \approx 0.038$), attributed to trajectories where the adatom collides with a substrate atom on the far side of the binding site and immediately recrosses the dividing surface.

Simulating the Lennard-Jones fcc(111) Surface

The authors performed computational experiments on a Lennard-Jones fcc(111) surface cluster.

System Setup: A single adatom on a 3-layer substrate (30 atoms/layer) with periodic boundary conditions.

Baselines: They compared their high-temperature results against standard Molecular Dynamics simulations to validate the method.

Ablation of Substrate Freedom: They ran a control experiment with a 6-layer substrate (top 3 free, 800 trajectories) to confirm the bounce-back effect persisted independently of the fixed deep layers, obtaining $D/D^{TST} = 0.75 \pm 0.06$, consistent with the original result.

Trajectory Analysis: They analyzed the angular distribution of initial momenta to characterize the specific geometry of the bounce-back trajectories. Bounce-back trajectories were more strongly peaked at $\phi = 90°$ (perpendicular to the TST gate), confirming the effect arises from interaction with the substrate atom directly across the binding site.

Temperature Range: The full calculation spanned $0.013 \leq T \leq 0.383$ in reduced units, bridging the rare-event regime and the high-temperature fluid-like regime.

Resolving Non-Arrhenius Behavior

Arrhenius Behavior of TST: The uncorrected TST diffusion constant ($D^{TST}$) followed a near-perfect Arrhenius law, with a linear least-squares fit of $\ln(D^{TST}) = -1.8 - 0.30/T$.

High-Temperature Correction: At high T, the dynamical correction factor $D/D^{TST} > 1$, indicating correlated multiple forward jumps (long flights).

Low-Temperature Dip: At low T, $D/D^{TST} < 1$ for $T = 0.013, 0.026, 0.038, 0.051$ (minimum at $T = 0.038$), caused by the bounce-back mechanism.

Validation: The method successfully reproduced high-T literature values while providing access to low-T dynamics inaccessible to direct MD.

Reproducibility Details

Data

The paper does not use external datasets but generates simulation data based on the Lennard-Jones potential.

Type	Parameter	Value	Notes
Potential	$\epsilon, \sigma$	1.0 (Reduced units)	Standard Lennard-Jones 6-12
Cutoff	Spline	$r_1=1.5\sigma, r_2=2.5\sigma$	5th-order spline smooths potential to 0 at $r_2$
Geometry	Lattice Constant	$a_0 = 1.549$	Minimum energy for this potential
Cluster	Size	3 layers, 30 atoms/layer	Periodic boundary conditions parallel to surface

Algorithms

The diffusion constant $D$ is calculated as $D = D^{TST} \times (D/D^{TST})$.

1. TST Rate Calculation ($D^{TST}$)

Method: Monte Carlo integration of the flux through the dividing surface.
Technique: Calculate free energy difference between the entire binding site and the TST dividing region.
Dividing Surface: Defined geometrically with respect to equilibrium substrate positions (honeycomb boundaries around fcc/hcp sites).

2. Dynamical Correction Factor ($D/D^{TST}$)

The method relies on evaluating the dynamical correction factor $f_d$, initialized via a Metropolis walk restricted to the TST boundary region, computed as:

$$ \begin{aligned} f_d(i\rightarrow j) = \frac{2}{N}\sum_{I=1}^{N}\eta_{ij}(I) \end{aligned} $$

Initialization:
- Position: Sampled via Metropolis walk restricted to the TST boundary region.
- Momentum: Maxwellian distribution for parallel components; Maxwellian-flux distribution for normal component.
- Symmetry: Trajectories entering hcp sites are generated by reversing momenta of those entering fcc sites.
Integration:
- Integrator: Adams-Bashforth-Moulton predictor-corrector formulas of orders 1 through 12.
- Duration: Integrated until time $t > \tau_{corr}$ (approximately $\tau_{corr} \approx 13$ reduced time units).
- Sample Size: 1400 trajectories per temperature point (700 initially entering each type of site).

Models

System: Single component Lennard-Jones solid (Argon-like).
Adsorbate: Single adatom on fcc(111) surface.
Substrate Flexibility: Adatom plus top layer atoms are free to move. Layers 2 and 3 are fixed. (Validation run used 6 layers with top 3 free).

Evaluation

The primary metric is the Diffusion Constant $D$, analyzed via the Dynamical Correction Factor.

Metric	Value	Baseline	Notes
Slope ($E_a$)	0.30	0.303 fcc / 0.316 hcp (Newton-Raphson)	TST slope in good agreement with static barrier height.
$D/D^{TST}$ (Low T)	$0.82 \pm 0.04$	1.0 (TST)	At $T=0.038$. Indicates 18% reduction due to recrossing.
$D/D^{TST}$ (High T)	$> 1.0$	MD Literature	Increases with T due to multiple jumps.

Hardware

Specific hardware configurations (e.g., node architectures, supercomputers) or training times were not specified in the original publication, which is typical for 1989 literature. Modern open-source MD engines (e.g., LAMMPS, ASE) could perform identical Lennard-Jones molecular dynamics integrations in negligible time on any consumer workstation.

Paper Information

Citation: Cohen, J. M., & Voter, A. F. (1989). Self-diffusion on the Lennard-Jones fcc(111) surface: Effects of temperature on dynamical corrections. The Journal of Chemical Physics, 91(8), 5082-5086. https://doi.org/10.1063/1.457599

Publication: The Journal of Chemical Physics 1989

@article{cohenSelfDiffusionLennard1989,
  title = {Self-diffusion on the {{Lennard}}-{{Jones}} Fcc(111) Surface: {{Effects}} of Temperature on Dynamical Corrections},
  shorttitle = {Self-diffusion on the {{Lennard}}-{{Jones}} Fcc(111) Surface},
  author = {Cohen, J. M. and Voter, A. F.},
  year = {1989},
  month = oct,
  journal = {The Journal of Chemical Physics},
  volume = {91},
  number = {8},
  pages = {5082--5086},
  issn = {0021-9606, 1089-7690},
  doi = {10.1063/1.457599},
  langid = {english}
}

Correlations in the Motion of Atoms in Liquid Argon

Sat, 13 Dec 2025 00:00:00 +0000

Contribution: Methodological Validation of MD

This is the archetypal Method paper (dominant classification with secondary Theory contribution). It establishes the architectural validity of Molecular Dynamics (MD) as a scientific tool. Rahman answers the question: “Can a digital computer solving classical difference equations faithfully represent a physical liquid?”

The paper utilizes specific rhetorical indicators of a methodological contribution:

Algorithmic Explication: A dedicated Appendix details the predictor-corrector difference equations.
Validation against Ground Truth: Extensive comparison of calculated diffusion constants and pair-correlation functions against experimental neutron and X-ray scattering data.
Robustness Checks: Ablation studies on the numerical integration stability (one vs. two corrector cycles).

Motivation: Bridging Neutron Scattering and Many-Body Theory

In the early 1960s, neutron scattering data provided insights into the dynamic structure of liquids, but theorists lacked concrete models to explain the observed two-body dynamical correlations. Analytic theories were limited by the difficulty of the many-body problem.

Rahman sought to bypass these analytical bottlenecks by assuming that classical dynamics with a simple 2-body potential (Lennard-Jones) could sufficiently describe the motion of atoms in liquid argon. The goal was to generate “experimental” data via simulation to test theoretical models (like the Vineyard convolution approximation) and provide a microscopic understanding of diffusion.

Core Innovation: System Stability and the Cage Effect

This paper is widely considered the birth of modern molecular dynamics for continuous potentials. Its key novelties include:

System Size & Stability: Successfully simulating 864 particles interacting via a continuous Lennard-Jones potential with stable temperature over the full simulation duration (approximately $10^{-11}$ sec, as confirmed by Table I in the paper).
The “Cage Effect”: The discovery that the velocity autocorrelation function becomes negative after a short time: $$ \langle \textbf{v}(0) \cdot \textbf{v}(t) \rangle < 0 \quad \text{for } t > 0.33 \times 10^{-12} \text{ s} $$ This proved that atoms in a liquid “rattle” against the cage of their nearest neighbors.
Delayed Convolution: Proposing an improvement to the Vineyard approximation for the distinct Van Hove function $G_d(r,t)$ by introducing a time-delayed convolution to account for the persistence of local structure. Instead of convolving $g(r)$ with $G_s(r,t)$ at the same time $t$, Rahman convolves at a delayed time $t’ < t$, using a one-parameter function with $\tau = 1.0 \times 10^{-12}$ sec. This makes $G_d(r,t)$ decay as $t^4$ at short times (instead of $t^2$ in the Vineyard approximation) and as $t$ at long times.

Methodology: Simulating 864 Argon Atoms

Rahman performed a “computer experiment” (simulation) of Liquid Argon:

System: 864 particles in a cubic box of side $L=10.229\sigma$.
Conditions: Temperature $94.4^\circ$K, Density $1.374 \text{ g cm}^{-3}$.
Interaction: Lennard-Jones potential, truncated at $R=2.25\sigma$.
Time Step: $\Delta t = 10^{-14}$ s (780 steps total, covering approximately $7.8 \times 10^{-12}$ s).
Output Analysis:
- Radial distribution function $g(r)$.
- Mean square displacement $\langle r^2 \rangle$.
- Velocity autocorrelation function $\langle v(0)\cdot v(t) \rangle$.
- Van Hove space-time correlation functions $G_s(r,t)$ and $G_d(r,t)$.

Results: Validation and Non-Gaussian Diffusion Analysis

Validation: The calculated pair-distribution function $g(r)$ agreed well with X-ray scattering data from Eisenstein and Gingrich (at $91.8^\circ$K). The self-diffusion constant $D = 2.43 \times 10^{-5} \text{ cm}^2 \text{ sec}^{-1}$ at $94.4^\circ$K matched the experimental value from Naghizadeh and Rice at $90^\circ$K and the same density ($1.374 \text{ g cm}^{-3}$).
Dynamics: The velocity autocorrelation has a negative region, contradicting simple exponential decay models (Langevin). Its frequency spectrum $f(\omega)$ shows a broad maximum at $\omega \approx 0.25 (k_BT/\hbar)$, reminiscent of solid-like behavior.
Non-Gaussian Behavior: The self-diffusion function $G_s(r,t)$ attains its maximum departure from a Gaussian shape at about $t \approx 3.0 \times 10^{-12}$ s (with $\langle r^4 \rangle$ departing from its Gaussian value by about 13%), returning to Gaussian form by $\sim 10^{-11}$ s. At that time, the rms displacement ($3.8$ Angstrom) is close to the first-neighbor distance ($3.7$ Angstrom). This indicates that Fickian diffusion is an asymptotic limit and does not apply at short times.
Fourier Transform Validation: The Fourier transform of $g(r)$ has peaks at $\kappa\sigma = 6.8$, 12.5, 18.5, 24.8, closely matching the X-ray scattering peaks at $\kappa\sigma = 6.8$, 12.3, 18.4, 24.4.
Temperature Dependence: A second simulation at $130^\circ$K and $1.16 \text{ g cm}^{-3}$ yielded $D = 5.67 \times 10^{-5} \text{ cm}^2 \text{ sec}^{-1}$, compared to the experimental value of $6.06 \times 10^{-5} \text{ cm}^2 \text{ sec}^{-1}$ from Naghizadeh and Rice at $120^\circ$K and $1.16 \text{ g cm}^{-3}$. The paper notes that both calculated values are lower than experiment by about 20%, and suggests that allowing for a softer repulsive part in the interaction potential might reduce this discrepancy.
Vineyard Approximation: The standard Vineyard convolution approximation ($G_d \approx g * G_s$) produces a too-rapid decay of $G_d(r,t)$ with time. The delayed convolution, matching pairs of $(t’, t)$ in units of $10^{-12}$ sec as (0.2, 0.4), (0.5, 0.8), (1.0, 1.6), (1.5, 2.3), (2.0, 2.9), (2.5, 3.5), provides a substantially better fit.
Conclusion: Classical N-body dynamics with a truncated pair potential is a sufficient model to reproduce both the structural and dynamical properties of simple liquids.

Reproducibility Details

Data

The simulation uses physical constants for Argon:

Parameter	Value	Notes
Particle Mass ($M$)	$39.95 \times 1.6747 \times 10^{-24}$ g	Mass of Argon atom
Potential Depth ($\epsilon/k_B$)	$120^\circ$K	Lennard-Jones parameter
Potential Size ($\sigma$)	$3.4$ Å	Lennard-Jones parameter
Cutoff Radius ($R$)	$2.25\sigma$	Potential truncated beyond this
Density ($\rho$)	$1.374$ g cm$^{-3}$
Particle Count ($N$)	864

Algorithms

Rahman utilized a Predictor-Corrector scheme for solving the second-order differential equations of motion.

Step Size: $\Delta t = 10^{-14}$ sec.

The Algorithm:

Predict positions $\bar{\xi}$ at $t + \Delta t$ based on previous steps: $$\bar{\xi}_i^{(n+1)} = \xi_i^{(n-1)} + 2\Delta u \eta_i^{(n)}$$
Calculate Forces (Accelerations $\alpha$) using predicted positions.
Correct positions and velocities using the trapezoidal rule: $$ \begin{aligned} \eta_i^{(n+1)} &= \eta_i^{(n)} + \frac{1}{2}\Delta u (\alpha_i^{(n+1)} + \alpha_i^{(n)}) \\ \xi_i^{(n+1)} &= \xi_i^{(n)} + \frac{1}{2}\Delta u (\eta_i^{(n+1)} + \eta_i^{(n)}) \end{aligned} $$

Note: The paper compared one vs. two repetitions of the corrector step, finding that two passes improved precision slightly. The results presented in the paper were obtained using two passes.

Models

Interaction Potential: Lennard-Jones 12-6 $$V(r_{ij}) = 4\epsilon \left[ \left(\frac{\sigma}{r_{ij}}\right)^{12} - \left(\frac{\sigma}{r_{ij}}\right)^6 \right]$$

Boundary Conditions: Periodic Boundary Conditions (PBC) in 3 dimensions. When a particle moves out of the box ($x > L$), it re-enters at $x - L$.

Hardware

This is a historical benchmark for computational capability in 1964:

Resource	Specification	Notes
Computer	CDC 3600	Control Data Corporation mainframe
Compute Time	45 seconds / cycle	Per predictor-corrector cycle for 864 particles (floating point)
Language	FORTRAN + Machine Language	Machine language used for the most time-consuming parts

Modern Context: Rahman’s system (864 Argon atoms, LJ-potential) is highly reproducible today and serves as a classic pedagogical exercise. It can be simulated in standard MD frameworks (LAMMPS, OpenMM) in fractions of a second on consumer hardware.

Paper Information

Citation: Rahman, A. (1964). Correlations in the Motion of Atoms in Liquid Argon. Physical Review, 136(2A), A405-A411. https://doi.org/10.1103/PhysRev.136.A405

Publication: Physical Review 1964

@article{rahman1964correlations,
  title={Correlations in the motion of atoms in liquid argon},
  author={Rahman, A.},
  journal={Physical Review},
  volume={136},
  number={2A},
  pages={A405--A411},
  year={1964},
  publisher={APS},
  doi={10.1103/PhysRev.136.A405}
}

Additional Resources:

Aneesur Rahman - Wikipedia

Adatom Dimer Diffusion on fcc(111) Crystal Surfaces

Sat, 13 Dec 2025 00:00:00 +0000

Classification: Discovery of Diffusion Mechanisms

Discovery (Translational Basis)

This paper applies a computational method (Molecular Dynamics) to observe and characterize a physical phenomenon: the specific diffusion mechanisms of adatom dimers on a crystal surface. It focuses on the “what was found” (simultaneous multiple jumps).

Based on the AI for Physical Sciences Paper Taxonomy, this is best classified as $\Psi_{\text{Discovery}}$ with a minor superposition of $\Psi_{\text{Method}}$ (approximately 80% Discovery, 20% Method). The dominant contribution is the application of computational tools to observe physical phenomena, while secondarily demonstrating MD’s capability for surface diffusion problems in an era when the technique was still developing.

Bridging the Intermediate Temperature Data Gap

The study aims to investigate the behavior of adatom dimers in an intermediate temperature range ($0.3T_m$ to $0.6T_m$). At the time, Field Ion Microscopy (FIM) provided data at low temperatures ($T \le 0.2T_m$), and previous simulations had studied single adatoms on various surfaces including (111), (110), and (100), but not dimers on (111). The authors sought to compare dimer mobility with single adatom mobility on the (111) surface, where single adatoms move almost like free particles.

Observation of Simultaneous Multiple Jumps

The core contribution is the observation of simultaneous multiple jumps for dimers on the (111) surface at intermediate temperatures. The study reveals that:

Dimers migrate as a whole entity, with both atoms jumping simultaneously
The mobility of dimers (center of mass) is very close to that of single adatoms in this regime.

Molecular Dynamics Simulation Design

The authors performed Molecular Dynamics (MD) simulations of a face-centred cubic (fcc) crystallite:

System: A single crystallite of 192 atoms bounded by two free (111) surfaces
Temperature Range: $0.22 \epsilon/k$ to $0.40 \epsilon/k$ (approximately $0.3T_m$ to $0.6T_m$)
Duration: Integration over 50,000 time steps
Comparison: Results were compared against single adatom diffusion data and Einstein’s diffusion relation

Outcomes on Mobility and Migration Dynamics

Mechanism Transition: At low temperatures ($T^\ast=0.22$), diffusion occurs via discrete single jumps where adatoms rotate or extend bonds. At higher temperatures, the “multiple jump” mechanism becomes preponderant.
Migration Style: The dimer migrates essentially by extending its bond along the $\langle 110 \rangle$ direction.
Mobility: The diffusion coefficient of dimers is quantitatively similar to single adatoms.
Qualitative Support: The results support Bonzel’s hypothesis of delocalized diffusion involving energy transfer between translation and rotation. The authors attempted to quantify the coupling using the cross-correlation function:

$$g(t) = C \langle E_T(t) , E_R(t + t’) \rangle$$

where $C$ is a normalization constant, $E_T$ is the translational energy of the center of mass, and $E_R$ is the rotational energy of the dimer. However, the average lifetime of a dimer (2% to 15% of the total calculation time in the studied temperature range) was too short to allow a statistically significant study of this coupling.

Dimer Concentration: The contribution of dimers to mass transport depends on their concentration. As a first approximation, the dimer concentration is expressed as:

$$C = C_0 \exp\left[\frac{-2E_f - E_d}{k_B T}\right]$$

where $E_f$ is the formation energy of adatoms and $E_d$ is the binding energy of a dimer. If the binding energy is sufficiently strong, dimer contributions should be accounted for even in the intermediate temperature range ($0.3T_m$ to $0.6T_m$).

Reproducibility Details

Data (Simulation Setup)

Because this is an early computational study, “data” refers to the initial structural configuration. The simulation begins with an algorithmically generated generic fcc(111) lattice containing two adatoms as the initial state.

Initial configuration showing an adatom dimer (two adatoms on neighboring sites) on an fcc(111) surface. The crystallite consists of 192 atoms with periodic boundary conditions in the x and y directions.

Parameter	Value	Notes
Particles	192 atoms	Single fcc crystallite
Dimensions	$4[110] \times 4[112]$	Thickness of 6 planes
Boundary	Periodic (x, y)	Free surface in z-direction
Initial State	Dimer on neighbor sites	Starts with 2 adatoms

Algorithms

The simulation relies on standard Molecular Dynamics integration techniques. Historical source code is absent. Complete reproducibility is achievable today utilizing modern open-source tools like LAMMPS with standard lj/cut pair styles and NVE/NVT ensembles.

Integration Scheme: Central difference algorithm (Verlet algorithm)
Time Step: $\Delta t^\ast = 0.01$ (reduced units)
Total Steps: 50,000 integration steps
Dimer Definition: Two adatoms are considered a dimer if their distance $r \le r_c = 2\sigma$

Models (Analytic Potential)

The physics are modeled using a classic Lennard-Jones potential.

Potential Form: (12, 6) Lennard-Jones $$ V(r) = 4\epsilon \left[ \left(\frac{\sigma}{r}\right)^{12} - \left(\frac{\sigma}{r}\right)^6 \right] $$

Parameters (Argon-like):

$\epsilon/k = 119.5$ K
$\sigma = 3.4478$ Å
$m = 39.948$ a.u.
Cut-off radius: $2\sigma$

Evaluation

Metrics used to quantify the diffusion behavior:

Metric	Formula	Notes
Diffusion Coefficient	$D = \frac{\langle R^2 \rangle}{4t}$	Calculated from Mean Square Displacement of center of mass
Trajectory Analysis	Visual inspection	Categorized into “fast migration” (multiple jumps) or “discrete jumps”

Hardware

Specifics: Unspecified in the original text.
Scale: 192 particles simulated for 50,000 steps is extremely lightweight by modern standards. A standard laptop CPU executes this workload in under a second, providing a strong contrast to the mainframe computing resources required in 1984.

Paper Information

Citation: Ghaleb, D. (1984). Diffusion of adatom dimers on (111) surface of face centred crystals: A molecular dynamics study. Surface Science, 137(2-3), L103-L108. https://doi.org/10.1016/0039-6028(84)90515-6

Publication: Surface Science 1984

@article{ghalebDiffusionAdatomDimers1984,
  title = {Diffusion of Adatom Dimers on (111) Surface of Face Centred Crystals: A Molecular Dynamics Study},
  author = {Ghaleb, Dominique},
  year = {1984},
  journal = {Surface Science},
  volume = {137},
  number = {2-3},
  pages = {L103-L108},
  doi = {10.1016/0039-6028(84)90515-6}
}

The Müller-Brown Potential: A 2D Benchmark Surface

Mon, 08 Sep 2025 00:00:00 +0000

Overview

The Müller-Brown potential is a primary benchmark system in computational chemistry: a two-dimensional analytical surface used to evaluate optimization algorithms. Introduced by Klaus Müller and Leo D. Brown in 1979 as a test system for their constrained simplex optimization algorithm, this potential energy function captures the essential topology of chemical reaction landscapes while preserving computational efficiency.

Origin: Müller, K., & Brown, L. D. (1979). Location of saddle points and minimum energy paths by a constrained simplex optimization procedure. Theoretica Chimica Acta, 53, 75-93. The potential is introduced in footnote 7 (p. 79) as a two-parametric model surface for testing the constrained simplex procedures.

Mathematical Definition

The Müller-Brown potential combines four two-dimensional Gaussian functions:

$$V(x,y) = \sum_{k=1}^{4} A_k \exp\left[a_k(x-x_k^0)^2 + b_k(x-x_k^0)(y-y_k^0) + c_k(y-y_k^0)^2\right]$$

Each Gaussian contributes a different “bump” or “well” to the landscape. The parameters control amplitude ($A_k$), width, orientation, and center position.

Standard Parameters

The canonical parameter values that define the Müller-Brown surface are:

k	$A_k$	$a_k$	$b_k$	$c_k$	$x_k^0$	$y_k^0$
1	-200	-1	0	-10	1	0
2	-100	-1	0	-10	0	0.5
3	-170	-6.5	11	-6.5	-0.5	1.5
4	15	0.7	0.6	0.7	-1	1

The first three terms have negative amplitudes (creating energy wells), while the fourth has a positive amplitude (creating a barrier). The cross-term $b_k$ in the third Gaussian creates the tilted orientation that gives the surface its characteristic curved pathways.

Analytical Gradients (Forces)

To optimize paths or simulate molecular dynamics across this surface, calculating the spatial derivatives (negative forces) is structurally simple. Defining $G_k(x,y)$ as the inner argument of the exponent, the partial derivatives with respect to $x$ and $y$ are:

$$ \frac{\partial V}{\partial x} = \sum_{k=1}^4 A_k \exp[G_k(x,y)] \cdot \left[ 2a_k(x-x_k^0) + b_k(y-y_k^0) \right] $$

$$ \frac{\partial V}{\partial y} = \sum_{k=1}^4 A_k \exp[G_k(x,y)] \cdot \left[ b_k(x-x_k^0) + 2c_k(y-y_k^0) \right] $$

Energy Landscape

This simple formula creates a surprisingly rich topography with exactly the features needed to challenge optimization algorithms:

Stationary Point	Coordinates	Energy	Type
MA (Reactant)	(-0.558, 1.442)	-146.70	Deep minimum
MC (Intermediate)	(-0.050, 0.467)	-80.77	Shallow minimum
MB (Product)	(0.623, 0.028)	-108.17	Medium minimum
S1	(-0.822, 0.624)	-40.67	First saddle point
S2	(0.212, 0.293)	-72.25	Second saddle point

All values from Table 1 of Müller & Brown (1979).

The Müller-Brown potential energy surface showing the three minima (dark blue regions) and two saddle points.

Key Challenge: Curved Reaction Pathways

The path from the deep reactant minimum (MA) to the product minimum (MB) follows a curved two-step pathway:

MA → S1 → MC: First transition over a lower barrier into an intermediate basin
MC → S2 → MB: Second transition over a slightly higher barrier to the product

This curved pathway breaks linear interpolation methods. Algorithms that draw a straight line from reactant to product miss both the intermediate minimum and the correct transition states, climbing over much higher energy regions instead.

Why It Works as a Benchmark

The Müller-Brown potential has served as a computational chemistry benchmark for over four decades because of four key characteristics:

Low dimensionality: As a 2D surface, it permits complete visualization of the landscape, clearly revealing why specific algorithms succeed or fail.

Analytical form: Energy and gradient calculations cost virtually nothing, enabling exhaustive testing impossible with quantum mechanical surfaces.

Non-trivial topology: The curved minimum energy path and shallow intermediate minimum challenge sophisticated methods while remaining manageable.

Known ground truth: All minima and saddle points are precisely known, providing unambiguous success metrics.

Contrast with Other Benchmarks

The Müller-Brown potential provides distinct evaluation metrics compared to other classic potentials. The Lennard-Jones potential serves as the standard benchmark for equilibrium properties due to its single energy minimum. In parallel, Müller-Brown explicitly models reactive landscapes. Its multiple minima and connecting barriers create an evaluation environment for algorithms designed to discover transition states and reaction paths.

Historical Applications

The potential has evolved with the field’s changing focus:

1980s-1990s: Testing path-finding methods like Nudged Elastic Band (NEB), which creates discrete representations of reaction pathways and optimizes them to find minimum energy paths.

2000s-2010s: Validating Transition Path Sampling (TPS) methods that harvest statistical ensembles of reactive trajectories.

2020s: Benchmarking machine learning models and generative approaches that learn to sample transition paths or approximate potential energy surfaces.

Modern Applications in Machine Learning

The rise of machine learning has given the Müller-Brown potential renewed purpose. Modern Machine Learning Interatomic Potentials (MLIPs) aim to bridge the gap between quantum mechanical accuracy and classical force field efficiency by training flexible models on expensive quantum chemistry data.

The Müller-Brown potential provides an ideal benchmarking solution: an exactly known potential energy surface that can generate unlimited, noise-free training data. This enables researchers to ask fundamental questions:

How well does a given architecture learn complex, curved surfaces?
How many training points are needed for acceptable accuracy?
How does the model behave when extrapolating beyond training data?
Can it correctly identify minima and saddle points?

The potential serves as a consistent benchmark for measuring the learning capacity of AI models.

Extensions and Variants

Higher-Dimensional Extensions

The canonical Müller-Brown potential can be extended beyond two dimensions to create more challenging test cases:

Harmonic constraints: Add quadratic wells in orthogonal dimensions while preserving the complex 2D landscape:

$$V_{5D}(x_1, x_2, x_3, x_4, x_5) = V(x_1, x_3) + \kappa(x_2^2 + x_4^2 + x_5^2)$$

Collective variables (CVs): Collective variables are low-dimensional coordinates that capture the most important degrees of freedom in a high-dimensional system. By defining CVs that mix multiple dimensions, the original surface can be embedded in higher-dimensional spaces. For instance, the active 2D coordinates $x$ and $y$ can be projected as linear combinations of $N$ arbitrary degrees of freedom ($q_i$):

$$ x = \sum_{i=1}^N w_{x,i} q_i \quad \text{and} \quad y = \sum_{i=1}^N w_{y,i} q_i $$

This constructs a complex, high-dimensional problem where an algorithm must learn to isolate the relevant active subspace (the CVs) before it can effectively optimize the topology.

These extensions enable systematic testing of algorithm scaling with dimensionality while maintaining known ground truth in the active subspace.

Limitations

Despite its utility, the Müller-Brown potential has fundamental limitations as a proxy for physical systems:

Lack of Realistic Scaling: As a purely mathematical 2D/analytical model, it cannot directly simulate the complexities of high-dimensional scaling found in many-body atomic systems.
No Entropic Effects: In real chemical systems, entropic contributions heavily influence the free-energy landscape. The Müller-Brown potential maps energy precisely but lacks the thermal/entropic complexity of solvent or macromolecular environments.
Trivial Topology Contrasts: While non-trivial compared to single wells, its global topology remains simpler than proper ab initio potential energy surfaces, missing features like complex bifurcations, multi-state crossings, or non-adiabatic couplings.

Implementation Considerations

Modern implementations typically focus on:

Vectorized calculations for batch processing
Analytical derivatives for gradient-based methods
JIT compilation for performance optimization
Automatic differentiation compatibility for machine learning frameworks

The analytical nature of the potential makes it ideal for testing both classical optimization methods and modern machine learning approaches.

Resources and Visualizations

Interactive Müller-Brown Potential Energy Surface - Local visualization tool
Müller-Brown Potential Visualization (Wolfram) - External Wolfram demonstration
Implementing the Müller-Brown Potential in PyTorch - Detailed implementation guide with performance analysis

The Müller-Brown potential belongs to a family of analytical benchmark systems used in computational chemistry. Other notable examples include:

Lennard-Jones potential: Single-minimum benchmark for equilibrium properties
Double-well potentials: Simple models for bistable systems
Eckart barrier: One-dimensional tunneling benchmark
Wolfe-Quapp potential: Higher-dimensional extension with valley-ridge inflection points

Conclusion

The Müller-Brown potential demonstrates how a well-designed benchmark can evolve with a field. Originating from 1970s computational constraints to test algorithms when quantum chemistry calculations were expensive, its topology causes naive linear-interpolation approaches to fail while maintaining instantaneous computational execution. Because of this, it remains a heavily analyzed benchmark system today.

It serves specific purposes in the machine learning era by providing a controlled environment for developing methods targeted at complex realistic molecular systems. Its evolution from a practical surrogate model to a machine learning benchmark demonstrates the continued relevance of foundational analytical test cases in computational science.

DenoiseVAE: Adaptive Noise for Molecular Pre-training

Sun, 24 Aug 2025 00:00:00 +0000

Paper Contribution Type

This is a method paper with a supporting theoretical component. It introduces a new pre-training framework, DenoiseVAE, that challenges the standard practice of using fixed, hand-crafted noise distributions in denoising-based molecular representation learning.

Motivation: The Inter- and Intra-molecular Variations Problem

The motivation is to create a more physically principled denoising pre-training task for 3D molecules. The core idea of denoising is to learn molecular force fields by corrupting an equilibrium conformation with noise and then learning to recover it. However, existing methods use a single, hand-crafted noise strategy (e.g., Gaussian noise of a fixed scale) for all atoms across all molecules. This is physically unrealistic for two main reasons:

Inter-molecular differences: Different molecules have unique Potential Energy Surfaces (PES), meaning the space of low-energy (i.e., physically plausible) conformations is highly molecule-specific.
Intra-molecular differences (Anisotropy): Within a single molecule, different atoms have different degrees of freedom. For instance, an atom in a rigid functional group can move much less than one connected by a single, rotatable bond.

The authors argue that this “one-size-fits-all” noise approach leads to inaccurate force field learning because it samples many physically improbable conformations.

Novelty: A Learnable, Atom-Specific Noise Generator

The core novelty is a framework that learns to generate noise tailored to each specific molecule and atom. This is achieved through three key innovations:

Learnable Noise Generator: The authors introduce a Noise Generator module (a 4-layer Equivariant Graph Neural Network) that takes a molecule’s equilibrium conformation $X$ as input and outputs a unique, atom-specific Gaussian noise distribution (i.e., a different variance $\sigma_i^2$ for each atom $i$). This directly addresses the issues of PES specificity and force field anisotropy.
Variational Autoencoder (VAE) Framework: The Noise Generator (encoder) and a Denoising Module (a 7-layer EGNN decoder) are trained jointly within a VAE paradigm. The noisy conformation is sampled using the reparameterization trick: $$ \begin{aligned} \tilde{x}_i &= x_i + \epsilon \sigma_i \end{aligned} $$
Principled Optimization Objective: The training loss balances two competing goals: $$ \begin{aligned} \mathcal{L}_{DenoiseVAE} &= \mathcal{L}_{Denoise} + \lambda \mathcal{L}_{KL} \end{aligned} $$
- A denoising reconstruction loss ($\mathcal{L}_{Denoise}$) encourages the Noise Generator to produce physically plausible perturbations from which the original conformation can be recovered. This implicitly constrains the noise to respect the molecule’s underlying force fields.
- A KL divergence regularization term ($\mathcal{L}_{KL}$) pushes the generated noise distributions towards a predefined prior. This prevents the trivial solution of generating zero noise and encourages the model to explore a diverse set of low-energy conformations.

The authors also provide a theoretical analysis showing that optimizing their objective is equivalent to maximizing the Evidence Lower Bound (ELBO) on the log-likelihood of observing physically realistic conformations.

Methodology & Experimental Baselines

The model was pretrained on the PCQM4Mv2 dataset (approximately 3.4 million organic molecules) and then evaluated on a comprehensive suite of downstream tasks to test the quality of the learned representations:

Molecular Property Prediction (QM9): The model was evaluated on 12 quantum chemical property prediction tasks for small molecules (134k molecules; 100k train, 18k val, 13k test split). DenoiseVAE achieved state-of-the-art or second-best performance on 11 of the 12 tasks, with particularly significant gains on $C_v$ (heat capacity), indicating better capture of vibrational modes.
Force Prediction (MD17): The task was to predict atomic forces from molecular dynamics trajectories for 8 different small molecules (9,500 train, 500 val split). DenoiseVAE was the top performer on 5 of the 8 molecules (Aspirin, Benzene, Ethanol, Naphthalene, Toluene), though it underperformed Frad on Malonaldehyde, Salicylic Acid, and Uracil by significant margins.
Ligand Binding Affinity (PDBBind v2019): On the PDBBind dataset with 30% and 60% protein sequence identity splits, the model showed strong generalization, outperforming baselines like Uni-Mol particularly on the more stringent 30% split across RMSE, Pearson correlation, and Spearman correlation.
PCQM4Mv2 Validation: DenoiseVAE achieved a validation MAE of 0.0777 on the PCQM4Mv2 HOMO-LUMO gap prediction task with only 1.44M parameters, competitive with models 10-40x larger (e.g., GPS++ at 44.3M params achieves 0.0778).
Ablation Studies: The authors analyzed the sensitivity to key hyperparameters, namely the prior’s standard deviation ($\sigma$) and the KL-divergence weight ($\lambda$), confirming that $\lambda=1$ and $\sigma=0.1$ are optimal. Removing the KL term leads to trivial solutions (near-zero noise). An additional ablation on the Noise Generator depth found 4 EGNN layers optimal over 2 layers. A comparison of independent (diagonal) versus non-independent (full covariance) noise sampling showed comparable results, suggesting the EGNN already captures inter-atomic dependencies implicitly.
Case Studies: Visualizations of the learned noise variances for different molecules confirmed that the model learns chemically intuitive noise patterns. For example, it applies smaller perturbations to atoms in a rigid bicyclic norcamphor derivative and larger ones to atoms in flexible functional groups of a cyclopropane derivative. Even identical functional groups (e.g., hydroxyl) receive different noise scales in different molecular contexts.

Key Findings on Force Field Learning

Primary Conclusion: Learning a molecule-adaptive and atom-specific noise distribution is a superior strategy for denoising-based pre-training compared to using fixed, hand-crafted heuristics. This more physically-grounded approach leads to representations that better capture molecular force fields.
Strong Benchmark Performance: DenoiseVAE achieves best or second-best results on 11 of 12 QM9 tasks, 5 of 8 MD17 molecules, and leads on the stringent 30% LBA split. Performance is mixed on some MD17 molecules (Malonaldehyde, Salicylic Acid, Uracil), where it trails Frad.
Effective Framework: The proposed VAE-based framework, which jointly trains a Noise Generator and a Denoising Module, is an effective and theoretically sound method for implementing this adaptive noise strategy. The interplay between the reconstruction loss and the KL-divergence regularization is key to its success.
Limitation and Future Direction: The method is based on classical force field assumptions. The authors note that integrating more accurate force fields represents a promising direction for future work.

Reproducibility Details

Artifacts

Artifact	Type	License	Notes
Serendipity-r/DenoiseVAE	Code	Unknown	Official implementation

Reproducibility Status

Source Code: The authors have released their code at Serendipity-r/DenoiseVAE on GitHub. No license is specified in the repository.
Implementation: Hyperparameters and architectures are detailed in the paper’s appendix (A.14), and the repository provides reference implementations.

Data

Pre-training Dataset: PCQM4Mv2 (approximately 3.4 million organic molecules)
Property Prediction: QM9 dataset (134k molecules; 100k train, 18k val, 13k test split) for 12 quantum chemical properties
Force Prediction: MD17 dataset (9,500 train, 500 val split) for 8 different small molecules
Ligand Binding Affinity: PDBBind v2019 (4,463 protein-ligand complexes) with 30% and 60% sequence identity splits

Algorithms

Noise Generator: 4-layer Equivariant Graph Neural Network (EGNN) that outputs atom-specific Gaussian noise distributions
Denoising Module: 7-layer EGNN decoder
Training Objective: $\mathcal{L}_{DenoiseVAE} = \mathcal{L}_{Denoise} + \lambda \mathcal{L}_{KL}$ with $\lambda=1$
Noise Sampling: Reparameterization trick with $\tilde{x}_i = x_i + \epsilon \sigma_i$
Prior Distribution: Standard deviation $\sigma=0.1$

Models

Model Size: 1.44M parameters total
Fine-tuning Protocol: Noise Generator discarded after pre-training; only the pre-trained Denoising Module (7-layer EGNN) is retained for downstream fine-tuning
Optimizer: AdamW with cosine learning rate decay (max LR of 0.0005)
Batch Size: 128
System Training: Fine-tuned end-to-end for specific tasks; force prediction involves computing the gradient of the predicted energy

Evaluation

Ablation Studies: Sensitivity analysis confirmed $\lambda=1$ and $\sigma=0.1$ as optimal hyperparameters; removing the KL term leads to trivial solutions (near-zero noise)
Noise Generator Depth: 4 EGNN layers outperformed 2 layers across both QM9 and MD17 benchmarks
Covariance Structure: Full covariance matrix (non-independent noise sampling) yielded comparable results to diagonal variance (independent sampling), likely because the EGNN already integrates neighboring atom information
O(3) Invariance: The method satisfies O(3) probabilistic invariance, meaning the noise distribution is unchanged under rotations and reflections

Hardware

GPU Configuration: Experiments conducted on a single RTX A3090 GPU; 6 GPUs with 144GB total memory sufficient for full reproduction
CPU: Intel Xeon Gold 5318Y @ 2.10GHz

Paper Information

Citation: Liu, Y., Chen, J., Jiao, R., Li, J., Huang, W., & Su, B. (2025). DenoiseVAE: Learning Molecule-Adaptive Noise Distributions for Denoising-based 3D Molecular Pre-training. The Thirteenth International Conference on Learning Representations (ICLR).

Publication: ICLR 2025

@inproceedings{liu2025denoisevae,
  title={DenoiseVAE: Learning Molecule-Adaptive Noise Distributions for Denoising-based 3D Molecular Pre-training},
  author={Yurou Liu and Jiahao Chen and Rui Jiao and Jiangmeng Li and Wenbing Huang and Bing Su},
  booktitle={The Thirteenth International Conference on Learning Representations},
  year={2025},
  url={https://openreview.net/forum?id=ym7pr83XQr}
}

Additional Resources:

eSEN: Smooth Interatomic Potentials (ICML Spotlight)

Sat, 23 Aug 2025 00:00:00 +0000

Paper Overview

This is a method paper. It addresses a critical disconnect in the evaluation of Machine Learning Interatomic Potentials (MLIPs) and introduces a novel architecture, eSEN, designed based on insights from this analysis. The paper proposes a new standard for evaluating MLIPs beyond simple test-set errors.

The Energy Conservation Gap in MLIP Evaluation

The motivation addresses a well-known but under-addressed problem in the field: improvements in standard MLIP metrics (lower energy/force MAE on static test sets) do not reliably translate to better performance on complex downstream tasks like molecular dynamics (MD) simulations, materials stability prediction, or phonon calculations. The authors seek to understand why this gap exists and how to design models that are both accurate on test sets and physically reliable in practical scientific workflows.

The eSEN Architecture and Continuous Representation

The novelty is twofold, spanning both a conceptual framework for evaluation and a new model architecture:

Energy Conservation as a Diagnostic Test: The core conceptual contribution is using an MLIP’s ability to conserve energy in out-of-distribution MD simulations as a crucial diagnostic test. The authors demonstrate that for models passing this test, a strong correlation between test-set error and downstream task performance is restored.
The eSEN Architecture: The paper introduces the equivariant Smooth Energy Network (eSEN), designed with specific choices to ensure a smooth and well-behaved Potential Energy Surface (PES):
- Strictly Conservative Forces: Forces are computed exclusively as the negative gradient of energy ($F = -\nabla E$), using conservative force prediction instead of faster direct-force prediction heads.
- Continuous Representations: Maintains strict equivariance and smoothness by using equivariant gated non-linearities instead of discretizing spherical harmonic representations during nodewise processing.
- Smooth PES Construction: Critical design choices include using distance cutoffs, polynomial envelope functions ensuring derivatives go to zero at cutoffs, and limited radial basis functions to avoid overly sensitive PES.
Efficient Training Strategy: A two-stage training regimen with fast pre-training using a non-conservative direct-force model, followed by fine-tuning to enforce energy conservation. This captures the efficiency of direct-force training while ensuring physical robustness.

Evaluating OOD Energy Conservation and Physical Properties

The paper presents a comprehensive experimental validation:

Ablation Studies on Energy Conservation: MD simulations on out-of-distribution systems (TM23 and MD22 datasets) systematically tested key design choices (direct-force vs. conservative, representation discretization, neighbor limits, envelope functions). This empirically demonstrated which choices lead to energy drift despite negligible impact on test-set MAE.
Physical Property Prediction Benchmarks: The eSEN model was evaluated on challenging downstream tasks:
- Matbench-Discovery: Materials stability and thermal conductivity prediction, where eSEN achieved the highest F1 score among compliant models and excelled at both metrics simultaneously.
- MDR Phonon Benchmark: Predicting phonon properties that test accurate second and third-order derivatives of the PES. eSEN achieved state-of-the-art results, particularly outperforming direct-force models.
- SPICE-MACE-OFF: Standard energy and force prediction on organic molecules, demonstrating that physical plausibility design choices enhanced raw accuracy.
Correlation Analysis: Explicit plots of test-set energy MAE versus performance on downstream benchmarks showed weak overall correlation that becomes strong and predictive when restricted to models passing the energy conservation test.

Outcomes and Conclusions

Primary Conclusion: Energy conservation is a critical, practical property for MLIPs. Using it as a filter re-establishes test-set error as a reliable proxy for model development, dramatically accelerating the innovation cycle. Models that are not conservative, even with low test error, are unreliable for many critical scientific applications.
Model Performance: The eSEN architecture outperforms base models across diverse tasks, from energy/force prediction to geometry optimization, phonon calculations, and thermal conductivity prediction.
Actionable Design Principles: The paper provides experimentally-validated architectural choices that promote physical plausibility. Seemingly minor details, like how atomic neighbors are selected, can have profound impacts on a model’s utility in simulations.
Efficient Path to Robust Models: The direct-force pre-training plus conservative fine-tuning strategy offers a practical method for developing physically robust models without incurring the full computational cost of conservative training from scratch.

Reproducibility

Artifacts

Artifact	Type	License	Notes
fairchem (GitHub)	Code	MIT	Official implementation within FAIR Chemistry framework
OMAT24 (Hugging Face)	Model	FAIR Acceptable Use Policy	Pre-trained eSEN-30M-MP and eSEN-30M-OAM checkpoints
OpenReview	Paper	CC BY 4.0	ICML 2025 camera-ready paper

Models

The eSEN architecture builds on components from eSCN (Equivariant Spherical Channel Network) and Equiformer, combining them with design choices that prioritize smoothness and energy conservation. The implementation integrates into the standard fairchem Open Catalyst experimental framework.

Layer Structure

Edgewise Convolution: Uses SO2 convolution layers (from eSCN) with an envelope function applied. Source and target embeddings are concatenated before convolution.
Nodewise Feed-Forward: Two equivariant linear layers with an intermediate SiLU-based gated non-linearity (from Equiformer).
Normalization: Equivariant Layer Normalization (from Equiformer).

Smoothness Design Choices

Several architectural decisions distinguish eSEN from prior work:

No Grid Projection: eSEN performs operations directly in the spherical harmonic space to maintain equivariance and energy conservation, bypassing the projection of spherical harmonics to spatial grids for non-linearity.
Distance Cutoff for Graph Construction: Uses a strict distance cutoff (6 Å for MPTrj models, 5 Å for SPICE models). Neighbor limits introduce discontinuities that break energy conservation.
Polynomial Envelope Functions: Ensures derivatives go to zero smoothly at the cutoff radius.

Algorithms

Two-Stage Training (eSEN-30M-MP)

Direct-Force Pre-training (60 epochs): Uses DeNS (Denoising Non-equilibrium Structures) to reduce overfitting. This stage is fast because it does not require backpropagation through energy gradients.
Conservative Fine-tuning (40 epochs): The direct-force head is removed, and forces are calculated via gradients ($F = -\nabla E$). This enforces energy conservation.

Important: DeNS is used exclusively during the direct-force pre-training stage, with a noising probability of 0.5, a standard deviation of 0.1 Å for the added Gaussian noise, and a DeNS loss coefficient of 10. The fine-tuning strategy reduces the wall-clock time for model training by 40%.

Optimization

Optimizer: AdamW with cosine learning rate scheduler
Max Learning Rate: $4 \times 10^{-4}$
Batch Size: 512 (for MPTrj models)
Weight Decay: $1 \times 10^{-3}$
Gradient Clipping: Norm of 100
Warmup: 0.1 epochs with a factor of 0.2

Loss Function

A composite loss combining per-atom energy MAE, force $L_2$ loss, and stress MAE:

$$ \begin{aligned} \mathcal{L} = \lambda_{\text{e}} \frac{1}{N} \sum_{i=1}^N \lvert E_{i} - \hat{E}_{i} \rvert + \lambda_{\text{f}} \frac{1}{3N} \sum_{i=1}^N \lVert \mathbf{F}_{i} - \hat{\mathbf{F}}_{i} \rVert_2^2 + \lambda_{\text{s}} \lVert \mathbf{S} - \hat{\mathbf{S}} \rVert_1 \end{aligned} $$

For MPTrj-30M, the weighting coefficients are set to $\lambda_{\text{e}} = 20$, $\lambda_{\text{f}} = 20$, and $\lambda_{\text{s}} = 5$.

Data

Training Data

Inorganic: MPTrj (Materials Project Trajectory) dataset
Organic: SPICE-MACE-OFF dataset

Test Data Construction

MPTrj Testing: Since MPTrj lacks an official test split, the authors created a test set using 5,000 random samples from the subsampled Alexandria (sAlex) dataset to ensure fair comparison.
Out-of-Distribution Conservation Testing:
- Inorganic: TM23 dataset (transition metal defects). Simulation: 100 ps, 5 fs timestep.
- Organic: MD22 dataset (large molecules). Simulation: 100 ps, 1 fs timestep.

Hardware

Compute for training operations predominantly utilizes 80GB NVIDIA A100 GPUs.

Inference Efficiency

For a periodic system of 216 atoms on a single A100 (PyTorch 2.4.0, CUDA 12.1, no compile/torchscript), the 2-layer eSEN models achieve approximately 0.4 million steps per day (3.2M parameters) and 0.8 million steps per day (6.5M parameters), comparable to MACE-OFF-L at 0.7 million steps per day.

Evaluation

The paper evaluated eSEN across three major benchmark tasks. Key evaluation metrics included energy MAE (meV/atom), force MAE (meV/Å), stress MAE (meV/Å/atom), F1 score for stability prediction, $\kappa_{\text{SRME}}$ for thermal conductivity, and phonon frequency accuracy.

Ablation Test-Set MAE (Table 1)

Design choices that dramatically affect energy conservation have negligible impact on static test-set MAE, which is precisely why test-set error alone is misleading. All models are 2-layer with 3.2M parameters, $L_{\text{max}} = 2$, $M_{\text{max}} = 2$:

Model	Energy MAE	Force MAE	Stress MAE
eSEN (default)	17.02	43.96	0.14
eSEN, direct-force	18.66	43.62	0.16
eSEN, neighbor limit	17.30	44.11	0.14
eSEN, no envelope	17.60	44.69	0.14
eSEN, $N_{\text{basis}} = 512$	19.87	48.29	0.15
eSEN, Bessel	17.65	44.83	0.15
eSEN, discrete, res=6	17.05	43.10	0.14
eSEN, discrete, res=10	17.11	43.13	0.14
eSEN, discrete, res=14	17.12	43.09	0.14

Energy MAE in meV/atom. Force MAE in meV/Å. Stress MAE in meV/Å/atom.

Matbench-Discovery (Tables 2 and 3)

Compliant models (trained only on MPTrj or its subset), unique prototype split:

Model	F1	DAF	$\kappa_{\text{SRME}}$	RMSD
eSEN-30M-MP	0.831	5.260	0.340	0.0752
eqV2-S-DeNS	0.815	5.042	1.676	0.0757
MatRIS-MP	0.809	5.049	0.861	0.0773
AlphaNet-MP	0.799	4.863	1.31	0.1067
DPA3-v2-MP	0.786	4.822	0.959	0.0823
ORB v2 MPtrj	0.765	4.702	1.725	0.1007
SevenNet-13i5	0.760	4.629	0.550	0.0847
GRACE-2L-MPtrj	0.691	4.163	0.525	0.0897
MACE-MP-0	0.669	3.777	0.647	0.0915
CHGNet	0.613	3.361	1.717	0.0949
M3GNet	0.569	2.882	1.412	0.1117

eSEN-30M-MP excels at both F1 and $\kappa_{\text{SRME}}$ simultaneously, while all previous models only achieve SOTA on one or the other.

Non-compliant models (trained on additional datasets):

Model	F1	$\kappa_{\text{SRME}}$	RMSD
eSEN-30M-OAM	0.925	0.170	0.0608
eqV2-M-OAM	0.917	1.771	0.0691
ORB v3	0.905	0.210	0.0750
SevenNet-MF-ompa	0.901	0.317	0.0639
DPA3-v2-OpenLAM	0.890	0.687	0.0679
GRACE-2L-OAM	0.880	0.294	0.0666
MatterSim-v1-5M	0.862	0.574	0.0733
MACE-MPA-0	0.852	0.412	0.0731

The eSEN-30M-OAM model is pre-trained on the OMat24 dataset, then fine-tuned on the subsampled Alexandria (sAlex) dataset and MPTrj dataset.

MDR Phonon Benchmark (Table 4)

Metrics: maximum phonon frequency MAE($\omega_{\text{max}}$) in K, vibrational entropy MAE($S$) in J/K/mol, Helmholtz free energy MAE($F$) in kJ/mol, heat capacity MAE($C_V$) in J/K/mol.

Model	MAE($\omega_{\text{max}}$)	MAE($S$)	MAE($F$)	MAE($C_V$)
eSEN-30M-MP	21	13	5	4
SevenNet-13i5	26	28	10	5
GRACE-2L (r6)	40	25	9	5
SevenNet-0	40	48	19	9
MACE	61	60	24	13
CHGNet	89	114	45	21
M3GNet	98	150	56	22

Direct-force models show dramatically worse performance at the standard 0.01 Å displacement (e.g., eqV2-S-DeNS: 280/224/54/94) but improve at larger displacements (0.2 Å: 58/26/8/8), revealing that their PES is rough near energy minima.

SPICE-MACE-OFF (Table 5)

Test set MAE for organic molecule energy/force prediction. Energy MAE in meV/atom, force MAE in meV/Å:

Dataset	MACE-4.7M (E/F)	EscAIP-45M* (E/F)	eSEN-3.2M (E/F)	eSEN-6.5M (E/F)
PubChem	0.88 / 14.75	0.53 / 5.86	0.22 / 6.10	0.15 / 4.21
DES370K M.	0.59 / 6.58	0.41 / 3.48	0.17 / 1.85	0.13 / 1.24
DES370K D.	0.54 / 6.62	0.38 / 2.18	0.20 / 2.77	0.15 / 2.12
Dipeptides	0.42 / 10.19	0.31 / 5.21	0.10 / 3.04	0.07 / 2.00
Sol. AA	0.98 / 19.43	0.61 / 11.52	0.30 / 5.76	0.25 / 3.68
Water	0.83 / 13.57	0.72 / 10.31	0.24 / 3.88	0.15 / 2.50
QMugs	0.45 / 16.93	0.41 / 8.74	0.16 / 5.70	0.12 / 3.78

*EscAIP-45M is a direct-force model. eSEN-6.5M outperforms MACE-OFF-L and EscAIP on all test splits. The smaller eSEN-3.2M has inference efficiency comparable to MACE-4.7M while achieving lower MAE.

Why These Design Choices Matter

Bounded Energy Derivatives and the Verlet Integrator

The theoretical foundation for why smoothness matters comes from Theorem 5.1 of Hairer et al. (2003). For the Verlet integrator (the standard NVE integrator), the total energy drift satisfies:

$$ |E(\mathbf{r}_T, \mathbf{a}) - E(\mathbf{r}_0, \mathbf{a})| \leq C \Delta t^2 + C_N \Delta t^N T $$

where $T$ is the total simulation time ($T \leq \Delta t^{-N}$), $N$ is the highest order for which the $N$th derivative of $E$ is continuously differentiable with bounded derivative, and $C$, $C_N$ are constants independent of $T$ and $\Delta t$. The first term is a time-independent fluctuation of $O(\Delta t^2)$; the second term governs long-term conservation. This means the PES must be continuously differentiable to high order, with bounded derivatives, for energy conservation in long-time simulations.

Architectural Choices That Break Conservation

The authors provide theoretical justification for why specific architectural choices break energy conservation:

Max Neighbor Limit (KNN): Introduces discontinuity in the PES. If a neighbor at distance $r$ moves to $r + \epsilon$ and drops out of the top-$K$, the energy changes discontinuously.
Grid Discretization: Projecting spherical harmonics to a spatial grid introduces discretization errors in energy gradients that break conservation. This can be mitigated with higher-resolution grids but not eliminated.
Direct-Force Prediction: Imposes no mathematical constraint that forces must be the gradient of an energy scalar field. In other words, $\nabla \times \mathbf{F} \neq 0$ is permitted, violating the requirement for a conservative force field.

Displacement Sensitivity in Phonon Calculations

An important empirical finding concerns how displacement values affect phonon predictions. Conservative models (eSEN, MACE) show convergent phonon band structures as displacement decreases toward zero. In contrast, direct-force models (eqV2-S-DeNS) fail to converge, exhibiting missing acoustic branches and spurious imaginary frequencies at small displacements. While direct-force models achieve competitive thermodynamic property accuracy at large displacements (0.2 Å), this is deceptive: the underlying phonon band structures remain inaccurate, and the apparent accuracy comes from Boltzmann-weighted integrals smoothing over errors.

Paper Information

Citation: Fu, X., Wood, B. M., Barroso-Luque, L., Levine, D. S., Gao, M., Dzamba, M., & Zitnick, C. L. (2025). Learning Smooth and Expressive Interatomic Potentials for Physical Property Prediction. Proceedings of the 42nd International Conference on Machine Learning (ICML), PMLR 267:17875–17893.

Publication: ICML 2025 (Spotlight)

@inproceedings{fu2025learning,
  title={Learning Smooth and Expressive Interatomic Potentials for Physical Property Prediction},
  author={Fu, Xiang and Wood, Brandon M. and Barroso-Luque, Luis and Levine, Daniel S. and Gao, Meng and Dzamba, Misko and Zitnick, C. Lawrence},
  booktitle={Proceedings of the 42nd International Conference on Machine Learning},
  series={Proceedings of Machine Learning Research},
  volume={267},
  pages={17875--17893},
  publisher={PMLR},
  year={2025}
}

Additional Resources:

Efficient DFT Hamiltonian Prediction via Adaptive Sparsity

Sat, 23 Aug 2025 00:00:00 +0000

Core Innovation: Adaptive Sparsity in SE(3) Networks

This is a methodological paper introducing a novel architecture and training curriculum to solve efficiency bottlenecks in Geometric Deep Learning. It directly tackles the primary computational bottleneck in modern SE(3)-equivariant graph neural networks (the tensor product operation) and proposes a generalizable solution through adaptive network sparsification.

The Computational Bottleneck in DFT Hamiltonian Prediction

SE(3)-equivariant networks are accurate but unscalable for DFT Hamiltonian prediction due to two key bottlenecks:

Atom Scaling: Tensor Product (TP) operations grow quadratically with atoms ($N^2$).
Basis Set Scaling: Computational complexity grows with the sixth power of the angular momentum order ($L^6$). Larger basis sets (e.g., def2-TZVP) require higher orders ($L=6$), making them prohibitively slow.

Existing SE(3)-equivariant models cannot handle large molecules (40-100 atoms) with high-quality basis sets, limiting their practical applicability in computational chemistry.

SPHNet Architecture and the Three-Phase Sparsity Scheduler

SPHNet introduces Adaptive Sparsity to prune redundant computations at two levels:

Sparse Pair Gate: Learns which atom pairs to include in message passing, adapting the interaction graph based on importance.
Sparse TP Gate: Filters which spherical harmonic triplets $(l_1, l_2, l_3)$ are computed in tensor product operations, pruning higher-order combinations that contribute less to accuracy.
Three-Phase Sparsity Scheduler: A training curriculum (Random → Adaptive → Fixed) that enables stable convergence to high-performing sparse subnetworks.

Key insight: The Sparse Pair Gate learns to preserve long-range interactions (16-25 Angstrom) at higher rates than short-range ones. Short-range pairs are abundant and easier to learn, while rare long-range interactions require more samples for accurate representation, making them more critical to retain.

Benchmarks and Ablation Studies

The authors evaluated SPHNet on three datasets (MD17, QH9, and PubChemQH) with varying molecule sizes and basis set complexities. Baselines include SchNOrb, PhiSNet, QHNet, and WANet. SchNOrb and PhiSNet results are limited to MD17, as those models are designed for trajectory datasets. WANet was not open-sourced, so only partial metrics from its paper are reported.

Evaluation Metrics

Hamiltonian MAE ($H$): Mean absolute error between predicted and DFT-computed Hamiltonian matrices, in Hartrees ($E_h$)
Occupied Orbital Energy MAE ($\epsilon$): Mean absolute error of all occupied molecular orbital energies derived from the predicted Hamiltonian
Orbital Coefficient Similarity ($\psi$): Cosine similarity of occupied molecular orbital coefficients between predicted and reference wavefunctions

Ablation Studies

Sparse Gates (on PubChemQH):

Configuration	$H$ [$10^{-6} E_h$] $\downarrow$	Memory [GB] $\downarrow$	Speedup $\uparrow$
Both gates	97.31	5.62	7.09x
Pair Gate only	87.70	6.98	2.73x
TP Gate only	94.31	8.04	3.98x
Neither gate	86.35	10.91	1.73x

The Sparse Pair Gate contributes a 78% speedup with 30% memory reduction. The Sparse TP Gate (pruning 70% of combinations) yields a 160% speedup. Both gates together achieve the highest speedup, though accuracy slightly decreases compared to no gating.

Three-Phase Scheduler: Removing the random phase causes convergence to local optima ($112.68 \pm 10.75$ vs $97.31 \pm 0.52$). Removing the adaptive phase increases variance and lowers accuracy ($122.79 \pm 19.02$). Removing the fixed phase has minimal accuracy impact but reduces speedup from 7.09x to 5.45x due to dynamic graph overhead.

Sparsity Rate: The critical sparsity threshold scales with system complexity: 30% for MD17 (small molecules), 40% for QH9 (medium), and 70% for PubChemQH (large). Beyond the threshold, MAE increases sharply. Computational cost decreases approximately linearly with sparsity rate.

Transferability to Other Models

To demonstrate the speedup is architecture-agnostic, the authors applied the Sparse Pair Gate and Sparse TP Gate to the QHNet baseline on PubChemQH:

Configuration	$H$ [$10^{-6} E_h$] $\downarrow$	Memory [GB] $\downarrow$	Speedup $\uparrow$
QHNet baseline	123.74	22.50	1.00x
+ TP Gate	128.16	12.68	2.04x
+ Pair Gate	126.27	10.07	1.66x
+ Both gates	128.89	8.46	3.30x

The gates reduced QHNet’s memory by 62% and improved speed by 3.3x with modest accuracy trade-off, confirming the gates are portable modules applicable to other SE(3)-equivariant architectures.

Performance Results

QH9 (134k molecules, $\leq$ 20 atoms)

SPHNet achieves 3.3x to 4.0x speedup over QHNet across all four QH9 splits, with improved Hamiltonian MAE and orbital energy MAE. Memory drops to 0.23 GB/sample (33% of QHNet’s 0.70 GB). On the stable-iid split, Hamiltonian MAE improves from 76.31 to 45.48 ($10^{-6} E_h$).

PubChemQH (50k molecules, 40-100 atoms)

Model	$H$ [$10^{-6} E_h$] $\downarrow$	$\epsilon$ [$E_h$] $\downarrow$	$\psi$ [$10^{-2}$] $\uparrow$	Memory [GB] $\downarrow$	Speedup $\uparrow$
QHNet	123.74	3.33	2.32	22.5	1.0x
WANet	99.98	1.17	3.13	15.0	2.4x
SPHNet	97.31	2.16	2.97	5.62	7.1x

SPHNet achieves the best Hamiltonian MAE and efficiency, though WANet outperforms on orbital energy MAE and coefficient similarity. The higher speedup on PubChemQH (vs QH9) reflects greater computational redundancy in larger systems with higher-order basis sets ($L_{max} = 6$ for def2-TZVP vs $L_{max} = 4$ for def2-SVP).

MD17 (Small Molecule Trajectories)

SPHNet achieves accuracy comparable to QHNet and PhiSNet on four MD17 molecules (water, ethanol, malondialdehyde, uracil; 3-12 atoms). MD17 represents a simpler task where baseline models already perform well, leaving limited room for improvement. For water (3 atoms), the number of interaction combinations is inherently small, limiting the benefit of adaptive sparsification.

Scaling Limit

SPHNet can train on systems with approximately 3000 atomic orbitals on a single A6000 GPU; the QHNet baseline runs out of memory at approximately 1800 orbitals. Memory consumption scales more favorably as molecule size increases.

Key Findings

Adaptive sparsity scales with system complexity: The method is most effective for large systems where redundancy is high. For small molecules (e.g., water with only 3 atoms), every interaction is critical, so pruning hurts accuracy and yields negligible speedup.
Long-range pair preservation: The Sparse Pair Gate selects long-range pairs (16-25 Angstrom) at higher rates than short-range ones. Short-range pairs are numerous and easier to learn, while rare long-range interactions are harder to represent and thus more critical to retain.
Generalizable components: The sparsification techniques are portable modules, demonstrated by successful integration into QHNet with 3.3x speedup.
Architecture ablation: Removing one Vectorial Node Interaction block or Spherical Node Interaction block significantly hurts accuracy, confirming the importance of the progressive order-increase design. Removing one Pair Construction block has less impact, suggesting room for further speedup.

Reproducibility Details

Artifacts

Artifact	Type	License	Notes
SPHNet (GitHub)	Code	MIT	Official implementation; archived by Microsoft (Dec 2025), read-only
PubChemQH (Hugging Face)	Dataset	MIT	50k molecules, 40-100 atoms, def2-TZVP basis

No pre-trained model weights are provided. MD17 and QH9 are publicly available community datasets. Training requires 4x NVIDIA A100 (80GB) GPUs; benchmarking uses a single NVIDIA RTX A6000 (46GB).

Data

The experiments evaluated SPHNet on three datasets with different molecular sizes and basis set complexities. All datasets use DFT calculations as ground truth, with MD17 using the PBE exchange-correlation functional and QH9/PubChemQH using B3LYP.

Dataset	Molecules	Molecule Size	Basis Set	$L_{max}$	Functional
MD17	4 systems	3-12 atoms (water, ethanol, malondialdehyde, uracil)	def2-SVP	4	PBE
QH9	134k	$\leq$ 20 atoms (Stable/Dynamic splits)	def2-SVP	4	B3LYP
PubChemQH	50k	40-100 atoms	def2-TZVP	6	B3LYP

Data Availability:

MD17 & QH9: Publicly available
PubChemQH: Publicly available on Hugging Face (EperLuo/PubChemQH)

Algorithms

Loss Function:

The model learns the residual $\Delta H$:

$$ \begin{aligned} \Delta H &= H_{\text{ref}} - H_{\text{init}} \\ \mathcal{L} &= \text{MAE}(H_{\text{ref}}, H_{\text{pred}}) + \text{MSE}(H_{\text{ref}}, H_{\text{pred}}) \end{aligned} $$

where $H_{\text{init}}$ is a computationally inexpensive initial guess computed via PySCF.

Hyperparameters:

Parameter	PubChemQH	QH9	MD17
Batch Size	8	32	10 (uracil: 5)
Training Steps	300k	260k	200k
Warmup Steps	1k	1k	1k
Learning Rate	1e-3	1e-3	5e-4
Sparsity Rate	0.7	0.4	0.1-0.3
TSS Epoch $t$	3	3	3

Sparse Pair Gate: Adapts the interaction graph. It concatenates zero-order features and inner products of atom pairs, then passes them through a linear layer $F_p$ with sigmoid activation to learn a weight $W_p^{ij}$ for every pair. Pairs are kept only if selected by the scheduler ($U_p^{TSS}$). The overhead comes primarily from the linear layer $F_p$.

Sparse TP Gate: Filters triplets $(l_1, l_2, l_3)$ inside the TP operation. Higher-order combinations are more likely to be pruned. Complexity: $\mathcal{O}(L^3)$.

Three-Phase Sparsity Scheduler: Training curriculum designed to optimize the sparse gates effectively:

Phase 1 (Random): Random selection ($1-k$ probability) to ensure unbiased weight updates. Complexity: $\mathcal{O}(|U|)$.
Phase 2 (Adaptive): Selects top $(1-k)$ percent based on learned magnitude. Complexity: $\mathcal{O}(|U|\log|U|)$.
Phase 3 (Fixed): Freezes the connectivity mask for maximum inference speed. No overhead.

Weight Initialization: Learnable sparsity weights ($W$) initialized as all-ones vector.

Models

The model predicts the Hamiltonian matrix $H$ from atomic numbers $Z$ and coordinates $r$.

Inputs: Atomic numbers ($Z$) and 3D coordinates.

Backbone Structure:

Vectorial Node Interaction (x4): Uses long-short range message passing. Extracts vectorial representations ($l=1$) without high-order TPs to save cost.
Spherical Node Interaction (x2): Projects features to high-order spherical harmonics (up to $L_{max}$). The first block increases the maximum order from 0 to $L_{max}$ without the Sparse Pair Gate; the second block applies the Sparse Pair Gate to filter node pairs.
Pair Construction Block (x2): Splits into Diagonal (self-interaction) and Non-Diagonal (cross-interaction) blocks. Both use the Sparse TP Gate to prune cross-order combinations $(l_1, l_2, l_3)$. The Non-Diagonal blocks also use the Sparse Pair Gate to filter atom pairs. The two Pair Construction blocks receive representations from the two Spherical Node Interaction blocks respectively, and their outputs are summed.
Expansion Block: Reconstructs the full Hamiltonian matrix from the sparse irreducible representations, exploiting symmetry ($H_{ji} = H_{ij}^T$) to halve computations.

Hardware

Training: 4x NVIDIA A100 (80GB)
Benchmarking: Single NVIDIA RTX A6000 (46GB)

Paper Information

Citation: Luo, E., Wei, X., Huang, L., Li, Y., Yang, H., Xia, Z., Wang, Z., Liu, C., Shao, B., & Zhang, J. (2025). Efficient and Scalable Density Functional Theory Hamiltonian Prediction through Adaptive Sparsity. Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:41368–41390.

Publication: ICML 2025

@inproceedings{luo2025efficient,
  title={Efficient and Scalable Density Functional Theory Hamiltonian Prediction through Adaptive Sparsity},
  author={Luo, Erpai and Wei, Xinran and Huang, Lin and Li, Yunyang and Yang, Han and Xia, Zaishuo and Wang, Zun and Liu, Chang and Shao, Bin and Zhang, Jia},
  booktitle={Proceedings of the 42nd International Conference on Machine Learning},
  pages={41368--41390},
  year={2025},
  volume={267},
  series={Proceedings of Machine Learning Research},
  publisher={PMLR}
}

Additional Resources:

ICML 2025 poster page
OpenReview forum
PDF on OpenReview
GitHub Repository (Note: The official repository was archived by Microsoft in December 2025. It is available for reference but no longer actively maintained.)

Dark Side of Forces: Non-Conservative ML Force Models

Sat, 23 Aug 2025 00:00:00 +0000

Contribution: Systematic Assessment of Non-Conservative ML Force Models

This is a Systematization paper. It systematically catalogs the exact failure modes of existing non-conservative force approaches, quantifies them with a new diagnostic metric, and proposes a hybrid Multiple Time-Stepping solution combining the speed benefits of direct force prediction with the physical correctness of conservative models.

Motivation: The Speed-Accuracy Trade-off in ML Force Fields

Many recent machine learning interatomic potential (MLIP) architectures predict forces directly ($F_\theta(r)$). This “non-conservative” approach avoids the computational overhead of automatic differentiation, yielding faster inference (typically 2-3x speedup) and faster training (up to 3x). However, it sacrifices energy conservation and rotational constraints, potentially destabilizing molecular dynamics simulations. The field lacks rigorous quantification of when this trade-off breaks down and how to mitigate the failures.

Novelty: Jacobian Asymmetry and Hybrid Architectures

Four key contributions:

Jacobian Asymmetry Metric ($\lambda$): A quantitative diagnostic for non-conservation. Since conservative forces derive from a scalar field, their Jacobian (the Hessian of energy) must be symmetric. The normalized norm of the antisymmetric part quantifies the degree of violation: $$ \lambda = \frac{|| \mathbf{J}_{\text{anti}} ||_F}{|| \mathbf{J} ||_F} $$ where $\mathbf{J}_{\text{anti}} = (\mathbf{J} - \mathbf{J}^\top)/2$. Measured values range from $\lambda \approx 0.004$ (PET-NC) to $\lambda \approx 0.032$ (SOAP-BPNN-NC), with ORB at 0.015 and EquiformerV2 at 0.017. Notably, the pairwise $\lambda_{ij}$ approaches 1 at large interatomic distances, meaning non-conservative artifacts disproportionately affect long-range and collective interactions.
Systematic Failure Mode Catalog: First comprehensive demonstration that non-conservative models cause runaway heating in NVE ensembles (temperature drifts of ~7,000 billion K/s for PET-NC and ~10x larger for ORB) and equipartition violations in NVT ensembles where different atom types equilibrate to different temperatures, a physical impossibility.
Theoretical Analysis of Force vs. Energy Training: Force-only training overemphasizes high-frequency vibrational modes because force labels carry per-atom gradients that are dominated by stiff, short-range interactions. Energy labels provide a more balanced representation across the frequency spectrum. Additionally, conservative models benefit from backpropagation extending the effective receptive field to approximately 2x the interaction cutoff, while direct-force models are limited to the nominal cutoff radius.
Hybrid Training and Inference Protocol: A practical workflow that combines fast direct-force prediction with conservative corrections:
- Training: Pre-train on direct forces, then fine-tune on energy gradients (2-4x faster than training conservative models from scratch)
- Inference: Multiple Time-Stepping (MTS) where fast non-conservative forces are periodically corrected by slower conservative forces

Methodology: Systematic Failure Mode Analysis

The evaluation systematically tests multiple state-of-the-art models across diverse simulation scenarios:

Models tested:

PET-C/PET-NC (Point Edge Transformer, conservative and non-conservative variants)
PET-M (hybrid variant jointly predicting both conservative and non-conservative forces)
ORB-v2 (non-conservative, trained on Alexandria/MPtrj)
EquiformerV2 (non-conservative equivariant Transformer)
MACE-MP-0 (conservative message-passing)
SevenNet (conservative message-passing)
SOAP-BPNN-C/SOAP-BPNN-NC (descriptor-based baseline, both conservative and non-conservative variants)

Test scenarios:

NVE stability tests on bulk liquid water, graphene, amorphous carbon, and FCC aluminum
Thermostat artifact analysis with Langevin and GLE thermostats
Geometry optimization on water snapshots and QM9 molecules using FIRE and L-BFGS
MTS validation on OC20 catalysis dataset
Species-resolved temperature measurements for equipartition testing

Key metrics:

Jacobian asymmetry ($\lambda$)
Kinetic temperature drift in NVE
Velocity-velocity correlations
Radial distribution functions
Species-resolved temperatures
Inference speed benchmarks

Results: Simulation Instability and Hybrid Solutions

Purely non-conservative models are unsuitable for production simulations due to uncontrollable unphysical artifacts that no thermostat can correct. Key findings:

Performance failures:

Non-conservative models exhibited catastrophic temperature drift in NVE simulations: ~7,000 billion K/s for PET-NC and ~70,000 billion K/s for ORB, with EquiformerV2 comparable to PET-NC
Strong Langevin thermostats ($\tau=10$ fs) damped diffusion by ~5x, negating the speed benefits of non-conservative models
Advanced GLE thermostats also failed to control non-conservative drift (ORB reached 1181 K vs. 300 K target)
Equipartition violations: under stochastic velocity rescaling, O and H atoms equilibrated at different temperatures. For ORB, H atoms reached 336 K and O atoms 230 K against a 300 K target. For PET-NC, deviations were smaller but still significant (H at 296 K, O at 310 K).
Geometry optimization was more fragile with non-conservative forces: inaccurate NC models (SOAP-BPNN-NC) failed catastrophically, while more accurate ones (PET-NC) could converge with FIRE but showed large force fluctuations with L-BFGS. Non-conservative models consistently had lower success rates across water and QM9 benchmarks.

Hybrid solution success:

MTS with non-conservative forces corrected every 8 steps ($M=8$) achieved conservative stability with only ~20% overhead compared to a purely non-conservative trajectory. Results were essentially indistinguishable from fully conservative simulations. Higher stride values ($M=16$) became unstable due to resonances between fast degrees of freedom and integration errors.
Conservative fine-tuning achieved the accuracy of from-scratch training in about 1/3 the total training time (2-4x resource reduction)
Validated on OC20 catalysis benchmark

Scaling caveat: The authors note that as training datasets grow and models become more expressive, non-conservative artifacts should diminish because accurate models naturally exhibit less non-conservative behavior. However, they argue the best path forward is hybrid approaches rather than waiting for scale to solve the problem.

Recommendation: The optimal production path is hybrid architectures using direct forces for acceleration (via MTS and pre-training) while anchoring models in conservative energy surfaces. This captures computational benefits without sacrificing physical reliability.

Reproducibility Details

Data

Primary training/evaluation:

Bulk Liquid Water (Cheng et al., 2019): revPBE0-D3 calculations with over 250,000 force/energy targets, chosen for rigorous thermodynamic testing

Generalization tests:

Graphene, amorphous carbon, FCC aluminum (tested with general-purpose foundation models)

Benchmarks:

QM9: Geometry optimization tests
OC20 (Open Catalyst): Oxygen on alloy surfaces for MTS validation

All datasets publicly available through cited sources.

Models

Point Edge Transformer (PET) variants:

PET-C (Conservative): Forces via energy backpropagation
PET-NC (Non-Conservative): Direct force prediction head, slightly higher parameter count
PET-M (Hybrid): Jointly predicts both conservative and non-conservative forces, accuracy within ~10% of the best single-task models

Baseline comparisons:

Model	Type	Training Data	Notes
ORB-v2	Non-conservative	Alexandria/MPtrj	Rotationally unconstrained
EquiformerV2	Non-conservative	Alexandria/MPtrj	Equivariant Transformer
MACE-MP-0	Conservative	MPtrj	Equivariant message-passing
SevenNet	Conservative	MPtrj	Equivariant message-passing
SOAP-BPNN-C	Conservative	Bulk water	Descriptor-based baseline
SOAP-BPNN-NC	Non-conservative	Bulk water	Descriptor-based baseline

Training details:

Loss functions: PET-C uses joint Energy + Force $L^2$ loss; PET-NC uses Force-only $L^2$ loss
Fine-tuning protocol: PET-NC converted to conservative via energy head fine-tuning
MTS configuration: Non-conservative forces with conservative corrections every 8 steps ($M=8$)

Evaluation

Metrics & Software: Molecular dynamics evaluations were performed using i-PI, while geometry optimizations used ASE (Atomic Simulation Environment). Note that primary code reproducibility is provided via an archived Zenodo snapshot; the authors did not link a live, public GitHub repository.

Jacobian asymmetry ($\lambda$): Quantifies non-conservation via antisymmetric component
Temperature drift: NVE ensemble stability
Velocity-velocity correlation ($\hat{c}_{vv}(\omega)$): Thermostat artifact detection
Radial distribution functions ($g(r)$): Structural accuracy
Species-resolved temperature: Equipartition testing
Inference speed: Wall-clock time per MD step

Key results:

Model	Speed (ms/step)	NVE Stability	Notes
PET-NC	8.58	Failed	~7,000 billion K/s drift
PET-C	19.4	Stable	2.3x slower than PET-NC
SevenNet	52.8	Stable	Conservative baseline
PET Hybrid (MTS)	~10.3	Stable	~20% overhead vs. pure NC

Thermostat artifacts:

Langevin ($\tau=10$ fs) dampened diffusion by ~5x (weaker coupling at $\tau=100$ fs reduced diffusion by ~1.5x)
GLE thermostats also failed to control non-conservative drift
Equipartition violations under SVR: ORB showed H at 336 K and O at 230 K (target 300 K); PET-NC showed smaller but significant species-resolved deviations

Optimization failures:

Non-conservative models showed lower geometry optimization success rates across water and QM9 benchmarks, with inaccurate NC models failing catastrophically

Hardware

Compute resources:

Training: From-scratch baseline models were trained using 4x Nvidia H100 GPUs (over a duration of around two days).
Fine-Tuning: Conservative fine-tuning was performed using a single (1x) Nvidia H100 GPU for a duration of one day.
This hybrid fine-tuning approach achieved a 2-4x reduction in computational resources compared to training conservative models from scratch.

Reproduction resources:

Artifact	Type	License	Notes
Zenodo repository	Code/Data	Unknown	Code and data to reproduce all results
MTS inference tutorial	Other	Unknown	Multiple time-stepping dynamics tutorial
Conservative fine-tuning tutorial	Other	Unknown	Fine-tuning workflow tutorial

Paper Information

Citation: Bigi, F., Langer, M. F., & Ceriotti, M. (2025). The dark side of the forces: assessing non-conservative force models for atomistic machine learning. Proceedings of the 42nd International Conference on Machine Learning, PMLR 267.

Publication: ICML 2025

@inproceedings{bigi2025dark,
  title={The dark side of the forces: assessing non-conservative force models for atomistic machine learning},
  author={Bigi, Filippo and Langer, Marcel F and Ceriotti, Michele},
  booktitle={Proceedings of the 42nd International Conference on Machine Learning},
  series={Proceedings of Machine Learning Research},
  volume={267},
  address={Vancouver, Canada},
  year={2025}
}

Additional Resources:

Beyond Atoms: 3D Space Modeling for Molecular Pretraining

Sat, 23 Aug 2025 00:00:00 +0000

Paper Typology and Contribution

This is a Method paper. It challenges the atom-centric paradigm of molecular representation learning by proposing a novel framework that models the continuous 3D space surrounding atoms. The core contribution is SpaceFormer, a Transformer-based architecture that discretizes molecular space into grids to capture physical phenomena (electron density, electromagnetic fields) often missed by traditional point-cloud models.

The Physical Intuition: Modeling “Empty” Space

The Gap: Prior 3D molecular representation models, such as Uni-Mol, treat molecules as discrete sets of atoms, essentially point clouds in 3D space. However, from a quantum physics perspective, the “empty” space between atoms is far from empty. It is permeated by electron density distributions and electromagnetic fields that determine molecular properties.

The Hypothesis: Explicitly modeling this continuous 3D space alongside discrete atom positions yields superior representations for downstream tasks, particularly for computational properties that depend on electronic structure, such as HOMO/LUMO energies and energy gaps.

A Surprising Observation: Virtual Points Improve Representations

Before proposing SpaceFormer, the authors present a simple yet revealing experiment. They augment Uni-Mol by adding randomly sampled virtual points (VPs) from the 3D space within the circumscribed cuboid of each molecule. These VPs carry no chemical information whatsoever: they are purely random noise points.

The result is surprising: adding just 10 random VPs already yields a noticeable improvement in validation loss. The improvement remains consistent and gradually increases as the number of VPs grows, eventually reaching a plateau. This observation holds across downstream tasks as well, with Uni-Mol + VPs improving on several quantum property predictions (LUMO, E1-CC2, E2-CC2) compared to vanilla Uni-Mol.

The implication is that even uninformative spatial context helps the model learn better representations, motivating a principled framework for modeling the full 3D molecular space.

SpaceFormer: Voxelization and 3D Positional Encodings

The key innovation is treating the molecular representation problem as 3D space modeling. SpaceFormer follows these core steps:

Voxelizes the entire 3D space into a grid with cells of $0.49\text{\AA}$ (based on O-H bond length to ensure at most one atom per cell).
Uses adaptive multi-resolution grids to efficiently handle empty space, keeping it fine-grained near atoms and coarse-grained far away.
Applies Transformers to 3D spatial tokens with custom positional encodings that achieve linear complexity.

Specifically, the model utilizes two forms of 3D Positional Encoding:

3D Directional PE (RoPE Extension) They extend Rotary Positional Encoding (RoPE) to 3D continuous space by splitting the Query and Key vectors into three blocks (one for each spatial axis). The directional attention mechanism takes the form:

$$ \begin{aligned} \mathbf{q}_{i}^{\top} \mathbf{k}_{j} = \sum_{s=1}^{3} \mathbf{q}_{i,s}^{\top} \mathbf{R}(c_{j,s} - c_{i,s}) \mathbf{k}_{j,s} \end{aligned} $$

3D Distance PE (RFF Approximation) To compute invariant geometric distance without incurring quadratic memory overhead, they use Random Fourier Features (RFF) to approximate a Gaussian kernel of pairwise distances:

$$ \begin{aligned} \exp \left( - \frac{| \mathbf{c}_i - \mathbf{c}_j |_2^2}{2\sigma^2} \right) &\approx z(\mathbf{c}_i)^\top z(\mathbf{c}_j) \\ z(\mathbf{c}_i) &= \sqrt{\frac{2}{d}} \cos(\sigma^{-1} \mathbf{c}_i^\top \boldsymbol{\omega} + \mathbf{b}) \end{aligned} $$

This approach enables the model to natively encode complex field-like phenomena without computing exhaustive $O(N^2)$ distance matrices.

Experimental Setup and Downstream Tasks

Pretraining Data: 19 million unlabeled molecules from the same dataset used by Uni-Mol.

Downstream Benchmarks: The authors propose a new benchmark of 15 tasks, motivated by known limitations of MoleculeNet: invalid structures, inconsistent chemical representations, data curation errors, and an inability to adequately distinguish model performance. The tasks split into two categories:

Computational Properties (Quantum Mechanics)
- Subsets of GDB-17 (HOMO, LUMO, GAP energy prediction, 20K samples; E1-CC2, E2-CC2, f1-CC2, f2-CC2, 21.7K samples)
- Cata-condensed polybenzenoid hydrocarbons (Dipole moment, adiabatic ionization potential, D3 dispersion correction, 8,678 samples)
- Metric: Mean Absolute Error (MAE)
Experimental Properties (Pharma/Bio)
- MoleculeNet tasks (BBBP, BACE for drug discovery)
- Biogen ADME tasks (HLM, MME, Solubility)
- Metrics: AUC for classification, MAE for regression

Splitting Strategy: All datasets use 8:1:1 train/validation/test ratio with scaffold splitting to test out-of-distribution generalization.

Training Setup:

Objective: Masked Auto-Encoder (MAE) with 30% random masking. Model predicts whether a cell contains an atom, and if so, regresses both atom type and precise offset position.
Hardware: ~50 hours on 8 NVIDIA A100 GPUs
Optimizer: Adam ($\beta_1=0.9, \beta_2=0.99$)
Learning Rate: Peak 1e-4 with linear decay and 0.01 warmup ratio
Batch Size: 128
Total Updates: 1 million

Baseline Comparisons: GROVER (2D graph-based MPR), GEM (2D graph enhanced with 3D information), 3D Infomax (GNN with 3D information), Uni-Mol (3D MPR, primary baseline using the same pretraining dataset), and Mol-AE (extends Uni-Mol with atom-based MAE pretraining).

Results and Analysis

Strong Contextual Performance: SpaceFormer ranked 1st in 10 of 15 tasks and in the top 2 for 14 of 15 tasks. It surpassed the runner-up models by approximately 20% on quantum property tasks (HOMO, LUMO, GAP, E1-CC2, Dipmom), validating that modeling non-atom space captures electronic structure better than atom-only regimes.

Key Results on Quantum Properties

Task	GROVER	GEM	3D Infomax	Uni-Mol	Mol-AE	SpaceFormer
HOMO (Ha)	0.0075	0.0068	0.0065	0.0052	0.0050	0.0042
LUMO (Ha)	0.0086	0.0080	0.0070	0.0060	0.0057	0.0040
GAP (Ha)	0.0109	0.0107	0.0095	0.0081	0.0080	0.0064
E1-CC2 (eV)	0.0101	0.0090	0.0089	0.0067	0.0070	0.0058
Dipmom (Debye)	0.0752	0.0289	0.0291	0.0106	0.0113	0.0083

SpaceFormer’s advantage is most pronounced on computational properties that depend on electronic structure. On experimental biological tasks (e.g., BBBP), where measurements are noisy, the advantage narrows or reverses: Uni-Mol achieves 0.9066 AUC on BBBP compared to SpaceFormer’s 0.8605.

Ablation Studies

The authors present several ablations that isolate the source of SpaceFormer’s improvements:

MAE vs. Denoising: SpaceFormer with MAE pretraining outperforms SpaceFormer with denoising on all four ablation tasks. The MAE objective requires predicting whether an atom exists in a masked voxel, which forces the model to learn global structural dependencies. In the denoising variant, only atom cells are masked so the model never needs to predict atom existence, reducing the task to coordinate regression.

FLOPs Control: A SpaceFormer-Large model (4x width, atom-only) trained with comparable FLOPs still falls short of SpaceFormer with 1000 non-atom cells on most downstream tasks. This confirms the improvement comes from modeling 3D space, not from additional compute.

Virtual Points vs. SpaceFormer: Adding up to 200 random virtual points to Uni-Mol improves some tasks but leaves a significant gap compared to SpaceFormer, demonstrating that principled space discretization outperforms naive point augmentation.

Efficiency Validation: The Adaptive Grid Merging method reduces the number of cells by roughly 10x with virtually no performance degradation. The 3D positional encodings scale linearly with the number of cells, while Uni-Mol’s pretraining cost scales quadratically.

Scope and Future Directions

SpaceFormer does not incorporate built-in SE(3) equivariance, relying instead on data augmentation (random rotations and random boundary padding) during training. The authors identify extending SpaceFormer to force field tasks and larger systems such as proteins and complexes as promising future directions.

Reproducibility Details

Code and Data Availability

Source Code: As of the current date, the authors have not released the official source code or pre-trained weights.
Datasets: Pretraining utilized the same 19M unlabeled molecule dataset as Uni-Mol. Downstream tasks use a newly curated internal benchmark built from subsets of GDB-17, MoleculeNet, and Biogen ADME. The exact customized scaffold splits for these evaluations are pending the official code release.
Compute: Pretraining the base SpaceFormer encoder (~67.8M parameters, configured to merge level 3) required approximately 50 hours on 8 NVIDIA A100 GPUs.

Artifact	Type	License	Notes
Source code	Code	N/A	Not publicly released as of March 2026
Pre-trained weights	Model	N/A	Not publicly released
Pretraining data (19M molecules)	Dataset	Unknown	Same dataset as Uni-Mol; not independently released
Downstream benchmark splits	Dataset	N/A	Custom scaffold splits pending code release

Models

The model treats a molecule as a 3D “image” via voxelization, processed by a Transformer.

Input Representation:

Discretization: 3D space divided into grid cells with length $0.49\text{\AA}$ (based on O-H bond length to ensure at most one atom per cell)
Tokenization: Tokens are pairs $(t_i, c_i)$ where $t_i$ is atom type (or NULL) and $c_i$ is the coordinate
Embeddings: Continuous embeddings with dimension 512. Inner-cell positions discretized with $0.01\text{\AA}$ precision

Transformer Specifications:

Component	Layers	Attention Heads	Embedding Dim	FFN Dim
Encoder	16	8	512	2048
Decoder (MAE)	4	4	256	1024

Attention Mechanism: FlashAttention for efficient handling of large sequence lengths.

Positional Encodings:

3D Directional PE: Extension of Rotary Positional Embedding (RoPE) to 3D continuous space, capturing relative directionality
3D Distance PE: Random Fourier Features (RFF) to approximate Gaussian kernel of pairwise distances with linear complexity

Visualizing RFF and RoPE

Visual intuition for SpaceFormer’s positional encodings: Top row shows RFF distance encoding (Gaussian-like attention decay and high-frequency feature fingerprints). Bottom row shows RoPE directional encoding (vector rotation fields and resulting attention patterns).

Top Row (Distance / RFF): Shows how the model learns “closeness.” Distance is represented by a complex “fingerprint” of waves that creates a Gaussian-like force field.

Top Left (The Force Field): The attention score (dot product) naturally forms a Gaussian curve. It is high when atoms are close and decays to zero as they move apart. This mimics physical forces without the model needing to learn that math from scratch.
Top Right (The Fingerprint): Each dimension oscillates at a different frequency. A specific distance (e.g., $d=2$) has a unique combination of high and low values across these dimensions, creating a unique “fingerprint” for that exact distance.

Bottom Row (Direction / RoPE): Shows how the model learns “relative position.” It visualizes the vector rotation and how that creates a grid-like attention pattern.

Bottom Left (The Rotation): This visualizes the “X-axis chunk” of the vector. As you move from left ($x=-3$) to right ($x=3$), the arrows rotate. The model compares angles between atoms to determine relative positions.
Bottom Right (The Grid): The resulting attention pattern when combining X-rotations and Y-rotations. The red/blue regions show where the model pays attention relative to the center, forming a grid-like interference pattern that distinguishes relative positions (e.g., “top-right” vs “bottom-left”).

Adaptive Grid Merging

To make the 3D grid approach computationally tractable, two key strategies are employed:

Grid Sampling: Randomly selecting 10-20% of empty cells during training
Adaptive Grid Merging: Recursively merging $2 \times 2 \times 2$ blocks of empty cells into larger “coarse” cells, creating a multi-resolution view that is fine-grained near atoms and coarse-grained in empty space (merging set to Level 3)

Visualizing Adaptive Grid Merging:

Adaptive grid merging demonstrated on H₂O. Red cells (Level 0) contain atoms and remain at full resolution. Progressively darker blue cells represent merged empty regions at higher levels, covering the same volume with fewer tokens.

The adaptive grid process compresses empty space around molecules while maintaining high resolution near atoms:

Red Cells (Level 0): The smallest squares ($0.49$Å) containing atoms. These are kept at highest resolution because electron density changes rapidly here.
Light Blue Cells (Level 0/1): Small empty regions close to atoms.
Darker Blue Cells (Level 2/3): Large blocks of empty space further away.

If we used a naive uniform grid, we would have to process thousands of empty “Level 0” cells containing almost zero information. By merging them into larger blocks (the dark blue squares), the model covers the same volume with significantly fewer input tokens, reducing the number of tokens by roughly 10x compared to a dense grid.

Adaptive grid merging for benzene (C₆H₆). The model maintains maximum resolution (red Level 0 cells) only where atoms exist, while merging vast empty regions into large blocks (dark blue L3/L4 cells). This allows the model to focus computational power on chemically active zones.

The benzene example above demonstrates how this scales to larger molecules. The characteristic hexagonal ring of 6 carbon atoms (black) and 6 hydrogen atoms (white) occupies a small fraction of the total grid. The dark blue corners (L3, L4) represent massive merged blocks of empty space, allowing the model to focus 90% of its computational power on the red “active” zones where chemistry actually happens.

Paper Information

Citation: Lu, S., Ji, X., Zhang, B., Yao, L., Liu, S., Gao, Z., Zhang, L., & Ke, G. (2025). Beyond Atoms: Enhancing Molecular Pretrained Representations with 3D Space Modeling. Proceedings of the 42nd International Conference on Machine Learning (ICML), 267, 40491-40504. https://proceedings.mlr.press/v267/lu25e.html

Publication: ICML 2025

@inproceedings{lu2025beyond,
  title={Beyond Atoms: Enhancing Molecular Pretrained Representations with 3D Space Modeling},
  author={Lu, Shuqi and Ji, Xiaohong and Zhang, Bohang and Yao, Lin and Liu, Siyuan and Gao, Zhifeng and Zhang, Linfeng and Ke, Guolin},
  booktitle={Proceedings of the 42nd International Conference on Machine Learning},
  pages={40491--40504},
  volume={267},
  series={Proceedings of Machine Learning Research},
  publisher={PMLR},
  year={2025}
}

Additional Resources:

Embedded-Atom Method: Impurities and Defects in Metals

Fri, 22 Aug 2025 00:00:00 +0000

Contribution: Adaptive Many-Body Potentials

This is a foundational method paper that introduces a new class of semi-empirical, many-body interatomic potential: the Embedded-Atom Method (EAM). It is designed for large-scale atomistic simulations of metallic systems, bridging the gap between computationally cheap (but physically limited) pair potentials and accurate (but expensive) quantum mechanical methods. The EAM achieves pair-potential speed while incorporating many-body physics inspired by density functional theory.

Motivation: The Geometric Limits of Pair Potentials

The authors sought to overcome the limitations of pair potentials (the dominant method of the time), which failed in three key areas:

Elastic Anisotropy: Pair potentials enforce the Cauchy relation ($C_{12} = C_{44}$), which is violated by most transition metals.
Volume Ambiguity: Pair potentials require a volume-dependent energy term, making them impossible to use accurately on surfaces or cracks where local volume is undefined.
Chemical Incompatibility: Pair potentials cannot model chemically active impurities like Hydrogen.

First-principles quantum mechanical methods (e.g., band theory) are limited by basis-set size and periodicity requirements, making them impractical for the large systems (thousands of atoms) needed to study defects, surfaces, and mechanical properties.

The goal was to create a new model that bridges this gap in accuracy and computational cost.

Core Innovation: The Embedding Energy Function

The EAM postulates that the energy of an atom is determined by the local electron density of its neighbors. The total energy is:

$$E_{tot} = \sum_{i} F_i(\rho_{h,i}) + \frac{1}{2}\sum_{i \neq j} \phi_{ij}(R_{ij})$$

$F_i(\rho_{h,i})$ (Embedding Energy): The energy required to embed atom $i$ into the background electron density $\rho$ provided by its neighbors. This term is non-linear and captures many-body effects.
$\phi_{ij}$ (Pair Potential): A short-range electrostatic repulsion between cores.
$\rho_{h,i}$ (Host Density): Approximated as a linear superposition of atomic densities: $\rho_{h,i} = \sum_{j \neq i} \rho^a_j(R_{ij})$.

The key innovations are:

The Embedding Energy: Each atom $i$ contributes an energy $F_i$ which is a non-linear function of the local electron density $\rho_{h,i}$ it is embedded in. This density is approximated as a simple linear superposition of the atomic electron densities of all its neighbors. This term captures the crucial many-body effects of metallic bonding.
A Redefined Pair Potential: A short-range, two-body potential $\phi_{ij}$ is retained, but it primarily models the electrostatic core-core repulsion.
Elimination of the “Volume” Problem: Because the embedding energy depends on the local electron density (a quantity that is always well-defined, even at a surface or a crack tip), the method circumvents the ambiguities of volume-dependent pair potentials.
Intrinsic Many-Body Nature: The non-linearity of the embedding function $F(\rho)$ naturally accounts for why chemically active impurities (like hydrogen) cannot be described by pair potentials and correctly breaks the Cauchy relation for elastic constants.

Experimental Design: Robust Parameter Validation

The authors validated EAM through a rigorous split between parameterization data and prediction tasks:

Fitting Data (Bulk Properties Only):

The model parameters were fitted exclusively to these experimental values for Ni and Pd:

Lattice constant ($a_0$)
Elastic constants ($C_{11}, C_{12}, C_{44}$)
Sublimation energy ($E_s$)
Vacancy-formation energy ($E^F_{1V}$)
Hydrogen heat of solution (for fitting H parameters)

Validation Tests (No Further Fitting):

The model was then evaluated on its ability to predict these properties without any additional parameter adjustments:

Surface Relaxations: Ni(110) surface contraction
Surface Energy: Ni(100) surface energy
Hydrogen Migration: H migration energy in Pd
Fracture Mechanics: Hydrogen embrittlement in Ni slabs

Results: Extending Predictive Power to Surfaces and Defects

Many-Body Physics: The embedding function $F(\rho)$ successfully captures the volume-dependence of metallic cohesion, fixing the “Cauchy discrepancy” inherent in pair potentials.
Surface Properties: A single set of functions, fitted only to bulk data, correctly reproduces surface relaxations within 0.1 Å of experiment across three faces (100), (110), and (111) for Ni. The Ni(100) surface energy (1550 erg/cm²) compares well with the measured crystal-vapor average (1725 erg/cm²).
Hydrogen in Bulk: The method predicts H migration energy in Pd as 0.26 eV, matching experiment exactly. Hydride lattice expansions are also well reproduced: 4.5% for NiH (experiment: 5%) and 4% for PdH (experiment: 3.5% for PdH$_{0.6}$).
Hydrogen on Surfaces: Calculated adsorption sites on all three Ni and Pd faces agree with experimentally determined sites. Adsorption energies on Ni surfaces are systematically about 0.25 eV too low, while on Pd surfaces the error is much smaller (about 0.05 eV too high on average).
Fracture Mechanics: Static fracture calculations on Ni slabs demonstrate brittle fracture behavior and show that hydrogen lowers the fracture stress, providing a qualitative model of hydrogen embrittlement.

Limitations

The authors acknowledge several limitations:

The functions $F$ and $\phi$ are not uniquely determined by the empirical fitting procedure. The short-range pair potential (restricted to first neighbors in fcc metals) may not be the best choice for all crystal structures.
The choice of hydrogen embedding function (Puska et al. vs. Norskov’s corrected function) remains undecided and may affect hydrogen binding energies.
The fracture calculations are static, and dynamical effects and plasticity play important roles in real fracture that are not captured.
The method has only been demonstrated for fcc metals (Ni and Pd). Extension to bcc metals and other crystal structures requires further investigation.

Reproducibility Details

Algorithms

To replicate the method, three specific algorithmic definitions are needed:

Atomic Density Construction: The electron density $\rho^a(r)$ is a weighted sum of Hartree-Fock $s$ and $d$ orbital densities (from Clementi & Roetti tables), controlled by a parameter $N_s$ (the number of s-like electrons): $$\rho^a(r) = N_s\rho_s^a(r) + (N-N_s)\rho_d^a(r)$$ For Ni, $N_s = 0.85$; for Pd, $N_s = 0.65$ (fitted to H solution heat).
Pair Potential Form: The short-range pair interaction derives from an effective charge function $Z(r)$ to handle core repulsion: $$\phi_{ij}(r) = \frac{Z_i(r)Z_j(r)}{r}$$ Splines for $Z(r)$ are provided in Table II.
Analytic Forces: Because embedding energy depends on neighbor density, the force calculation is many-body: $$\vec{f}_{k} = -\sum_{j(\neq k)} (F’_{k} \rho’_{j} + F’_{j} \rho’_{k} + \phi’_{jk}) \vec{r}_{jk}$$

Models

The functions $F(\rho)$ and $\phi(r)$ are modeled using cubic splines, with parameters fitted to reproduce bulk experimental constants. The embedding function $F(\rho)$ is constrained to have a single minimum and to be linear at high densities, matching the qualitative form of the first-principles calculations by Puska et al. Energy minimization uses the conjugate gradients technique. The paper explicitly lists spline knots, coefficients, and cutoffs in Tables II and IV, making the method fully reproducible.

Left: Dimensionless embedding energy ($E/E_s$) vs. normalized electron density ($\rho/\bar{\rho}$). The minimum near $\rho/\bar{\rho} \approx 1.0$ drives metallic cohesion. Right: Normalized effective charge ($Z/Z_0$) vs. normalized distance ($R/a_0$). The charge drops to zero near $R/a_0 = 0.85$, ensuring short-range interactions. Reproduced from Table II spline knots.

Evaluation

Fitting Data (Used for Parameterization):

Bulk experimental properties for Ni and Pd only:

Lattice constant ($a_0$)
Elastic constants ($C_{11}, C_{12}, C_{44}$)
Sublimation energy ($E_s$)
Vacancy-formation energy ($E^F_{1V}$)
Hydrogen heat of solution (for fitting H parameters)

Validation Results (Predictions Without Further Fitting):

Property	Predicted	Experimental	Agreement
Ni(110) surface contraction	-0.11 Å	-0.06 to -0.10 Å	Within 0.1 Å
Ni(100) surface energy	1550 erg/cm²	1725 erg/cm² (avg.)	Close
H migration in Pd	0.26 eV	0.26 eV	Exact
NiH lattice expansion	4.5%	5%	Close
PdH lattice expansion	4%	3.5% (PdH$_{0.6}$)	Close
H adsorption sites (Ni, Pd)	Correct on all faces	Matches experiment	Exact
H embrittlement in Ni	Qualitative model	-	Qualitative

Paper Information

Citation: Daw, M. S., & Baskes, M. I. (1984). Embedded-atom method: Derivation and application to impurities, surfaces, and other defects in metals. Physical Review B, 29(12), 6443-6453. https://doi.org/10.1103/PhysRevB.29.6443

Publication: Physical Review B, 1984

@article{daw1984embedded,
  title={Embedded-atom method: Derivation and application to impurities, surfaces, and other defects in metals},
  author={Daw, Murray S and Baskes, Mike I},
  journal={Physical Review B},
  volume={29},
  number={12},
  pages={6443--6453},
  year={1984},
  publisher={APS},
  doi={10.1103/PhysRevB.29.6443}
}

Additional Resources:

Umbrella Sampling: Monte Carlo Free-Energy Estimation

Thu, 21 Aug 2025 00:00:00 +0000

A Methodological Shift in Monte Carlo Simulations

This is a Method paper that introduces a novel computational technique for Monte Carlo simulations. It presents Umbrella Sampling, an importance sampling approach that uses non-physical distributions to calculate free energy differences in molecular systems.

The Sampling Gap in Phase Transitions

The paper addresses the failure of conventional Boltzmann-weighted Monte Carlo to estimate free energy differences.

The Problem: Free energy depends on the integral of configurations that are rare in the reference system. In a standard simulation, the relevant probability density $f_0(\Delta U^*)$ is too small to be sampled accurately by conventional Boltzmann-weighted Monte Carlo.
Phase Transitions: Conventional “thermodynamic integration” fails near phase transitions because it requires a path of integration where ensemble averages can be reliably measured, which is difficult in unstable regions.

Bridging States with Non-Physical Distributions

The authors introduce a non-physical distribution $\pi(q^N)$ to bridge the gap between a reference system (0) and a system of interest (1).

Arbitrary Weights: They generate a Markov chain with a limiting distribution $\pi(q^N)$ that differs from the Boltzmann distribution of either system. This distribution is written as $\pi(q’^N) = w(q’^N) \exp(-U_0(q’^N)/kT_0) / Z$, where $w(q^N) = W(\Delta U^*)$ is a weighting function chosen to favor configurations with values of $\Delta U^*$ important to the free-energy integral.
Reweighting Formula: The unbiased average of any property $\theta$ is recovered via the ratio of biased averages:

$$\langle\theta\rangle_{0}=\frac{\langle\theta/w\rangle_{w}}{\langle1/w\rangle_{w}}$$

Overlap: The method allows sampling a range of $\Delta U^*$ up to three times that of a conventional Monte Carlo experiment, enabling accurate determination of values of $f_0(\Delta U^*)$ as small as $10^{-8}$. If a single weight function cannot span the entire gap, additional overlapping umbrella-sampling experiments are carried out with different weighting functions exploring successively overlapping ranges of $\Delta U^*$.

Validation on Lennard-Jones Fluids

The authors validated Umbrella Sampling using Monte Carlo simulations of model fluids.

Experimental Setup

System Specifications: The study used a Lennard-Jones (LJ) fluid and an inverse-12 “soft-sphere” fluid.
System Size: Simulations were primarily performed with $N=32$ particles, with some validation runs at $N=108$ particles to check for size dependence.
State Points: Calculations covered a wide range of densities ($N\sigma^3/V = 0.50$ to $0.85$) and temperatures ($kT/\epsilon = 0.7$ to $2.8$), including the gas-liquid coexistence region.

Baselines

Baselines: Results were compared to thermodynamic integration data from Hansen, Levesque, and Verlet.
Quantitative Success:
- Agreement: The free energy estimates agreed with pressure integration results to within statistical uncertainties (e.g., at $kT/\epsilon=1.35$, Umbrella Sampling gave -3.236 vs. Conventional -3.25).
- Precision: Free energy differences were obtained with high precision ($\pm 0.005 NkT$ for $N=108$).
- Efficiency: A single umbrella run could replace the “numerous runs” required for conventional $1/T$ integrations.

Temperature Scaling via Reweighting

When the reference system has the same internal energy function as the system of interest (i.e., the same fluid at a different temperature), the free-energy expression simplifies to:

$$\frac{A(T)}{kT} = \frac{A(T_0)}{kT_0} - \ln \int f_0(U) \exp\left[-U\left(\frac{1}{kT} - \frac{1}{kT_0}\right)\right] dU$$

This is especially useful because a single determination of $f_0(U)$ over a wide energy range gives the free energy over a whole range of temperatures simultaneously. For 32 Lennard-Jones particles, only two umbrella-sampling experiments are needed to span the temperature range from the triple point ($kT/\epsilon = 0.7$) to twice the critical temperature ($kT/\epsilon = 2.8$). For 108 particles, four experiments suffice.

Mapping the Liquid-Gas Free Energy Surface

Methodological Utility: The method successfully mapped the free energy of the LJ fluid across the liquid-gas transition, a region where conventional methods face convergence problems.
N-Dependence: Comparison between $N=32$ and $N=108$ showed no statistically significant size dependence for free energy differences, suggesting small systems are sufficient for these estimates.
Comparison with Gosling-Singer Method: The paper contrasts its results with free energies derived from Gosling and Singer’s entropy estimation technique, finding discrepancies as large as $0.4N\epsilon$ (a 20% error in the nonideal entropy), equivalent to overestimating the configurational integral of a 108-particle system by a factor of $10^{16}$.
Generality: While demonstrated on energy ($U$), the authors note the weighting function $w$ can be any function of the coordinates, generalizing the technique beyond simple free energy differences.

Reproducibility

This 1977 paper predates modern code-sharing practices, and no source code or data files are publicly available. However, the paper provides sufficient algorithmic detail for reimplementation:

Constructing $W$: The paper does not derive $W$ analytically. It uses a trial-and-error procedure: start with a short Boltzmann-weighted experiment, then broaden the distribution in stages through short test runs, adjusting weights to flatten the probability density $f_w(\Delta U^*)$. The paper acknowledges this requires “interaction between the trial computer results and human judgment.”
Specific Weights: Table I provides the exact numerical weights used for the 32-particle soft-sphere experiment at $N\sigma^3/V = 0.85$, $kT/\epsilon = 2.74$, with values spanning from $W=1{,}500{,}000$ at the lowest energies down to $W=1.0$ at the center and back up to $W=16.0$ at the highest energies.
Potentials: The Lennard-Jones and inverse-twelve potentials are fully specified (Eqs. 8 and 9).
State Points: Densities and temperatures are enumerated in Tables II and III.
Block Averaging: Errors were estimated by treating sequences of $m$ steps as independent samples, where $m$ is determined by increasing block size until no systematic trends can be detected in either the average or the standard deviation of the mean.

Paper Information

Citation: Torrie, G. M., & Valleau, J. P. (1977). Nonphysical sampling distributions in Monte Carlo free-energy estimation: Umbrella sampling. Journal of Computational Physics, 23(2), 187-199. https://doi.org/10.1016/0021-9991(77)90121-8

Publication: Journal of Computational Physics, 1977

@article{torrie1977nonphysical,
  title={Nonphysical sampling distributions in Monte Carlo free-energy estimation: Umbrella sampling},
  author={Torrie, Glenn M and Valleau, John P},
  journal={Journal of Computational Physics},
  volume={23},
  number={2},
  pages={187--199},
  year={1977},
  publisher={Elsevier},
  doi={10.1016/0021-9991(77)90121-8}
}

Lennard-Jones on Adsorption and Diffusion on Surfaces

Sun, 17 Aug 2025 00:00:00 +0000

The Theoretical Foundation of Adsorption and Diffusion

This paper represents a foundational Theory contribution with dual elements of Systematization. It derives physical laws for adsorption potentials (Section 2) and diffusion kinetics (Section 4) from first principles, validating them against external experimental data (Ward, Benton). It bridges electronic structure theory (potential curves) and statistical mechanics (diffusion rates). It provides a unifying theoretical framework to explain a range of experimental observations.

Reconciling Physisorption and Chemisorption

The primary motivation was to reconcile conflicting experimental evidence regarding the nature of gas-solid interactions. At the time, it was observed that the same gas and solid could interact weakly at low temperatures (consistent with van der Waals forces) but exhibit strong, chemical-like bonding at higher temperatures, a process requiring significant activation energy. The paper seeks to provide a single, coherent model that can explain both “physical adsorption” (physisorption) and “activated” or “chemical adsorption” (chemisorption) and the transition between them.

Quantum Mechanical Potential Energy Surfaces for Adsorption

The core novelty is the application of quantum mechanical potential energy surfaces to the problem of surface adsorption. The key conceptual breakthroughs are:

Dual Potential Energy Curves: The paper proposes that the state of the system must be described by at least two distinct potential energy curves as a function of the distance from the surface:
- One curve represents the interaction of the intact molecule with the surface (e.g., H₂ with a metal). This corresponds to weak, long-range van der Waals forces.
- A second curve represents the interaction of the dissociated constituent atoms with the surface (e.g., 2H atoms with the metal). This corresponds to strong, short-range chemical bonds.
Activated Adsorption via Curve Crossing: The transition from the molecular (physisorbed) state to the atomic (chemisorbed) state occurs at the intersection of these two potential energy curves. For a molecule to dissociate and chemisorb, it must possess sufficient energy to reach this crossing point. This energy is identified as the energy of activation, which had been observed experimentally.
Unified Model: This model unifies physisorption and chemisorption into a single continuous process. A molecule approaching the surface is first trapped in the shallow potential well of the physisorption curve. If it acquires enough thermal energy to overcome the activation barrier, it can transition to the much deeper potential well of the chemisorption state. This provides a clear physical picture for temperature-dependent adsorption phenomena.
Quantum Mechanical Basis for Cohesion: To explain the nature of the chemisorption bond itself, Lennard-Jones draws on the then-recent quantum theory of metals (Sommerfeld, Bloch). In a metal, electrons are not bound to individual atoms but instead occupy shared energy states (bands) spread across the crystal. When an atom approaches the surface, local energy levels form in the gap between the bulk bands, creating sites where bonding can occur. The adsorption bond arises from the interaction between the valency electron of the approaching atom and conduction electrons of the metal, forming a closed shell analogous to a homopolar bond.

Validating Theory Against Experimental Gas-Solid Interactions

This is a theoretical paper with no original experiments performed by the author. However, Lennard-Jones validates his theoretical framework against existing experimental data from other researchers:

Ward’s data: Hydrogen absorption on copper, used to validate the square root time law for slow sorption kinetics (§4)
Activated adsorption experiments: Benton and White (hydrogen on nickel), Taylor and Williamson, and Taylor and McKinney all provided isobar data showing temperature-dependent transitions between adsorption types (§3). Garner and Kingman documented three distinct adsorption regimes at different temperatures.
van der Waals constant data: Used existing measurements of diamagnetic susceptibility to calculate predicted heats of adsorption (e.g., argon on copper yielding approximately 6000 cal/gram atom, nitrogen roughly 2500 cal/gram mol, hydrogen roughly 1300 cal/gram mol)
KCl crystal calculations: Computed the full attractive potential field of argon above a KCl crystal lattice, accounting for the discrete ionic structure to produce detailed potential energy curves at different surface positions (§2)

The validation approach involves deriving theoretical predictions from first principles and showing they match the functional form and magnitude of independently measured experimental results.

The Lennard-Jones Diagram and Activated Adsorption

Key Outcomes:

The paper introduced the now-famous Lennard-Jones diagram for surface interactions, plotting potential energy versus distance from the surface for both molecular and dissociated atomic species. This graphical model became a cornerstone of surface science.
Derived the square root time law ($S \propto \sqrt{t}$) for slow sorption kinetics, validated against Ward’s experimental data.
Established quantitative connection between adsorption potentials and measurable atomic properties (diamagnetic susceptibility).

Conclusions:

The nature of adsorption is determined by the interplay between two distinct potential states (molecular and atomic).
“Activated adsorption” is the process of overcoming an energy barrier to transition from a physically adsorbed molecular state to a chemically adsorbed atomic state.
The model predicts that the specific geometry of the surface (i.e., the lattice spacing) and the orientation of the approaching molecule are critical, as they influence the shape of the potential energy surfaces and thus the magnitude of the activation energy.
The reverse process (recombination of atoms and desorption of a molecule) also requires activation energy to move from the chemisorbed state back to the molecular state.
This entire mechanism is proposed as a fundamental factor in heterogeneous catalysis, where the surface acts to lower the activation energy for molecular dissociation, facilitating chemical reactions.

Limitations:

The initial “method of images” derivation assumes a perfectly continuous conducting surface, an approximation that breaks down at the atomic orbital level close to the surface.
While Lennard-Jones uses one-dimensional calculations to estimate initial potential well depths, he later qualitatively extends this to 3D “contour tunnels” to explain surface migration. However, these early geometric approximations lack the many-body, multi-dimensional complexity natively handled by modern Density Functional Theory (DFT) simulations.

Mathematical Derivations

Van der Waals Calculation (Section 2)

The paper derives the attractive force between a neutral atom and a metal surface using the classical method of electrical images. The key steps are:

Method of Images: Lennard-Jones models the metal as a continuum of perfectly mobile electric fluid (a perfectly polarisable system). When a neutral atom approaches, its instantaneous dipole moment induces image charges in the metal surface.

An atom and its electrical image in a conducting surface. The nucleus (+Ne) and electrons create mirror charges across the metal plane.

The Interaction Potential: The resulting potential energy $W$ of an atom at distance $R$ from the metal surface is:

$$W = -\frac{e^2 \overline{r^2}}{6R^3}$$

where $\overline{r^2}$ is the mean square distance of electrons from the nucleus.

Connection to Measurable Properties: This theoretical potential can be calculated using diamagnetic susceptibility ($\chi$). The interaction simplifies to:

$$W = \mu R^{-3}$$

where $\mu = mc^2\chi/L$, with $m$ the electron mass, $c$ the speed of light, $\chi$ the diamagnetic susceptibility, and $L$ Loschmidt’s number ($6.06 \times 10^{23}$). This connects the adsorption potential to measurable magnetic properties of the atom.

Repulsive Forces and Equilibrium: By assuming repulsive forces account for approximately 40% of the potential at equilibrium, Lennard-Jones estimates heats of adsorption. For argon on copper, this yields approximately 6000 cal per gram atom. Similar calculations give roughly 2500 cal/gram mol for nitrogen on copper and 1300 cal/gram mol for hydrogen.

Kinetic Theory of Slow Sorption (Section 4)

The paper extends beyond surface phenomena to model how gas enters the bulk solid (absorption). This section is critical for understanding time-dependent adsorption kinetics.

The “Cracks” Hypothesis

Lennard-Jones proposes that “slow sorption” is lateral diffusion along surface cracks (fissures between microcrystal boundaries) in the solid. The outer surface presents not a uniform plane but a network of narrow, deep crevasses where gas can penetrate. This reframes the problem: the rate-limiting step is diffusion along these crack walls, explaining why sorption rates differ from predictions based on bulk diffusion coefficients.

The Diffusion Equation

The problem is formulated using Fick’s second law:

$$\frac{\partial n}{\partial t} = D \frac{\partial^{2}n}{\partial x^{2}}$$

where $n$ is the concentration of adsorbed atoms, $t$ is time, $D$ is the diffusion coefficient, and $x$ is the position along the crack.

Derivation of the Diffusion Coefficient

The diffusion coefficient is derived from kinetic theory:

$$D = \frac{\bar{c}^2 \tau^2}{2\tau^*}$$

where:

$\bar{c}$ is the mean lateral velocity of mobile atoms parallel to the surface
$\tau$ is the time an atom spends in the mobile (activated) state
$\tau^*$ is the interval between activation events

Atoms are “activated” to a mobile state with energy $E_0$, after which they can migrate along the surface.

The Square Root Law

Solving the diffusion equation for a semi-infinite crack yields the total amount of gas absorbed $S$ as a function of time:

$$S = 2n_0 \sqrt{\frac{Dt}{\pi}}$$

This predicts that absorption scales with the square root of time:

$$S \propto \sqrt{t}$$

Experimental Validation

Lennard-Jones validates this derivation by re-analyzing Ward’s experimental data on the Copper/Hydrogen system. Plotting the absorbed quantity against $\sqrt{t}$ produces linear curves, confirming the theoretical prediction. From the slope of the $\log_{10}(S^2/q^2t)$ vs. $1/T$ plot, Ward determined an activation energy of 14,100 cal per gram-molecule for the surface diffusion process.

Surface Topography and 3D Contours

The notes above imply a one-dimensional process (distance from surface). The paper explicitly expands this to three dimensions to explain surface migration.

Potential “Tunnels”

Lennard-Jones models the surface potential as 3D contour surfaces resembling “underground caverns” or tunnels. The potential energy landscape above a crystalline surface has periodic minima and saddle points.

Surface Migration

Atoms migrate along “tunnels” of low potential energy between surface atoms. The activation energy for surface diffusion corresponds to the barrier height between adjacent potential wells on the surface. This geometric picture explains:

Why certain crystallographic orientations are more reactive
The temperature dependence of surface diffusion rates
The role of surface defects in catalysis

Reproducibility

This is a 1932 theoretical paper with no associated code, datasets, or models. The mathematical derivations are fully presented in the text and can be followed from first principles. The experimental data referenced (Ward’s copper/hydrogen measurements, Benton and White’s nickel/hydrogen isobars) are cited from independently published sources. No computational artifacts exist.

Status: Closed (theoretical paper, no reproducibility artifacts)
Hardware: N/A (analytical derivations only)

Paper Information

Citation: Lennard-Jones, J. E. (1932). Processes of Adsorption and Diffusion on Solid Surfaces. Transactions of the Faraday Society, 28, 333-359. https://doi.org/10.1039/tf9322800333

Publication: Transactions of the Faraday Society, 1932

@article{lennardjones1932processes,
  title={Processes of adsorption and diffusion on solid surfaces},
  author={Lennard-Jones, John Edward},
  journal={Transactions of the Faraday Society},
  volume={28},
  pages={333--359},
  year={1932},
  publisher={Royal Society of Chemistry}
}

Molecular Simulation on Hunter Heidenreich | ML Research Scientist

Ewald Message Passing for Molecular Graphs

A Fourier-Space Long-Range Correction for Molecular GNNs

The Long-Range Interaction Problem in Molecular GNNs

From Ewald Summation to Learnable Fourier-Space Messages

Experiments Across Four GNN Architectures and Two Datasets

Key Energy MAE Results on OE62

Key Energy MAE Results on OC20 (Averaged Across Test Splits)

Robust Long-Range Improvements and Dispersion Recovery

Reproducibility Details

Data

Algorithms

Models

Evaluation

Hardware

Artifacts

Paper Information

PharMolixFM: Multi-Modal All-Atom Molecular Models

A Unified Framework for All-Atom Molecular Foundation Models

Challenges in Multi-Modal Atomic Modeling

Multi-Modal Denoising with Task-Specific Priors

Three Generative Model Variants

Network Architecture

Docking and Drug Design Experiments

Protein-Small-Molecule Docking

Structure-Based Drug Design

Inference Scaling Law

Competitive Docking with Faster Inference, but Limited Task Scope

Reproducibility Details

Data

Algorithms

Models

Evaluation

Hardware

Artifacts

Paper Information

MAT: Graph-Augmented Transformer for Molecules (2020)

A Graph-Augmented Transformer for Molecular Property Prediction

Challenges in Deep Learning for Molecular Properties

Molecule Self-Attention: Combining Attention, Distance, and Graph Structure

Benchmark Evaluation and Ablation Studies

Experimental setup

Main results

Ablation results

Interpretability analysis

Findings, Limitations, and Future Directions

Reproducibility Details

Data

Algorithms

Models

Evaluation

Hardware

Artifacts

Paper Information

MOFFlow: Flow Matching for MOF Structure Prediction

Methodological Contribution: MOFFlow Architecture

Motivation: Scaling Limits of Atom-Level Generation

Core Innovation: Rigid-Body Flow Matching on SE(3)

Experimental Setup and Baselines

Results and Generative Performance

Limitations

Reproducibility Details

Data

Algorithms

Models

Evaluation

Hardware

Artifacts

Paper Information

Stillinger-Weber Potential for Silicon Simulation

Core Methodological Contribution

The Failure of Pair Potentials in Silicon

The Three-Body Interaction Novelty

Molecular Dynamics Validation

Phase Topology and Inverse Lindemann Criterion

Limitations and Energy Scale Problem

Bonding Statistics in Inherent Structures

Reproducibility Details

Algorithms

Models