<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Molecular Simulation on Hunter Heidenreich | ML Research Scientist</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/</link><description>Recent content in Molecular Simulation on Hunter Heidenreich | ML Research Scientist</description><image><title>Hunter Heidenreich | ML Research Scientist</title><url>https://hunterheidenreich.com/img/avatar.webp</url><link>https://hunterheidenreich.com/img/avatar.webp</link></image><generator>Hugo -- 0.147.7</generator><language>en-US</language><copyright>2026 Hunter Heidenreich</copyright><lastBuildDate>Sat, 11 Apr 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://hunterheidenreich.com/notes/chemistry/molecular-simulation/index.xml" rel="self" type="application/rss+xml"/><item><title>Ewald Message Passing for Molecular Graphs</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/ewald-message-passing-molecular-graphs/</link><pubDate>Tue, 07 Apr 2026 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/ewald-message-passing-molecular-graphs/</guid><description>Ewald message passing augments GNNs with Fourier-space long-range interactions, improving energy predictions by 10-16% on OC20 and OE62 benchmarks.</description><content:encoded><![CDATA[<h2 id="a-fourier-space-long-range-correction-for-molecular-gnns">A Fourier-Space Long-Range Correction for Molecular GNNs</h2>
<p>This is a <strong>Method</strong> paper that introduces Ewald message passing (Ewald MP), a general framework for incorporating long-range interactions into message passing neural networks (MPNNs) for molecular <a href="/notes/chemistry/molecular-simulation/learning-smooth-interatomic-potentials/">potential energy surface</a> prediction. The key contribution is a nonlocal Fourier-space message passing scheme, grounded in the classical <a href="https://en.wikipedia.org/wiki/Ewald_summation">Ewald summation</a> technique from computational physics, that complements the short-range message passing of existing GNN architectures.</p>
<h2 id="the-long-range-interaction-problem-in-molecular-gnns">The Long-Range Interaction Problem in Molecular GNNs</h2>
<p>Standard MPNNs for molecular property prediction rely on a spatial distance cutoff to define atomic neighborhoods. While this locality assumption enables favorable scaling with system size and provides a useful inductive bias, it fundamentally limits the model&rsquo;s ability to capture long-range interactions such as electrostatic forces and van der Waals (<a href="https://en.wikipedia.org/wiki/London_dispersion_force">London dispersion</a>) interactions. These interactions decay slowly with distance (e.g., electrostatic energy follows a $1/r$ power law), and truncating them with a distance cutoff can introduce severe artifacts in thermochemical predictions.</p>
<p>This problem is well-known in molecular dynamics, where empirical force fields explicitly separate bonded (short-range) and non-bonded (long-range) energy terms. The Ewald summation technique addresses this by decomposing interactions into a short-range part that converges quickly with a distance cutoff and a long-range part whose Fourier transform converges quickly with a frequency cutoff. The authors propose bringing this same strategy into the GNN paradigm.</p>
<h2 id="from-ewald-summation-to-learnable-fourier-space-messages">From Ewald Summation to Learnable Fourier-Space Messages</h2>
<p>The core insight is a formal analogy between the continuous-filter convolution used in MPNNs and the electrostatic potential computation in Ewald summation. In a standard continuous-filter convolution, the message sum for atom $i$ is:</p>
<p>$$
M_i^{(l+1)} = \sum_{j \in \mathcal{N}(i)} h_j^{(l)} \cdot \Phi^{(l)}(| \mathbf{x}_i - \mathbf{x}_j |)
$$</p>
<p>where $h_j^{(l)}$ are atom embeddings and $\Phi^{(l)}$ is a learned radial filter. Comparing this to the electrostatic potential $V_i^{\text{es}}(\mathbf{x}_i) = \sum_{j \neq i} q_j \cdot \Phi^{\text{es}}(| \mathbf{x}_i - \mathbf{x}_j |)$ reveals a direct correspondence: atom embeddings play the role of partial charges, and learned filters replace the $1/r$ kernel.</p>
<p>Ewald MP decomposes the learned filter into short-range and long-range components. The short-range part is handled by any existing GNN architecture with a distance cutoff. The long-range part is computed as a sum over Fourier frequencies:</p>
<p>$$
M^{\text{lr}}(\mathbf{x}_i) = \sum_{\mathbf{k}} \exp(i \mathbf{k}^T \mathbf{x}_i) \cdot s_{\mathbf{k}} \cdot \hat{\Phi}^{\text{lr}}(| \mathbf{k} |)
$$</p>
<p>where $s_{\mathbf{k}}$ are <strong><a href="https://en.wikipedia.org/wiki/Structure_factor">structure factor</a> embeddings</strong>, computed as:</p>
<p>$$
s_{\mathbf{k}} = \sum_{j \in \mathcal{S}} h_j \exp(-i \mathbf{k}^T \mathbf{x}_j)
$$</p>
<p>These structure factor embeddings are a Fourier-space representation of the atom embedding distribution, and truncating to low frequencies effectively coarse-grains the hidden model state while preserving long-range information. The frequency filters $\hat{\Phi}^{\text{lr}}$ are learned, making the entire scheme data-driven rather than tied to a fixed physical functional form.</p>
<p>The method handles both <strong>periodic</strong> systems (where the <a href="https://en.wikipedia.org/wiki/Reciprocal_lattice">reciprocal lattice</a> provides a natural frequency discretization) and <strong>aperiodic</strong> systems (where the Fourier domain is discretized using a cubic voxel grid with SVD-based rotation alignment to preserve rotation invariance). The combined embedding update becomes:</p>
<p>$$
h_i^{(l+1)} = \frac{1}{\sqrt{3}} \left[ h_i^{(l)} + f_{\text{upd}}^{\text{sr}}(M_i^{\text{sr}}) + f_{\text{upd}}^{\text{lr}}(M_i^{\text{lr}}) \right]
$$</p>
<p>The computational complexity is $\mathcal{O}(N_{\text{at}} N_{\text{k}})$, and by fixing the number of frequency vectors $N_{\text{k}}$, linear scaling $\mathcal{O}(N_{\text{at}})$ is achievable.</p>
<h2 id="experiments-across-four-gnn-architectures-and-two-datasets">Experiments Across Four GNN Architectures and Two Datasets</h2>
<p>The authors test Ewald MP as an augmentation on four baseline architectures: <a href="/notes/chemistry/datasets/marcel/">SchNet, PaiNN, DimeNet++, and GemNet-T</a>. Two datasets are used:</p>
<ul>
<li><strong>OC20</strong> (Chanussot et al., 2021): ~265M periodic structures of adsorbate-catalyst systems with DFT-computed energies and forces. The OC20-2M subsplit is used for training.</li>
<li><strong>OE62</strong> (Stuke et al., 2020): ~62,000 large aperiodic organic molecules with DFT-computed energies that include a DFT-D3 dispersion correction for London dispersion interactions.</li>
</ul>
<p>All baselines use a 6 Å distance cutoff and 50 maximum neighbors. The Ewald modification is minimal: the long-range message sum is added as an additional skip connection term in each interaction block. Comparison studies include: (1) increasing the distance cutoff to match the computational cost of Ewald MP, (2) replacing the Ewald block with a SchNet interaction block at increased cutoff, and (3) increasing atom embedding dimensions to match Ewald MP&rsquo;s parameter count.</p>
<h3 id="key-energy-mae-results-on-oe62">Key Energy MAE Results on OE62</h3>
<table>
  <thead>
      <tr>
          <th>Model</th>
          <th>Baseline (meV)</th>
          <th>Ewald MP (meV)</th>
          <th>Improvement</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>SchNet</td>
          <td>133.5</td>
          <td>79.2</td>
          <td>40.7%</td>
      </tr>
      <tr>
          <td>PaiNN</td>
          <td>61.4</td>
          <td>57.9</td>
          <td>5.7%</td>
      </tr>
      <tr>
          <td>DimeNet++</td>
          <td>51.2</td>
          <td>46.5</td>
          <td>9.2%</td>
      </tr>
      <tr>
          <td>GemNet-T</td>
          <td>51.5</td>
          <td>47.4</td>
          <td>8.0%</td>
      </tr>
  </tbody>
</table>
<h3 id="key-energy-mae-results-on-oc20-averaged-across-test-splits">Key Energy MAE Results on OC20 (Averaged Across Test Splits)</h3>
<table>
  <thead>
      <tr>
          <th>Model</th>
          <th>Baseline (meV)</th>
          <th>Ewald MP (meV)</th>
          <th>Improvement</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>SchNet</td>
          <td>895</td>
          <td>830</td>
          <td>7.3%</td>
      </tr>
      <tr>
          <td>PaiNN</td>
          <td>448</td>
          <td>393</td>
          <td>12.3%</td>
      </tr>
      <tr>
          <td>DimeNet++</td>
          <td>496</td>
          <td>445</td>
          <td>10.4%</td>
      </tr>
      <tr>
          <td>GemNet-T</td>
          <td>346</td>
          <td>307</td>
          <td>11.3%</td>
      </tr>
  </tbody>
</table>
<h2 id="robust-long-range-improvements-and-dispersion-recovery">Robust Long-Range Improvements and Dispersion Recovery</h2>
<p>Ewald MP achieves consistent improvements across all models and both datasets, averaging 16.1% on OE62 and 10.3% on OC20. Several findings stand out:</p>
<ol>
<li>
<p><strong>Robustness</strong>: Unlike the increased-cutoff and SchNet-LR alternatives, Ewald MP never produces detrimental effects in any tested configuration. The increased cutoff setting hurts SchNet and PaiNN on OE62, and the SchNet-LR block fails to improve DimeNet++ and GemNet-T.</p>
</li>
<li>
<p><strong>Long-range specificity</strong>: A binning analysis on OE62 groups molecules by the magnitude of their DFT-D3 dispersion correction. Ewald MP shows an outsize improvement for structures with large long-range energy contributions. It recovers or surpasses a &ldquo;cheating&rdquo; baseline that receives the exact DFT-D3 ground truth as an additional input.</p>
</li>
<li>
<p><strong>Efficiency on periodic systems</strong>: Ewald MP achieves similar relative improvements on OC20 at roughly half the relative computational cost compared to OE62, suggesting periodic structures as a particularly attractive application domain.</p>
</li>
<li>
<p><strong>Force predictions</strong>: Improvements in <a href="/notes/chemistry/molecular-simulation/dark-side-of-forces/">force MAEs</a> are consistent but small, which is expected since the frequency truncation removes high-frequency contributions to the potential energy surface.</p>
</li>
<li>
<p><strong>Ablation studies</strong>: Results are robust across different frequency cutoffs, voxel resolutions, and filtering strategies, with the non-radial periodic filtering scheme outperforming radial alternatives on out-of-distribution generalization.</p>
</li>
</ol>
<p>Limitations include the current focus on scalar (invariant) embeddings only (PaiNN&rsquo;s equivariant vector embeddings are not augmented), and the potential for a &ldquo;gap&rdquo; of medium-range interactions when $N_{\text{k}}$ is fixed for linear scaling. The authors suggest adapting more efficient Ewald summation variants (e.g., particle mesh Ewald with $\mathcal{O}(N \log N)$ scaling) as future work.</p>
<hr>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<h3 id="data">Data</h3>
<table>
  <thead>
      <tr>
          <th>Purpose</th>
          <th>Dataset</th>
          <th>Size</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Training (periodic)</td>
          <td>OC20-2M</td>
          <td>~2M structures</td>
          <td>Subsplit of OC20; PBC; DFT energies and forces</td>
      </tr>
      <tr>
          <td>Training (aperiodic)</td>
          <td>OE62</td>
          <td>~62,000 molecules</td>
          <td>Large organic molecules; DFT energies with D3 correction</td>
      </tr>
      <tr>
          <td>Evaluation</td>
          <td>OC20-test (4 splits: ID, OOD-ads, OOD-cat, OOD-both)</td>
          <td>Varies</td>
          <td>Evaluated via submission to OC20 evaluation server</td>
      </tr>
      <tr>
          <td>Evaluation</td>
          <td>OE62-val, OE62-test</td>
          <td>~6,000 each</td>
          <td>Direct evaluation</td>
      </tr>
  </tbody>
</table>
<h3 id="algorithms">Algorithms</h3>
<ul>
<li>Ewald message passing is integrated as an additional skip connection term in each interaction block</li>
<li>For periodic systems: non-radial filtering with fixed reciprocal lattice positions ($N_x, N_y, N_z$ hyperparameters)</li>
<li>For aperiodic systems: radial Gaussian basis function filtering with frequency cutoff $c_k$ and voxel resolution $\Delta = 0.2$ Å$^{-1}$</li>
<li>SVD-based coordinate alignment for rotation invariance in the aperiodic case</li>
<li>Bottleneck dimension $N_\downarrow = 16$ (GemNet-T) or $N_\downarrow = 8$ (others)</li>
<li>Update function: dense layer + $N_{\text{hidden}}$ residual layers ($N_{\text{hidden}} = 3$, except PaiNN with $N_{\text{hidden}} = 0$)</li>
</ul>
<h3 id="models">Models</h3>
<table>
  <thead>
      <tr>
          <th>Model</th>
          <th>Embedding Size (OE62)</th>
          <th>Interaction Blocks</th>
          <th>Ewald Params (OE62)</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>SchNet</td>
          <td>512</td>
          <td>4</td>
          <td>12.2M total</td>
      </tr>
      <tr>
          <td>PaiNN</td>
          <td>512</td>
          <td>4</td>
          <td>15.7M total</td>
      </tr>
      <tr>
          <td>DimeNet++</td>
          <td>256</td>
          <td>3</td>
          <td>4.8M total</td>
      </tr>
      <tr>
          <td>GemNet-T</td>
          <td>256</td>
          <td>3</td>
          <td>16.1M total</td>
      </tr>
  </tbody>
</table>
<h3 id="evaluation">Evaluation</h3>
<ul>
<li>Primary metric: Energy mean absolute error (EMAE) in meV</li>
<li>Secondary metric: Force MAE in meV/Å (OC20 only)</li>
<li>Loss: Linear combination of energy and force MAEs (Eq. 15) with model-specific force multipliers</li>
<li>Optimizer: Adam with weight decay ($\lambda = 0.01$)</li>
</ul>
<h3 id="hardware">Hardware</h3>
<ul>
<li>All runtime measurements on NVIDIA A100 GPUs</li>
<li>Runtimes measured after 50 warmup batches, averaged over 500 batches, minimum of 3 repetitions</li>
<li>Code: <a href="https://github.com/arthurkosmala/EwaldMP">EwaldMP</a> (Hippocratic License 3.0)</li>
</ul>
<h3 id="artifacts">Artifacts</h3>
<table>
  <thead>
      <tr>
          <th>Artifact</th>
          <th>Type</th>
          <th>License</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><a href="https://github.com/arthurkosmala/EwaldMP">EwaldMP</a></td>
          <td>Code</td>
          <td>Hippocratic License 3.0 (new files) / MIT (OC20 base)</td>
          <td>Official implementation built on the Open Catalyst Project codebase</td>
      </tr>
      <tr>
          <td><a href="https://github.com/Open-Catalyst-Project/ocp/blob/main/DATASET.md">OC20</a></td>
          <td>Dataset</td>
          <td>CC-BY-4.0</td>
          <td>~265M periodic adsorbate-catalyst structures with DFT energies and forces</td>
      </tr>
      <tr>
          <td><a href="https://doi.org/10.1038/s41597-020-0385-y">OE62</a></td>
          <td>Dataset</td>
          <td>CC-BY-4.0</td>
          <td>~62,000 large organic molecules with DFT energies including D3 correction</td>
      </tr>
  </tbody>
</table>
<p><strong>Reproducibility status</strong>: Highly Reproducible. Source code, both datasets, and detailed hyperparameters (including per-model learning rates, batch sizes, and Ewald-specific settings) are all publicly available. Pre-trained model weights are not provided.</p>
<hr>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Kosmala, A., Gasteiger, J., Gao, N., &amp; Günnemann, S. (2023). Ewald-based Long-Range Message Passing for Molecular Graphs. In <em>Proceedings of the 40th International Conference on Machine Learning (ICML 2023)</em>.</p>
<p><strong>Publication</strong>: ICML 2023</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@inproceedings</span>{kosmala2023ewald,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span>=<span style="color:#e6db74">{Ewald-based Long-Range Message Passing for Molecular Graphs}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span>=<span style="color:#e6db74">{Kosmala, Arthur and Gasteiger, Johannes and Gao, Nicholas and G{\&#34;u}nnemann, Stephan}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">booktitle</span>=<span style="color:#e6db74">{Proceedings of the 40th International Conference on Machine Learning}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span>=<span style="color:#e6db74">{2023}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">series</span>=<span style="color:#e6db74">{PMLR}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">volume</span>=<span style="color:#e6db74">{202}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div>]]></content:encoded></item><item><title>PharMolixFM: Multi-Modal All-Atom Molecular Models</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/pharmolixfm-all-atom-foundation-models/</link><pubDate>Sat, 28 Mar 2026 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/pharmolixfm-all-atom-foundation-models/</guid><description>PharMolixFM unifies diffusion, flow matching, and Bayesian flow networks for all-atom molecular modeling and generation with task-specific denoising priors.</description><content:encoded><![CDATA[<h2 id="a-unified-framework-for-all-atom-molecular-foundation-models">A Unified Framework for All-Atom Molecular Foundation Models</h2>
<p>PharMolixFM is a <strong>Method</strong> paper that introduces a unified framework for constructing all-atom foundation models for molecular modeling and generation. The primary contribution is the systematic implementation of three multi-modal generative model variants (diffusion, flow matching, and Bayesian flow networks) within a single architecture, along with a task-unifying denoising formulation that enables training on multiple structural biology tasks simultaneously. The framework achieves competitive performance on protein-small-molecule docking and structure-based drug design while providing the first empirical analysis of inference scaling laws for molecular generative models.</p>
<h2 id="challenges-in-multi-modal-atomic-modeling">Challenges in Multi-Modal Atomic Modeling</h2>
<p>Existing all-atom foundation models such as AlphaFold3, RoseTTAFold All-Atom, and ESM-AA face two core challenges that limit their generalization across molecular modeling and generation tasks.</p>
<p>First, atomic data is inherently multi-modal: each atom comprises both a discrete atom type and continuous 3D coordinates. This poses challenges for structure models that need to jointly capture and predict both modalities. Unlike text or image data that exhibit a single modality, molecular structures require generative models that can handle discrete categorical variables (atom types, bond types) and continuous variables (coordinates) simultaneously.</p>
<p>Second, there has been no comprehensive analysis of how different training objectives and sampling strategies impact the performance of all-atom foundation models. Prior work has focused on individual model architectures without systematically comparing generative frameworks or studying how inference-time compute scaling affects prediction quality.</p>
<p>PharMolixFM addresses both challenges by providing a unified framework that implements three state-of-the-art multi-modal generative models and formulates all downstream tasks as a generalized denoising process with task-specific priors.</p>
<h2 id="multi-modal-denoising-with-task-specific-priors">Multi-Modal Denoising with Task-Specific Priors</h2>
<p>The core innovation of PharMolixFM is the formulation of molecular tasks as a generalized denoising process where task-specific priors control which parts of the molecular system are noised during training. The framework decomposes a biomolecular system into $N$ atoms represented as a triplet $\bar{\mathbf{S}}_0 = \langle \mathbf{X}_0, \mathbf{A}_0, \mathbf{E}_0 \rangle$, where $\mathbf{X}_0 \in \mathbb{R}^{N \times 3}$ are atom coordinates, $\mathbf{A}_0 \in \mathbb{Z}^{N \times D_1}$ are one-hot atom types, and $\mathbf{E}_0 \in \mathbb{Z}^{N \times N \times D_2}$ are one-hot bond types.</p>
<p>The generative model estimates the density $p_\theta(\langle \mathbf{X}_0, \mathbf{A}_0, \mathbf{E}_0 \rangle)$ subject to SE(3) invariance:</p>
<p>$$
p_\theta(\langle \mathbf{R}\mathbf{X}_0 + \mathbf{t}, \mathbf{A}_0, \mathbf{E}_0 \rangle) = p_\theta(\langle \mathbf{X}_0, \mathbf{A}_0, \mathbf{E}_0 \rangle)
$$</p>
<p>The variational lower bound is optimized over latent variables $S_1, \ldots, S_T$ obtained by adding independent noise to different modalities and atoms:</p>
<p>$$
q(S_{1:T} \mid S_0) = \prod_{i=1}^{T} \prod_{j=1}^{N} q(\mathbf{X}_{i,j} \mid \mathbf{X}_{0,j}, \sigma_{i,j}^{(\mathbf{X})}) , q(\mathbf{A}_{i,j} \mid \mathbf{A}_{0,j}, \sigma_{i,j}^{(\mathbf{A})}) , q(\mathbf{E}_{i,j} \mid \mathbf{E}_{0,j}, \sigma_{i,j}^{(\mathbf{E})})
$$</p>
<p>A key design choice is the noise schedule $\sigma_{i,j}^{(\mathcal{M})} = \frac{i}{T} \cdot \text{fix}_j^{(\mathcal{M})}$, where $\text{fix}_j^{(\mathcal{M})}$ is a scaling factor between 0 and 1 that controls which atoms and modalities receive noise. This &ldquo;Fix&rdquo; mechanism enables multiple training tasks:</p>
<ul>
<li><strong>Docking</strong> ($\text{Fix} = 1$ for protein and molecular graph, $\text{Fix} = 0$ for molecule coordinates): predicts binding pose given known atom/bond types.</li>
<li><strong>Structure-based drug design</strong> ($\text{Fix} = 1$ for protein, $\text{Fix} = 0$ for all molecule properties): generates novel molecules for a given pocket.</li>
<li><strong>Robustness augmentation</strong> ($\text{Fix} = 0.7$ for 15% randomly selected atoms, $\text{Fix} = 0$ for rest): simulates partial structure determination.</li>
</ul>
<h3 id="three-generative-model-variants">Three Generative Model Variants</h3>
<p><strong>Multi-modal diffusion (PharMolixFM-Diff)</strong> uses a Markovian forward process. Continuous coordinates follow Gaussian diffusion while discrete variables use a D3PM categorical transition:</p>
<p>$$
q(\mathbf{X}_{i,j} \mid \mathbf{X}_{0,j}) = \mathcal{N}(\sqrt{\alpha_{i,j}} , \mathbf{X}_{0,j}, (1 - \alpha_{i,j}) \mathbf{I}), \quad \alpha_{i,j} = \prod_{k=1}^{i}(1 - \sigma_{i,j}^{(\mathbf{X})})
$$</p>
<p>$$
q(\mathbf{A}_{i,j} \mid \mathbf{A}_{0,j}) = \text{Cat}(\mathbf{A}_{0,j} \bar{Q}_{i,j}^{(\mathbf{A})}), \quad Q_{i,j}^{(\mathbf{A})} = (1 - \sigma_{i,j}^{(\mathbf{A})}) \mathbf{I} + \frac{\sigma_{i,j}^{(\mathbf{A})}}{D_1} \mathbb{1}\mathbb{1}^T
$$</p>
<p>The training loss combines coordinate MSE with cross-entropy for discrete variables:</p>
<p>$$
\mathcal{L} = \mathbb{E}_{S_0, i, S_i} \left[ \lambda_i^{(\mathbf{X})} | \tilde{\mathbf{X}}_0 - \mathbf{X}_0 |_2^2 + \lambda_i^{(\mathbf{A})} \mathcal{L}_{CE}(\tilde{\mathbf{A}}_0, \mathbf{A}_0) + \lambda_i^{(\mathbf{E})} \mathcal{L}_{CE}(\tilde{\mathbf{E}}_0, \mathbf{E}_0) \right]
$$</p>
<p><strong>Multi-modal flow matching (PharMolixFM-Flow)</strong> constructs a direct mapping between data and prior distributions using conditional vector fields. For coordinates, the conditional flow uses a Gaussian path $q(\mathbf{X}_{i,j} \mid \mathbf{X}_{0,j}) = \mathcal{N}((1 - \sigma_{i,j}^{(\mathbf{X})}) \mathbf{X}_{0,j}, (\sigma_{i,j}^{(\mathbf{X})})^2 \mathbf{I})$, while discrete variables use the same D3PM Markov chain. Sampling proceeds by solving an ODE via Euler integration.</p>
<p><strong>Bayesian flow networks (PharMolixFM-BFN)</strong> perform generative modeling in the parameter space of the data distribution rather than the data space. The Bayesian flow distribution for coordinates is:</p>
<p>$$
p_F(\tilde{\mathbf{X}}_{i,j}^{(\theta)} \mid \mathbf{X}_{0,j}) = \mathcal{N}(\gamma_{i,j} \mathbf{X}_{0,j}, \gamma_{i,j}(1 - \gamma_{i,j}) \mathbf{I}), \quad \gamma_{i,j} = 1 - \alpha^{2(1 - \sigma_{i,j}^{(\mathbf{X})})}
$$</p>
<h3 id="network-architecture">Network Architecture</h3>
<p>The architecture follows PocketXMol with a dual-branch SE(3)-equivariant graph neural network. A protein branch (4-layer GNN with kNN graph) processes pocket atoms, then representations are passed to a molecule branch (6-layer GNN) that captures protein-molecule interactions. Independent prediction heads reconstruct atom coordinates, atom types, and bond types, with additional confidence heads for self-ranking during inference.</p>
<h2 id="docking-and-drug-design-experiments">Docking and Drug Design Experiments</h2>
<h3 id="protein-small-molecule-docking">Protein-Small-Molecule Docking</h3>
<p>PharMolixFM is evaluated on the PoseBusters benchmark (428 protein-small-molecule complexes) using the holo docking setting with a known protein structure and 10 Angstrom binding pocket. The metric is the ratio of predictions with RMSD &lt; 2 Angstrom.</p>
<table>
  <thead>
      <tr>
          <th>Method</th>
          <th>Self-Ranking (%)</th>
          <th>Oracle-Ranking (%)</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>DiffDock</td>
          <td>38.0</td>
          <td>-</td>
      </tr>
      <tr>
          <td>RFAA</td>
          <td>42.0</td>
          <td>-</td>
      </tr>
      <tr>
          <td>Vina</td>
          <td>52.3</td>
          <td>-</td>
      </tr>
      <tr>
          <td>UniMol-Docking V2</td>
          <td>77.6</td>
          <td>-</td>
      </tr>
      <tr>
          <td>SurfDock</td>
          <td>78.0</td>
          <td>-</td>
      </tr>
      <tr>
          <td>AlphaFold3</td>
          <td>90.4</td>
          <td>-</td>
      </tr>
      <tr>
          <td>PocketXMol (50 repeats)</td>
          <td>82.2</td>
          <td>95.3</td>
      </tr>
      <tr>
          <td>PharMolixFM-Diff (50 repeats)</td>
          <td>83.4</td>
          <td>96.0</td>
      </tr>
      <tr>
          <td>PharMolixFM-Flow (50 repeats)</td>
          <td>73.4</td>
          <td>93.7</td>
      </tr>
      <tr>
          <td>PharMolixFM-BFN (50 repeats)</td>
          <td>78.5</td>
          <td>93.5</td>
      </tr>
      <tr>
          <td>PharMolixFM-Diff (500 repeats)</td>
          <td>83.9</td>
          <td>98.1</td>
      </tr>
  </tbody>
</table>
<p>PharMolixFM-Diff achieves the second-best self-ranking result (83.4%), outperforming PocketXMol by 1.7% absolute but trailing AlphaFold3 (90.4%). The key advantage is inference speed: approximately 4.6 seconds per complex on a single A800 GPU compared to approximately 249.0 seconds for AlphaFold3 (a 54x speedup). Under oracle-ranking with 500 repeats, PharMolixFM-Diff reaches 98.1%, suggesting that better ranking strategies could further improve practical performance.</p>
<h3 id="structure-based-drug-design">Structure-Based Drug Design</h3>
<p>Evaluation uses the CrossDocked test set (100 protein pockets, 100 molecules generated per pocket), measuring Vina binding affinity scores and drug-likeness properties (QED and SA).</p>
<table>
  <thead>
      <tr>
          <th>Method</th>
          <th>Vina Score (Avg/Med)</th>
          <th>QED</th>
          <th>SA</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Pocket2Mol</td>
          <td>-5.14 / -4.70</td>
          <td>0.57</td>
          <td>0.76</td>
      </tr>
      <tr>
          <td>TargetDiff</td>
          <td>-5.47 / -6.30</td>
          <td>0.48</td>
          <td>0.58</td>
      </tr>
      <tr>
          <td>DecompDiff</td>
          <td>-5.67 / -6.04</td>
          <td>0.45</td>
          <td>0.61</td>
      </tr>
      <tr>
          <td>MolCRAFT</td>
          <td>-6.61 / -8.14</td>
          <td>0.46</td>
          <td>0.62</td>
      </tr>
      <tr>
          <td>PharMolixFM-Diff</td>
          <td>-6.18 / -6.44</td>
          <td>0.50</td>
          <td>0.73</td>
      </tr>
      <tr>
          <td>PharMolixFM-Flow</td>
          <td>-6.34 / -6.47</td>
          <td>0.49</td>
          <td>0.74</td>
      </tr>
      <tr>
          <td>PharMolixFM-BFN</td>
          <td>-6.38 / -6.45</td>
          <td>0.48</td>
          <td>0.64</td>
      </tr>
  </tbody>
</table>
<p>PharMolixFM achieves a better balance between binding affinity and drug-like properties compared to baselines. While MolCRAFT achieves the best Vina scores, PharMolixFM-Diff and Flow variants show notably higher QED (0.49-0.50 vs. 0.45-0.48) and SA (0.73-0.74 vs. 0.58-0.62), which are important for downstream validation and in-vivo application.</p>
<h3 id="inference-scaling-law">Inference Scaling Law</h3>
<p>The paper explores whether inference-time scaling holds for molecular generative models, fitting the relationship:</p>
<p>$$
\text{Acc} = a \log(bR + c) + d
$$</p>
<p>where $R$ is the number of sampling repeats. All three PharMolixFM variants exhibit logarithmic improvement in docking accuracy with increased sampling repeats, analogous to inference scaling laws observed in NLP. Performance plateaus eventually due to distributional differences between training and test sets.</p>
<h2 id="competitive-docking-with-faster-inference-but-limited-task-scope">Competitive Docking with Faster Inference, but Limited Task Scope</h2>
<p>PharMolixFM demonstrates that multi-modal generative models can achieve competitive all-atom molecular modeling with substantial inference speed advantages over AlphaFold3. The key findings are:</p>
<ol>
<li><strong>Diffusion outperforms flow matching and BFN</strong> for docking under standard sampling budgets. The stochastic nature of diffusion sampling appears beneficial compared to the deterministic ODE integration of flow matching.</li>
<li><strong>Oracle-ranking reveals untapped potential</strong>: the gap between self-ranking (83.4%) and oracle-ranking (98.1%) at 500 repeats indicates that confidence-based ranking is a bottleneck. Better ranking methods could close the gap with AlphaFold3.</li>
<li><strong>The three variants show similar performance for drug design</strong>, suggesting that model architecture and training data may matter more than the generative framework for generation tasks.</li>
<li><strong>Inference scaling laws hold</strong> for molecular generative models, paralleling findings in NLP.</li>
</ol>
<p>Limitations include that the framework is only evaluated on two tasks (docking and SBDD), and the paper does not address protein structure prediction, protein-protein interactions, or nucleic acid modeling, which are part of AlphaFold3&rsquo;s scope. The BFN variant underperforms the diffusion model, which the authors attribute to smaller noise scales at early sampling steps making training less challenging. The paper also does not compare against concurrent work on inference-time scaling for molecular models.</p>
<hr>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<h3 id="data">Data</h3>
<table>
  <thead>
      <tr>
          <th>Purpose</th>
          <th>Dataset</th>
          <th>Size</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Training</td>
          <td>PDBBind, Binding MOAD, CrossDocked2020, PepBDB</td>
          <td>Not specified</td>
          <td>Filtered by PocketXMol criteria</td>
      </tr>
      <tr>
          <td>Docking eval</td>
          <td>PoseBusters benchmark</td>
          <td>428 complexes</td>
          <td>Holo docking with known protein</td>
      </tr>
      <tr>
          <td>SBDD eval</td>
          <td>CrossDocked test set</td>
          <td>100 pockets</td>
          <td>100 molecules per pocket</td>
      </tr>
  </tbody>
</table>
<h3 id="algorithms">Algorithms</h3>
<ul>
<li>Three generative variants: multi-modal diffusion (D3PM), flow matching, Bayesian flow networks</li>
<li>Task-specific noise via Fix mechanism (0, 0.7, or 1.0)</li>
<li>Training tasks selected with equal probability per sample</li>
<li>AdamW optimizer: weight decay 0.001, $\beta_1 = 0.99$, $\beta_2 = 0.999$</li>
<li>Linear warmup to learning rate 0.001 over 1000 steps</li>
<li>180K training steps with batch size 40</li>
</ul>
<h3 id="models">Models</h3>
<ul>
<li>Dual-branch SE(3)-equivariant GNN (protein: 4-layer, molecule: 6-layer)</li>
<li>kNN graph construction for protein and protein-molecule interactions</li>
<li>Independent prediction heads for coordinates, atom types, bond types</li>
<li>Confidence heads for self-ranking during inference</li>
</ul>
<h3 id="evaluation">Evaluation</h3>
<table>
  <thead>
      <tr>
          <th>Metric</th>
          <th>PharMolixFM-Diff</th>
          <th>AlphaFold3</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>RMSD &lt; 2A self-ranking</td>
          <td>83.4% (50 rep)</td>
          <td>90.4%</td>
          <td>PoseBusters docking</td>
      </tr>
      <tr>
          <td>RMSD &lt; 2A oracle-ranking</td>
          <td>98.1% (500 rep)</td>
          <td>-</td>
          <td>PoseBusters docking</td>
      </tr>
      <tr>
          <td>Inference time (per complex)</td>
          <td>~4.6s</td>
          <td>~249.0s</td>
          <td>Single A800 GPU</td>
      </tr>
      <tr>
          <td>Vina score (avg)</td>
          <td>-6.18</td>
          <td>-</td>
          <td>CrossDocked SBDD</td>
      </tr>
  </tbody>
</table>
<h3 id="hardware">Hardware</h3>
<ul>
<li>Training: 4x 80GB A800 GPUs</li>
<li>Inference benchmarked on single A800 GPU</li>
</ul>
<h3 id="artifacts">Artifacts</h3>
<table>
  <thead>
      <tr>
          <th>Artifact</th>
          <th>Type</th>
          <th>License</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><a href="https://github.com/PharMolix/OpenBioMed">OpenBioMed (GitHub)</a></td>
          <td>Code</td>
          <td>MIT</td>
          <td>Official implementation</td>
      </tr>
  </tbody>
</table>
<hr>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Luo, Y., Wang, J., Fan, S., &amp; Nie, Z. (2025). PharMolixFM: All-Atom Foundation Models for Molecular Modeling and Generation. <em>arXiv preprint arXiv:2503.21788</em>.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@article</span>{luo2025pharmolixfm,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span>=<span style="color:#e6db74">{PharMolixFM: All-Atom Foundation Models for Molecular Modeling and Generation}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span>=<span style="color:#e6db74">{Luo, Yizhen and Wang, Jiashuo and Fan, Siqi and Nie, Zaiqing}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">journal</span>=<span style="color:#e6db74">{arXiv preprint arXiv:2503.21788}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span>=<span style="color:#e6db74">{2025}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div>]]></content:encoded></item><item><title>MAT: Graph-Augmented Transformer for Molecules (2020)</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/molecule-attention-transformer/</link><pubDate>Fri, 27 Mar 2026 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/molecule-attention-transformer/</guid><description>MAT augments the Transformer self-attention mechanism with inter-atomic distances and molecular graph adjacency for molecular property prediction.</description><content:encoded><![CDATA[<h2 id="a-graph-augmented-transformer-for-molecular-property-prediction">A Graph-Augmented Transformer for Molecular Property Prediction</h2>
<p>This is a <strong>Method</strong> paper that proposes the Molecule Attention Transformer (MAT), a Transformer-based architecture adapted for molecular property prediction. The primary contribution is a modified self-attention mechanism that incorporates inter-atomic distances and molecular graph structure alongside the standard query-key attention. Combined with self-supervised pretraining on 2 million molecules from ZINC15, MAT achieves competitive performance across seven diverse molecular property prediction tasks while requiring minimal hyperparameter tuning.</p>
<h2 id="challenges-in-deep-learning-for-molecular-properties">Challenges in Deep Learning for Molecular Properties</h2>
<p>Predicting molecular properties is central to drug discovery and materials design, yet deep neural networks have struggled to consistently outperform shallow methods like random forests and SVMs on these tasks. Wu et al. (2018) demonstrated through the MoleculeNet benchmark that graph neural networks do not reliably beat classical models. Two recurring problems compound this:</p>
<ol>
<li><strong>Underfitting</strong>: Graph neural networks tend to underfit training data, with performance failing to scale with model complexity (Ishiguro et al., 2019).</li>
<li><strong>Hyperparameter sensitivity</strong>: Deep models for molecule property prediction require extensive hyperparameter search (often 500+ configurations) to achieve competitive results, making them impractical for many practitioners.</li>
</ol>
<p>Concurrent work explored using vanilla Transformers on SMILES string representations of molecules (Honda et al., 2019; Wang et al., 2019), but these approaches discard the explicit structural information encoded in molecular graphs and 3D conformations. The motivation for MAT is to combine the flexibility of the Transformer architecture with domain-specific inductive biases from molecular structure.</p>
<h2 id="molecule-self-attention-combining-attention-distance-and-graph-structure">Molecule Self-Attention: Combining Attention, Distance, and Graph Structure</h2>
<p>The core innovation is the Molecule Self-Attention layer, which replaces standard Transformer self-attention. In a standard Transformer, head $i$ computes:</p>
<p>$$
\mathcal{A}^{(i)} = \rho\left(\frac{\mathbf{Q}_{i} \mathbf{K}_{i}^{T}}{\sqrt{d_{k}}}\right) \mathbf{V}_{i}
$$</p>
<p>MAT augments this with two additional information sources. Let $\mathbf{A} \in {0, 1}^{N_{\text{atoms}} \times N_{\text{atoms}}}$ denote the molecular graph adjacency matrix and $\mathbf{D} \in \mathbb{R}^{N_{\text{atoms}} \times N_{\text{atoms}}}$ denote the inter-atomic distance matrix. The modified attention becomes:</p>
<p>$$
\mathcal{A}^{(i)} = \left(\lambda_{a} \rho\left(\frac{\mathbf{Q}_{i} \mathbf{K}_{i}^{T}}{\sqrt{d_{k}}}\right) + \lambda_{d}, g(\mathbf{D}) + \lambda_{g}, \mathbf{A}\right) \mathbf{V}_{i}
$$</p>
<p>where $\lambda_{a}$, $\lambda_{d}$, and $\lambda_{g}$ are scalar hyperparameters weighting each component, and $g$ is either a row-wise softmax or an element-wise exponential decay $g(d) = \exp(-d)$.</p>
<p>Key architectural details:</p>
<ul>
<li><strong>Atom embedding</strong>: Each atom is represented as a 26-dimensional vector encoding atomic identity (one-hot over B, N, C, O, F, P, S, Cl, Br, I, dummy, other), number of heavy neighbors, number of hydrogens, formal charge, ring membership, and aromaticity.</li>
<li><strong>Dummy node</strong>: An artificial disconnected node (distance $10^{6}$ from all atoms) is added to each molecule, allowing the model to &ldquo;skip&rdquo; attention heads when no relevant pattern exists, similar to how BERT uses the separation token.</li>
<li><strong>3D conformers</strong>: Distance matrices are computed from RDKit-generated 3D conformers using the Universal Force Field (UFF).</li>
<li><strong>Pretraining</strong>: Node-level masked atom prediction on 2 million ZINC15 molecules (following Hu et al., 2019), where 15% of atom features are masked and the model predicts them.</li>
</ul>
<h2 id="benchmark-evaluation-and-ablation-studies">Benchmark Evaluation and Ablation Studies</h2>
<h3 id="experimental-setup">Experimental setup</h3>
<p>MAT is evaluated on seven molecular property prediction datasets spanning regression and classification:</p>
<table>
  <thead>
      <tr>
          <th>Dataset</th>
          <th>Task</th>
          <th>Size</th>
          <th>Metric</th>
          <th>Split</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>FreeSolv</td>
          <td>Regression (hydration free energy)</td>
          <td>642</td>
          <td>RMSE</td>
          <td>Random</td>
      </tr>
      <tr>
          <td>ESOL</td>
          <td>Regression (log solubility)</td>
          <td>1,128</td>
          <td>RMSE</td>
          <td>Random</td>
      </tr>
      <tr>
          <td>BBBP</td>
          <td>Classification (BBB permeability)</td>
          <td>2,039</td>
          <td>ROC AUC</td>
          <td>Scaffold</td>
      </tr>
      <tr>
          <td>Estrogen-alpha</td>
          <td>Classification (receptor activity)</td>
          <td>2,398</td>
          <td>ROC AUC</td>
          <td>Scaffold</td>
      </tr>
      <tr>
          <td>Estrogen-beta</td>
          <td>Classification (receptor activity)</td>
          <td>1,961</td>
          <td>ROC AUC</td>
          <td>Scaffold</td>
      </tr>
      <tr>
          <td>MetStab-high</td>
          <td>Classification (metabolic stability)</td>
          <td>2,127</td>
          <td>ROC AUC</td>
          <td>Random</td>
      </tr>
      <tr>
          <td>MetStab-low</td>
          <td>Classification (metabolic stability)</td>
          <td>2,127</td>
          <td>ROC AUC</td>
          <td>Random</td>
      </tr>
  </tbody>
</table>
<p>Baselines include GCN, Weave, EAGCN, Random Forest (RF), and SVM. Each model receives the same hyperparameter search budget (150 or 500 evaluations). Results are averaged over 6 random train/validation/test splits.</p>
<h3 id="main-results">Main results</h3>
<p>MAT achieves the best average rank across all seven tasks:</p>
<table>
  <thead>
      <tr>
          <th>Model</th>
          <th>Avg. Rank (500 budget)</th>
          <th>Avg. Rank (150 budget)</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>MAT</td>
          <td>2.42</td>
          <td>2.71</td>
      </tr>
      <tr>
          <td>RF</td>
          <td>3.14</td>
          <td>3.14</td>
      </tr>
      <tr>
          <td>SVM</td>
          <td>3.57</td>
          <td>3.28</td>
      </tr>
      <tr>
          <td>GCN</td>
          <td>3.57</td>
          <td>3.71</td>
      </tr>
      <tr>
          <td>Weave</td>
          <td>3.71</td>
          <td>3.57</td>
      </tr>
      <tr>
          <td>EAGCN</td>
          <td>4.14</td>
          <td>4.14</td>
      </tr>
  </tbody>
</table>
<p>With self-supervised pretraining, Pretrained MAT achieves an average rank of 1.57, outperforming both Pretrained EAGCN (4.0) and SMILES Transformer (4.29). Pretrained MAT requires tuning only the learning rate (7 values tested), compared to 500 hyperparameter combinations for the non-pretrained models.</p>
<h3 id="ablation-results">Ablation results</h3>
<p>Ablation studies on BBBP, ESOL, and FreeSolv reveal:</p>
<table>
  <thead>
      <tr>
          <th>Variant</th>
          <th>BBBP (AUC)</th>
          <th>ESOL (RMSE)</th>
          <th>FreeSolv (RMSE)</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>MAT (full)</td>
          <td>.723</td>
          <td>.286</td>
          <td>.250</td>
      </tr>
      <tr>
          <td>- Graph</td>
          <td>.716</td>
          <td>.316</td>
          <td>.276</td>
      </tr>
      <tr>
          <td>- Distance</td>
          <td>.729</td>
          <td>.281</td>
          <td>.281</td>
      </tr>
      <tr>
          <td>- Attention</td>
          <td>.692</td>
          <td>.306</td>
          <td>.329</td>
      </tr>
      <tr>
          <td>- Dummy node</td>
          <td>.714</td>
          <td>.317</td>
          <td>.249</td>
      </tr>
      <tr>
          <td>+ Edge features</td>
          <td>.683</td>
          <td>.314</td>
          <td>.358</td>
      </tr>
  </tbody>
</table>
<p>Removing any single component degrades performance on at least one task, supporting the value of combining all three information sources. Adding edge features does not help, suggesting the adjacency and distance matrices already capture sufficient bond-level information.</p>
<h3 id="interpretability-analysis">Interpretability analysis</h3>
<p>Individual attention heads in the first layer learn chemically meaningful functions. Six heads were identified that focus on specific chemical patterns: 2-neighbored aromatic carbons, sulfur atoms, non-ring nitrogens, carbonyl oxygens, 3-neighbored aromatic atoms (substitution positions), and aromatic ring nitrogens. Statistical validation using Kruskal-Wallis tests confirmed that atoms matching these SMARTS patterns receive significantly higher attention weights ($p &lt; 0.001$ for all patterns).</p>
<h2 id="findings-limitations-and-future-directions">Findings, Limitations, and Future Directions</h2>
<p>MAT demonstrates that augmenting Transformer self-attention with molecular graph structure and 3D distance information produces a model that performs consistently well across diverse property prediction tasks. The key practical finding is that self-supervised pretraining dramatically reduces the hyperparameter tuning burden: Pretrained MAT matches or exceeds the performance of extensively tuned models while requiring only learning rate selection.</p>
<p>Several limitations are acknowledged:</p>
<ul>
<li><strong>Fingerprint-based models still win on some tasks</strong>: RF and SVM with extended-connectivity fingerprints outperform MAT on metabolic stability and Estrogen-beta tasks, suggesting that incorporating fingerprint representations could improve MAT further.</li>
<li><strong>Single conformer</strong>: Only one pre-computed 3D conformer is used per molecule. More sophisticated conformer sampling or ensemble strategies were not explored.</li>
<li><strong>Limited pretraining exploration</strong>: Only the masked atom prediction task from Hu et al. (2019) was used. The authors note that exploring additional pretraining objectives is a promising direction.</li>
<li><strong>Scalability</strong>: The pretrained model uses 1024-dimensional embeddings with 8 layers and 16 attention heads, fitting the largest model that fits in GPU memory.</li>
</ul>
<hr>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<h3 id="data">Data</h3>
<table>
  <thead>
      <tr>
          <th>Purpose</th>
          <th>Dataset</th>
          <th>Size</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Pretraining</td>
          <td>ZINC15</td>
          <td>2M molecules</td>
          <td>Sampled from ZINC database</td>
      </tr>
      <tr>
          <td>Evaluation</td>
          <td>FreeSolv</td>
          <td>642</td>
          <td>Hydration free energy regression</td>
      </tr>
      <tr>
          <td>Evaluation</td>
          <td>ESOL</td>
          <td>1,128</td>
          <td>Log solubility regression</td>
      </tr>
      <tr>
          <td>Evaluation</td>
          <td>BBBP</td>
          <td>2,039</td>
          <td>Blood-brain barrier classification</td>
      </tr>
      <tr>
          <td>Evaluation</td>
          <td>Estrogen-alpha/beta</td>
          <td>2,398 / 1,961</td>
          <td>Receptor activity classification</td>
      </tr>
      <tr>
          <td>Evaluation</td>
          <td>MetStab-high/low</td>
          <td>2,127 each</td>
          <td>Metabolic stability classification</td>
      </tr>
  </tbody>
</table>
<h3 id="algorithms">Algorithms</h3>
<ul>
<li>Optimizer: Adam with Noam learning rate scheduler (warmup then inverse square root decay)</li>
<li>Pretraining: 8 epochs, learning rate 0.001, batch size 256, binary cross-entropy loss</li>
<li>Fine-tuning: 100 epochs, batch size 32, learning rate selected from {1e-3, 5e-4, 1e-4, 5e-5, 1e-5, 5e-6, 1e-6}</li>
<li>Distance kernel: exponential decay $g(d) = \exp(-d)$ for pretrained model</li>
<li>Lambda weights: $\lambda_{a} = \lambda_{d} = 0.33$ for pretrained model</li>
</ul>
<h3 id="models">Models</h3>
<ul>
<li>Pretrained MAT: 1024-dim embeddings, 8 layers, 16 attention heads, 1 feed-forward layer per block</li>
<li>Dropout: 0.0, weight decay: 0.0 for pretrained model</li>
<li>Atom featurization: 26-dimensional one-hot encoding (Table 1 in paper)</li>
</ul>
<h3 id="evaluation">Evaluation</h3>
<ul>
<li>Regression: RMSE (FreeSolv, ESOL)</li>
<li>Classification: ROC AUC (BBBP, Estrogen-alpha/beta, MetStab-high/low)</li>
<li>All experiments repeated 6 times with different train/validation/test splits</li>
<li>Scaffold split for BBBP, Estrogen, random split for others</li>
</ul>
<h3 id="hardware">Hardware</h3>
<p>The paper does not specify exact hardware details. The pretrained model is described as &ldquo;the largest model that still fits the GPU memory.&rdquo;</p>
<h3 id="artifacts">Artifacts</h3>
<table>
  <thead>
      <tr>
          <th>Artifact</th>
          <th>Type</th>
          <th>License</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><a href="https://github.com/gmum/MAT">gmum/MAT</a></td>
          <td>Code</td>
          <td>MIT</td>
          <td>Official implementation with pretrained weights</td>
      </tr>
  </tbody>
</table>
<hr>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Maziarka, Ł., Danel, T., Mucha, S., Rataj, K., Tabor, J., &amp; Jastrzębski, S. (2020). Molecule Attention Transformer. <em>arXiv preprint arXiv:2002.08264</em>.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@article</span>{maziarka2020molecule,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span>=<span style="color:#e6db74">{Molecule Attention Transformer}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span>=<span style="color:#e6db74">{Maziarka, {\L}ukasz and Danel, Tomasz and Mucha, S{\l}awomir and Rataj, Krzysztof and Tabor, Jacek and Jastrz{\k{e}}bski, Stanis{\l}aw}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">journal</span>=<span style="color:#e6db74">{arXiv preprint arXiv:2002.08264}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span>=<span style="color:#e6db74">{2020}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div>]]></content:encoded></item><item><title>MOFFlow: Flow Matching for MOF Structure Prediction</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/mofflow/</link><pubDate>Sat, 20 Dec 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/mofflow/</guid><description>A Riemannian flow matching framework for generating Metal-Organic Framework structures by treating building blocks as rigid bodies.</description><content:encoded><![CDATA[<h2 id="methodological-contribution-mofflow-architecture">Methodological Contribution: MOFFlow Architecture</h2>
<p>This is a <strong>Methodological Paper</strong> ($\Psi_{\text{Method}}$).</p>
<p>It introduces <strong>MOFFlow</strong>, a generative architecture and training framework designed specifically for the structure prediction of Metal-Organic Frameworks (MOFs). The paper focuses on the algorithmic innovation of decomposing the problem into rigid-body assembly on a Riemannian manifold, validates this through comparison against existing baselines, and performs ablation studies to justify architectural choices. While it leverages the theory of flow matching, its primary contribution is the application-specific architecture and the handling of modular constraints.</p>
<h2 id="motivation-scaling-limits-of-atom-level-generation">Motivation: Scaling Limits of Atom-Level Generation</h2>
<p>The primary motivation is to overcome the scalability and accuracy limitations of existing methods for MOF structure prediction.</p>
<ul>
<li><strong>Computational Cost of DFT:</strong> Conventional approaches rely on <em>ab initio</em> calculations (DFT) combined with random search, which are computationally prohibitive for large, complex systems like MOFs.</li>
<li><strong>Failure of General CSP:</strong> Existing deep generative models for general Crystal Structure Prediction (CSP) operate on an atom-by-atom basis. They fail to scale to MOFs, which often contain hundreds or thousands of atoms per unit cell, and do not exploit the inherent modular nature (building blocks) of MOFs.</li>
<li><strong>Tunability:</strong> MOFs have applications in carbon capture and drug delivery due to their tunable porosity, making automated design tools valuable.</li>
</ul>
<h2 id="core-innovation-rigid-body-flow-matching-on-se3">Core Innovation: Rigid-Body Flow Matching on SE(3)</h2>
<p>MOFFlow introduces a <strong>hierarchical, rigid-body flow matching framework</strong> tailored for MOFs.</p>
<ul>
<li><strong>Rigid Body Decomposition:</strong> MOFFlow treats metal nodes and organic linkers as rigid bodies, reducing the search space from $3N$ (atoms) to $6M$ (roto-translation of $M$ blocks) compared to atom-based methods.</li>
<li><strong>Riemannian Flow Matching on $SE(3)$:</strong> It is the first end-to-end model to jointly generate block-level rotations ($SO(3)$), translations ($\mathbb{R}^3$), and lattice parameters using <a href="/notes/machine-learning/generative-models/flow-matching-for-generative-modeling/">Riemannian flow matching</a>.</li>
<li><strong>MOFAttention:</strong> A custom attention module designed to encode the geometric relationships between building blocks, lattice parameters, and rotational constraints.</li>
<li><strong>Constraint Handling:</strong> It incorporates domain knowledge by operating on a mean-free system for translation invariance and using canonicalized coordinates for rotation invariance.</li>
</ul>
<h2 id="experimental-setup-and-baselines">Experimental Setup and Baselines</h2>
<p>The authors evaluated MOFFlow on structure prediction accuracy, physical property preservation, and scalability.</p>
<ul>
<li><strong>Dataset:</strong> The <strong>Boyd et al. (2019)</strong> dataset consisting of 324,426 hypothetical MOF structures, decomposed into building blocks using the <strong>MOFid</strong> algorithm. Filtered to structures with &lt;200 blocks, yielding 308,829 structures (247,066 train / 30,883 val / 30,880 test). Structures contain up to approximately 2,400 atoms per unit cell.</li>
<li><strong>Baselines:</strong>
<ul>
<li><em>Optimization-based:</em> Random Search (RS) and Evolutionary Algorithm (EA) using CrySPY and CHGNet.</li>
<li><em>Deep Learning:</em> DiffCSP (deep generative model for general crystals).</li>
<li><em>Self-Assembly:</em> A heuristic algorithm used in MOFDiff (adapted for comparison).</li>
</ul>
</li>
<li><strong>Metrics:</strong>
<ul>
<li><strong>Match Rate (MR):</strong> Percentage of generated structures matching ground truth within tolerance.</li>
<li><strong>RMSE:</strong> Root mean squared displacement normalized by average free length per atom.</li>
<li><strong>Structural Properties:</strong> Volumetric/Gravimetric Surface Area (VSA/GSA), Pore Limiting Diameter (PLD), Void Fraction, etc., calculated via Zeo++.</li>
<li><strong>Scalability:</strong> Performance vs. number of atoms and building blocks.</li>
</ul>
</li>
</ul>
<h2 id="results-and-generative-performance">Results and Generative Performance</h2>
<p>MOFFlow outperformed all baselines in accuracy and efficiency, particularly for large structures.</p>
<ul>
<li><strong>Accuracy:</strong> With a single sample, MOFFlow achieved a <strong>31.69% match rate</strong> (stol=0.5) and <strong>87.46%</strong> (stol=1.0) on the full test set (30,880 structures). With 5 samples, these rose to <strong>44.75%</strong> (stol=0.5) and <strong>100.0%</strong> (stol=1.0). RS and EA (tested on 100 and 15 samples respectively due to computational cost, generating 20 candidates each) achieved 0.00% MR at both tolerance levels. DiffCSP reached 0.09% (stol=0.5) and 23.12% (stol=1.0) with 1 sample.</li>
<li><strong>Speed:</strong> Inference took <strong>1.94 seconds</strong> per structure, compared to 5.37s for DiffCSP, 332s for RS, and 1,959s for EA.</li>
<li><strong>Scalability:</strong> MOFFlow preserved high match rates across all system sizes, while DiffCSP&rsquo;s match rate dropped sharply beyond 200 atoms.</li>
<li><strong>Property Preservation:</strong> The distributions of physical properties (e.g., surface area, void fraction) for MOFFlow-generated structures closely matched the ground truth. DiffCSP frequently reduced volumetric surface area and void fraction to zero.</li>
<li><strong>Self-Assembly Comparison:</strong> In a controlled comparison where the self-assembly (SA) algorithm received MOFFlow&rsquo;s predicted translations and lattice, MOFFlow (MR=31.69%, RMSE=0.2820) outperformed SA (MR=30.04%, RMSE=0.3084), confirming the value of the learned rotational vector fields. In an extended scalability comparison, SA scaled better for structures with many building blocks, but MOFFlow achieved higher overall match rate (31.69% vs. 27.14%).</li>
<li><strong>Batch Implementation:</strong> A refactored Batch version achieves improved results: <strong>32.73% MR</strong> (stol=0.5), RMSE of 0.2743, inference in <strong>0.19s</strong> per structure (10x faster), and training in roughly 1/3 the GPU hours.</li>
</ul>
<h3 id="limitations">Limitations</h3>
<p>The paper identifies three key limitations:</p>
<ol>
<li><strong>Hypothetical-only evaluation:</strong> All experiments use the Boyd et al. hypothetical database. Evaluation on more challenging real-world datasets remains needed.</li>
<li><strong>Rigid-body assumption:</strong> The model assumes that local building block structures are known, which may be impractical for rare building blocks whose structural information is missing from existing libraries or is inaccurate.</li>
<li><strong>Periodic invariance:</strong> The model is not invariant to periodic transformations of the input. Explicitly modeling periodic invariance could further improve performance.</li>
</ol>
<hr>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<h3 id="data">Data</h3>
<ul>
<li><strong>Source:</strong> MOF dataset by Boyd et al. (2019).</li>
<li><strong>Preprocessing:</strong> Structures were decomposed using the metal-oxo decomposition algorithm from <strong>MOFid</strong>.</li>
<li><strong>Filtering:</strong> Structures with fewer than 200 building blocks were used, yielding 308,829 structures.</li>
<li><strong>Splits:</strong> Train/Validation/Test ratio of 8:1:1 (247,066 / 30,883 / 30,880).</li>
<li><strong>Availability:</strong> Pre-processed dataset is available on <a href="https://zenodo.org/records/15187230">Zenodo</a>.</li>
<li><strong>Representations:</strong>
<ul>
<li><em>Atom-level:</em> Tuple $(X, a, l)$ (coordinates, types, lattice).</li>
<li><em>Block-level:</em> Tuple $(\mathcal{B}, q, \tau, l)$ (blocks, rotations, translations, lattice).</li>
</ul>
</li>
</ul>
<h3 id="algorithms">Algorithms</h3>
<ul>
<li><strong>Framework:</strong> Riemannian Flow Matching.</li>
<li><strong>Objective:</strong> Conditional Flow Matching (CFM) loss regressing to clean data $q_1, \tau_1, l_1$.
$$
\begin{aligned}
\mathcal{L}(\theta) = \mathbb{E}_{t, \mathcal{S}^{(1)}} \left[ \frac{1}{(1-t)^2} \left( \lambda_1 |\log_{q_t}(\hat{q}_1) - \log_{q_t}(q_1)|^2 + \dots \right) \right]
\end{aligned}
$$</li>
<li><strong>Priors:</strong>
<ul>
<li>Rotations ($q$): Uniform on $SO(3)$.</li>
<li>Translations ($\tau$): Standard normal on $\mathbb{R}^3$.</li>
<li>Lattice ($l$): Log-normal for lengths, Uniform(60, 120) for angles (Niggli reduced).</li>
</ul>
</li>
<li><strong>Inference:</strong> ODE solver with <strong>50 integration steps</strong>.</li>
<li><strong>Local Coordinates:</strong> Defined using PCA axes, corrected for symmetry to ensure consistency.</li>
</ul>
<h3 id="models">Models</h3>
<ul>
<li><strong>Architecture:</strong> Hierarchical structure with two key modules.
<ul>
<li><strong>Atom-level Update Layers:</strong> 4-layer EGNN-like structure to encode building block features $h_m$ from atomic graphs (cutoff 5Å).</li>
<li><strong>Block-level Update Layers:</strong> 6 layers that iteratively update $q, \tau, l$ using the <strong>MOFAttention</strong> module.</li>
</ul>
</li>
<li><strong>MOFAttention:</strong> Modified Invariant Point Attention (IPA) that incorporates lattice parameters as offsets to the attention matrix.</li>
<li><strong>Hyperparameters:</strong>
<ul>
<li>Node dimension: 256 (block-level), 64 (atom-level).</li>
<li>Attention heads: 24.</li>
<li>Loss coefficients: $\lambda_1=1.0$ (rot), $\lambda_2=2.0$ (trans), $\lambda_3=0.1$ (lattice).</li>
</ul>
</li>
<li><strong>Checkpoints:</strong> Pre-trained weights and models are openly provided on <a href="https://zenodo.org/records/15187230">Zenodo</a>.</li>
</ul>
<h3 id="evaluation">Evaluation</h3>
<ul>
<li><strong>Metrics:</strong>
<ul>
<li><strong>Match Rate:</strong> Using <code>StructureMatcher</code> from <code>pymatgen</code>. Tolerances: <code>stol=0.5/1.0</code>, <code>ltol=0.3</code>, <code>angle_tol=10.0</code>.</li>
<li><strong>RMSE:</strong> Normalized by average free length per atom.</li>
</ul>
</li>
<li><strong>Tools:</strong> <strong>Zeo++</strong> for structural property calculations (Surface Area, Pore Diameter, etc.).</li>
</ul>
<table>
  <thead>
      <tr>
          <th style="text-align: left">Metric</th>
          <th style="text-align: left">MOFFlow</th>
          <th style="text-align: left">DiffCSP</th>
          <th style="text-align: left">RS (20 cands)</th>
          <th style="text-align: left">EA (20 cands)</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left">MR (stol=0.5, k=1)</td>
          <td style="text-align: left"><strong>31.69%</strong></td>
          <td style="text-align: left">0.09%</td>
          <td style="text-align: left">0.00%</td>
          <td style="text-align: left">0.00%</td>
      </tr>
      <tr>
          <td style="text-align: left">MR (stol=1.0, k=1)</td>
          <td style="text-align: left"><strong>87.46%</strong></td>
          <td style="text-align: left">23.12%</td>
          <td style="text-align: left">0.00%</td>
          <td style="text-align: left">0.00%</td>
      </tr>
      <tr>
          <td style="text-align: left">MR (stol=0.5, k=5)</td>
          <td style="text-align: left"><strong>44.75%</strong></td>
          <td style="text-align: left">0.34%</td>
          <td style="text-align: left">-</td>
          <td style="text-align: left">-</td>
      </tr>
      <tr>
          <td style="text-align: left">MR (stol=1.0, k=5)</td>
          <td style="text-align: left"><strong>100.0%</strong></td>
          <td style="text-align: left">38.94%</td>
          <td style="text-align: left">-</td>
          <td style="text-align: left">-</td>
      </tr>
      <tr>
          <td style="text-align: left">RMSE (stol=0.5, k=1)</td>
          <td style="text-align: left"><strong>0.2820</strong></td>
          <td style="text-align: left">0.3961</td>
          <td style="text-align: left">-</td>
          <td style="text-align: left">-</td>
      </tr>
      <tr>
          <td style="text-align: left">Avg. time per structure</td>
          <td style="text-align: left"><strong>1.94s</strong></td>
          <td style="text-align: left">5.37s</td>
          <td style="text-align: left">332s</td>
          <td style="text-align: left">1,959s</td>
      </tr>
  </tbody>
</table>
<h3 id="hardware">Hardware</h3>
<ul>
<li><strong>Training Hardware:</strong> 8 $\times$ NVIDIA RTX 3090 (24GB VRAM).</li>
<li><strong>Training Time:</strong>
<ul>
<li><em>TimestepBatch version (main paper):</em> ~5 days 15 hours.</li>
<li><em>Batch version:</em> ~1 day 17 hours (332.74 GPU hours). The authors also release this refactored implementation, which achieves comparable performance with faster convergence.</li>
</ul>
</li>
<li><strong>Batch Size:</strong> 160 (capped by $N^2$ where $N$ is the number of atoms, for memory management).</li>
</ul>
<h3 id="artifacts">Artifacts</h3>
<table>
  <thead>
      <tr>
          <th style="text-align: left">Artifact</th>
          <th style="text-align: left">Type</th>
          <th style="text-align: left">License</th>
          <th style="text-align: left">Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left"><a href="https://github.com/nayoung10/MOFFlow">MOFFlow (GitHub)</a></td>
          <td style="text-align: left">Code</td>
          <td style="text-align: left">MIT</td>
          <td style="text-align: left">Official implementation built on DiffDock, EGNN, MOFDiff, and protein-frame-flow</td>
      </tr>
      <tr>
          <td style="text-align: left"><a href="https://zenodo.org/records/15187230">Pre-processed dataset and checkpoints (Zenodo)</a></td>
          <td style="text-align: left">Dataset / Model</td>
          <td style="text-align: left">Unknown</td>
          <td style="text-align: left">Includes pre-processed MOF structures and trained model weights</td>
      </tr>
  </tbody>
</table>
<hr>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Kim, N., Kim, S., Kim, M., Park, J., &amp; Ahn, S. (2025). MOFFlow: Flow Matching for Structure Prediction of Metal-Organic Frameworks. <em>International Conference on Learning Representations (ICLR)</em>.</p>
<p><strong>Publication</strong>: ICLR 2025</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@inproceedings</span>{kimMOFFlowFlowMatching2025,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span>=<span style="color:#e6db74">{MOFFlow: Flow Matching for Structure Prediction of Metal-Organic Frameworks}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span>=<span style="color:#e6db74">{Kim, Nayoung and Kim, Seongsu and Kim, Minsu and Park, Jinkyoo and Ahn, Sungsoo}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">booktitle</span>=<span style="color:#e6db74">{The Thirteenth International Conference on Learning Representations}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span>=<span style="color:#e6db74">{2025}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">url</span>=<span style="color:#e6db74">{https://openreview.net/forum?id=dNT3abOsLo}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p><strong>Additional Resources</strong>:</p>
<ul>
<li><a href="https://openreview.net/forum?id=dNT3abOsLo">OpenReview Discussion</a></li>
<li><a href="https://github.com/nayoung10/MOFFlow">Official Code Repository</a></li>
</ul>
]]></content:encoded></item><item><title>Stillinger-Weber Potential for Silicon Simulation</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/stillinger-weber-1985/</link><pubDate>Sun, 14 Dec 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/stillinger-weber-1985/</guid><description>The 1985 paper introducing the Stillinger-Weber potential, a 3-body interaction model for molecular dynamics of tetrahedral semiconductors.</description><content:encoded><![CDATA[<h2 id="core-methodological-contribution">Core Methodological Contribution</h2>
<p>This is a <strong>Method</strong> paper.</p>
<p>Its primary contribution is the formulation of the <strong>Stillinger-Weber potential</strong>, a non-additive potential energy function designed to model tetrahedral semiconductors. The paper also uses molecular dynamics simulation to explore physical properties of silicon in both crystalline and liquid phases, but the methodological contribution (the potential architecture) is what enabled subsequent research on covalent materials.</p>
<h2 id="the-failure-of-pair-potentials-in-silicon">The Failure of Pair Potentials in Silicon</h2>
<p>The authors aimed to simulate the melting and liquid properties of tetrahedral semiconductors (Silicon and Germanium).</p>
<ul>
<li><strong>The Problem:</strong> Standard pair potentials (like Lennard-Jones) favor close-packed structures (12 nearest neighbors) and cannot stabilize the open diamond structure (4 nearest neighbors) of Silicon.</li>
<li><strong>The Gap:</strong> Earlier classical potentials lacked the flexibility to describe the profound structural change where Silicon shrinks upon melting (coordination number increases from 4 to &gt;6) while remaining conductive.</li>
<li><strong>The Goal:</strong> To construct a potential that spans the entire configuration space, describing both the rigid crystal and the diffusive liquid, without requiring quantum mechanical calculations.</li>
</ul>
<h2 id="the-three-body-interaction-novelty">The Three-Body Interaction Novelty</h2>
<p>The core novelty is the introduction of a stabilizing <strong>three-body interaction term</strong> ($v_3$) to the potential energy function.</p>
<ul>
<li><strong>3-Body Term:</strong> Explicitly penalizes deviations from the ideal tetrahedral angle ($\cos \theta_t = -1/3$).</li>
<li><strong>Unified Model:</strong> This potential handles bond breaking and reforming, allowing for the simulation of melting and liquid diffusion. Previous &ldquo;Keating&rdquo; potentials model only small elastic deformations.</li>
<li><strong>Mapping Technique:</strong> The application of &ldquo;steepest-descent mapping&rdquo; to quench dynamical configurations into their underlying &ldquo;inherent structures&rdquo; (local minima), revealing the fundamental topology of the liquid energy landscape.</li>
</ul>
<h2 id="molecular-dynamics-validation">Molecular Dynamics Validation</h2>
<p>The authors performed Molecular Dynamics (MD) simulations using the proposed potential.</p>
<ul>
<li><strong>System:</strong> 216 Silicon atoms in a cubic cell with periodic boundary conditions.</li>
<li><strong>State Points:</strong> Fixed density $\rho = 2.53 \text{ g/cm}^3$ (matching experimental liquid density at melting).</li>
<li><strong>Process:</strong>
<ol>
<li>Start with diamond crystal at low temperature.</li>
<li>Systematically heat to induce spontaneous nucleation and melting.</li>
<li>Equilibrate the liquid.</li>
<li>Periodically map configurations to potential minima (inherent structures) using steepest descent.</li>
</ol>
</li>
</ul>
<h2 id="phase-topology-and-inverse-lindemann-criterion">Phase Topology and Inverse Lindemann Criterion</h2>
<ul>
<li><strong>Validation:</strong> The potential successfully stabilizes the diamond structure as the global minimum at zero pressure.</li>
<li><strong>Liquid Structure:</strong> The simulated liquid pair-correlation function $g(r)$ and structure factor $S(k)$ qualitatively match experimental diffraction data, including the characteristic shoulder on the structure factor peak.</li>
<li><strong>Inherent Structure:</strong> The liquid possesses a temperature-independent inherent structure (amorphous network) hidden beneath thermal vibrations.</li>
<li><strong>Melting/Freezing Criteria:</strong> The study proposes an &ldquo;Inverse Lindemann Criterion&rdquo;: while crystals melt when vibration amplitude exceeds ~0.19 lattice spacings, liquids freeze when atom displacements from their inherent minima drop below ~0.30 neighbor spacings.</li>
</ul>
<h2 id="limitations-and-energy-scale-problem">Limitations and Energy Scale Problem</h2>
<p>The authors acknowledge a quantitative energy scale discrepancy. To match the observed melting temperature of Si ($1410°$C), $\epsilon$ would need to be approximately 42 kcal/mol, considerably less than the 50 kcal/mol required to reproduce the correct cohesive energy of the crystal. The authors suggest this could be resolved either by further optimization of $v_2$ and $v_3$, or by adding position-independent single-particle terms $v_1 \approx -16$ kcal/mol arising from the electronic structure. Adding $v_1$ terms only affects the temperature scale and has no influence on local structure at a given reduced temperature.</p>
<p>The simulated liquid coordination number (8.07) is also higher than the experimentally reported value of approximately 6.4, though the authors note that the experimental definition of &ldquo;nearest neighbors&rdquo; was not precisely stated.</p>
<h2 id="bonding-statistics-in-inherent-structures">Bonding Statistics in Inherent Structures</h2>
<p>Analysis of potential-energy minima (inherent structures) using a bond cutoff of $r/\sigma = 1.40$ reveals the coordination distribution in the liquid:</p>
<table>
  <thead>
      <tr>
          <th style="text-align: left">Coordination Number</th>
          <th style="text-align: left">Fraction of Atoms</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left">4</td>
          <td style="text-align: left">0.201</td>
      </tr>
      <tr>
          <td style="text-align: left">5</td>
          <td style="text-align: left">0.568</td>
      </tr>
      <tr>
          <td style="text-align: left">6</td>
          <td style="text-align: left">0.205</td>
      </tr>
      <tr>
          <td style="text-align: left">7</td>
          <td style="text-align: left">0.024</td>
      </tr>
  </tbody>
</table>
<p>Five-coordinate atoms dominate the liquid&rsquo;s inherent structure, with four- and six-coordinate atoms each accounting for about 20% of the population. The three-body interactions prevent any occurrence of coordination numbers near 12 that would indicate local close packing.</p>
<hr>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<h3 id="algorithms">Algorithms</h3>
<ul>
<li><strong>Integration:</strong> Equations of motion integrated using a <strong>fifth-order Gear algorithm</strong>.</li>
<li><strong>Time Step:</strong> $\Delta t = 5 \times 10^{-3} \tau$ (approx $3.83 \times 10^{-16}$ s), where $\tau = \sigma(m/\epsilon)^{1/2} = 7.6634 \times 10^{-14}$ s.</li>
<li><strong>Minimization:</strong> Steepest-descent mapping utilized <strong>Newton&rsquo;s method</strong> to find limiting solutions ($\nabla \Phi = 0$).</li>
</ul>
<h3 id="models">Models</h3>
<p>To reproduce this work, one must implement the potential $\Phi = \sum v_2 + \sum v_3$ with the exact functional forms and parameters provided.</p>















<figure class="post-figure center ">
    <img src="/img/notes/chemistry/stillinger-weber-potential.webp"
         alt="Stillinger-Weber potential visualization"
         title="Stillinger-Weber potential visualization"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">Left: Two-body radial potential $v_2(r)$ showing the characteristic well at $r_{min} \approx 1.12\sigma$. Right: Three-body angular penalty $h(r_{min}, r_{min}, \theta)$ demonstrating the minimum at the tetrahedral angle (109.5°), which enforces the diamond crystal structure.</figcaption>
    
</figure>

<h4 id="reduced-units">Reduced Units</h4>
<ul>
<li>$\sigma = 0.20951 \text{ nm}$</li>
<li>$\epsilon = 50 \text{ kcal/mol} = 3.4723 \times 10^{-12} \text{ erg}$</li>
</ul>
<h4 id="two-body-term-v_2">Two-Body Term ($v_2$)</h4>
<p>$$
v_2(r_{ij}) = \epsilon A (B r_{ij}^{-p} - r_{ij}^{-q}) \exp[(r_{ij} - a)^{-1}] \quad \text{for } r_{ij} &lt; a
$$</p>
<p><em>(Vanishes for $r \geq a$)</em></p>
<h4 id="three-body-term-v_3">Three-Body Term ($v_3$)</h4>
<p>$$
v_3(r_i, r_j, r_k) = \epsilon [h(r_{ij}, r_{ik}, \theta_{jik}) + h(r_{ji}, r_{jk}, \theta_{ijk}) + h(r_{ki}, r_{kj}, \theta_{ikj})]
$$</p>
<p>where:</p>
<p>$$
h(r_{ij}, r_{ik}, \theta_{jik}) = \lambda \exp[\gamma(r_{ij}-a)^{-1} + \gamma(r_{ik}-a)^{-1}] (\cos\theta_{jik} + \frac{1}{3})^2
$$</p>
<p><em>(Vanishes if distances $\geq a$)</em></p>
<h4 id="parameters">Parameters</h4>
<table>
  <thead>
      <tr>
          <th style="text-align: left">Parameter</th>
          <th style="text-align: left">Value</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left">$A$</td>
          <td style="text-align: left">$7.049556277$</td>
      </tr>
      <tr>
          <td style="text-align: left">$B$</td>
          <td style="text-align: left">$0.6022245584$</td>
      </tr>
      <tr>
          <td style="text-align: left">$p$</td>
          <td style="text-align: left">$4$</td>
      </tr>
      <tr>
          <td style="text-align: left">$q$</td>
          <td style="text-align: left">$0$</td>
      </tr>
      <tr>
          <td style="text-align: left">$a$</td>
          <td style="text-align: left">$1.80$</td>
      </tr>
      <tr>
          <td style="text-align: left">$\lambda$</td>
          <td style="text-align: left">$21.0$</td>
      </tr>
      <tr>
          <td style="text-align: left">$\gamma$</td>
          <td style="text-align: left">$1.20$</td>
      </tr>
  </tbody>
</table>
<h3 id="evaluation">Evaluation</h3>
<p>The paper evaluates the model against experimental diffraction data.</p>
<table>
  <thead>
      <tr>
          <th style="text-align: left">Metric</th>
          <th style="text-align: left">Simulated Value</th>
          <th style="text-align: left">Experimental Value</th>
          <th style="text-align: left">Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left"><strong>Melting Point ($T_m^*$)</strong></td>
          <td style="text-align: left">$\approx 0.080$</td>
          <td style="text-align: left">N/A</td>
          <td style="text-align: left">Reduced units. Requires $\epsilon \approx 42$ kcal/mol to match real $T_m = 1410°$C, vs 50 kcal/mol for correct cohesive energy.</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>Coordination (Liquid)</strong></td>
          <td style="text-align: left">$8.07$</td>
          <td style="text-align: left">$\approx 6.4$</td>
          <td style="text-align: left">Evaluated at first $g(r)$ minimum ($r/\sigma = 1.625$). Simulated value is higher than experiment.</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>$S(k)$ First Peak</strong></td>
          <td style="text-align: left">$2.53$ $\AA^{-1}$</td>
          <td style="text-align: left">$2.80$ $\AA^{-1}$</td>
          <td style="text-align: left">From Table I.</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>$S(k)$ Shoulder</strong></td>
          <td style="text-align: left">$3.25$ $\AA^{-1}$</td>
          <td style="text-align: left">$3.25$ $\AA^{-1}$</td>
          <td style="text-align: left">From Table I. Exact match with experiment.</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>$S(k)$ Second Peak</strong></td>
          <td style="text-align: left">$5.35$ $\AA^{-1}$</td>
          <td style="text-align: left">$5.75$ $\AA^{-1}$</td>
          <td style="text-align: left">From Table I.</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>$S(k)$ Third Peak</strong></td>
          <td style="text-align: left">$8.16$ $\AA^{-1}$</td>
          <td style="text-align: left">$8.50$ $\AA^{-1}$</td>
          <td style="text-align: left">From Table I.</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>$S(k)$ Fourth Peak</strong></td>
          <td style="text-align: left">$10.60$ $\AA^{-1}$</td>
          <td style="text-align: left">$11.20$ $\AA^{-1}$</td>
          <td style="text-align: left">From Table I.</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>Entropy of Melting ($\Delta S / N k_B$)</strong></td>
          <td style="text-align: left">$\approx 3.7$</td>
          <td style="text-align: left">$3.25$</td>
          <td style="text-align: left">Simulated at constant volume; experimental at constant pressure (1 atm).</td>
      </tr>
  </tbody>
</table>
<hr>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Stillinger, F. H., &amp; Weber, T. A. (1985). Computer simulation of local order in condensed phases of silicon. <em>Physical Review B</em>, 31(8), 5262-5271. <a href="https://doi.org/10.1103/PhysRevB.31.5262">https://doi.org/10.1103/PhysRevB.31.5262</a></p>
<p><strong>Publication</strong>: Physical Review B, 1985</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@article</span>{stillingerComputerSimulationLocal1985,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span> = <span style="color:#e6db74">{Computer Simulation of Local Order in Condensed Phases of Silicon}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span> = <span style="color:#e6db74">{Stillinger, Frank H. and Weber, Thomas A.}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span> = <span style="color:#ae81ff">1985</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">month</span> = apr,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">journal</span> = <span style="color:#e6db74">{Physical Review B}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">volume</span> = <span style="color:#e6db74">{31}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">number</span> = <span style="color:#e6db74">{8}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span> = <span style="color:#e6db74">{5262--5271}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">publisher</span> = <span style="color:#e6db74">{American Physical Society}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">doi</span> = <span style="color:#e6db74">{10.1103/PhysRevB.31.5262}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div>]]></content:encoded></item><item><title>Second-Order Langevin Equation for Field Simulations</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/second-order-langevin-1987/</link><pubDate>Sun, 14 Dec 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/second-order-langevin-1987/</guid><description>Hyperbolic Algorithm adds second-order derivatives to Langevin dynamics, reducing systematic errors to O(ε²) for lattice field simulations.</description><content:encoded><![CDATA[<h2 id="contribution-and-paper-type">Contribution and Paper Type</h2>
<p>This is a <strong>Methodological Paper</strong> ($\Psi_{\text{Method}}$). It proposes a novel stochastic algorithm, the Hyperbolic Algorithm (HA), and validates its superior efficiency against the existing Langevin Algorithm (LA) through formal error analysis and numerical simulation. It contains significant theoretical derivation (Liouville dynamics) that serves primarily to justify the algorithmic performance claims.</p>
<h2 id="motivation-and-gaps-in-prior-work">Motivation and Gaps in Prior Work</h2>
<p>The standard Langevin Algorithm (LA) for numerical simulation of Euclidean field theories suffers from efficiency bottlenecks. The simplest Euler-discretization of the LA introduces systematic errors of $O(\epsilon)$ (where $\epsilon$ is the step size). To maintain accuracy, $\epsilon$ must be kept small, which increases the sweep-sweep correlation time (autocorrelation time), making simulations computationally expensive.</p>
<h2 id="core-novelty-second-order-dynamics">Core Novelty: Second-Order Dynamics</h2>
<p>The core contribution is the introduction of a <strong>second-order derivative in fictitious time</strong> to the stochastic equation. This converts the parabolic Langevin equation into a hyperbolic equation:</p>
<p>$$
\begin{aligned}
\frac{\partial^{2}\phi}{\partial t^{2}}+\gamma\frac{\partial\phi}{\partial t}=-\frac{\partial S}{\partial\phi}+\eta
\end{aligned}
$$</p>
<h3 id="equation-comparison">Equation Comparison</h3>
<p>The key difference from the standard (first-order) Langevin equation:</p>
<table>
  <thead>
      <tr>
          <th style="text-align: left">Equation Type</th>
          <th style="text-align: left">Formula</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left"><strong>Hyperbolic (Second Order)</strong></td>
          <td style="text-align: left">$$\frac{\partial^{2}\phi}{\partial t^{2}}+\gamma\frac{\partial\phi}{\partial t}=-\frac{\partial S}{\partial\phi}+\eta$$</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>Langevin (First Order)</strong></td>
          <td style="text-align: left">$$\frac{\partial\phi}{\partial t}=-\frac{\partial S}{\partial\phi}+\eta$$</td>
      </tr>
  </tbody>
</table>
<p>The standard Langevin equation corresponds to the overdamped limit where the acceleration term is absent. Physically, the Hyperbolic equation can be viewed as microcanonical equations of motion with an added friction term.</p>
<h3 id="key-innovations">Key Innovations</h3>
<ul>
<li><strong>Higher Order Accuracy</strong>: The simplest discretization of this equation leads to systematic errors of only $O(\epsilon^2)$ compared to $O(\epsilon)$ for LA.</li>
<li><strong>Tunable Damping</strong>: The addition of the damping parameter $\gamma$ allows tuning to minimize autocorrelation tails.</li>
<li><strong>Uniform Evolution</strong>: The method evolves structures of different wavelengths more uniformly than LA due to the specific dissipation structure.</li>
</ul>
<h2 id="methodology-and-experiments">Methodology and Experiments</h2>
<p>The author validated the method using the <strong>XY Model</strong> on 2D lattices.</p>
<ul>
<li><strong>System</strong>: Euclidean action $S = -\sum_{x,\mu} \cos(\theta_{x+\mu} - \theta_x)$.</li>
<li><strong>Setup</strong>:
<ul>
<li>Lattice sizes: $15^2$ (helical boundary conditions) and $30^2$.</li>
<li>$\beta$ range: 0.9 to 1.2 (crossing the critical point $\approx 1.0$).</li>
<li>Run length: &gt;100,000 updates in equilibrium.</li>
</ul>
</li>
<li><strong>Metrics</strong>:
<ul>
<li><strong>Autocorrelation time ($\tau$)</strong>: Defined as the number of updates for the time-correlation function to drop to 10% of its initial value.</li>
<li><strong>Systematic Error</strong>: Measured via deviation of average action from Monte Carlo values.</li>
</ul>
</li>
</ul>
<h2 id="results-and-conclusions">Results and Conclusions</h2>
<ul>
<li><strong>Efficiency</strong>: The Hyperbolic Algorithm (HA) is far more efficient. For equal systematic errors, sweep-sweep correlation times are significantly lower than LA.</li>
<li><strong>Error Scaling</strong>: Numerical results confirmed that HA step size $\epsilon_H = 0.1$ yields systematic errors comparable to LA step size $\epsilon_L \approx 0.008$ ($O(\epsilon^2)$ vs $O(\epsilon)$ scaling).</li>
<li><strong>Speedup</strong>: In the disordered phase, HA is roughly $\epsilon_H / \epsilon_L$ times faster (approximately a factor of 12.5 for $\epsilon_H = 0.1$, $\epsilon_L = 0.008$). In the ordered phase, efficiency gains increase with distance scale, reaching factors of 20 or more for long-range correlations.</li>
<li><strong>Optimal Damping</strong>: For the XY model, the optimal damping parameter was found to be $\gamma \approx 0.4$.</li>
</ul>
<hr>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<h3 id="algorithms">Algorithms</h3>
<p><strong>1. The Hyperbolic Algorithm (HA)</strong></p>
<p>The discretized update equations for scalar fields are:</p>
<p>$$
\begin{aligned}
\pi_{t+\epsilon} - \pi_{t} &amp;= -\epsilon\gamma\pi_{t} - \epsilon\frac{\partial S}{\partial\phi_{t}} + \sqrt{2\epsilon\gamma/\beta}\xi_{t} \\
\phi_{t+\epsilon} - \phi_{t} &amp;= \epsilon\pi_{t+\epsilon}
\end{aligned}
$$</p>
<ul>
<li><strong>Variables</strong>: $\phi$ is the field, $\pi$ is the conjugate momentum ($\dot{\phi}$).</li>
<li><strong>Parameters</strong>: $\epsilon$ (step size), $\gamma$ (damping constant).</li>
<li><strong>Noise</strong>: $\xi$ is Gaussian noise with $\langle\xi_x \xi_y\rangle = \delta_{x,y}$.</li>
<li><strong>Storage</strong>: Requires storing both $\phi$ and $\pi$ vectors.</li>
</ul>
<p><strong>2. Non-Abelian Generalization</strong></p>
<p>For Lie group elements $U$ with generators $T^a$:</p>
<p>$$
\begin{aligned}
\pi_{t+\epsilon}^a - \pi_{t}^a &amp;= -\epsilon\gamma\pi_{t}^a - \epsilon\delta^a S[U_t] + \sqrt{2\epsilon\gamma/\beta}\xi_{t}^a \\
U_{t+\epsilon} &amp;= e^{i\epsilon\pi_{t+\epsilon}^a T^a} U_t
\end{aligned}
$$</p>
<h3 id="theoretical-proof-of-oepsilon2-accuracy">Theoretical Proof of $O(\epsilon^2)$ Accuracy</h3>
<p>The derivation relies on the generalized Liouville equation for the probability distribution $P[\phi, \pi; t]$.</p>
<ol>
<li><strong>Transition Probability</strong>: The transition $W$ for one iteration is defined.</li>
<li><strong>Effective Liouville Operator</strong>: The evolution is written as $P(t+\epsilon) = \exp(\epsilon L_{\text{eff}}) P(t)$.</li>
<li><strong>Baker-Hausdorff Expansion</strong>: Using normal ordering of operators, the equilibrium distribution $P_{\text{eq}}$ is derived through $O(\epsilon^2)$:</li>
</ol>
<p>$$
\begin{aligned}
P_{\text{eq}} &amp;= \exp\left\lbrace-\frac{1}{2}\beta_{1}\sum_{x}\pi_{x}^{2} - \beta S[\phi] + \frac{1}{2}\epsilon\beta\sum_{x}\pi_{x}S_{x} + \epsilon^{2}G + O(\epsilon^3)\right\rbrace
\end{aligned}
$$</p>
<p>where $\beta_1 = \beta\left(1 - \frac{1}{2}\epsilon\gamma\right)$.</p>
<ol start="4">
<li><strong>Effective Action</strong>: Integrating out $\pi$ yields the effective action for $\phi$:</li>
</ol>
<p>$$
\begin{aligned}
S_{\text{eff}}[\phi] &amp;= S[\phi] - \frac{1}{8}\epsilon^2 \sum_x S_x^2 + \dots
\end{aligned}
$$</p>
<p>The absence of $O(\epsilon)$ terms proves the higher-order accuracy.</p>
<h3 id="evaluation">Evaluation</h3>
<ul>
<li><strong>Model</strong>: XY Model (2D)</li>
<li><strong>Hamiltonian</strong>: $H = \frac{1}{2}\sum \pi^2 + S[\phi]$ where $S = -\sum \cos(\Delta \theta)$.</li>
<li><strong>Observables</strong>:
<ul>
<li>$\Gamma_n = \cos(\theta_{m+n} - \theta_m)$ (averaged over lattice $m$).</li>
</ul>
</li>
<li><strong>Comparisons</strong>:
<ul>
<li><strong>LA Step</strong>: $\epsilon_L \approx 0.005 - 0.02$.</li>
<li><strong>HA Step</strong>: $\epsilon_H \approx 0.1 - 0.2$.</li>
<li><strong>Equivalence</strong>: $\epsilon_H = 0.1$ matches error of $\epsilon_L \approx 0.008$.</li>
</ul>
</li>
</ul>
<hr>
<h2 id="terminology-note">Terminology Note</h2>
<p>The naming conventions in this paper differ from those commonly used in molecular dynamics (MD). The following table provides a cross-field mapping:</p>
<table>
  <thead>
      <tr>
          <th style="text-align: left">Concept</th>
          <th style="text-align: left"><strong>Field Theory (This Paper)</strong></th>
          <th style="text-align: left"><strong>Molecular Dynamics</strong></th>
          <th style="text-align: left"><strong>Mathematics</strong></th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left"><strong>Equation 1</strong></td>
          <td style="text-align: left">&ldquo;Langevin Equation&rdquo;</td>
          <td style="text-align: left">Brownian Dynamics (BD)</td>
          <td style="text-align: left">Overdamped Langevin</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>Equation 2</strong></td>
          <td style="text-align: left">&ldquo;Hyperbolic Equation&rdquo;</td>
          <td style="text-align: left">Langevin Dynamics (LD)</td>
          <td style="text-align: left">Underdamped Langevin</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>Integrator 1</strong></td>
          <td style="text-align: left">Euler Discretization</td>
          <td style="text-align: left">Euler Integrator</td>
          <td style="text-align: left">Euler-Maruyama</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>Integrator 2</strong></td>
          <td style="text-align: left">Hyperbolic Algorithm (HA)</td>
          <td style="text-align: left">Velocity Verlet / Leapfrog</td>
          <td style="text-align: left">Quasi-Symplectic Splitting</td>
      </tr>
  </tbody>
</table>
<p><strong>Key insight</strong>: The paper&rsquo;s &ldquo;Hyperbolic Algorithm&rdquo; is mathematically equivalent to Langevin Dynamics with a Leapfrog/Verlet integrator, commonly used in MD. The baseline &ldquo;Langevin Algorithm&rdquo; corresponds to Brownian Dynamics. The term &ldquo;Langevin equation&rdquo; is overloaded: field theorists often use it for overdamped dynamics (no inertia), while chemists assume it includes momentum ($F=ma$).</p>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Horowitz, A. M. (1987). The Second Order Langevin Equation and Numerical Simulations. <em>Nuclear Physics B</em>, 280, 510-522. <a href="https://doi.org/10.1016/0550-3213(87)90159-3">https://doi.org/10.1016/0550-3213(87)90159-3</a></p>
<p><strong>Publication</strong>: Nuclear Physics B 1987</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@article</span>{horowitzSecondOrderLangevin1987,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span> = <span style="color:#e6db74">{The Second Order {{Langevin}} Equation and Numerical Simulations}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span> = <span style="color:#e6db74">{Horowitz, Alan M.}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span> = <span style="color:#ae81ff">1987</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">month</span> = jan,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">journal</span> = <span style="color:#e6db74">{Nuclear Physics B}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">volume</span> = <span style="color:#e6db74">{280}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span> = <span style="color:#e6db74">{510--522}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">issn</span> = <span style="color:#e6db74">{05503213}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">doi</span> = <span style="color:#e6db74">{10.1016/0550-3213(87)90159-3}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div>]]></content:encoded></item><item><title>Oscillatory CO Oxidation on Pt(110): Temporal Modeling</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/oscillatory-co-oxidation-pt110-1992/</link><pubDate>Sun, 14 Dec 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/oscillatory-co-oxidation-pt110-1992/</guid><description>A kinetic model using coupled ODEs to explain temporal self-organization and mixed-mode oscillations in catalytic CO oxidation on Pt(110).</description><content:encoded><![CDATA[<p><strong>Related Work</strong>: This builds on <a href="/notes/chemistry/molecular-simulation/kinetic-oscillations-pt100-1985/">Kinetic Oscillations on Pt(100)</a>, which established that surface phase transitions drive oscillatory catalysis. The Pt(110) system exhibits richer dynamics including mixed-mode oscillations and chaos.</p>
<h2 id="method-presentation-modeling-temporal-self-organization">Method Presentation: Modeling Temporal Self-Organization</h2>
<p>This is primarily a <strong>Method</strong> paper, supported by <strong>Theory</strong>.</p>
<ul>
<li><strong>Method</strong>: The authors construct a specific computational architecture, a set of coupled Ordinary Differential Equations (ODEs), to simulate the catalytic oxidation of CO. They systematically &ldquo;ablate&rdquo; the model, starting with 2 variables (bistability only), adding a 3rd (simple oscillations), and finally a 4th (mixed-mode oscillations) to demonstrate the necessity of each physical component.</li>
<li><strong>Theory</strong>: The model is analyzed using formal bifurcation theory (continuation methods) to map the topology of the phase space (Hopf bifurcations, saddle-node loops, etc.).</li>
</ul>
<h2 id="motivation-bridging-microscopic-structure-and-macroscopic-dynamics">Motivation: Bridging Microscopic Structure and Macroscopic Dynamics</h2>
<p>The Pt(110) surface exhibits complex temporal behavior during CO oxidation, including bistability, sustained oscillations, mixed-mode oscillations (MMOs), and chaos. Previous simple models could explain bistability but failed to capture the oscillatory dynamics observed experimentally. There was a need for a &ldquo;realistic&rdquo; model that used physically derived parameters to quantitatively link microscopic surface changes (structural phase transitions) to macroscopic reaction rates.</p>
<h2 id="novelty-coupling-reaction-kinetics-and-surface-phase-transitions">Novelty: Coupling Reaction Kinetics and Surface Phase Transitions</h2>
<p>The core novelty is the <strong>&ldquo;Reconstruction Model&rdquo;</strong>, which couples the chemical kinetics (Langmuir-Hinshelwood mechanism) with the physical structural phase transition of the platinum surface ($1\times1 \leftrightarrow 1\times2$).</p>
<ul>
<li>They treat the surface structure as a dynamic variable ($w$).</li>
<li>They introduce a fourth variable ($z$) representing &ldquo;faceting&rdquo; to explain complex mixed-mode oscillations, identifying the interplay between two negative feedback loops on different time scales as the driver for this behavior.</li>
</ul>
<h2 id="methodology-experimental-parameters-and-bifurcation-topology">Methodology: Experimental Parameters and Bifurcation Topology</h2>
<p>The validation approach involved a tight loop between numerical simulation and physical experiment:</p>
<ol>
<li><strong>Parameter Determination</strong>: They experimentally measured individual rate constants (sticking coefficients, desorption energies) using Surface Science techniques (LEED, TDS) to ground the model in reality.</li>
<li><strong>Bifurcation Analysis</strong>: They used numerical continuation methods (AUTO package) to compute &ldquo;skeleton bifurcation diagrams,&rdquo; mapping the boundaries between stable states, simple oscillations, and chaos in parameter space ($p_{CO}$ vs $p_{O_2}$).</li>
<li><strong>Physical Validation</strong>: These diagrams were compared directly against experimental work function ($\Delta \phi$) measurements and LEED intensity profiles to verify the existence regions of different dynamic regimes.</li>
</ol>
<h2 id="results-and-limitations-mixed-mode-oscillations-vs-spatiotemporal-chaos">Results and Limitations: Mixed-Mode Oscillations vs. Spatiotemporal Chaos</h2>
<ul>
<li><strong>Successes</strong>: The 3-variable model successfully reproduces bistability and simple oscillations (limit cycles). The extended 4-variable model qualitatively captures mixed-mode oscillations (MMOs).</li>
<li><strong>Mechanism</strong>: Oscillations arise from the delay between CO adsorption and the resulting surface phase transition (which changes oxygen sticking probabilities).</li>
<li><strong>Limitations</strong>: The 4-variable model only reproduces one type of MMO; certain experimental patterns (e.g., square-wave forms with small oscillations on both high and low work-function levels) were not obtained. The oscillatory region also does not extend to low temperatures as observed experimentally. More fundamentally, the ODE model fails to predict the period-doubling cascade to chaos or hyperchaos observed in experiments. The authors conclude these are likely spatiotemporal phenomena (involving wave propagation and pattern formation) that require Partial Differential Equations (PDEs).</li>
</ul>
<hr>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<p>The paper provides a complete set of equations and parameters required to reproduce the dynamics.</p>
<h3 id="data-parameters">Data (Parameters)</h3>
<p>The model uses kinetic parameters derived from Pt(110) experiments. Key constants for reproduction:</p>
<table>
  <thead>
      <tr>
          <th style="text-align: left">Parameter</th>
          <th style="text-align: left">Value</th>
          <th style="text-align: left">Description</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left">$\kappa_c$</td>
          <td style="text-align: left">$3.135 \times 10^5 , s^{-1} \text{mbar}^{-1}$</td>
          <td style="text-align: left">Rate of CO hitting surface</td>
      </tr>
      <tr>
          <td style="text-align: left">$s_c$</td>
          <td style="text-align: left">$1.0$</td>
          <td style="text-align: left">CO sticking coefficient</td>
      </tr>
      <tr>
          <td style="text-align: left">$q$</td>
          <td style="text-align: left">$3$</td>
          <td style="text-align: left">Mobility parameter of precursor adsorption</td>
      </tr>
      <tr>
          <td style="text-align: left">$u_s$</td>
          <td style="text-align: left">$1.0$</td>
          <td style="text-align: left">Saturation coverage ($CO$)</td>
      </tr>
      <tr>
          <td style="text-align: left">$\kappa_o$</td>
          <td style="text-align: left">$5.858 \times 10^5 , s^{-1} \text{mbar}^{-1}$</td>
          <td style="text-align: left">Rate of $O_2$ hitting surface</td>
      </tr>
      <tr>
          <td style="text-align: left">$s_{o,1\times2}$</td>
          <td style="text-align: left">$0.4$</td>
          <td style="text-align: left">$O_2$ sticking coeff ($1\times2$ phase)</td>
      </tr>
      <tr>
          <td style="text-align: left">$s_{o,1\times1}$</td>
          <td style="text-align: left">$0.6$</td>
          <td style="text-align: left">$O_2$ sticking coeff ($1\times1$ phase)</td>
      </tr>
      <tr>
          <td style="text-align: left">$v_s$</td>
          <td style="text-align: left">$0.8$</td>
          <td style="text-align: left">Saturation coverage ($O$)</td>
      </tr>
      <tr>
          <td style="text-align: left">$k_{r}^{0}$</td>
          <td style="text-align: left">$3 \times 10^6 , s^{-1}$</td>
          <td style="text-align: left">Reaction pre-exponential</td>
      </tr>
      <tr>
          <td style="text-align: left">$E_r$</td>
          <td style="text-align: left">$10 , \text{kcal/mol}$</td>
          <td style="text-align: left">Reaction activation energy</td>
      </tr>
      <tr>
          <td style="text-align: left">$k_{d}^{0}$</td>
          <td style="text-align: left">$2 \times 10^{16} , s^{-1}$</td>
          <td style="text-align: left">Desorption pre-exponential</td>
      </tr>
      <tr>
          <td style="text-align: left">$E_d$</td>
          <td style="text-align: left">$38 , \text{kcal/mol}$</td>
          <td style="text-align: left">Desorption activation energy</td>
      </tr>
      <tr>
          <td style="text-align: left">$k_{p}^{0}$</td>
          <td style="text-align: left">$10^2 , s^{-1}$</td>
          <td style="text-align: left">Phase transition pre-exponential</td>
      </tr>
      <tr>
          <td style="text-align: left">$E_p$</td>
          <td style="text-align: left">$7 , \text{kcal/mol}$</td>
          <td style="text-align: left">Phase transition activation energy</td>
      </tr>
      <tr>
          <td style="text-align: left">$k_f$</td>
          <td style="text-align: left">$0.03 , s^{-1}$</td>
          <td style="text-align: left">Rate of facet formation</td>
      </tr>
      <tr>
          <td style="text-align: left">$k_{t}^{0}$</td>
          <td style="text-align: left">$2.65 \times 10^5 , s^{-1}$</td>
          <td style="text-align: left">Thermal annealing pre-exponential</td>
      </tr>
      <tr>
          <td style="text-align: left">$E_t$</td>
          <td style="text-align: left">$20 , \text{kcal/mol}$</td>
          <td style="text-align: left">Thermal annealing activation energy</td>
      </tr>
      <tr>
          <td style="text-align: left">$s_{o,3}$</td>
          <td style="text-align: left">$0.2$</td>
          <td style="text-align: left">Increase of $s_o$ for max faceting ($z=1$)</td>
      </tr>
  </tbody>
</table>
<h3 id="algorithms-the-equations">Algorithms (The Equations)</h3>
<p>The system is defined by a set of coupled Ordinary Differential Equations (ODEs).</p>
<p><strong>1. Basic 3-Variable Model (Reconstruction Model)</strong></p>
<p>The core system is structured as a single mathematical block of coupled variables representing CO coverage ($u$), Oxygen coverage ($v$), and the surface phase fraction ($w$):</p>
<p>$$
\begin{aligned}
\dot{u} &amp;= p_{CO} \kappa_c s_c \left(1 - \left(\frac{u}{u_s}\right)^q \right) - k_d u - k_r u v \\
\dot{v} &amp;= p_{O_2} \kappa_o s_o \left(1 - \frac{u}{u_s} - \frac{v}{v_s}\right)^2 - k_r u v \\
\dot{w} &amp;= k_p (w_{eq}(u) - w)
\end{aligned}
$$</p>
<p><em>Note:</em> The oxygen sticking coefficient $s_o$ dynamically depends on the structure $w$, calculated as $s_o = w \cdot s_{o,1\times1} + (1-w) \cdot s_{o,1\times2}$. The equilibrium function $w_{eq}(u)$ is a polynomial step function that activates the phase transition:</p>
<p>$$
w_{eq}(u) =
\begin{cases}
0 &amp; u \le 0.2 \
\sum_{i=0}^3 r_i u^i &amp; 0.2 &lt; u &lt; 0.5 \
1 &amp; u \ge 0.5
\end{cases}
$$</p>
<p>The polynomial coefficients from Table II are: $r_3 = -1/0.0135$, $r_2 = -1.05 r_3$, $r_1 = 0.3 r_3$, $r_0 = -0.026 r_3$.</p>
<p><strong>2. Extended 4-Variable Model (Faceting)</strong></p>
<p>To reproduce Mixed-Mode Oscillations, the model adds a faceting variable $z$:</p>
<p>$$
\begin{aligned}
s_o &amp;= w \cdot s_{o,1\times1} + (1-w) \cdot s_{o,1\times2} + s_{o,3} z \\
\dot{z} &amp;= k_f \cdot u \cdot v \cdot w \cdot (1-z) - k_t z (1-u)
\end{aligned}
$$</p>
<h3 id="models">Models</h3>
<p>The authors define two distinct configurations:</p>
<ol>
<li><strong>3-Variable (u, v, w)</strong>: Sufficient for bistability and simple oscillations (limit cycles).</li>
<li><strong>4-Variable (u, v, w, z)</strong>: Required for mixed-mode oscillations (small oscillations superimposed on large relaxation spikes).</li>
</ol>
<h3 id="evaluation">Evaluation</h3>
<ul>
<li><strong>Bifurcation Analysis</strong>: The system should be evaluated by computing steady states and detecting Hopf bifurcations as a function of $p_{CO}$ and $p_{O_2}$.</li>
<li><strong>Time Integration</strong>: Stiff ODE solvers (e.g., <code>scipy.integrate.odeint</code> or <code>solve_ivp</code> with &lsquo;Radau&rsquo; or &lsquo;BDF&rsquo; method) are recommended due to the differing time scales of reaction ($u,v$) and reconstruction ($w,z$).</li>
</ul>
<h3 id="hardware">Hardware</h3>
<ul>
<li><strong>Original</strong>: VAX 6800 and VAX station 3100.</li>
<li><strong>Modern Reqs</strong>: Minimal. Can be solved in milliseconds on any modern CPU using standard scientific libraries (Python/Matlab).</li>
</ul>
<h3 id="reference-implementation">Reference Implementation</h3>
<p>The following Python script implements the 3-variable Reconstruction Model described in the paper, replicating the stable oscillations shown in Figure 7 (T=540K):</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">import</span> numpy <span style="color:#66d9ef">as</span> np
</span></span><span style="display:flex;"><span><span style="color:#f92672">from</span> scipy.integrate <span style="color:#f92672">import</span> odeint
</span></span><span style="display:flex;"><span><span style="color:#f92672">import</span> matplotlib.pyplot <span style="color:#66d9ef">as</span> plt
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e"># --- 1. CONSTANTS &amp; PARAMETERS ---</span>
</span></span><span style="display:flex;"><span>R <span style="color:#f92672">=</span> <span style="color:#ae81ff">0.001987</span>
</span></span><span style="display:flex;"><span>k_c, s_c, q <span style="color:#f92672">=</span> <span style="color:#ae81ff">3.135e5</span>, <span style="color:#ae81ff">1.0</span>, <span style="color:#ae81ff">3.0</span>
</span></span><span style="display:flex;"><span>k_o, s_o1, s_o2 <span style="color:#f92672">=</span> <span style="color:#ae81ff">5.858e5</span>, <span style="color:#ae81ff">0.6</span>, <span style="color:#ae81ff">0.4</span>
</span></span><span style="display:flex;"><span>k_d0, E_d <span style="color:#f92672">=</span> <span style="color:#ae81ff">2.0e16</span>, <span style="color:#ae81ff">38.0</span>
</span></span><span style="display:flex;"><span>k_r0, E_r <span style="color:#f92672">=</span> <span style="color:#ae81ff">3.0e6</span>, <span style="color:#ae81ff">10.0</span>
</span></span><span style="display:flex;"><span>k_p0, E_p <span style="color:#f92672">=</span> <span style="color:#ae81ff">100.0</span>, <span style="color:#ae81ff">7.0</span>
</span></span><span style="display:flex;"><span>u_s, v_s <span style="color:#f92672">=</span> <span style="color:#ae81ff">1.0</span>, <span style="color:#ae81ff">0.8</span>
</span></span><span style="display:flex;"><span>T, p_CO, p_O2 <span style="color:#f92672">=</span> <span style="color:#ae81ff">540.0</span>, <span style="color:#ae81ff">3.0e-5</span>, <span style="color:#ae81ff">6.67e-5</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e"># Calculate Arrhenius rates</span>
</span></span><span style="display:flex;"><span>k_d <span style="color:#f92672">=</span> k_d0 <span style="color:#f92672">*</span> np<span style="color:#f92672">.</span>exp(<span style="color:#f92672">-</span>E_d <span style="color:#f92672">/</span> (R <span style="color:#f92672">*</span> T))
</span></span><span style="display:flex;"><span>k_r <span style="color:#f92672">=</span> k_r0 <span style="color:#f92672">*</span> np<span style="color:#f92672">.</span>exp(<span style="color:#f92672">-</span>E_r <span style="color:#f92672">/</span> (R <span style="color:#f92672">*</span> T))
</span></span><span style="display:flex;"><span>k_p <span style="color:#f92672">=</span> k_p0 <span style="color:#f92672">*</span> np<span style="color:#f92672">.</span>exp(<span style="color:#f92672">-</span>E_p <span style="color:#f92672">/</span> (R <span style="color:#f92672">*</span> T))
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">model</span>(y, t):
</span></span><span style="display:flex;"><span>    u, v, w <span style="color:#f92672">=</span> y
</span></span><span style="display:flex;"><span>    s_o <span style="color:#f92672">=</span> w <span style="color:#f92672">*</span> s_o1 <span style="color:#f92672">+</span> (<span style="color:#ae81ff">1</span> <span style="color:#f92672">-</span> w) <span style="color:#f92672">*</span> s_o2
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Smooth step function for Equilibrium w</span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">if</span> u <span style="color:#f92672">&lt;=</span> <span style="color:#ae81ff">0.2</span>: weq <span style="color:#f92672">=</span> <span style="color:#ae81ff">0.0</span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">elif</span> u <span style="color:#f92672">&gt;=</span> <span style="color:#ae81ff">0.5</span>: weq <span style="color:#f92672">=</span> <span style="color:#ae81ff">1.0</span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">else</span>:
</span></span><span style="display:flex;"><span>        x <span style="color:#f92672">=</span> (u <span style="color:#f92672">-</span> <span style="color:#ae81ff">0.2</span>) <span style="color:#f92672">/</span> <span style="color:#ae81ff">0.3</span>
</span></span><span style="display:flex;"><span>        weq <span style="color:#f92672">=</span> <span style="color:#ae81ff">3</span><span style="color:#f92672">*</span>x<span style="color:#f92672">**</span><span style="color:#ae81ff">2</span> <span style="color:#f92672">-</span> <span style="color:#ae81ff">2</span><span style="color:#f92672">*</span>x<span style="color:#f92672">**</span><span style="color:#ae81ff">3</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    r_reac <span style="color:#f92672">=</span> k_r <span style="color:#f92672">*</span> u <span style="color:#f92672">*</span> v
</span></span><span style="display:flex;"><span>    du <span style="color:#f92672">=</span> p_CO <span style="color:#f92672">*</span> k_c <span style="color:#f92672">*</span> s_c <span style="color:#f92672">*</span> (<span style="color:#ae81ff">1</span> <span style="color:#f92672">-</span> (u<span style="color:#f92672">/</span>u_s)<span style="color:#f92672">**</span>q) <span style="color:#f92672">-</span> k_d <span style="color:#f92672">*</span> u <span style="color:#f92672">-</span> r_reac
</span></span><span style="display:flex;"><span>    dv <span style="color:#f92672">=</span> p_O2 <span style="color:#f92672">*</span> k_o <span style="color:#f92672">*</span> s_o <span style="color:#f92672">*</span> (<span style="color:#ae81ff">1</span> <span style="color:#f92672">-</span> u<span style="color:#f92672">/</span>u_s <span style="color:#f92672">-</span> v<span style="color:#f92672">/</span>v_s)<span style="color:#f92672">**</span><span style="color:#ae81ff">2</span> <span style="color:#f92672">-</span> r_reac
</span></span><span style="display:flex;"><span>    dw <span style="color:#f92672">=</span> k_p <span style="color:#f92672">*</span> (weq <span style="color:#f92672">-</span> w)
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> [du, dv, dw]
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e"># --- 2. SIMULATION STRATEGY ---</span>
</span></span><span style="display:flex;"><span><span style="color:#75715e"># Simulate for 300 seconds to kill transients</span>
</span></span><span style="display:flex;"><span>t_full <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>linspace(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">300</span>, <span style="color:#ae81ff">3000</span>)
</span></span><span style="display:flex;"><span>y0 <span style="color:#f92672">=</span> [<span style="color:#ae81ff">0.1</span>, <span style="color:#ae81ff">0.1</span>, <span style="color:#ae81ff">0.0</span>]
</span></span><span style="display:flex;"><span>solution <span style="color:#f92672">=</span> odeint(model, y0, t_full)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e"># --- 3. SLICING FOR FIGURE 7 ---</span>
</span></span><span style="display:flex;"><span><span style="color:#75715e"># Only take the last 60 seconds (stable limit cycle)</span>
</span></span><span style="display:flex;"><span>mask <span style="color:#f92672">=</span> (t_full <span style="color:#f92672">&gt;</span> <span style="color:#ae81ff">240</span>) <span style="color:#f92672">&amp;</span> (t_full <span style="color:#f92672">&lt;</span> <span style="color:#ae81ff">300</span>)
</span></span><span style="display:flex;"><span>t_plot <span style="color:#f92672">=</span> t_full[mask]
</span></span><span style="display:flex;"><span><span style="color:#75715e"># Shift time axis to start at 10s (matching Fig 7 style)</span>
</span></span><span style="display:flex;"><span>t_display <span style="color:#f92672">=</span> t_plot <span style="color:#f92672">-</span> t_plot[<span style="color:#ae81ff">0</span>] <span style="color:#f92672">+</span> <span style="color:#ae81ff">10</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>u_plot <span style="color:#f92672">=</span> solution[mask, <span style="color:#ae81ff">0</span>]
</span></span><span style="display:flex;"><span>v_plot <span style="color:#f92672">=</span> solution[mask, <span style="color:#ae81ff">1</span>]
</span></span><span style="display:flex;"><span>w_plot <span style="color:#f92672">=</span> solution[mask, <span style="color:#ae81ff">2</span>]
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e"># --- 4. VISUALIZATION ---</span>
</span></span><span style="display:flex;"><span>plt<span style="color:#f92672">.</span>figure(figsize<span style="color:#f92672">=</span>(<span style="color:#ae81ff">8</span>, <span style="color:#ae81ff">5</span>))
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e"># Plot CO (u) and Structure (w) on top (Primary Axis)</span>
</span></span><span style="display:flex;"><span>plt<span style="color:#f92672">.</span>plot(t_display, w_plot, <span style="color:#e6db74">&#39;g--&#39;</span>, label<span style="color:#f92672">=</span><span style="color:#e6db74">&#39;1x1 Fraction (w)&#39;</span>, linewidth<span style="color:#f92672">=</span><span style="color:#ae81ff">1.5</span>)
</span></span><span style="display:flex;"><span>plt<span style="color:#f92672">.</span>plot(t_display, u_plot, <span style="color:#e6db74">&#39;k-&#39;</span>, label<span style="color:#f92672">=</span><span style="color:#e6db74">&#39;CO Coverage (u)&#39;</span>, linewidth<span style="color:#f92672">=</span><span style="color:#ae81ff">2</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e"># Plot Oxygen (v) on bottom</span>
</span></span><span style="display:flex;"><span>plt<span style="color:#f92672">.</span>plot(t_display, v_plot, <span style="color:#e6db74">&#39;r-.&#39;</span>, label<span style="color:#f92672">=</span><span style="color:#e6db74">&#39;Oxygen (v)&#39;</span>, linewidth<span style="color:#f92672">=</span><span style="color:#ae81ff">1.5</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>plt<span style="color:#f92672">.</span>title(<span style="color:#e6db74">&#39;Replication of Figure 7: Stable Oscillations&#39;</span>)
</span></span><span style="display:flex;"><span>plt<span style="color:#f92672">.</span>xlabel(<span style="color:#e6db74">&#39;Time (s)&#39;</span>)
</span></span><span style="display:flex;"><span>plt<span style="color:#f92672">.</span>ylabel(<span style="color:#e6db74">&#39;Coverage [ML]&#39;</span>)
</span></span><span style="display:flex;"><span>plt<span style="color:#f92672">.</span>legend(loc<span style="color:#f92672">=</span><span style="color:#e6db74">&#39;upper center&#39;</span>, ncol<span style="color:#f92672">=</span><span style="color:#ae81ff">3</span>)
</span></span><span style="display:flex;"><span>plt<span style="color:#f92672">.</span>xlim(<span style="color:#ae81ff">10</span>, <span style="color:#ae81ff">60</span>)
</span></span><span style="display:flex;"><span>plt<span style="color:#f92672">.</span>ylim(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">1.0</span>)
</span></span><span style="display:flex;"><span>plt<span style="color:#f92672">.</span>grid(<span style="color:#66d9ef">True</span>, alpha<span style="color:#f92672">=</span><span style="color:#ae81ff">0.3</span>)
</span></span><span style="display:flex;"><span>plt<span style="color:#f92672">.</span>show()
</span></span></code></pre></div>














<figure class="post-figure center ">
    <img src="/img/notes/oscillatory-co-pt110-replication.webp"
         alt="Replication of Figure 7 showing stable oscillations in CO oxidation on Pt(110)"
         title="Replication of Figure 7 showing stable oscillations in CO oxidation on Pt(110)"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">Output of the reference implementation showing stable oscillations on Pt(110)</figcaption>
    
</figure>

<p>This plot faithfully replicates the stable limit cycle shown in <strong>Figure 7</strong> of the paper:</p>
<ul>
<li><strong>Timeframe</strong>: Shows a 50-second window (labeled 10-60s) after initial transients have died out.</li>
<li><strong>Period</strong>: Regular oscillations with a period of roughly 7-8 seconds.</li>
<li><strong>Phase Relationship</strong>: The surface phase reconstruction ($w$, green dashed) lags slightly behind the CO coverage ($u$, black solid). This delay is the crucial &ldquo;memory&rdquo; effect that enables the oscillation.</li>
<li><strong>Anticorrelation</strong>: The oxygen coverage ($v$, red dash-dot) spikes exactly when the surface is in the active $1\times1$ phase (high $w$) and CO is low, confirming the &ldquo;Langmuir-Hinshelwood&rdquo; reaction mechanism.</li>
</ul>
<hr>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Krischer, K., Eiswirth, M., &amp; Ertl, G. (1992). Oscillatory CO oxidation on Pt(110): Modeling of temporal self-organization. <em>The Journal of Chemical Physics</em>, 96(12), 9161-9172. <a href="https://doi.org/10.1063/1.462226">https://doi.org/10.1063/1.462226</a></p>
<p><strong>Publication</strong>: Journal of Chemical Physics 1992</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@article</span>{krischerOscillatoryCOOxidation1992,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span> = <span style="color:#e6db74">{Oscillatory {{CO}} Oxidation on {{Pt}}(110): {{Modeling}} of Temporal Self-organization}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">shorttitle</span> = <span style="color:#e6db74">{Oscillatory {{CO}} Oxidation on {{Pt}}(110)}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span> = <span style="color:#e6db74">{Krischer, K. and Eiswirth, M. and Ertl, G.}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span> = <span style="color:#ae81ff">1992</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">month</span> = jun,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">journal</span> = <span style="color:#e6db74">{The Journal of Chemical Physics}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">volume</span> = <span style="color:#e6db74">{96}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">number</span> = <span style="color:#e6db74">{12}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span> = <span style="color:#e6db74">{9161--9172}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">issn</span> = <span style="color:#e6db74">{0021-9606, 1089-7690}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">doi</span> = <span style="color:#e6db74">{10.1063/1.462226}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div>]]></content:encoded></item><item><title>MD Simulation of Self-Diffusion on Metal Surfaces (1994)</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/self-diffusion-metal-surfaces-1994/</link><pubDate>Sun, 14 Dec 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/self-diffusion-metal-surfaces-1994/</guid><description>Molecular dynamics simulation of Iridium surface diffusion confirming atomic exchange mechanisms using EAM and many-body potentials.</description><content:encoded><![CDATA[<h2 id="scientific-typology-computational-discovery">Scientific Typology: Computational Discovery</h2>
<p>This is primarily a <strong>Discovery</strong> ($\Psi_{\text{Discovery}}$) paper, with strong supporting contributions as a <strong>Method</strong> ($\Psi_{\text{Method}}$) evaluation. The primary contribution is the validation and mechanistic visualization of the &ldquo;exchange mechanism&rdquo; for surface diffusion using computational methods (Molecular Dynamics with many-body potentials). This physical phenomenon was previously observed in Field Ion Microscope (FIM) experiments but difficult to characterize dynamically. The paper focuses on determining <em>how</em> atoms move, specifically distinguishing between hopping and exchange mechanisms.</p>
<h2 id="the-field-ion-microscope-fim-observation-gap">The Field Ion Microscope (FIM) Observation Gap</h2>
<p>Surface diffusion is critical for understanding phenomena like crystal growth, epitaxy, and catalysis. Experimental evidence from FIM on fcc(001) surfaces (specifically Pt and Ir) suggested an &ldquo;exchange mechanism&rdquo; where an adatom replaces a substrate atom, challenging the conventional wisdom that adatoms migrate by hopping over potential barriers (bridge sites) between binding sites. The authors sought to:</p>
<ol>
<li>Investigate whether this exchange mechanism could be reproduced dynamically in simulation.</li>
<li>Determine which interatomic potentials (EAM, Sutton-Chen, R-G-L) accurately describe these surface behaviors compared to bulk properties.</li>
</ol>
<h2 id="dynamic-visualization-of-atomic-exchange">Dynamic Visualization of Atomic Exchange</h2>
<p>The study provides a direct dynamic visualization of the &ldquo;concerted motion&rdquo; involved in exchange diffusion events, which happens on timescales too fast for experimental imaging. By comparing three different many-body potentials, the authors demonstrate that the choice of potential is critical for capturing surface phenomena; specifically, identifying that &ldquo;bulk&rdquo; derived potentials (like Sutton-Chen) may fail to capture specific surface exchange events that EAM and R-G-L potentials successfully model.</p>
<h2 id="simulation-protocol--evaluated-potentials">Simulation Protocol &amp; Evaluated Potentials</h2>
<p>The authors performed Molecular Dynamics (MD) simulations on Iridium (Ir) surfaces:</p>
<ul>
<li><strong>Surfaces</strong>: Channeled (110), densely packed (111), and loosely packed (001).</li>
<li><strong>Potentials</strong>: Three many-body models were tested: Embedded Atom Method (EAM), Sutton-Chen (S-C), and Rosato-Guillope-Legrand (R-G-L).</li>
<li><strong>Conditions</strong>: Simulations were primarily run at $T=800$ K to ensure sufficient sampling of diffusion events.</li>
<li><strong>Cross-Validation</strong>: The study extended the analysis to Cu, Rh, and Pt systems to verify the universality of the exchange mechanism against experimental data.</li>
</ul>
<h2 id="confirmation-of-concerted-motion-mechanisms">Confirmation of Concerted Motion Mechanisms</h2>
<ul>
<li><strong>Mechanism Confirmation</strong>: The study confirmed that diffusion on Ir(001) proceeds via an atomic exchange mechanism (concerted motion). The activation energy for exchange ($0.77$ eV) was found to be significantly lower than for hopping over bridge sites ($1.57$ eV).</li>
<li><strong>Surface Structure Dependence</strong>:
<ul>
<li><strong>Ir(111)</strong>: Diffusion is rapid (activation energy $V_a = 0.17$ eV from R-G-L Arrhenius plot) and occurs exclusively via hopping; no exchange events were observed due to the close-packed nature of the surface.</li>
<li><strong>Ir(110)</strong>: Diffusion is anisotropic; atoms hop <em>along</em> channels but use the exchange mechanism to move <em>across</em> channels.</li>
</ul>
</li>
<li><strong>Potential Validity</strong>: The R-G-L and EAM potentials successfully reproduced experimental exchange behaviors, whereas the Sutton-Chen potential failed to predict exchange on Ir(001). The authors attribute the S-C failure primarily to the use of &ldquo;bulk&rdquo; potential parameters to describe interactions at the surface.</li>
<li><strong>Cross-System Comparison</strong>: The study extended the analysis to Cu, Rh, and Pt systems. Both S-C and R-G-L potentials correctly predicted the absence of exchange on all three Rh surfaces and on (111) surfaces of Cu and Pt. Exchange events were correctly predicted on Cu(001), Cu(110), Pt(001), and Pt(110) by both potentials. The sole discrepancy was S-C failing to predict exchange on Ir(001), where R-G-L and EAM succeeded in agreement with experiment.</li>
</ul>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<h3 id="algorithms">Algorithms</h3>
<ul>
<li><strong>Integration</strong>: &ldquo;Velocity&rdquo; form of the Verlet algorithm.</li>
<li><strong>Time Step</strong>: $\Delta t = 0.01$ ps ($10^{-14}$ s).</li>
<li><strong>Simulation Protocol</strong>:
<ol>
<li><strong>Quenching</strong>: System relaxed to 0 K by zeroing velocities when $v \cdot F &lt; 0$.</li>
<li><strong>Equilibration</strong>: 5 ps constant-temperature run (renormalizing velocities every step).</li>
<li><strong>Production</strong>: 15 ps constant-energy (microcanonical) run where trajectories are collected.</li>
</ol>
</li>
</ul>
<h3 id="models">Models</h3>
<p>The study relies on three specific many-body potential formulations:</p>
<ol>
<li><strong>Embedded Atom Method (EAM)</strong>:
<ul>
<li>Total energy:
$$U_{tot} = \sum_i F_i(\rho_i) + \frac{1}{2} \sum_{j \neq i} \phi_{ij}(r_{ij})$$</li>
</ul>
</li>
<li><strong>Sutton-Chen (S-C)</strong>:
<ul>
<li>Uses a square root density dependence and power-law pair repulsion $(a/r)^{n}$:
$$F(\rho) \propto \rho^{1/2}$$</li>
</ul>
</li>
<li><strong>Rosato-Guillope-Legrand (R-G-L)</strong>:
<ul>
<li>Born-Mayer type repulsion:
$$\phi_{ij}(r) = A \exp[-p(r/r_0 - 1)]$$</li>
<li>Attractive band energy:
$$F_i(\rho) = -\left(\sum \xi^2 \exp[-2q(r/r_0 - 1)]\right)^{1/2}$$</li>
</ul>
</li>
</ol>
<h3 id="data">Data</h3>
<ul>
<li><strong>System Size</strong>: 648 classical atoms.</li>
<li><strong>Geometry</strong>:
<ul>
<li>Cubic box with fixed volume.</li>
<li>Periodic boundary conditions in $x$ and $y$ (parallel to surface), free motion in $z$.</li>
<li>Substrate depth: 8, 12, or 9 atomic layers depending on orientation [(001), (110), (111)].</li>
</ul>
</li>
<li><strong>Cutoff Radius</strong>: 14 bohr ($\sim 7.4$ Å).</li>
<li><strong>Initial Conditions</strong>: Velocities initialized from a Maxwellian distribution.</li>
</ul>
<h3 id="evaluation">Evaluation</h3>
<ul>
<li><strong>Diffusion Constant ($D$)</strong>: Calculated using the Einstein relation via Mean Square Displacement (MSD):
$$D = \lim_{t \to \infty} \frac{\langle \Delta r^2(t) \rangle}{2td}$$
where $d=2$ for surface diffusion.</li>
<li><strong>Activation Energy ($V_a$)</strong>: Extracted from the slope of Arrhenius plots ($\ln D$ vs $1/T$).</li>
<li><strong>Attempt Frequency ($\nu$)</strong>: Estimated via harmonic approximation: $\nu = \frac{1}{2\pi}\sqrt{c/M}$.</li>
</ul>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Shiang, K.-D., Wei, C. M., &amp; Tsong, T. T. (1994). A molecular dynamics study of self-diffusion on metal surfaces. <em>Surface Science</em>, 301(1-3), 136-150. <a href="https://doi.org/10.1016/0039-6028(94)91295-5">https://doi.org/10.1016/0039-6028(94)91295-5</a></p>
<p><strong>Publication</strong>: Surface Science 1994</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@article</span>{shiang1994molecular,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span>=<span style="color:#e6db74">{A molecular dynamics study of self-diffusion on metal surfaces}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span>=<span style="color:#e6db74">{Shiang, Keh-Dong and Wei, C.M. and Tsong, Tien T.}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">journal</span>=<span style="color:#e6db74">{Surface Science}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">volume</span>=<span style="color:#e6db74">{301}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">number</span>=<span style="color:#e6db74">{1-3}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span>=<span style="color:#e6db74">{136--150}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span>=<span style="color:#e6db74">{1994}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">publisher</span>=<span style="color:#e6db74">{Elsevier}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">doi</span>=<span style="color:#e6db74">{10.1016/0039-6028(94)91295-5}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div>]]></content:encoded></item><item><title>Kinetic Oscillations in CO Oxidation on Pt(100): Theory</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/kinetic-oscillations-pt100-1985/</link><pubDate>Sun, 14 Dec 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/kinetic-oscillations-pt100-1985/</guid><description>Theoretical model using coupled differential equations to explain CO oxidation oscillations via surface phase transitions on platinum.</description><content:encoded><![CDATA[














<figure class="post-figure center ">
    <img src="/img/notes/co-pt100-hollow.webp"
         alt="Carbon monoxide molecule adsorbed on Pt(100) FCC surface in hollow site configuration"
         title="Carbon monoxide molecule adsorbed on Pt(100) FCC surface in hollow site configuration"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">CO molecule adsorbed in hollow site on Pt(100) surface. The surface structure and CO binding configurations are central to understanding the oscillatory behavior.</figcaption>
    
</figure>

<h2 id="contribution-theoretical-modeling-of-kinetic-oscillations">Contribution: Theoretical Modeling of Kinetic Oscillations</h2>
<p><strong>Theory ($\Psi_{\text{Theory}}$)</strong>.</p>
<p>This paper derives a microscopic mechanism based on experimental kinetic data to explain observed kinetic oscillations. It relies heavily on <strong>formal analysis</strong>, including a <strong>Linear Stability Analysis</strong> of a simplified model to derive eigenvalues and characterize stationary points (stable nodes, saddle points, and foci) whose appearance and disappearance drive relaxation oscillations. The primary contribution is the mathematical formulation of the surface phase transition.</p>
<h2 id="motivation-explaining-periodicity-in-surface-reactions">Motivation: Explaining Periodicity in Surface Reactions</h2>
<p>Experimental studies had shown that the catalytic oxidation of Carbon Monoxide (CO) on Platinum (100) surfaces exhibits temporal oscillations and spatial wave patterns at low pressures ($10^{-4}$ Torr). While the individual elementary steps (adsorption, desorption, reaction) were known, the mechanism driving the periodicity was not understood. Prior models relied on indirect evidence; this work aimed to ground the theory in new LEED (Low-Energy Electron Diffraction) observations showing that the surface structure itself transforms periodically between a reconstructed <code>hex</code> phase and a bulk-like <code>1x1</code> phase.</p>
<h2 id="novelty-the-surface-phase-transition-model">Novelty: The Surface Phase Transition Model</h2>
<p>The core novelty is the <strong>Surface Phase Transition Model</strong>. The authors propose that the oscillations are driven by the reversible phase transition of the Pt surface atoms, which is triggered by critical adsorbate coverages:</p>
<ol>
<li><strong>State Dependent Kinetics</strong>: The <code>hex</code> and <code>1x1</code> phases have vastly different sticking coefficients for Oxygen (negligible on <code>hex</code>, high on <code>1x1</code>).</li>
<li><strong>Critical Coverage Triggers</strong>: The transition depends on whether local CO coverage exceeds a critical threshold ($U_{a,grow}$) or falls below another ($U_{a,crit}$).</li>
<li><strong>Trapping-Desorption</strong>: The model introduces a &ldquo;trapping&rdquo; term where CO diffuses from the weakly-binding <code>hex</code> phase to the strongly-binding <code>1x1</code> patches, creating a feedback loop.</li>
</ol>
<h2 id="methodology-reaction-diffusion-simulations">Methodology: Reaction-Diffusion Simulations</h2>
<p>As a theoretical paper, the &ldquo;experiments&rdquo; were computational simulations and mathematical derivations:</p>
<ul>
<li><strong>Linear Stability Analysis</strong>: They simplified the 4-variable model to a 3-variable system ($u$, $v$, $a$), then treated the phase fraction $a$ as a slowly varying parameter. This allowed them to perform a 2-variable stability analysis on the $u$-$v$ subsystem, identifying the conditions for oscillations through the appearance and disappearance of stationary points as $a$ varies.</li>
<li><strong>Hysteresis Simulation</strong>: They simulated temperature-programmed variations to match experimental CO adsorption hysteresis loops, fitting the critical coverage parameters ($U_{a,grow} \approx 0.5$).</li>
<li><strong>Reaction-Diffusion Simulation</strong>: They numerically integrated the full set of 4 coupled differential equations over a 1D spatial grid (40 compartments) to reproduce temporal oscillations and propagating wave fronts.</li>
</ul>
<h2 id="results-mechanisms-of-spatiotemporal-self-organization">Results: Mechanisms of Spatiotemporal Self-Organization</h2>
<ul>
<li><strong>Mechanism Validation</strong>: The model successfully reproduced the asymmetric oscillation waveform (a slow plateau followed by a steep breakdown) observed in work function and LEED measurements.</li>
<li><strong>Phase Transition Role</strong>: Confirmed that the &ldquo;slow&rdquo; step driving the oscillation period is the phase transformation, specifically the requirement for CO to build up to a critical level to nucleate the reactive <code>1x1</code> phase.</li>
<li><strong>Spatial Self-Organization</strong>: The addition of diffusion terms allowed the model to reproduce wave propagation, showing that defects at crystal edges can act as &ldquo;pacemakers&rdquo; or triggers for the rest of the surface.</li>
<li><strong>Chaotic Behavior</strong>: Under slightly different conditions (e.g., $T = 470$ K instead of 480 K), the coupled system produces irregular, chaotic work function oscillations. This arises when not every trigger compartment oscillation drives a wave into the bulk because the bulk has not yet recovered from the previous wave front. The authors note that such irregular behavior is the rule rather than the exception in experimental observations.</li>
<li><strong>Quantitative Limitations</strong>: The calculated oscillation periods are at least one order of magnitude shorter than experimental values (1 to 4 min). This discrepancy arises mainly from unrealistically high values of $k_5$ and $k_8$ used to reduce computational time. The model also restricts spatial analysis to a 1D grid, which oversimplifies the true 2D wave patterns seen in experiments. The authors note that microscopic adsorbate-adsorbate interactions and island formation are not included, which would require multi-scale modeling.</li>
</ul>
<hr>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<p>To faithfully replicate this study, one must implement the system of four coupled differential equations. The hardware requirements are negligible by modern standards.</p>
<h3 id="models">Models</h3>
<p>The system tracks four state variables:</p>
<ol>
<li>$u_a$: CO coverage on the <code>1x1</code> phase (normalized to local area $a$)</li>
<li>$u_b$: CO coverage on the <code>hex</code> phase (normalized to local area $b$)</li>
<li>$v_a$: Oxygen coverage on the <code>1x1</code> phase (normalized to local area $a$)</li>
<li>$a$: Fraction of surface in <code>1x1</code> phase ($b = 1 - a$)</li>
</ol>
<p><strong>The Governing Equations:</strong></p>
<p><strong>CO coverage on 1x1 phase:</strong>
$$
\begin{aligned}
\frac{\partial u_a}{\partial t} = k_1 a p_{CO} - k_2 u_a + k_3 a u_b - k_4 u_a v_a / a + k_5 \nabla^2(u_a/a)
\end{aligned}
$$</p>
<p><strong>CO coverage on hex phase:</strong>
$$
\begin{aligned}
\frac{\partial u_b}{\partial t} = k_1 b p_{CO} - k_6 u_b - k_3 a u_b
\end{aligned}
$$</p>
<p><strong>Oxygen coverage on 1x1 phase:</strong>
$$
\begin{aligned}
\frac{\partial v_a}{\partial t} = k_7 a p_{O_2} \left[ \left(1 - 2 \frac{u_a}{a} - \frac{5}{3} \frac{v_a}{a}\right)^2 + \alpha \left(1 - \frac{5}{3}\frac{v_a}{a}\right)^2 \right] - k_4 u_a v_a / a
\end{aligned}
$$</p>
<p><strong>The Phase Transition Logic ($da/dt$):</strong></p>
<p>The growth of the <code>1x1</code> phase ($a$) is piecewise, defined by critical coverages:</p>
<ul>
<li>If $U_a &gt; U_{a,grow}$ and $\partial u_a/\partial t &gt; 0$: island growth with $\partial a/\partial t = (1/U_{a,grow}) \cdot \partial u_a/\partial t$</li>
<li>If $c = U_a/U_{a,crit} + V_a/V_{a,crit} &lt; 1$: decay to hex with $\partial a/\partial t = -k_8 a c$</li>
<li>Otherwise: $\partial a/\partial t = 0$</li>
</ul>
<h3 id="algorithms">Algorithms</h3>
<ul>
<li><strong>Time Integration</strong>: Runge-Kutta-Merson routine.</li>
<li><strong>Spatial Integration</strong>: Crank-Nicholson algorithm for the diffusion term.</li>
<li><strong>Time Step</strong>: $\Delta t = 10^{-4}$ s.</li>
<li><strong>Spatial Grid</strong>: 1D array of 40 compartments, total length 0.4 cm (each compartment 0.01 cm).</li>
<li><strong>Boundary Conditions</strong>: Closed ends (no flux). Defects simulated by setting $\alpha$ higher in the first 3 &ldquo;edge&rdquo; compartments.</li>
</ul>
<h3 id="data">Data</h3>
<p>Replication requires the specific rate constants. Note: $k_3$ and $\alpha$ are fitting parameters.</p>
<table>
  <thead>
      <tr>
          <th>Parameter</th>
          <th>Symbol</th>
          <th>Value (at 480 K)</th>
          <th>Description</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>CO Stick</td>
          <td>$k_1$</td>
          <td>$2.94 \times 10^5$ ML/s/Torr</td>
          <td>Pre-exponential factor</td>
      </tr>
      <tr>
          <td>CO Desorp (1x1)</td>
          <td>$k_2$</td>
          <td>$1.5$ s$^{-1}$ ($U_a = 0.5$)</td>
          <td>$E_a = 37.3$ (low cov), $33.5$ kcal/mol (high cov)</td>
      </tr>
      <tr>
          <td>Trapping</td>
          <td>$k_3$</td>
          <td>$50 \pm 30$ s$^{-1}$</td>
          <td>Hex to 1x1 diffusion</td>
      </tr>
      <tr>
          <td>Reaction</td>
          <td>$k_4$</td>
          <td>$10^3 - 10^5$ ML$^{-1}$s$^{-1}$</td>
          <td>Langmuir-Hinshelwood</td>
      </tr>
      <tr>
          <td>Diffusion</td>
          <td>$k_5$</td>
          <td>$4 \times 10^{-4}$ cm$^2$/s</td>
          <td>CO surface diffusion (elevated for computational speed; realistic: $10^{-7}$ to $10^{-5}$)</td>
      </tr>
      <tr>
          <td>CO Desorp (hex)</td>
          <td>$k_6$</td>
          <td>$11$ s$^{-1}$</td>
          <td>$E_a = 27.5$ kcal/mol</td>
      </tr>
      <tr>
          <td>O2 Adsorption</td>
          <td>$k_7$</td>
          <td>$5.6 \times 10^5$ ML/s/Torr</td>
          <td>Only on 1x1 phase</td>
      </tr>
      <tr>
          <td>Phase Trans</td>
          <td>$k_8$</td>
          <td>$0.4 - 2.0$ s$^{-1}$</td>
          <td>Relaxation constant</td>
      </tr>
      <tr>
          <td>Defect Coeff</td>
          <td>$\alpha$</td>
          <td>$0.1 - 0.5$</td>
          <td>Fitting param for defects</td>
      </tr>
      <tr>
          <td>Crit Cov (Grow)</td>
          <td>$U_{a,grow}$</td>
          <td>$0.5 \pm 0.1$</td>
          <td>Trigger for hex to 1x1</td>
      </tr>
      <tr>
          <td>Crit Cov (Decay)</td>
          <td>$U_{a,crit}$</td>
          <td>$0.32$</td>
          <td>Trigger for 1x1 to hex (CO)</td>
      </tr>
      <tr>
          <td>Crit O Cov</td>
          <td>$V_{a,crit}$</td>
          <td>$0.4$</td>
          <td>Trigger for 1x1 to hex (O)</td>
      </tr>
  </tbody>
</table>
<h3 id="evaluation">Evaluation</h3>
<p>The model was evaluated by comparing the simulated temporal oscillations and spatial wave patterns against experimental work function measurements and LEED observations.</p>
<h3 id="hardware">Hardware</h3>
<p>The hardware requirements are negligible by modern standards. The original simulations were likely performed on a mainframe or minicomputer of the era. Today, they can be run on any standard personal computer.</p>
<hr>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Imbihl, R., Cox, M. P., Ertl, G., Müller, H., &amp; Brenig, W. (1985). Kinetic oscillations in the catalytic CO oxidation on Pt(100): Theory. <em>The Journal of Chemical Physics</em>, 83(4), 1578-1587. <a href="https://doi.org/10.1063/1.449834">https://doi.org/10.1063/1.449834</a></p>
<p><strong>Publication</strong>: The Journal of Chemical Physics 1985</p>
<p><strong>Related Work</strong>: See also <a href="/notes/chemistry/molecular-simulation/oscillatory-co-oxidation-pt110-1992/">Oscillatory CO Oxidation on Pt(110)</a> for the same catalytic system on a different crystal face, demonstrating that surface phase transitions drive oscillatory behavior across multiple platinum surfaces.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@article</span>{imbihl1985kinetic,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span>=<span style="color:#e6db74">{Kinetic oscillations in the catalytic CO oxidation on Pt(100): Theory}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span>=<span style="color:#e6db74">{Imbihl, R and Cox, MP and Ertl, G and M{\&#34;u}ller, H and Brenig, W}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">journal</span>=<span style="color:#e6db74">{The Journal of Chemical Physics}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">volume</span>=<span style="color:#e6db74">{83}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">number</span>=<span style="color:#e6db74">{4}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span>=<span style="color:#e6db74">{1578--1587}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span>=<span style="color:#e6db74">{1985}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">publisher</span>=<span style="color:#e6db74">{American Institute of Physics}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div>]]></content:encoded></item><item><title>In Situ XRD of Oxidation-Reduction Oscillations on Pt/SiO2</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/oxidation-reduction-oscillations-pt-sio2-1994/</link><pubDate>Sun, 14 Dec 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/oxidation-reduction-oscillations-pt-sio2-1994/</guid><description>In situ XRD validation of the oxide model driving kinetic rate oscillations in high-pressure CO oxidation on supported platinum.</description><content:encoded><![CDATA[<h2 id="experimental-validation-of-the-oxide-model">Experimental Validation of the Oxide Model</h2>
<p>This is a <strong>Discovery (Translational/Application)</strong> paper.</p>
<p>It is classified as such because the primary contribution is the experimental resolution of a long-standing scientific debate regarding the physical driving force of kinetic oscillations. The authors use established techniques (in situ X-ray diffraction and Debye Function Analysis) to falsify existing hypotheses (reconstruction model, carbon model) and validate a specific physical mechanism (the oxide model).</p>
<h2 id="the-missing-driving-force-in-high-pressure-co-oxidation">The Missing Driving Force in High-Pressure CO Oxidation</h2>
<p>The study addresses the debate surrounding the driving force of kinetic oscillations in CO oxidation on platinum catalysts at high pressures ($p &gt; 10^{-3}$ mbar). While low-pressure oscillations on single crystals were known to be caused by surface reconstruction, the mechanism for high-pressure oscillations on supported catalysts was unresolved. Three main models existed:</p>
<ul>
<li><strong>Reconstruction model</strong>: Structural changes of the substrate</li>
<li><strong>Carbon model</strong>: Periodic deactivation by carbon</li>
<li><strong>Oxide model</strong>: Periodic formation and reduction of surface oxides</li>
</ul>
<p>Prior to this work, there was no conclusive experimental proof demonstrating the periodic oxidation and reduction required by the oxide model.</p>
<h2 id="direct-in-situ-xrd-proof">Direct In Situ XRD Proof</h2>
<p>The core novelty is the <strong>first direct experimental evidence</strong> connecting periodic structural changes in the catalyst to rate oscillations. Using in situ X-ray diffraction (XRD), the authors demonstrated that the intensity of the Pt(111) Bragg peak oscillates in sync with the reaction rate.</p>
<p>By applying Debye Function Analysis (DFA) to the diffraction profiles, they quantitatively showed that the catalyst transitions between a metallic Pt state and a partially oxidized state (containing $\text{PtO}$ and $\text{Pt}_3\text{O}_4$). This definitively ruled out the reconstruction model (which would produce much smaller intensity variations) and confirmed the oxide model.</p>
<h2 id="in-situ-x-ray-diffraction-and-activity-monitoring">In Situ X-ray Diffraction and Activity Monitoring</h2>
<p>The authors performed <strong>in situ X-ray diffraction</strong> experiments on a supported Pt catalyst (EuroPt-1) during the CO oxidation reaction.</p>
<ul>
<li><strong>Reaction Monitoring</strong>: They cycled the temperature and gas flow rates (CO, $\text{O}_2$, He) to induce ignition, extinction, and oscillations.</li>
<li><strong>Activity Metrics</strong>: Catalytic activity was tracked via sample temperature (using thermocouples) and $\text{CO}_2$ production (using a quadrupole mass spectrometer).</li>
<li><strong>Structural Monitoring</strong>: They recorded the intensity of the Pt(111) Bragg peak continuously.</li>
<li><strong>Cluster Analysis</strong>: Detailed angular scans of diffracted intensity were taken at stationary points (active vs. inactive states) and analyzed using Debye functions to determine cluster size and composition.</li>
</ul>
<h2 id="periodic-oxidation-mechanism-and-reversibility">Periodic Oxidation Mechanism and Reversibility</h2>
<p><strong>Key Findings</strong>:</p>
<ul>
<li><strong>Oscillation Mechanism</strong>: Rate oscillations are accompanied by the periodic oxidation and reduction of the Pt catalyst.</li>
<li><strong>Phase Relationship</strong>: The X-ray intensity (oxide amount) oscillates approximately 120° ahead of the temperature (reaction rate), consistent with the oxide model: oxidation deactivates the surface → rate drops → CO reduces the surface → rate rises.</li>
<li><strong>Oxide Composition</strong>: The oxidized state consists of a mixture of metallic clusters, $\text{PtO}$, and $\text{Pt}_3\text{O}_4$. $\text{PtO}_2$ was not found.</li>
<li><strong>Extent of Oxidation</strong>: Approximately 20-30% of the metal atoms are oxidized, corresponding effectively to a shell of oxide on the surface of the nanoclusters.</li>
<li><strong>Reversibility</strong>: The transition between metallic and oxidized states is fully reversible with no sintering observed under the experimental conditions.</li>
<li><strong>Scope Limitation</strong>: The authors note that whether the oxide model also applies to kinetic oscillations on Pt foils or Pt wires remains to be verified, since small Pt clusters likely have a much higher tendency to form oxides than massive Pt metal.</li>
</ul>
<hr>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<h3 id="data">Data</h3>
<p>The study used the <strong>EuroPt-1</strong> standard catalyst.</p>
<table>
  <thead>
      <tr>
          <th>Type</th>
          <th>Material</th>
          <th>Details</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><strong>Catalyst</strong></td>
          <td>EuroPt-1 ($\text{Pt/SiO}_2$)</td>
          <td>6.3% Pt loading on silica support</td>
      </tr>
      <tr>
          <td><strong>Particle Size</strong></td>
          <td>Pt Clusters</td>
          <td>Mean diameter ~15.5 Å; dispersion $65 \pm 5\%$</td>
      </tr>
      <tr>
          <td><strong>Sample Prep</strong></td>
          <td>Pellets</td>
          <td>40 mg of catalyst pressed into $15 \times 12 \times 0.3 \text{ mm}^3$ self-supporting pellets</td>
      </tr>
  </tbody>
</table>
<h3 id="algorithms">Algorithms</h3>
<p><strong>Debye Function Analysis (DFA)</strong></p>
<p>The study used DFA to fit theoretical scattering curves to experimental intensity profiles. This method is suitable for randomly oriented clusters where standard crystallographic methods might fail due to finite size effects.</p>
<p>$$I_{N}(b)=\sum_{m,n=1}^{N}f_{m}f_{n}\frac{\sin(2\pi br_{mn})}{2\pi br_{mn}}$$</p>
<p>Where:</p>
<ul>
<li><strong>$b$</strong>: Scattering vector magnitude, $b=2 \sin \vartheta/\lambda$</li>
<li><strong>$f_m, f_n$</strong>: Atomic scattering amplitudes</li>
<li><strong>$r_{mn}$</strong>: Distance between atom pairs</li>
<li><strong>Shape Assumption</strong>: Cuboctahedral clusters (nearly spherical)</li>
</ul>
<h3 id="models">Models</h3>
<p><strong>1. The Oxide Model (Physical Mechanism)</strong></p>
<p>Proposed by Sales, Turner, and Maple, validated here:</p>
<ol>
<li><strong>Oxidation</strong>: As oxygen coverage increases, the surface forms a catalytically inactive oxide layer ($\text{PtO}_x$).</li>
<li><strong>Deactivation</strong>: The reaction rate drops as the surface deactivates.</li>
<li><strong>Reduction</strong>: CO adsorption leads to the reduction of the oxide layer, restoring the metallic surface.</li>
<li><strong>Reactivation</strong>: The metallic surface is active for CO oxidation, increasing the rate until oxygen coverage builds up again.</li>
</ol>
<p><strong>2. Shell Model (Structural)</strong></p>
<p>The diffraction data was fit using a &ldquo;Shell Model&rdquo; where a metallic Pt core is surrounded by an oxide shell.</p>
<h3 id="evaluation">Evaluation</h3>
<p><strong>Key Experimental Signatures for Replication</strong>:</p>
<ul>
<li><strong>Ignition Point</strong>: A sharp increase in sample temperature accompanied by a steep 18% decrease in Bragg intensity. After the He flow was switched off, the intensity dropped further to a total decrease of 31.5%.</li>
<li><strong>Oscillation Regime</strong>: Observed at flow rates $\sim 100 \text{ ml/min}$ after cooling the sample to $\sim 375 \text{ K}$. Below $50 \text{ ml/min}$, only bistability is observed. Temperature oscillations had $\sim 50 \text{ K}$ peak-to-peak amplitude.</li>
<li><strong>Magnitude</strong>: Bragg intensity oscillations of ~11% amplitude.</li>
</ul>
<h3 id="hardware">Hardware</h3>
<p><strong>Experimental Setup</strong>:</p>
<ul>
<li><strong>Diffractometer</strong>: Commercial Guinier diffractometer (HUBER) with monochromatized Cu $K_{\alpha1}$ radiation (45° transmission geometry).</li>
<li><strong>Reactor Cell</strong>: Custom 115 $\text{cm}^3$ cell, evacuatable to $10^{-7}$ mbar, equipped with Kapton windows and a Be-cover.</li>
<li><strong>Gases</strong>: CO (4.7 purity), $\text{O}_2$ (4.5 purity), He (4.6 purity) regulated by flow controllers.</li>
<li><strong>Sensors</strong>: Two K-type thermocouples (surface and gas phase) and a differentially pumped Quadrupole Mass Spectrometer (QMS).</li>
</ul>
<hr>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Hartmann, N., Imbihl, R., &amp; Vogel, W. (1994). Experimental evidence for an oxidation/reduction mechanism in rate oscillations of catalytic CO oxidation on Pt/SiO2. <em>Catalysis Letters</em>, 28(2-4), 373-381. <a href="https://doi.org/10.1007/BF00806068">https://doi.org/10.1007/BF00806068</a></p>
<p><strong>Publication</strong>: Catalysis Letters 1994</p>
<p><strong>Related Work</strong>: This work complements <a href="/notes/chemistry/molecular-simulation/oscillatory-co-oxidation-pt110-1992/">Oscillatory CO Oxidation on Pt(110)</a>, which modeled oscillations via surface reconstruction. Here, the driving force is oxidation/reduction.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@article</span>{hartmannExperimentalEvidenceOxidation1994,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span> = <span style="color:#e6db74">{Experimental Evidence for an Oxidation/Reduction Mechanism in Rate Oscillations of Catalytic {{CO}} Oxidation on {{Pt}}/{{SiO2}}}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span> = <span style="color:#e6db74">{Hartmann, N. and Imbihl, R. and Vogel, W.}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span> = <span style="color:#ae81ff">1994</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">journal</span> = <span style="color:#e6db74">{Catalysis Letters}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">volume</span> = <span style="color:#e6db74">{28}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">number</span> = <span style="color:#e6db74">{2-4}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span> = <span style="color:#e6db74">{373--381}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">issn</span> = <span style="color:#e6db74">{1011-372X, 1572-879X}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">doi</span> = <span style="color:#e6db74">{10.1007/BF00806068}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div>]]></content:encoded></item><item><title>Evans 1986: Thermal Conductivity of Lennard-Jones Fluid</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/evans-thermal-conductivity-1986/</link><pubDate>Sun, 14 Dec 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/evans-thermal-conductivity-1986/</guid><description>A 1986 validation of the Evans NEMD method for simulating heat flow, identifying long-time tail anomalies near the critical point.</description><content:encoded><![CDATA[<h2 id="methodological-validation-and-physical-discovery">Methodological Validation and Physical Discovery</h2>
<p>This is primarily a <strong>Methodological Paper ($\Psi_{\text{Method}}$)</strong>, with a significant secondary component of <strong>Discovery ($\Psi_{\text{Discovery}}$)</strong>.</p>
<p>It focuses on validating a specific algorithm (the &ldquo;Evans method&rdquo;) for Non-Equilibrium Molecular Dynamics (NEMD) by comparing its results against experimental benchmarks. However, it also uncovers physical anomalies, specifically &ldquo;long-time tails&rdquo; in the heat flux autocorrelation function that deviate significantly from theoretical predictions, marking a discovery about the physics of the Lennard-Jones fluid itself.</p>
<h2 id="flow-gradients-and-boundary-limitations">Flow Gradients and Boundary Limitations</h2>
<p>The primary motivation is to overcome the limitations of simulating heat flow using physical boundaries (e.g., walls at different temperatures), which causes severe interpretive difficulties due to density and temperature gradients.</p>
<p>The &ldquo;Evans method&rdquo; uses a fictitious external field to induce heat flow in a periodic, homogeneous system. This paper serves to:</p>
<ol>
<li>Validate this method across a wide range of state points (temperatures and densities) beyond the triple point.</li>
<li>Investigate the system&rsquo;s behavior near the critical point, where transport properties are known to be anomalous.</li>
</ol>
<h2 id="core-innovations-of-the-evans-algorithm">Core Innovations of the Evans Algorithm</h2>
<p>The core contribution is the rigorous stress-testing of the <strong>homogeneous heat flow algorithm</strong> (Evans method) combined with a <strong>Gaussian thermostat</strong>.</p>
<p>Specific novel insights include:</p>
<ul>
<li><strong>Linearity Validation</strong>: Establishing that, away from phase boundaries, the effective thermal conductivity is a monotonic, virtually linear function of the external field, justifying the extrapolation to zero field.</li>
<li><strong>Critical Anomaly Detection</strong>: Finding that near the critical point, conductivity becomes a non-monotonic function of the field, challenging standard simulation approaches in this regime.</li>
<li><strong>Tail Amplitude Discovery</strong>: Demonstrating that the &ldquo;long-time tails&rdquo; of the heat flux autocorrelation function have amplitudes roughly 6 times larger than those predicted by mode-coupling theory.</li>
</ul>
<h2 id="nemd-simulation-setup">NEMD Simulation Setup</h2>
<p>The author performed <strong>Non-Equilibrium Molecular Dynamics (NEMD)</strong> simulations using the Lennard-Jones potential.</p>
<ul>
<li><strong>System</strong>: Mostly $N=108$ particles, with some checks using $N=256$ to test size dependence.</li>
<li><strong>Thermostat</strong>: A Gaussian thermostat was used to keep the kinetic energy (temperature) constant.</li>
<li><strong>State Points</strong>:
<ul>
<li><strong>Critical Isotherm</strong>: $T=1.35$, varying density.</li>
<li><strong>Supercritical Isotherm</strong>: $T=2.0$.</li>
<li><strong>Freezing Line</strong>: Two points ($T=2.74, \rho=1.113$ and $T=2.0, \rho=1.04$).</li>
</ul>
</li>
<li><strong>Validation</strong>: Results were compared against <strong>experimental data for Argon</strong> (using standard LJ parameters).</li>
<li><strong>Ablation</strong>:
<ul>
<li><strong>Field Strength ($F$)</strong>: Varied to check for linearity/non-linearity.</li>
<li><strong>System Size ($N$)</strong>: Comparison between 108 and 256 particles to rule out finite-size artifacts.</li>
</ul>
</li>
</ul>
<h2 id="linearity-regimes-and-long-time-tail-anomalies">Linearity Regimes and Long-Time Tail Anomalies</h2>
<ul>
<li><strong>Agreement with Experiment</strong>: The Evans method yields thermal conductivities in broad agreement with experimental Argon data for most state points.</li>
<li><strong>Linearity</strong>: Away from the critical point, conductivity is a virtually linear function of the field strength $F$, allowing for accurate zero-field extrapolation.</li>
<li><strong>Critical Region Failure</strong>: Near the critical point ($T=1.35, \rho=0.4$), the method struggles; the conductivity is non-monotonic with respect to $F$, and the zero-field extrapolation underestimates the experimental value by ~11%.</li>
<li><strong>Long-Time Tails</strong>: The decay of the heat flux autocorrelation function follows a $t^{-3/2}$ tail (consistent with mode-coupling theory), but the <strong>amplitude is ~6x larger</strong> than predicted.</li>
<li><strong>Phase Hysteresis</strong>: In high-density regions near the freezing line, the system exhibits hysteresis and bi-stability between solid and liquid phases depending on the field strength.</li>
</ul>
<hr>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<h3 id="data">Data</h3>
<p>The simulation relies on the Lennard-Jones (LJ) potential to model Argon. No external training data is used; the &ldquo;data&rdquo; consists of the physical constants defining the system.</p>
<table>
  <thead>
      <tr>
          <th>Parameter</th>
          <th>Value/Description</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><strong>Potential</strong></td>
          <td>$\Phi(q)=4(q^{-12}-q^{-6})$</td>
          <td>Standard LJ 12-6 potential</td>
      </tr>
      <tr>
          <td><strong>Cutoff</strong></td>
          <td>$r_c = 2.5$</td>
          <td>Truncated at 2.5 distance units</td>
      </tr>
      <tr>
          <td><strong>Comparison</strong></td>
          <td>Argon Experimental Data</td>
          <td>Sourced from NBS recommended values</td>
      </tr>
  </tbody>
</table>
<h3 id="algorithms">Algorithms</h3>
<p>The core algorithm is the <strong>Evans Homogeneous Heat Flow</strong> method. To reproduce this, one must implement the specific Equations of Motion (EOM) derived from linear response theory.</p>
<p><strong>Equations of Motion:</strong></p>
<p>The trajectories are generated by:
$$
\begin{aligned}
\dot{q}_i &amp;= \frac{p_i}{m} \\
\dot{p}_i &amp;= F_i^{\text{inter}} + (E_i - \bar{E})F(t) - \sum_{j} F_{ij} q_{ij} \cdot F(t) + \frac{1}{2N} \sum_{j,k} F_{jk} q_{jk} \cdot F(t) - \alpha p_i
\end{aligned}
$$</p>
<p>Where:</p>
<ul>
<li>$F(t)$ is the fictitious external field driving heat flow.</li>
<li>$E_i$ is the instantaneous energy of particle $i$.</li>
<li>$\alpha$ is the <strong>Gaussian Thermostat multiplier</strong> (calculated at every step to strictly conserve kinetic energy/Temperature):
$$\alpha = \frac{\sum_i [\dots]_{\text{force terms}} \cdot p_i}{\sum_i p_i \cdot p_i}$$</li>
</ul>
<p><strong>Conductivity Calculation:</strong></p>
<p>The zero-frequency limit is extrapolated as:
$$ \lambda = \lim_{F \to 0} \frac{J_Q}{FT} $$</p>
<p>The frequency-dependent conductivity relies on the heat-flux autocorrelation:
$$ \lambda(\omega) = \frac{V}{3k_B T^2} \int_0^\infty dt , e^{i\omega t} \langle J_Q(t) \cdot J_Q(0) \rangle $$</p>
<h3 id="models">Models</h3>
<p>The &ldquo;model&rdquo; here is the physical simulation setup.</p>
<ul>
<li><strong>Particle Count</strong>: $N = 108$ (primary), $N = 256$ (validation).</li>
<li><strong>Boundary Conditions</strong>: Periodic Boundary Conditions (PBC).</li>
<li><strong>Thermostat</strong>: Gaussian Isokinetic (Temperature is a constant of motion).</li>
</ul>
<h3 id="evaluation">Evaluation</h3>
<p>The primary metric is the <strong>Thermal Conductivity</strong> ($\lambda$).</p>
<table>
  <thead>
      <tr>
          <th>Metric</th>
          <th>Definition</th>
          <th>Baseline</th>
          <th>Result</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><strong>Thermal Conductivity</strong></td>
          <td>Ratio of heat flux $J_Q$ to field $F$ (extrapolated to $F=0$)</td>
          <td>Experimental Argon (NBS Data)</td>
          <td>Good agreement away from critical point</td>
      </tr>
      <tr>
          <td><strong>Tail Amplitude</strong></td>
          <td>Coefficient of the $\omega^{1/2}$ term in frequency-dependent conductivity</td>
          <td>Mode-Coupling Theory ($\approx 0.05$)</td>
          <td>Simulation value $\approx 0.3$ (6x larger)</td>
      </tr>
  </tbody>
</table>
<h3 id="hardware">Hardware</h3>
<ul>
<li><strong>Requirements</strong>: While 1986 hardware is obsolete, reproducing this requires a standard MD code capable of non-conservative forces (NEMD).</li>
<li><strong>Compute Cost</strong>: Low by modern standards. 108 particles for $\sim 10^5$ to $10^6$ steps is trivial on modern CPUs.</li>
</ul>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Evans, D. J. (1986). Thermal conductivity of the Lennard-Jones fluid. <em>Physical Review A</em>, 34(2), 1449-1453. <a href="https://doi.org/10.1103/PhysRevA.34.1449">https://doi.org/10.1103/PhysRevA.34.1449</a></p>
<p><strong>Publication</strong>: Physical Review A, 1986</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@article</span>{PhysRevA.34.1449,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span> = <span style="color:#e6db74">{Thermal conductivity of the Lennard-Jones fluid}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span> = <span style="color:#e6db74">{Evans, Denis J.}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">journal</span> = <span style="color:#e6db74">{Phys. Rev. A}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">volume</span> = <span style="color:#e6db74">{34}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">number</span> = <span style="color:#e6db74">{2}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span> = <span style="color:#e6db74">{1449--1453}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">numpages</span> = <span style="color:#e6db74">{0}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span> = <span style="color:#e6db74">{1986}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">month</span> = <span style="color:#e6db74">{Aug}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">publisher</span> = <span style="color:#e6db74">{American Physical Society}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">doi</span> = <span style="color:#e6db74">{10.1103/PhysRevA.34.1449}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">url</span> = <span style="color:#e6db74">{https://link.aps.org/doi/10.1103/PhysRevA.34.1449}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div>]]></content:encoded></item><item><title>Embedded-Atom Method: Theory and Applications Review</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/embedded-atom-method-review-1993/</link><pubDate>Sun, 14 Dec 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/embedded-atom-method-review-1993/</guid><description>Comprehensive 1993 review of the Embedded-Atom Method (EAM), covering theory, parameterization, and applications to metallic systems.</description><content:encoded><![CDATA[<h2 id="systematizing-the-embedded-atom-method">Systematizing the Embedded-Atom Method</h2>
<p>This is a <strong>Systematization (Review)</strong> paper. It consolidates the theoretical development, semi-empirical parameterization, and broad applications of the Embedded-Atom Method (EAM) into a unified framework. The paper systematizes the field by connecting the EAM to related theories (Effective Medium Theory, Finnis-Sinclair, &ldquo;glue&rdquo; models) and organizing phenomenological results across diverse physical regimes (bulk, surfaces, interfaces).</p>
<p>The authors explicitly frame the work as a survey, stating &ldquo;We review here the history, development, and application of the EAM&rdquo; and &ldquo;This review emphasizes the physical insight that motivated the EAM.&rdquo; The paper follows a classic survey structure, organizing the literature by application domains.</p>
<h2 id="the-failure-of-pair-potentials-in-metallic-systems">The Failure of Pair Potentials in Metallic Systems</h2>
<p>The primary motivation is the failure of pair-potential models to accurately describe metallic bonding, particularly at defects and interfaces.</p>
<p><strong>Physics Gap</strong>: Pair potentials assume bond strength is independent of environment, implying cohesive energy scales linearly with coordination ($Z$), whereas in reality it scales roughly as $\sqrt{Z}$.</p>
<p><strong>Empirical Failures</strong>: Pair potentials incorrectly predict the &ldquo;Cauchy relation&rdquo; ($C_{12} = C_{44}$) and predict a vacancy formation energy equal to the cohesive energy, contradicting experimental data for fcc metals.</p>
<p><strong>Practical Need</strong>: First-principles calculations (like DFT) were computationally too expensive for low-symmetry systems like grain boundaries and fracture tips, creating a need for an efficient, semi-empirical many-body potential.</p>
<h2 id="theoretical-unification--core-innovations">Theoretical Unification &amp; Core Innovations</h2>
<p>The paper&rsquo;s core contribution is the synthesis of the EAM as a practical computational tool that captures &ldquo;coordination-dependent bond strength&rdquo; without the cost of ab initio methods.</p>
<p><strong>Theoretical Unification</strong>: It demonstrates that the EAM ansatz can be derived from Density Functional Theory (DFT) by assuming the total electron density is a superposition of atomic densities.</p>
<p><strong>Environmental Dependence</strong>: It explicitly formulates how the &ldquo;effective&rdquo; pair interaction stiffens and shortens as coordination decreases (e.g., at surfaces), a feature naturally arising from the non-linearity of the embedding function.</p>
<p><strong>Broad Validation</strong>: It provides a centralized evaluation of the method across a vast array of metallic properties, establishing it as the standard for atomistic simulations of face-centered cubic (fcc) metals.</p>
<h2 id="validating-eam-across-application-domains">Validating EAM Across Application Domains</h2>
<p>The authors review computational experiments using Energy Minimization, Molecular Dynamics (MD), and Monte Carlo (MC) simulations across several domains:</p>
<p><strong>Bulk Properties</strong>: Calculation of phonon spectra, liquid structure factors, thermal expansion coefficients, and melting points for fcc metals (Ni, Pd, Pt, Cu, Ag, Au).</p>
<p><strong>Defects</strong>: Computation of vacancy formation/migration energies and self-interstitial geometries.</p>
<p><strong>Grain Boundaries</strong>: Calculation of grain boundary structures, energies, and elastic properties for twist and tilt boundaries in Au and Al. Computed structures show good agreement with X-ray diffraction and HRTEM experiments. The many-body interactions in the EAM produce somewhat better agreement than pair potentials, which tend to overestimate boundary expansion.</p>
<p><strong>Surfaces</strong>: Analysis of surface energies, relaxations, reconstructions (e.g., Au(110) missing row), and surface phonons.</p>
<p><strong>Alloys</strong>: Investigation of heat of solution, surface segregation profiles (e.g., Ni-Cu), and order-disorder transitions.</p>
<p><strong>Mechanical Properties</strong>: Simulation of dislocation mobility, pinning by defects (He bubbles), and crack tip plasticity (ductile vs. brittle fracture modes).</p>
<h2 id="key-outcomes-and-the-limits-of-eam">Key Outcomes and the Limits of EAM</h2>
<p><strong>Many-Body Success</strong>: The EAM successfully reproduces the breakdown of the Cauchy relation and the correct ratio of vacancy formation energy to cohesive energy (~0.35) for fcc metals.</p>
<p><strong>Surface Accuracy</strong>: It correctly predicts that surface bonds are shorter and stiffer than bulk bonds due to lower coordination. It accurately predicts surface reconstructions (e.g., Au(110) $(1 \times 2)$).</p>
<p><strong>Alloy Behavior</strong>: The method naturally captures segregation phenomena, including oscillating concentration profiles in Ni-Cu, driven by the embedding energy.</p>
<p><strong>Limitations</strong>: The method is less accurate for systems with strong directional bonding (covalent materials) or significant Fermi-surface effects, as it assumes spherically averaged electron densities.</p>
<hr>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<h3 id="data">Data</h3>
<p><strong>Fitting Data</strong>: The semi-empirical functions are fitted to basic bulk properties: lattice constants, cohesive energy, elastic constants ($C_{11}$, $C_{12}$, $C_{44}$), and vacancy formation energy.</p>
<p><strong>Universal Binding Curve</strong>: The cohesive energy as a function of lattice constant is constrained to follow the &ldquo;universal binding curve&rdquo; of Rose et al. to ensure accurate anharmonic behavior.</p>
<p><strong>Alloy Data</strong>: For binary alloys, dilute heats of alloying are used for fitting cross-interactions.</p>
<h3 id="algorithms">Algorithms</h3>
<p><strong>Core Ansatz</strong>: The total energy is defined as:</p>
<p>$$E_{coh} = \sum_{i} G_i\left( \sum_{j \neq i} \rho_j^a(R_{ij}) \right) + \frac{1}{2} \sum_{i, j (j \neq i)} U_{ij}(R_{ij})$$</p>
<p>where $G$ is the embedding energy (function of local electron density $\rho$), and $U$ is a pair interaction.</p>
<p><strong>Simulation Techniques</strong>:</p>
<ul>
<li><strong>Molecular Dynamics (MD)</strong>: Used for liquids, phonons, and fracture simulations.</li>
<li><strong>Monte Carlo (MC)</strong>: Used for phase diagrams and segregation profiles (e.g., approximately $10^5$ iterations per atom).</li>
<li><strong>Phonons</strong>: Calculated via the dynamical matrix derived from the force-constant tensor $K_{ij}$.</li>
<li><strong>Normal-Mode Analysis</strong>: Vibrational normal modes obtained by diagonalizing the dynamical matrix, feasible for unit cells of up to about 260 atoms.</li>
</ul>
<h3 id="models">Models</h3>
<p><strong>Parameterizations</strong>: The review lists several specific function sets developed by the authors (Table 2), including:</p>
<ul>
<li><strong>Daw and Baskes</strong>: For Ni, Pd, H (elemental metals and H in solution/on surfaces)</li>
<li><strong>Foiles</strong>: For Cu, Ag, Au, Ni, Pd, Pt (elemental metals)</li>
<li><strong>Foiles</strong>: For Cu, Ni (tailored for the Ni-Cu alloy system)</li>
<li><strong>Foiles, Baskes and Daw</strong>: For Cu, Ag, Au, Ni, Pd, Pt (dilute alloys)</li>
<li><strong>Daw, Baskes, Bisson and Wolfer</strong>: For Ni, H (fracture, dislocations, H embrittlement)</li>
<li><strong>Foiles and Daw</strong>: For Ni, Al (Ni-rich end of the Ni-Al alloy system)</li>
<li><strong>Daw</strong>: For Ni (calculated from first principles, not semi-empirical)</li>
<li><strong>Hoagland, Daw, Foiles and Baskes</strong>: For Al (elemental Al)</li>
</ul>
<p>Many of these historical parameterizations are directly downloadable in machine-readable formats from the NIST Interatomic Potentials Repository (linked in the resources below).</p>
<p><strong>Transferability</strong>: EAM functions are generally <em>not</em> transferable between different parameterization sets; mixing functions from different sets (e.g., Daw-Baskes Ni with Foiles Pd) is invalid.</p>
<h3 id="evaluation">Evaluation</h3>
<p><strong>Bulk Validation</strong>: Phonon dispersion curves for Cu show excellent agreement with experiment across the full Brillouin zone.</p>
<p><strong>Thermal Properties</strong>: Linear thermal expansion coefficients match experiment well (e.g., Cu calculated: $16.4 \times 10^{-6}/K$ vs experimental: $16.7 \times 10^{-6}/K$).</p>
<p><strong>Defect Energetics</strong>: Vacancy migration energies and divacancy binding energies (~0.1-0.2 eV) align with experimental data.</p>
<p><strong>Surface Segregation</strong>: Correctly predicts segregation species for 18 distinct dilute alloy cases (e.g., Cu segregating in Ni).</p>
<h3 id="hardware">Hardware</h3>
<p><strong>Compute Scale</strong>: At the time of publication (1993), Molecular Dynamics simulations of up to 35,000 atoms were possible.</p>
<p><strong>Platforms</strong>: Calculations were performed on supercomputers like the <strong>CRAY-XMP</strong>, though smaller calculations were noted as feasible on high-performance workstations.</p>
<hr>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Daw, M. S., Foiles, S. M., &amp; Baskes, M. I. (1993). The embedded-atom method: a review of theory and applications. <em>Materials Science Reports</em>, 9(7-8), 251-310. <a href="https://doi.org/10.1016/0920-2307(93)90001-U">https://doi.org/10.1016/0920-2307(93)90001-U</a></p>
<p><strong>Publication</strong>: Materials Science Reports 1993</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@article</span>{dawEmbeddedatomMethodReview1993,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span> = <span style="color:#e6db74">{The embedded-atom method: a review of theory and applications}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">shorttitle</span> = <span style="color:#e6db74">{The Embedded-Atom Method}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span> = <span style="color:#e6db74">{Daw, Murray S. and Foiles, Stephen M. and Baskes, Michael I.}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span> = <span style="color:#ae81ff">1993</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">month</span> = mar,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">journal</span> = <span style="color:#e6db74">{Materials Science Reports}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">volume</span> = <span style="color:#e6db74">{9}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">number</span> = <span style="color:#e6db74">{7-8}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span> = <span style="color:#e6db74">{251--310}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">issn</span> = <span style="color:#e6db74">{0920-2307}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">doi</span> = <span style="color:#e6db74">{10.1016/0920-2307(93)90001-U}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p><strong>Additional Resources</strong>:</p>
<ul>
<li><a href="/notes/chemistry/molecular-simulation/embedded-atom-method/">Original EAM Paper (1984)</a></li>
<li><a href="/notes/chemistry/molecular-simulation/embedded-atom-method-voter-1994/">EAM User Guide (1994)</a></li>
<li><a href="https://www.ctcms.nist.gov/potentials/">NIST Interatomic Potentials Repository</a></li>
</ul>
]]></content:encoded></item><item><title>Embedded-Atom Method User Guide: Voter's 1994 Chapter</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/embedded-atom-method-voter-1994/</link><pubDate>Sun, 14 Dec 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/embedded-atom-method-voter-1994/</guid><description>Comprehensive user guide for the Embedded-Atom Method (EAM), covering theory, potential fitting, and applications to intermetallics.</description><content:encoded><![CDATA[<h2 id="contribution-systematizing-the-embedded-atom-method">Contribution: Systematizing the Embedded-Atom Method</h2>
<p>This is a <strong>Systematization</strong> paper (specifically a handbook chapter) with a strong secondary <strong>Method</strong> projection.</p>
<p>Its primary goal is to serve as a &ldquo;users&rsquo; guide&rdquo; to the Embedded-Atom Method (EAM). The text organizes existing knowledge:</p>
<ul>
<li>It traces the physical origins of EAM from Density Functional Theory (DFT) and Effective Medium Theory.</li>
<li>It synthesizes &ldquo;closely related methods&rdquo; (Second Moment Approximation, Glue Model), showing they are mathematically equivalent or very similar to EAM.</li>
<li>It provides a pedagogical, step-by-step methodology for fitting potentials to experimental data.</li>
</ul>
<h2 id="motivation-bridging-the-gap-between-dft-and-pair-potentials">Motivation: Bridging the Gap Between DFT and Pair Potentials</h2>
<p>The primary motivation is to bridge the gap between accurate, expensive electronic structure calculations and fast, inaccurate pair potentials.</p>
<ul>
<li><strong>Computational Efficiency</strong>: First-principles methods scale as $O(N^3)$ or worse, limiting simulations to $&lt;100$ atoms (in 1994). Pair potentials scale as $O(N)$ and fail to capture essential many-body physics of metals.</li>
<li><strong>Physical Accuracy</strong>: Simple pair potentials cannot accurately model metallic defects; they predict zero Cauchy pressure ($C_{12} - C_{44} = 0$) and equate vacancy formation energy to cohesive energy, both of which are incorrect for transition metals.</li>
<li><strong>Practical Utility</strong>: There was a need for a clear guide on how to construct and apply these potentials for large-scale simulations ($10^6+$ atoms) of fracture and defects.</li>
</ul>
<h2 id="novelty-a-unified-framework-and-robust-fitting-recipe">Novelty: A Unified Framework and Robust Fitting Recipe</h2>
<p>As a review chapter, the novelty lies in the synthesis and the specific, reproducible recipe for potential construction. Central to this synthesis is the core EAM energy functional:</p>
<p>$$E_{\text{tot}} = \sum_i \left( F(\bar{\rho}_i) + \frac{1}{2} \sum_{j \neq i} \phi(r_{ij}) \right)$$</p>
<p>where the total energy $E_{\text{tot}}$ depends on embedding an atom $i$ into a local background electron density $\bar{\rho}_i = \sum_{j \neq i} \rho(r_{ij})$, plus a repulsive pair interaction $\phi(r_{ij})$.</p>
<ul>
<li><strong>Unified Framework</strong>: It explicitly maps the &ldquo;Second Moment Approximation&rdquo; (Tight Binding) and the &ldquo;Glue Model&rdquo; onto the fundamental EAM framework above, clarifying that they differ primarily in terminology or specific functional choices (e.g., square root embedding functions).</li>
<li><strong>Cross-Potential Fitting Recipe</strong>: It details a robust method for fitting alloy potentials (specifically Ni-Al-B) by using &ldquo;transformation invariance&rdquo;, scaling the density and shifting the embedding function to fit alloy properties without disturbing pure element fits.</li>
<li><strong>Specific Parameters</strong>: It publishes optimized potential parameters for Ni, Al, and B that accurately reproduce properties like the Boron interstitial preference in $\text{Ni}_3\text{Al}$.</li>
</ul>
<h2 id="validation-computational-benchmarks-and-simulations">Validation: Computational Benchmarks and Simulations</h2>
<p>The &ldquo;experiments&rdquo; described are computational validations and simulations using the fitted Ni-Al-B potential:</p>
<ol>
<li>
<p><strong>Potential Fitting</strong>:</p>
<ul>
<li>Pure elements (Ni, Al) were fitted to elastic constants, vacancy formation energies, and diatomic data. The Ni fit achieved $\chi_{\text{rms}} = 0.75%$ and Al achieved $\chi_{\text{rms}} = 3.85%$.</li>
<li>Boron was fitted using hypothetical crystal structures (fcc, bcc) calculated via LMTO (Linear Muffin-Tin Orbital) since experimental data for fcc B does not exist.</li>
</ul>
</li>
<li>
<p><strong>Molecular Statics (Validation)</strong>:</p>
<ul>
<li><strong>Surface Relaxation</strong>: Demonstrated that EAM captures the oscillatory relaxation of atomic layers near a free surface, a many-body effect that pair potentials fail to capture.</li>
<li><strong>Defect Energetics</strong>: Calculated formation energies for Boron interstitials in $\text{Ni}_3\text{Al}$. Found the 6Ni-octahedral site is most stable ($-4.59$ eV relative to an isolated B atom and unperturbed crystal), followed by the 4Ni-2Al octahedral site ($-3.65$ eV) and the 3Ni-1Al tetrahedral site ($-2.99$ eV), consistent with channeling experiments.</li>
</ul>
</li>
<li>
<p><strong>Molecular Dynamics (Application)</strong>:</p>
<ul>
<li><strong>Grain Boundary (GB) Cleavage</strong>: Simulated the fracture of a (210) tilt grain boundary in $\text{Ni}_3\text{Al}$ at a strain rate of $5 \times 10^{10}$ s$^{-1}$.</li>
<li><strong>Comparison</strong>: Compared pure $\text{Ni}_3\text{Al}$ boundaries vs. those doped with Boron and substitutional Nickel.</li>
</ul>
</li>
</ol>
<h2 id="key-outcomes-eam-efficiency-and-boron-strengthening">Key Outcomes: EAM Efficiency and Boron Strengthening</h2>
<ul>
<li><strong>EAM Efficiency</strong>: Confirmed that EAM scales linearly with atom count ($N$), requiring only 2-5 times the computational work of pair potentials.</li>
<li><strong>Boron Strengthening Mechanism</strong>: The simulations suggested that Boron segregates to grain boundaries and, specifically when co-segregated with Ni, significantly increases cohesion.
<ul>
<li>The maximum stress for the enriched boundary was approximately 22 GPa, compared to approximately 19 GPa for the clean boundary.</li>
<li>The B-doped boundary required approximately 44% more work to cleave than the undoped boundary.</li>
<li>The fracture mode shifted from cleaving along the GB to failure in the bulk.</li>
</ul>
</li>
<li><strong>Grain Boundary Segregation</strong>: Molecular statics calculations found B interstitial energies at the GB as low as $-6.9$ eV, compared to $-4.59$ eV in the bulk, consistent with experimental observations of boron segregation to grain boundaries.</li>
<li><strong>Limitations</strong>: The author concludes that while EAM is excellent for metals, it lacks the angular dependence required for strongly covalent materials (like $\text{MoSi}_2$) or directional bonding.</li>
</ul>
<hr>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<p>The chapter provides nearly all details required to implement the described potential from scratch.</p>
<h3 id="data">Data</h3>
<ul>
<li><strong>Experimental/Reference Data</strong>: Used for fitting the cost function $\chi_{\text{rms}}$.
<ul>
<li><strong>Pure Elements</strong>: Lattice constants ($a_0$), cohesive energy ($E_{\text{coh}}$), bulk modulus ($B$), elastic constants ($C_{11}, C_{12}, C_{44}$), vacancy formation energy ($E_{\text{vac}}^f$), and diatomic bond length/strength ($R_e, D_e$).</li>
<li><strong>Alloys</strong>: Heat of solution and defect energies (APB, SISF) for $\text{Ni}_3\text{Al}$.</li>
<li><strong>Hypothetical Data</strong>: LMTO first-principles data used for unobserved phases (e.g., fcc Boron, B2 NiB) to constrain the fit.</li>
</ul>
</li>
</ul>
<h3 id="algorithms">Algorithms</h3>
<ul>
<li><strong>Component Functions</strong>:
<ul>
<li><strong>Pair Potential $\phi(r)$</strong>: Morse potential form:
$$\phi(r) = D_M {1 - \exp[-\alpha_M(r - R_M)]}^2 - D_M$$</li>
<li><strong>Density Function $\rho(r)$</strong>: Modified hydrogenic 4s orbital:
$$\rho(r) = r^6(e^{-\beta r} + 2^9 e^{-2\beta r})$$</li>
<li><strong>Embedding Function $F(\bar{\rho})$</strong>: Derived numerically to force the crystal energy to match the &ldquo;Universal Energy Relation&rdquo; (Rose et al.) as a function of lattice constant.</li>
</ul>
</li>
<li><strong>Fitting Strategy</strong>:
<ul>
<li><strong>Smooth Cutoff</strong>: A polynomial smoothing function ($h_{\text{smooth}}$) applied at $r_{\text{cut}}$ to ensure continuous derivatives.</li>
<li><strong>Simplex Algorithm</strong>: Used to optimize parameters ($D_M, R_M, \alpha_M, \beta, r_{\text{cut}}$).</li>
<li><strong>Alloy Invariance</strong>: Used transformations $F&rsquo;(\rho) = F(\rho) + g\rho$ and $\rho&rsquo;(r) = s\rho(r)$ to fit cross-potentials without altering pure-element properties.</li>
</ul>
</li>
</ul>
<h3 id="models">Models</h3>
<ul>
<li><strong>Parameters</strong>: The text provides the exact optimized parameters for the Ni-Al-B potential in <strong>Table 2</strong> (Pure elements) and <strong>Table 5</strong> (Cross-potentials).
<ul>
<li>Example Ni parameters: $D_M=1.5335$ eV, $\alpha_M=1.7728$ Å$^{-1}$, $r_{\text{cut}}=4.7895$ Å.</li>
</ul>
</li>
</ul>
<h3 id="hardware">Hardware</h3>
<ul>
<li><strong>1994 Context</strong>: Mentions that simulations of $10^6$ atoms were possible on the &ldquo;fastest computers available&rdquo;.</li>
<li><strong>Scaling</strong>: Explicitly notes computational work scales as $O(N)$, roughly 2-5x slower than pair potentials.</li>
</ul>
<hr>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Voter, A. F. (1994). Chapter 4: The Embedded-Atom Method. In <em>Intermetallic Compounds: Vol. 1, Principles</em>, edited by J. H. Westbrook and R. L. Fleischer. John Wiley &amp; Sons Ltd.</p>
<p><strong>Publication</strong>: Intermetallic Compounds: Vol. 1, Principles (1994)</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@incollection</span>{voterEmbeddedAtomMethod1994,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span> = <span style="color:#e6db74">{The Embedded-Atom Method}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span> = <span style="color:#e6db74">{Voter, Arthur F.}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">booktitle</span> = <span style="color:#e6db74">{Intermetallic Compounds: Vol. 1, Principles}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">editor</span> = <span style="color:#e6db74">{Westbrook, J. H. and Fleischer, R. L.}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span> = <span style="color:#e6db74">{1994}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">publisher</span> = <span style="color:#e6db74">{John Wiley &amp; Sons Ltd}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span> = <span style="color:#e6db74">{77--90}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">chapter</span> = <span style="color:#e6db74">{4}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p><strong>Additional Resources</strong>:</p>
<ul>
<li><a href="https://www.ctcms.nist.gov/potentials/">NIST Interatomic Potentials Repository</a> (Modern repository often hosting EAM files)</li>
<li><a href="/notes/chemistry/molecular-simulation/embedded-atom-method/">Original EAM Paper (1984)</a></li>
<li><a href="/notes/chemistry/molecular-simulation/embedded-atom-method-review-1993/">EAM Review (1993)</a></li>
</ul>
]]></content:encoded></item><item><title>Dynamical Corrections to TST for Surface Diffusion</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/self-diffusion-lj-fcc111-1989/</link><pubDate>Sun, 14 Dec 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/self-diffusion-lj-fcc111-1989/</guid><description>Application of dynamical corrections formalism to TST for LJ surface diffusion, revealing bounce-back recrossings at low T.</description><content:encoded><![CDATA[<h2 id="bridging-md-and-tst-for-surface-diffusion">Bridging MD and TST for Surface Diffusion</h2>
<p>This is primarily a <strong>Methodological Paper</strong> with a secondary contribution in <strong>Discovery</strong>.</p>
<p>The authors&rsquo; primary goal is to demonstrate the validity of the &ldquo;dynamical corrections formalism&rdquo; for calculating diffusion constants. They validate this by reproducing Molecular Dynamics (MD) results at high temperatures and then extending the method into low-temperature regimes where MD is infeasible.</p>
<p>By applying this method, they uncover a specific physical phenomenon, &ldquo;bounce-back recrossings&rdquo;, that causes a dip in the diffusion coefficient at low temperatures, a detail previously unobserved.</p>
<h2 id="timescale-limits-in-molecular-dynamics">Timescale Limits in Molecular Dynamics</h2>
<p>The authors aim to solve the timescale problem in simulating surface diffusion.</p>
<p><strong>Limit of MD</strong>: Molecular Dynamics (MD) is effective at high temperatures but becomes computationally infeasible at low temperatures because the time between diffusive hops increases drastically.</p>
<p><strong>Limit of TST</strong>: Standard Transition State Theory (TST) can handle long timescales but assumes all barrier crossings are successful, ignoring correlated dynamical events like immediate recrossings or multiple jumps.</p>
<p><strong>Goal</strong>: They seek to apply a formalism that corrects TST using short-time trajectory data, allowing for accurate calculation of diffusion constants across the entire temperature range.</p>
<h2 id="the-bounce-back-mechanism">The Bounce-Back Mechanism</h2>
<p>The core novelty is the rigorous application of the dynamical corrections formalism to a multi-site system (fcc/hcp sites) to characterize non-Arrhenius behavior at low temperatures.</p>
<p><strong>Unified Approach</strong>: They demonstrate that this method works for all temperatures, bridging the gap between the &ldquo;rare-event regime&rdquo; and the high-temperature regime dominated by fluid-like motion.</p>
<p><strong>Bounce-back Mechanism</strong>: They identify a specific &ldquo;dip&rdquo; in the dynamical correction factor ($f_d &lt; 1$) at low temperatures ($T \approx 0.038$), attributed to trajectories where the adatom collides with a substrate atom on the far side of the binding site and immediately recrosses the dividing surface.</p>
<h2 id="simulating-the-lennard-jones-fcc111-surface">Simulating the Lennard-Jones fcc(111) Surface</h2>
<p>The authors performed computational experiments on a Lennard-Jones fcc(111) surface cluster.</p>
<p><strong>System Setup</strong>: A single adatom on a 3-layer substrate (30 atoms/layer) with periodic boundary conditions.</p>
<p><strong>Baselines</strong>: They compared their high-temperature results against standard Molecular Dynamics simulations to validate the method.</p>
<p><strong>Ablation of Substrate Freedom</strong>: They ran a control experiment with a 6-layer substrate (top 3 free, 800 trajectories) to confirm the bounce-back effect persisted independently of the fixed deep layers, obtaining $D/D^{TST} = 0.75 \pm 0.06$, consistent with the original result.</p>
<p><strong>Trajectory Analysis</strong>: They analyzed the angular distribution of initial momenta to characterize the specific geometry of the bounce-back trajectories. Bounce-back trajectories were more strongly peaked at $\phi = 90°$ (perpendicular to the TST gate), confirming the effect arises from interaction with the substrate atom directly across the binding site.</p>
<p><strong>Temperature Range</strong>: The full calculation spanned $0.013 \leq T \leq 0.383$ in reduced units, bridging the rare-event regime and the high-temperature fluid-like regime.</p>
<h2 id="resolving-non-arrhenius-behavior">Resolving Non-Arrhenius Behavior</h2>
<p><strong>Arrhenius Behavior of TST</strong>: The uncorrected TST diffusion constant ($D^{TST}$) followed a near-perfect Arrhenius law, with a linear least-squares fit of $\ln(D^{TST}) = -1.8 - 0.30/T$.</p>
<p><strong>High-Temperature Correction</strong>: At high T, the dynamical correction factor $D/D^{TST} &gt; 1$, indicating correlated multiple forward jumps (long flights).</p>
<p><strong>Low-Temperature Dip</strong>: At low T, $D/D^{TST} &lt; 1$ for $T = 0.013, 0.026, 0.038, 0.051$ (minimum at $T = 0.038$), caused by the bounce-back mechanism.</p>
<p><strong>Validation</strong>: The method successfully reproduced high-T literature values while providing access to low-T dynamics inaccessible to direct MD.</p>
<hr>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<h3 id="data">Data</h3>
<p>The paper does not use external datasets but generates simulation data based on the Lennard-Jones potential.</p>
<table>
  <thead>
      <tr>
          <th>Type</th>
          <th>Parameter</th>
          <th>Value</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><strong>Potential</strong></td>
          <td>$\epsilon, \sigma$</td>
          <td>1.0 (Reduced units)</td>
          <td>Standard Lennard-Jones 6-12</td>
      </tr>
      <tr>
          <td><strong>Cutoff</strong></td>
          <td>Spline</td>
          <td>$r_1=1.5\sigma, r_2=2.5\sigma$</td>
          <td>5th-order spline smooths potential to 0 at $r_2$</td>
      </tr>
      <tr>
          <td><strong>Geometry</strong></td>
          <td>Lattice Constant</td>
          <td>$a_0 = 1.549$</td>
          <td>Minimum energy for this potential</td>
      </tr>
      <tr>
          <td><strong>Cluster</strong></td>
          <td>Size</td>
          <td>3 layers, 30 atoms/layer</td>
          <td>Periodic boundary conditions parallel to surface</td>
      </tr>
  </tbody>
</table>
<h3 id="algorithms">Algorithms</h3>
<p>The diffusion constant $D$ is calculated as $D = D^{TST} \times (D/D^{TST})$.</p>
<p><strong>1. TST Rate Calculation ($D^{TST}$)</strong></p>
<ul>
<li><strong>Method</strong>: Monte Carlo integration of the flux through the dividing surface.</li>
<li><strong>Technique</strong>: Calculate free energy difference between the entire binding site and the TST dividing region.</li>
<li><strong>Dividing Surface</strong>: Defined geometrically with respect to equilibrium substrate positions (honeycomb boundaries around fcc/hcp sites).</li>
</ul>
<p><strong>2. Dynamical Correction Factor ($D/D^{TST}$)</strong></p>
<p>The method relies on evaluating the dynamical correction factor $f_d$, initialized via a Metropolis walk restricted to the TST boundary region, computed as:</p>
<p>$$
\begin{aligned}
f_d(i\rightarrow j) = \frac{2}{N}\sum_{I=1}^{N}\eta_{ij}(I)
\end{aligned}
$$</p>
<ul>
<li><strong>Initialization</strong>:
<ul>
<li><strong>Position</strong>: Sampled via Metropolis walk restricted to the TST boundary region.</li>
<li><strong>Momentum</strong>: Maxwellian distribution for parallel components; Maxwellian-flux distribution for normal component.</li>
<li><strong>Symmetry</strong>: Trajectories entering hcp sites are generated by reversing momenta of those entering fcc sites.</li>
</ul>
</li>
<li><strong>Integration</strong>:
<ul>
<li><strong>Integrator</strong>: Adams-Bashforth-Moulton predictor-corrector formulas of orders 1 through 12.</li>
<li><strong>Duration</strong>: Integrated until time $t &gt; \tau_{corr}$ (approximately $\tau_{corr} \approx 13$ reduced time units).</li>
<li><strong>Sample Size</strong>: 1400 trajectories per temperature point (700 initially entering each type of site).</li>
</ul>
</li>
</ul>
<h3 id="models">Models</h3>
<ul>
<li><strong>System</strong>: Single component Lennard-Jones solid (Argon-like).</li>
<li><strong>Adsorbate</strong>: Single adatom on fcc(111) surface.</li>
<li><strong>Substrate Flexibility</strong>: Adatom plus top layer atoms are free to move. Layers 2 and 3 are fixed. (Validation run used 6 layers with top 3 free).</li>
</ul>
<h3 id="evaluation">Evaluation</h3>
<p>The primary metric is the Diffusion Constant $D$, analyzed via the Dynamical Correction Factor.</p>
<table>
  <thead>
      <tr>
          <th>Metric</th>
          <th>Value</th>
          <th>Baseline</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><strong>Slope ($E_a$)</strong></td>
          <td>0.30</td>
          <td>0.303 fcc / 0.316 hcp (Newton-Raphson)</td>
          <td>TST slope in good agreement with static barrier height.</td>
      </tr>
      <tr>
          <td><strong>$D/D^{TST}$ (Low T)</strong></td>
          <td>$0.82 \pm 0.04$</td>
          <td>1.0 (TST)</td>
          <td>At $T=0.038$. Indicates 18% reduction due to recrossing.</td>
      </tr>
      <tr>
          <td><strong>$D/D^{TST}$ (High T)</strong></td>
          <td>$&gt; 1.0$</td>
          <td>MD Literature</td>
          <td>Increases with T due to multiple jumps.</td>
      </tr>
  </tbody>
</table>
<h3 id="hardware">Hardware</h3>
<p>Specific hardware configurations (e.g., node architectures, supercomputers) or training times were not specified in the original publication, which is typical for 1989 literature. Modern open-source MD engines (e.g., LAMMPS, ASE) could perform identical Lennard-Jones molecular dynamics integrations in negligible time on any consumer workstation.</p>
<hr>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Cohen, J. M., &amp; Voter, A. F. (1989). Self-diffusion on the Lennard-Jones fcc(111) surface: Effects of temperature on dynamical corrections. <em>The Journal of Chemical Physics</em>, 91(8), 5082-5086. <a href="https://doi.org/10.1063/1.457599">https://doi.org/10.1063/1.457599</a></p>
<p><strong>Publication</strong>: The Journal of Chemical Physics 1989</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@article</span>{cohenSelfDiffusionLennard1989,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span> = <span style="color:#e6db74">{Self-diffusion on the {{Lennard}}-{{Jones}} Fcc(111) Surface: {{Effects}} of Temperature on Dynamical Corrections}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">shorttitle</span> = <span style="color:#e6db74">{Self-diffusion on the {{Lennard}}-{{Jones}} Fcc(111) Surface}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span> = <span style="color:#e6db74">{Cohen, J. M. and Voter, A. F.}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span> = <span style="color:#e6db74">{1989}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">month</span> = oct,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">journal</span> = <span style="color:#e6db74">{The Journal of Chemical Physics}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">volume</span> = <span style="color:#e6db74">{91}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">number</span> = <span style="color:#e6db74">{8}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span> = <span style="color:#e6db74">{5082--5086}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">issn</span> = <span style="color:#e6db74">{0021-9606, 1089-7690}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">doi</span> = <span style="color:#e6db74">{10.1063/1.457599}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">langid</span> = <span style="color:#e6db74">{english}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div>]]></content:encoded></item><item><title>Correlations in the Motion of Atoms in Liquid Argon</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/correlations-motion-atoms-liquid-argon/</link><pubDate>Sat, 13 Dec 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/correlations-motion-atoms-liquid-argon/</guid><description>Rahman's 1964 MD simulation of 864 argon atoms with Lennard-Jones potential revealed the cage effect and validated classical molecular dynamics for liquids.</description><content:encoded><![CDATA[<h2 id="contribution-methodological-validation-of-md">Contribution: Methodological Validation of MD</h2>
<p>This is the archetypal <strong>Method</strong> paper (dominant classification with secondary <strong>Theory</strong> contribution). It establishes the architectural validity of Molecular Dynamics (MD) as a scientific tool. Rahman answers the question: &ldquo;Can a digital computer solving classical difference equations faithfully represent a physical liquid?&rdquo;</p>
<p>The paper utilizes specific rhetorical indicators of a methodological contribution:</p>
<ul>
<li><strong>Algorithmic Explication</strong>: A dedicated Appendix details the predictor-corrector difference equations.</li>
<li><strong>Validation against Ground Truth</strong>: Extensive comparison of calculated diffusion constants and pair-correlation functions against experimental neutron and X-ray scattering data.</li>
<li><strong>Robustness Checks</strong>: Ablation studies on the numerical integration stability (one vs. two corrector cycles).</li>
</ul>
<h2 id="motivation-bridging-neutron-scattering-and-many-body-theory">Motivation: Bridging Neutron Scattering and Many-Body Theory</h2>
<p>In the early 1960s, neutron scattering data provided insights into the dynamic structure of liquids, but theorists lacked concrete models to explain the observed two-body dynamical correlations. Analytic theories were limited by the difficulty of the many-body problem.</p>
<p>Rahman sought to bypass these analytical bottlenecks by assuming that <strong>classical dynamics</strong> with a simple 2-body potential (Lennard-Jones) could sufficiently describe the motion of atoms in liquid argon. The goal was to generate &ldquo;experimental&rdquo; data via simulation to test theoretical models (like the Vineyard convolution approximation) and provide a microscopic understanding of diffusion.</p>
<h2 id="core-innovation-system-stability-and-the-cage-effect">Core Innovation: System Stability and the Cage Effect</h2>
<p>This paper is widely considered the birth of modern molecular dynamics for continuous potentials. Its key novelties include:</p>
<ol>
<li><strong>System Size &amp; Stability</strong>: Successfully simulating 864 particles interacting via a continuous Lennard-Jones potential with stable temperature over the full simulation duration (approximately $10^{-11}$ sec, as confirmed by Table I in the paper).</li>
<li><strong>The &ldquo;Cage Effect&rdquo;</strong>: The discovery that the velocity autocorrelation function becomes negative after a short time:
$$ \langle \textbf{v}(0) \cdot \textbf{v}(t) \rangle &lt; 0 \quad \text{for } t &gt; 0.33 \times 10^{-12} \text{ s} $$
This proved that atoms in a liquid &ldquo;rattle&rdquo; against the cage of their nearest neighbors.</li>
<li><strong>Delayed Convolution</strong>: Proposing an improvement to the Vineyard approximation for the distinct Van Hove function $G_d(r,t)$ by introducing a time-delayed convolution to account for the persistence of local structure. Instead of convolving $g(r)$ with $G_s(r,t)$ at the same time $t$, Rahman convolves at a delayed time $t&rsquo; &lt; t$, using a one-parameter function with $\tau = 1.0 \times 10^{-12}$ sec. This makes $G_d(r,t)$ decay as $t^4$ at short times (instead of $t^2$ in the Vineyard approximation) and as $t$ at long times.</li>
</ol>
<h2 id="methodology-simulating-864-argon-atoms">Methodology: Simulating 864 Argon Atoms</h2>
<p>Rahman performed a &ldquo;computer experiment&rdquo; (simulation) of <strong>Liquid Argon</strong>:</p>
<ul>
<li><strong>System</strong>: 864 particles in a cubic box of side $L=10.229\sigma$.</li>
<li><strong>Conditions</strong>: Temperature $94.4^\circ$K, Density $1.374 \text{ g cm}^{-3}$.</li>
<li><strong>Interaction</strong>: Lennard-Jones potential, truncated at $R=2.25\sigma$.</li>
<li><strong>Time Step</strong>: $\Delta t = 10^{-14}$ s (780 steps total, covering approximately $7.8 \times 10^{-12}$ s).</li>
<li><strong>Output Analysis</strong>:
<ul>
<li>Radial distribution function $g(r)$.</li>
<li>Mean square displacement $\langle r^2 \rangle$.</li>
<li>Velocity autocorrelation function $\langle v(0)\cdot v(t) \rangle$.</li>
<li>Van Hove space-time correlation functions $G_s(r,t)$ and $G_d(r,t)$.</li>
</ul>
</li>
</ul>
<h2 id="results-validation-and-non-gaussian-diffusion-analysis">Results: Validation and Non-Gaussian Diffusion Analysis</h2>
<ul>
<li><strong>Validation</strong>: The calculated pair-distribution function $g(r)$ agreed well with X-ray scattering data from Eisenstein and Gingrich (at $91.8^\circ$K). The self-diffusion constant $D = 2.43 \times 10^{-5} \text{ cm}^2 \text{ sec}^{-1}$ at $94.4^\circ$K matched the experimental value from Naghizadeh and Rice at $90^\circ$K and the same density ($1.374 \text{ g cm}^{-3}$).</li>
<li><strong>Dynamics</strong>: The velocity autocorrelation has a negative region, contradicting simple exponential decay models (Langevin). Its frequency spectrum $f(\omega)$ shows a broad maximum at $\omega \approx 0.25 (k_BT/\hbar)$, reminiscent of solid-like behavior.</li>
<li><strong>Non-Gaussian Behavior</strong>: The self-diffusion function $G_s(r,t)$ attains its maximum departure from a Gaussian shape at about $t \approx 3.0 \times 10^{-12}$ s (with $\langle r^4 \rangle$ departing from its Gaussian value by about 13%), returning to Gaussian form by $\sim 10^{-11}$ s. At that time, the rms displacement ($3.8$ Angstrom) is close to the first-neighbor distance ($3.7$ Angstrom). This indicates that Fickian diffusion is an asymptotic limit and does not apply at short times.</li>
<li><strong>Fourier Transform Validation</strong>: The Fourier transform of $g(r)$ has peaks at $\kappa\sigma = 6.8$, 12.5, 18.5, 24.8, closely matching the X-ray scattering peaks at $\kappa\sigma = 6.8$, 12.3, 18.4, 24.4.</li>
<li><strong>Temperature Dependence</strong>: A second simulation at $130^\circ$K and $1.16 \text{ g cm}^{-3}$ yielded $D = 5.67 \times 10^{-5} \text{ cm}^2 \text{ sec}^{-1}$, compared to the experimental value of $6.06 \times 10^{-5} \text{ cm}^2 \text{ sec}^{-1}$ from Naghizadeh and Rice at $120^\circ$K and $1.16 \text{ g cm}^{-3}$. The paper notes that both calculated values are lower than experiment by about 20%, and suggests that allowing for a softer repulsive part in the interaction potential might reduce this discrepancy.</li>
<li><strong>Vineyard Approximation</strong>: The standard Vineyard convolution approximation ($G_d \approx g * G_s$) produces a too-rapid decay of $G_d(r,t)$ with time. The delayed convolution, matching pairs of $(t&rsquo;, t)$ in units of $10^{-12}$ sec as (0.2, 0.4), (0.5, 0.8), (1.0, 1.6), (1.5, 2.3), (2.0, 2.9), (2.5, 3.5), provides a substantially better fit.</li>
<li><strong>Conclusion</strong>: Classical N-body dynamics with a truncated pair potential is a sufficient model to reproduce both the structural and dynamical properties of simple liquids.</li>
</ul>
<hr>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<h3 id="data">Data</h3>
<p>The simulation uses physical constants for Argon:</p>
<table>
  <thead>
      <tr>
          <th>Parameter</th>
          <th>Value</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Particle Mass ($M$)</td>
          <td>$39.95 \times 1.6747 \times 10^{-24}$ g</td>
          <td>Mass of Argon atom</td>
      </tr>
      <tr>
          <td>Potential Depth ($\epsilon/k_B$)</td>
          <td>$120^\circ$K</td>
          <td>Lennard-Jones parameter</td>
      </tr>
      <tr>
          <td>Potential Size ($\sigma$)</td>
          <td>$3.4$ Å</td>
          <td>Lennard-Jones parameter</td>
      </tr>
      <tr>
          <td>Cutoff Radius ($R$)</td>
          <td>$2.25\sigma$</td>
          <td>Potential truncated beyond this</td>
      </tr>
      <tr>
          <td>Density ($\rho$)</td>
          <td>$1.374$ g cm$^{-3}$</td>
          <td></td>
      </tr>
      <tr>
          <td>Particle Count ($N$)</td>
          <td>864</td>
          <td></td>
      </tr>
  </tbody>
</table>
<h3 id="algorithms">Algorithms</h3>
<p>Rahman utilized a <strong>Predictor-Corrector</strong> scheme for solving the second-order differential equations of motion.</p>
<p><strong>Step Size</strong>: $\Delta t = 10^{-14}$ sec.</p>
<p><strong>The Algorithm:</strong></p>
<ol>
<li><strong>Predict</strong> positions $\bar{\xi}$ at $t + \Delta t$ based on previous steps:
$$\bar{\xi}_i^{(n+1)} = \xi_i^{(n-1)} + 2\Delta u \eta_i^{(n)}$$</li>
<li><strong>Calculate Forces</strong> (Accelerations $\alpha$) using predicted positions.</li>
<li><strong>Correct</strong> positions and velocities using the trapezoidal rule:
$$
\begin{aligned}
\eta_i^{(n+1)} &amp;= \eta_i^{(n)} + \frac{1}{2}\Delta u (\alpha_i^{(n+1)} + \alpha_i^{(n)}) \\
\xi_i^{(n+1)} &amp;= \xi_i^{(n)} + \frac{1}{2}\Delta u (\eta_i^{(n+1)} + \eta_i^{(n)})
\end{aligned}
$$</li>
</ol>
<p><em>Note: The paper compared one vs. two repetitions of the corrector step, finding that two passes improved precision slightly. The results presented in the paper were obtained using two passes.</em></p>
<h3 id="models">Models</h3>
<p><strong>Interaction Potential</strong>: Lennard-Jones 12-6
$$V(r_{ij}) = 4\epsilon \left[ \left(\frac{\sigma}{r_{ij}}\right)^{12} - \left(\frac{\sigma}{r_{ij}}\right)^6 \right]$$</p>
<p><strong>Boundary Conditions</strong>: Periodic Boundary Conditions (PBC) in 3 dimensions. When a particle moves out of the box ($x &gt; L$), it re-enters at $x - L$.</p>
<h3 id="hardware">Hardware</h3>
<p>This is a historical benchmark for computational capability in 1964:</p>
<table>
  <thead>
      <tr>
          <th>Resource</th>
          <th>Specification</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><strong>Computer</strong></td>
          <td>CDC 3600</td>
          <td>Control Data Corporation mainframe</td>
      </tr>
      <tr>
          <td><strong>Compute Time</strong></td>
          <td>45 seconds / cycle</td>
          <td>Per predictor-corrector cycle for 864 particles (floating point)</td>
      </tr>
      <tr>
          <td><strong>Language</strong></td>
          <td>FORTRAN + Machine Language</td>
          <td>Machine language used for the most time-consuming parts</td>
      </tr>
  </tbody>
</table>
<p><em>Modern Context: Rahman&rsquo;s system (864 Argon atoms, LJ-potential) is highly reproducible today and serves as a classic pedagogical exercise. It can be simulated in standard MD frameworks (LAMMPS, OpenMM) in fractions of a second on consumer hardware.</em></p>
<hr>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Rahman, A. (1964). Correlations in the Motion of Atoms in Liquid Argon. <em>Physical Review</em>, 136(2A), A405-A411. <a href="https://doi.org/10.1103/PhysRev.136.A405">https://doi.org/10.1103/PhysRev.136.A405</a></p>
<p><strong>Publication</strong>: Physical Review 1964</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@article</span>{rahman1964correlations,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span>=<span style="color:#e6db74">{Correlations in the motion of atoms in liquid argon}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span>=<span style="color:#e6db74">{Rahman, A.}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">journal</span>=<span style="color:#e6db74">{Physical Review}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">volume</span>=<span style="color:#e6db74">{136}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">number</span>=<span style="color:#e6db74">{2A}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span>=<span style="color:#e6db74">{A405--A411}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span>=<span style="color:#e6db74">{1964}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">publisher</span>=<span style="color:#e6db74">{APS}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">doi</span>=<span style="color:#e6db74">{10.1103/PhysRev.136.A405}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p><strong>Additional Resources</strong>:</p>
<ul>
<li><a href="https://en.wikipedia.org/wiki/Aneesur_Rahman">Aneesur Rahman - Wikipedia</a></li>
</ul>
]]></content:encoded></item><item><title>Adatom Dimer Diffusion on fcc(111) Crystal Surfaces</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/diffusion-adatom-dimers-1984/</link><pubDate>Sat, 13 Dec 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/diffusion-adatom-dimers-1984/</guid><description>A 1984 molecular dynamics study identifying simultaneous multiple jumps in adatom dimer diffusion on fcc(111) surfaces.</description><content:encoded><![CDATA[<h2 id="classification-discovery-of-diffusion-mechanisms">Classification: Discovery of Diffusion Mechanisms</h2>
<p><strong>Discovery (Translational Basis)</strong></p>
<p>This paper applies a computational method (Molecular Dynamics) to observe and characterize a physical phenomenon: the specific diffusion mechanisms of adatom dimers on a crystal surface. It focuses on the &ldquo;what was found&rdquo; (simultaneous multiple jumps).</p>
<p>Based on the <a href="/notes/interdisciplinary/research-methods/ai-physical-sciences-paper-taxonomy/">AI for Physical Sciences Paper Taxonomy</a>, this is best classified as $\Psi_{\text{Discovery}}$ with a minor superposition of $\Psi_{\text{Method}}$ (approximately 80% Discovery, 20% Method). The dominant contribution is the application of computational tools to observe physical phenomena, while secondarily demonstrating MD&rsquo;s capability for surface diffusion problems in an era when the technique was still developing.</p>
<h2 id="bridging-the-intermediate-temperature-data-gap">Bridging the Intermediate Temperature Data Gap</h2>
<p>The study aims to investigate the behavior of adatom dimers in an <strong>intermediate temperature range</strong> ($0.3T_m$ to $0.6T_m$). At the time, Field Ion Microscopy (FIM) provided data at low temperatures ($T \le 0.2T_m$), and previous simulations had studied single adatoms on various surfaces including (111), (110), and (100), but not dimers on (111). The authors sought to compare dimer mobility with single adatom mobility on the (111) surface, where single adatoms move almost like free particles.</p>
<h2 id="observation-of-simultaneous-multiple-jumps">Observation of Simultaneous Multiple Jumps</h2>
<p>The core contribution is the observation of <strong>simultaneous multiple jumps</strong> for dimers on the (111) surface at intermediate temperatures. The study reveals that:</p>
<ol>
<li>Dimers migrate as a whole entity, with both atoms jumping simultaneously</li>
<li>The mobility of dimers (center of mass) is very close to that of single adatoms in this regime.</li>
</ol>
<h2 id="molecular-dynamics-simulation-design">Molecular Dynamics Simulation Design</h2>
<p>The authors performed <strong>Molecular Dynamics (MD) simulations</strong> of a face-centred cubic (fcc) crystallite:</p>
<ul>
<li><strong>System</strong>: A single crystallite of 192 atoms bounded by two free (111) surfaces</li>
<li><strong>Temperature Range</strong>: $0.22 \epsilon/k$ to $0.40 \epsilon/k$ (approximately $0.3T_m$ to $0.6T_m$)</li>
<li><strong>Duration</strong>: Integration over 50,000 time steps</li>
<li><strong>Comparison</strong>: Results were compared against single adatom diffusion data and Einstein&rsquo;s diffusion relation</li>
</ul>
<h2 id="outcomes-on-mobility-and-migration-dynamics">Outcomes on Mobility and Migration Dynamics</h2>
<ul>
<li><strong>Mechanism Transition</strong>: At low temperatures ($T^\ast=0.22$), diffusion occurs via discrete single jumps where adatoms rotate or extend bonds. At higher temperatures, the &ldquo;multiple jump&rdquo; mechanism becomes preponderant.</li>
<li><strong>Migration Style</strong>: The dimer migrates essentially by extending its bond along the $\langle 110 \rangle$ direction.</li>
<li><strong>Mobility</strong>: The diffusion coefficient of dimers is quantitatively similar to single adatoms.</li>
<li><strong>Qualitative Support</strong>: The results support Bonzel&rsquo;s hypothesis of delocalized diffusion involving energy transfer between translation and rotation. The authors attempted to quantify the coupling using the cross-correlation function:</li>
</ul>
<p>$$g(t) = C \langle E_T(t) , E_R(t + t&rsquo;) \rangle$$</p>
<p>where $C$ is a normalization constant, $E_T$ is the translational energy of the center of mass, and $E_R$ is the rotational energy of the dimer. However, the average lifetime of a dimer (2% to 15% of the total calculation time in the studied temperature range) was too short to allow a statistically significant study of this coupling.</p>
<ul>
<li><strong>Dimer Concentration</strong>: The contribution of dimers to mass transport depends on their concentration. As a first approximation, the dimer concentration is expressed as:</li>
</ul>
<p>$$C = C_0 \exp\left[\frac{-2E_f - E_d}{k_B T}\right]$$</p>
<p>where $E_f$ is the formation energy of adatoms and $E_d$ is the binding energy of a dimer. If the binding energy is sufficiently strong, dimer contributions should be accounted for even in the intermediate temperature range ($0.3T_m$ to $0.6T_m$).</p>
<hr>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<h3 id="data-simulation-setup">Data (Simulation Setup)</h3>
<p>Because this is an early computational study, &ldquo;data&rdquo; refers to the initial structural configuration. The simulation begins with an algorithmically generated generic fcc(111) lattice containing two adatoms as the initial state.</p>















<figure class="post-figure center ">
    <img src="/img/notes/chemistry/argon-dimer-diffusion.webp"
         alt="Visualization of argon dimer on fcc(111) surface"
         title="Visualization of argon dimer on fcc(111) surface"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">Initial configuration showing an adatom dimer (two adatoms on neighboring sites) on an fcc(111) surface. The crystallite consists of 192 atoms with periodic boundary conditions in the x and y directions.</figcaption>
    
</figure>

<table>
  <thead>
      <tr>
          <th>Parameter</th>
          <th>Value</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><strong>Particles</strong></td>
          <td>192 atoms</td>
          <td>Single fcc crystallite</td>
      </tr>
      <tr>
          <td><strong>Dimensions</strong></td>
          <td>$4[110] \times 4[112]$</td>
          <td>Thickness of 6 planes</td>
      </tr>
      <tr>
          <td><strong>Boundary</strong></td>
          <td>Periodic (x, y)</td>
          <td>Free surface in z-direction</td>
      </tr>
      <tr>
          <td><strong>Initial State</strong></td>
          <td>Dimer on neighbor sites</td>
          <td>Starts with 2 adatoms</td>
      </tr>
  </tbody>
</table>
<h3 id="algorithms">Algorithms</h3>
<p>The simulation relies on standard Molecular Dynamics integration techniques. Historical source code is absent. Complete reproducibility is achievable today utilizing modern open-source tools like LAMMPS with standard <code>lj/cut</code> pair styles and NVE/NVT ensembles.</p>
<ul>
<li><strong>Integration Scheme</strong>: Central difference algorithm (Verlet algorithm)</li>
<li><strong>Time Step</strong>: $\Delta t^\ast = 0.01$ (reduced units)</li>
<li><strong>Total Steps</strong>: 50,000 integration steps</li>
<li><strong>Dimer Definition</strong>: Two adatoms are considered a dimer if their distance $r \le r_c = 2\sigma$</li>
</ul>
<h3 id="models-analytic-potential">Models (Analytic Potential)</h3>
<p>The physics are modeled using a classic Lennard-Jones potential.</p>
<p><strong>Potential Form</strong>: (12, 6) Lennard-Jones
$$ V(r) = 4\epsilon \left[ \left(\frac{\sigma}{r}\right)^{12} - \left(\frac{\sigma}{r}\right)^6 \right] $$</p>
<p><strong>Parameters (Argon-like)</strong>:</p>
<ul>
<li>$\epsilon/k = 119.5$ K</li>
<li>$\sigma = 3.4478$ Å</li>
<li>$m = 39.948$ a.u.</li>
<li>Cut-off radius: $2\sigma$</li>
</ul>
<h3 id="evaluation">Evaluation</h3>
<p>Metrics used to quantify the diffusion behavior:</p>
<table>
  <thead>
      <tr>
          <th>Metric</th>
          <th>Formula</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><strong>Diffusion Coefficient</strong></td>
          <td>$D = \frac{\langle R^2 \rangle}{4t}$</td>
          <td>Calculated from Mean Square Displacement of center of mass</td>
      </tr>
      <tr>
          <td><strong>Trajectory Analysis</strong></td>
          <td>Visual inspection</td>
          <td>Categorized into &ldquo;fast migration&rdquo; (multiple jumps) or &ldquo;discrete jumps&rdquo;</td>
      </tr>
  </tbody>
</table>
<h3 id="hardware">Hardware</h3>
<ul>
<li><strong>Specifics</strong>: Unspecified in the original text.</li>
<li><strong>Scale</strong>: 192 particles simulated for 50,000 steps is extremely lightweight by modern standards. A standard laptop CPU executes this workload in under a second, providing a strong contrast to the mainframe computing resources required in 1984.</li>
</ul>
<hr>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Ghaleb, D. (1984). Diffusion of adatom dimers on (111) surface of face centred crystals: A molecular dynamics study. <em>Surface Science</em>, 137(2-3), L103-L108. <a href="https://doi.org/10.1016/0039-6028(84)90515-6">https://doi.org/10.1016/0039-6028(84)90515-6</a></p>
<p><strong>Publication</strong>: Surface Science 1984</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@article</span>{ghalebDiffusionAdatomDimers1984,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span> = <span style="color:#e6db74">{Diffusion of Adatom Dimers on (111) Surface of Face Centred Crystals: A Molecular Dynamics Study}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span> = <span style="color:#e6db74">{Ghaleb, Dominique}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span> = <span style="color:#e6db74">{1984}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">journal</span> = <span style="color:#e6db74">{Surface Science}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">volume</span> = <span style="color:#e6db74">{137}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">number</span> = <span style="color:#e6db74">{2-3}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span> = <span style="color:#e6db74">{L103-L108}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">doi</span> = <span style="color:#e6db74">{10.1016/0039-6028(84)90515-6}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div>]]></content:encoded></item><item><title>The Müller-Brown Potential: A 2D Benchmark Surface</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/muller-brown-1979/</link><pubDate>Mon, 08 Sep 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/muller-brown-1979/</guid><description>The Müller-Brown potential is a classic 2D benchmark for testing optimization algorithms and molecular dynamics methods.</description><content:encoded><![CDATA[<h2 id="overview">Overview</h2>
<p>The Müller-Brown potential is a primary benchmark system in computational chemistry: a two-dimensional analytical surface used to evaluate optimization algorithms. Introduced by Klaus Müller and Leo D. Brown in 1979 as a test system for their constrained simplex optimization algorithm, this potential energy function captures the essential topology of chemical reaction landscapes while preserving computational efficiency.</p>
<p><strong>Origin</strong>: Müller, K., &amp; Brown, L. D. (1979). Location of saddle points and minimum energy paths by a constrained simplex optimization procedure. <em>Theoretica Chimica Acta</em>, 53, 75-93. The potential is introduced in footnote 7 (p. 79) as a two-parametric model surface for testing the constrained simplex procedures.</p>
<h2 id="mathematical-definition">Mathematical Definition</h2>
<p>The Müller-Brown potential combines four two-dimensional Gaussian functions:</p>
<p>$$V(x,y) = \sum_{k=1}^{4} A_k \exp\left[a_k(x-x_k^0)^2 + b_k(x-x_k^0)(y-y_k^0) + c_k(y-y_k^0)^2\right]$$</p>
<p>Each Gaussian contributes a different &ldquo;bump&rdquo; or &ldquo;well&rdquo; to the landscape. The parameters control amplitude ($A_k$), width, orientation, and center position.</p>
<h3 id="standard-parameters">Standard Parameters</h3>
<p>The canonical parameter values that define the Müller-Brown surface are:</p>
<table>
  <thead>
      <tr>
          <th>k</th>
          <th>$A_k$</th>
          <th>$a_k$</th>
          <th>$b_k$</th>
          <th>$c_k$</th>
          <th>$x_k^0$</th>
          <th>$y_k^0$</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>1</td>
          <td>-200</td>
          <td>-1</td>
          <td>0</td>
          <td>-10</td>
          <td>1</td>
          <td>0</td>
      </tr>
      <tr>
          <td>2</td>
          <td>-100</td>
          <td>-1</td>
          <td>0</td>
          <td>-10</td>
          <td>0</td>
          <td>0.5</td>
      </tr>
      <tr>
          <td>3</td>
          <td>-170</td>
          <td>-6.5</td>
          <td>11</td>
          <td>-6.5</td>
          <td>-0.5</td>
          <td>1.5</td>
      </tr>
      <tr>
          <td>4</td>
          <td>15</td>
          <td>0.7</td>
          <td>0.6</td>
          <td>0.7</td>
          <td>-1</td>
          <td>1</td>
      </tr>
  </tbody>
</table>
<p>The first three terms have negative amplitudes (creating energy wells), while the fourth has a positive amplitude (creating a barrier). The cross-term $b_k$ in the third Gaussian creates the tilted orientation that gives the surface its characteristic curved pathways.</p>
<h3 id="analytical-gradients-forces">Analytical Gradients (Forces)</h3>
<p>To optimize paths or simulate molecular dynamics across this surface, calculating the spatial derivatives (negative forces) is structurally simple. Defining $G_k(x,y)$ as the inner argument of the exponent, the partial derivatives with respect to $x$ and $y$ are:</p>
<p>$$ \frac{\partial V}{\partial x} = \sum_{k=1}^4 A_k \exp[G_k(x,y)] \cdot \left[ 2a_k(x-x_k^0) + b_k(y-y_k^0) \right] $$</p>
<p>$$ \frac{\partial V}{\partial y} = \sum_{k=1}^4 A_k \exp[G_k(x,y)] \cdot \left[ b_k(x-x_k^0) + 2c_k(y-y_k^0) \right] $$</p>
<h2 id="energy-landscape">Energy Landscape</h2>
<p>This simple formula creates a surprisingly rich topography with exactly the features needed to challenge optimization algorithms:</p>
<table>
  <thead>
      <tr>
          <th><strong>Stationary Point</strong></th>
          <th><strong>Coordinates</strong></th>
          <th><strong>Energy</strong></th>
          <th><strong>Type</strong></th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>MA (Reactant)</td>
          <td>(-0.558, 1.442)</td>
          <td>-146.70</td>
          <td>Deep minimum</td>
      </tr>
      <tr>
          <td>MC (Intermediate)</td>
          <td>(-0.050, 0.467)</td>
          <td>-80.77</td>
          <td>Shallow minimum</td>
      </tr>
      <tr>
          <td>MB (Product)</td>
          <td>(0.623, 0.028)</td>
          <td>-108.17</td>
          <td>Medium minimum</td>
      </tr>
      <tr>
          <td>S1</td>
          <td>(-0.822, 0.624)</td>
          <td>-40.67</td>
          <td>First saddle point</td>
      </tr>
      <tr>
          <td>S2</td>
          <td>(0.212, 0.293)</td>
          <td>-72.25</td>
          <td>Second saddle point</td>
      </tr>
  </tbody>
</table>
<p>All values from Table 1 of Müller &amp; Brown (1979).</p>















<figure class="post-figure center ">
    <img src="/img/muller-brown/muller-brown-potential-surface.webp"
         alt="Müller-Brown Potential Energy Surface showing the three minima (dark blue regions) and two saddle points"
         title="Müller-Brown Potential Energy Surface showing the three minima (dark blue regions) and two saddle points"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">The Müller-Brown potential energy surface showing the three minima (dark blue regions) and two saddle points.</figcaption>
    
</figure>

<h3 id="key-challenge-curved-reaction-pathways">Key Challenge: Curved Reaction Pathways</h3>
<p>The path from the deep reactant minimum (MA) to the product minimum (MB) follows a curved two-step pathway:</p>
<ol>
<li><strong>MA → S1 → MC</strong>: First transition over a lower barrier into an intermediate basin</li>
<li><strong>MC → S2 → MB</strong>: Second transition over a slightly higher barrier to the product</li>
</ol>
<p>This curved pathway breaks linear interpolation methods. Algorithms that draw a straight line from reactant to product miss both the intermediate minimum and the correct transition states, climbing over much higher energy regions instead.</p>
<h2 id="why-it-works-as-a-benchmark">Why It Works as a Benchmark</h2>
<p>The Müller-Brown potential has served as a computational chemistry benchmark for over four decades because of four key characteristics:</p>
<p><strong>Low dimensionality</strong>: As a 2D surface, it permits complete visualization of the landscape, clearly revealing why specific algorithms succeed or fail.</p>
<p><strong>Analytical form</strong>: Energy and gradient calculations cost virtually nothing, enabling exhaustive testing impossible with quantum mechanical surfaces.</p>
<p><strong>Non-trivial topology</strong>: The curved minimum energy path and shallow intermediate minimum challenge sophisticated methods while remaining manageable.</p>
<p><strong>Known ground truth</strong>: All minima and saddle points are precisely known, providing unambiguous success metrics.</p>
<h3 id="contrast-with-other-benchmarks">Contrast with Other Benchmarks</h3>
<p>The Müller-Brown potential provides distinct evaluation metrics compared to other classic potentials. The Lennard-Jones potential serves as the standard benchmark for equilibrium properties due to its single energy minimum. In parallel, Müller-Brown explicitly models reactive landscapes. Its multiple minima and connecting barriers create an evaluation environment for algorithms designed to discover transition states and reaction paths.</p>
<h2 id="historical-applications">Historical Applications</h2>
<p>The potential has evolved with the field&rsquo;s changing focus:</p>
<p><strong>1980s-1990s</strong>: Testing path-finding methods like Nudged Elastic Band (NEB), which creates discrete representations of reaction pathways and optimizes them to find minimum energy paths.</p>
<p><strong>2000s-2010s</strong>: Validating Transition Path Sampling (TPS) methods that harvest statistical ensembles of reactive trajectories.</p>
<p><strong>2020s</strong>: Benchmarking machine learning models and generative approaches that learn to sample transition paths or approximate potential energy surfaces.</p>
<h2 id="modern-applications-in-machine-learning">Modern Applications in Machine Learning</h2>
<p>The rise of machine learning has given the Müller-Brown potential renewed purpose. Modern <strong>Machine Learning Interatomic Potentials (MLIPs)</strong> aim to bridge the gap between quantum mechanical accuracy and classical force field efficiency by training flexible models on expensive quantum chemistry data.</p>
<p>The Müller-Brown potential provides an ideal benchmarking solution: an exactly known potential energy surface that can generate unlimited, noise-free training data. This enables researchers to ask fundamental questions:</p>
<ul>
<li>How well does a given architecture learn complex, curved surfaces?</li>
<li>How many training points are needed for acceptable accuracy?</li>
<li>How does the model behave when extrapolating beyond training data?</li>
<li>Can it correctly identify minima and saddle points?</li>
</ul>
<p>The potential serves as a consistent benchmark for measuring the learning capacity of AI models.</p>
<h2 id="extensions-and-variants">Extensions and Variants</h2>
<h3 id="higher-dimensional-extensions">Higher-Dimensional Extensions</h3>
<p>The canonical Müller-Brown potential can be extended beyond two dimensions to create more challenging test cases:</p>
<p><strong>Harmonic constraints</strong>: Add quadratic wells in orthogonal dimensions while preserving the complex 2D landscape:</p>
<p>$$V_{5D}(x_1, x_2, x_3, x_4, x_5) = V(x_1, x_3) + \kappa(x_2^2 + x_4^2 + x_5^2)$$</p>
<p><strong>Collective variables (CVs)</strong>: Collective variables are low-dimensional coordinates that capture the most important degrees of freedom in a high-dimensional system. By defining CVs that mix multiple dimensions, the original surface can be embedded in higher-dimensional spaces. For instance, the active 2D coordinates $x$ and $y$ can be projected as linear combinations of $N$ arbitrary degrees of freedom ($q_i$):</p>
<p>$$ x = \sum_{i=1}^N w_{x,i} q_i \quad \text{and} \quad y = \sum_{i=1}^N w_{y,i} q_i $$</p>
<p>This constructs a complex, high-dimensional problem where an algorithm must learn to isolate the relevant active subspace (the CVs) before it can effectively optimize the topology.</p>
<p>These extensions enable systematic testing of algorithm scaling with dimensionality while maintaining known ground truth in the active subspace.</p>
<h2 id="limitations">Limitations</h2>
<p>Despite its utility, the Müller-Brown potential has fundamental limitations as a proxy for physical systems:</p>
<ul>
<li><strong>Lack of Realistic Scaling</strong>: As a purely mathematical 2D/analytical model, it cannot directly simulate the complexities of high-dimensional scaling found in many-body atomic systems.</li>
<li><strong>No Entropic Effects</strong>: In real chemical systems, entropic contributions heavily influence the free-energy landscape. The Müller-Brown potential maps energy precisely but lacks the thermal/entropic complexity of solvent or macromolecular environments.</li>
<li><strong>Trivial Topology Contrasts</strong>: While non-trivial compared to single wells, its global topology remains simpler than proper ab initio potential energy surfaces, missing features like complex bifurcations, multi-state crossings, or non-adiabatic couplings.</li>
</ul>
<h2 id="implementation-considerations">Implementation Considerations</h2>
<p>Modern implementations typically focus on:</p>
<ul>
<li><strong>Vectorized calculations</strong> for batch processing</li>
<li><strong>Analytical derivatives</strong> for gradient-based methods</li>
<li><strong>JIT compilation</strong> for performance optimization</li>
<li><strong>Automatic differentiation</strong> compatibility for machine learning frameworks</li>
</ul>
<p>The analytical nature of the potential makes it ideal for testing both classical optimization methods and modern machine learning approaches.</p>
<h2 id="resources-and-visualizations">Resources and Visualizations</h2>
<ul>
<li><a href="/muller-brown-optimized">Interactive Müller-Brown Potential Energy Surface</a> - Local visualization tool</li>
<li><a href="https://www.wolframcloud.com/objects/demonstrations/TrajectoriesOnTheMullerBrownPotentialEnergySurface-source.nb">Müller-Brown Potential Visualization (Wolfram)</a> - External Wolfram demonstration</li>
<li><a href="/posts/muller-brown-in-pytorch/">Implementing the Müller-Brown Potential in PyTorch</a> - Detailed implementation guide with performance analysis</li>
</ul>
<h2 id="related-systems">Related Systems</h2>
<p>The Müller-Brown potential belongs to a family of analytical benchmark systems used in computational chemistry. Other notable examples include:</p>
<ul>
<li><strong>Lennard-Jones potential</strong>: Single-minimum benchmark for equilibrium properties</li>
<li><strong>Double-well potentials</strong>: Simple models for bistable systems</li>
<li><strong>Eckart barrier</strong>: One-dimensional tunneling benchmark</li>
<li><strong>Wolfe-Quapp potential</strong>: Higher-dimensional extension with valley-ridge inflection points</li>
</ul>
<h2 id="conclusion">Conclusion</h2>
<p>The Müller-Brown potential demonstrates how a well-designed benchmark can evolve with a field. Originating from 1970s computational constraints to test algorithms when quantum chemistry calculations were expensive, its topology causes naive linear-interpolation approaches to fail while maintaining instantaneous computational execution. Because of this, it remains a heavily analyzed benchmark system today.</p>
<p>It serves specific purposes in the machine learning era by providing a controlled environment for developing methods targeted at complex realistic molecular systems. Its evolution from a practical surrogate model to a machine learning benchmark demonstrates the continued relevance of foundational analytical test cases in computational science.</p>
]]></content:encoded></item><item><title>DenoiseVAE: Adaptive Noise for Molecular Pre-training</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/denoise-vae/</link><pubDate>Sun, 24 Aug 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/denoise-vae/</guid><description>Liu et al.'s ICLR 2025 paper introducing DenoiseVAE, which learns adaptive, atom-specific noise distributions for better molecular force fields.</description><content:encoded><![CDATA[<h2 id="paper-contribution-type">Paper Contribution Type</h2>
<p>This is a <strong>method paper</strong> with a supporting theoretical component. It introduces a new pre-training framework, DenoiseVAE, that challenges the standard practice of using fixed, hand-crafted noise distributions in denoising-based molecular representation learning.</p>
<h2 id="motivation-the-inter--and-intra-molecular-variations-problem">Motivation: The Inter- and Intra-molecular Variations Problem</h2>
<p>The motivation is to create a more physically principled denoising pre-training task for 3D molecules. The core idea of denoising is to learn molecular force fields by corrupting an equilibrium conformation with noise and then learning to recover it. However, existing methods use a single, hand-crafted noise strategy (e.g., Gaussian noise of a fixed scale) for all atoms across all molecules. This is physically unrealistic for two main reasons:</p>
<ol>
<li><strong>Inter-molecular differences</strong>: Different molecules have unique Potential Energy Surfaces (PES), meaning the space of low-energy (i.e., physically plausible) conformations is highly molecule-specific.</li>
<li><strong>Intra-molecular differences (Anisotropy)</strong>: Within a single molecule, different atoms have different degrees of freedom. For instance, an atom in a rigid functional group can move much less than one connected by a single, rotatable bond.</li>
</ol>
<p>The authors argue that this &ldquo;one-size-fits-all&rdquo; noise approach leads to inaccurate force field learning because it samples many physically improbable conformations.</p>
<h2 id="novelty-a-learnable-atom-specific-noise-generator">Novelty: A Learnable, Atom-Specific Noise Generator</h2>
<p>The core novelty is a framework that learns to generate noise tailored to each specific molecule and atom. This is achieved through three key innovations:</p>
<ol>
<li><strong>Learnable Noise Generator</strong>: The authors introduce a Noise Generator module (a 4-layer Equivariant Graph Neural Network) that takes a molecule&rsquo;s equilibrium conformation $X$ as input and outputs a unique, atom-specific Gaussian noise distribution (i.e., a different variance $\sigma_i^2$ for each atom $i$). This directly addresses the issues of PES specificity and force field anisotropy.</li>
<li><strong>Variational Autoencoder (VAE) Framework</strong>: The Noise Generator (encoder) and a Denoising Module (a 7-layer EGNN decoder) are trained jointly within a VAE paradigm. The noisy conformation is sampled using the reparameterization trick:
$$
\begin{aligned}
\tilde{x}_i &amp;= x_i + \epsilon \sigma_i
\end{aligned}
$$</li>
<li><strong>Principled Optimization Objective</strong>: The training loss balances two competing goals:
$$
\begin{aligned}
\mathcal{L}_{DenoiseVAE} &amp;= \mathcal{L}_{Denoise} + \lambda \mathcal{L}_{KL}
\end{aligned}
$$
<ul>
<li>A denoising reconstruction loss ($\mathcal{L}_{Denoise}$) encourages the Noise Generator to produce physically plausible perturbations from which the original conformation can be recovered. This implicitly constrains the noise to respect the molecule&rsquo;s underlying force fields.</li>
<li>A KL divergence regularization term ($\mathcal{L}_{KL}$) pushes the generated noise distributions towards a predefined prior. This prevents the trivial solution of generating zero noise and encourages the model to explore a diverse set of low-energy conformations.</li>
</ul>
</li>
</ol>
<p>The authors also provide a theoretical analysis showing that optimizing their objective is equivalent to maximizing the Evidence Lower Bound (ELBO) on the log-likelihood of observing physically realistic conformations.</p>
<h2 id="methodology--experimental-baselines">Methodology &amp; Experimental Baselines</h2>
<p>The model was pretrained on the PCQM4Mv2 dataset (approximately 3.4 million organic molecules) and then evaluated on a comprehensive suite of downstream tasks to test the quality of the learned representations:</p>
<ol>
<li><strong>Molecular Property Prediction (QM9)</strong>: The model was evaluated on 12 quantum chemical property prediction tasks for small molecules (134k molecules; 100k train, 18k val, 13k test split). DenoiseVAE achieved state-of-the-art or second-best performance on 11 of the 12 tasks, with particularly significant gains on $C_v$ (heat capacity), indicating better capture of vibrational modes.</li>
<li><strong>Force Prediction (MD17)</strong>: The task was to predict atomic forces from molecular dynamics trajectories for 8 different small molecules (9,500 train, 500 val split). DenoiseVAE was the top performer on 5 of the 8 molecules (Aspirin, Benzene, Ethanol, Naphthalene, Toluene), though it underperformed Frad on Malonaldehyde, Salicylic Acid, and Uracil by significant margins.</li>
<li><strong>Ligand Binding Affinity (PDBBind v2019)</strong>: On the PDBBind dataset with 30% and 60% protein sequence identity splits, the model showed strong generalization, outperforming baselines like Uni-Mol particularly on the more stringent 30% split across RMSE, Pearson correlation, and Spearman correlation.</li>
<li><strong>PCQM4Mv2 Validation</strong>: DenoiseVAE achieved a validation MAE of 0.0777 on the PCQM4Mv2 HOMO-LUMO gap prediction task with only 1.44M parameters, competitive with models 10-40x larger (e.g., GPS++ at 44.3M params achieves 0.0778).</li>
<li><strong>Ablation Studies</strong>: The authors analyzed the sensitivity to key hyperparameters, namely the prior&rsquo;s standard deviation ($\sigma$) and the KL-divergence weight ($\lambda$), confirming that $\lambda=1$ and $\sigma=0.1$ are optimal. Removing the KL term leads to trivial solutions (near-zero noise). An additional ablation on the Noise Generator depth found 4 EGNN layers optimal over 2 layers. A comparison of independent (diagonal) versus non-independent (full covariance) noise sampling showed comparable results, suggesting the EGNN already captures inter-atomic dependencies implicitly.</li>
<li><strong>Case Studies</strong>: Visualizations of the learned noise variances for different molecules confirmed that the model learns chemically intuitive noise patterns. For example, it applies smaller perturbations to atoms in a rigid bicyclic norcamphor derivative and larger ones to atoms in flexible functional groups of a cyclopropane derivative. Even identical functional groups (e.g., hydroxyl) receive different noise scales in different molecular contexts.</li>
</ol>
<h2 id="key-findings-on-force-field-learning">Key Findings on Force Field Learning</h2>
<ul>
<li><strong>Primary Conclusion</strong>: Learning a <strong>molecule-adaptive and atom-specific</strong> noise distribution is a superior strategy for denoising-based pre-training compared to using fixed, hand-crafted heuristics. This more physically-grounded approach leads to representations that better capture molecular force fields.</li>
<li><strong>Strong Benchmark Performance</strong>: DenoiseVAE achieves best or second-best results on 11 of 12 QM9 tasks, 5 of 8 MD17 molecules, and leads on the stringent 30% LBA split. Performance is mixed on some MD17 molecules (Malonaldehyde, Salicylic Acid, Uracil), where it trails Frad.</li>
<li><strong>Effective Framework</strong>: The proposed VAE-based framework, which jointly trains a Noise Generator and a Denoising Module, is an effective and theoretically sound method for implementing this adaptive noise strategy. The interplay between the reconstruction loss and the KL-divergence regularization is key to its success.</li>
<li><strong>Limitation and Future Direction</strong>: The method is based on classical force field assumptions. The authors note that integrating more accurate force fields represents a promising direction for future work.</li>
</ul>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<h3 id="artifacts">Artifacts</h3>
<table>
  <thead>
      <tr>
          <th>Artifact</th>
          <th>Type</th>
          <th>License</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><a href="https://github.com/Serendipity-r/DenoiseVAE">Serendipity-r/DenoiseVAE</a></td>
          <td>Code</td>
          <td>Unknown</td>
          <td>Official implementation</td>
      </tr>
  </tbody>
</table>
<h3 id="reproducibility-status">Reproducibility Status</h3>
<ul>
<li><strong>Source Code</strong>: The authors have released their code at <a href="https://github.com/Serendipity-r/DenoiseVAE">Serendipity-r/DenoiseVAE</a> on GitHub. No license is specified in the repository.</li>
<li><strong>Implementation</strong>: Hyperparameters and architectures are detailed in the paper&rsquo;s appendix (A.14), and the repository provides reference implementations.</li>
</ul>
<h3 id="data">Data</h3>
<ul>
<li><strong>Pre-training Dataset</strong>: <a href="https://ogb.stanford.edu/docs/lsc/pcqm4mv2/">PCQM4Mv2</a> (approximately 3.4 million organic molecules)</li>
<li><strong>Property Prediction</strong>: <a href="https://pytorch-geometric.readthedocs.io/en/latest/generated/torch_geometric.datasets.QM9.html">QM9 dataset</a> (134k molecules; 100k train, 18k val, 13k test split) for 12 quantum chemical properties</li>
<li><strong>Force Prediction</strong>: <a href="http://www.sgdml.org/#datasets">MD17 dataset</a> (9,500 train, 500 val split) for 8 different small molecules</li>
<li><strong>Ligand Binding Affinity</strong>: PDBBind v2019 (4,463 protein-ligand complexes) with 30% and 60% sequence identity splits</li>
</ul>
<h3 id="algorithms">Algorithms</h3>
<ul>
<li><strong>Noise Generator</strong>: 4-layer Equivariant Graph Neural Network (EGNN) that outputs atom-specific Gaussian noise distributions</li>
<li><strong>Denoising Module</strong>: 7-layer EGNN decoder</li>
<li><strong>Training Objective</strong>: $\mathcal{L}_{DenoiseVAE} = \mathcal{L}_{Denoise} + \lambda \mathcal{L}_{KL}$ with $\lambda=1$</li>
<li><strong>Noise Sampling</strong>: Reparameterization trick with $\tilde{x}_i = x_i + \epsilon \sigma_i$</li>
<li><strong>Prior Distribution</strong>: Standard deviation $\sigma=0.1$</li>
</ul>
<h3 id="models">Models</h3>
<ul>
<li><strong>Model Size</strong>: 1.44M parameters total</li>
<li><strong>Fine-tuning Protocol</strong>: Noise Generator discarded after pre-training; only the pre-trained Denoising Module (7-layer EGNN) is retained for downstream fine-tuning</li>
<li><strong>Optimizer</strong>: AdamW with cosine learning rate decay (max LR of 0.0005)</li>
<li><strong>Batch Size</strong>: 128</li>
<li><strong>System Training</strong>: Fine-tuned end-to-end for specific tasks; force prediction involves computing the gradient of the predicted energy</li>
</ul>
<h3 id="evaluation">Evaluation</h3>
<ul>
<li><strong>Ablation Studies</strong>: Sensitivity analysis confirmed $\lambda=1$ and $\sigma=0.1$ as optimal hyperparameters; removing the KL term leads to trivial solutions (near-zero noise)</li>
<li><strong>Noise Generator Depth</strong>: 4 EGNN layers outperformed 2 layers across both QM9 and MD17 benchmarks</li>
<li><strong>Covariance Structure</strong>: Full covariance matrix (non-independent noise sampling) yielded comparable results to diagonal variance (independent sampling), likely because the EGNN already integrates neighboring atom information</li>
<li><strong>O(3) Invariance</strong>: The method satisfies O(3) probabilistic invariance, meaning the noise distribution is unchanged under rotations and reflections</li>
</ul>
<h3 id="hardware">Hardware</h3>
<ul>
<li><strong>GPU Configuration</strong>: Experiments conducted on a single RTX A3090 GPU; 6 GPUs with 144GB total memory sufficient for full reproduction</li>
<li><strong>CPU</strong>: Intel Xeon Gold 5318Y @ 2.10GHz</li>
</ul>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Liu, Y., Chen, J., Jiao, R., Li, J., Huang, W., &amp; Su, B. (2025). DenoiseVAE: Learning Molecule-Adaptive Noise Distributions for Denoising-based 3D Molecular Pre-training. <em>The Thirteenth International Conference on Learning Representations (ICLR)</em>.</p>
<p><strong>Publication</strong>: ICLR 2025</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@inproceedings</span>{liu2025denoisevae,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span>=<span style="color:#e6db74">{DenoiseVAE: Learning Molecule-Adaptive Noise Distributions for Denoising-based 3D Molecular Pre-training}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span>=<span style="color:#e6db74">{Yurou Liu and Jiahao Chen and Rui Jiao and Jiangmeng Li and Wenbing Huang and Bing Su}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">booktitle</span>=<span style="color:#e6db74">{The Thirteenth International Conference on Learning Representations}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span>=<span style="color:#e6db74">{2025}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">url</span>=<span style="color:#e6db74">{https://openreview.net/forum?id=ym7pr83XQr}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p><strong>Additional Resources</strong>:</p>
<ul>
<li><a href="https://iclr.cc/virtual/2025/poster/27701">ICLR 2025 poster page</a></li>
<li><a href="https://openreview.net/forum?id=ym7pr83XQr">OpenReview forum</a></li>
<li><a href="https://openreview.net/pdf?id=ym7pr83XQr">PDF on OpenReview</a></li>
</ul>
]]></content:encoded></item><item><title>eSEN: Smooth Interatomic Potentials (ICML Spotlight)</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/learning-smooth-interatomic-potentials/</link><pubDate>Sat, 23 Aug 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/learning-smooth-interatomic-potentials/</guid><description>Fu et al. propose energy conservation as a key MLIP diagnostic and introduce eSEN, bridging test accuracy and real performance.</description><content:encoded><![CDATA[<h2 id="paper-overview">Paper Overview</h2>
<p>This is a <strong>method paper</strong>. It addresses a critical disconnect in the evaluation of Machine Learning Interatomic Potentials (MLIPs) and introduces a novel architecture, <strong>eSEN</strong>, designed based on insights from this analysis. The paper proposes a new standard for evaluating MLIPs beyond simple test-set errors.</p>
<h2 id="the-energy-conservation-gap-in-mlip-evaluation">The Energy Conservation Gap in MLIP Evaluation</h2>
<p>The motivation addresses a well-known but under-addressed problem in the field: improvements in standard MLIP metrics (lower energy/force MAE on static test sets) do not reliably translate to better performance on complex downstream tasks like molecular dynamics (MD) simulations, materials stability prediction, or phonon calculations. The authors seek to understand why this gap exists and how to design models that are both accurate on test sets and physically reliable in practical scientific workflows.</p>
<h2 id="the-esen-architecture-and-continuous-representation">The eSEN Architecture and Continuous Representation</h2>
<p>The novelty is twofold, spanning both a conceptual framework for evaluation and a new model architecture:</p>
<ol>
<li>
<p><strong>Energy Conservation as a Diagnostic Test</strong>: The core conceptual contribution is using an MLIP&rsquo;s ability to conserve energy in out-of-distribution MD simulations as a crucial diagnostic test. The authors demonstrate that for models passing this test, a strong correlation between test-set error and downstream task performance is restored.</p>
</li>
<li>
<p><strong>The eSEN Architecture</strong>: The paper introduces the <strong>equivariant Smooth Energy Network (eSEN)</strong>, designed with specific choices to ensure a smooth and well-behaved Potential Energy Surface (PES):</p>
<ul>
<li><strong>Strictly Conservative Forces</strong>: Forces are computed exclusively as the negative gradient of energy ($F = -\nabla E$), using conservative force prediction instead of faster direct-force prediction heads.</li>
<li><strong>Continuous Representations</strong>: Maintains strict equivariance and smoothness by using equivariant gated non-linearities instead of discretizing spherical harmonic representations during nodewise processing.</li>
<li><strong>Smooth PES Construction</strong>: Critical design choices include using distance cutoffs, polynomial envelope functions ensuring derivatives go to zero at cutoffs, and limited radial basis functions to avoid overly sensitive PES.</li>
</ul>
</li>
<li>
<p><strong>Efficient Training Strategy</strong>: A two-stage training regimen with fast pre-training using a non-conservative direct-force model, followed by fine-tuning to enforce energy conservation. This captures the efficiency of direct-force training while ensuring physical robustness.</p>
</li>
</ol>
<h2 id="evaluating-ood-energy-conservation-and-physical-properties">Evaluating OOD Energy Conservation and Physical Properties</h2>
<p>The paper presents a comprehensive experimental validation:</p>
<ol>
<li>
<p><strong>Ablation Studies on Energy Conservation</strong>: MD simulations on out-of-distribution systems (TM23 and MD22 datasets) systematically tested key design choices (direct-force vs. conservative, representation discretization, neighbor limits, envelope functions). This empirically demonstrated which choices lead to energy drift despite negligible impact on test-set MAE.</p>
</li>
<li>
<p><strong>Physical Property Prediction Benchmarks</strong>: The eSEN model was evaluated on challenging downstream tasks:</p>
<ul>
<li><strong>Matbench-Discovery</strong>: Materials stability and thermal conductivity prediction, where eSEN achieved the highest F1 score among compliant models and excelled at both metrics simultaneously.</li>
<li><strong>MDR Phonon Benchmark</strong>: Predicting phonon properties that test accurate second and third-order derivatives of the PES. eSEN achieved state-of-the-art results, particularly outperforming direct-force models.</li>
<li><strong>SPICE-MACE-OFF</strong>: Standard energy and force prediction on organic molecules, demonstrating that physical plausibility design choices enhanced raw accuracy.</li>
</ul>
</li>
<li>
<p><strong>Correlation Analysis</strong>: Explicit plots of test-set energy MAE versus performance on downstream benchmarks showed weak overall correlation that becomes strong and predictive when restricted to models passing the energy conservation test.</p>
</li>
</ol>
<h2 id="outcomes-and-conclusions">Outcomes and Conclusions</h2>
<ul>
<li>
<p><strong>Primary Conclusion</strong>: Energy conservation is a critical, practical property for MLIPs. Using it as a filter re-establishes test-set error as a reliable proxy for model development, dramatically accelerating the innovation cycle. Models that are not conservative, even with low test error, are unreliable for many critical scientific applications.</p>
</li>
<li>
<p><strong>Model Performance</strong>: The eSEN architecture outperforms base models across diverse tasks, from energy/force prediction to geometry optimization, phonon calculations, and thermal conductivity prediction.</p>
</li>
<li>
<p><strong>Actionable Design Principles</strong>: The paper provides experimentally-validated architectural choices that promote physical plausibility. Seemingly minor details, like how atomic neighbors are selected, can have profound impacts on a model&rsquo;s utility in simulations.</p>
</li>
<li>
<p><strong>Efficient Path to Robust Models</strong>: The direct-force pre-training plus conservative fine-tuning strategy offers a practical method for developing physically robust models without incurring the full computational cost of conservative training from scratch.</p>
</li>
</ul>
<hr>
<h2 id="reproducibility">Reproducibility</h2>
<h3 id="artifacts">Artifacts</h3>
<table>
  <thead>
      <tr>
          <th>Artifact</th>
          <th>Type</th>
          <th>License</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><a href="https://github.com/facebookresearch/fairchem">fairchem (GitHub)</a></td>
          <td>Code</td>
          <td>MIT</td>
          <td>Official implementation within FAIR Chemistry framework</td>
      </tr>
      <tr>
          <td><a href="https://huggingface.co/facebook/OMAT24">OMAT24 (Hugging Face)</a></td>
          <td>Model</td>
          <td>FAIR Acceptable Use Policy</td>
          <td>Pre-trained eSEN-30M-MP and eSEN-30M-OAM checkpoints</td>
      </tr>
      <tr>
          <td><a href="https://openreview.net/forum?id=R0PBjxIbgm">OpenReview</a></td>
          <td>Paper</td>
          <td>CC BY 4.0</td>
          <td>ICML 2025 camera-ready paper</td>
      </tr>
  </tbody>
</table>
<h3 id="models">Models</h3>
<p>The eSEN architecture builds on components from <strong>eSCN</strong> (Equivariant Spherical Channel Network) and <strong>Equiformer</strong>, combining them with design choices that prioritize smoothness and energy conservation. The implementation integrates into the standard <code>fairchem</code> Open Catalyst experimental framework.</p>
<h4 id="layer-structure">Layer Structure</h4>
<ul>
<li><strong>Edgewise Convolution</strong>: Uses <code>SO2</code> convolution layers (from eSCN) with an envelope function applied. Source and target embeddings are concatenated before convolution.</li>
<li><strong>Nodewise Feed-Forward</strong>: Two equivariant linear layers with an intermediate <strong>SiLU-based gated non-linearity</strong> (from Equiformer).</li>
<li><strong>Normalization</strong>: Equivariant Layer Normalization (from Equiformer).</li>
</ul>
<h4 id="smoothness-design-choices">Smoothness Design Choices</h4>
<p>Several architectural decisions distinguish eSEN from prior work:</p>
<ul>
<li><strong>No Grid Projection</strong>: eSEN performs operations directly in the spherical harmonic space to maintain equivariance and energy conservation, bypassing the projection of spherical harmonics to spatial grids for non-linearity.</li>
<li><strong>Distance Cutoff for Graph Construction</strong>: Uses a strict distance cutoff (6 Å for MPTrj models, 5 Å for SPICE models). Neighbor limits introduce discontinuities that break energy conservation.</li>
<li><strong>Polynomial Envelope Functions</strong>: Ensures derivatives go to zero smoothly at the cutoff radius.</li>
</ul>
<h3 id="algorithms">Algorithms</h3>
<h4 id="two-stage-training-esen-30m-mp">Two-Stage Training (eSEN-30M-MP)</h4>
<ol>
<li><strong>Direct-Force Pre-training</strong> (60 epochs): Uses <strong>DeNS</strong> (Denoising Non-equilibrium Structures) to reduce overfitting. This stage is fast because it does not require backpropagation through energy gradients.</li>
<li><strong>Conservative Fine-tuning</strong> (40 epochs): The direct-force head is removed, and forces are calculated via gradients ($F = -\nabla E$). This enforces energy conservation.</li>
</ol>
<p><strong>Important</strong>: DeNS is used exclusively during the direct-force pre-training stage, with a noising probability of 0.5, a standard deviation of 0.1 Å for the added Gaussian noise, and a DeNS loss coefficient of 10. The fine-tuning strategy reduces the wall-clock time for model training by 40%.</p>
<h4 id="optimization">Optimization</h4>
<ul>
<li><strong>Optimizer</strong>: AdamW with cosine learning rate scheduler</li>
<li><strong>Max Learning Rate</strong>: $4 \times 10^{-4}$</li>
<li><strong>Batch Size</strong>: 512 (for MPTrj models)</li>
<li><strong>Weight Decay</strong>: $1 \times 10^{-3}$</li>
<li><strong>Gradient Clipping</strong>: Norm of 100</li>
<li><strong>Warmup</strong>: 0.1 epochs with a factor of 0.2</li>
</ul>
<h4 id="loss-function">Loss Function</h4>
<p>A composite loss combining per-atom energy MAE, force $L_2$ loss, and stress MAE:</p>
<p>$$
\begin{aligned}
\mathcal{L} = \lambda_{\text{e}} \frac{1}{N} \sum_{i=1}^N \lvert E_{i} - \hat{E}_{i} \rvert + \lambda_{\text{f}} \frac{1}{3N} \sum_{i=1}^N \lVert \mathbf{F}_{i} - \hat{\mathbf{F}}_{i} \rVert_2^2 + \lambda_{\text{s}} \lVert \mathbf{S} - \hat{\mathbf{S}} \rVert_1
\end{aligned}
$$</p>
<p>For MPTrj-30M, the weighting coefficients are set to $\lambda_{\text{e}} = 20$, $\lambda_{\text{f}} = 20$, and $\lambda_{\text{s}} = 5$.</p>
<h3 id="data">Data</h3>
<h4 id="training-data">Training Data</h4>
<ul>
<li><strong>Inorganic</strong>: MPTrj (Materials Project Trajectory) dataset</li>
<li><strong>Organic</strong>: SPICE-MACE-OFF dataset</li>
</ul>
<h4 id="test-data-construction">Test Data Construction</h4>
<ul>
<li><strong>MPTrj Testing</strong>: Since MPTrj lacks an official test split, the authors created a test set using 5,000 random samples from the <strong>subsampled Alexandria (sAlex)</strong> dataset to ensure fair comparison.</li>
<li><strong>Out-of-Distribution Conservation Testing</strong>:
<ul>
<li><em>Inorganic</em>: <strong>TM23</strong> dataset (transition metal defects). Simulation: 100 ps, 5 fs timestep.</li>
<li><em>Organic</em>: <strong>MD22</strong> dataset (large molecules). Simulation: 100 ps, 1 fs timestep.</li>
</ul>
</li>
</ul>
<h3 id="hardware">Hardware</h3>
<p>Compute for training operations predominantly utilizes <strong>80GB NVIDIA A100 GPUs</strong>.</p>
<h4 id="inference-efficiency">Inference Efficiency</h4>
<p>For a periodic system of <strong>216 atoms</strong> on a single A100 (PyTorch 2.4.0, CUDA 12.1, no compile/torchscript), the 2-layer eSEN models achieve approximately <strong>0.4 million steps per day</strong> (3.2M parameters) and <strong>0.8 million steps per day</strong> (6.5M parameters), comparable to MACE-OFF-L at 0.7 million steps per day.</p>
<h3 id="evaluation">Evaluation</h3>
<p>The paper evaluated eSEN across three major benchmark tasks. Key evaluation metrics included energy MAE (meV/atom), force MAE (meV/Å), stress MAE (meV/Å/atom), F1 score for stability prediction, $\kappa_{\text{SRME}}$ for thermal conductivity, and phonon frequency accuracy.</p>
<h4 id="ablation-test-set-mae-table-1">Ablation Test-Set MAE (Table 1)</h4>
<p>Design choices that dramatically affect energy conservation have negligible impact on static test-set MAE, which is precisely why test-set error alone is misleading. All models are 2-layer with 3.2M parameters, $L_{\text{max}} = 2$, $M_{\text{max}} = 2$:</p>
<table>
  <thead>
      <tr>
          <th>Model</th>
          <th>Energy MAE</th>
          <th>Force MAE</th>
          <th>Stress MAE</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>eSEN (default)</td>
          <td>17.02</td>
          <td>43.96</td>
          <td>0.14</td>
      </tr>
      <tr>
          <td>eSEN, direct-force</td>
          <td>18.66</td>
          <td>43.62</td>
          <td>0.16</td>
      </tr>
      <tr>
          <td>eSEN, neighbor limit</td>
          <td>17.30</td>
          <td>44.11</td>
          <td>0.14</td>
      </tr>
      <tr>
          <td>eSEN, no envelope</td>
          <td>17.60</td>
          <td>44.69</td>
          <td>0.14</td>
      </tr>
      <tr>
          <td>eSEN, $N_{\text{basis}} = 512$</td>
          <td>19.87</td>
          <td>48.29</td>
          <td>0.15</td>
      </tr>
      <tr>
          <td>eSEN, Bessel</td>
          <td>17.65</td>
          <td>44.83</td>
          <td>0.15</td>
      </tr>
      <tr>
          <td>eSEN, discrete, res=6</td>
          <td>17.05</td>
          <td>43.10</td>
          <td>0.14</td>
      </tr>
      <tr>
          <td>eSEN, discrete, res=10</td>
          <td>17.11</td>
          <td>43.13</td>
          <td>0.14</td>
      </tr>
      <tr>
          <td>eSEN, discrete, res=14</td>
          <td>17.12</td>
          <td>43.09</td>
          <td>0.14</td>
      </tr>
  </tbody>
</table>
<p>Energy MAE in meV/atom. Force MAE in meV/Å. Stress MAE in meV/Å/atom.</p>
<h4 id="matbench-discovery-tables-2-and-3">Matbench-Discovery (Tables 2 and 3)</h4>
<p><strong>Compliant models</strong> (trained only on MPTrj or its subset), unique prototype split:</p>
<table>
  <thead>
      <tr>
          <th>Model</th>
          <th>F1</th>
          <th>DAF</th>
          <th>$\kappa_{\text{SRME}}$</th>
          <th>RMSD</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><strong>eSEN-30M-MP</strong></td>
          <td><strong>0.831</strong></td>
          <td><strong>5.260</strong></td>
          <td><strong>0.340</strong></td>
          <td><strong>0.0752</strong></td>
      </tr>
      <tr>
          <td>eqV2-S-DeNS</td>
          <td>0.815</td>
          <td>5.042</td>
          <td>1.676</td>
          <td>0.0757</td>
      </tr>
      <tr>
          <td>MatRIS-MP</td>
          <td>0.809</td>
          <td>5.049</td>
          <td>0.861</td>
          <td>0.0773</td>
      </tr>
      <tr>
          <td>AlphaNet-MP</td>
          <td>0.799</td>
          <td>4.863</td>
          <td>1.31</td>
          <td>0.1067</td>
      </tr>
      <tr>
          <td>DPA3-v2-MP</td>
          <td>0.786</td>
          <td>4.822</td>
          <td>0.959</td>
          <td>0.0823</td>
      </tr>
      <tr>
          <td>ORB v2 MPtrj</td>
          <td>0.765</td>
          <td>4.702</td>
          <td>1.725</td>
          <td>0.1007</td>
      </tr>
      <tr>
          <td>SevenNet-13i5</td>
          <td>0.760</td>
          <td>4.629</td>
          <td>0.550</td>
          <td>0.0847</td>
      </tr>
      <tr>
          <td>GRACE-2L-MPtrj</td>
          <td>0.691</td>
          <td>4.163</td>
          <td>0.525</td>
          <td>0.0897</td>
      </tr>
      <tr>
          <td>MACE-MP-0</td>
          <td>0.669</td>
          <td>3.777</td>
          <td>0.647</td>
          <td>0.0915</td>
      </tr>
      <tr>
          <td>CHGNet</td>
          <td>0.613</td>
          <td>3.361</td>
          <td>1.717</td>
          <td>0.0949</td>
      </tr>
      <tr>
          <td>M3GNet</td>
          <td>0.569</td>
          <td>2.882</td>
          <td>1.412</td>
          <td>0.1117</td>
      </tr>
  </tbody>
</table>
<p>eSEN-30M-MP excels at both F1 and $\kappa_{\text{SRME}}$ simultaneously, while all previous models only achieve SOTA on one or the other.</p>
<p><strong>Non-compliant models</strong> (trained on additional datasets):</p>
<table>
  <thead>
      <tr>
          <th>Model</th>
          <th>F1</th>
          <th>$\kappa_{\text{SRME}}$</th>
          <th>RMSD</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><strong>eSEN-30M-OAM</strong></td>
          <td><strong>0.925</strong></td>
          <td><strong>0.170</strong></td>
          <td><strong>0.0608</strong></td>
      </tr>
      <tr>
          <td>eqV2-M-OAM</td>
          <td>0.917</td>
          <td>1.771</td>
          <td>0.0691</td>
      </tr>
      <tr>
          <td>ORB v3</td>
          <td>0.905</td>
          <td>0.210</td>
          <td>0.0750</td>
      </tr>
      <tr>
          <td>SevenNet-MF-ompa</td>
          <td>0.901</td>
          <td>0.317</td>
          <td>0.0639</td>
      </tr>
      <tr>
          <td>DPA3-v2-OpenLAM</td>
          <td>0.890</td>
          <td>0.687</td>
          <td>0.0679</td>
      </tr>
      <tr>
          <td>GRACE-2L-OAM</td>
          <td>0.880</td>
          <td>0.294</td>
          <td>0.0666</td>
      </tr>
      <tr>
          <td>MatterSim-v1-5M</td>
          <td>0.862</td>
          <td>0.574</td>
          <td>0.0733</td>
      </tr>
      <tr>
          <td>MACE-MPA-0</td>
          <td>0.852</td>
          <td>0.412</td>
          <td>0.0731</td>
      </tr>
  </tbody>
</table>
<p>The eSEN-30M-OAM model is pre-trained on the OMat24 dataset, then fine-tuned on the subsampled Alexandria (sAlex) dataset and MPTrj dataset.</p>
<h4 id="mdr-phonon-benchmark-table-4">MDR Phonon Benchmark (Table 4)</h4>
<p>Metrics: maximum phonon frequency MAE($\omega_{\text{max}}$) in K, vibrational entropy MAE($S$) in J/K/mol, Helmholtz free energy MAE($F$) in kJ/mol, heat capacity MAE($C_V$) in J/K/mol.</p>
<table>
  <thead>
      <tr>
          <th>Model</th>
          <th>MAE($\omega_{\text{max}}$)</th>
          <th>MAE($S$)</th>
          <th>MAE($F$)</th>
          <th>MAE($C_V$)</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><strong>eSEN-30M-MP</strong></td>
          <td><strong>21</strong></td>
          <td><strong>13</strong></td>
          <td><strong>5</strong></td>
          <td><strong>4</strong></td>
      </tr>
      <tr>
          <td>SevenNet-13i5</td>
          <td>26</td>
          <td>28</td>
          <td>10</td>
          <td>5</td>
      </tr>
      <tr>
          <td>GRACE-2L (r6)</td>
          <td>40</td>
          <td>25</td>
          <td>9</td>
          <td>5</td>
      </tr>
      <tr>
          <td>SevenNet-0</td>
          <td>40</td>
          <td>48</td>
          <td>19</td>
          <td>9</td>
      </tr>
      <tr>
          <td>MACE</td>
          <td>61</td>
          <td>60</td>
          <td>24</td>
          <td>13</td>
      </tr>
      <tr>
          <td>CHGNet</td>
          <td>89</td>
          <td>114</td>
          <td>45</td>
          <td>21</td>
      </tr>
      <tr>
          <td>M3GNet</td>
          <td>98</td>
          <td>150</td>
          <td>56</td>
          <td>22</td>
      </tr>
  </tbody>
</table>
<p>Direct-force models show dramatically worse performance at the standard 0.01 Å displacement (e.g., eqV2-S-DeNS: 280/224/54/94) but improve at larger displacements (0.2 Å: 58/26/8/8), revealing that their PES is rough near energy minima.</p>
<h4 id="spice-mace-off-table-5">SPICE-MACE-OFF (Table 5)</h4>
<p>Test set MAE for organic molecule energy/force prediction. Energy MAE in meV/atom, force MAE in meV/Å:</p>
<table>
  <thead>
      <tr>
          <th>Dataset</th>
          <th>MACE-4.7M (E/F)</th>
          <th>EscAIP-45M* (E/F)</th>
          <th>eSEN-3.2M (E/F)</th>
          <th>eSEN-6.5M (E/F)</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>PubChem</td>
          <td>0.88 / 14.75</td>
          <td>0.53 / 5.86</td>
          <td>0.22 / 6.10</td>
          <td><strong>0.15</strong> / <strong>4.21</strong></td>
      </tr>
      <tr>
          <td>DES370K M.</td>
          <td>0.59 / 6.58</td>
          <td>0.41 / 3.48</td>
          <td>0.17 / 1.85</td>
          <td><strong>0.13</strong> / <strong>1.24</strong></td>
      </tr>
      <tr>
          <td>DES370K D.</td>
          <td>0.54 / 6.62</td>
          <td>0.38 / 2.18</td>
          <td>0.20 / 2.77</td>
          <td><strong>0.15</strong> / <strong>2.12</strong></td>
      </tr>
      <tr>
          <td>Dipeptides</td>
          <td>0.42 / 10.19</td>
          <td>0.31 / 5.21</td>
          <td>0.10 / 3.04</td>
          <td><strong>0.07</strong> / <strong>2.00</strong></td>
      </tr>
      <tr>
          <td>Sol. AA</td>
          <td>0.98 / 19.43</td>
          <td>0.61 / 11.52</td>
          <td>0.30 / 5.76</td>
          <td><strong>0.25</strong> / <strong>3.68</strong></td>
      </tr>
      <tr>
          <td>Water</td>
          <td>0.83 / 13.57</td>
          <td>0.72 / 10.31</td>
          <td>0.24 / 3.88</td>
          <td><strong>0.15</strong> / <strong>2.50</strong></td>
      </tr>
      <tr>
          <td>QMugs</td>
          <td>0.45 / 16.93</td>
          <td>0.41 / 8.74</td>
          <td>0.16 / 5.70</td>
          <td><strong>0.12</strong> / <strong>3.78</strong></td>
      </tr>
  </tbody>
</table>
<p>*EscAIP-45M is a direct-force model. eSEN-6.5M outperforms MACE-OFF-L and EscAIP on all test splits. The smaller eSEN-3.2M has inference efficiency comparable to MACE-4.7M while achieving lower MAE.</p>
<hr>
<h2 id="why-these-design-choices-matter">Why These Design Choices Matter</h2>
<h3 id="bounded-energy-derivatives-and-the-verlet-integrator">Bounded Energy Derivatives and the Verlet Integrator</h3>
<p>The theoretical foundation for why smoothness matters comes from Theorem 5.1 of Hairer et al. (2003). For the Verlet integrator (the standard NVE integrator), the total energy drift satisfies:</p>
<p>$$
|E(\mathbf{r}_T, \mathbf{a}) - E(\mathbf{r}_0, \mathbf{a})| \leq C \Delta t^2 + C_N \Delta t^N T
$$</p>
<p>where $T$ is the total simulation time ($T \leq \Delta t^{-N}$), $N$ is the highest order for which the $N$th derivative of $E$ is continuously differentiable with bounded derivative, and $C$, $C_N$ are constants independent of $T$ and $\Delta t$. The first term is a time-independent fluctuation of $O(\Delta t^2)$; the second term governs long-term conservation. This means the PES must be continuously differentiable to high order, with bounded derivatives, for energy conservation in long-time simulations.</p>
<h3 id="architectural-choices-that-break-conservation">Architectural Choices That Break Conservation</h3>
<p>The authors provide theoretical justification for why specific architectural choices break energy conservation:</p>
<ul>
<li><strong>Max Neighbor Limit (KNN)</strong>: Introduces discontinuity in the PES. If a neighbor at distance $r$ moves to $r + \epsilon$ and drops out of the top-$K$, the energy changes discontinuously.</li>
<li><strong>Grid Discretization</strong>: Projecting spherical harmonics to a spatial grid introduces discretization errors in energy gradients that break conservation. This can be mitigated with higher-resolution grids but not eliminated.</li>
<li><strong>Direct-Force Prediction</strong>: Imposes no mathematical constraint that forces must be the gradient of an energy scalar field. In other words, $\nabla \times \mathbf{F} \neq 0$ is permitted, violating the requirement for a conservative force field.</li>
</ul>
<h3 id="displacement-sensitivity-in-phonon-calculations">Displacement Sensitivity in Phonon Calculations</h3>
<p>An important empirical finding concerns how displacement values affect phonon predictions. Conservative models (eSEN, MACE) show convergent phonon band structures as displacement decreases toward zero. In contrast, direct-force models (eqV2-S-DeNS) fail to converge, exhibiting missing acoustic branches and spurious imaginary frequencies at small displacements. While direct-force models achieve competitive thermodynamic property accuracy at large displacements (0.2 Å), this is deceptive: the underlying phonon band structures remain inaccurate, and the apparent accuracy comes from Boltzmann-weighted integrals smoothing over errors.</p>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Fu, X., Wood, B. M., Barroso-Luque, L., Levine, D. S., Gao, M., Dzamba, M., &amp; Zitnick, C. L. (2025). Learning Smooth and Expressive Interatomic Potentials for Physical Property Prediction. <em>Proceedings of the 42nd International Conference on Machine Learning (ICML)</em>, PMLR 267:17875–17893.</p>
<p><strong>Publication</strong>: ICML 2025 (Spotlight)</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@inproceedings</span>{fu2025learning,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span>=<span style="color:#e6db74">{Learning Smooth and Expressive Interatomic Potentials for Physical Property Prediction}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span>=<span style="color:#e6db74">{Fu, Xiang and Wood, Brandon M. and Barroso-Luque, Luis and Levine, Daniel S. and Gao, Meng and Dzamba, Misko and Zitnick, C. Lawrence}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">booktitle</span>=<span style="color:#e6db74">{Proceedings of the 42nd International Conference on Machine Learning}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">series</span>=<span style="color:#e6db74">{Proceedings of Machine Learning Research}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">volume</span>=<span style="color:#e6db74">{267}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span>=<span style="color:#e6db74">{17875--17893}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">publisher</span>=<span style="color:#e6db74">{PMLR}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span>=<span style="color:#e6db74">{2025}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p><strong>Additional Resources</strong>:</p>
<ul>
<li><a href="https://icml.cc/virtual/2025/poster/45302">ICML 2025 poster page</a></li>
<li><a href="https://openreview.net/forum?id=R0PBjxIbgm">OpenReview forum</a></li>
<li><a href="https://openreview.net/pdf?id=R0PBjxIbgm">PDF on OpenReview</a></li>
<li><a href="https://huggingface.co/facebook/OMAT24">OMAT24 model on Hugging Face</a></li>
<li><a href="https://github.com/facebookresearch/fairchem">Code on GitHub (fairchem)</a></li>
</ul>
]]></content:encoded></item><item><title>Efficient DFT Hamiltonian Prediction via Adaptive Sparsity</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/efficient-dft-hamiltonian-predicton-sphnet/</link><pubDate>Sat, 23 Aug 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/efficient-dft-hamiltonian-predicton-sphnet/</guid><description>Luo et al. introduce SPHNet, using adaptive sparsity to achieve up to 7x speedup in SE(3)-equivariant Hamiltonian prediction.</description><content:encoded><![CDATA[<h2 id="core-innovation-adaptive-sparsity-in-se3-networks">Core Innovation: Adaptive Sparsity in SE(3) Networks</h2>
<p>This is a <strong>methodological paper</strong> introducing a novel architecture and training curriculum to solve efficiency bottlenecks in Geometric Deep Learning. It directly tackles the primary computational bottleneck in modern SE(3)-equivariant graph neural networks (the tensor product operation) and proposes a generalizable solution through adaptive network sparsification.</p>
<h2 id="the-computational-bottleneck-in-dft-hamiltonian-prediction">The Computational Bottleneck in DFT Hamiltonian Prediction</h2>
<p>SE(3)-equivariant networks are accurate but unscalable for DFT Hamiltonian prediction due to two key bottlenecks:</p>
<ul>
<li><strong>Atom Scaling</strong>: Tensor Product (TP) operations grow quadratically with atoms ($N^2$).</li>
<li><strong>Basis Set Scaling</strong>: Computational complexity grows with the sixth power of the angular momentum order ($L^6$). Larger basis sets (e.g., def2-TZVP) require higher orders ($L=6$), making them prohibitively slow.</li>
</ul>
<p>Existing SE(3)-equivariant models cannot handle large molecules (40-100 atoms) with high-quality basis sets, limiting their practical applicability in computational chemistry.</p>
<h2 id="sphnet-architecture-and-the-three-phase-sparsity-scheduler">SPHNet Architecture and the Three-Phase Sparsity Scheduler</h2>
<p><strong>SPHNet</strong> introduces <strong>Adaptive Sparsity</strong> to prune redundant computations at two levels:</p>
<ol>
<li><strong>Sparse Pair Gate</strong>: Learns which atom pairs to include in message passing, adapting the interaction graph based on importance.</li>
<li><strong>Sparse TP Gate</strong>: Filters which spherical harmonic triplets $(l_1, l_2, l_3)$ are computed in tensor product operations, pruning higher-order combinations that contribute less to accuracy.</li>
<li><strong>Three-Phase Sparsity Scheduler</strong>: A training curriculum (Random → Adaptive → Fixed) that enables stable convergence to high-performing sparse subnetworks.</li>
</ol>
<p>Key insight: The Sparse Pair Gate learns to preserve long-range interactions (16-25 Angstrom) at higher rates than short-range ones. Short-range pairs are abundant and easier to learn, while rare long-range interactions require more samples for accurate representation, making them more critical to retain.</p>
<h2 id="benchmarks-and-ablation-studies">Benchmarks and Ablation Studies</h2>
<p>The authors evaluated SPHNet on three datasets (MD17, QH9, and PubChemQH) with varying molecule sizes and basis set complexities. Baselines include SchNOrb, PhiSNet, QHNet, and WANet. SchNOrb and PhiSNet results are limited to MD17, as those models are designed for trajectory datasets. WANet was not open-sourced, so only partial metrics from its paper are reported.</p>
<h3 id="evaluation-metrics">Evaluation Metrics</h3>
<ul>
<li><strong>Hamiltonian MAE ($H$)</strong>: Mean absolute error between predicted and DFT-computed Hamiltonian matrices, in Hartrees ($E_h$)</li>
<li><strong>Occupied Orbital Energy MAE ($\epsilon$)</strong>: Mean absolute error of all occupied molecular orbital energies derived from the predicted Hamiltonian</li>
<li><strong>Orbital Coefficient Similarity ($\psi$)</strong>: Cosine similarity of occupied molecular orbital coefficients between predicted and reference wavefunctions</li>
</ul>
<h3 id="ablation-studies">Ablation Studies</h3>
<p><strong>Sparse Gates</strong> (on PubChemQH):</p>
<table>
  <thead>
      <tr>
          <th>Configuration</th>
          <th>$H$ [$10^{-6} E_h$] $\downarrow$</th>
          <th>Memory [GB] $\downarrow$</th>
          <th>Speedup $\uparrow$</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Both gates</td>
          <td>97.31</td>
          <td>5.62</td>
          <td>7.09x</td>
      </tr>
      <tr>
          <td>Pair Gate only</td>
          <td>87.70</td>
          <td>6.98</td>
          <td>2.73x</td>
      </tr>
      <tr>
          <td>TP Gate only</td>
          <td>94.31</td>
          <td>8.04</td>
          <td>3.98x</td>
      </tr>
      <tr>
          <td>Neither gate</td>
          <td>86.35</td>
          <td>10.91</td>
          <td>1.73x</td>
      </tr>
  </tbody>
</table>
<p>The Sparse Pair Gate contributes a 78% speedup with 30% memory reduction. The Sparse TP Gate (pruning 70% of combinations) yields a 160% speedup. Both gates together achieve the highest speedup, though accuracy slightly decreases compared to no gating.</p>
<p><strong>Three-Phase Scheduler</strong>: Removing the random phase causes convergence to local optima ($112.68 \pm 10.75$ vs $97.31 \pm 0.52$). Removing the adaptive phase increases variance and lowers accuracy ($122.79 \pm 19.02$). Removing the fixed phase has minimal accuracy impact but reduces speedup from 7.09x to 5.45x due to dynamic graph overhead.</p>
<p><strong>Sparsity Rate</strong>: The critical sparsity threshold scales with system complexity: 30% for MD17 (small molecules), 40% for QH9 (medium), and 70% for PubChemQH (large). Beyond the threshold, MAE increases sharply. Computational cost decreases approximately linearly with sparsity rate.</p>
<h3 id="transferability-to-other-models">Transferability to Other Models</h3>
<p>To demonstrate the speedup is architecture-agnostic, the authors applied the Sparse Pair Gate and Sparse TP Gate to the QHNet baseline on PubChemQH:</p>
<table>
  <thead>
      <tr>
          <th>Configuration</th>
          <th>$H$ [$10^{-6} E_h$] $\downarrow$</th>
          <th>Memory [GB] $\downarrow$</th>
          <th>Speedup $\uparrow$</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>QHNet baseline</td>
          <td>123.74</td>
          <td>22.50</td>
          <td>1.00x</td>
      </tr>
      <tr>
          <td>+ TP Gate</td>
          <td>128.16</td>
          <td>12.68</td>
          <td>2.04x</td>
      </tr>
      <tr>
          <td>+ Pair Gate</td>
          <td>126.27</td>
          <td>10.07</td>
          <td>1.66x</td>
      </tr>
      <tr>
          <td>+ Both gates</td>
          <td>128.89</td>
          <td>8.46</td>
          <td>3.30x</td>
      </tr>
  </tbody>
</table>
<p>The gates reduced QHNet&rsquo;s memory by 62% and improved speed by 3.3x with modest accuracy trade-off, confirming the gates are portable modules applicable to other SE(3)-equivariant architectures.</p>
<h2 id="performance-results">Performance Results</h2>
<h3 id="qh9-134k-molecules-leq-20-atoms">QH9 (134k molecules, $\leq$ 20 atoms)</h3>
<p>SPHNet achieves 3.3x to 4.0x speedup over QHNet across all four QH9 splits, with improved Hamiltonian MAE and orbital energy MAE. Memory drops to 0.23 GB/sample (33% of QHNet&rsquo;s 0.70 GB). On the stable-iid split, Hamiltonian MAE improves from 76.31 to 45.48 ($10^{-6} E_h$).</p>
<h3 id="pubchemqh-50k-molecules-40-100-atoms">PubChemQH (50k molecules, 40-100 atoms)</h3>
<table>
  <thead>
      <tr>
          <th>Model</th>
          <th>$H$ [$10^{-6} E_h$] $\downarrow$</th>
          <th>$\epsilon$ [$E_h$] $\downarrow$</th>
          <th>$\psi$ [$10^{-2}$] $\uparrow$</th>
          <th>Memory [GB] $\downarrow$</th>
          <th>Speedup $\uparrow$</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>QHNet</td>
          <td>123.74</td>
          <td>3.33</td>
          <td>2.32</td>
          <td>22.5</td>
          <td>1.0x</td>
      </tr>
      <tr>
          <td>WANet</td>
          <td>99.98</td>
          <td><strong>1.17</strong></td>
          <td><strong>3.13</strong></td>
          <td>15.0</td>
          <td>2.4x</td>
      </tr>
      <tr>
          <td>SPHNet</td>
          <td><strong>97.31</strong></td>
          <td>2.16</td>
          <td>2.97</td>
          <td><strong>5.62</strong></td>
          <td><strong>7.1x</strong></td>
      </tr>
  </tbody>
</table>
<p>SPHNet achieves the best Hamiltonian MAE and efficiency, though WANet outperforms on orbital energy MAE and coefficient similarity. The higher speedup on PubChemQH (vs QH9) reflects greater computational redundancy in larger systems with higher-order basis sets ($L_{max} = 6$ for def2-TZVP vs $L_{max} = 4$ for def2-SVP).</p>
<h3 id="md17-small-molecule-trajectories">MD17 (Small Molecule Trajectories)</h3>
<p>SPHNet achieves accuracy comparable to QHNet and PhiSNet on four MD17 molecules (water, ethanol, malondialdehyde, uracil; 3-12 atoms). MD17 represents a simpler task where baseline models already perform well, leaving limited room for improvement. For water (3 atoms), the number of interaction combinations is inherently small, limiting the benefit of adaptive sparsification.</p>
<h3 id="scaling-limit">Scaling Limit</h3>
<p>SPHNet can train on systems with approximately 3000 atomic orbitals on a single A6000 GPU; the QHNet baseline runs out of memory at approximately 1800 orbitals. Memory consumption scales more favorably as molecule size increases.</p>
<h3 id="key-findings">Key Findings</h3>
<ul>
<li><strong>Adaptive sparsity scales with system complexity</strong>: The method is most effective for large systems where redundancy is high. For small molecules (e.g., water with only 3 atoms), every interaction is critical, so pruning hurts accuracy and yields negligible speedup.</li>
<li><strong>Long-range pair preservation</strong>: The Sparse Pair Gate selects long-range pairs (16-25 Angstrom) at higher rates than short-range ones. Short-range pairs are numerous and easier to learn, while rare long-range interactions are harder to represent and thus more critical to retain.</li>
<li><strong>Generalizable components</strong>: The sparsification techniques are portable modules, demonstrated by successful integration into QHNet with 3.3x speedup.</li>
<li><strong>Architecture ablation</strong>: Removing one Vectorial Node Interaction block or Spherical Node Interaction block significantly hurts accuracy, confirming the importance of the progressive order-increase design. Removing one Pair Construction block has less impact, suggesting room for further speedup.</li>
</ul>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<h3 id="artifacts">Artifacts</h3>
<table>
  <thead>
      <tr>
          <th>Artifact</th>
          <th>Type</th>
          <th>License</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><a href="https://github.com/microsoft/SPHNet">SPHNet (GitHub)</a></td>
          <td>Code</td>
          <td>MIT</td>
          <td>Official implementation; archived by Microsoft (Dec 2025), read-only</td>
      </tr>
      <tr>
          <td><a href="https://huggingface.co/datasets/EperLuo/PubChemQH">PubChemQH (Hugging Face)</a></td>
          <td>Dataset</td>
          <td>MIT</td>
          <td>50k molecules, 40-100 atoms, def2-TZVP basis</td>
      </tr>
  </tbody>
</table>
<p>No pre-trained model weights are provided. MD17 and QH9 are publicly available community datasets. Training requires 4x NVIDIA A100 (80GB) GPUs; benchmarking uses a single NVIDIA RTX A6000 (46GB).</p>
<h3 id="data">Data</h3>
<p>The experiments evaluated SPHNet on three datasets with different molecular sizes and basis set complexities. All datasets use DFT calculations as ground truth, with MD17 using the PBE exchange-correlation functional and QH9/PubChemQH using B3LYP.</p>
<table>
  <thead>
      <tr>
          <th>Dataset</th>
          <th>Molecules</th>
          <th>Molecule Size</th>
          <th>Basis Set</th>
          <th>$L_{max}$</th>
          <th>Functional</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>MD17</td>
          <td>4 systems</td>
          <td>3-12 atoms (water, ethanol, malondialdehyde, uracil)</td>
          <td>def2-SVP</td>
          <td>4</td>
          <td>PBE</td>
      </tr>
      <tr>
          <td>QH9</td>
          <td>134k</td>
          <td>$\leq$ 20 atoms (Stable/Dynamic splits)</td>
          <td>def2-SVP</td>
          <td>4</td>
          <td>B3LYP</td>
      </tr>
      <tr>
          <td>PubChemQH</td>
          <td>50k</td>
          <td>40-100 atoms</td>
          <td>def2-TZVP</td>
          <td>6</td>
          <td>B3LYP</td>
      </tr>
  </tbody>
</table>
<p><strong>Data Availability</strong>:</p>
<ul>
<li><strong>MD17 &amp; QH9</strong>: Publicly available</li>
<li><strong>PubChemQH</strong>: Publicly available on Hugging Face (<a href="https://huggingface.co/datasets/EperLuo/PubChemQH">EperLuo/PubChemQH</a>)</li>
</ul>
<h3 id="algorithms">Algorithms</h3>
<p><strong>Loss Function</strong>:</p>
<p>The model learns the <strong>residual</strong> $\Delta H$:</p>
<p>$$
\begin{aligned}
\Delta H &amp;= H_{\text{ref}} - H_{\text{init}} \\
\mathcal{L} &amp;= \text{MAE}(H_{\text{ref}}, H_{\text{pred}}) + \text{MSE}(H_{\text{ref}}, H_{\text{pred}})
\end{aligned}
$$</p>
<p>where $H_{\text{init}}$ is a computationally inexpensive initial guess computed via PySCF.</p>
<p><strong>Hyperparameters</strong>:</p>
<table>
  <thead>
      <tr>
          <th>Parameter</th>
          <th>PubChemQH</th>
          <th>QH9</th>
          <th>MD17</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Batch Size</td>
          <td>8</td>
          <td>32</td>
          <td>10 (uracil: 5)</td>
      </tr>
      <tr>
          <td>Training Steps</td>
          <td>300k</td>
          <td>260k</td>
          <td>200k</td>
      </tr>
      <tr>
          <td>Warmup Steps</td>
          <td>1k</td>
          <td>1k</td>
          <td>1k</td>
      </tr>
      <tr>
          <td>Learning Rate</td>
          <td>1e-3</td>
          <td>1e-3</td>
          <td>5e-4</td>
      </tr>
      <tr>
          <td>Sparsity Rate</td>
          <td>0.7</td>
          <td>0.4</td>
          <td>0.1-0.3</td>
      </tr>
      <tr>
          <td>TSS Epoch $t$</td>
          <td>3</td>
          <td>3</td>
          <td>3</td>
      </tr>
  </tbody>
</table>
<p><strong>Sparse Pair Gate</strong>: Adapts the interaction graph. It concatenates zero-order features and inner products of atom pairs, then passes them through a linear layer $F_p$ with sigmoid activation to learn a weight $W_p^{ij}$ for every pair. Pairs are kept only if selected by the scheduler ($U_p^{TSS}$). The overhead comes primarily from the linear layer $F_p$.</p>
<p><strong>Sparse TP Gate</strong>: Filters triplets $(l_1, l_2, l_3)$ inside the TP operation. Higher-order combinations are more likely to be pruned. Complexity: $\mathcal{O}(L^3)$.</p>
<p><strong>Three-Phase Sparsity Scheduler</strong>: Training curriculum designed to optimize the sparse gates effectively:</p>
<ul>
<li><strong>Phase 1 (Random)</strong>: Random selection ($1-k$ probability) to ensure unbiased weight updates. Complexity: $\mathcal{O}(|U|)$.</li>
<li><strong>Phase 2 (Adaptive)</strong>: Selects top $(1-k)$ percent based on learned magnitude. Complexity: $\mathcal{O}(|U|\log|U|)$.</li>
<li><strong>Phase 3 (Fixed)</strong>: Freezes the connectivity mask for maximum inference speed. No overhead.</li>
</ul>
<p><strong>Weight Initialization</strong>: Learnable sparsity weights ($W$) initialized as all-ones vector.</p>
<h3 id="models">Models</h3>
<p>The model predicts the Hamiltonian matrix $H$ from atomic numbers $Z$ and coordinates $r$.</p>
<p><strong>Inputs</strong>: Atomic numbers ($Z$) and 3D coordinates.</p>
<p><strong>Backbone Structure</strong>:</p>
<ol>
<li><strong>Vectorial Node Interaction (x4)</strong>: Uses long-short range message passing. Extracts vectorial representations ($l=1$) without high-order TPs to save cost.</li>
<li><strong>Spherical Node Interaction (x2)</strong>: Projects features to high-order spherical harmonics (up to $L_{max}$). The first block increases the maximum order from 0 to $L_{max}$ without the Sparse Pair Gate; the second block applies the <strong>Sparse Pair Gate</strong> to filter node pairs.</li>
<li><strong>Pair Construction Block (x2)</strong>: Splits into <strong>Diagonal</strong> (self-interaction) and <strong>Non-Diagonal</strong> (cross-interaction) blocks. Both use the <strong>Sparse TP Gate</strong> to prune cross-order combinations $(l_1, l_2, l_3)$. The Non-Diagonal blocks also use the <strong>Sparse Pair Gate</strong> to filter atom pairs. The two Pair Construction blocks receive representations from the two Spherical Node Interaction blocks respectively, and their outputs are summed.</li>
<li><strong>Expansion Block</strong>: Reconstructs the full Hamiltonian matrix from the sparse irreducible representations, exploiting symmetry ($H_{ji} = H_{ij}^T$) to halve computations.</li>
</ol>
<h3 id="hardware">Hardware</h3>
<ul>
<li><strong>Training</strong>: 4x NVIDIA A100 (80GB)</li>
<li><strong>Benchmarking</strong>: Single NVIDIA RTX A6000 (46GB)</li>
</ul>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Luo, E., Wei, X., Huang, L., Li, Y., Yang, H., Xia, Z., Wang, Z., Liu, C., Shao, B., &amp; Zhang, J. (2025). Efficient and Scalable Density Functional Theory Hamiltonian Prediction through Adaptive Sparsity. <em>Proceedings of the 42nd International Conference on Machine Learning</em>, PMLR 267:41368&ndash;41390.</p>
<p><strong>Publication</strong>: ICML 2025</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@inproceedings</span>{luo2025efficient,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span>=<span style="color:#e6db74">{Efficient and Scalable Density Functional Theory Hamiltonian Prediction through Adaptive Sparsity}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span>=<span style="color:#e6db74">{Luo, Erpai and Wei, Xinran and Huang, Lin and Li, Yunyang and Yang, Han and Xia, Zaishuo and Wang, Zun and Liu, Chang and Shao, Bin and Zhang, Jia}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">booktitle</span>=<span style="color:#e6db74">{Proceedings of the 42nd International Conference on Machine Learning}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span>=<span style="color:#e6db74">{41368--41390}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span>=<span style="color:#e6db74">{2025}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">volume</span>=<span style="color:#e6db74">{267}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">series</span>=<span style="color:#e6db74">{Proceedings of Machine Learning Research}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">publisher</span>=<span style="color:#e6db74">{PMLR}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p><strong>Additional Resources</strong>:</p>
<ul>
<li><a href="https://icml.cc/virtual/2025/poster/45656">ICML 2025 poster page</a></li>
<li><a href="https://openreview.net/forum?id=K3lykWhXON">OpenReview forum</a></li>
<li><a href="https://openreview.net/pdf?id=K3lykWhXON">PDF on OpenReview</a></li>
<li><a href="https://github.com/microsoft/SPHNet">GitHub Repository</a> <em>(Note: The official repository was archived by Microsoft in December 2025. It is available for reference but no longer actively maintained.)</em></li>
</ul>
]]></content:encoded></item><item><title>Dark Side of Forces: Non-Conservative ML Force Models</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/dark-side-of-forces/</link><pubDate>Sat, 23 Aug 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/dark-side-of-forces/</guid><description>Bigi et al. critique non-conservative force models in ML potentials, showing their simulation failures and proposing hybrid solutions.</description><content:encoded><![CDATA[<h2 id="contribution-systematic-assessment-of-non-conservative-ml-force-models">Contribution: Systematic Assessment of Non-Conservative ML Force Models</h2>
<p>This is a <strong>Systematization</strong> paper. It systematically catalogs the exact failure modes of existing non-conservative force approaches, quantifies them with a new diagnostic metric, and proposes a hybrid Multiple Time-Stepping solution combining the speed benefits of direct force prediction with the physical correctness of conservative models.</p>
<h2 id="motivation-the-speed-accuracy-trade-off-in-ml-force-fields">Motivation: The Speed-Accuracy Trade-off in ML Force Fields</h2>
<p>Many recent machine learning interatomic potential (MLIP) architectures predict forces directly ($F_\theta(r)$). This &ldquo;non-conservative&rdquo; approach avoids the computational overhead of automatic differentiation, yielding faster inference (typically 2-3x speedup) and faster training (up to 3x). However, it sacrifices energy conservation and rotational constraints, potentially destabilizing molecular dynamics simulations. The field lacks rigorous quantification of when this trade-off breaks down and how to mitigate the failures.</p>
<h2 id="novelty-jacobian-asymmetry-and-hybrid-architectures">Novelty: Jacobian Asymmetry and Hybrid Architectures</h2>
<p>Four key contributions:</p>
<ol>
<li>
<p><strong>Jacobian Asymmetry Metric ($\lambda$):</strong> A quantitative diagnostic for non-conservation. Since conservative forces derive from a scalar field, their Jacobian (the Hessian of energy) must be symmetric. The normalized norm of the antisymmetric part quantifies the degree of violation:
$$ \lambda = \frac{|| \mathbf{J}_{\text{anti}} ||_F}{|| \mathbf{J} ||_F} $$
where $\mathbf{J}_{\text{anti}} = (\mathbf{J} - \mathbf{J}^\top)/2$. Measured values range from $\lambda \approx 0.004$ (PET-NC) to $\lambda \approx 0.032$ (SOAP-BPNN-NC), with ORB at 0.015 and EquiformerV2 at 0.017. Notably, the pairwise $\lambda_{ij}$ approaches 1 at large interatomic distances, meaning non-conservative artifacts disproportionately affect long-range and collective interactions.</p>
</li>
<li>
<p><strong>Systematic Failure Mode Catalog:</strong> First comprehensive demonstration that non-conservative models cause runaway heating in NVE ensembles (temperature drifts of ~7,000 billion K/s for PET-NC and ~10x larger for ORB) and equipartition violations in NVT ensembles where different atom types equilibrate to different temperatures, a physical impossibility.</p>
</li>
<li>
<p><strong>Theoretical Analysis of Force vs. Energy Training:</strong> Force-only training overemphasizes high-frequency vibrational modes because force labels carry per-atom gradients that are dominated by stiff, short-range interactions. Energy labels provide a more balanced representation across the frequency spectrum. Additionally, conservative models benefit from backpropagation extending the effective receptive field to approximately 2x the interaction cutoff, while direct-force models are limited to the nominal cutoff radius.</p>
</li>
<li>
<p><strong>Hybrid Training and Inference Protocol:</strong> A practical workflow that combines fast direct-force prediction with conservative corrections:</p>
<ul>
<li><strong>Training:</strong> Pre-train on direct forces, then fine-tune on energy gradients (2-4x faster than training conservative models from scratch)</li>
<li><strong>Inference:</strong> Multiple Time-Stepping (MTS) where fast non-conservative forces are periodically corrected by slower conservative forces</li>
</ul>
</li>
</ol>
<h2 id="methodology-systematic-failure-mode-analysis">Methodology: Systematic Failure Mode Analysis</h2>
<p>The evaluation systematically tests multiple state-of-the-art models across diverse simulation scenarios:</p>
<p><strong>Models tested:</strong></p>
<ul>
<li><strong>PET-C/PET-NC</strong> (Point Edge Transformer, conservative and non-conservative variants)</li>
<li><strong>PET-M</strong> (hybrid variant jointly predicting both conservative and non-conservative forces)</li>
<li><strong>ORB-v2</strong> (non-conservative, trained on Alexandria/MPtrj)</li>
<li><strong>EquiformerV2</strong> (non-conservative equivariant Transformer)</li>
<li><strong>MACE-MP-0</strong> (conservative message-passing)</li>
<li><strong>SevenNet</strong> (conservative message-passing)</li>
<li><strong>SOAP-BPNN-C/SOAP-BPNN-NC</strong> (descriptor-based baseline, both conservative and non-conservative variants)</li>
</ul>
<p><strong>Test scenarios:</strong></p>
<ol>
<li><strong>NVE stability tests</strong> on bulk liquid water, graphene, amorphous carbon, and FCC aluminum</li>
<li><strong>Thermostat artifact analysis</strong> with Langevin and GLE thermostats</li>
<li><strong>Geometry optimization</strong> on water snapshots and QM9 molecules using FIRE and L-BFGS</li>
<li><strong>MTS validation</strong> on OC20 catalysis dataset</li>
<li><strong>Species-resolved temperature measurements</strong> for equipartition testing</li>
</ol>
<p><strong>Key metrics:</strong></p>
<ul>
<li>Jacobian asymmetry ($\lambda$)</li>
<li>Kinetic temperature drift in NVE</li>
<li>Velocity-velocity correlations</li>
<li>Radial distribution functions</li>
<li>Species-resolved temperatures</li>
<li>Inference speed benchmarks</li>
</ul>
<h2 id="results-simulation-instability-and-hybrid-solutions">Results: Simulation Instability and Hybrid Solutions</h2>
<p>Purely non-conservative models are <strong>unsuitable for production simulations</strong> due to uncontrollable unphysical artifacts that no thermostat can correct. Key findings:</p>
<p><strong>Performance failures:</strong></p>
<ul>
<li>Non-conservative models exhibited catastrophic temperature drift in NVE simulations: ~7,000 billion K/s for PET-NC and ~70,000 billion K/s for ORB, with EquiformerV2 comparable to PET-NC</li>
<li>Strong Langevin thermostats ($\tau=10$ fs) damped diffusion by ~5x, negating the speed benefits of non-conservative models</li>
<li>Advanced GLE thermostats also failed to control non-conservative drift (ORB reached 1181 K vs. 300 K target)</li>
<li>Equipartition violations: under stochastic velocity rescaling, O and H atoms equilibrated at different temperatures. For ORB, H atoms reached 336 K and O atoms 230 K against a 300 K target. For PET-NC, deviations were smaller but still significant (H at 296 K, O at 310 K).</li>
<li>Geometry optimization was more fragile with non-conservative forces: inaccurate NC models (SOAP-BPNN-NC) failed catastrophically, while more accurate ones (PET-NC) could converge with FIRE but showed large force fluctuations with L-BFGS. Non-conservative models consistently had lower success rates across water and QM9 benchmarks.</li>
</ul>
<p><strong>Hybrid solution success:</strong></p>
<ul>
<li>MTS with non-conservative forces corrected every 8 steps ($M=8$) achieved conservative stability with only ~20% overhead compared to a purely non-conservative trajectory. Results were essentially indistinguishable from fully conservative simulations. Higher stride values ($M=16$) became unstable due to resonances between fast degrees of freedom and integration errors.</li>
<li>Conservative fine-tuning achieved the accuracy of from-scratch training in about 1/3 the total training time (2-4x resource reduction)</li>
<li>Validated on OC20 catalysis benchmark</li>
</ul>
<p><strong>Scaling caveat:</strong> The authors note that as training datasets grow and models become more expressive, non-conservative artifacts should diminish because accurate models naturally exhibit less non-conservative behavior. However, they argue the best path forward is hybrid approaches rather than waiting for scale to solve the problem.</p>
<p><strong>Recommendation:</strong> The optimal production path is hybrid architectures using direct forces for acceleration (via MTS and pre-training) while anchoring models in conservative energy surfaces. This captures computational benefits without sacrificing physical reliability.</p>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<h3 id="data">Data</h3>
<p><strong>Primary training/evaluation:</strong></p>
<ul>
<li><strong>Bulk Liquid Water</strong> (Cheng et al., 2019): revPBE0-D3 calculations with over 250,000 force/energy targets, chosen for rigorous thermodynamic testing</li>
</ul>
<p><strong>Generalization tests:</strong></p>
<ul>
<li>Graphene, amorphous carbon, FCC aluminum (tested with general-purpose foundation models)</li>
</ul>
<p><strong>Benchmarks:</strong></p>
<ul>
<li><strong>QM9</strong>: Geometry optimization tests</li>
<li><strong>OC20</strong> (Open Catalyst): Oxygen on alloy surfaces for MTS validation</li>
</ul>
<p>All datasets publicly available through cited sources.</p>
<h3 id="models">Models</h3>
<p><strong>Point Edge Transformer (PET)</strong> variants:</p>
<ul>
<li><strong>PET-C (Conservative)</strong>: Forces via energy backpropagation</li>
<li><strong>PET-NC (Non-Conservative)</strong>: Direct force prediction head, slightly higher parameter count</li>
<li><strong>PET-M (Hybrid)</strong>: Jointly predicts both conservative and non-conservative forces, accuracy within ~10% of the best single-task models</li>
</ul>
<p><strong>Baseline comparisons:</strong></p>
<table>
  <thead>
      <tr>
          <th>Model</th>
          <th>Type</th>
          <th>Training Data</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>ORB-v2</td>
          <td>Non-conservative</td>
          <td>Alexandria/MPtrj</td>
          <td>Rotationally unconstrained</td>
      </tr>
      <tr>
          <td>EquiformerV2</td>
          <td>Non-conservative</td>
          <td>Alexandria/MPtrj</td>
          <td>Equivariant Transformer</td>
      </tr>
      <tr>
          <td>MACE-MP-0</td>
          <td>Conservative</td>
          <td>MPtrj</td>
          <td>Equivariant message-passing</td>
      </tr>
      <tr>
          <td>SevenNet</td>
          <td>Conservative</td>
          <td>MPtrj</td>
          <td>Equivariant message-passing</td>
      </tr>
      <tr>
          <td>SOAP-BPNN-C</td>
          <td>Conservative</td>
          <td>Bulk water</td>
          <td>Descriptor-based baseline</td>
      </tr>
      <tr>
          <td>SOAP-BPNN-NC</td>
          <td>Non-conservative</td>
          <td>Bulk water</td>
          <td>Descriptor-based baseline</td>
      </tr>
  </tbody>
</table>
<p><strong>Training details:</strong></p>
<ul>
<li><strong>Loss functions</strong>: PET-C uses joint Energy + Force $L^2$ loss; PET-NC uses Force-only $L^2$ loss</li>
<li><strong>Fine-tuning protocol</strong>: PET-NC converted to conservative via energy head fine-tuning</li>
<li><strong>MTS configuration</strong>: Non-conservative forces with conservative corrections every 8 steps ($M=8$)</li>
</ul>
<h3 id="evaluation">Evaluation</h3>
<p><strong>Metrics &amp; Software:</strong>
Molecular dynamics evaluations were performed using <strong>i-PI</strong>, while geometry optimizations used <strong>ASE (Atomic Simulation Environment)</strong>. Note that primary code reproducibility is provided via an archived Zenodo snapshot; the authors did not link a live, public GitHub repository.</p>
<ol>
<li><strong>Jacobian asymmetry</strong> ($\lambda$): Quantifies non-conservation via antisymmetric component</li>
<li><strong>Temperature drift</strong>: NVE ensemble stability</li>
<li><strong>Velocity-velocity correlation</strong> ($\hat{c}_{vv}(\omega)$): Thermostat artifact detection</li>
<li><strong>Radial distribution functions</strong> ($g(r)$): Structural accuracy</li>
<li><strong>Species-resolved temperature</strong>: Equipartition testing</li>
<li><strong>Inference speed</strong>: Wall-clock time per MD step</li>
</ol>
<p><strong>Key results:</strong></p>
<table>
  <thead>
      <tr>
          <th>Model</th>
          <th>Speed (ms/step)</th>
          <th>NVE Stability</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>PET-NC</td>
          <td>8.58</td>
          <td>Failed</td>
          <td>~7,000 billion K/s drift</td>
      </tr>
      <tr>
          <td>PET-C</td>
          <td>19.4</td>
          <td>Stable</td>
          <td>2.3x slower than PET-NC</td>
      </tr>
      <tr>
          <td>SevenNet</td>
          <td>52.8</td>
          <td>Stable</td>
          <td>Conservative baseline</td>
      </tr>
      <tr>
          <td><strong>PET Hybrid (MTS)</strong></td>
          <td><strong>~10.3</strong></td>
          <td><strong>Stable</strong></td>
          <td><strong>~20% overhead vs. pure NC</strong></td>
      </tr>
  </tbody>
</table>
<p><strong>Thermostat artifacts:</strong></p>
<ul>
<li>Langevin ($\tau=10$ fs) dampened diffusion by ~5x (weaker coupling at $\tau=100$ fs reduced diffusion by ~1.5x)</li>
<li>GLE thermostats also failed to control non-conservative drift</li>
<li>Equipartition violations under SVR: ORB showed H at 336 K and O at 230 K (target 300 K); PET-NC showed smaller but significant species-resolved deviations</li>
</ul>
<p><strong>Optimization failures:</strong></p>
<ul>
<li>Non-conservative models showed lower geometry optimization success rates across water and QM9 benchmarks, with inaccurate NC models failing catastrophically</li>
</ul>
<h3 id="hardware">Hardware</h3>
<p><strong>Compute resources:</strong></p>
<ul>
<li><strong>Training</strong>: From-scratch baseline models were trained using 4x Nvidia H100 GPUs (over a duration of around two days).</li>
<li><strong>Fine-Tuning</strong>: Conservative fine-tuning was performed using a single (1x) Nvidia H100 GPU for a duration of one day.</li>
<li>This hybrid fine-tuning approach achieved a 2-4x reduction in computational resources compared to training conservative models from scratch.</li>
</ul>
<p><strong>Reproduction resources:</strong></p>
<table>
  <thead>
      <tr>
          <th>Artifact</th>
          <th>Type</th>
          <th>License</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><a href="https://zenodo.org/records/14778891">Zenodo repository</a></td>
          <td>Code/Data</td>
          <td>Unknown</td>
          <td>Code and data to reproduce all results</td>
      </tr>
      <tr>
          <td><a href="https://atomistic-cookbook.org/examples/pet-mad-nc/pet-mad-nc.html">MTS inference tutorial</a></td>
          <td>Other</td>
          <td>Unknown</td>
          <td>Multiple time-stepping dynamics tutorial</td>
      </tr>
      <tr>
          <td><a href="https://atomistic-cookbook.org/examples/pet-finetuning/pet-ft-nc.html">Conservative fine-tuning tutorial</a></td>
          <td>Other</td>
          <td>Unknown</td>
          <td>Fine-tuning workflow tutorial</td>
      </tr>
  </tbody>
</table>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Bigi, F., Langer, M. F., &amp; Ceriotti, M. (2025). The dark side of the forces: assessing non-conservative force models for atomistic machine learning. <em>Proceedings of the 42nd International Conference on Machine Learning</em>, PMLR 267.</p>
<p><strong>Publication</strong>: ICML 2025</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@inproceedings</span>{bigi2025dark,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span>=<span style="color:#e6db74">{The dark side of the forces: assessing non-conservative force models for atomistic machine learning}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span>=<span style="color:#e6db74">{Bigi, Filippo and Langer, Marcel F and Ceriotti, Michele}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">booktitle</span>=<span style="color:#e6db74">{Proceedings of the 42nd International Conference on Machine Learning}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">series</span>=<span style="color:#e6db74">{Proceedings of Machine Learning Research}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">volume</span>=<span style="color:#e6db74">{267}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">address</span>=<span style="color:#e6db74">{Vancouver, Canada}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span>=<span style="color:#e6db74">{2025}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p><strong>Additional Resources</strong>:</p>
<ul>
<li><a href="https://icml.cc/virtual/2025/poster/45458">ICML 2025 poster page</a></li>
<li><a href="https://openreview.net/pdf?id=OEl3L8osas">PDF on OpenReview</a></li>
<li><a href="https://zenodo.org/records/14778891">Zenodo repository</a></li>
<li><a href="https://atomistic-cookbook.org/examples/pet-mad-nc/pet-mad-nc.html">MTS Inference Tutorial</a></li>
<li><a href="https://atomistic-cookbook.org/examples/pet-finetuning/pet-ft-nc.html">Conservative Fine-Tuning Tutorial</a></li>
</ul>
]]></content:encoded></item><item><title>Beyond Atoms: 3D Space Modeling for Molecular Pretraining</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/beyond-atoms/</link><pubDate>Sat, 23 Aug 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/beyond-atoms/</guid><description>Lu et al. introduce SpaceFormer, a Transformer that models entire 3D molecular space including atoms for superior representations.</description><content:encoded><![CDATA[<h2 id="paper-typology-and-contribution">Paper Typology and Contribution</h2>
<p>This is a <strong>Method</strong> paper. It challenges the atom-centric paradigm of molecular representation learning by proposing a novel framework that models the continuous 3D space surrounding atoms. The core contribution is <strong>SpaceFormer</strong>, a Transformer-based architecture that discretizes molecular space into grids to capture physical phenomena (electron density, electromagnetic fields) often missed by traditional point-cloud models.</p>
<h2 id="the-physical-intuition-modeling-empty-space">The Physical Intuition: Modeling &ldquo;Empty&rdquo; Space</h2>
<p><strong>The Gap</strong>: Prior 3D molecular representation models, such as Uni-Mol, treat molecules as discrete sets of atoms, essentially point clouds in 3D space. However, from a quantum physics perspective, the &ldquo;empty&rdquo; space between atoms is far from empty. It is permeated by electron density distributions and electromagnetic fields that determine molecular properties.</p>
<p><strong>The Hypothesis</strong>: Explicitly modeling this continuous 3D space alongside discrete atom positions yields superior representations for downstream tasks, particularly for computational properties that depend on electronic structure, such as HOMO/LUMO energies and energy gaps.</p>
<h2 id="a-surprising-observation-virtual-points-improve-representations">A Surprising Observation: Virtual Points Improve Representations</h2>
<p>Before proposing SpaceFormer, the authors present a simple yet revealing experiment. They augment Uni-Mol by adding randomly sampled virtual points (VPs) from the 3D space within the circumscribed cuboid of each molecule. These VPs carry no chemical information whatsoever: they are purely random noise points.</p>
<p>The result is surprising: adding just 10 random VPs already yields a noticeable improvement in validation loss. The improvement remains consistent and gradually increases as the number of VPs grows, eventually reaching a plateau. This observation holds across downstream tasks as well, with Uni-Mol + VPs improving on several quantum property predictions (LUMO, E1-CC2, E2-CC2) compared to vanilla Uni-Mol.</p>
<p>The implication is that even uninformative spatial context helps the model learn better representations, motivating a principled framework for modeling the full 3D molecular space.</p>
<h2 id="spaceformer-voxelization-and-3d-positional-encodings">SpaceFormer: Voxelization and 3D Positional Encodings</h2>
<p>The key innovation is treating the molecular representation problem as <strong>3D space modeling</strong>. SpaceFormer follows these core steps:</p>
<ol>
<li><strong>Voxelizes the entire 3D space</strong> into a grid with cells of $0.49\text{\AA}$ (based on O-H bond length to ensure at most one atom per cell).</li>
<li><strong>Uses adaptive multi-resolution grids</strong> to efficiently handle empty space, keeping it fine-grained near atoms and coarse-grained far away.</li>
<li><strong>Applies Transformers to 3D spatial tokens</strong> with custom positional encodings that achieve linear complexity.</li>
</ol>
<p>Specifically, the model utilizes two forms of 3D Positional Encoding:</p>
<p><strong>3D Directional PE (RoPE Extension)</strong>
They extend Rotary Positional Encoding (RoPE) to 3D continuous space by splitting the Query and Key vectors into three blocks (one for each spatial axis). The directional attention mechanism takes the form:</p>
<p>$$
\begin{aligned}
\mathbf{q}_{i}^{\top} \mathbf{k}_{j} = \sum_{s=1}^{3} \mathbf{q}_{i,s}^{\top} \mathbf{R}(c_{j,s} - c_{i,s}) \mathbf{k}_{j,s}
\end{aligned}
$$</p>
<p><strong>3D Distance PE (RFF Approximation)</strong>
To compute invariant geometric distance without incurring quadratic memory overhead, they use Random Fourier Features (RFF) to approximate a Gaussian kernel of pairwise distances:</p>
<p>$$
\begin{aligned}
\exp \left( - \frac{| \mathbf{c}_i - \mathbf{c}_j |_2^2}{2\sigma^2} \right) &amp;\approx z(\mathbf{c}_i)^\top z(\mathbf{c}_j) \\
z(\mathbf{c}_i) &amp;= \sqrt{\frac{2}{d}} \cos(\sigma^{-1} \mathbf{c}_i^\top \boldsymbol{\omega} + \mathbf{b})
\end{aligned}
$$</p>
<p>This approach enables the model to natively encode complex field-like phenomena without computing exhaustive $O(N^2)$ distance matrices.</p>
<h2 id="experimental-setup-and-downstream-tasks">Experimental Setup and Downstream Tasks</h2>
<p><strong>Pretraining Data</strong>: 19 million unlabeled molecules from the same dataset used by Uni-Mol.</p>
<p><strong>Downstream Benchmarks</strong>: The authors propose a new benchmark of 15 tasks, motivated by known limitations of MoleculeNet: invalid structures, inconsistent chemical representations, data curation errors, and an inability to adequately distinguish model performance. The tasks split into two categories:</p>
<ol>
<li>
<p><strong>Computational Properties (Quantum Mechanics)</strong></p>
<ul>
<li>Subsets of <a href="/notes/chemistry/datasets/gdb-17/">GDB-17</a> (HOMO, LUMO, GAP energy prediction, 20K samples; E1-CC2, E2-CC2, f1-CC2, f2-CC2, 21.7K samples)</li>
<li>Cata-condensed polybenzenoid hydrocarbons (Dipole moment, adiabatic ionization potential, D3 dispersion correction, 8,678 samples)</li>
<li>Metric: Mean Absolute Error (MAE)</li>
</ul>
</li>
<li>
<p><strong>Experimental Properties (Pharma/Bio)</strong></p>
<ul>
<li>MoleculeNet tasks (BBBP, BACE for drug discovery)</li>
<li>Biogen ADME tasks (HLM, MME, Solubility)</li>
<li>Metrics: AUC for classification, MAE for regression</li>
</ul>
</li>
</ol>
<p><strong>Splitting Strategy</strong>: All datasets use 8:1:1 train/validation/test ratio with <strong>scaffold splitting</strong> to test out-of-distribution generalization.</p>
<p><strong>Training Setup</strong>:</p>
<ul>
<li><strong>Objective</strong>: Masked Auto-Encoder (MAE) with 30% random masking. Model predicts whether a cell contains an atom, and if so, regresses both atom type and precise offset position.</li>
<li><strong>Hardware</strong>: ~50 hours on 8 NVIDIA A100 GPUs</li>
<li><strong>Optimizer</strong>: Adam ($\beta_1=0.9, \beta_2=0.99$)</li>
<li><strong>Learning Rate</strong>: Peak 1e-4 with linear decay and 0.01 warmup ratio</li>
<li><strong>Batch Size</strong>: 128</li>
<li><strong>Total Updates</strong>: 1 million</li>
</ul>
<p><strong>Baseline Comparisons</strong>: GROVER (2D graph-based MPR), GEM (2D graph enhanced with 3D information), 3D Infomax (GNN with 3D information), Uni-Mol (3D MPR, primary baseline using the same pretraining dataset), and Mol-AE (extends Uni-Mol with atom-based MAE pretraining).</p>
<h2 id="results-and-analysis">Results and Analysis</h2>
<p><strong>Strong Contextual Performance</strong>: SpaceFormer ranked 1st in 10 of 15 tasks and in the top 2 for 14 of 15 tasks. It surpassed the runner-up models by approximately 20% on quantum property tasks (HOMO, LUMO, GAP, E1-CC2, Dipmom), validating that modeling non-atom space captures electronic structure better than atom-only regimes.</p>
<h3 id="key-results-on-quantum-properties">Key Results on Quantum Properties</h3>
<table>
  <thead>
      <tr>
          <th>Task</th>
          <th>GROVER</th>
          <th>GEM</th>
          <th>3D Infomax</th>
          <th>Uni-Mol</th>
          <th>Mol-AE</th>
          <th><strong>SpaceFormer</strong></th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>HOMO (Ha)</td>
          <td>0.0075</td>
          <td>0.0068</td>
          <td>0.0065</td>
          <td>0.0052</td>
          <td>0.0050</td>
          <td><strong>0.0042</strong></td>
      </tr>
      <tr>
          <td>LUMO (Ha)</td>
          <td>0.0086</td>
          <td>0.0080</td>
          <td>0.0070</td>
          <td>0.0060</td>
          <td>0.0057</td>
          <td><strong>0.0040</strong></td>
      </tr>
      <tr>
          <td>GAP (Ha)</td>
          <td>0.0109</td>
          <td>0.0107</td>
          <td>0.0095</td>
          <td>0.0081</td>
          <td>0.0080</td>
          <td><strong>0.0064</strong></td>
      </tr>
      <tr>
          <td>E1-CC2 (eV)</td>
          <td>0.0101</td>
          <td>0.0090</td>
          <td>0.0089</td>
          <td>0.0067</td>
          <td>0.0070</td>
          <td><strong>0.0058</strong></td>
      </tr>
      <tr>
          <td>Dipmom (Debye)</td>
          <td>0.0752</td>
          <td>0.0289</td>
          <td>0.0291</td>
          <td>0.0106</td>
          <td>0.0113</td>
          <td><strong>0.0083</strong></td>
      </tr>
  </tbody>
</table>
<p>SpaceFormer&rsquo;s advantage is most pronounced on computational properties that depend on electronic structure. On experimental biological tasks (e.g., BBBP), where measurements are noisy, the advantage narrows or reverses: Uni-Mol achieves 0.9066 AUC on BBBP compared to SpaceFormer&rsquo;s 0.8605.</p>
<h3 id="ablation-studies">Ablation Studies</h3>
<p>The authors present several ablations that isolate the source of SpaceFormer&rsquo;s improvements:</p>
<p><strong>MAE vs. Denoising</strong>: SpaceFormer with MAE pretraining outperforms SpaceFormer with denoising on all four ablation tasks. The MAE objective requires predicting <em>whether</em> an atom exists in a masked voxel, which forces the model to learn global structural dependencies. In the denoising variant, only atom cells are masked so the model never needs to predict atom existence, reducing the task to coordinate regression.</p>
<p><strong>FLOPs Control</strong>: A SpaceFormer-Large model (4x width, atom-only) trained with comparable FLOPs still falls short of SpaceFormer with 1000 non-atom cells on most downstream tasks. This confirms the improvement comes from modeling 3D space, not from additional compute.</p>
<p><strong>Virtual Points vs. SpaceFormer</strong>: Adding up to 200 random virtual points to Uni-Mol improves some tasks but leaves a significant gap compared to SpaceFormer, demonstrating that principled space discretization outperforms naive point augmentation.</p>
<p><strong>Efficiency Validation</strong>: The Adaptive Grid Merging method reduces the number of cells by roughly 10x with virtually no performance degradation. The 3D positional encodings scale linearly with the number of cells, while Uni-Mol&rsquo;s pretraining cost scales quadratically.</p>
<h3 id="scope-and-future-directions">Scope and Future Directions</h3>
<p>SpaceFormer does not incorporate built-in SE(3) equivariance, relying instead on data augmentation (random rotations and random boundary padding) during training. The authors identify extending SpaceFormer to force field tasks and larger systems such as proteins and complexes as promising future directions.</p>
<hr>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<h3 id="code-and-data-availability">Code and Data Availability</h3>
<ul>
<li><strong>Source Code</strong>: As of the current date, the authors have not released the official source code or pre-trained weights.</li>
<li><strong>Datasets</strong>: Pretraining utilized the same 19M unlabeled molecule dataset as Uni-Mol. Downstream tasks use a newly curated internal benchmark built from subsets of GDB-17, MoleculeNet, and Biogen ADME. The exact customized scaffold splits for these evaluations are pending the official code release.</li>
<li><strong>Compute</strong>: Pretraining the base SpaceFormer encoder (~67.8M parameters, configured to merge level 3) required approximately 50 hours on 8 NVIDIA A100 GPUs.</li>
</ul>
<table>
  <thead>
      <tr>
          <th>Artifact</th>
          <th>Type</th>
          <th>License</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Source code</td>
          <td>Code</td>
          <td>N/A</td>
          <td>Not publicly released as of March 2026</td>
      </tr>
      <tr>
          <td>Pre-trained weights</td>
          <td>Model</td>
          <td>N/A</td>
          <td>Not publicly released</td>
      </tr>
      <tr>
          <td>Pretraining data (19M molecules)</td>
          <td>Dataset</td>
          <td>Unknown</td>
          <td>Same dataset as Uni-Mol; not independently released</td>
      </tr>
      <tr>
          <td>Downstream benchmark splits</td>
          <td>Dataset</td>
          <td>N/A</td>
          <td>Custom scaffold splits pending code release</td>
      </tr>
  </tbody>
</table>
<h3 id="models">Models</h3>
<p>The model treats a molecule as a 3D &ldquo;image&rdquo; via voxelization, processed by a Transformer.</p>
<p><strong>Input Representation</strong>:</p>
<ul>
<li><strong>Discretization</strong>: 3D space divided into grid cells with length <strong>$0.49\text{\AA}$</strong> (based on O-H bond length to ensure at most one atom per cell)</li>
<li><strong>Tokenization</strong>: Tokens are pairs $(t_i, c_i)$ where $t_i$ is atom type (or NULL) and $c_i$ is the coordinate</li>
<li><strong>Embeddings</strong>: Continuous embeddings with dimension 512. Inner-cell positions discretized with $0.01\text{\AA}$ precision</li>
</ul>
<p><strong>Transformer Specifications</strong>:</p>
<table>
  <thead>
      <tr>
          <th>Component</th>
          <th>Layers</th>
          <th>Attention Heads</th>
          <th>Embedding Dim</th>
          <th>FFN Dim</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><strong>Encoder</strong></td>
          <td>16</td>
          <td>8</td>
          <td>512</td>
          <td>2048</td>
      </tr>
      <tr>
          <td><strong>Decoder</strong> (MAE)</td>
          <td>4</td>
          <td>4</td>
          <td>256</td>
          <td>1024</td>
      </tr>
  </tbody>
</table>
<p><strong>Attention Mechanism</strong>: FlashAttention for efficient handling of large sequence lengths.</p>
<p><strong>Positional Encodings</strong>:</p>
<ol>
<li><strong>3D Directional PE</strong>: Extension of Rotary Positional Embedding (RoPE) to 3D continuous space, capturing relative directionality</li>
<li><strong>3D Distance PE</strong>: Random Fourier Features (RFF) to approximate Gaussian kernel of pairwise distances with linear complexity</li>
</ol>
<h4 id="visualizing-rff-and-rope">Visualizing RFF and RoPE</h4>















<figure class="post-figure center ">
    <img src="/img/notes/spaceformer-rff-rope-visualization.webp"
         alt="Four-panel visualization showing RFF distance encoding and RoPE directional encoding mechanisms"
         title="Four-panel visualization showing RFF distance encoding and RoPE directional encoding mechanisms"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">Visual intuition for SpaceFormer&rsquo;s positional encodings: Top row shows RFF distance encoding (Gaussian-like attention decay and high-frequency feature fingerprints). Bottom row shows RoPE directional encoding (vector rotation fields and resulting attention patterns).</figcaption>
    
</figure>

<p><strong>Top Row (Distance / RFF):</strong> Shows how the model learns &ldquo;closeness.&rdquo; Distance is represented by a complex &ldquo;fingerprint&rdquo; of waves that creates a Gaussian-like force field.</p>
<ul>
<li><strong>Top Left (The Force Field):</strong> The attention score (dot product) naturally forms a Gaussian curve. It is high when atoms are close and decays to zero as they move apart. This mimics physical forces without the model needing to learn that math from scratch.</li>
<li><strong>Top Right (The Fingerprint):</strong> Each dimension oscillates at a different frequency. A specific distance (e.g., $d=2$) has a unique combination of high and low values across these dimensions, creating a unique &ldquo;fingerprint&rdquo; for that exact distance.</li>
</ul>
<p><strong>Bottom Row (Direction / RoPE):</strong> Shows how the model learns &ldquo;relative position.&rdquo; It visualizes the vector rotation and how that creates a grid-like attention pattern.</p>
<ul>
<li><strong>Bottom Left (The Rotation):</strong> This visualizes the &ldquo;X-axis chunk&rdquo; of the vector. As you move from left ($x=-3$) to right ($x=3$), the arrows rotate. The model compares angles between atoms to determine relative positions.</li>
<li><strong>Bottom Right (The Grid):</strong> The resulting attention pattern when combining X-rotations and Y-rotations. The red/blue regions show where the model pays attention relative to the center, forming a grid-like interference pattern that distinguishes relative positions (e.g., &ldquo;top-right&rdquo; vs &ldquo;bottom-left&rdquo;).</li>
</ul>
<h4 id="adaptive-grid-merging">Adaptive Grid Merging</h4>
<p>To make the 3D grid approach computationally tractable, two key strategies are employed:</p>
<ol>
<li><strong>Grid Sampling</strong>: Randomly selecting 10-20% of empty cells during training</li>
<li><strong>Adaptive Grid Merging</strong>: Recursively merging $2 \times 2 \times 2$ blocks of empty cells into larger &ldquo;coarse&rdquo; cells, creating a multi-resolution view that is fine-grained near atoms and coarse-grained in empty space (merging set to Level 3)</li>
</ol>
<p><strong>Visualizing Adaptive Grid Merging</strong>:</p>















<figure class="post-figure center ">
    <img src="/img/notes/spaceformer-adaptive-grid-merging.webp"
         alt="2D simulation of adaptive grid merging for an H2O molecule showing multi-resolution cells"
         title="2D simulation of adaptive grid merging for an H2O molecule showing multi-resolution cells"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">Adaptive grid merging demonstrated on H₂O. Red cells (Level 0) contain atoms and remain at full resolution. Progressively darker blue cells represent merged empty regions at higher levels, covering the same volume with fewer tokens.</figcaption>
    
</figure>

<p>The adaptive grid process compresses empty space around molecules while maintaining high resolution near atoms:</p>
<ul>
<li><strong>Red Cells (Level 0):</strong> The smallest squares ($0.49$Å) containing atoms. These are kept at highest resolution because electron density changes rapidly here.</li>
<li><strong>Light Blue Cells (Level 0/1):</strong> Small empty regions close to atoms.</li>
<li><strong>Darker Blue Cells (Level 2/3):</strong> Large blocks of empty space further away.</li>
</ul>
<p>If we used a naive uniform grid, we would have to process thousands of empty &ldquo;Level 0&rdquo; cells containing almost zero information. By merging them into larger blocks (the dark blue squares), the model covers the same volume with significantly fewer input tokens, reducing the number of tokens by roughly <strong>10x</strong> compared to a dense grid.</p>















<figure class="post-figure center ">
    <img src="/img/notes/spaceformer-adaptive-grid-benzene.webp"
         alt="Adaptive grid merging visualization for benzene molecule showing hexagonal ring with multi-resolution grid cells"
         title="Adaptive grid merging visualization for benzene molecule showing hexagonal ring with multi-resolution grid cells"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">Adaptive grid merging for benzene (C₆H₆). The model maintains maximum resolution (red Level 0 cells) only where atoms exist, while merging vast empty regions into large blocks (dark blue L3/L4 cells). This allows the model to focus computational power on chemically active zones.</figcaption>
    
</figure>

<p>The benzene example above demonstrates how this scales to larger molecules. The characteristic hexagonal ring of 6 carbon atoms (black) and 6 hydrogen atoms (white) occupies a small fraction of the total grid. The dark blue corners (L3, L4) represent massive merged blocks of empty space, allowing the model to focus 90% of its computational power on the red &ldquo;active&rdquo; zones where chemistry actually happens.</p>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Lu, S., Ji, X., Zhang, B., Yao, L., Liu, S., Gao, Z., Zhang, L., &amp; Ke, G. (2025). Beyond Atoms: Enhancing Molecular Pretrained Representations with 3D Space Modeling. <em>Proceedings of the 42nd International Conference on Machine Learning (ICML)</em>, 267, 40491-40504. <a href="https://proceedings.mlr.press/v267/lu25e.html">https://proceedings.mlr.press/v267/lu25e.html</a></p>
<p><strong>Publication</strong>: ICML 2025</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@inproceedings</span>{lu2025beyond,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span>=<span style="color:#e6db74">{Beyond Atoms: Enhancing Molecular Pretrained Representations with 3D Space Modeling}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span>=<span style="color:#e6db74">{Lu, Shuqi and Ji, Xiaohong and Zhang, Bohang and Yao, Lin and Liu, Siyuan and Gao, Zhifeng and Zhang, Linfeng and Ke, Guolin}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">booktitle</span>=<span style="color:#e6db74">{Proceedings of the 42nd International Conference on Machine Learning}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span>=<span style="color:#e6db74">{40491--40504}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">volume</span>=<span style="color:#e6db74">{267}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">series</span>=<span style="color:#e6db74">{Proceedings of Machine Learning Research}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">publisher</span>=<span style="color:#e6db74">{PMLR}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span>=<span style="color:#e6db74">{2025}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p><strong>Additional Resources</strong>:</p>
<ul>
<li><a href="https://openreview.net/forum?id=Wd9KPQCKwq">OpenReview forum</a></li>
<li><a href="https://openreview.net/pdf?id=Wd9KPQCKwq">PDF on OpenReview</a></li>
<li><a href="https://icml.cc/virtual/2025/poster/45004">ICML 2025 poster page</a></li>
</ul>
]]></content:encoded></item><item><title>Embedded-Atom Method: Impurities and Defects in Metals</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/embedded-atom-method/</link><pubDate>Fri, 22 Aug 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/embedded-atom-method/</guid><description>Daw and Baskes's foundational 1984 paper introducing the Embedded-Atom Method (EAM), a many-body potential for metal simulations.</description><content:encoded><![CDATA[<h2 id="contribution-adaptive-many-body-potentials">Contribution: Adaptive Many-Body Potentials</h2>
<p>This is a foundational <strong>method paper</strong> that introduces a new class of semi-empirical, many-body interatomic potential: the <strong>Embedded-Atom Method (EAM)</strong>. It is designed for large-scale atomistic simulations of metallic systems, bridging the gap between computationally cheap (but physically limited) pair potentials and accurate (but expensive) quantum mechanical methods. The EAM achieves pair-potential speed while incorporating many-body physics inspired by density functional theory.</p>
<h2 id="motivation-the-geometric-limits-of-pair-potentials">Motivation: The Geometric Limits of Pair Potentials</h2>
<p>The authors sought to overcome the limitations of <strong>pair potentials</strong> (the dominant method of the time), which failed in three key areas:</p>
<ul>
<li><strong>Elastic Anisotropy:</strong> Pair potentials enforce the Cauchy relation ($C_{12} = C_{44}$), which is violated by most transition metals.</li>
<li><strong>Volume Ambiguity:</strong> Pair potentials require a volume-dependent energy term, making them impossible to use accurately on surfaces or cracks where local volume is undefined.</li>
<li><strong>Chemical Incompatibility:</strong> Pair potentials cannot model chemically active impurities like Hydrogen.</li>
</ul>
<p>First-principles quantum mechanical methods (e.g., band theory) are limited by basis-set size and periodicity requirements, making them impractical for the large systems (thousands of atoms) needed to study defects, surfaces, and mechanical properties.</p>
<p>The goal was to create a new model that bridges this gap in accuracy and computational cost.</p>
<h2 id="core-innovation-the-embedding-energy-function">Core Innovation: The Embedding Energy Function</h2>
<p>The EAM postulates that the energy of an atom is determined by the local electron density of its neighbors. The total energy is:</p>
<p>$$E_{tot} = \sum_{i} F_i(\rho_{h,i}) + \frac{1}{2}\sum_{i \neq j} \phi_{ij}(R_{ij})$$</p>
<ul>
<li><strong>$F_i(\rho_{h,i})$ (Embedding Energy):</strong> The energy required to embed atom $i$ into the background electron density $\rho$ provided by its neighbors. This term is non-linear and captures many-body effects.</li>
<li><strong>$\phi_{ij}$ (Pair Potential):</strong> A short-range electrostatic repulsion between cores.</li>
<li><strong>$\rho_{h,i}$ (Host Density):</strong> Approximated as a linear superposition of atomic densities: $\rho_{h,i} = \sum_{j \neq i} \rho^a_j(R_{ij})$.</li>
</ul>
<p>The key innovations are:</p>
<ol>
<li><strong>The Embedding Energy</strong>: Each atom $i$ contributes an energy $F_i$ which is a non-linear function of the local electron density $\rho_{h,i}$ it is embedded in. This density is approximated as a simple linear superposition of the atomic electron densities of all its neighbors. This term captures the crucial many-body effects of metallic bonding.</li>
<li><strong>A Redefined Pair Potential</strong>: A short-range, two-body potential $\phi_{ij}$ is retained, but it primarily models the electrostatic core-core repulsion.</li>
<li><strong>Elimination of the &ldquo;Volume&rdquo; Problem</strong>: Because the embedding energy depends on the local electron density (a quantity that is always well-defined, even at a surface or a crack tip), the method circumvents the ambiguities of volume-dependent pair potentials.</li>
<li><strong>Intrinsic Many-Body Nature</strong>: The non-linearity of the embedding function $F(\rho)$ naturally accounts for why chemically active impurities (like hydrogen) cannot be described by pair potentials and correctly breaks the Cauchy relation for elastic constants.</li>
</ol>
<h2 id="experimental-design-robust-parameter-validation">Experimental Design: Robust Parameter Validation</h2>
<p>The authors validated EAM through a rigorous split between parameterization data and prediction tasks:</p>
<p><strong>Fitting Data (Bulk Properties Only):</strong></p>
<p>The model parameters were fitted exclusively to these experimental values for Ni and Pd:</p>
<ul>
<li>Lattice constant ($a_0$)</li>
<li>Elastic constants ($C_{11}, C_{12}, C_{44}$)</li>
<li>Sublimation energy ($E_s$)</li>
<li>Vacancy-formation energy ($E^F_{1V}$)</li>
<li>Hydrogen heat of solution (for fitting H parameters)</li>
</ul>
<p><strong>Validation Tests (No Further Fitting):</strong></p>
<p>The model was then evaluated on its ability to predict these properties without any additional parameter adjustments:</p>
<ul>
<li><strong>Surface Relaxations:</strong> Ni(110) surface contraction</li>
<li><strong>Surface Energy:</strong> Ni(100) surface energy</li>
<li><strong>Hydrogen Migration:</strong> H migration energy in Pd</li>
<li><strong>Fracture Mechanics:</strong> Hydrogen embrittlement in Ni slabs</li>
</ul>
<h2 id="results-extending-predictive-power-to-surfaces-and-defects">Results: Extending Predictive Power to Surfaces and Defects</h2>
<ol>
<li><strong>Many-Body Physics:</strong> The embedding function $F(\rho)$ successfully captures the volume-dependence of metallic cohesion, fixing the &ldquo;Cauchy discrepancy&rdquo; inherent in pair potentials.</li>
<li><strong>Surface Properties:</strong> A single set of functions, fitted only to bulk data, correctly reproduces surface relaxations within 0.1 Å of experiment across three faces (100), (110), and (111) for Ni. The Ni(100) surface energy (1550 erg/cm²) compares well with the measured crystal-vapor average (1725 erg/cm²).</li>
<li><strong>Hydrogen in Bulk:</strong> The method predicts H migration energy in Pd as 0.26 eV, matching experiment exactly. Hydride lattice expansions are also well reproduced: 4.5% for NiH (experiment: 5%) and 4% for PdH (experiment: 3.5% for PdH$_{0.6}$).</li>
<li><strong>Hydrogen on Surfaces:</strong> Calculated adsorption sites on all three Ni and Pd faces agree with experimentally determined sites. Adsorption energies on Ni surfaces are systematically about 0.25 eV too low, while on Pd surfaces the error is much smaller (about 0.05 eV too high on average).</li>
<li><strong>Fracture Mechanics:</strong> Static fracture calculations on Ni slabs demonstrate brittle fracture behavior and show that hydrogen lowers the fracture stress, providing a qualitative model of hydrogen embrittlement.</li>
</ol>
<h2 id="limitations">Limitations</h2>
<p>The authors acknowledge several limitations:</p>
<ul>
<li>The functions $F$ and $\phi$ are not uniquely determined by the empirical fitting procedure. The short-range pair potential (restricted to first neighbors in fcc metals) may not be the best choice for all crystal structures.</li>
<li>The choice of hydrogen embedding function (Puska et al. vs. Norskov&rsquo;s corrected function) remains undecided and may affect hydrogen binding energies.</li>
<li>The fracture calculations are static, and dynamical effects and plasticity play important roles in real fracture that are not captured.</li>
<li>The method has only been demonstrated for fcc metals (Ni and Pd). Extension to bcc metals and other crystal structures requires further investigation.</li>
</ul>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<h3 id="algorithms">Algorithms</h3>
<p>To replicate the method, three specific algorithmic definitions are needed:</p>
<ol>
<li>
<p><strong>Atomic Density Construction</strong>: The electron density $\rho^a(r)$ is a weighted sum of Hartree-Fock $s$ and $d$ orbital densities (from Clementi &amp; Roetti tables), controlled by a parameter $N_s$ (the number of s-like electrons):
$$\rho^a(r) = N_s\rho_s^a(r) + (N-N_s)\rho_d^a(r)$$
For Ni, $N_s = 0.85$; for Pd, $N_s = 0.65$ (fitted to H solution heat).</p>
</li>
<li>
<p><strong>Pair Potential Form</strong>: The short-range pair interaction derives from an effective charge function $Z(r)$ to handle core repulsion:
$$\phi_{ij}(r) = \frac{Z_i(r)Z_j(r)}{r}$$
Splines for $Z(r)$ are provided in Table II.</p>
</li>
<li>
<p><strong>Analytic Forces</strong>: Because embedding energy depends on neighbor density, the force calculation is many-body:
$$\vec{f}_{k} = -\sum_{j(\neq k)} (F&rsquo;_{k} \rho&rsquo;_{j} + F&rsquo;_{j} \rho&rsquo;_{k} + \phi&rsquo;_{jk}) \vec{r}_{jk}$$</p>
</li>
</ol>
<h3 id="models">Models</h3>
<p>The functions $F(\rho)$ and $\phi(r)$ are modeled using <strong>cubic splines</strong>, with parameters fitted to reproduce bulk experimental constants. The embedding function $F(\rho)$ is constrained to have a single minimum and to be linear at high densities, matching the qualitative form of the first-principles calculations by Puska et al. Energy minimization uses the <strong>conjugate gradients</strong> technique. The paper explicitly lists spline knots, coefficients, and cutoffs in Tables II and IV, making the method fully reproducible.</p>















<figure class="post-figure center ">
    <img src="/img/notes/chemistry/eam-embedding-effective-charge.webp"
         alt="Reproduction of Figures 1 and 2 from Daw &amp; Baskes (1984) showing the embedding energy and effective charge functions for Ni and Pd"
         title="Reproduction of Figures 1 and 2 from Daw &amp; Baskes (1984) showing the embedding energy and effective charge functions for Ni and Pd"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption"><strong>Left:</strong> Dimensionless embedding energy ($E/E_s$) vs. normalized electron density ($\rho/\bar{\rho}$). The minimum near $\rho/\bar{\rho} \approx 1.0$ drives metallic cohesion. <strong>Right:</strong> Normalized effective charge ($Z/Z_0$) vs. normalized distance ($R/a_0$). The charge drops to zero near $R/a_0 = 0.85$, ensuring short-range interactions. Reproduced from Table II spline knots.</figcaption>
    
</figure>

<h3 id="evaluation">Evaluation</h3>
<p><strong>Fitting Data (Used for Parameterization):</strong></p>
<p>Bulk experimental properties for Ni and Pd only:</p>
<ul>
<li>Lattice constant ($a_0$)</li>
<li>Elastic constants ($C_{11}, C_{12}, C_{44}$)</li>
<li>Sublimation energy ($E_s$)</li>
<li>Vacancy-formation energy ($E^F_{1V}$)</li>
<li>Hydrogen heat of solution (for fitting H parameters)</li>
</ul>
<p><strong>Validation Results (Predictions Without Further Fitting):</strong></p>
<table>
  <thead>
      <tr>
          <th>Property</th>
          <th>Predicted</th>
          <th>Experimental</th>
          <th>Agreement</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Ni(110) surface contraction</td>
          <td>-0.11 Å</td>
          <td>-0.06 to -0.10 Å</td>
          <td>Within 0.1 Å</td>
      </tr>
      <tr>
          <td>Ni(100) surface energy</td>
          <td>1550 erg/cm²</td>
          <td>1725 erg/cm² (avg.)</td>
          <td>Close</td>
      </tr>
      <tr>
          <td>H migration in Pd</td>
          <td>0.26 eV</td>
          <td>0.26 eV</td>
          <td>Exact</td>
      </tr>
      <tr>
          <td>NiH lattice expansion</td>
          <td>4.5%</td>
          <td>5%</td>
          <td>Close</td>
      </tr>
      <tr>
          <td>PdH lattice expansion</td>
          <td>4%</td>
          <td>3.5% (PdH$_{0.6}$)</td>
          <td>Close</td>
      </tr>
      <tr>
          <td>H adsorption sites (Ni, Pd)</td>
          <td>Correct on all faces</td>
          <td>Matches experiment</td>
          <td>Exact</td>
      </tr>
      <tr>
          <td>H embrittlement in Ni</td>
          <td>Qualitative model</td>
          <td>-</td>
          <td>Qualitative</td>
      </tr>
  </tbody>
</table>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Daw, M. S., &amp; Baskes, M. I. (1984). Embedded-atom method: Derivation and application to impurities, surfaces, and other defects in metals. <em>Physical Review B</em>, 29(12), 6443-6453. <a href="https://doi.org/10.1103/PhysRevB.29.6443">https://doi.org/10.1103/PhysRevB.29.6443</a></p>
<p><strong>Publication</strong>: Physical Review B, 1984</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@article</span>{daw1984embedded,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span>=<span style="color:#e6db74">{Embedded-atom method: Derivation and application to impurities, surfaces, and other defects in metals}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span>=<span style="color:#e6db74">{Daw, Murray S and Baskes, Mike I}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">journal</span>=<span style="color:#e6db74">{Physical Review B}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">volume</span>=<span style="color:#e6db74">{29}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">number</span>=<span style="color:#e6db74">{12}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span>=<span style="color:#e6db74">{6443--6453}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span>=<span style="color:#e6db74">{1984}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">publisher</span>=<span style="color:#e6db74">{APS}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">doi</span>=<span style="color:#e6db74">{10.1103/PhysRevB.29.6443}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p><strong>Additional Resources</strong>:</p>
<ul>
<li><a href="/notes/chemistry/molecular-simulation/embedded-atom-method-review-1993/">EAM Review (1993)</a></li>
<li><a href="/notes/chemistry/molecular-simulation/embedded-atom-method-voter-1994/">EAM User Guide (1994)</a></li>
<li><a href="https://www.ctcms.nist.gov/potentials/">NIST Interatomic Potentials Repository</a></li>
</ul>
]]></content:encoded></item><item><title>Umbrella Sampling: Monte Carlo Free-Energy Estimation</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/umbrella-sampling/</link><pubDate>Thu, 21 Aug 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/umbrella-sampling/</guid><description>Torrie and Valleau's 1977 paper introducing Umbrella Sampling, an importance sampling technique for Monte Carlo free-energy calculations.</description><content:encoded><![CDATA[<h2 id="a-methodological-shift-in-monte-carlo-simulations">A Methodological Shift in Monte Carlo Simulations</h2>
<p>This is a <strong>Method</strong> paper that introduces a novel computational technique for Monte Carlo simulations. It presents Umbrella Sampling, an importance sampling approach that uses non-physical distributions to calculate free energy differences in molecular systems.</p>
<h2 id="the-sampling-gap-in-phase-transitions">The Sampling Gap in Phase Transitions</h2>
<p>The paper addresses the failure of conventional Boltzmann-weighted Monte Carlo to estimate free energy differences.</p>
<ul>
<li><strong>The Problem</strong>: Free energy depends on the integral of configurations that are rare in the reference system. In a standard simulation, the relevant probability density $f_0(\Delta U^*)$ is too small to be sampled accurately by conventional Boltzmann-weighted Monte Carlo.</li>
<li><strong>Phase Transitions</strong>: Conventional &ldquo;thermodynamic integration&rdquo; fails near phase transitions because it requires a path of integration where ensemble averages can be reliably measured, which is difficult in unstable regions.</li>
</ul>
<h2 id="bridging-states-with-non-physical-distributions">Bridging States with Non-Physical Distributions</h2>
<p>The authors introduce a non-physical distribution $\pi(q^N)$ to bridge the gap between a reference system (0) and a system of interest (1).</p>
<ul>
<li><strong>Arbitrary Weights</strong>: They generate a Markov chain with a limiting distribution $\pi(q^N)$ that differs from the Boltzmann distribution of either system. This distribution is written as $\pi(q&rsquo;^N) = w(q&rsquo;^N) \exp(-U_0(q&rsquo;^N)/kT_0) / Z$, where $w(q^N) = W(\Delta U^*)$ is a weighting function chosen to favor configurations with values of $\Delta U^*$ important to the free-energy integral.</li>
<li><strong>Reweighting Formula</strong>: The unbiased average of any property $\theta$ is recovered via the ratio of biased averages:</li>
</ul>
<p>$$\langle\theta\rangle_{0}=\frac{\langle\theta/w\rangle_{w}}{\langle1/w\rangle_{w}}$$</p>
<ul>
<li><strong>Overlap</strong>: The method allows sampling a range of $\Delta U^*$ up to <strong>three times</strong> that of a conventional Monte Carlo experiment, enabling accurate determination of values of $f_0(\Delta U^*)$ as small as $10^{-8}$. If a single weight function cannot span the entire gap, additional overlapping umbrella-sampling experiments are carried out with different weighting functions exploring successively overlapping ranges of $\Delta U^*$.</li>
</ul>
<h2 id="validation-on-lennard-jones-fluids">Validation on Lennard-Jones Fluids</h2>
<p>The authors validated Umbrella Sampling using Monte Carlo simulations of model fluids.</p>
<h3 id="experimental-setup">Experimental Setup</h3>
<ul>
<li><strong>System Specifications</strong>: The study used a <strong>Lennard-Jones (LJ)</strong> fluid and an <strong>inverse-12 &ldquo;soft-sphere&rdquo;</strong> fluid.</li>
<li><strong>System Size</strong>: Simulations were primarily performed with <strong>$N=32$ particles</strong>, with some validation runs at <strong>$N=108$ particles</strong> to check for size dependence.</li>
<li><strong>State Points</strong>: Calculations covered a wide range of densities ($N\sigma^3/V = 0.50$ to $0.85$) and temperatures ($kT/\epsilon = 0.7$ to $2.8$), including the gas-liquid coexistence region.</li>
</ul>
<h3 id="baselines">Baselines</h3>
<ul>
<li><strong>Baselines</strong>: Results were compared to thermodynamic integration data from <strong>Hansen</strong>, <strong>Levesque</strong>, and <strong>Verlet</strong>.</li>
<li><strong>Quantitative Success</strong>:
<ul>
<li><strong>Agreement</strong>: The free energy estimates agreed with pressure integration results to within statistical uncertainties (e.g., at $kT/\epsilon=1.35$, Umbrella Sampling gave -3.236 vs. Conventional -3.25).</li>
<li><strong>Precision</strong>: Free energy differences were obtained with high precision ($\pm 0.005 NkT$ for $N=108$).</li>
<li><strong>Efficiency</strong>: A single umbrella run could replace the &ldquo;numerous runs&rdquo; required for conventional $1/T$ integrations.</li>
</ul>
</li>
</ul>
<h2 id="temperature-scaling-via-reweighting">Temperature Scaling via Reweighting</h2>
<p>When the reference system has the same internal energy function as the system of interest (i.e., the same fluid at a different temperature), the free-energy expression simplifies to:</p>
<p>$$\frac{A(T)}{kT} = \frac{A(T_0)}{kT_0} - \ln \int f_0(U) \exp\left[-U\left(\frac{1}{kT} - \frac{1}{kT_0}\right)\right] dU$$</p>
<p>This is especially useful because a single determination of $f_0(U)$ over a wide energy range gives the free energy over a whole range of temperatures simultaneously. For 32 Lennard-Jones particles, only two umbrella-sampling experiments are needed to span the temperature range from the triple point ($kT/\epsilon = 0.7$) to twice the critical temperature ($kT/\epsilon = 2.8$). For 108 particles, four experiments suffice.</p>
<h2 id="mapping-the-liquid-gas-free-energy-surface">Mapping the Liquid-Gas Free Energy Surface</h2>
<ul>
<li><strong>Methodological Utility</strong>: The method successfully mapped the free energy of the LJ fluid across the liquid-gas transition, a region where conventional methods face convergence problems.</li>
<li><strong>N-Dependence</strong>: Comparison between $N=32$ and $N=108$ showed no statistically significant size dependence for free energy differences, suggesting small systems are sufficient for these estimates.</li>
<li><strong>Comparison with Gosling-Singer Method</strong>: The paper contrasts its results with free energies derived from Gosling and Singer&rsquo;s entropy estimation technique, finding discrepancies as large as $0.4N\epsilon$ (a 20% error in the nonideal entropy), equivalent to overestimating the configurational integral of a 108-particle system by a factor of $10^{16}$.</li>
<li><strong>Generality</strong>: While demonstrated on energy ($U$), the authors note the weighting function $w$ can be any function of the coordinates, generalizing the technique beyond simple free energy differences.</li>
</ul>
<h2 id="reproducibility">Reproducibility</h2>
<p>This 1977 paper predates modern code-sharing practices, and no source code or data files are publicly available. However, the paper provides sufficient algorithmic detail for reimplementation:</p>
<ul>
<li><strong>Constructing $W$</strong>: The paper does not derive $W$ analytically. It uses a <strong>trial-and-error procedure</strong>: start with a short Boltzmann-weighted experiment, then broaden the distribution in stages through short test runs, adjusting weights to flatten the probability density $f_w(\Delta U^*)$. The paper acknowledges this requires &ldquo;interaction between the trial computer results and human judgment.&rdquo;</li>
<li><strong>Specific Weights</strong>: Table I provides the exact numerical weights used for the 32-particle soft-sphere experiment at $N\sigma^3/V = 0.85$, $kT/\epsilon = 2.74$, with values spanning from $W=1{,}500{,}000$ at the lowest energies down to $W=1.0$ at the center and back up to $W=16.0$ at the highest energies.</li>
<li><strong>Potentials</strong>: The Lennard-Jones and inverse-twelve potentials are fully specified (Eqs. 8 and 9).</li>
<li><strong>State Points</strong>: Densities and temperatures are enumerated in Tables II and III.</li>
<li><strong>Block Averaging</strong>: Errors were estimated by treating sequences of $m$ steps as independent samples, where $m$ is determined by increasing block size until no systematic trends can be detected in either the average or the standard deviation of the mean.</li>
</ul>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Torrie, G. M., &amp; Valleau, J. P. (1977). Nonphysical sampling distributions in Monte Carlo free-energy estimation: Umbrella sampling. <em>Journal of Computational Physics</em>, 23(2), 187-199. <a href="https://doi.org/10.1016/0021-9991(77)90121-8">https://doi.org/10.1016/0021-9991(77)90121-8</a></p>
<p><strong>Publication</strong>: Journal of Computational Physics, 1977</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@article</span>{torrie1977nonphysical,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span>=<span style="color:#e6db74">{Nonphysical sampling distributions in Monte Carlo free-energy estimation: Umbrella sampling}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span>=<span style="color:#e6db74">{Torrie, Glenn M and Valleau, John P}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">journal</span>=<span style="color:#e6db74">{Journal of Computational Physics}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">volume</span>=<span style="color:#e6db74">{23}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">number</span>=<span style="color:#e6db74">{2}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span>=<span style="color:#e6db74">{187--199}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span>=<span style="color:#e6db74">{1977}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">publisher</span>=<span style="color:#e6db74">{Elsevier}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">doi</span>=<span style="color:#e6db74">{10.1016/0021-9991(77)90121-8}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div>]]></content:encoded></item><item><title>Lennard-Jones on Adsorption and Diffusion on Surfaces</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/processes-of-adsorption/</link><pubDate>Sun, 17 Aug 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/processes-of-adsorption/</guid><description>Lennard-Jones's 1932 foundational paper introducing potential energy surface models to unify physical and chemical adsorption.</description><content:encoded><![CDATA[<h2 id="the-theoretical-foundation-of-adsorption-and-diffusion">The Theoretical Foundation of Adsorption and Diffusion</h2>
<p>This paper represents a foundational <strong>Theory</strong> contribution with dual elements of <strong>Systematization</strong>. It derives physical laws for adsorption potentials (Section 2) and diffusion kinetics (Section 4) from first principles, validating them against external experimental data (Ward, Benton). It bridges <strong>electronic structure theory</strong> (potential curves) and <strong>statistical mechanics</strong> (diffusion rates). It provides a unifying theoretical framework to explain a range of experimental observations.</p>
<h2 id="reconciling-physisorption-and-chemisorption">Reconciling Physisorption and Chemisorption</h2>
<p>The primary motivation was to reconcile conflicting experimental evidence regarding the nature of gas-solid interactions. At the time, it was observed that the same gas and solid could interact weakly at low temperatures (consistent with van der Waals forces) but exhibit strong, chemical-like bonding at higher temperatures, a process requiring significant activation energy. The paper seeks to provide a single, coherent model that can explain both &ldquo;physical adsorption&rdquo; (physisorption) and &ldquo;activated&rdquo; or &ldquo;chemical adsorption&rdquo; (chemisorption) and the transition between them.</p>
<h2 id="quantum-mechanical-potential-energy-surfaces-for-adsorption">Quantum Mechanical Potential Energy Surfaces for Adsorption</h2>
<p>The core novelty is the application of quantum mechanical potential energy surfaces to the problem of surface adsorption. The key conceptual breakthroughs are:</p>
<ol>
<li>
<p><strong>Dual Potential Energy Curves</strong>: The paper proposes that the state of the system must be described by at least two distinct potential energy curves as a function of the distance from the surface:</p>
<ul>
<li>One curve represents the interaction of the intact molecule with the surface (e.g., H₂ with a metal). This corresponds to weak, long-range van der Waals forces.</li>
<li>A second curve represents the interaction of the dissociated constituent atoms with the surface (e.g., 2H atoms with the metal). This corresponds to strong, short-range chemical bonds.</li>
</ul>
</li>
<li>
<p><strong>Activated Adsorption via Curve Crossing</strong>: The transition from the molecular (physisorbed) state to the atomic (chemisorbed) state occurs at the intersection of these two potential energy curves. For a molecule to dissociate and chemisorb, it must possess sufficient energy to reach this crossing point. This energy is identified as the <strong>energy of activation</strong>, which had been observed experimentally.</p>
</li>
<li>
<p><strong>Unified Model</strong>: This model unifies physisorption and chemisorption into a single continuous process. A molecule approaching the surface is first trapped in the shallow potential well of the physisorption curve. If it acquires enough thermal energy to overcome the activation barrier, it can transition to the much deeper potential well of the chemisorption state. This provides a clear physical picture for temperature-dependent adsorption phenomena.</p>
</li>
<li>
<p><strong>Quantum Mechanical Basis for Cohesion</strong>: To explain the nature of the chemisorption bond itself, Lennard-Jones draws on the then-recent quantum theory of metals (Sommerfeld, Bloch). In a metal, electrons are not bound to individual atoms but instead occupy shared energy states (bands) spread across the crystal. When an atom approaches the surface, local energy levels form in the gap between the bulk bands, creating sites where bonding can occur. The adsorption bond arises from the interaction between the valency electron of the approaching atom and conduction electrons of the metal, forming a closed shell analogous to a homopolar bond.</p>
</li>
</ol>
<h2 id="validating-theory-against-experimental-gas-solid-interactions">Validating Theory Against Experimental Gas-Solid Interactions</h2>
<p>This is a theoretical paper with no original experiments performed by the author. However, Lennard-Jones validates his theoretical framework against existing experimental data from other researchers:</p>
<ul>
<li><strong>Ward&rsquo;s data</strong>: Hydrogen absorption on copper, used to validate the square root time law for slow sorption kinetics (§4)</li>
<li><strong>Activated adsorption experiments</strong>: Benton and White (hydrogen on nickel), Taylor and Williamson, and Taylor and McKinney all provided isobar data showing temperature-dependent transitions between adsorption types (§3). Garner and Kingman documented three distinct adsorption regimes at different temperatures.</li>
<li><strong>van der Waals constant data</strong>: Used existing measurements of diamagnetic susceptibility to calculate predicted heats of adsorption (e.g., argon on copper yielding approximately 6000 cal/gram atom, nitrogen roughly 2500 cal/gram mol, hydrogen roughly 1300 cal/gram mol)</li>
<li><strong>KCl crystal calculations</strong>: Computed the full attractive potential field of argon above a KCl crystal lattice, accounting for the discrete ionic structure to produce detailed potential energy curves at different surface positions (§2)</li>
</ul>
<p>The validation approach involves deriving theoretical predictions from first principles and showing they match the functional form and magnitude of independently measured experimental results.</p>
<h2 id="the-lennard-jones-diagram-and-activated-adsorption">The Lennard-Jones Diagram and Activated Adsorption</h2>
<p><strong>Key Outcomes</strong>:</p>
<ul>
<li>The paper introduced the now-famous Lennard-Jones diagram for surface interactions, plotting potential energy versus distance from the surface for both molecular and dissociated atomic species. This graphical model became a cornerstone of surface science.</li>
<li>Derived the square root time law ($S \propto \sqrt{t}$) for slow sorption kinetics, validated against Ward&rsquo;s experimental data.</li>
<li>Established quantitative connection between adsorption potentials and measurable atomic properties (diamagnetic susceptibility).</li>
</ul>
<p><strong>Conclusions</strong>:</p>
<ul>
<li>The nature of adsorption is determined by the interplay between two distinct potential states (molecular and atomic).</li>
<li>&ldquo;Activated adsorption&rdquo; is the process of overcoming an energy barrier to transition from a physically adsorbed molecular state to a chemically adsorbed atomic state.</li>
<li>The model predicts that the specific geometry of the surface (i.e., the lattice spacing) and the orientation of the approaching molecule are critical, as they influence the shape of the potential energy surfaces and thus the magnitude of the activation energy.</li>
<li>The reverse process (recombination of atoms and desorption of a molecule) also requires activation energy to move from the chemisorbed state back to the molecular state.</li>
<li>This entire mechanism is proposed as a fundamental factor in heterogeneous <strong>catalysis</strong>, where the surface acts to lower the activation energy for molecular dissociation, facilitating chemical reactions.</li>
</ul>
<p><strong>Limitations</strong>:</p>
<ul>
<li>The initial &ldquo;method of images&rdquo; derivation assumes a perfectly continuous conducting surface, an approximation that breaks down at the atomic orbital level close to the surface.</li>
<li>While Lennard-Jones uses one-dimensional calculations to estimate initial potential well depths, he later qualitatively extends this to 3D &ldquo;contour tunnels&rdquo; to explain surface migration. However, these early geometric approximations lack the many-body, multi-dimensional complexity natively handled by modern Density Functional Theory (DFT) simulations.</li>
</ul>
<hr>
<h2 id="mathematical-derivations">Mathematical Derivations</h2>
<h3 id="van-der-waals-calculation-section-2">Van der Waals Calculation (Section 2)</h3>
<p>The paper derives the attractive force between a neutral atom and a metal surface using the <strong>classical method of electrical images</strong>. The key steps are:</p>
<ol>
<li><strong>Method of Images</strong>: Lennard-Jones models the metal as a continuum of perfectly mobile electric fluid (a perfectly polarisable system). When a neutral atom approaches, its instantaneous dipole moment induces image charges in the metal surface.</li>
</ol>















<figure class="post-figure center ">
    <img src="/img/notes/method-of-images-atom-surface.webp"
         alt="Diagram showing an atom with nucleus (&#43;Ne) and electrons (-e) at distance R from a conducting surface, with its electrical image reflected on the opposite side"
         title="Diagram showing an atom with nucleus (&#43;Ne) and electrons (-e) at distance R from a conducting surface, with its electrical image reflected on the opposite side"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">An atom and its electrical image in a conducting surface. The nucleus (+Ne) and electrons create mirror charges across the metal plane.</figcaption>
    
</figure>

<ol start="2">
<li><strong>The Interaction Potential</strong>: The resulting potential energy $W$ of an atom at distance $R$ from the metal surface is:</li>
</ol>
<p>$$W = -\frac{e^2 \overline{r^2}}{6R^3}$$</p>
<p>where $\overline{r^2}$ is the mean square distance of electrons from the nucleus.</p>
<ol start="3">
<li><strong>Connection to Measurable Properties</strong>: This theoretical potential can be calculated using <strong>diamagnetic susceptibility</strong> ($\chi$). The interaction simplifies to:</li>
</ol>
<p>$$W = \mu R^{-3}$$</p>
<p>where $\mu = mc^2\chi/L$, with $m$ the electron mass, $c$ the speed of light, $\chi$ the diamagnetic susceptibility, and $L$ Loschmidt&rsquo;s number ($6.06 \times 10^{23}$). This connects the adsorption potential to measurable magnetic properties of the atom.</p>
<ol start="4">
<li><strong>Repulsive Forces and Equilibrium</strong>: By assuming repulsive forces account for approximately 40% of the potential at equilibrium, Lennard-Jones estimates heats of adsorption. For argon on copper, this yields approximately 6000 cal per gram atom. Similar calculations give roughly 2500 cal/gram mol for nitrogen on copper and 1300 cal/gram mol for hydrogen.</li>
</ol>
<hr>
<h2 id="kinetic-theory-of-slow-sorption-section-4">Kinetic Theory of Slow Sorption (Section 4)</h2>
<p>The paper extends beyond surface phenomena to model how gas <em>enters</em> the bulk solid (absorption). This section is critical for understanding time-dependent adsorption kinetics.</p>
<h3 id="the-cracks-hypothesis">The &ldquo;Cracks&rdquo; Hypothesis</h3>
<p>Lennard-Jones proposes that &ldquo;slow sorption&rdquo; is <strong>lateral diffusion along surface cracks</strong> (fissures between microcrystal boundaries) in the solid. The outer surface presents not a uniform plane but a network of narrow, deep crevasses where gas can penetrate. This reframes the problem: the rate-limiting step is diffusion along these crack walls, explaining why sorption rates differ from predictions based on bulk diffusion coefficients.</p>
<h3 id="the-diffusion-equation">The Diffusion Equation</h3>
<p>The problem is formulated using Fick&rsquo;s second law:</p>
<p>$$\frac{\partial n}{\partial t} = D \frac{\partial^{2}n}{\partial x^{2}}$$</p>
<p>where $n$ is the concentration of adsorbed atoms, $t$ is time, $D$ is the diffusion coefficient, and $x$ is the position along the crack.</p>
<h3 id="derivation-of-the-diffusion-coefficient">Derivation of the Diffusion Coefficient</h3>
<p>The diffusion coefficient is derived from kinetic theory:</p>
<p>$$D = \frac{\bar{c}^2 \tau^2}{2\tau^*}$$</p>
<p>where:</p>
<ul>
<li>$\bar{c}$ is the mean lateral velocity of mobile atoms parallel to the surface</li>
<li>$\tau$ is the time an atom spends in the mobile (activated) state</li>
<li>$\tau^*$ is the interval between activation events</li>
</ul>
<p>Atoms are &ldquo;activated&rdquo; to a mobile state with energy $E_0$, after which they can migrate along the surface.</p>
<h3 id="the-square-root-law">The Square Root Law</h3>
<p>Solving the diffusion equation for a semi-infinite crack yields the total amount of gas absorbed $S$ as a function of time:</p>
<p>$$S = 2n_0 \sqrt{\frac{Dt}{\pi}}$$</p>
<p>This predicts that <strong>absorption scales with the square root of time</strong>:</p>
<p>$$S \propto \sqrt{t}$$</p>
<h3 id="experimental-validation">Experimental Validation</h3>
<p>Lennard-Jones validates this derivation by re-analyzing Ward&rsquo;s experimental data on the Copper/Hydrogen system. Plotting the absorbed quantity against $\sqrt{t}$ produces linear curves, confirming the theoretical prediction. From the slope of the $\log_{10}(S^2/q^2t)$ vs. $1/T$ plot, Ward determined an activation energy of 14,100 cal per gram-molecule for the surface diffusion process.</p>
<hr>
<h2 id="surface-topography-and-3d-contours">Surface Topography and 3D Contours</h2>
<p>The notes above imply a one-dimensional process (distance from surface). The paper explicitly expands this to three dimensions to explain surface migration.</p>
<h3 id="potential-tunnels">Potential &ldquo;Tunnels&rdquo;</h3>
<p>Lennard-Jones models the surface potential as <strong>3D contour surfaces</strong> resembling &ldquo;underground caverns&rdquo; or tunnels. The potential energy landscape above a crystalline surface has periodic minima and saddle points.</p>
<h3 id="surface-migration">Surface Migration</h3>
<p>Atoms migrate along &ldquo;tunnels&rdquo; of low potential energy between surface atoms. The activation energy for surface diffusion corresponds to the barrier height between adjacent potential wells on the surface. This geometric picture explains:</p>
<ul>
<li>Why certain crystallographic orientations are more reactive</li>
<li>The temperature dependence of surface diffusion rates</li>
<li>The role of surface defects in catalysis</li>
</ul>
<h2 id="reproducibility">Reproducibility</h2>
<p>This is a 1932 theoretical paper with no associated code, datasets, or models. The mathematical derivations are fully presented in the text and can be followed from first principles. The experimental data referenced (Ward&rsquo;s copper/hydrogen measurements, Benton and White&rsquo;s nickel/hydrogen isobars) are cited from independently published sources. No computational artifacts exist.</p>
<ul>
<li><strong>Status</strong>: Closed (theoretical paper, no reproducibility artifacts)</li>
<li><strong>Hardware</strong>: N/A (analytical derivations only)</li>
</ul>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Lennard-Jones, J. E. (1932). Processes of Adsorption and Diffusion on Solid Surfaces. <em>Transactions of the Faraday Society</em>, 28, 333-359. <a href="https://doi.org/10.1039/tf9322800333">https://doi.org/10.1039/tf9322800333</a></p>
<p><strong>Publication</strong>: Transactions of the Faraday Society, 1932</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@article</span>{lennardjones1932processes,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span>=<span style="color:#e6db74">{Processes of adsorption and diffusion on solid surfaces}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span>=<span style="color:#e6db74">{Lennard-Jones, John Edward}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">journal</span>=<span style="color:#e6db74">{Transactions of the Faraday Society}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">volume</span>=<span style="color:#e6db74">{28}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span>=<span style="color:#e6db74">{333--359}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span>=<span style="color:#e6db74">{1932}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">publisher</span>=<span style="color:#e6db74">{Royal Society of Chemistry}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div>]]></content:encoded></item></channel></rss>