<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Molecular-Dynamics on Hunter Heidenreich | Senior AI Research Scientist</title><link>https://hunterheidenreich.com/tags/molecular-dynamics/</link><description>Recent content in Molecular-Dynamics on Hunter Heidenreich | Senior AI Research Scientist</description><image><title>Hunter Heidenreich | Senior AI Research Scientist</title><url>https://hunterheidenreich.com/img/avatar.webp</url><link>https://hunterheidenreich.com/img/avatar.webp</link></image><generator>Hugo -- 0.147.7</generator><language>en-US</language><copyright>2026 Hunter Heidenreich</copyright><lastBuildDate>Sat, 30 May 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://hunterheidenreich.com/tags/molecular-dynamics/index.xml" rel="self" type="application/rss+xml"/><item><title>MB-nrg: CCSD(T)-Accurate Potentials for Polyalanine</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/ml-potentials/mb-nrg-polyalanine-ccsdt/</link><pubDate>Sun, 12 Apr 2026 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/ml-potentials/mb-nrg-polyalanine-ccsdt/</guid><description>MB-nrg decomposes polyalanine into n-mer building blocks fit to DLPNO-CCSD(T) references, reaching coupled-cluster accuracy for gas-phase peptide dynamics.</description><content:encoded><![CDATA[<h2 id="a-modular-mb-nrg-method-for-biomolecular-potentials">A Modular MB-nrg Method for Biomolecular Potentials</h2>
<p>This is a <strong>Method</strong> paper. Zhou and colleagues extend the MB-nrg (many-body energy) formalism to covalently bonded biomolecules and build the first coupled-cluster-accurate potential energy function (PEF) for polyalanine in the gas phase. The contribution has three parts: a generalization of the MB-nrg decomposition from whole-molecule 1-mers to functional-group &ldquo;natural building blocks,&rdquo; a DLPNO-CCSD(T)/aug-cc-pVTZ training protocol driven by parallel-bias metadynamics sampling, and a demonstration that the resulting PEF reproduces alanine dipeptide energetics and AceAla$_9$Nme secondary-structure dynamics more faithfully than the Amber ff14SB and ff19SB force fields.</p>
<h2 id="why-empirical-force-fields-fall-short-for-protein-dynamics">Why Empirical Force Fields Fall Short for Protein Dynamics</h2>
<p>Protein dynamics span femtosecond vibrations to millisecond conformational changes, and capturing them at atomic resolution is central to understanding catalysis, allostery, and ligand binding. Classical force fields such as CHARMM, OPLS, and Amber approximate the potential energy surface with pairwise-additive analytical terms. This functional form struggles with the many-body interactions that shape disordered regions of proteins, including exchange-repulsion, charge transfer, charge penetration, and cooperative hydrogen bonding. Polarizable force fields add induced dipoles but remain empirically parameterized and fail to capture short-range many-body effects from electron-density overlap.</p>
<p>Quantum-mechanical methods avoid this, but <a href="https://en.wikipedia.org/wiki/Coupled_cluster">coupled cluster theory</a> scales as $\mathcal{O}(N^7)$ in the number of electrons and even DFT remains $\mathcal{O}(N^3)$ to $\mathcal{O}(N^4)$, ruling out direct ab initio molecular dynamics for biomolecules. Fragmentation methods like molecular fractionation with conjugate caps (MFCC) mitigate the cost, but they truncate the many-body expansion at two bodies and miss long-range hydrogen bonding. <a href="/notes/chemistry/molecular-simulation/ml-potentials/dark-side-of-forces/">Machine-learned force fields (MLFFs)</a> reach near-QM accuracy at lower cost, yet they typically train on DFT data (inheriting delocalization errors and poor dispersion), struggle with interpretability, and extrapolate unreliably. Existing permutationally invariant polynomial (PIP) approaches scale factorially in the number of atoms, capping direct applicability at roughly ten to fifteen atoms per fragment.</p>
<p>MB-nrg PEFs based on the many-body expansion and PIPs have successfully modeled water, halides in water, carbon dioxide, methane, ammonia, nitrogen pentoxide, and N-methylacetamide. Extending them to covalently bonded biomolecules requires rethinking what counts as a &ldquo;body.&rdquo;</p>
<h2 id="building-polyalanine-from-functional-group-n-mers">Building Polyalanine from Functional-Group n-mers</h2>
<p>The MB-nrg formalism starts from the many-body expansion of the total energy,</p>
<p>$$
E_N(1, \dots, N) = \sum_{i=1}^{N} \varepsilon^{1\mathrm{B}}(i) + \sum_{i&lt;j}^{N} \varepsilon^{2\mathrm{B}}(i,j) + \sum_{i&lt;j&lt;k}^{N} \varepsilon^{3\mathrm{B}}(i,j,k) + \dots + \varepsilon^{N\mathrm{B}}(1, \dots, N)
$$</p>
<p>where each $n$-body contribution is defined recursively as the $n$-mer energy minus all lower-order terms. The full PEF combines physics-based and data-driven components,</p>
<p>$$
V_{\mathrm{MB\text{-}nrg}} = V_{\mathrm{ML}} + V_{\mathrm{phys}}
$$</p>
<p>with $V_{\mathrm{ML}} = V_{\mathrm{ML}}^{1\mathrm{B}} + V_{\mathrm{ML}}^{2\mathrm{B}} + V_{\mathrm{ML}}^{3\mathrm{B}}$ capturing short-range quantum-mechanical interactions, and $V_{\mathrm{phys}} = V_{\mathrm{elec}} + V_{\mathrm{disp}} + V_{\mathrm{rep}}$ supplying electrostatics, dispersion, and repulsion. Dispersion follows a Tang-Toennies damped $C_6/R^6$ form with XDM-derived coefficients; electrostatics uses a Thole-modified self-consistent polarization model inherited from MB-pol; the repulsion term is a Lennard-Jones $R^{-12}$ contribution borrowed from Amber ff14SB, activated only for non-bonded atom pairs not covered by a PIP.</p>
<p>Each data-driven $n$-body term is expressed as</p>
<p>$$
V_{\mathrm{ML}}^{n\mathrm{B}} = \sum_{\mathrm{M}_1 &lt; \dots &lt; \mathrm{M}_n}^{N} s^{n\mathrm{B}}(\mathrm{M}_1, \dots, \mathrm{M}_n), V_{\mathrm{PIP}}^{n\mathrm{B}}(\mathrm{M}_1, \dots, \mathrm{M}_n)
$$</p>
<p>where $V_{\mathrm{PIP}}^{n\mathrm{B}}$ is a permutationally invariant polynomial in Morse-like variables $\xi_{ij} = \exp(-k_{\tau(ij)} R_{ij})$ and $s^{n\mathrm{B}}$ is a switching function.</p>
<p>The key extension in this paper, building on earlier work on linear alkanes, is to treat functional groups (not whole molecules) as 1-mers. An Ace-capped, Nme-capped polyalanine chain decomposes into three distinct 1-mer types (-CH-, CH$_3$-, -CONH-), five distinct 2-mer types, and six distinct 3-mer types, for 14 unique PIPs that cover every $n$-mer appearing in any AceAla$_n$Nme chain. Cleaving covalent bonds between 1-mers would produce radicals, so the authors cap dangling valences with &ldquo;ghost&rdquo; hydrogen atoms at fixed C-H (1.14 Å) and N-H (1.09 Å) distances. Each $n$-mer energy is then referenced to its own optimized H-capped structure,</p>
<p>$$
E_n(1, \dots, n) = E_n^{\mathrm{H\text{-}capped}}(1, \dots, n) - E_n^{\mathrm{H\text{-}capped,opt}}(1, \dots, n).
$$</p>
<p>In the current implementation, only covalently bonded $n$-mers receive PIPs, the 2-body contribution from a dimer with one intervening 1-mer is folded into the corresponding 3-body term, and non-bonded 1-mers interact through the Lennard-Jones repulsion alone. Crucially, no whole-chain polyalanine data enters any stage of training: every PIP is parameterized on isolated $n$-mer configurations, and the total energy is reconstructed through the many-body expansion.</p>
<h2 id="training-on-dlpno-ccsdt-with-metadynamics-sampling">Training on DLPNO-CCSD(T) with Metadynamics Sampling</h2>
<p>Training sets are generated for each of the 14 $n$-mer types using <a href="https://en.wikipedia.org/wiki/Metadynamics">parallel-bias metadynamics (PBMetaD)</a> with partitioned families, biasing heavy-atom bonds, angles, and dihedrals across 300 K, 500 K, and 700 K in LAMMPS interfaced with PLUMED and modified OPLS/CM1A and Amber ff14SB force fields. For each $n$-mer, 200,000 candidate configurations are sampled, then reduced to roughly 10,000-20,000 training configurations (and about 1,000 test configurations) through Mini-batch K-means clustering on chemically equivalent pairwise distances. Reference energies are computed at the DLPNO-CCSD(T)/aug-cc-pVTZ level in ORCA.</p>
<p>Each PIP minimizes a weighted, ridge-regularized sum of squared errors,</p>
<p>$$
\chi^2 = \sum_{k \in \mathcal{S}} w_k \left[ V^{n\mathrm{B}}(k) - \varepsilon^{n\mathrm{B}}(k) \right]^2 + \Gamma^2 \sum_l c_l^2
$$</p>
<p>with $\Gamma = 0.0005$ throughout and low-energy bias weights</p>
<p>$$
w_k = \left( \frac{\delta E}{\varepsilon^{n\mathrm{B}}(k) - \varepsilon^{n\mathrm{B}}_{\min} + \delta E} \right)^2.
$$</p>
<p>MB-Fit handles the fit, combining simplex optimization for non-linear parameters $k_{\tau(ij)}$ with ridge regression for the linear coefficients $c_l$.</p>
<p>Table 1 in the paper reports, for each of the 14 PIPs, the polynomial degree (5 for the smaller -CH- and CH$_3$- 1-mers, 3 for the larger -CONH- 1-mer and for all 2-mers and 3-mers), the number of symmetrized monomials (ranging from 635 for the -CH- and CH$_3$- 1-mers to 2871 for the -CONH-CH-CONH- 3-mer), the training-set size, and RMSDs for the train and test splits. All training RMSDs stay below 0.4 kcal/mol and all test RMSDs below 0.5 kcal/mol, with the smallest errors for the -CH- and CH$_3$- 1-mers (0.05 kcal/mol train, 0.14 kcal/mol test) and the largest test RMSD (0.47 kcal/mol) for the -CONH-CH- 2-mer.</p>
<p>MD validations run in LAMMPS interfaced with MBX and PLUMED. For alanine dipeptide metadynamics, bias potentials on the backbone $\varphi$ and $\psi$ angles are deposited every 500 steps with a 1.0 kJ/mol height and 11.46° width over 10 ns trajectories in the NVT ensemble, using the velocity-Verlet integrator with a 0.5 fs time step. Analogous MetaD runs with Amber ff14SB and ff19SB are performed in Amber23. The longer AceAla$_9$Nme trajectories start from fully extended structures and run in a 100 Å × 100 Å × 100 Å gas-phase box.</p>
<h2 id="ccsdt-energy-landscapes-free-energy-surfaces-and-helix-dynamics">CCSD(T) Energy Landscapes, Free-Energy Surfaces, and Helix Dynamics</h2>
<p><strong>Alanine dipeptide 2D PES.</strong> Alanine dipeptide geometries are optimized on a <a href="https://en.wikipedia.org/wiki/Ramachandran_plot">Ramachandran</a> grid with 10° spacing at the RI-MP2/def2-TZVP level and then evaluated at DLPNO-CCSD(T)/aug-cc-pVTZ. Despite never seeing whole alanine dipeptide in training, MB-nrg closely matches the reference locations and relative energies of four minima ($m_1$ to $m_4$), three maxima ($M_1$ to $M_3$), and one saddle point ($X$). Amber ff14SB and ff19SB capture the minima reasonably but badly overshoot the barriers: at $M_1$, MB-nrg misses the reference by only -2.41 kcal/mol, while ff14SB and ff19SB overshoot by +7.50 and +7.83 kcal/mol. The authors also note that ff19SB incorrectly orders the secondary minima by predicting $m_3$ lower than $m_2$.</p>
<table>
  <thead>
      <tr>
          <th>Model</th>
          <th>RMSD overall (kcal/mol)</th>
          <th>RMSD $\leq 10$ kcal/mol</th>
          <th>RMSD $&gt; 10$ kcal/mol</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>MB-nrg</td>
          <td>1.27</td>
          <td>1.18</td>
          <td>1.59</td>
      </tr>
      <tr>
          <td>Amber ff14SB</td>
          <td>6.33</td>
          <td>5.72</td>
          <td>8.44</td>
      </tr>
      <tr>
          <td>Amber ff19SB</td>
          <td>5.23</td>
          <td>4.79</td>
          <td>6.81</td>
      </tr>
  </tbody>
</table>
<p>The authors attribute MB-nrg&rsquo;s residual high-energy error to terminal methyl groups approaching the backbone in conformations where non-bonded 1-mer interactions are modeled by the simple LJ repulsion rather than an explicit PIP.</p>
<p><strong>Harmonic vibrations.</strong> Normal modes for the $m_1$ and $m_4$ alanine dipeptide conformers, computed by diagonalizing the Hessian, match RI-MP2/def2-TZVP references with mean deviations of 17.41 cm$^{-1}$ and 21.07 cm$^{-1}$ across all 60 modes. The authors acknowledge that some of this discrepancy reflects differences in theoretical levels (MB-nrg is trained to CCSD(T)/aug-cc-pVTZ, while the reference normal modes are computed at RI-MP2/def2-TZVP).</p>
<p><strong>Free-energy surfaces.</strong> Well-tempered metadynamics at 300 K produces 2D free-energy surfaces over $(\varphi, \psi)$. MB-nrg yields a smoother FES whose extrema line up with the DLPNO-CCSD(T) reference PES. Amber ff14SB and ff19SB remain reasonable near the low-energy $m_1$ and $m_2$ minima but systematically overestimate the barriers near $M_1$, $M_2$, and $M_3$, which the authors argue artificially confines the dipeptide and suppresses conformational transitions.</p>
<p><strong>Secondary structure in AceAla$_9$Nme.</strong> In 600 ps NVT MD starting from a fully extended structure, the <a href="https://en.wikipedia.org/wiki/STRIDE_(algorithm)">STRIDE algorithm</a> tracks residue-level secondary structures. Amber ff14SB and ff19SB collapse into $\alpha$-helices at roughly 40 ps and 80 ps, respectively, with ff19SB remaining especially rigid. MB-nrg takes about 100 ps before helices begin to form and then exhibits continuous oscillations between $3_{10}$- and $\alpha$-helical conformations. Ramachandran plots over the nine alanine residues show MB-nrg exploring the &ldquo;bridge&rdquo; region ($\varphi &lt; 0°$, $-20° \leq \psi \leq 20°$) associated with $3_{10}$-helices and sampling the left-handed $\alpha_L$ region that Amber rarely visits. The authors tie this flexibility to experimental observations of alanine-rich peptides in the gas phase and to similar predictions from GEMS and MACE-OFF.</p>
<h2 id="transferability-without-whole-chain-training-data">Transferability Without Whole-Chain Training Data</h2>
<p>The paper demonstrates that a modular, bottom-up PEF built from functional-group $n$-mers can reach CCSD(T) accuracy for polyalanine in the gas phase without ever training on whole-chain data. Truncating explicit data-driven terms at the 3-body level appears to balance cost and fidelity, with long-range effects handled by many-body polarization in $V_{\mathrm{elec}}$ and by Amber-derived repulsion between distant 1-mers. The 2D PES, harmonic frequencies, free-energy surface, and secondary-structure dynamics each validate a different facet of the model.</p>
<p>The authors are explicit about limitations. The current PEF applies only to gas-phase polyalanine; solvent effects and other amino acids remain open. The Lennard-Jones repulsion for non-bonded 1-mers is a placeholder for eventual 2-body PIPs that should capture short-range interactions during folding. Long-range hydrogen bonding in compact secondary structures (π-helices, $3_{10}$-helices, $\alpha$-helices) may produce non-negligible higher-order many-body contributions that the current 3-body truncation omits. The 2-body contribution from a dimer with one intervening monomer is currently folded into the 3-body term because of steric conflicts between capping hydrogens, and a systematic fix is flagged for future work. The authors position this paper as the first in a series (the &ldquo;I.&rdquo; in the title refers to &ldquo;Polyalanine in the Gas Phase&rdquo;) that will extend MB-nrg to broader biomolecular systems under physiological conditions. The follow-up, <a href="/notes/chemistry/molecular-simulation/ml-potentials/mb-nrg-polyalanine-water/">MB-nrg in Solution: Polyalanine in Water with CCSD(T) PEFs</a>, adds explicit 1-mer/water 2-body PIPs and benchmarks alanine dipeptide solvation.</p>
<hr>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<h3 id="data">Data</h3>
<table>
  <thead>
      <tr>
          <th>Purpose</th>
          <th>Dataset</th>
          <th>Size</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Training</td>
          <td>Per $n$-mer pools from PBMetaD in LAMMPS/PLUMED</td>
          <td>200,000 configurations each, reduced to ~10-20k via Mini-batch K-means</td>
          <td>OPLS/CM1A and Amber ff14SB sampled at 300 K, 500 K, 700 K</td>
      </tr>
      <tr>
          <td>Training labels</td>
          <td>DLPNO-CCSD(T)/aug-cc-pVTZ in ORCA</td>
          <td>14 unique $n$-mer types</td>
          <td>Domain-based local pair natural orbital approximation to canonical CCSD(T)</td>
      </tr>
      <tr>
          <td>Test</td>
          <td>Held-out $n$-mer configurations</td>
          <td>~1,000 per $n$-mer</td>
          <td>Same clustering protocol</td>
      </tr>
      <tr>
          <td>Alanine dipeptide benchmark</td>
          <td>Ramachandran grid at 10° spacing, RI-MP2/def2-TZVP geometries</td>
          <td>1,296 grid points (approximate)</td>
          <td>Single-point energies at DLPNO-CCSD(T)/aug-cc-pVTZ, ff14SB, ff19SB, MB-nrg</td>
      </tr>
      <tr>
          <td>AceAla$_9$Nme dynamics</td>
          <td>600 ps NVT MD from fully extended start</td>
          <td>Single trajectory per model</td>
          <td>STRIDE for secondary-structure assignment</td>
      </tr>
  </tbody>
</table>
<p>Per the Data Availability statement, &ldquo;any data generated and analyzed in this study are available from the authors upon request.&rdquo; No public release is announced in the text.</p>
<h3 id="algorithms">Algorithms</h3>
<ul>
<li>Many-body expansion of the energy with 1-, 2-, and 3-body data-driven terms.</li>
<li>Permutationally invariant polynomials in Morse-exponential variables $\xi_{ij} = \exp(-k_{\tau(ij)} R_{ij})$, symmetrized over chemically equivalent atoms.</li>
<li>&ldquo;Ghost&rdquo; H-capping at cleaved covalent bonds, with fixed C-H (1.14 Å) and N-H (1.09 Å) bond lengths and per-$n$-mer optimized-structure referencing.</li>
<li>Non-linear parameters fit by simplex minimization, linear coefficients by ridge regression with $\Gamma = 0.0005$.</li>
<li>Low-energy weighting in the loss through $w_k = (\delta E / (\varepsilon^{n\mathrm{B}}(k) - \varepsilon^{n\mathrm{B}}_{\min} + \delta E))^2$.</li>
<li>Tang-Toennies damped dispersion with XDM-derived $C_6$ and damping parameters, Thole-modified many-body polarization, and LJ repulsion borrowed from Amber ff14SB.</li>
</ul>
<h3 id="models">Models</h3>
<ul>
<li>14 PIPs total covering three 1-mer types, five 2-mer types, and six 3-mer types. Polynomial degree is 5 for the -CH- and CH$_3$- 1-mers, and 3 for the -CONH- 1-mer together with all 2-mers and 3-mers. Term counts range from 635 (-CH-, CH$_3$-) to 2871 (-CONH-CH-CONH-).</li>
<li>MB-nrg PEF implemented in the MBX code and exercised through LAMMPS and PLUMED.</li>
<li>Training set sizes per $n$-mer range from roughly 12,000 to 47,000 configurations (the -CONH- 1-mer dataset is the largest at 47,438).</li>
</ul>
<h3 id="evaluation">Evaluation</h3>
<table>
  <thead>
      <tr>
          <th>Metric</th>
          <th>MB-nrg</th>
          <th>Amber ff14SB</th>
          <th>Amber ff19SB</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>$n$-mer training RMSD</td>
          <td>$\leq 0.35$ kcal/mol</td>
          <td>n/a</td>
          <td>n/a</td>
      </tr>
      <tr>
          <td>$n$-mer test RMSD</td>
          <td>$\leq 0.47$ kcal/mol</td>
          <td>n/a</td>
          <td>n/a</td>
      </tr>
      <tr>
          <td>Alanine dipeptide 2D PES RMSD (overall)</td>
          <td>1.27 kcal/mol</td>
          <td>6.33 kcal/mol</td>
          <td>5.23 kcal/mol</td>
      </tr>
      <tr>
          <td>Same, $\leq 10$ kcal/mol region</td>
          <td>1.18 kcal/mol</td>
          <td>5.72 kcal/mol</td>
          <td>4.79 kcal/mol</td>
      </tr>
      <tr>
          <td>Same, $&gt; 10$ kcal/mol region</td>
          <td>1.59 kcal/mol</td>
          <td>8.44 kcal/mol</td>
          <td>6.81 kcal/mol</td>
      </tr>
      <tr>
          <td>Alanine dipeptide $m_1$ normal-mode mean deviation vs RI-MP2/def2-TZVP</td>
          <td>17.41 cm$^{-1}$</td>
          <td>n/a</td>
          <td>n/a</td>
      </tr>
      <tr>
          <td>Alanine dipeptide $m_4$ normal-mode mean deviation vs RI-MP2/def2-TZVP</td>
          <td>21.07 cm$^{-1}$</td>
          <td>n/a</td>
          <td>n/a</td>
      </tr>
      <tr>
          <td>AceAla$_9$Nme helix-formation onset (from extended start)</td>
          <td>~100 ps ($\alpha$/$3_{10}$ mix)</td>
          <td>~40 ps ($\alpha$)</td>
          <td>~80 ps ($\alpha$)</td>
      </tr>
  </tbody>
</table>
<h3 id="hardware">Hardware</h3>
<p>Computational resources came from the Air Force Office of Scientific Research (FA9550-20-1-0351), NSF award 2311260, the DoD High Performance Computing Modernization Program, the San Diego Supercomputer Center via ACCESS allocation CHE240114, and NERSC (contract DE-AC02-05CH11231, award BES-ERCAP0030920). Specific wall-clock and node-hour figures are not reported in the main text.</p>
<hr>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Zhou, R., Bull-Vulpe, E. F., Pan, Y., &amp; Paesani, F. (2025). Data-Driven Many-Body Simulations of Biomolecules with CCSD(T) Accuracy: I. Polyalanine in the Gas Phase. <em>ChemRxiv</em>. <a href="https://doi.org/10.26434/chemrxiv-2025-b05k5">https://doi.org/10.26434/chemrxiv-2025-b05k5</a></p>
<p><strong>Publication</strong>: ChemRxiv preprint, 25 March 2025.</p>
<p><strong>Additional Resources</strong>:</p>
<ul>
<li><a href="https://github.com/paesanilab/MBX">MBX software (Paesani group)</a></li>
<li><a href="https://github.com/paesanilab/MB-Fit">MB-Fit (training pipeline)</a></li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@misc</span>{zhou2025data,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span>=<span style="color:#e6db74">{Data-Driven Many-Body Simulations of Biomolecules with CCSD(T) Accuracy: I. Polyalanine in the Gas Phase}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span>=<span style="color:#e6db74">{Zhou, Ruihan and Bull-Vulpe, Ethan F. and Pan, Yuanhui and Paesani, Francesco}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span>=<span style="color:#e6db74">{2025}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">doi</span>=<span style="color:#e6db74">{10.26434/chemrxiv-2025-b05k5}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">howpublished</span>=<span style="color:#e6db74">{\url{https://doi.org/10.26434/chemrxiv-2025-b05k5}}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div>]]></content:encoded></item><item><title>MB-nrg in Solution: Polyalanine in Water with CCSD(T) PEFs</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/ml-potentials/mb-nrg-polyalanine-water/</link><pubDate>Sun, 12 Apr 2026 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/ml-potentials/mb-nrg-polyalanine-water/</guid><description>Zhou and Paesani extend MB-nrg to peptide-water interactions, training 1-mer-water 2-body PIPs on DLPNO-CCSD(T) and benchmarking alanine dipeptide solvation.</description><content:encoded><![CDATA[<h2 id="extending-mb-nrg-from-gas-phase-polyalanine-to-aqueous-solution">Extending MB-nrg from Gas-Phase Polyalanine to Aqueous Solution</h2>
<p>This is a <strong>Method</strong> paper, the second installment in Zhou and Paesani&rsquo;s MB-nrg-for-biomolecules series. Paper I (covered in <a href="/notes/chemistry/molecular-simulation/ml-potentials/mb-nrg-polyalanine-ccsdt/">MB-nrg: CCSD(T)-Accurate Potentials for Polyalanine</a>) decomposed gas-phase polyalanine into functional-group $n$-mers and fit permutationally invariant polynomials (PIPs) to DLPNO-CCSD(T)/aug-cc-pVTZ reference data. This sequel adds the missing piece: explicit, machine-learned 2-body interactions between every polyalanine functional-group 1-mer and a water molecule, trained on the same <a href="https://en.wikipedia.org/wiki/Coupled_cluster">coupled-cluster</a> reference. The resulting PEF couples the gas-phase intramolecular MB-nrg term, the MB-pol water model, and a new MB-nrg ala-water cross term within a single modular many-body decomposition.</p>
<h2 id="why-empirical-force-fields-struggle-with-hydrated-peptides">Why Empirical Force Fields Struggle with Hydrated Peptides</h2>
<p>Biomolecular function in water emerges from a coupling of intramolecular flexibility with solvent-mediated interactions, including hydrogen-bond networks, cooperative polarization, dispersion, and short-range exchange-repulsion. Empirical force fields such as AMBER, CHARMM, and OPLS approximate the multidimensional PES with pairwise-additive analytical terms whose parameters are tuned to experimental observables or low-level quantum data. The authors note that this functional form leads to systematic errors in predicted conformational ensembles for short peptides and <a href="https://en.wikipedia.org/wiki/Intrinsically_disordered_proteins">intrinsically disordered proteins (IDPs)</a>, with reported overpopulation of polyproline II (pPII) basins and antiparallel $\beta$ regions for alanine residues, plus underrepresentation of the transitional $\beta$ basin compared to experiment.</p>
<p>Polarizable force fields recover dielectric and hydration trends through induced dipoles, but still lean on empirical functional forms and miss short-range quantum effects (charge transfer, charge penetration, exchange-repulsion) that arise from electron-density overlap. <a href="/notes/chemistry/molecular-simulation/ml-potentials/dark-side-of-forces/">Machine-learned force fields</a> like MACE-OFF, GEMS, and FeNNix-Bio1 have improved bio-organic accuracy, but they still depend critically on the diversity and quality of training data, struggle to decompose energies into physically interpretable components, and most rely on DFT references that inherit delocalization errors and incomplete long-range correlation. Local descriptors common to MLFFs also limit treatment of long-range electrostatics and many-body correlations, both essential for biomolecular solvation.</p>
<p>The MB-nrg formalism, originally developed for water and small molecules and recently extended to alkanes and gas-phase polyalanine, offers an alternative: a rigorous many-body expansion (MBE) of the energy combined with both data-driven $n$-body PIPs and physics-based long-range terms. Paper II asks whether this modular gas-phase scaffold can be cleanly extended to aqueous environments by adding only short-range peptide-water 2-body PIPs.</p>
<h2 id="a-modular-mb-nrg-pef-for-polyalanine-in-water">A Modular MB-nrg PEF for Polyalanine in Water</h2>
<p>The MBE writes the total energy of a system of $N$ 1-mers as</p>
<p>$$
E_N(1, \dots, N) = \sum_{i=1}^{N} \varepsilon^{1\mathrm{B}}(i) + \sum_{i&lt;j}^{N} \varepsilon^{2\mathrm{B}}(i,j) + \sum_{i&lt;j&lt;k}^{N} \varepsilon^{3\mathrm{B}}(i,j,k) + \dots + \varepsilon^{N\mathrm{B}}(1, \dots, N)
$$</p>
<p>with each $n$-body term defined recursively as the $n$-mer energy minus all lower-order contributions. The MBE converges quickly for insulating molecular systems with large electronic band gaps (such as water and peptides), so explicit PIP corrections are typically truncated at $n \leq 4$, with higher-order effects absorbed into many-body polarization.</p>
<p>For polyalanine in water, the total potential is partitioned into three modular blocks:</p>
<p>$$
V_{\mathrm{MB\text{-}nrg}}^{\mathrm{tot}} = V_{\mathrm{MB\text{-}nrg}}^{\mathrm{ala}} + V_{\mathrm{MB\text{-}pol}}^{\mathrm{wat}} + V_{\mathrm{MB\text{-}nrg}}^{\mathrm{ala\text{-}wat}}
$$</p>
<p>where $V_{\mathrm{MB\text{-}nrg}}^{\mathrm{ala}}$ is the gas-phase intramolecular polyalanine PEF from Paper I, $V_{\mathrm{MB\text{-}pol}}^{\mathrm{wat}}$ is the MB-pol water model, and $V_{\mathrm{MB\text{-}nrg}}^{\mathrm{ala\text{-}wat}}$ is the new peptide-water cross term. The cross term itself follows the MB-nrg recipe of splitting machine-learned and physics-based contributions:</p>
<p>$$
V_{\mathrm{MB\text{-}nrg}}^{\mathrm{ala\text{-}wat}} = V_{\mathrm{ML}} + V_{\mathrm{phys}}
$$</p>
<p>with $V_{\mathrm{ML}} = V_{\mathrm{ML}}^{2\mathrm{B}}$ (only 2-body PIPs in this implementation) and $V_{\mathrm{phys}} = V_{\mathrm{elec}} + V_{\mathrm{disp}}$. The 2-body machine-learned term sums switched PIPs over every (1-mer, water) dimer:</p>
<p>$$
V_{\mathrm{ML}}^{2\mathrm{B}} = \sum_{i=1}^{N} s^{2\mathrm{B}}(\mathrm{M}_i, \mathrm{WAT}), V_{\mathrm{PIP}}^{2\mathrm{B}}(\mathrm{M}_i, \mathrm{WAT})
$$</p>
<p>where $\mathrm{M}_i$ is the $i$-th polyalanine functional-group 1-mer (-CH-, CH$_3$-, or -CONH-), WAT is a water molecule, and $s^{2\mathrm{B}}$ is a cosine switching function</p>
<p>$$
s^{2\mathrm{B}}(x) = \begin{cases} 1 &amp; x &lt; 0 \\ (1 + \cos(x))/2 &amp; 0 \leq x &lt; 1 \\ 0 &amp; 1 \leq x \end{cases}, \quad x = \frac{R - R_{\mathrm{in}}}{R_{\mathrm{out}} - R_{\mathrm{in}}}
$$</p>
<p>that smoothly attenuates the short-range PIP beyond a defined distance to preserve energy conservation in MD. The physics-based block uses a Thole-modified self-consistent polarization model (inherited from MB-pol) for $V_{\mathrm{elec}}$ and a Tang-Toennies damped dispersion sum</p>
<p>$$
V_{\mathrm{disp}} = -\sum_{\substack{\alpha \in 1\text{-mers} \\ \beta \in \mathrm{water}}} f(\mathrm{b}_{\alpha\beta} R_{\alpha\beta}), \frac{C_{6, \alpha\beta}}{R_{\alpha\beta}^{6}}
$$</p>
<p>with $C_{6, \alpha\beta}$ coefficients and atomic polarizabilities derived from the exchange-hole dipole moment (XDM) method, and atomic charges fit to reproduce the permanent multipole moments of each $n$-mer&rsquo;s optimized structure.</p>
<p>The authors stress that explicit 3-body and higher peptide-water PIPs are deliberately omitted in this first implementation; their effects are absorbed into the classical polarization term. They flag that strongly hydrogen-bonded or cooperative configurations may benefit from adding higher-body corrections in future work, following the precedent of MB-pol(2023) for water.</p>
<h2 id="training-set-generation-and-dlpno-ccsdt-reference-data">Training Set Generation and DLPNO-CCSD(T) Reference Data</h2>
<p>Training pools for the three 1-mer-water dimers (CH$_3$-H$_2$O, -CH&ndash;H$_2$O, -CONH&ndash;H$_2$O) extend the <a href="https://en.wikipedia.org/wiki/Metadynamics">parallel-bias metadynamics with partitioned families (PBMetaD+PFs)</a> protocol from Paper I. Covalent boundaries are capped with &ldquo;ghost&rdquo; hydrogens at fixed C-H (1.14 Å) and N-H (1.09 Å) distances to preserve closed-shell character; each 2-body energy is referenced to the corresponding optimized capped 1-mer-water geometry to remove constant offsets.</p>
<p>PBMetaD simulations are run in LAMMPS interfaced with PLUMED, using Amber ff14SB for the alanine 1-mers and TIP4P/2005f for water. Collective variables span all heavy-atom bonds, angles, and dihedrals in each dimer. To target distinct interaction regimes, three separate biased runs apply upper and lower walls on the 1-mer/water center-of-mass distance: 0-4 Å (short-range repulsion), 4-7 Å (mid-range attraction), and 7-10 Å (long-range orientation-dependent interactions). Each dimer yields about 600,000 configurations, reduced to roughly 40,000 training and 2,000 test configurations per type by K-means clustering.</p>
<p>Reference 2-body energies are computed at the DLPNO-CCSD(T)/aug-cc-pVTZ level in ORCA, using the aug-cc-pVTZ/C auxiliary basis, the RIJCOSX approximation, TightSCF, TightPNO, and the PModel pair-selection option. The counterpoise method corrects every 2-body energy for <a href="https://en.wikipedia.org/wiki/Basis_set_superposition_error">basis set superposition error</a>.</p>
<p>Each PIP minimizes a weighted, ridge-regularized least-squares objective:</p>
<p>$$
\chi^2 = \sum_{k \in \mathcal{S}} w_k \left[ V^{2\mathrm{B}}(k) - \varepsilon^{2\mathrm{B}}(k) \right]^2 + \Gamma^2 \sum_l c_l^2
$$</p>
<p>with $\Gamma = 0.0005$ throughout. Training weights bias the fit toward low-energy configurations,</p>
<p>$$
w_k = \left( \frac{\delta E}{\varepsilon^{2\mathrm{B}}(k) - \varepsilon_{\mathrm{min}}^{2\mathrm{B}} + \delta E} \right)^2
$$</p>
<p>with $\delta E = 40$ kcal/mol for all 1-mer-water pairs. MB-Fit handles the optimization, combining simplex minimization for non-linear parameters (Morse decay constants) with ridge regression for the linear coefficients.</p>
<p>Table 1 reports the PIP specifications. All three PIPs use polynomial degree 3 with a complete, unscreened basis. The -CH- and CH$_3$- dimers each require 710 symmetrized terms; the chemically richer -CONH- dimer requires 1,267 terms to capture its dipolar character and directional hydrogen bonding. Training-set sizes range from 41,781 to 43,174 configurations.</p>
<table>
  <thead>
      <tr>
          <th>1-mer type</th>
          <th>PIP degree</th>
          <th>PIP terms</th>
          <th>Training configs</th>
          <th>Train RMSD (kcal/mol)</th>
          <th>Test RMSD (kcal/mol)</th>
          <th>Train MAE</th>
          <th>Test MAE</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>-CH-</td>
          <td>3</td>
          <td>710</td>
          <td>43,174</td>
          <td>0.07</td>
          <td>0.08</td>
          <td>0.06</td>
          <td>0.06</td>
      </tr>
      <tr>
          <td>CH$_3$-</td>
          <td>3</td>
          <td>710</td>
          <td>43,172</td>
          <td>0.08</td>
          <td>0.08</td>
          <td>0.05</td>
          <td>0.05</td>
      </tr>
      <tr>
          <td>-CONH-</td>
          <td>3</td>
          <td>1,267</td>
          <td>41,781</td>
          <td>0.18</td>
          <td>0.20</td>
          <td>0.13</td>
          <td>0.16</td>
      </tr>
  </tbody>
</table>
<p>All RMSDs sit below 0.20 kcal/mol on both train and test splits, well within sub-chemical accuracy.</p>
<h2 id="validation-dimer-scans-free-energy-surfaces-and-hydration">Validation: Dimer Scans, Free-Energy Surfaces, and Hydration</h2>
<p>The authors stage four validation studies of increasing complexity, each touching a distinct facet of the new PEF.</p>
<p><strong>Alanine dipeptide-water dimer scans.</strong> One-dimensional scans probe the interaction energy along four hydrogen-bonding coordinates of an alanine dipeptide-water dimer: O$_1$-H$_w$, H$_1$-O$_w$, O$_2$-H$_w$, and H$_2$-O$_w$, where subscripts 1 and 2 mark the acetyl and N-methyl termini. The dipeptide is constrained to four representative <a href="https://en.wikipedia.org/wiki/Ramachandran_plot">Ramachandran</a> conformations: C5 ($\varphi = -150°$, $\psi = 150°$), pPII ($\varphi = -80°$, $\psi = 150°$), C7$_{\mathrm{eq}}$ ($\varphi = -80°$, $\psi = 70°$), and right-handed $\alpha$-helix $\alpha_R$ ($\varphi = -80°$, $\psi = -30°$). MB-nrg closely tracks the DLPNO-CCSD(T)/aug-cc-pVTZ reference curves across all 16 (4 conformation $\times$ 4 site) scans, despite never seeing the full dipeptide-water surface during training. Amber ff14SB/TIP3P and ff19SB/OPC underestimate hydrogen-bond depths and miss curvature near equilibrium, with the ff14SB/TIP3P combination yielding slightly better overall agreement than ff19SB/OPC even though TIP3P is the less accurate water model.</p>
<p>Two specific failure modes of the empirical force fields stand out. In the pPII conformation, both ff14SB and ff19SB predict significantly deeper interaction wells than the reference, overstabilizing several hydrogen bonds. In the H$_2$-O$_w$ scan of the $\alpha_R$ conformation, both empirical FFs exhibit a spurious 2.5-4.0 Å energy barrier that the authors trace to the simple Lennard-Jones repulsion between the acetyl carbonyl oxygen and water; MB-nrg and DLPNO-CCSD(T) instead show a smoothly decaying profile. The one MB-nrg deviation noted is the C5 H$_1$-O$_w$ scan in the 1.5-2.5 Å range, where MB-nrg predicts a slightly more attractive interaction than the reference. Here the H$_1$-O$_2$ distance is 2.3 Å and water acts simultaneously as acceptor at H$_1$ and donor to O$_2$, a cooperative pattern the authors expect would require explicit 2-mer-water or 3-mer-water terms to fully reproduce.</p>
<p><strong>Free-energy surface in explicit MB-pol water.</strong> Four-walker well-tempered metadynamics (WT-MetaD) simulations explore the conformational landscape of alanine dipeptide as a function of $(\varphi, \psi)$, biasing the central alanine residue&rsquo;s backbone dihedrals every 500 steps with 1.0 kJ/mol Gaussians of 11.46° width. The free-energy section reports 2.5 ns per replica across four parallel walkers (10 ns aggregate, matching the Figure 6 caption); the methods section states 8 ns total, an internal inconsistency in the paper. The MB-nrg FES recovers all major low-energy conformers identified by NMR and prior MP2/DFT studies: a global minimum at $\alpha_R$, additional local minima in C5, $\beta_2$, and $\alpha_L$, and a metastable pPII basin. The C7$_{\mathrm{eq}}$ minimum that dominates the gas-phase Ramachandran surface in Paper I is significantly destabilized in solution, consistent with experiment.</p>
<p>Quantitatively, MB-nrg predicts $\alpha_R$ and $\beta_2$ as isoenergetic global minima, with C5 about 3 kcal/mol higher in free energy. Prior DFT-with-implicit-solvation studies (Mironov et al., Yang and Honig) report C5, $\alpha_R$, and $\beta_2$ as nearly isoenergetic, and the authors note that the discrepancy may reflect the explicit MB-pol water treatment, residual DFT errors in the reference, or both. They flag a planned systematic benchmarking of MB-nrg PEFs for diverse polypeptides against both DFT and DLPNO-CCSD(T) data in future work. The Amber FESs over-stabilize pPII relative to C5/$\alpha_R$, contradicting experimental and DFT benchmarks; ff19SB/OPC also exhibits a spurious C7$_{\mathrm{eq}}$ minimum that is absent from MB-nrg.</p>
<p><strong>Hydration radial distribution functions.</strong> Site-site RDFs at 300 K for the same hydrogen-bond contacts (O$_1$-H$_w$, O$_2$-H$_w$, H$_1$-O$_w$, H$_2$-O$_w$) are computed from NVT MD trajectories. All three models reproduce well-defined first-shell peaks near 2.0 Å. For the O-H$_w$ pairs, MB-nrg shows a broader, slightly right-shifted second-shell peak, indicating less rigid water structure beyond the first shell. The amide-hydrogen RDFs are nearly identical between ff14SB/TIP3P and ff19SB/OPC, while MB-nrg reveals subtle first-shell shifts (shorter H$_1$-O$_w$, longer H$_2$-O$_w$) and weaker, less-defined second-shell features near 3.7-3.8 Å that are absent from the empirical force fields and consistent with prior ab initio MD on alanine dipeptide.</p>
<h2 id="a-modular-path-to-chemically-accurate-biomolecular-simulations">A Modular Path to Chemically Accurate Biomolecular Simulations</h2>
<p>Across the four benchmarks, the same picture emerges: a modular, bottom-up MB-nrg PEF built from functional-group $n$-mers and trained only on isolated 1-mer-water dimers can reach DLPNO-CCSD(T) accuracy for both energetic and structural observables of alanine dipeptide in explicit water. The decomposition into a gas-phase intramolecular term, an MB-pol water model, and an MB-nrg cross term keeps each piece interpretable and individually replaceable; the gas-phase polyalanine PEF from Paper I drops in unchanged, and the new ala-water PIPs were fit without ever seeing the full alanine dipeptide-water PES.</p>
<p>The authors are explicit about limitations:</p>
<ul>
<li>The cross term currently includes only 2-body PIPs (one 1-mer with one water). Higher-body peptide-water terms ($n &gt; 2$) are folded into the classical polarization, which the authors expect will be inadequate for strongly cooperative configurations such as the C5 H$_1$-O$_w$ scan where one water bridges H$_1$ and O$_2$.</li>
<li>Quantitative differences between the MB-nrg FES and prior implicit-solvation DFT studies (relative depths of $\alpha_R$, $\beta_2$, and C5) remain to be reconciled through systematic benchmarking against higher-level reference data.</li>
<li>Only polyalanine is considered. The framework is designed to generalize to other amino acids and side-chain-water interactions, but sequence- and side-chain-specific PIPs are still to be fit.</li>
<li>No public release of the parameterized PEF or training data is announced; the data availability statement says &ldquo;available from the authors upon request.&rdquo;</li>
</ul>
<p>The paper positions MB-nrg as a transferable, interpretable strategy for chemically accurate biomolecular simulations in solution, with future work aimed at heteropolypeptides and explicit higher-order many-body cross terms.</p>
<hr>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<h3 id="data">Data</h3>
<table>
  <thead>
      <tr>
          <th>Purpose</th>
          <th>Dataset</th>
          <th>Size</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Training pools</td>
          <td>PBMetaD+PFs in LAMMPS/PLUMED</td>
          <td>~600,000 configs per dimer, reduced to ~40,000</td>
          <td>ff14SB for alanine 1-mers, TIP4P/2005f for water; 300 K, 500 K, 700 K</td>
      </tr>
      <tr>
          <td>Distance regimes</td>
          <td>Walls on 1-mer/water COM distance</td>
          <td>0-4, 4-7, and 7-10 Å</td>
          <td>Short-range repulsion, mid-range attraction, long-range orientation</td>
      </tr>
      <tr>
          <td>Training labels</td>
          <td>DLPNO-CCSD(T)/aug-cc-pVTZ in ORCA</td>
          <td>3 unique 1-mer-water dimer types</td>
          <td>RIJCOSX, TightSCF, TightPNO, PModel; counterpoise BSSE correction</td>
      </tr>
      <tr>
          <td>Test sets</td>
          <td>Held-out clustered configs</td>
          <td>~2,000 per dimer</td>
          <td>Same K-means clustering protocol</td>
      </tr>
      <tr>
          <td>Alanine dipeptide-water scans</td>
          <td>1D scans along 4 H-bond coordinates in 4 conformations</td>
          <td>16 scans total</td>
          <td>C5, pPII, C7$_{\mathrm{eq}}$, and $\alpha_R$ conformations</td>
      </tr>
      <tr>
          <td>Alanine dipeptide FES</td>
          <td>WT-MetaD on $\varphi$, $\psi$ in MB-pol water</td>
          <td>4 walkers, 2.5 ns each (10 ns total per the results section and Figure 6 caption; methods section states 8 ns)</td>
          <td>1.0 kJ/mol height, 11.46° width, deposition every 500 steps</td>
      </tr>
      <tr>
          <td>Hydration RDFs</td>
          <td>NVT MD at 300 K</td>
          <td>Single trajectory per model</td>
          <td>Same H-bond sites as the dimer scans</td>
      </tr>
  </tbody>
</table>
<p>Per the data availability statement, &ldquo;any data generated and analyzed in this study, including the MB-nrg PEF, are available from the authors upon request.&rdquo; The MBX engine is publicly available on <a href="https://github.com/paesanilab/MBX">GitHub</a> under a UC Regents custom license that grants free use for educational, research, and non-profit purposes but restricts commercial use. No public release of the new ala-water PIPs is announced in the text.</p>
<h4 id="artifacts-table">Artifacts table</h4>
<table>
  <thead>
      <tr>
          <th>Artifact</th>
          <th>Type</th>
          <th>License</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><a href="https://github.com/paesanilab/MBX">MBX</a></td>
          <td>Code</td>
          <td>UC Regents custom (academic/non-profit only; no SPDX-recognized OSS license)</td>
          <td>C++ many-body potential engine; runs the MB-nrg PEF via LAMMPS and PLUMED</td>
      </tr>
      <tr>
          <td><a href="https://github.com/paesanilab/MB-Fit">MB-Fit</a></td>
          <td>Code</td>
          <td>Check repo</td>
          <td>Training pipeline for PIP fitting; used to fit the new 1-mer-water PIPs</td>
      </tr>
      <tr>
          <td>MB-nrg ala-water PIPs (this paper)</td>
          <td>Model</td>
          <td>Not released</td>
          <td>&ldquo;Available from the authors upon request&rdquo; per the data availability statement</td>
      </tr>
      <tr>
          <td>DLPNO-CCSD(T) training/test sets</td>
          <td>Dataset</td>
          <td>Not released</td>
          <td>Same statement; ~600,000 raw configs per dimer reduced to ~40,000 train + ~2,000 test</td>
      </tr>
  </tbody>
</table>
<h3 id="algorithms">Algorithms</h3>
<ul>
<li>Many-body expansion of the energy partitioned into three modular blocks: $V_{\mathrm{MB\text{-}nrg}}^{\mathrm{ala}} + V_{\mathrm{MB\text{-}pol}}^{\mathrm{wat}} + V_{\mathrm{MB\text{-}nrg}}^{\mathrm{ala\text{-}wat}}$.</li>
<li>Cross term split into $V_{\mathrm{ML}}^{2\mathrm{B}}$ (PIPs over every 1-mer-water dimer) and $V_{\mathrm{phys}} = V_{\mathrm{elec}} + V_{\mathrm{disp}}$.</li>
<li>Permutationally invariant polynomials in Morse-exponential variables $\xi_{ij} = \exp(-k_{\tau(ij)} R_{ij})$, symmetrized over chemically equivalent atoms; same construction as the NMA-water PIPs.</li>
<li>Cosine switching function $s^{2\mathrm{B}}$ smoothly attenuates short-range PIPs between user-defined inner and outer cutoffs.</li>
<li>Dispersion: Tang-Toennies damped $C_6/R^6$ with XDM-derived coefficients and damping parameters.</li>
<li>Electrostatics: modified Thole model with self-consistent induced dipoles for many-body polarization; per-atom charges fit to reproduce permanent multipole moments of each $n$-mer&rsquo;s optimized structure.</li>
<li>Ghost-H capping at cleaved covalent boundaries with fixed C-H (1.14 Å) and N-H (1.09 Å) distances; per-dimer optimized-structure referencing.</li>
<li>Training with simplex minimization for non-linear parameters and ridge regression for linear coefficients via MB-Fit, with low-energy weighting and $\Gamma = 0.0005$, $\delta E = 40$ kcal/mol.</li>
<li>WT-MetaD with four parallel walkers for the alanine dipeptide FES.</li>
</ul>
<h3 id="models">Models</h3>
<ul>
<li>Three new 1-mer-water 2-body PIPs covering -CH-/H$_2$O, CH$_3$-/H$_2$O, and -CONH-/H$_2$O dimers.</li>
<li>All three PIPs use polynomial degree 3 with a complete, unscreened basis (no term screening).</li>
<li>Term counts: 710 for -CH-/H$_2$O and CH$_3$-/H$_2$O, 1,267 for -CONH-/H$_2$O.</li>
<li>Combined with the gas-phase polyalanine MB-nrg PEF from Paper I and the MB-pol water model, exercised through MBX, LAMMPS, and PLUMED.</li>
</ul>
<h3 id="evaluation">Evaluation</h3>
<table>
  <thead>
      <tr>
          <th>Metric</th>
          <th>MB-nrg</th>
          <th>Amber ff14SB/TIP3P</th>
          <th>Amber ff19SB/OPC</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>-CH-/H$_2$O 2-body train/test RMSD</td>
          <td>0.07 / 0.08 kcal/mol</td>
          <td>n/a</td>
          <td>n/a</td>
      </tr>
      <tr>
          <td>CH$_3$-/H$_2$O 2-body train/test RMSD</td>
          <td>0.08 / 0.08 kcal/mol</td>
          <td>n/a</td>
          <td>n/a</td>
      </tr>
      <tr>
          <td>-CONH-/H$_2$O 2-body train/test RMSD</td>
          <td>0.18 / 0.20 kcal/mol</td>
          <td>n/a</td>
          <td>n/a</td>
      </tr>
      <tr>
          <td>Alanine dipeptide-water 1D scans (qualitative)</td>
          <td>Tracks DLPNO-CCSD(T) curves across 16 scans</td>
          <td>Underestimates H-bond depths; spurious $\alpha_R$ H$_2$-O$_w$ barrier</td>
          <td>Same shape as ff14SB/TIP3P</td>
      </tr>
      <tr>
          <td>Alanine dipeptide FES global minima</td>
          <td>Isoenergetic $\alpha_R$ and $\beta_2$; C5 ~3 kcal/mol higher</td>
          <td>Over-stabilizes pPII</td>
          <td>Over-stabilizes pPII; spurious C7$_{\mathrm{eq}}$ minimum</td>
      </tr>
      <tr>
          <td>O-H$_w$ second shell</td>
          <td>Broader, right-shifted; finer detail consistent with prior AIMD</td>
          <td>Sharper, less detail</td>
          <td>Sharper, less detail</td>
      </tr>
      <tr>
          <td>H-O$_w$ second shell</td>
          <td>Weak features near 3.7-3.8 Å</td>
          <td>Absent</td>
          <td>Absent</td>
      </tr>
  </tbody>
</table>
<p>Quantitative RMSD or KL-divergence values for the FES and RDF benchmarks are not reported in the main text.</p>
<h3 id="hardware">Hardware</h3>
<p>The authors acknowledge support from the Air Force Office of Scientific Research (FA9550-20-1-0351, theoretical development) and NSF (award 2311260, MBX implementation). Computational resources came from the DoD High Performance Computing Modernization Program, the San Diego Supercomputer Center via ACCESS allocation CHE240114, and NERSC (contract DE-AC02-05CH11231, award BES-ERCAP0030920). Specific wall-clock and node-hour figures are not reported in the main text.</p>
<hr>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Zhou, R., &amp; Paesani, F. (2025). Toward Chemical Accuracy in Biomolecular Simulations through Data-Driven Many-Body Potentials: II. Polyalanine in Water. <em>ChemRxiv</em>. <a href="https://doi.org/10.26434/chemrxiv-2025-j6cwv-v2">https://doi.org/10.26434/chemrxiv-2025-j6cwv-v2</a></p>
<p><strong>Publication</strong>: ChemRxiv preprint (version 2), 10 October 2025.</p>
<p><strong>Additional Resources</strong>:</p>
<ul>
<li><a href="https://github.com/paesanilab/MBX">MBX software (Paesani group)</a></li>
<li><a href="https://github.com/paesanilab/MB-Fit">MB-Fit (training pipeline)</a></li>
<li>Companion paper: <a href="/notes/chemistry/molecular-simulation/ml-potentials/mb-nrg-polyalanine-ccsdt/">MB-nrg: CCSD(T)-Accurate Potentials for Polyalanine</a> (Paper I)</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@article</span>{zhou2025toward,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span>=<span style="color:#e6db74">{Toward Chemical Accuracy in Biomolecular Simulations through Data-Driven Many-Body Potentials: II. Polyalanine in Water}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span>=<span style="color:#e6db74">{Zhou, Ruihan and Paesani, Francesco}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">journal</span>=<span style="color:#e6db74">{ChemRxiv}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span>=<span style="color:#e6db74">{2025}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">doi</span>=<span style="color:#e6db74">{10.26434/chemrxiv-2025-j6cwv-v2}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div>]]></content:encoded></item><item><title>Ewald Message Passing for Molecular Graphs</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/ml-potentials/ewald-message-passing-molecular-graphs/</link><pubDate>Tue, 07 Apr 2026 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/ml-potentials/ewald-message-passing-molecular-graphs/</guid><description>Ewald message passing augments GNNs with Fourier-space long-range interactions, improving energy predictions by 10-16% on OC20 and OE62 benchmarks.</description><content:encoded><![CDATA[<h2 id="a-fourier-space-long-range-correction-for-molecular-gnns">A Fourier-Space Long-Range Correction for Molecular GNNs</h2>
<p>This is a <strong>Method</strong> paper that introduces Ewald message passing (Ewald MP), a general framework for incorporating long-range interactions into message passing neural networks (MPNNs) for molecular <a href="/notes/chemistry/molecular-simulation/ml-potentials/learning-smooth-interatomic-potentials/">potential energy surface</a> prediction. The key contribution is a nonlocal Fourier-space message passing scheme, grounded in the classical <a href="https://en.wikipedia.org/wiki/Ewald_summation">Ewald summation</a> technique from computational physics, that complements the short-range message passing of existing GNN architectures.</p>
<h2 id="the-long-range-interaction-problem-in-molecular-gnns">The Long-Range Interaction Problem in Molecular GNNs</h2>
<p>Standard MPNNs for molecular property prediction rely on a spatial distance cutoff to define atomic neighborhoods. While this locality assumption enables favorable scaling with system size and provides a useful inductive bias, it fundamentally limits the model&rsquo;s ability to capture long-range interactions such as electrostatic forces and van der Waals (<a href="https://en.wikipedia.org/wiki/London_dispersion_force">London dispersion</a>) interactions. These interactions decay slowly with distance (e.g., electrostatic energy follows a $1/r$ power law), and truncating them with a distance cutoff can introduce severe artifacts in thermochemical predictions.</p>
<p>This problem is well-known in molecular dynamics, where empirical force fields explicitly separate bonded (short-range) and non-bonded (long-range) energy terms. The Ewald summation technique addresses this by decomposing interactions into a short-range part that converges quickly with a distance cutoff and a long-range part whose Fourier transform converges quickly with a frequency cutoff. The authors propose bringing this same strategy into the GNN paradigm.</p>
<h2 id="from-ewald-summation-to-learnable-fourier-space-messages">From Ewald Summation to Learnable Fourier-Space Messages</h2>
<p>The core insight is a formal analogy between the continuous-filter convolution used in MPNNs and the electrostatic potential computation in Ewald summation. In a standard continuous-filter convolution, the message sum for atom $i$ is:</p>
<p>$$
M_i^{(l+1)} = \sum_{j \in \mathcal{N}(i)} h_j^{(l)} \cdot \Phi^{(l)}(| \mathbf{x}_i - \mathbf{x}_j |)
$$</p>
<p>where $h_j^{(l)}$ are atom embeddings and $\Phi^{(l)}$ is a learned radial filter. Comparing this to the electrostatic potential $V_i^{\text{es}}(\mathbf{x}_i) = \sum_{j \neq i} q_j \cdot \Phi^{\text{es}}(| \mathbf{x}_i - \mathbf{x}_j |)$ reveals a direct correspondence: atom embeddings play the role of partial charges, and learned filters replace the $1/r$ kernel.</p>
<p>Ewald MP decomposes the learned filter into short-range and long-range components. The short-range part is handled by any existing GNN architecture with a distance cutoff. The long-range part is computed as a sum over Fourier frequencies:</p>
<p>$$
M^{\text{lr}}(\mathbf{x}_i) = \sum_{\mathbf{k}} \exp(i \mathbf{k}^T \mathbf{x}_i) \cdot s_{\mathbf{k}} \cdot \hat{\Phi}^{\text{lr}}(| \mathbf{k} |)
$$</p>
<p>where $s_{\mathbf{k}}$ are <strong><a href="https://en.wikipedia.org/wiki/Structure_factor">structure factor</a> embeddings</strong>, computed as:</p>
<p>$$
s_{\mathbf{k}} = \sum_{j \in \mathcal{S}} h_j \exp(-i \mathbf{k}^T \mathbf{x}_j)
$$</p>
<p>These structure factor embeddings are a Fourier-space representation of the atom embedding distribution, and truncating to low frequencies effectively coarse-grains the hidden model state while preserving long-range information. The frequency filters $\hat{\Phi}^{\text{lr}}$ are learned, making the entire scheme data-driven rather than tied to a fixed physical functional form.</p>
<p>The method handles both <strong>periodic</strong> systems (where the <a href="https://en.wikipedia.org/wiki/Reciprocal_lattice">reciprocal lattice</a> provides a natural frequency discretization) and <strong>aperiodic</strong> systems (where the Fourier domain is discretized using a cubic voxel grid with SVD-based rotation alignment to preserve rotation invariance). The combined embedding update becomes:</p>
<p>$$
h_i^{(l+1)} = \frac{1}{\sqrt{3}} \left[ h_i^{(l)} + f_{\text{upd}}^{\text{sr}}(M_i^{\text{sr}}) + f_{\text{upd}}^{\text{lr}}(M_i^{\text{lr}}) \right]
$$</p>
<p>The computational complexity is $\mathcal{O}(N_{\text{at}} N_{\text{k}})$, and by fixing the number of frequency vectors $N_{\text{k}}$, linear scaling $\mathcal{O}(N_{\text{at}})$ is achievable.</p>
<h2 id="experiments-across-four-gnn-architectures-and-two-datasets">Experiments Across Four GNN Architectures and Two Datasets</h2>
<p>The authors test Ewald MP as an augmentation on four baseline architectures: <a href="/notes/chemistry/datasets/marcel/">SchNet, PaiNN, DimeNet++, and GemNet-T</a>. Two datasets are used:</p>
<ul>
<li><strong>OC20</strong> (Chanussot et al., 2021): ~265M periodic structures of adsorbate-catalyst systems with DFT-computed energies and forces. The OC20-2M subsplit is used for training.</li>
<li><strong>OE62</strong> (Stuke et al., 2020): ~62,000 large aperiodic organic molecules with DFT-computed energies that include a DFT-D3 dispersion correction for London dispersion interactions.</li>
</ul>
<p>All baselines use a 6 Å distance cutoff and 50 maximum neighbors. The Ewald modification is minimal: the long-range message sum is added as an additional skip connection term in each interaction block. Comparison studies include: (1) increasing the distance cutoff to match the computational cost of Ewald MP, (2) replacing the Ewald block with a SchNet interaction block at increased cutoff, and (3) increasing atom embedding dimensions to match Ewald MP&rsquo;s parameter count.</p>
<h3 id="key-energy-mae-results-on-oe62">Key Energy MAE Results on OE62</h3>
<table>
  <thead>
      <tr>
          <th>Model</th>
          <th>Baseline (meV)</th>
          <th>Ewald MP (meV)</th>
          <th>Improvement</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>SchNet</td>
          <td>133.5</td>
          <td>79.2</td>
          <td>40.7%</td>
      </tr>
      <tr>
          <td>PaiNN</td>
          <td>61.4</td>
          <td>57.9</td>
          <td>5.7%</td>
      </tr>
      <tr>
          <td>DimeNet++</td>
          <td>51.2</td>
          <td>46.5</td>
          <td>9.2%</td>
      </tr>
      <tr>
          <td>GemNet-T</td>
          <td>51.5</td>
          <td>47.4</td>
          <td>8.0%</td>
      </tr>
  </tbody>
</table>
<h3 id="key-energy-mae-results-on-oc20-averaged-across-test-splits">Key Energy MAE Results on OC20 (Averaged Across Test Splits)</h3>
<table>
  <thead>
      <tr>
          <th>Model</th>
          <th>Baseline (meV)</th>
          <th>Ewald MP (meV)</th>
          <th>Improvement</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>SchNet</td>
          <td>895</td>
          <td>830</td>
          <td>7.3%</td>
      </tr>
      <tr>
          <td>PaiNN</td>
          <td>448</td>
          <td>393</td>
          <td>12.3%</td>
      </tr>
      <tr>
          <td>DimeNet++</td>
          <td>496</td>
          <td>445</td>
          <td>10.4%</td>
      </tr>
      <tr>
          <td>GemNet-T</td>
          <td>346</td>
          <td>307</td>
          <td>11.3%</td>
      </tr>
  </tbody>
</table>
<h2 id="robust-long-range-improvements-and-dispersion-recovery">Robust Long-Range Improvements and Dispersion Recovery</h2>
<p>Ewald MP achieves consistent improvements across all models and both datasets, averaging 16.1% on OE62 and 10.3% on OC20. Several findings stand out:</p>
<ol>
<li>
<p><strong>Robustness</strong>: Unlike the increased-cutoff and SchNet-LR alternatives, Ewald MP never produces detrimental effects in any tested configuration. The increased cutoff setting hurts SchNet and PaiNN on OE62, and the SchNet-LR block fails to improve DimeNet++ and GemNet-T.</p>
</li>
<li>
<p><strong>Long-range specificity</strong>: A binning analysis on OE62 groups molecules by the magnitude of their DFT-D3 dispersion correction. Ewald MP shows an outsize improvement for structures with large long-range energy contributions. It recovers or surpasses a &ldquo;cheating&rdquo; baseline that receives the exact DFT-D3 ground truth as an additional input.</p>
</li>
<li>
<p><strong>Efficiency on periodic systems</strong>: Ewald MP achieves similar relative improvements on OC20 at roughly half the relative computational cost compared to OE62, suggesting periodic structures as a particularly attractive application domain.</p>
</li>
<li>
<p><strong>Force predictions</strong>: Improvements in <a href="/notes/chemistry/molecular-simulation/ml-potentials/dark-side-of-forces/">force MAEs</a> are consistent but small, which is expected since the frequency truncation removes high-frequency contributions to the potential energy surface.</p>
</li>
<li>
<p><strong>Ablation studies</strong>: Results are robust across different frequency cutoffs, voxel resolutions, and filtering strategies, with the non-radial periodic filtering scheme outperforming radial alternatives on out-of-distribution generalization.</p>
</li>
</ol>
<p>Limitations include the current focus on scalar (invariant) embeddings only (PaiNN&rsquo;s equivariant vector embeddings are not augmented), and the potential for a &ldquo;gap&rdquo; of medium-range interactions when $N_{\text{k}}$ is fixed for linear scaling. The authors suggest adapting more efficient Ewald summation variants (e.g., particle mesh Ewald with $\mathcal{O}(N \log N)$ scaling) as future work.</p>
<hr>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<h3 id="data">Data</h3>
<table>
  <thead>
      <tr>
          <th>Purpose</th>
          <th>Dataset</th>
          <th>Size</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Training (periodic)</td>
          <td>OC20-2M</td>
          <td>~2M structures</td>
          <td>Subsplit of OC20; PBC; DFT energies and forces</td>
      </tr>
      <tr>
          <td>Training (aperiodic)</td>
          <td>OE62</td>
          <td>~62,000 molecules</td>
          <td>Large organic molecules; DFT energies with D3 correction</td>
      </tr>
      <tr>
          <td>Evaluation</td>
          <td>OC20-test (4 splits: ID, OOD-ads, OOD-cat, OOD-both)</td>
          <td>Varies</td>
          <td>Evaluated via submission to OC20 evaluation server</td>
      </tr>
      <tr>
          <td>Evaluation</td>
          <td>OE62-val, OE62-test</td>
          <td>~6,000 each</td>
          <td>Direct evaluation</td>
      </tr>
  </tbody>
</table>
<h3 id="algorithms">Algorithms</h3>
<ul>
<li>Ewald message passing is integrated as an additional skip connection term in each interaction block</li>
<li>For periodic systems: non-radial filtering with fixed reciprocal lattice positions ($N_x, N_y, N_z$ hyperparameters)</li>
<li>For aperiodic systems: radial Gaussian basis function filtering with frequency cutoff $c_k$ and voxel resolution $\Delta = 0.2$ Å$^{-1}$</li>
<li>SVD-based coordinate alignment for rotation invariance in the aperiodic case</li>
<li>Bottleneck dimension $N_\downarrow = 16$ (GemNet-T) or $N_\downarrow = 8$ (others)</li>
<li>Update function: dense layer + $N_{\text{hidden}}$ residual layers ($N_{\text{hidden}} = 3$, except PaiNN with $N_{\text{hidden}} = 0$)</li>
</ul>
<h3 id="models">Models</h3>
<table>
  <thead>
      <tr>
          <th>Model</th>
          <th>Embedding Size (OE62)</th>
          <th>Interaction Blocks</th>
          <th>Ewald Params (OE62)</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>SchNet</td>
          <td>512</td>
          <td>4</td>
          <td>12.2M total</td>
      </tr>
      <tr>
          <td>PaiNN</td>
          <td>512</td>
          <td>4</td>
          <td>15.7M total</td>
      </tr>
      <tr>
          <td>DimeNet++</td>
          <td>256</td>
          <td>3</td>
          <td>4.8M total</td>
      </tr>
      <tr>
          <td>GemNet-T</td>
          <td>256</td>
          <td>3</td>
          <td>16.1M total</td>
      </tr>
  </tbody>
</table>
<h3 id="evaluation">Evaluation</h3>
<ul>
<li>Primary metric: Energy mean absolute error (EMAE) in meV</li>
<li>Secondary metric: Force MAE in meV/Å (OC20 only)</li>
<li>Loss: Linear combination of energy and force MAEs (Eq. 15) with model-specific force multipliers</li>
<li>Optimizer: Adam with weight decay ($\lambda = 0.01$)</li>
</ul>
<h3 id="hardware">Hardware</h3>
<ul>
<li>All runtime measurements on NVIDIA A100 GPUs</li>
<li>Runtimes measured after 50 warmup batches, averaged over 500 batches, minimum of 3 repetitions</li>
<li>Code: <a href="https://github.com/arthurkosmala/EwaldMP">EwaldMP</a> (Hippocratic License 3.0)</li>
</ul>
<h3 id="artifacts">Artifacts</h3>
<table>
  <thead>
      <tr>
          <th>Artifact</th>
          <th>Type</th>
          <th>License</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><a href="https://github.com/arthurkosmala/EwaldMP">EwaldMP</a></td>
          <td>Code</td>
          <td>Hippocratic License 3.0 (new files) / MIT (OC20 base)</td>
          <td>Official implementation built on the Open Catalyst Project codebase</td>
      </tr>
      <tr>
          <td><a href="https://github.com/Open-Catalyst-Project/ocp/blob/main/DATASET.md">OC20</a></td>
          <td>Dataset</td>
          <td>CC-BY-4.0</td>
          <td>~265M periodic adsorbate-catalyst structures with DFT energies and forces</td>
      </tr>
      <tr>
          <td><a href="https://doi.org/10.1038/s41597-020-0385-y">OE62</a></td>
          <td>Dataset</td>
          <td>CC-BY-4.0</td>
          <td>~62,000 large organic molecules with DFT energies including D3 correction</td>
      </tr>
  </tbody>
</table>
<p><strong>Reproducibility status</strong>: Highly Reproducible. Source code, both datasets, and detailed hyperparameters (including per-model learning rates, batch sizes, and Ewald-specific settings) are all publicly available. Pre-trained model weights are not provided.</p>
<hr>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Kosmala, A., Gasteiger, J., Gao, N., &amp; Günnemann, S. (2023). Ewald-based Long-Range Message Passing for Molecular Graphs. In <em>Proceedings of the 40th International Conference on Machine Learning (ICML 2023)</em>.</p>
<p><strong>Publication</strong>: ICML 2023</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@inproceedings</span>{kosmala2023ewald,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span>=<span style="color:#e6db74">{Ewald-based Long-Range Message Passing for Molecular Graphs}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span>=<span style="color:#e6db74">{Kosmala, Arthur and Gasteiger, Johannes and Gao, Nicholas and G{\&#34;u}nnemann, Stephan}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">booktitle</span>=<span style="color:#e6db74">{Proceedings of the 40th International Conference on Machine Learning}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span>=<span style="color:#e6db74">{2023}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">series</span>=<span style="color:#e6db74">{PMLR}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">volume</span>=<span style="color:#e6db74">{202}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div>]]></content:encoded></item><item><title>Kabsch-Horn Cookbook: Differentiable Alignment</title><link>https://hunterheidenreich.com/projects/kabsch-horn-cookbook/</link><pubDate>Fri, 20 Mar 2026 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/projects/kabsch-horn-cookbook/</guid><description>Differentiable Kabsch (SVD) and Horn (quaternion) alignment for NumPy, PyTorch, JAX, TensorFlow, and MLX with gradient-safe SVD.</description><content:encoded><![CDATA[<h2 id="overview">Overview</h2>
<p>Aligning two sets of corresponding points, finding the optimal rotation (and optionally translation and scale) that maps one onto the other, is a fundamental operation across scientific computing. It appears in molecular dynamics (superimposing protein conformations), robotics (sensor registration), and computer vision (shape matching). The two dominant algorithm families are the Kabsch (SVD-based) method and the Horn (quaternion-based) method.</p>
<p>The <strong>Kabsch-Horn Cookbook</strong> is a Python library that implements both algorithm families across five numerical frameworks: NumPy, PyTorch, JAX, TensorFlow, and MLX. Every backend shares the same API, supports N-dimensional point sets, per-point weights, and arbitrary batch dimensions. The PyTorch, JAX, TensorFlow, and MLX backends are fully differentiable, with custom autograd rules that bypass the numerically unstable gradient of the standard SVD near degenerate singular values.</p>
<h2 id="features">Features</h2>
<h3 id="algorithms">Algorithms</h3>
<ul>
<li><strong>Kabsch</strong>: SVD-based optimal rotation for rigid alignment</li>
<li><strong>Kabsch-Umeyama</strong>: Kabsch with an additional optimal scaling factor $c$, solving $Q \approx cRP + t$</li>
<li><strong>Horn</strong>: Quaternion-based optimal rotation via the eigendecomposition of a $4 \times 4$ key matrix</li>
<li><strong>Horn + Scale</strong>: Horn&rsquo;s method extended with optimal isotropic scaling</li>
<li><strong>RMSD Wrappers</strong>: Convenience functions that return RMSD directly alongside the alignment parameters</li>
</ul>
<h3 id="framework-support">Framework Support</h3>
<table>
  <thead>
      <tr>
          <th>Framework</th>
          <th style="text-align: center">Differentiable</th>
          <th style="text-align: center">Compile/JIT</th>
          <th>Versions</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>NumPy</td>
          <td style="text-align: center"></td>
          <td style="text-align: center"></td>
          <td>1.24+</td>
      </tr>
      <tr>
          <td>PyTorch</td>
          <td style="text-align: center">Yes</td>
          <td style="text-align: center"><code>torch.compile</code></td>
          <td>2.0+</td>
      </tr>
      <tr>
          <td>JAX</td>
          <td style="text-align: center">Yes</td>
          <td style="text-align: center"><code>jax.jit</code></td>
          <td>0.4+</td>
      </tr>
      <tr>
          <td>TensorFlow</td>
          <td style="text-align: center">Yes</td>
          <td style="text-align: center"></td>
          <td>2.13+</td>
      </tr>
      <tr>
          <td>MLX</td>
          <td style="text-align: center">Yes</td>
          <td style="text-align: center"></td>
          <td>0.1+</td>
      </tr>
  </tbody>
</table>
<p><code>torch.compile</code> and <code>jax.jit</code> are the tested compile/JIT paths. MLX supports 3D inputs only; the Kabsch (SVD) path is N-dimensional on the other four backends.</p>
<h3 id="numerical-robustness">Numerical Robustness</h3>
<p>Standard SVD and eigendecomposition backward passes produce <code>NaN</code> gradients when singular values collide or are near-zero. The library provides custom autograd primitives to handle these cases:</p>
<ul>
<li><strong>SafeSVD</strong> (PyTorch, JAX, TF, MLX): Custom backward pass that clamps the singular value gap, preventing division-by-zero in the gradient</li>
<li><strong>SafeEigh</strong> (PyTorch, JAX, TF, MLX): Analogous safe backward for the symmetric eigendecomposition used in Horn&rsquo;s method</li>
<li><strong>Per-point weights</strong>: Weighted centroids and weighted cross-covariance for mass-weighted or confidence-weighted alignment</li>
<li><strong>Batch dimensions</strong>: All functions broadcast over leading batch dimensions without explicit loops</li>
<li><strong>Mixed-dtype promotion</strong>: Inputs are promoted to a common floating-point dtype automatically</li>
</ul>
<h3 id="testing">Testing</h3>
<p>The test suite uses Hypothesis-based property testing across 13 modules covering:</p>
<ul>
<li>Round-trip correctness (align then compare)</li>
<li>Gradient finiteness and correctness (finite-difference checks)</li>
<li>Reflection handling (proper vs. improper rotations)</li>
<li>Weighted alignment consistency</li>
<li>Batch broadcasting</li>
<li>4 differentiable backends $\times$ 4 precisions (float32, float64, and where supported, float16, bfloat16)</li>
</ul>
<h2 id="usage">Usage</h2>
<p>This is a reference cookbook, so you can copy the framework folder you need from <code>src/kabsch_horn/&lt;framework&gt;/</code> directly into your project (the code has no runtime dependencies beyond the framework itself). To depend on it instead, install a pinned version from GitHub:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bash" data-lang="bash"><span style="display:flex;"><span>pip install <span style="color:#e6db74">&#34;git+https://github.com/hunter-heidenreich/Kabsch-Cookbook.git@v0.4.1&#34;</span>
</span></span></code></pre></div><p>Basic alignment with NumPy:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">import</span> numpy <span style="color:#66d9ef">as</span> np
</span></span><span style="display:flex;"><span><span style="color:#f92672">from</span> kabsch_horn <span style="color:#f92672">import</span> numpy <span style="color:#66d9ef">as</span> kh
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e"># Two sets of corresponding 3D points</span>
</span></span><span style="display:flex;"><span>P <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>random<span style="color:#f92672">.</span>randn(<span style="color:#ae81ff">100</span>, <span style="color:#ae81ff">3</span>)
</span></span><span style="display:flex;"><span>R_true <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>linalg<span style="color:#f92672">.</span>qr(np<span style="color:#f92672">.</span>random<span style="color:#f92672">.</span>randn(<span style="color:#ae81ff">3</span>, <span style="color:#ae81ff">3</span>))[<span style="color:#ae81ff">0</span>]  <span style="color:#75715e"># random rotation matrix</span>
</span></span><span style="display:flex;"><span>Q <span style="color:#f92672">=</span> (P <span style="color:#f92672">@</span> R_true<span style="color:#f92672">.</span>T) <span style="color:#f92672">+</span> np<span style="color:#f92672">.</span>random<span style="color:#f92672">.</span>randn(<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">3</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>R, t, rmsd <span style="color:#f92672">=</span> kh<span style="color:#f92672">.</span>kabsch(P, Q)
</span></span><span style="display:flex;"><span>aligned <span style="color:#f92672">=</span> P <span style="color:#f92672">@</span> R<span style="color:#f92672">.</span>T <span style="color:#f92672">+</span> t
</span></span></code></pre></div><p>RMSD loss for training in PyTorch:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">import</span> torch
</span></span><span style="display:flex;"><span><span style="color:#f92672">from</span> kabsch_horn <span style="color:#f92672">import</span> pytorch <span style="color:#66d9ef">as</span> kh
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>pred_coords <span style="color:#f92672">=</span> model(input_features)   <span style="color:#75715e"># (B, N, 3), requires_grad=True</span>
</span></span><span style="display:flex;"><span>target_coords <span style="color:#f92672">=</span> batch[<span style="color:#e6db74">&#34;target&#34;</span>]       <span style="color:#75715e"># (B, N, 3)</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>rmsd <span style="color:#f92672">=</span> kh<span style="color:#f92672">.</span>kabsch_rmsd(pred_coords, target_coords)  <span style="color:#75715e"># (B,)</span>
</span></span><span style="display:flex;"><span>loss <span style="color:#f92672">=</span> rmsd<span style="color:#f92672">.</span>mean()
</span></span><span style="display:flex;"><span>loss<span style="color:#f92672">.</span>backward()  <span style="color:#75715e"># safe gradients via SafeSVD</span>
</span></span></code></pre></div><p>For the full API reference and additional examples, see the <a href="https://hunter-heidenreich.github.io/Kabsch-Cookbook/">documentation site</a>.</p>
<h2 id="results">Results</h2>
<h3 id="gradient-stability">Gradient Stability</h3>
<p>The standard SVD backward pass computes terms of the form $\frac{1}{\sigma_i^2 - \sigma_j^2}$, which diverges when two singular values are close. In molecular alignment this happens frequently: planar molecules, symmetric structures, and noisy coordinates can all produce near-degenerate singular values. The SafeSVD primitive floors the magnitude of that denominator at the dtype&rsquo;s machine epsilon (<code>finfo(dtype).eps</code>), producing finite (if slightly biased) gradients in these edge cases. Property-based tests confirm that gradients remain finite across thousands of random rotations, scales, and noise levels for all four differentiable backends.</p>
<h3 id="framework-parity">Framework Parity</h3>
<p>All five backends produce numerically equivalent results (up to floating-point tolerance) on the same inputs. The shared API means switching from NumPy prototyping to PyTorch training requires changing only the import path.</p>
<h2 id="related-work">Related Work</h2>
<p>This project builds on the foundational alignment algorithms described in these papers:</p>
<ul>
<li><a href="/notes/biology/computational-biology/kabsch-algorithm/">Kabsch (1976)</a>: the original SVD-based rotation alignment</li>
<li><a href="/notes/biology/computational-biology/arun-svd-point-fitting/">Arun et al. (1987)</a>: SVD formulation for 3D point set fitting</li>
<li><a href="/notes/biology/computational-biology/horn-absolute-orientation/">Horn (1987)</a>: quaternion-based closed-form absolute orientation</li>
<li><a href="/notes/biology/computational-biology/horn-orthonormal-matrices/">Horn et al. (1988)</a>: orthonormal matrix (polar decomposition) approach</li>
<li><a href="/notes/biology/computational-biology/umeyama-similarity-transformation/">Umeyama (1991)</a>: extension to include optimal scaling</li>
</ul>
<p>For a detailed walkthrough of the Kabsch algorithm with code examples, see the companion blog post: <a href="/posts/kabsch-algorithm/">The Kabsch Algorithm</a>.</p>
]]></content:encoded></item><item><title>Stillinger-Weber Potential for Silicon Simulation</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/classical-methods/stillinger-weber-1985/</link><pubDate>Sun, 14 Dec 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/classical-methods/stillinger-weber-1985/</guid><description>The 1985 paper introducing the Stillinger-Weber potential, a 3-body interaction model for molecular dynamics of tetrahedral semiconductors.</description><content:encoded><![CDATA[<h2 id="core-methodological-contribution">Core Methodological Contribution</h2>
<p>This is a <strong>Method</strong> paper.</p>
<p>Its primary contribution is the formulation of the <strong>Stillinger-Weber potential</strong>, a non-additive potential energy function designed to model tetrahedral semiconductors. The paper also uses molecular dynamics simulation to explore physical properties of silicon in both crystalline and liquid phases, but the methodological contribution (the potential architecture) is what enabled subsequent research on covalent materials.</p>
<h2 id="the-failure-of-pair-potentials-in-silicon">The Failure of Pair Potentials in Silicon</h2>
<p>The authors aimed to simulate the melting and liquid properties of tetrahedral semiconductors (Silicon and Germanium).</p>
<ul>
<li><strong>The Problem:</strong> Standard pair potentials (like Lennard-Jones) favor close-packed structures (12 nearest neighbors) and cannot stabilize the open diamond structure (4 nearest neighbors) of Silicon.</li>
<li><strong>The Gap:</strong> Earlier classical potentials lacked the flexibility to describe the profound structural change where Silicon shrinks upon melting (coordination number increases from 4 to &gt;6) while remaining conductive.</li>
<li><strong>The Goal:</strong> To construct a potential that spans the entire configuration space, describing both the rigid crystal and the diffusive liquid, without requiring quantum mechanical calculations.</li>
</ul>
<h2 id="the-three-body-interaction-novelty">The Three-Body Interaction Novelty</h2>
<p>The core novelty is the introduction of a stabilizing <strong>three-body interaction term</strong> ($v_3$) to the potential energy function.</p>
<ul>
<li><strong>3-Body Term:</strong> Explicitly penalizes deviations from the ideal tetrahedral angle ($\cos \theta_t = -1/3$).</li>
<li><strong>Unified Model:</strong> This potential handles bond breaking and reforming, allowing for the simulation of melting and liquid diffusion. Previous &ldquo;Keating&rdquo; potentials model only small elastic deformations.</li>
<li><strong>Mapping Technique:</strong> The application of &ldquo;steepest-descent mapping&rdquo; to quench dynamical configurations into their underlying &ldquo;inherent structures&rdquo; (local minima), revealing the fundamental topology of the liquid energy landscape.</li>
</ul>
<h2 id="molecular-dynamics-validation">Molecular Dynamics Validation</h2>
<p>The authors performed Molecular Dynamics (MD) simulations using the proposed potential.</p>
<ul>
<li><strong>System:</strong> 216 Silicon atoms in a cubic cell with periodic boundary conditions.</li>
<li><strong>State Points:</strong> Fixed density $\rho = 2.53 \text{ g/cm}^3$ (matching experimental liquid density at melting).</li>
<li><strong>Process:</strong>
<ol>
<li>Start with diamond crystal at low temperature.</li>
<li>Systematically heat to induce spontaneous nucleation and melting.</li>
<li>Equilibrate the liquid.</li>
<li>Periodically map configurations to potential minima (inherent structures) using steepest descent.</li>
</ol>
</li>
</ul>
<h2 id="phase-topology-and-inverse-lindemann-criterion">Phase Topology and Inverse Lindemann Criterion</h2>
<ul>
<li><strong>Validation:</strong> The potential successfully stabilizes the diamond structure as the global minimum at zero pressure.</li>
<li><strong>Liquid Structure:</strong> The simulated liquid pair-correlation function $g(r)$ and structure factor $S(k)$ qualitatively match experimental diffraction data, including the characteristic shoulder on the structure factor peak.</li>
<li><strong>Inherent Structure:</strong> The liquid possesses a temperature-independent inherent structure (amorphous network) hidden beneath thermal vibrations.</li>
<li><strong>Melting/Freezing Criteria:</strong> The study proposes an &ldquo;Inverse Lindemann Criterion&rdquo;: while crystals melt when vibration amplitude exceeds ~0.19 lattice spacings, liquids freeze when atom displacements from their inherent minima drop below ~0.30 neighbor spacings.</li>
</ul>
<h2 id="limitations-and-energy-scale-problem">Limitations and Energy Scale Problem</h2>
<p>The authors acknowledge a quantitative energy scale discrepancy. To match the observed melting temperature of Si ($1410°$C), $\epsilon$ would need to be approximately 42 kcal/mol, considerably less than the 50 kcal/mol required to reproduce the correct cohesive energy of the crystal. The authors suggest this could be resolved either by further optimization of $v_2$ and $v_3$, or by adding position-independent single-particle terms $v_1 \approx -16$ kcal/mol arising from the electronic structure. Adding $v_1$ terms only affects the temperature scale and has no influence on local structure at a given reduced temperature.</p>
<p>The simulated liquid coordination number (8.07) is also higher than the experimentally reported value of approximately 6.4, though the authors note that the experimental definition of &ldquo;nearest neighbors&rdquo; was not precisely stated.</p>
<h2 id="bonding-statistics-in-inherent-structures">Bonding Statistics in Inherent Structures</h2>
<p>Analysis of potential-energy minima (inherent structures) using a bond cutoff of $r/\sigma = 1.40$ reveals the coordination distribution in the liquid:</p>
<table>
  <thead>
      <tr>
          <th style="text-align: left">Coordination Number</th>
          <th style="text-align: left">Fraction of Atoms</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left">4</td>
          <td style="text-align: left">0.201</td>
      </tr>
      <tr>
          <td style="text-align: left">5</td>
          <td style="text-align: left">0.568</td>
      </tr>
      <tr>
          <td style="text-align: left">6</td>
          <td style="text-align: left">0.205</td>
      </tr>
      <tr>
          <td style="text-align: left">7</td>
          <td style="text-align: left">0.024</td>
      </tr>
  </tbody>
</table>
<p>Five-coordinate atoms dominate the liquid&rsquo;s inherent structure, with four- and six-coordinate atoms each accounting for about 20% of the population. The three-body interactions prevent any occurrence of coordination numbers near 12 that would indicate local close packing.</p>
<hr>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<h3 id="algorithms">Algorithms</h3>
<ul>
<li><strong>Integration:</strong> Equations of motion integrated using a <strong>fifth-order Gear algorithm</strong>.</li>
<li><strong>Time Step:</strong> $\Delta t = 5 \times 10^{-3} \tau$ (approx $3.83 \times 10^{-16}$ s), where $\tau = \sigma(m/\epsilon)^{1/2} = 7.6634 \times 10^{-14}$ s.</li>
<li><strong>Minimization:</strong> Steepest-descent mapping utilized <strong>Newton&rsquo;s method</strong> to find limiting solutions ($\nabla \Phi = 0$).</li>
</ul>
<h3 id="models">Models</h3>
<p>To reproduce this work, one must implement the potential $\Phi = \sum v_2 + \sum v_3$ with the exact functional forms and parameters provided.</p>















<figure class="post-figure center ">
    <img src="/img/notes/chemistry/stillinger-weber-potential.webp"
         alt="Stillinger-Weber potential visualization"
         title="Stillinger-Weber potential visualization"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">Left: Two-body radial potential $v_2(r)$ showing the characteristic well at $r_{min} \approx 1.12\sigma$. Right: Three-body angular penalty $h(r_{min}, r_{min}, \theta)$ demonstrating the minimum at the tetrahedral angle (109.5°), which enforces the diamond crystal structure.</figcaption>
    
</figure>

<h4 id="reduced-units">Reduced Units</h4>
<ul>
<li>$\sigma = 0.20951 \text{ nm}$</li>
<li>$\epsilon = 50 \text{ kcal/mol} = 3.4723 \times 10^{-12} \text{ erg}$</li>
</ul>
<h4 id="two-body-term-v_2">Two-Body Term ($v_2$)</h4>
<p>$$
v_2(r_{ij}) = \epsilon A (B r_{ij}^{-p} - r_{ij}^{-q}) \exp[(r_{ij} - a)^{-1}] \quad \text{for } r_{ij} &lt; a
$$</p>
<p><em>(Vanishes for $r \geq a$)</em></p>
<h4 id="three-body-term-v_3">Three-Body Term ($v_3$)</h4>
<p>$$
v_3(r_i, r_j, r_k) = \epsilon [h(r_{ij}, r_{ik}, \theta_{jik}) + h(r_{ji}, r_{jk}, \theta_{ijk}) + h(r_{ki}, r_{kj}, \theta_{ikj})]
$$</p>
<p>where:</p>
<p>$$
h(r_{ij}, r_{ik}, \theta_{jik}) = \lambda \exp[\gamma(r_{ij}-a)^{-1} + \gamma(r_{ik}-a)^{-1}] (\cos\theta_{jik} + \frac{1}{3})^2
$$</p>
<p><em>(Vanishes if distances $\geq a$)</em></p>
<h4 id="parameters">Parameters</h4>
<table>
  <thead>
      <tr>
          <th style="text-align: left">Parameter</th>
          <th style="text-align: left">Value</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left">$A$</td>
          <td style="text-align: left">$7.049556277$</td>
      </tr>
      <tr>
          <td style="text-align: left">$B$</td>
          <td style="text-align: left">$0.6022245584$</td>
      </tr>
      <tr>
          <td style="text-align: left">$p$</td>
          <td style="text-align: left">$4$</td>
      </tr>
      <tr>
          <td style="text-align: left">$q$</td>
          <td style="text-align: left">$0$</td>
      </tr>
      <tr>
          <td style="text-align: left">$a$</td>
          <td style="text-align: left">$1.80$</td>
      </tr>
      <tr>
          <td style="text-align: left">$\lambda$</td>
          <td style="text-align: left">$21.0$</td>
      </tr>
      <tr>
          <td style="text-align: left">$\gamma$</td>
          <td style="text-align: left">$1.20$</td>
      </tr>
  </tbody>
</table>
<h3 id="evaluation">Evaluation</h3>
<p>The paper evaluates the model against experimental diffraction data.</p>
<table>
  <thead>
      <tr>
          <th style="text-align: left">Metric</th>
          <th style="text-align: left">Simulated Value</th>
          <th style="text-align: left">Experimental Value</th>
          <th style="text-align: left">Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left"><strong>Melting Point ($T_m^*$)</strong></td>
          <td style="text-align: left">$\approx 0.080$</td>
          <td style="text-align: left">N/A</td>
          <td style="text-align: left">Reduced units. Requires $\epsilon \approx 42$ kcal/mol to match real $T_m = 1410°$C, vs 50 kcal/mol for correct cohesive energy.</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>Coordination (Liquid)</strong></td>
          <td style="text-align: left">$8.07$</td>
          <td style="text-align: left">$\approx 6.4$</td>
          <td style="text-align: left">Evaluated at first $g(r)$ minimum ($r/\sigma = 1.625$). Simulated value is higher than experiment.</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>$S(k)$ First Peak</strong></td>
          <td style="text-align: left">$2.53$ $\AA^{-1}$</td>
          <td style="text-align: left">$2.80$ $\AA^{-1}$</td>
          <td style="text-align: left">From Table I.</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>$S(k)$ Shoulder</strong></td>
          <td style="text-align: left">$3.25$ $\AA^{-1}$</td>
          <td style="text-align: left">$3.25$ $\AA^{-1}$</td>
          <td style="text-align: left">From Table I. Exact match with experiment.</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>$S(k)$ Second Peak</strong></td>
          <td style="text-align: left">$5.35$ $\AA^{-1}$</td>
          <td style="text-align: left">$5.75$ $\AA^{-1}$</td>
          <td style="text-align: left">From Table I.</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>$S(k)$ Third Peak</strong></td>
          <td style="text-align: left">$8.16$ $\AA^{-1}$</td>
          <td style="text-align: left">$8.50$ $\AA^{-1}$</td>
          <td style="text-align: left">From Table I.</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>$S(k)$ Fourth Peak</strong></td>
          <td style="text-align: left">$10.60$ $\AA^{-1}$</td>
          <td style="text-align: left">$11.20$ $\AA^{-1}$</td>
          <td style="text-align: left">From Table I.</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>Entropy of Melting ($\Delta S / N k_B$)</strong></td>
          <td style="text-align: left">$\approx 3.7$</td>
          <td style="text-align: left">$3.25$</td>
          <td style="text-align: left">Simulated at constant volume; experimental at constant pressure (1 atm).</td>
      </tr>
  </tbody>
</table>
<hr>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Stillinger, F. H., &amp; Weber, T. A. (1985). Computer simulation of local order in condensed phases of silicon. <em>Physical Review B</em>, 31(8), 5262-5271. <a href="https://doi.org/10.1103/PhysRevB.31.5262">https://doi.org/10.1103/PhysRevB.31.5262</a></p>
<p><strong>Publication</strong>: Physical Review B, 1985</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@article</span>{stillingerComputerSimulationLocal1985,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span> = <span style="color:#e6db74">{Computer Simulation of Local Order in Condensed Phases of Silicon}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span> = <span style="color:#e6db74">{Stillinger, Frank H. and Weber, Thomas A.}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span> = <span style="color:#ae81ff">1985</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">month</span> = apr,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">journal</span> = <span style="color:#e6db74">{Physical Review B}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">volume</span> = <span style="color:#e6db74">{31}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">number</span> = <span style="color:#e6db74">{8}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span> = <span style="color:#e6db74">{5262--5271}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">publisher</span> = <span style="color:#e6db74">{American Physical Society}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">doi</span> = <span style="color:#e6db74">{10.1103/PhysRevB.31.5262}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div>]]></content:encoded></item><item><title>Second-Order Langevin Equation for Field Simulations</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/classical-methods/second-order-langevin-1987/</link><pubDate>Sun, 14 Dec 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/classical-methods/second-order-langevin-1987/</guid><description>Hyperbolic Algorithm adds second-order derivatives to Langevin dynamics, reducing systematic errors to O(ε²) for lattice field simulations.</description><content:encoded><![CDATA[<h2 id="contribution-and-paper-type">Contribution and Paper Type</h2>
<p>This is a <strong>Methodological Paper</strong> ($\Psi_{\text{Method}}$). It proposes a novel stochastic algorithm, the Hyperbolic Algorithm (HA), and validates its superior efficiency against the existing Langevin Algorithm (LA) through formal error analysis and numerical simulation. It contains significant theoretical derivation (Liouville dynamics) that serves primarily to justify the algorithmic performance claims.</p>
<h2 id="motivation-and-gaps-in-prior-work">Motivation and Gaps in Prior Work</h2>
<p>The standard Langevin Algorithm (LA) for numerical simulation of Euclidean field theories suffers from efficiency bottlenecks. The simplest Euler-discretization of the LA introduces systematic errors of $O(\epsilon)$ (where $\epsilon$ is the step size). To maintain accuracy, $\epsilon$ must be kept small, which increases the sweep-sweep correlation time (autocorrelation time), making simulations computationally expensive.</p>
<h2 id="core-novelty-second-order-dynamics">Core Novelty: Second-Order Dynamics</h2>
<p>The core contribution is the introduction of a <strong>second-order derivative in fictitious time</strong> to the stochastic equation. This converts the parabolic Langevin equation into a hyperbolic equation:</p>
<p>$$
\begin{aligned}
\frac{\partial^{2}\phi}{\partial t^{2}}+\gamma\frac{\partial\phi}{\partial t}=-\frac{\partial S}{\partial\phi}+\eta
\end{aligned}
$$</p>
<h3 id="equation-comparison">Equation Comparison</h3>
<p>The key difference from the standard (first-order) Langevin equation:</p>
<table>
  <thead>
      <tr>
          <th style="text-align: left">Equation Type</th>
          <th style="text-align: left">Formula</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left"><strong>Hyperbolic (Second Order)</strong></td>
          <td style="text-align: left">$$\frac{\partial^{2}\phi}{\partial t^{2}}+\gamma\frac{\partial\phi}{\partial t}=-\frac{\partial S}{\partial\phi}+\eta$$</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>Langevin (First Order)</strong></td>
          <td style="text-align: left">$$\frac{\partial\phi}{\partial t}=-\frac{\partial S}{\partial\phi}+\eta$$</td>
      </tr>
  </tbody>
</table>
<p>The standard Langevin equation corresponds to the overdamped limit where the acceleration term is absent. Physically, the Hyperbolic equation can be viewed as microcanonical equations of motion with an added friction term.</p>
<h3 id="key-innovations">Key Innovations</h3>
<ul>
<li><strong>Higher Order Accuracy</strong>: The simplest discretization of this equation leads to systematic errors of only $O(\epsilon^2)$ compared to $O(\epsilon)$ for LA.</li>
<li><strong>Tunable Damping</strong>: The addition of the damping parameter $\gamma$ allows tuning to minimize autocorrelation tails.</li>
<li><strong>Uniform Evolution</strong>: The method evolves structures of different wavelengths more uniformly than LA due to the specific dissipation structure.</li>
</ul>
<h2 id="methodology-and-experiments">Methodology and Experiments</h2>
<p>The author validated the method using the <strong>XY Model</strong> on 2D lattices.</p>
<ul>
<li><strong>System</strong>: Euclidean action $S = -\sum_{x,\mu} \cos(\theta_{x+\mu} - \theta_x)$.</li>
<li><strong>Setup</strong>:
<ul>
<li>Lattice sizes: $15^2$ (helical boundary conditions) and $30^2$.</li>
<li>$\beta$ range: 0.9 to 1.2 (crossing the critical point $\approx 1.0$).</li>
<li>Run length: &gt;100,000 updates in equilibrium.</li>
</ul>
</li>
<li><strong>Metrics</strong>:
<ul>
<li><strong>Autocorrelation time ($\tau$)</strong>: Defined as the number of updates for the time-correlation function to drop to 10% of its initial value.</li>
<li><strong>Systematic Error</strong>: Measured via deviation of average action from Monte Carlo values.</li>
</ul>
</li>
</ul>
<h2 id="results-and-conclusions">Results and Conclusions</h2>
<ul>
<li><strong>Efficiency</strong>: The Hyperbolic Algorithm (HA) is far more efficient. For equal systematic errors, sweep-sweep correlation times are significantly lower than LA.</li>
<li><strong>Error Scaling</strong>: Numerical results confirmed that HA step size $\epsilon_H = 0.1$ yields systematic errors comparable to LA step size $\epsilon_L \approx 0.008$ ($O(\epsilon^2)$ vs $O(\epsilon)$ scaling).</li>
<li><strong>Speedup</strong>: In the disordered phase, HA is roughly $\epsilon_H / \epsilon_L$ times faster (approximately a factor of 12.5 for $\epsilon_H = 0.1$, $\epsilon_L = 0.008$). In the ordered phase, efficiency gains increase with distance scale, reaching factors of 20 or more for long-range correlations.</li>
<li><strong>Optimal Damping</strong>: For the XY model, the optimal damping parameter was found to be $\gamma \approx 0.4$.</li>
</ul>
<hr>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<h3 id="algorithms">Algorithms</h3>
<p><strong>1. The Hyperbolic Algorithm (HA)</strong></p>
<p>The discretized update equations for scalar fields are:</p>
<p>$$
\begin{aligned}
\pi_{t+\epsilon} - \pi_{t} &amp;= -\epsilon\gamma\pi_{t} - \epsilon\frac{\partial S}{\partial\phi_{t}} + \sqrt{2\epsilon\gamma/\beta}\xi_{t} \\
\phi_{t+\epsilon} - \phi_{t} &amp;= \epsilon\pi_{t+\epsilon}
\end{aligned}
$$</p>
<ul>
<li><strong>Variables</strong>: $\phi$ is the field, $\pi$ is the conjugate momentum ($\dot{\phi}$).</li>
<li><strong>Parameters</strong>: $\epsilon$ (step size), $\gamma$ (damping constant).</li>
<li><strong>Noise</strong>: $\xi$ is Gaussian noise with $\langle\xi_x \xi_y\rangle = \delta_{x,y}$.</li>
<li><strong>Storage</strong>: Requires storing both $\phi$ and $\pi$ vectors.</li>
</ul>
<p><strong>2. Non-Abelian Generalization</strong></p>
<p>For Lie group elements $U$ with generators $T^a$:</p>
<p>$$
\begin{aligned}
\pi_{t+\epsilon}^a - \pi_{t}^a &amp;= -\epsilon\gamma\pi_{t}^a - \epsilon\delta^a S[U_t] + \sqrt{2\epsilon\gamma/\beta}\xi_{t}^a \\
U_{t+\epsilon} &amp;= e^{i\epsilon\pi_{t+\epsilon}^a T^a} U_t
\end{aligned}
$$</p>
<h3 id="theoretical-proof-of-oepsilon2-accuracy">Theoretical Proof of $O(\epsilon^2)$ Accuracy</h3>
<p>The derivation relies on the generalized Liouville equation for the probability distribution $P[\phi, \pi; t]$.</p>
<ol>
<li><strong>Transition Probability</strong>: The transition $W$ for one iteration is defined.</li>
<li><strong>Effective Liouville Operator</strong>: The evolution is written as $P(t+\epsilon) = \exp(\epsilon L_{\text{eff}}) P(t)$.</li>
<li><strong>Baker-Hausdorff Expansion</strong>: Using normal ordering of operators, the equilibrium distribution $P_{\text{eq}}$ is derived through $O(\epsilon^2)$:</li>
</ol>
<p>$$
\begin{aligned}
P_{\text{eq}} &amp;= \exp\left\lbrace-\frac{1}{2}\beta_{1}\sum_{x}\pi_{x}^{2} - \beta S[\phi] + \frac{1}{2}\epsilon\beta\sum_{x}\pi_{x}S_{x} + \epsilon^{2}G + O(\epsilon^3)\right\rbrace
\end{aligned}
$$</p>
<p>where $\beta_1 = \beta\left(1 - \frac{1}{2}\epsilon\gamma\right)$.</p>
<ol start="4">
<li><strong>Effective Action</strong>: Integrating out $\pi$ yields the effective action for $\phi$:</li>
</ol>
<p>$$
\begin{aligned}
S_{\text{eff}}[\phi] &amp;= S[\phi] - \frac{1}{8}\epsilon^2 \sum_x S_x^2 + \dots
\end{aligned}
$$</p>
<p>The absence of $O(\epsilon)$ terms proves the higher-order accuracy.</p>
<h3 id="evaluation">Evaluation</h3>
<ul>
<li><strong>Model</strong>: XY Model (2D)</li>
<li><strong>Hamiltonian</strong>: $H = \frac{1}{2}\sum \pi^2 + S[\phi]$ where $S = -\sum \cos(\Delta \theta)$.</li>
<li><strong>Observables</strong>:
<ul>
<li>$\Gamma_n = \cos(\theta_{m+n} - \theta_m)$ (averaged over lattice $m$).</li>
</ul>
</li>
<li><strong>Comparisons</strong>:
<ul>
<li><strong>LA Step</strong>: $\epsilon_L \approx 0.005 - 0.02$.</li>
<li><strong>HA Step</strong>: $\epsilon_H \approx 0.1 - 0.2$.</li>
<li><strong>Equivalence</strong>: $\epsilon_H = 0.1$ matches error of $\epsilon_L \approx 0.008$.</li>
</ul>
</li>
</ul>
<hr>
<h2 id="terminology-note">Terminology Note</h2>
<p>The naming conventions in this paper differ from those commonly used in molecular dynamics (MD). The following table provides a cross-field mapping:</p>
<table>
  <thead>
      <tr>
          <th style="text-align: left">Concept</th>
          <th style="text-align: left"><strong>Field Theory (This Paper)</strong></th>
          <th style="text-align: left"><strong>Molecular Dynamics</strong></th>
          <th style="text-align: left"><strong>Mathematics</strong></th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left"><strong>Equation 1</strong></td>
          <td style="text-align: left">&ldquo;Langevin Equation&rdquo;</td>
          <td style="text-align: left">Brownian Dynamics (BD)</td>
          <td style="text-align: left">Overdamped Langevin</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>Equation 2</strong></td>
          <td style="text-align: left">&ldquo;Hyperbolic Equation&rdquo;</td>
          <td style="text-align: left">Langevin Dynamics (LD)</td>
          <td style="text-align: left">Underdamped Langevin</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>Integrator 1</strong></td>
          <td style="text-align: left">Euler Discretization</td>
          <td style="text-align: left">Euler Integrator</td>
          <td style="text-align: left">Euler-Maruyama</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>Integrator 2</strong></td>
          <td style="text-align: left">Hyperbolic Algorithm (HA)</td>
          <td style="text-align: left">Velocity Verlet / Leapfrog</td>
          <td style="text-align: left">Quasi-Symplectic Splitting</td>
      </tr>
  </tbody>
</table>
<p><strong>Key insight</strong>: The paper&rsquo;s &ldquo;Hyperbolic Algorithm&rdquo; is mathematically equivalent to Langevin Dynamics with a Leapfrog/Verlet integrator, commonly used in MD. The baseline &ldquo;Langevin Algorithm&rdquo; corresponds to Brownian Dynamics. The term &ldquo;Langevin equation&rdquo; is overloaded: field theorists often use it for overdamped dynamics (no inertia), while chemists assume it includes momentum ($F=ma$).</p>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Horowitz, A. M. (1987). The Second Order Langevin Equation and Numerical Simulations. <em>Nuclear Physics B</em>, 280, 510-522. <a href="https://doi.org/10.1016/0550-3213(87)90159-3">https://doi.org/10.1016/0550-3213(87)90159-3</a></p>
<p><strong>Publication</strong>: Nuclear Physics B 1987</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@article</span>{horowitzSecondOrderLangevin1987,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span> = <span style="color:#e6db74">{The Second Order {{Langevin}} Equation and Numerical Simulations}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span> = <span style="color:#e6db74">{Horowitz, Alan M.}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span> = <span style="color:#ae81ff">1987</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">month</span> = jan,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">journal</span> = <span style="color:#e6db74">{Nuclear Physics B}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">volume</span> = <span style="color:#e6db74">{280}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span> = <span style="color:#e6db74">{510--522}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">issn</span> = <span style="color:#e6db74">{05503213}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">doi</span> = <span style="color:#e6db74">{10.1016/0550-3213(87)90159-3}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div>]]></content:encoded></item><item><title>MD Simulation of Self-Diffusion on Metal Surfaces (1994)</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/surface-science/self-diffusion-metal-surfaces-1994/</link><pubDate>Sun, 14 Dec 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/surface-science/self-diffusion-metal-surfaces-1994/</guid><description>Molecular dynamics simulation of Iridium surface diffusion confirming atomic exchange mechanisms using EAM and many-body potentials.</description><content:encoded><![CDATA[<h2 id="scientific-typology-computational-discovery">Scientific Typology: Computational Discovery</h2>
<p>This is primarily a <strong>Discovery</strong> ($\Psi_{\text{Discovery}}$) paper, with strong supporting contributions as a <strong>Method</strong> ($\Psi_{\text{Method}}$) evaluation. The primary contribution is the validation and mechanistic visualization of the &ldquo;exchange mechanism&rdquo; for surface diffusion using computational methods (Molecular Dynamics with many-body potentials). This physical phenomenon was previously observed in Field Ion Microscope (FIM) experiments but difficult to characterize dynamically. The paper focuses on determining <em>how</em> atoms move, specifically distinguishing between hopping and exchange mechanisms.</p>
<h2 id="the-field-ion-microscope-fim-observation-gap">The Field Ion Microscope (FIM) Observation Gap</h2>
<p>Surface diffusion is critical for understanding phenomena like crystal growth, epitaxy, and catalysis. Experimental evidence from FIM on fcc(001) surfaces (specifically Pt and Ir) suggested an &ldquo;exchange mechanism&rdquo; where an adatom replaces a substrate atom, challenging the conventional wisdom that adatoms migrate by hopping over potential barriers (bridge sites) between binding sites. The authors sought to:</p>
<ol>
<li>Investigate whether this exchange mechanism could be reproduced dynamically in simulation.</li>
<li>Determine which interatomic potentials (EAM, Sutton-Chen, R-G-L) accurately describe these surface behaviors compared to bulk properties.</li>
</ol>
<h2 id="dynamic-visualization-of-atomic-exchange">Dynamic Visualization of Atomic Exchange</h2>
<p>The study provides a direct dynamic visualization of the &ldquo;concerted motion&rdquo; involved in exchange diffusion events, which happens on timescales too fast for experimental imaging. By comparing three different many-body potentials, the authors demonstrate that the choice of potential is critical for capturing surface phenomena; specifically, identifying that &ldquo;bulk&rdquo; derived potentials (like Sutton-Chen) may fail to capture specific surface exchange events that EAM and R-G-L potentials successfully model.</p>
<h2 id="simulation-protocol--evaluated-potentials">Simulation Protocol &amp; Evaluated Potentials</h2>
<p>The authors performed Molecular Dynamics (MD) simulations on Iridium (Ir) surfaces:</p>
<ul>
<li><strong>Surfaces</strong>: Channeled (110), densely packed (111), and loosely packed (001).</li>
<li><strong>Potentials</strong>: Three many-body models were tested: Embedded Atom Method (EAM), Sutton-Chen (S-C), and Rosato-Guillope-Legrand (R-G-L).</li>
<li><strong>Conditions</strong>: Simulations were primarily run at $T=800$ K to ensure sufficient sampling of diffusion events.</li>
<li><strong>Cross-Validation</strong>: The study extended the analysis to Cu, Rh, and Pt systems to verify the universality of the exchange mechanism against experimental data.</li>
</ul>
<h2 id="confirmation-of-concerted-motion-mechanisms">Confirmation of Concerted Motion Mechanisms</h2>
<ul>
<li><strong>Mechanism Confirmation</strong>: The study confirmed that diffusion on Ir(001) proceeds via an atomic exchange mechanism (concerted motion). The activation energy for exchange ($0.77$ eV) was found to be significantly lower than for hopping over bridge sites ($1.57$ eV).</li>
<li><strong>Surface Structure Dependence</strong>:
<ul>
<li><strong>Ir(111)</strong>: Diffusion is rapid (activation energy $V_a = 0.17$ eV from R-G-L Arrhenius plot) and occurs exclusively via hopping; no exchange events were observed due to the close-packed nature of the surface.</li>
<li><strong>Ir(110)</strong>: Diffusion is anisotropic; atoms hop <em>along</em> channels but use the exchange mechanism to move <em>across</em> channels.</li>
</ul>
</li>
<li><strong>Potential Validity</strong>: The R-G-L and EAM potentials successfully reproduced experimental exchange behaviors, whereas the Sutton-Chen potential failed to predict exchange on Ir(001). The authors attribute the S-C failure primarily to the use of &ldquo;bulk&rdquo; potential parameters to describe interactions at the surface.</li>
<li><strong>Cross-System Comparison</strong>: The study extended the analysis to Cu, Rh, and Pt systems. Both S-C and R-G-L potentials correctly predicted the absence of exchange on all three Rh surfaces and on (111) surfaces of Cu and Pt. Exchange events were correctly predicted on Cu(001), Cu(110), Pt(001), and Pt(110) by both potentials. The sole discrepancy was S-C failing to predict exchange on Ir(001), where R-G-L and EAM succeeded in agreement with experiment.</li>
</ul>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<h3 id="algorithms">Algorithms</h3>
<ul>
<li><strong>Integration</strong>: &ldquo;Velocity&rdquo; form of the Verlet algorithm.</li>
<li><strong>Time Step</strong>: $\Delta t = 0.01$ ps ($10^{-14}$ s).</li>
<li><strong>Simulation Protocol</strong>:
<ol>
<li><strong>Quenching</strong>: System relaxed to 0 K by zeroing velocities when $v \cdot F &lt; 0$.</li>
<li><strong>Equilibration</strong>: 5 ps constant-temperature run (renormalizing velocities every step).</li>
<li><strong>Production</strong>: 15 ps constant-energy (microcanonical) run where trajectories are collected.</li>
</ol>
</li>
</ul>
<h3 id="models">Models</h3>
<p>The study relies on three specific many-body potential formulations:</p>
<ol>
<li><strong>Embedded Atom Method (EAM)</strong>:
<ul>
<li>Total energy:
$$U_{tot} = \sum_i F_i(\rho_i) + \frac{1}{2} \sum_{j \neq i} \phi_{ij}(r_{ij})$$</li>
</ul>
</li>
<li><strong>Sutton-Chen (S-C)</strong>:
<ul>
<li>Uses a square root density dependence and power-law pair repulsion $(a/r)^{n}$:
$$F(\rho) \propto \rho^{1/2}$$</li>
</ul>
</li>
<li><strong>Rosato-Guillope-Legrand (R-G-L)</strong>:
<ul>
<li>Born-Mayer type repulsion:
$$\phi_{ij}(r) = A \exp[-p(r/r_0 - 1)]$$</li>
<li>Attractive band energy:
$$F_i(\rho) = -\left(\sum \xi^2 \exp[-2q(r/r_0 - 1)]\right)^{1/2}$$</li>
</ul>
</li>
</ol>
<h3 id="data">Data</h3>
<ul>
<li><strong>System Size</strong>: 648 classical atoms.</li>
<li><strong>Geometry</strong>:
<ul>
<li>Cubic box with fixed volume.</li>
<li>Periodic boundary conditions in $x$ and $y$ (parallel to surface), free motion in $z$.</li>
<li>Substrate depth: 8, 12, or 9 atomic layers depending on orientation [(001), (110), (111)].</li>
</ul>
</li>
<li><strong>Cutoff Radius</strong>: 14 bohr ($\sim 7.4$ Å).</li>
<li><strong>Initial Conditions</strong>: Velocities initialized from a Maxwellian distribution.</li>
</ul>
<h3 id="evaluation">Evaluation</h3>
<ul>
<li><strong>Diffusion Constant ($D$)</strong>: Calculated using the Einstein relation via Mean Square Displacement (MSD):
$$D = \lim_{t \to \infty} \frac{\langle \Delta r^2(t) \rangle}{2td}$$
where $d=2$ for surface diffusion.</li>
<li><strong>Activation Energy ($V_a$)</strong>: Extracted from the slope of Arrhenius plots ($\ln D$ vs $1/T$).</li>
<li><strong>Attempt Frequency ($\nu$)</strong>: Estimated via harmonic approximation: $\nu = \frac{1}{2\pi}\sqrt{c/M}$.</li>
</ul>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Shiang, K.-D., Wei, C. M., &amp; Tsong, T. T. (1994). A molecular dynamics study of self-diffusion on metal surfaces. <em>Surface Science</em>, 301(1-3), 136-150. <a href="https://doi.org/10.1016/0039-6028(94)91295-5">https://doi.org/10.1016/0039-6028(94)91295-5</a></p>
<p><strong>Publication</strong>: Surface Science 1994</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@article</span>{shiang1994molecular,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span>=<span style="color:#e6db74">{A molecular dynamics study of self-diffusion on metal surfaces}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span>=<span style="color:#e6db74">{Shiang, Keh-Dong and Wei, C.M. and Tsong, Tien T.}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">journal</span>=<span style="color:#e6db74">{Surface Science}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">volume</span>=<span style="color:#e6db74">{301}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">number</span>=<span style="color:#e6db74">{1-3}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span>=<span style="color:#e6db74">{136--150}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span>=<span style="color:#e6db74">{1994}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">publisher</span>=<span style="color:#e6db74">{Elsevier}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">doi</span>=<span style="color:#e6db74">{10.1016/0039-6028(94)91295-5}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div>]]></content:encoded></item><item><title>Evans 1986: Thermal Conductivity of Lennard-Jones Fluid</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/classical-methods/evans-thermal-conductivity-1986/</link><pubDate>Sun, 14 Dec 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/classical-methods/evans-thermal-conductivity-1986/</guid><description>A 1986 validation of the Evans NEMD method for simulating heat flow, identifying long-time tail anomalies near the critical point.</description><content:encoded><![CDATA[<h2 id="methodological-validation-and-physical-discovery">Methodological Validation and Physical Discovery</h2>
<p>This is primarily a <strong>Methodological Paper ($\Psi_{\text{Method}}$)</strong>, with a significant secondary component of <strong>Discovery ($\Psi_{\text{Discovery}}$)</strong>.</p>
<p>It focuses on validating a specific algorithm (the &ldquo;Evans method&rdquo;) for Non-Equilibrium Molecular Dynamics (NEMD) by comparing its results against experimental benchmarks. However, it also uncovers physical anomalies, specifically &ldquo;long-time tails&rdquo; in the heat flux autocorrelation function that deviate significantly from theoretical predictions, marking a discovery about the physics of the Lennard-Jones fluid itself.</p>
<h2 id="flow-gradients-and-boundary-limitations">Flow Gradients and Boundary Limitations</h2>
<p>The primary motivation is to overcome the limitations of simulating heat flow using physical boundaries (e.g., walls at different temperatures), which causes severe interpretive difficulties due to density and temperature gradients.</p>
<p>The &ldquo;Evans method&rdquo; uses a fictitious external field to induce heat flow in a periodic, homogeneous system. This paper serves to:</p>
<ol>
<li>Validate this method across a wide range of state points (temperatures and densities) beyond the triple point.</li>
<li>Investigate the system&rsquo;s behavior near the critical point, where transport properties are known to be anomalous.</li>
</ol>
<h2 id="core-innovations-of-the-evans-algorithm">Core Innovations of the Evans Algorithm</h2>
<p>The core contribution is the rigorous stress-testing of the <strong>homogeneous heat flow algorithm</strong> (Evans method) combined with a <strong>Gaussian thermostat</strong>.</p>
<p>Specific novel insights include:</p>
<ul>
<li><strong>Linearity Validation</strong>: Establishing that, away from phase boundaries, the effective thermal conductivity is a monotonic, virtually linear function of the external field, justifying the extrapolation to zero field.</li>
<li><strong>Critical Anomaly Detection</strong>: Finding that near the critical point, conductivity becomes a non-monotonic function of the field, challenging standard simulation approaches in this regime.</li>
<li><strong>Tail Amplitude Discovery</strong>: Demonstrating that the &ldquo;long-time tails&rdquo; of the heat flux autocorrelation function have amplitudes roughly 6 times larger than those predicted by mode-coupling theory.</li>
</ul>
<h2 id="nemd-simulation-setup">NEMD Simulation Setup</h2>
<p>The author performed <strong>Non-Equilibrium Molecular Dynamics (NEMD)</strong> simulations using the Lennard-Jones potential.</p>
<ul>
<li><strong>System</strong>: Mostly $N=108$ particles, with some checks using $N=256$ to test size dependence.</li>
<li><strong>Thermostat</strong>: A Gaussian thermostat was used to keep the kinetic energy (temperature) constant.</li>
<li><strong>State Points</strong>:
<ul>
<li><strong>Critical Isotherm</strong>: $T=1.35$, varying density.</li>
<li><strong>Supercritical Isotherm</strong>: $T=2.0$.</li>
<li><strong>Freezing Line</strong>: Two points ($T=2.74, \rho=1.113$ and $T=2.0, \rho=1.04$).</li>
</ul>
</li>
<li><strong>Validation</strong>: Results were compared against <strong>experimental data for Argon</strong> (using standard LJ parameters).</li>
<li><strong>Ablation</strong>:
<ul>
<li><strong>Field Strength ($F$)</strong>: Varied to check for linearity/non-linearity.</li>
<li><strong>System Size ($N$)</strong>: Comparison between 108 and 256 particles to rule out finite-size artifacts.</li>
</ul>
</li>
</ul>
<h2 id="linearity-regimes-and-long-time-tail-anomalies">Linearity Regimes and Long-Time Tail Anomalies</h2>
<ul>
<li><strong>Agreement with Experiment</strong>: The Evans method yields thermal conductivities in broad agreement with experimental Argon data for most state points.</li>
<li><strong>Linearity</strong>: Away from the critical point, conductivity is a virtually linear function of the field strength $F$, allowing for accurate zero-field extrapolation.</li>
<li><strong>Critical Region Failure</strong>: Near the critical point ($T=1.35, \rho=0.4$), the method struggles; the conductivity is non-monotonic with respect to $F$, and the zero-field extrapolation underestimates the experimental value by ~11%.</li>
<li><strong>Long-Time Tails</strong>: The decay of the heat flux autocorrelation function follows a $t^{-3/2}$ tail (consistent with mode-coupling theory), but the <strong>amplitude is ~6x larger</strong> than predicted.</li>
<li><strong>Phase Hysteresis</strong>: In high-density regions near the freezing line, the system exhibits hysteresis and bi-stability between solid and liquid phases depending on the field strength.</li>
</ul>
<hr>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<h3 id="data">Data</h3>
<p>The simulation relies on the Lennard-Jones (LJ) potential to model Argon. No external training data is used; the &ldquo;data&rdquo; consists of the physical constants defining the system.</p>
<table>
  <thead>
      <tr>
          <th>Parameter</th>
          <th>Value/Description</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><strong>Potential</strong></td>
          <td>$\Phi(q)=4(q^{-12}-q^{-6})$</td>
          <td>Standard LJ 12-6 potential</td>
      </tr>
      <tr>
          <td><strong>Cutoff</strong></td>
          <td>$r_c = 2.5$</td>
          <td>Truncated at 2.5 distance units</td>
      </tr>
      <tr>
          <td><strong>Comparison</strong></td>
          <td>Argon Experimental Data</td>
          <td>Sourced from NBS recommended values</td>
      </tr>
  </tbody>
</table>
<h3 id="algorithms">Algorithms</h3>
<p>The core algorithm is the <strong>Evans Homogeneous Heat Flow</strong> method. To reproduce this, one must implement the specific Equations of Motion (EOM) derived from linear response theory.</p>
<p><strong>Equations of Motion:</strong></p>
<p>The trajectories are generated by:
$$
\begin{aligned}
\dot{q}_i &amp;= \frac{p_i}{m} \\
\dot{p}_i &amp;= F_i^{\text{inter}} + (E_i - \bar{E})F(t) - \sum_{j} F_{ij} q_{ij} \cdot F(t) + \frac{1}{2N} \sum_{j,k} F_{jk} q_{jk} \cdot F(t) - \alpha p_i
\end{aligned}
$$</p>
<p>Where:</p>
<ul>
<li>$F(t)$ is the fictitious external field driving heat flow.</li>
<li>$E_i$ is the instantaneous energy of particle $i$.</li>
<li>$\alpha$ is the <strong>Gaussian Thermostat multiplier</strong> (calculated at every step to strictly conserve kinetic energy/Temperature):
$$\alpha = \frac{\sum_i [\dots]_{\text{force terms}} \cdot p_i}{\sum_i p_i \cdot p_i}$$</li>
</ul>
<p><strong>Conductivity Calculation:</strong></p>
<p>The zero-frequency limit is extrapolated as:
$$ \lambda = \lim_{F \to 0} \frac{J_Q}{FT} $$</p>
<p>The frequency-dependent conductivity relies on the heat-flux autocorrelation:
$$ \lambda(\omega) = \frac{V}{3k_B T^2} \int_0^\infty dt , e^{i\omega t} \langle J_Q(t) \cdot J_Q(0) \rangle $$</p>
<h3 id="models">Models</h3>
<p>The &ldquo;model&rdquo; here is the physical simulation setup.</p>
<ul>
<li><strong>Particle Count</strong>: $N = 108$ (primary), $N = 256$ (validation).</li>
<li><strong>Boundary Conditions</strong>: Periodic Boundary Conditions (PBC).</li>
<li><strong>Thermostat</strong>: Gaussian Isokinetic (Temperature is a constant of motion).</li>
</ul>
<h3 id="evaluation">Evaluation</h3>
<p>The primary metric is the <strong>Thermal Conductivity</strong> ($\lambda$).</p>
<table>
  <thead>
      <tr>
          <th>Metric</th>
          <th>Definition</th>
          <th>Baseline</th>
          <th>Result</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><strong>Thermal Conductivity</strong></td>
          <td>Ratio of heat flux $J_Q$ to field $F$ (extrapolated to $F=0$)</td>
          <td>Experimental Argon (NBS Data)</td>
          <td>Good agreement away from critical point</td>
      </tr>
      <tr>
          <td><strong>Tail Amplitude</strong></td>
          <td>Coefficient of the $\omega^{1/2}$ term in frequency-dependent conductivity</td>
          <td>Mode-Coupling Theory ($\approx 0.05$)</td>
          <td>Simulation value $\approx 0.3$ (6x larger)</td>
      </tr>
  </tbody>
</table>
<h3 id="hardware">Hardware</h3>
<ul>
<li><strong>Requirements</strong>: While 1986 hardware is obsolete, reproducing this requires a standard MD code capable of non-conservative forces (NEMD).</li>
<li><strong>Compute Cost</strong>: Low by modern standards. 108 particles for $\sim 10^5$ to $10^6$ steps is trivial on modern CPUs.</li>
</ul>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Evans, D. J. (1986). Thermal conductivity of the Lennard-Jones fluid. <em>Physical Review A</em>, 34(2), 1449-1453. <a href="https://doi.org/10.1103/PhysRevA.34.1449">https://doi.org/10.1103/PhysRevA.34.1449</a></p>
<p><strong>Publication</strong>: Physical Review A, 1986</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@article</span>{PhysRevA.34.1449,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span> = <span style="color:#e6db74">{Thermal conductivity of the Lennard-Jones fluid}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span> = <span style="color:#e6db74">{Evans, Denis J.}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">journal</span> = <span style="color:#e6db74">{Phys. Rev. A}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">volume</span> = <span style="color:#e6db74">{34}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">number</span> = <span style="color:#e6db74">{2}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span> = <span style="color:#e6db74">{1449--1453}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">numpages</span> = <span style="color:#e6db74">{0}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span> = <span style="color:#e6db74">{1986}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">month</span> = <span style="color:#e6db74">{Aug}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">publisher</span> = <span style="color:#e6db74">{American Physical Society}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">doi</span> = <span style="color:#e6db74">{10.1103/PhysRevA.34.1449}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">url</span> = <span style="color:#e6db74">{https://link.aps.org/doi/10.1103/PhysRevA.34.1449}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div>]]></content:encoded></item><item><title>Embedded-Atom Method: Theory and Applications Review</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/classical-methods/embedded-atom-method-review-1993/</link><pubDate>Sun, 14 Dec 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/classical-methods/embedded-atom-method-review-1993/</guid><description>Comprehensive 1993 review of the Embedded-Atom Method (EAM), covering theory, parameterization, and applications to metallic systems.</description><content:encoded><![CDATA[<h2 id="systematizing-the-embedded-atom-method">Systematizing the Embedded-Atom Method</h2>
<p>This is a <strong>Systematization (Review)</strong> paper. It consolidates the theoretical development, semi-empirical parameterization, and broad applications of the Embedded-Atom Method (EAM) into a unified framework. The paper systematizes the field by connecting the EAM to related theories (Effective Medium Theory, Finnis-Sinclair, &ldquo;glue&rdquo; models) and organizing phenomenological results across diverse physical regimes (bulk, surfaces, interfaces).</p>
<p>The authors explicitly frame the work as a survey, stating &ldquo;We review here the history, development, and application of the EAM&rdquo; and &ldquo;This review emphasizes the physical insight that motivated the EAM.&rdquo; The paper follows a classic survey structure, organizing the literature by application domains.</p>
<h2 id="the-failure-of-pair-potentials-in-metallic-systems">The Failure of Pair Potentials in Metallic Systems</h2>
<p>The primary motivation is the failure of pair-potential models to accurately describe metallic bonding, particularly at defects and interfaces.</p>
<p><strong>Physics Gap</strong>: Pair potentials assume bond strength is independent of environment, implying cohesive energy scales linearly with coordination ($Z$), whereas in reality it scales roughly as $\sqrt{Z}$.</p>
<p><strong>Empirical Failures</strong>: Pair potentials incorrectly predict the &ldquo;Cauchy relation&rdquo; ($C_{12} = C_{44}$) and predict a vacancy formation energy equal to the cohesive energy, contradicting experimental data for fcc metals.</p>
<p><strong>Practical Need</strong>: First-principles calculations (like DFT) were computationally too expensive for low-symmetry systems like grain boundaries and fracture tips, creating a need for an efficient, semi-empirical many-body potential.</p>
<h2 id="theoretical-unification--core-innovations">Theoretical Unification &amp; Core Innovations</h2>
<p>The paper&rsquo;s core contribution is the synthesis of the EAM as a practical computational tool that captures &ldquo;coordination-dependent bond strength&rdquo; without the cost of ab initio methods.</p>
<p><strong>Theoretical Unification</strong>: It demonstrates that the EAM ansatz can be derived from Density Functional Theory (DFT) by assuming the total electron density is a superposition of atomic densities.</p>
<p><strong>Environmental Dependence</strong>: It explicitly formulates how the &ldquo;effective&rdquo; pair interaction stiffens and shortens as coordination decreases (e.g., at surfaces), a feature naturally arising from the non-linearity of the embedding function.</p>
<p><strong>Broad Validation</strong>: It provides a centralized evaluation of the method across a vast array of metallic properties, establishing it as the standard for atomistic simulations of face-centered cubic (fcc) metals.</p>
<h2 id="validating-eam-across-application-domains">Validating EAM Across Application Domains</h2>
<p>The authors review computational experiments using Energy Minimization, Molecular Dynamics (MD), and Monte Carlo (MC) simulations across several domains:</p>
<p><strong>Bulk Properties</strong>: Calculation of phonon spectra, liquid structure factors, thermal expansion coefficients, and melting points for fcc metals (Ni, Pd, Pt, Cu, Ag, Au).</p>
<p><strong>Defects</strong>: Computation of vacancy formation/migration energies and self-interstitial geometries.</p>
<p><strong>Grain Boundaries</strong>: Calculation of grain boundary structures, energies, and elastic properties for twist and tilt boundaries in Au and Al. Computed structures show good agreement with X-ray diffraction and HRTEM experiments. The many-body interactions in the EAM produce somewhat better agreement than pair potentials, which tend to overestimate boundary expansion.</p>
<p><strong>Surfaces</strong>: Analysis of surface energies, relaxations, reconstructions (e.g., Au(110) missing row), and surface phonons.</p>
<p><strong>Alloys</strong>: Investigation of heat of solution, surface segregation profiles (e.g., Ni-Cu), and order-disorder transitions.</p>
<p><strong>Mechanical Properties</strong>: Simulation of dislocation mobility, pinning by defects (He bubbles), and crack tip plasticity (ductile vs. brittle fracture modes).</p>
<h2 id="key-outcomes-and-the-limits-of-eam">Key Outcomes and the Limits of EAM</h2>
<p><strong>Many-Body Success</strong>: The EAM successfully reproduces the breakdown of the Cauchy relation and the correct ratio of vacancy formation energy to cohesive energy (~0.35) for fcc metals.</p>
<p><strong>Surface Accuracy</strong>: It correctly predicts that surface bonds are shorter and stiffer than bulk bonds due to lower coordination. It accurately predicts surface reconstructions (e.g., Au(110) $(1 \times 2)$).</p>
<p><strong>Alloy Behavior</strong>: The method naturally captures segregation phenomena, including oscillating concentration profiles in Ni-Cu, driven by the embedding energy.</p>
<p><strong>Limitations</strong>: The method is less accurate for systems with strong directional bonding (covalent materials) or significant Fermi-surface effects, as it assumes spherically averaged electron densities.</p>
<hr>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<h3 id="data">Data</h3>
<p><strong>Fitting Data</strong>: The semi-empirical functions are fitted to basic bulk properties: lattice constants, cohesive energy, elastic constants ($C_{11}$, $C_{12}$, $C_{44}$), and vacancy formation energy.</p>
<p><strong>Universal Binding Curve</strong>: The cohesive energy as a function of lattice constant is constrained to follow the &ldquo;universal binding curve&rdquo; of Rose et al. to ensure accurate anharmonic behavior.</p>
<p><strong>Alloy Data</strong>: For binary alloys, dilute heats of alloying are used for fitting cross-interactions.</p>
<h3 id="algorithms">Algorithms</h3>
<p><strong>Core Ansatz</strong>: The total energy is defined as:</p>
<p>$$E_{coh} = \sum_{i} G_i\left( \sum_{j \neq i} \rho_j^a(R_{ij}) \right) + \frac{1}{2} \sum_{i, j (j \neq i)} U_{ij}(R_{ij})$$</p>
<p>where $G$ is the embedding energy (function of local electron density $\rho$), and $U$ is a pair interaction.</p>
<p><strong>Simulation Techniques</strong>:</p>
<ul>
<li><strong>Molecular Dynamics (MD)</strong>: Used for liquids, phonons, and fracture simulations.</li>
<li><strong>Monte Carlo (MC)</strong>: Used for phase diagrams and segregation profiles (e.g., approximately $10^5$ iterations per atom).</li>
<li><strong>Phonons</strong>: Calculated via the dynamical matrix derived from the force-constant tensor $K_{ij}$.</li>
<li><strong>Normal-Mode Analysis</strong>: Vibrational normal modes obtained by diagonalizing the dynamical matrix, feasible for unit cells of up to about 260 atoms.</li>
</ul>
<h3 id="models">Models</h3>
<p><strong>Parameterizations</strong>: The review lists several specific function sets developed by the authors (Table 2), including:</p>
<ul>
<li><strong>Daw and Baskes</strong>: For Ni, Pd, H (elemental metals and H in solution/on surfaces)</li>
<li><strong>Foiles</strong>: For Cu, Ag, Au, Ni, Pd, Pt (elemental metals)</li>
<li><strong>Foiles</strong>: For Cu, Ni (tailored for the Ni-Cu alloy system)</li>
<li><strong>Foiles, Baskes and Daw</strong>: For Cu, Ag, Au, Ni, Pd, Pt (dilute alloys)</li>
<li><strong>Daw, Baskes, Bisson and Wolfer</strong>: For Ni, H (fracture, dislocations, H embrittlement)</li>
<li><strong>Foiles and Daw</strong>: For Ni, Al (Ni-rich end of the Ni-Al alloy system)</li>
<li><strong>Daw</strong>: For Ni (calculated from first principles, not semi-empirical)</li>
<li><strong>Hoagland, Daw, Foiles and Baskes</strong>: For Al (elemental Al)</li>
</ul>
<p>Many of these historical parameterizations are directly downloadable in machine-readable formats from the NIST Interatomic Potentials Repository (linked in the resources below).</p>
<p><strong>Transferability</strong>: EAM functions are generally <em>not</em> transferable between different parameterization sets; mixing functions from different sets (e.g., Daw-Baskes Ni with Foiles Pd) is invalid.</p>
<h3 id="evaluation">Evaluation</h3>
<p><strong>Bulk Validation</strong>: Phonon dispersion curves for Cu show excellent agreement with experiment across the full Brillouin zone.</p>
<p><strong>Thermal Properties</strong>: Linear thermal expansion coefficients match experiment well (e.g., Cu calculated: $16.4 \times 10^{-6}/K$ vs experimental: $16.7 \times 10^{-6}/K$).</p>
<p><strong>Defect Energetics</strong>: Vacancy migration energies and divacancy binding energies (~0.1-0.2 eV) align with experimental data.</p>
<p><strong>Surface Segregation</strong>: Correctly predicts segregation species for 18 distinct dilute alloy cases (e.g., Cu segregating in Ni).</p>
<h3 id="hardware">Hardware</h3>
<p><strong>Compute Scale</strong>: At the time of publication (1993), Molecular Dynamics simulations of up to 35,000 atoms were possible.</p>
<p><strong>Platforms</strong>: Calculations were performed on supercomputers like the <strong>CRAY-XMP</strong>, though smaller calculations were noted as feasible on high-performance workstations.</p>
<hr>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Daw, M. S., Foiles, S. M., &amp; Baskes, M. I. (1993). The embedded-atom method: a review of theory and applications. <em>Materials Science Reports</em>, 9(7-8), 251-310. <a href="https://doi.org/10.1016/0920-2307(93)90001-U">https://doi.org/10.1016/0920-2307(93)90001-U</a></p>
<p><strong>Publication</strong>: Materials Science Reports 1993</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@article</span>{dawEmbeddedatomMethodReview1993,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span> = <span style="color:#e6db74">{The embedded-atom method: a review of theory and applications}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">shorttitle</span> = <span style="color:#e6db74">{The Embedded-Atom Method}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span> = <span style="color:#e6db74">{Daw, Murray S. and Foiles, Stephen M. and Baskes, Michael I.}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span> = <span style="color:#ae81ff">1993</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">month</span> = mar,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">journal</span> = <span style="color:#e6db74">{Materials Science Reports}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">volume</span> = <span style="color:#e6db74">{9}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">number</span> = <span style="color:#e6db74">{7-8}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span> = <span style="color:#e6db74">{251--310}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">issn</span> = <span style="color:#e6db74">{0920-2307}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">doi</span> = <span style="color:#e6db74">{10.1016/0920-2307(93)90001-U}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p><strong>Additional Resources</strong>:</p>
<ul>
<li><a href="/notes/chemistry/molecular-simulation/classical-methods/embedded-atom-method/">Original EAM Paper (1984)</a></li>
<li><a href="/notes/chemistry/molecular-simulation/classical-methods/embedded-atom-method-voter-1994/">EAM User Guide (1994)</a></li>
<li><a href="https://www.ctcms.nist.gov/potentials/">NIST Interatomic Potentials Repository</a></li>
</ul>
]]></content:encoded></item><item><title>Embedded-Atom Method User Guide: Voter's 1994 Chapter</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/classical-methods/embedded-atom-method-voter-1994/</link><pubDate>Sun, 14 Dec 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/classical-methods/embedded-atom-method-voter-1994/</guid><description>Comprehensive user guide for the Embedded-Atom Method (EAM), covering theory, potential fitting, and applications to intermetallics.</description><content:encoded><![CDATA[<h2 id="contribution-systematizing-the-embedded-atom-method">Contribution: Systematizing the Embedded-Atom Method</h2>
<p>This is a <strong>Systematization</strong> paper (specifically a handbook chapter) with a strong secondary <strong>Method</strong> projection.</p>
<p>Its primary goal is to serve as a &ldquo;users&rsquo; guide&rdquo; to the Embedded-Atom Method (EAM). The text organizes existing knowledge:</p>
<ul>
<li>It traces the physical origins of EAM from Density Functional Theory (DFT) and Effective Medium Theory.</li>
<li>It synthesizes &ldquo;closely related methods&rdquo; (Second Moment Approximation, Glue Model), showing they are mathematically equivalent or very similar to EAM.</li>
<li>It provides a pedagogical, step-by-step methodology for fitting potentials to experimental data.</li>
</ul>
<h2 id="motivation-bridging-the-gap-between-dft-and-pair-potentials">Motivation: Bridging the Gap Between DFT and Pair Potentials</h2>
<p>The primary motivation is to bridge the gap between accurate, expensive electronic structure calculations and fast, inaccurate pair potentials.</p>
<ul>
<li><strong>Computational Efficiency</strong>: First-principles methods scale as $O(N^3)$ or worse, limiting simulations to $&lt;100$ atoms (in 1994). Pair potentials scale as $O(N)$ and fail to capture essential many-body physics of metals.</li>
<li><strong>Physical Accuracy</strong>: Simple pair potentials cannot accurately model metallic defects; they predict zero Cauchy pressure ($C_{12} - C_{44} = 0$) and equate vacancy formation energy to cohesive energy, both of which are incorrect for transition metals.</li>
<li><strong>Practical Utility</strong>: There was a need for a clear guide on how to construct and apply these potentials for large-scale simulations ($10^6+$ atoms) of fracture and defects.</li>
</ul>
<h2 id="novelty-a-unified-framework-and-robust-fitting-recipe">Novelty: A Unified Framework and Robust Fitting Recipe</h2>
<p>As a review chapter, the novelty lies in the synthesis and the specific, reproducible recipe for potential construction. Central to this synthesis is the core EAM energy functional:</p>
<p>$$E_{\text{tot}} = \sum_i \left( F(\bar{\rho}_i) + \frac{1}{2} \sum_{j \neq i} \phi(r_{ij}) \right)$$</p>
<p>where the total energy $E_{\text{tot}}$ depends on embedding an atom $i$ into a local background electron density $\bar{\rho}_i = \sum_{j \neq i} \rho(r_{ij})$, plus a repulsive pair interaction $\phi(r_{ij})$.</p>
<ul>
<li><strong>Unified Framework</strong>: It explicitly maps the &ldquo;Second Moment Approximation&rdquo; (Tight Binding) and the &ldquo;Glue Model&rdquo; onto the fundamental EAM framework above, clarifying that they differ primarily in terminology or specific functional choices (e.g., square root embedding functions).</li>
<li><strong>Cross-Potential Fitting Recipe</strong>: It details a robust method for fitting alloy potentials (specifically Ni-Al-B) by using &ldquo;transformation invariance&rdquo;, scaling the density and shifting the embedding function to fit alloy properties without disturbing pure element fits.</li>
<li><strong>Specific Parameters</strong>: It publishes optimized potential parameters for Ni, Al, and B that accurately reproduce properties like the Boron interstitial preference in $\text{Ni}_3\text{Al}$.</li>
</ul>
<h2 id="validation-computational-benchmarks-and-simulations">Validation: Computational Benchmarks and Simulations</h2>
<p>The &ldquo;experiments&rdquo; described are computational validations and simulations using the fitted Ni-Al-B potential:</p>
<ol>
<li>
<p><strong>Potential Fitting</strong>:</p>
<ul>
<li>Pure elements (Ni, Al) were fitted to elastic constants, vacancy formation energies, and diatomic data. The Ni fit achieved $\chi_{\text{rms}} = 0.75%$ and Al achieved $\chi_{\text{rms}} = 3.85%$.</li>
<li>Boron was fitted using hypothetical crystal structures (fcc, bcc) calculated via LMTO (Linear Muffin-Tin Orbital) since experimental data for fcc B does not exist.</li>
</ul>
</li>
<li>
<p><strong>Molecular Statics (Validation)</strong>:</p>
<ul>
<li><strong>Surface Relaxation</strong>: Demonstrated that EAM captures the oscillatory relaxation of atomic layers near a free surface, a many-body effect that pair potentials fail to capture.</li>
<li><strong>Defect Energetics</strong>: Calculated formation energies for Boron interstitials in $\text{Ni}_3\text{Al}$. Found the 6Ni-octahedral site is most stable ($-4.59$ eV relative to an isolated B atom and unperturbed crystal), followed by the 4Ni-2Al octahedral site ($-3.65$ eV) and the 3Ni-1Al tetrahedral site ($-2.99$ eV), consistent with channeling experiments.</li>
</ul>
</li>
<li>
<p><strong>Molecular Dynamics (Application)</strong>:</p>
<ul>
<li><strong>Grain Boundary (GB) Cleavage</strong>: Simulated the fracture of a (210) tilt grain boundary in $\text{Ni}_3\text{Al}$ at a strain rate of $5 \times 10^{10}$ s$^{-1}$.</li>
<li><strong>Comparison</strong>: Compared pure $\text{Ni}_3\text{Al}$ boundaries vs. those doped with Boron and substitutional Nickel.</li>
</ul>
</li>
</ol>
<h2 id="key-outcomes-eam-efficiency-and-boron-strengthening">Key Outcomes: EAM Efficiency and Boron Strengthening</h2>
<ul>
<li><strong>EAM Efficiency</strong>: Confirmed that EAM scales linearly with atom count ($N$), requiring only 2-5 times the computational work of pair potentials.</li>
<li><strong>Boron Strengthening Mechanism</strong>: The simulations suggested that Boron segregates to grain boundaries and, specifically when co-segregated with Ni, significantly increases cohesion.
<ul>
<li>The maximum stress for the enriched boundary was approximately 22 GPa, compared to approximately 19 GPa for the clean boundary.</li>
<li>The B-doped boundary required approximately 44% more work to cleave than the undoped boundary.</li>
<li>The fracture mode shifted from cleaving along the GB to failure in the bulk.</li>
</ul>
</li>
<li><strong>Grain Boundary Segregation</strong>: Molecular statics calculations found B interstitial energies at the GB as low as $-6.9$ eV, compared to $-4.59$ eV in the bulk, consistent with experimental observations of boron segregation to grain boundaries.</li>
<li><strong>Limitations</strong>: The author concludes that while EAM is excellent for metals, it lacks the angular dependence required for strongly covalent materials (like $\text{MoSi}_2$) or directional bonding.</li>
</ul>
<hr>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<p>The chapter provides nearly all details required to implement the described potential from scratch.</p>
<h3 id="data">Data</h3>
<ul>
<li><strong>Experimental/Reference Data</strong>: Used for fitting the cost function $\chi_{\text{rms}}$.
<ul>
<li><strong>Pure Elements</strong>: Lattice constants ($a_0$), cohesive energy ($E_{\text{coh}}$), bulk modulus ($B$), elastic constants ($C_{11}, C_{12}, C_{44}$), vacancy formation energy ($E_{\text{vac}}^f$), and diatomic bond length/strength ($R_e, D_e$).</li>
<li><strong>Alloys</strong>: Heat of solution and defect energies (APB, SISF) for $\text{Ni}_3\text{Al}$.</li>
<li><strong>Hypothetical Data</strong>: LMTO first-principles data used for unobserved phases (e.g., fcc Boron, B2 NiB) to constrain the fit.</li>
</ul>
</li>
</ul>
<h3 id="algorithms">Algorithms</h3>
<ul>
<li><strong>Component Functions</strong>:
<ul>
<li><strong>Pair Potential $\phi(r)$</strong>: Morse potential form:
$$\phi(r) = D_M {1 - \exp[-\alpha_M(r - R_M)]}^2 - D_M$$</li>
<li><strong>Density Function $\rho(r)$</strong>: Modified hydrogenic 4s orbital:
$$\rho(r) = r^6(e^{-\beta r} + 2^9 e^{-2\beta r})$$</li>
<li><strong>Embedding Function $F(\bar{\rho})$</strong>: Derived numerically to force the crystal energy to match the &ldquo;Universal Energy Relation&rdquo; (Rose et al.) as a function of lattice constant.</li>
</ul>
</li>
<li><strong>Fitting Strategy</strong>:
<ul>
<li><strong>Smooth Cutoff</strong>: A polynomial smoothing function ($h_{\text{smooth}}$) applied at $r_{\text{cut}}$ to ensure continuous derivatives.</li>
<li><strong>Simplex Algorithm</strong>: Used to optimize parameters ($D_M, R_M, \alpha_M, \beta, r_{\text{cut}}$).</li>
<li><strong>Alloy Invariance</strong>: Used transformations $F&rsquo;(\rho) = F(\rho) + g\rho$ and $\rho&rsquo;(r) = s\rho(r)$ to fit cross-potentials without altering pure-element properties.</li>
</ul>
</li>
</ul>
<h3 id="models">Models</h3>
<ul>
<li><strong>Parameters</strong>: The text provides the exact optimized parameters for the Ni-Al-B potential in <strong>Table 2</strong> (Pure elements) and <strong>Table 5</strong> (Cross-potentials).
<ul>
<li>Example Ni parameters: $D_M=1.5335$ eV, $\alpha_M=1.7728$ Å$^{-1}$, $r_{\text{cut}}=4.7895$ Å.</li>
</ul>
</li>
</ul>
<h3 id="hardware">Hardware</h3>
<ul>
<li><strong>1994 Context</strong>: Mentions that simulations of $10^6$ atoms were possible on the &ldquo;fastest computers available&rdquo;.</li>
<li><strong>Scaling</strong>: Explicitly notes computational work scales as $O(N)$, roughly 2-5x slower than pair potentials.</li>
</ul>
<hr>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Voter, A. F. (1994). Chapter 4: The Embedded-Atom Method. In <em>Intermetallic Compounds: Vol. 1, Principles</em>, edited by J. H. Westbrook and R. L. Fleischer. John Wiley &amp; Sons Ltd.</p>
<p><strong>Publication</strong>: Intermetallic Compounds: Vol. 1, Principles (1994)</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@incollection</span>{voterEmbeddedAtomMethod1994,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span> = <span style="color:#e6db74">{The Embedded-Atom Method}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span> = <span style="color:#e6db74">{Voter, Arthur F.}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">booktitle</span> = <span style="color:#e6db74">{Intermetallic Compounds: Vol. 1, Principles}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">editor</span> = <span style="color:#e6db74">{Westbrook, J. H. and Fleischer, R. L.}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span> = <span style="color:#e6db74">{1994}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">publisher</span> = <span style="color:#e6db74">{John Wiley &amp; Sons Ltd}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span> = <span style="color:#e6db74">{77--90}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">chapter</span> = <span style="color:#e6db74">{4}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p><strong>Additional Resources</strong>:</p>
<ul>
<li><a href="https://www.ctcms.nist.gov/potentials/">NIST Interatomic Potentials Repository</a> (Modern repository often hosting EAM files)</li>
<li><a href="/notes/chemistry/molecular-simulation/classical-methods/embedded-atom-method/">Original EAM Paper (1984)</a></li>
<li><a href="/notes/chemistry/molecular-simulation/classical-methods/embedded-atom-method-review-1993/">EAM Review (1993)</a></li>
</ul>
]]></content:encoded></item><item><title>Dynamical Corrections to TST for Surface Diffusion</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/surface-science/self-diffusion-lj-fcc111-1989/</link><pubDate>Sun, 14 Dec 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/surface-science/self-diffusion-lj-fcc111-1989/</guid><description>Application of dynamical corrections formalism to TST for LJ surface diffusion, revealing bounce-back recrossings at low T.</description><content:encoded><![CDATA[<h2 id="bridging-md-and-tst-for-surface-diffusion">Bridging MD and TST for Surface Diffusion</h2>
<p>This is primarily a <strong>Methodological Paper</strong> with a secondary contribution in <strong>Discovery</strong>.</p>
<p>The authors&rsquo; primary goal is to demonstrate the validity of the &ldquo;dynamical corrections formalism&rdquo; for calculating diffusion constants. They validate this by reproducing Molecular Dynamics (MD) results at high temperatures and then extending the method into low-temperature regimes where MD is infeasible.</p>
<p>By applying this method, they uncover a specific physical phenomenon, &ldquo;bounce-back recrossings&rdquo;, that causes a dip in the diffusion coefficient at low temperatures, a detail previously unobserved.</p>
<h2 id="timescale-limits-in-molecular-dynamics">Timescale Limits in Molecular Dynamics</h2>
<p>The authors aim to solve the timescale problem in simulating surface diffusion.</p>
<p><strong>Limit of MD</strong>: Molecular Dynamics (MD) is effective at high temperatures but becomes computationally infeasible at low temperatures because the time between diffusive hops increases drastically.</p>
<p><strong>Limit of TST</strong>: Standard Transition State Theory (TST) can handle long timescales but assumes all barrier crossings are successful, ignoring correlated dynamical events like immediate recrossings or multiple jumps.</p>
<p><strong>Goal</strong>: They seek to apply a formalism that corrects TST using short-time trajectory data, allowing for accurate calculation of diffusion constants across the entire temperature range.</p>
<h2 id="the-bounce-back-mechanism">The Bounce-Back Mechanism</h2>
<p>The core novelty is the rigorous application of the dynamical corrections formalism to a multi-site system (fcc/hcp sites) to characterize non-Arrhenius behavior at low temperatures.</p>
<p><strong>Unified Approach</strong>: They demonstrate that this method works for all temperatures, bridging the gap between the &ldquo;rare-event regime&rdquo; and the high-temperature regime dominated by fluid-like motion.</p>
<p><strong>Bounce-back Mechanism</strong>: They identify a specific &ldquo;dip&rdquo; in the dynamical correction factor ($f_d &lt; 1$) at low temperatures ($T \approx 0.038$), attributed to trajectories where the adatom collides with a substrate atom on the far side of the binding site and immediately recrosses the dividing surface.</p>
<h2 id="simulating-the-lennard-jones-fcc111-surface">Simulating the Lennard-Jones fcc(111) Surface</h2>
<p>The authors performed computational experiments on a Lennard-Jones fcc(111) surface cluster.</p>
<p><strong>System Setup</strong>: A single adatom on a 3-layer substrate (30 atoms/layer) with periodic boundary conditions.</p>
<p><strong>Baselines</strong>: They compared their high-temperature results against standard Molecular Dynamics simulations to validate the method.</p>
<p><strong>Ablation of Substrate Freedom</strong>: They ran a control experiment with a 6-layer substrate (top 3 free, 800 trajectories) to confirm the bounce-back effect persisted independently of the fixed deep layers, obtaining $D/D^{TST} = 0.75 \pm 0.06$, consistent with the original result.</p>
<p><strong>Trajectory Analysis</strong>: They analyzed the angular distribution of initial momenta to characterize the specific geometry of the bounce-back trajectories. Bounce-back trajectories were more strongly peaked at $\phi = 90°$ (perpendicular to the TST gate), confirming the effect arises from interaction with the substrate atom directly across the binding site.</p>
<p><strong>Temperature Range</strong>: The full calculation spanned $0.013 \leq T \leq 0.383$ in reduced units, bridging the rare-event regime and the high-temperature fluid-like regime.</p>
<h2 id="resolving-non-arrhenius-behavior">Resolving Non-Arrhenius Behavior</h2>
<p><strong>Arrhenius Behavior of TST</strong>: The uncorrected TST diffusion constant ($D^{TST}$) followed a near-perfect Arrhenius law, with a linear least-squares fit of $\ln(D^{TST}) = -1.8 - 0.30/T$.</p>
<p><strong>High-Temperature Correction</strong>: At high T, the dynamical correction factor $D/D^{TST} &gt; 1$, indicating correlated multiple forward jumps (long flights).</p>
<p><strong>Low-Temperature Dip</strong>: At low T, $D/D^{TST} &lt; 1$ for $T = 0.013, 0.026, 0.038, 0.051$ (minimum at $T = 0.038$), caused by the bounce-back mechanism.</p>
<p><strong>Validation</strong>: The method successfully reproduced high-T literature values while providing access to low-T dynamics inaccessible to direct MD.</p>
<hr>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<h3 id="data">Data</h3>
<p>The paper does not use external datasets but generates simulation data based on the Lennard-Jones potential.</p>
<table>
  <thead>
      <tr>
          <th>Type</th>
          <th>Parameter</th>
          <th>Value</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><strong>Potential</strong></td>
          <td>$\epsilon, \sigma$</td>
          <td>1.0 (Reduced units)</td>
          <td>Standard Lennard-Jones 6-12</td>
      </tr>
      <tr>
          <td><strong>Cutoff</strong></td>
          <td>Spline</td>
          <td>$r_1=1.5\sigma, r_2=2.5\sigma$</td>
          <td>5th-order spline smooths potential to 0 at $r_2$</td>
      </tr>
      <tr>
          <td><strong>Geometry</strong></td>
          <td>Lattice Constant</td>
          <td>$a_0 = 1.549$</td>
          <td>Minimum energy for this potential</td>
      </tr>
      <tr>
          <td><strong>Cluster</strong></td>
          <td>Size</td>
          <td>3 layers, 30 atoms/layer</td>
          <td>Periodic boundary conditions parallel to surface</td>
      </tr>
  </tbody>
</table>
<h3 id="algorithms">Algorithms</h3>
<p>The diffusion constant $D$ is calculated as $D = D^{TST} \times (D/D^{TST})$.</p>
<p><strong>1. TST Rate Calculation ($D^{TST}$)</strong></p>
<ul>
<li><strong>Method</strong>: Monte Carlo integration of the flux through the dividing surface.</li>
<li><strong>Technique</strong>: Calculate free energy difference between the entire binding site and the TST dividing region.</li>
<li><strong>Dividing Surface</strong>: Defined geometrically with respect to equilibrium substrate positions (honeycomb boundaries around fcc/hcp sites).</li>
</ul>
<p><strong>2. Dynamical Correction Factor ($D/D^{TST}$)</strong></p>
<p>The method relies on evaluating the dynamical correction factor $f_d$, initialized via a Metropolis walk restricted to the TST boundary region, computed as:</p>
<p>$$
\begin{aligned}
f_d(i\rightarrow j) = \frac{2}{N}\sum_{I=1}^{N}\eta_{ij}(I)
\end{aligned}
$$</p>
<ul>
<li><strong>Initialization</strong>:
<ul>
<li><strong>Position</strong>: Sampled via Metropolis walk restricted to the TST boundary region.</li>
<li><strong>Momentum</strong>: Maxwellian distribution for parallel components; Maxwellian-flux distribution for normal component.</li>
<li><strong>Symmetry</strong>: Trajectories entering hcp sites are generated by reversing momenta of those entering fcc sites.</li>
</ul>
</li>
<li><strong>Integration</strong>:
<ul>
<li><strong>Integrator</strong>: Adams-Bashforth-Moulton predictor-corrector formulas of orders 1 through 12.</li>
<li><strong>Duration</strong>: Integrated until time $t &gt; \tau_{corr}$ (approximately $\tau_{corr} \approx 13$ reduced time units).</li>
<li><strong>Sample Size</strong>: 1400 trajectories per temperature point (700 initially entering each type of site).</li>
</ul>
</li>
</ul>
<h3 id="models">Models</h3>
<ul>
<li><strong>System</strong>: Single component Lennard-Jones solid (Argon-like).</li>
<li><strong>Adsorbate</strong>: Single adatom on fcc(111) surface.</li>
<li><strong>Substrate Flexibility</strong>: Adatom plus top layer atoms are free to move. Layers 2 and 3 are fixed. (Validation run used 6 layers with top 3 free).</li>
</ul>
<h3 id="evaluation">Evaluation</h3>
<p>The primary metric is the Diffusion Constant $D$, analyzed via the Dynamical Correction Factor.</p>
<table>
  <thead>
      <tr>
          <th>Metric</th>
          <th>Value</th>
          <th>Baseline</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><strong>Slope ($E_a$)</strong></td>
          <td>0.30</td>
          <td>0.303 fcc / 0.316 hcp (Newton-Raphson)</td>
          <td>TST slope in good agreement with static barrier height.</td>
      </tr>
      <tr>
          <td><strong>$D/D^{TST}$ (Low T)</strong></td>
          <td>$0.82 \pm 0.04$</td>
          <td>1.0 (TST)</td>
          <td>At $T=0.038$. Indicates 18% reduction due to recrossing.</td>
      </tr>
      <tr>
          <td><strong>$D/D^{TST}$ (High T)</strong></td>
          <td>$&gt; 1.0$</td>
          <td>MD Literature</td>
          <td>Increases with T due to multiple jumps.</td>
      </tr>
  </tbody>
</table>
<h3 id="hardware">Hardware</h3>
<p>Specific hardware configurations (e.g., node architectures, supercomputers) or training times were not specified in the original publication, which is typical for 1989 literature. Modern open-source MD engines (e.g., LAMMPS, ASE) could perform identical Lennard-Jones molecular dynamics integrations in negligible time on any consumer workstation.</p>
<hr>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Cohen, J. M., &amp; Voter, A. F. (1989). Self-diffusion on the Lennard-Jones fcc(111) surface: Effects of temperature on dynamical corrections. <em>The Journal of Chemical Physics</em>, 91(8), 5082-5086. <a href="https://doi.org/10.1063/1.457599">https://doi.org/10.1063/1.457599</a></p>
<p><strong>Publication</strong>: The Journal of Chemical Physics 1989</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@article</span>{cohenSelfDiffusionLennard1989,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span> = <span style="color:#e6db74">{Self-diffusion on the {{Lennard}}-{{Jones}} Fcc(111) Surface: {{Effects}} of Temperature on Dynamical Corrections}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">shorttitle</span> = <span style="color:#e6db74">{Self-diffusion on the {{Lennard}}-{{Jones}} Fcc(111) Surface}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span> = <span style="color:#e6db74">{Cohen, J. M. and Voter, A. F.}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span> = <span style="color:#e6db74">{1989}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">month</span> = oct,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">journal</span> = <span style="color:#e6db74">{The Journal of Chemical Physics}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">volume</span> = <span style="color:#e6db74">{91}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">number</span> = <span style="color:#e6db74">{8}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span> = <span style="color:#e6db74">{5082--5086}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">issn</span> = <span style="color:#e6db74">{0021-9606, 1089-7690}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">doi</span> = <span style="color:#e6db74">{10.1063/1.457599}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">langid</span> = <span style="color:#e6db74">{english}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div>]]></content:encoded></item><item><title>Correlations in the Motion of Atoms in Liquid Argon</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/classical-methods/correlations-motion-atoms-liquid-argon/</link><pubDate>Sat, 13 Dec 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/classical-methods/correlations-motion-atoms-liquid-argon/</guid><description>Rahman's 1964 MD simulation of 864 argon atoms with Lennard-Jones potential revealed the cage effect and validated classical molecular dynamics for liquids.</description><content:encoded><![CDATA[<h2 id="contribution-methodological-validation-of-md">Contribution: Methodological Validation of MD</h2>
<p>This is the archetypal <strong>Method</strong> paper (dominant classification with secondary <strong>Theory</strong> contribution). It establishes the architectural validity of Molecular Dynamics (MD) as a scientific tool. Rahman answers the question: &ldquo;Can a digital computer solving classical difference equations faithfully represent a physical liquid?&rdquo;</p>
<p>The paper utilizes specific rhetorical indicators of a methodological contribution:</p>
<ul>
<li><strong>Algorithmic Explication</strong>: A dedicated Appendix details the predictor-corrector difference equations.</li>
<li><strong>Validation against Ground Truth</strong>: Extensive comparison of calculated diffusion constants and pair-correlation functions against experimental neutron and X-ray scattering data.</li>
<li><strong>Robustness Checks</strong>: Ablation studies on the numerical integration stability (one vs. two corrector cycles).</li>
</ul>
<h2 id="motivation-bridging-neutron-scattering-and-many-body-theory">Motivation: Bridging Neutron Scattering and Many-Body Theory</h2>
<p>In the early 1960s, neutron scattering data provided insights into the dynamic structure of liquids, but theorists lacked concrete models to explain the observed two-body dynamical correlations. Analytic theories were limited by the difficulty of the many-body problem.</p>
<p>Rahman sought to bypass these analytical bottlenecks by assuming that <strong>classical dynamics</strong> with a simple 2-body potential (Lennard-Jones) could sufficiently describe the motion of atoms in liquid argon. The goal was to generate &ldquo;experimental&rdquo; data via simulation to test theoretical models (like the Vineyard convolution approximation) and provide a microscopic understanding of diffusion.</p>
<h2 id="core-innovation-system-stability-and-the-cage-effect">Core Innovation: System Stability and the Cage Effect</h2>
<p>This paper is widely considered the birth of modern molecular dynamics for continuous potentials. Its key novelties include:</p>
<ol>
<li><strong>System Size &amp; Stability</strong>: Successfully simulating 864 particles interacting via a continuous Lennard-Jones potential with stable temperature over the full simulation duration (approximately $10^{-11}$ sec, as confirmed by Table I in the paper).</li>
<li><strong>The &ldquo;Cage Effect&rdquo;</strong>: The discovery that the velocity autocorrelation function becomes negative after a short time:
$$ \langle \textbf{v}(0) \cdot \textbf{v}(t) \rangle &lt; 0 \quad \text{for } t &gt; 0.33 \times 10^{-12} \text{ s} $$
This proved that atoms in a liquid &ldquo;rattle&rdquo; against the cage of their nearest neighbors.</li>
<li><strong>Delayed Convolution</strong>: Proposing an improvement to the Vineyard approximation for the distinct Van Hove function $G_d(r,t)$ by introducing a time-delayed convolution to account for the persistence of local structure. Instead of convolving $g(r)$ with $G_s(r,t)$ at the same time $t$, Rahman convolves at a delayed time $t&rsquo; &lt; t$, using a one-parameter function with $\tau = 1.0 \times 10^{-12}$ sec. This makes $G_d(r,t)$ decay as $t^4$ at short times (instead of $t^2$ in the Vineyard approximation) and as $t$ at long times.</li>
</ol>
<h2 id="methodology-simulating-864-argon-atoms">Methodology: Simulating 864 Argon Atoms</h2>
<p>Rahman performed a &ldquo;computer experiment&rdquo; (simulation) of <strong>Liquid Argon</strong>:</p>
<ul>
<li><strong>System</strong>: 864 particles in a cubic box of side $L=10.229\sigma$.</li>
<li><strong>Conditions</strong>: Temperature $94.4^\circ$K, Density $1.374 \text{ g cm}^{-3}$.</li>
<li><strong>Interaction</strong>: Lennard-Jones potential, truncated at $R=2.25\sigma$.</li>
<li><strong>Time Step</strong>: $\Delta t = 10^{-14}$ s (780 steps total, covering approximately $7.8 \times 10^{-12}$ s).</li>
<li><strong>Output Analysis</strong>:
<ul>
<li>Radial distribution function $g(r)$.</li>
<li>Mean square displacement $\langle r^2 \rangle$.</li>
<li>Velocity autocorrelation function $\langle v(0)\cdot v(t) \rangle$.</li>
<li>Van Hove space-time correlation functions $G_s(r,t)$ and $G_d(r,t)$.</li>
</ul>
</li>
</ul>
<h2 id="results-validation-and-non-gaussian-diffusion-analysis">Results: Validation and Non-Gaussian Diffusion Analysis</h2>
<ul>
<li><strong>Validation</strong>: The calculated pair-distribution function $g(r)$ agreed well with X-ray scattering data from Eisenstein and Gingrich (at $91.8^\circ$K). The self-diffusion constant $D = 2.43 \times 10^{-5} \text{ cm}^2 \text{ sec}^{-1}$ at $94.4^\circ$K matched the experimental value from Naghizadeh and Rice at $90^\circ$K and the same density ($1.374 \text{ g cm}^{-3}$).</li>
<li><strong>Dynamics</strong>: The velocity autocorrelation has a negative region, contradicting simple exponential decay models (Langevin). Its frequency spectrum $f(\omega)$ shows a broad maximum at $\omega \approx 0.25 (k_BT/\hbar)$, reminiscent of solid-like behavior.</li>
<li><strong>Non-Gaussian Behavior</strong>: The self-diffusion function $G_s(r,t)$ attains its maximum departure from a Gaussian shape at about $t \approx 3.0 \times 10^{-12}$ s (with $\langle r^4 \rangle$ departing from its Gaussian value by about 13%), returning to Gaussian form by $\sim 10^{-11}$ s. At that time, the rms displacement ($3.8$ Angstrom) is close to the first-neighbor distance ($3.7$ Angstrom). This indicates that Fickian diffusion is an asymptotic limit and does not apply at short times.</li>
<li><strong>Fourier Transform Validation</strong>: The Fourier transform of $g(r)$ has peaks at $\kappa\sigma = 6.8$, 12.5, 18.5, 24.8, closely matching the X-ray scattering peaks at $\kappa\sigma = 6.8$, 12.3, 18.4, 24.4.</li>
<li><strong>Temperature Dependence</strong>: A second simulation at $130^\circ$K and $1.16 \text{ g cm}^{-3}$ yielded $D = 5.67 \times 10^{-5} \text{ cm}^2 \text{ sec}^{-1}$, compared to the experimental value of $6.06 \times 10^{-5} \text{ cm}^2 \text{ sec}^{-1}$ from Naghizadeh and Rice at $120^\circ$K and $1.16 \text{ g cm}^{-3}$. The paper notes that both calculated values are lower than experiment by about 20%, and suggests that allowing for a softer repulsive part in the interaction potential might reduce this discrepancy.</li>
<li><strong>Vineyard Approximation</strong>: The standard Vineyard convolution approximation ($G_d \approx g * G_s$) produces a too-rapid decay of $G_d(r,t)$ with time. The delayed convolution, matching pairs of $(t&rsquo;, t)$ in units of $10^{-12}$ sec as (0.2, 0.4), (0.5, 0.8), (1.0, 1.6), (1.5, 2.3), (2.0, 2.9), (2.5, 3.5), provides a substantially better fit.</li>
<li><strong>Conclusion</strong>: Classical N-body dynamics with a truncated pair potential is a sufficient model to reproduce both the structural and dynamical properties of simple liquids.</li>
</ul>
<hr>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<h3 id="data">Data</h3>
<p>The simulation uses physical constants for Argon:</p>
<table>
  <thead>
      <tr>
          <th>Parameter</th>
          <th>Value</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Particle Mass ($M$)</td>
          <td>$39.95 \times 1.6747 \times 10^{-24}$ g</td>
          <td>Mass of Argon atom</td>
      </tr>
      <tr>
          <td>Potential Depth ($\epsilon/k_B$)</td>
          <td>$120^\circ$K</td>
          <td>Lennard-Jones parameter</td>
      </tr>
      <tr>
          <td>Potential Size ($\sigma$)</td>
          <td>$3.4$ Å</td>
          <td>Lennard-Jones parameter</td>
      </tr>
      <tr>
          <td>Cutoff Radius ($R$)</td>
          <td>$2.25\sigma$</td>
          <td>Potential truncated beyond this</td>
      </tr>
      <tr>
          <td>Density ($\rho$)</td>
          <td>$1.374$ g cm$^{-3}$</td>
          <td></td>
      </tr>
      <tr>
          <td>Particle Count ($N$)</td>
          <td>864</td>
          <td></td>
      </tr>
  </tbody>
</table>
<h3 id="algorithms">Algorithms</h3>
<p>Rahman utilized a <strong>Predictor-Corrector</strong> scheme for solving the second-order differential equations of motion.</p>
<p><strong>Step Size</strong>: $\Delta t = 10^{-14}$ sec.</p>
<p><strong>The Algorithm:</strong></p>
<ol>
<li><strong>Predict</strong> positions $\bar{\xi}$ at $t + \Delta t$ based on previous steps:
$$\bar{\xi}_i^{(n+1)} = \xi_i^{(n-1)} + 2\Delta u \eta_i^{(n)}$$</li>
<li><strong>Calculate Forces</strong> (Accelerations $\alpha$) using predicted positions.</li>
<li><strong>Correct</strong> positions and velocities using the trapezoidal rule:
$$
\begin{aligned}
\eta_i^{(n+1)} &amp;= \eta_i^{(n)} + \frac{1}{2}\Delta u (\alpha_i^{(n+1)} + \alpha_i^{(n)}) \\
\xi_i^{(n+1)} &amp;= \xi_i^{(n)} + \frac{1}{2}\Delta u (\eta_i^{(n+1)} + \eta_i^{(n)})
\end{aligned}
$$</li>
</ol>
<p><em>Note: The paper compared one vs. two repetitions of the corrector step, finding that two passes improved precision slightly. The results presented in the paper were obtained using two passes.</em></p>
<h3 id="models">Models</h3>
<p><strong>Interaction Potential</strong>: Lennard-Jones 12-6
$$V(r_{ij}) = 4\epsilon \left[ \left(\frac{\sigma}{r_{ij}}\right)^{12} - \left(\frac{\sigma}{r_{ij}}\right)^6 \right]$$</p>
<p><strong>Boundary Conditions</strong>: Periodic Boundary Conditions (PBC) in 3 dimensions. When a particle moves out of the box ($x &gt; L$), it re-enters at $x - L$.</p>
<h3 id="hardware">Hardware</h3>
<p>This is a historical benchmark for computational capability in 1964:</p>
<table>
  <thead>
      <tr>
          <th>Resource</th>
          <th>Specification</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><strong>Computer</strong></td>
          <td>CDC 3600</td>
          <td>Control Data Corporation mainframe</td>
      </tr>
      <tr>
          <td><strong>Compute Time</strong></td>
          <td>45 seconds / cycle</td>
          <td>Per predictor-corrector cycle for 864 particles (floating point)</td>
      </tr>
      <tr>
          <td><strong>Language</strong></td>
          <td>FORTRAN + Machine Language</td>
          <td>Machine language used for the most time-consuming parts</td>
      </tr>
  </tbody>
</table>
<p><em>Modern Context: Rahman&rsquo;s system (864 Argon atoms, LJ-potential) is highly reproducible today and serves as a classic pedagogical exercise. It can be simulated in standard MD frameworks (LAMMPS, OpenMM) in fractions of a second on consumer hardware.</em></p>
<hr>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Rahman, A. (1964). Correlations in the Motion of Atoms in Liquid Argon. <em>Physical Review</em>, 136(2A), A405-A411. <a href="https://doi.org/10.1103/PhysRev.136.A405">https://doi.org/10.1103/PhysRev.136.A405</a></p>
<p><strong>Publication</strong>: Physical Review 1964</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@article</span>{rahman1964correlations,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span>=<span style="color:#e6db74">{Correlations in the motion of atoms in liquid argon}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span>=<span style="color:#e6db74">{Rahman, A.}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">journal</span>=<span style="color:#e6db74">{Physical Review}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">volume</span>=<span style="color:#e6db74">{136}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">number</span>=<span style="color:#e6db74">{2A}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span>=<span style="color:#e6db74">{A405--A411}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span>=<span style="color:#e6db74">{1964}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">publisher</span>=<span style="color:#e6db74">{APS}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">doi</span>=<span style="color:#e6db74">{10.1103/PhysRev.136.A405}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p><strong>Additional Resources</strong>:</p>
<ul>
<li><a href="https://en.wikipedia.org/wiki/Aneesur_Rahman">Aneesur Rahman - Wikipedia</a></li>
</ul>
]]></content:encoded></item><item><title>Adatom Dimer Diffusion on fcc(111) Crystal Surfaces</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/surface-science/diffusion-adatom-dimers-1984/</link><pubDate>Sat, 13 Dec 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/surface-science/diffusion-adatom-dimers-1984/</guid><description>A 1984 molecular dynamics study identifying simultaneous multiple jumps in adatom dimer diffusion on fcc(111) surfaces.</description><content:encoded><![CDATA[<h2 id="classification-discovery-of-diffusion-mechanisms">Classification: Discovery of Diffusion Mechanisms</h2>
<p><strong>Discovery (Translational Basis)</strong></p>
<p>This paper applies a computational method (Molecular Dynamics) to observe and characterize a physical phenomenon: the specific diffusion mechanisms of adatom dimers on a crystal surface. It focuses on the &ldquo;what was found&rdquo; (simultaneous multiple jumps).</p>
<p>Based on the <a href="/notes/interdisciplinary/research-methods/ai-physical-sciences-paper-taxonomy/">AI for Physical Sciences Paper Taxonomy</a>, this is best classified as $\Psi_{\text{Discovery}}$ with a minor superposition of $\Psi_{\text{Method}}$ (approximately 80% Discovery, 20% Method). The dominant contribution is the application of computational tools to observe physical phenomena, while secondarily demonstrating MD&rsquo;s capability for surface diffusion problems in an era when the technique was still developing.</p>
<h2 id="bridging-the-intermediate-temperature-data-gap">Bridging the Intermediate Temperature Data Gap</h2>
<p>The study aims to investigate the behavior of adatom dimers in an <strong>intermediate temperature range</strong> ($0.3T_m$ to $0.6T_m$). At the time, Field Ion Microscopy (FIM) provided data at low temperatures ($T \le 0.2T_m$), and previous simulations had studied single adatoms on various surfaces including (111), (110), and (100), but not dimers on (111). The authors sought to compare dimer mobility with single adatom mobility on the (111) surface, where single adatoms move almost like free particles.</p>
<h2 id="observation-of-simultaneous-multiple-jumps">Observation of Simultaneous Multiple Jumps</h2>
<p>The core contribution is the observation of <strong>simultaneous multiple jumps</strong> for dimers on the (111) surface at intermediate temperatures. The study reveals that:</p>
<ol>
<li>Dimers migrate as a whole entity, with both atoms jumping simultaneously</li>
<li>The mobility of dimers (center of mass) is very close to that of single adatoms in this regime.</li>
</ol>
<h2 id="molecular-dynamics-simulation-design">Molecular Dynamics Simulation Design</h2>
<p>The authors performed <strong>Molecular Dynamics (MD) simulations</strong> of a face-centred cubic (fcc) crystallite:</p>
<ul>
<li><strong>System</strong>: A single crystallite of 192 atoms bounded by two free (111) surfaces</li>
<li><strong>Temperature Range</strong>: $0.22 \epsilon/k$ to $0.40 \epsilon/k$ (approximately $0.3T_m$ to $0.6T_m$)</li>
<li><strong>Duration</strong>: Integration over 50,000 time steps</li>
<li><strong>Comparison</strong>: Results were compared against single adatom diffusion data and Einstein&rsquo;s diffusion relation</li>
</ul>
<h2 id="outcomes-on-mobility-and-migration-dynamics">Outcomes on Mobility and Migration Dynamics</h2>
<ul>
<li><strong>Mechanism Transition</strong>: At low temperatures ($T^\ast=0.22$), diffusion occurs via discrete single jumps where adatoms rotate or extend bonds. At higher temperatures, the &ldquo;multiple jump&rdquo; mechanism becomes preponderant.</li>
<li><strong>Migration Style</strong>: The dimer migrates essentially by extending its bond along the $\langle 110 \rangle$ direction.</li>
<li><strong>Mobility</strong>: The diffusion coefficient of dimers is quantitatively similar to single adatoms.</li>
<li><strong>Qualitative Support</strong>: The results support Bonzel&rsquo;s hypothesis of delocalized diffusion involving energy transfer between translation and rotation. The authors attempted to quantify the coupling using the cross-correlation function:</li>
</ul>
<p>$$g(t) = C \langle E_T(t) , E_R(t + t&rsquo;) \rangle$$</p>
<p>where $C$ is a normalization constant, $E_T$ is the translational energy of the center of mass, and $E_R$ is the rotational energy of the dimer. However, the average lifetime of a dimer (2% to 15% of the total calculation time in the studied temperature range) was too short to allow a statistically significant study of this coupling.</p>
<ul>
<li><strong>Dimer Concentration</strong>: The contribution of dimers to mass transport depends on their concentration. As a first approximation, the dimer concentration is expressed as:</li>
</ul>
<p>$$C = C_0 \exp\left[\frac{-2E_f - E_d}{k_B T}\right]$$</p>
<p>where $E_f$ is the formation energy of adatoms and $E_d$ is the binding energy of a dimer. If the binding energy is sufficiently strong, dimer contributions should be accounted for even in the intermediate temperature range ($0.3T_m$ to $0.6T_m$).</p>
<hr>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<h3 id="data-simulation-setup">Data (Simulation Setup)</h3>
<p>Because this is an early computational study, &ldquo;data&rdquo; refers to the initial structural configuration. The simulation begins with an algorithmically generated generic fcc(111) lattice containing two adatoms as the initial state.</p>















<figure class="post-figure center ">
    <img src="/img/notes/chemistry/argon-dimer-diffusion.webp"
         alt="Visualization of argon dimer on fcc(111) surface"
         title="Visualization of argon dimer on fcc(111) surface"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">Initial configuration showing an adatom dimer (two adatoms on neighboring sites) on an fcc(111) surface. The crystallite consists of 192 atoms with periodic boundary conditions in the x and y directions.</figcaption>
    
</figure>

<table>
  <thead>
      <tr>
          <th>Parameter</th>
          <th>Value</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><strong>Particles</strong></td>
          <td>192 atoms</td>
          <td>Single fcc crystallite</td>
      </tr>
      <tr>
          <td><strong>Dimensions</strong></td>
          <td>$4[110] \times 4[112]$</td>
          <td>Thickness of 6 planes</td>
      </tr>
      <tr>
          <td><strong>Boundary</strong></td>
          <td>Periodic (x, y)</td>
          <td>Free surface in z-direction</td>
      </tr>
      <tr>
          <td><strong>Initial State</strong></td>
          <td>Dimer on neighbor sites</td>
          <td>Starts with 2 adatoms</td>
      </tr>
  </tbody>
</table>
<h3 id="algorithms">Algorithms</h3>
<p>The simulation relies on standard Molecular Dynamics integration techniques. Historical source code is absent. Complete reproducibility is achievable today utilizing modern open-source tools like LAMMPS with standard <code>lj/cut</code> pair styles and NVE/NVT ensembles.</p>
<ul>
<li><strong>Integration Scheme</strong>: Central difference algorithm (Verlet algorithm)</li>
<li><strong>Time Step</strong>: $\Delta t^\ast = 0.01$ (reduced units)</li>
<li><strong>Total Steps</strong>: 50,000 integration steps</li>
<li><strong>Dimer Definition</strong>: Two adatoms are considered a dimer if their distance $r \le r_c = 2\sigma$</li>
</ul>
<h3 id="models-analytic-potential">Models (Analytic Potential)</h3>
<p>The physics are modeled using a classic Lennard-Jones potential.</p>
<p><strong>Potential Form</strong>: (12, 6) Lennard-Jones
$$ V(r) = 4\epsilon \left[ \left(\frac{\sigma}{r}\right)^{12} - \left(\frac{\sigma}{r}\right)^6 \right] $$</p>
<p><strong>Parameters (Argon-like)</strong>:</p>
<ul>
<li>$\epsilon/k = 119.5$ K</li>
<li>$\sigma = 3.4478$ Å</li>
<li>$m = 39.948$ a.u.</li>
<li>Cut-off radius: $2\sigma$</li>
</ul>
<h3 id="evaluation">Evaluation</h3>
<p>Metrics used to quantify the diffusion behavior:</p>
<table>
  <thead>
      <tr>
          <th>Metric</th>
          <th>Formula</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><strong>Diffusion Coefficient</strong></td>
          <td>$D = \frac{\langle R^2 \rangle}{4t}$</td>
          <td>Calculated from Mean Square Displacement of center of mass</td>
      </tr>
      <tr>
          <td><strong>Trajectory Analysis</strong></td>
          <td>Visual inspection</td>
          <td>Categorized into &ldquo;fast migration&rdquo; (multiple jumps) or &ldquo;discrete jumps&rdquo;</td>
      </tr>
  </tbody>
</table>
<h3 id="hardware">Hardware</h3>
<ul>
<li><strong>Specifics</strong>: Unspecified in the original text.</li>
<li><strong>Scale</strong>: 192 particles simulated for 50,000 steps is extremely lightweight by modern standards. A standard laptop CPU executes this workload in under a second, providing a strong contrast to the mainframe computing resources required in 1984.</li>
</ul>
<hr>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Ghaleb, D. (1984). Diffusion of adatom dimers on (111) surface of face centred crystals: A molecular dynamics study. <em>Surface Science</em>, 137(2-3), L103-L108. <a href="https://doi.org/10.1016/0039-6028(84)90515-6">https://doi.org/10.1016/0039-6028(84)90515-6</a></p>
<p><strong>Publication</strong>: Surface Science 1984</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@article</span>{ghalebDiffusionAdatomDimers1984,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span> = <span style="color:#e6db74">{Diffusion of Adatom Dimers on (111) Surface of Face Centred Crystals: A Molecular Dynamics Study}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span> = <span style="color:#e6db74">{Ghaleb, Dominique}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span> = <span style="color:#e6db74">{1984}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">journal</span> = <span style="color:#e6db74">{Surface Science}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">volume</span> = <span style="color:#e6db74">{137}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">number</span> = <span style="color:#e6db74">{2-3}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span> = <span style="color:#e6db74">{L103-L108}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">doi</span> = <span style="color:#e6db74">{10.1016/0039-6028(84)90515-6}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div>]]></content:encoded></item><item><title>Müller-Brown Transition: Langevin Dynamics Simulation</title><link>https://hunterheidenreich.com/videos/muller-brown-transition-simulation/</link><pubDate>Wed, 27 Aug 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/videos/muller-brown-transition-simulation/</guid><description>Extended Langevin dynamics simulation showing particle transitions between different basins of the Müller-Brown potential energy surface.</description><content:encoded><![CDATA[<div style="position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;">
      <iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share; fullscreen" loading="eager" referrerpolicy="strict-origin-when-cross-origin" src="https://www.youtube-nocookie.com/embed/dVFe_4KZbps?autoplay=0&amp;controls=1&amp;end=0&amp;loop=0&amp;mute=0&amp;start=0" style="position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;" title="YouTube video"></iframe>
    </div>

<p>This video shows an extended Langevin dynamics simulation demonstrating transitions between different basins of the Müller-Brown potential. The trajectory illustrates the particle&rsquo;s movement between energy minima, highlighting the energy barriers and pathways involved in reactive processes.</p>
<p>Details on the simulation and implementation can be found in the <a href="/posts/muller-brown-in-pytorch/">Implementing the Müller-Brown Potential in PyTorch</a> post.</p>
]]></content:encoded></item><item><title>Müller-Brown Potential: A PyTorch ML Testbed</title><link>https://hunterheidenreich.com/projects/muller-brown-pytorch/</link><pubDate>Wed, 27 Aug 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/projects/muller-brown-pytorch/</guid><description>A PyTorch testbed for the Müller-Brown potential: BAOAB Langevin dynamics, torch.compile analytical forces, and a statistical-mechanics validation suite.</description><content:encoded><![CDATA[<h2 id="overview">Overview</h2>
<p>This project implements the classic 2D Müller-Brown potential in PyTorch as a ground-truth testbed for machine-learning-in-molecular-dynamics (ML-MD) work. The potential is a <code>torch.nn.Module</code> that computes forces two ways: a hand-derived analytical gradient (the default, compiled with <code>torch.compile</code>) and <code>torch.autograd.grad</code> (a reference the analytical path is checked against). On an Apple M1 Max, the analytical kernel runs about 4x faster than autograd (3-7x depending on batch size; 100 warm-up iterations, then the median of 5 runs of 1000), because it skips autograd&rsquo;s graph construction inside the force loop.</p>
<p>The energy is deliberately left uncompiled so that second derivatives (the Hessian via autograd) keep working, since <code>torch.compile</code> does not support double-backward; the force, the hot path, is the compiled function.</p>
<h2 id="features">Features</h2>
<ul>
<li><strong>Dual force kernels</strong>: a hand-derived analytical gradient (compiled) for fast simulation, and an autograd mode for differentiation and as the correctness reference the analytical path is tested against.</li>
<li><strong>BAOAB Langevin integrator</strong>: the BAOAB splitting scheme (Leimkuhler &amp; Matthews, 2013), which solves the friction-plus-noise step exactly and samples the canonical distribution accurately (exactly so for a harmonic oscillator).</li>
<li><strong>Device-agnostic</strong>: potential, forces, and simulation are plain PyTorch tensor operations that run on CPU or CUDA; the included benchmark measures CPU.</li>
<li><strong>Modular architecture</strong>: physics (<code>MuellerBrownPotential</code>), numerics (<code>LangevinSimulator</code>), visualization, and HDF5 I/O are separated, with a CLI orchestrating demo, single-run, batch, and plot-regeneration modes.</li>
</ul>
<h2 id="usage">Usage</h2>
<p>The package installs editable with <code>uv sync</code> and imports as a normal package:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">from</span> muller_brown <span style="color:#f92672">import</span> MuellerBrownPotential, LangevinSimulator
</span></span></code></pre></div><p>It provides a fast, differentiable Müller-Brown potential and a Langevin sampler for testing ML-MD algorithms against a known-exact surface.</p>
<h2 id="results">Results</h2>
<h3 id="architecture">Architecture</h3>
<ul>
<li><strong>Physics module</strong>: the energy surface is a <code>torch.nn.Module</code> with the potential parameters held as registered buffers, so device and dtype move with the module.</li>
<li><strong>Analytical force kernel</strong>: the analytical Jacobian is implemented directly and compiled with <code>torch.compile(dynamic=True)</code>, bypassing autograd-graph construction during long simulations.</li>
<li><strong>Vectorized execution</strong>: kernel operations are vectorized over particles, so an ensemble runs in roughly the same wall time as a single particle (the per-step cost is dominated by the fixed force call and noise draw).</li>
<li><strong>Device-agnostic</strong>: all operations move to CUDA via native tensor handling; the benchmark and tests run on CPU.</li>
</ul>
<h3 id="performance">Performance</h3>
<p>A force-throughput benchmark (analytical vs autograd) across batch sizes from 2 to roughly 50,000 particles, on an Apple M1 Max:</p>
<ul>
<li>The analytical kernel is about 4x faster than autograd (3-7x across batch sizes).</li>
<li>Per-particle force time drops below 1 microsecond at large batch sizes.</li>
<li>Throughput rises with batch size and saturates for large ensembles.</li>
</ul>
<h3 id="validation">Validation</h3>
<p>The sampler is checked against statistical mechanics, not just run:</p>
<ul>
<li><strong>Deterministic tests</strong>: the documented minima and saddles have the correct Hessian signatures; the analytical force matches <code>torch.autograd.grad</code>; energy is conserved in the frictionless (NVE) limit; <code>float32</code> matches <code>float64</code>; and HDF5 round-trips preserve the data.</li>
<li><strong>Statistical tests</strong>: the sampler reproduces equipartition, the harmonic-oscillator distributions, and the Müller-Brown Boltzmann mean energy against a grid-integrated reference; a separate convergence study confirms the integrator&rsquo;s kinetic-temperature bias vanishes as the timestep squared.</li>
</ul>
<h3 id="molecular-dynamics">Molecular Dynamics</h3>
<p>Langevin simulations on the surface show particle motion within energy basins, thermal fluctuations around the minima, and barrier-crossing transitions between wells, visualized as trajectories on the potential surface.</p>
<h2 id="simulation-videos">Simulation Videos</h2>
<p>These videos demonstrate Langevin dynamics simulations on the Müller-Brown potential surface:</p>
<p><div style="position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;">
      <iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share; fullscreen" loading="eager" referrerpolicy="strict-origin-when-cross-origin" src="https://www.youtube-nocookie.com/embed/woVM90qXUQs?autoplay=0&amp;controls=1&amp;end=0&amp;loop=0&amp;mute=0&amp;start=0" style="position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;" title="YouTube video"></iframe>
    </div>

<strong>A Basin Dynamics</strong>: Particle motion and thermal fluctuations around the A minimum.</p>
<p><div style="position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;">
      <iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share; fullscreen" loading="eager" referrerpolicy="strict-origin-when-cross-origin" src="https://www.youtube-nocookie.com/embed/gdAHme07bGs?autoplay=0&amp;controls=1&amp;end=0&amp;loop=0&amp;mute=0&amp;start=0" style="position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;" title="YouTube video"></iframe>
    </div>

<strong>B Basin Dynamics</strong>: Exploration of the deeper B minimum energy well.</p>
<p><div style="position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;">
      <iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share; fullscreen" loading="eager" referrerpolicy="strict-origin-when-cross-origin" src="https://www.youtube-nocookie.com/embed/dVFe_4KZbps?autoplay=0&amp;controls=1&amp;end=0&amp;loop=0&amp;mute=0&amp;start=0" style="position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;" title="YouTube video"></iframe>
    </div>

<strong>Transition Path</strong>: Particle transitioning between energy basins, demonstrating barrier crossing.</p>
<h2 id="related-work">Related Work</h2>
<p>This implementation is documented in detail in:</p>
<ul>
<li><a href="/posts/muller-brown-in-pytorch/">Implementing the Müller-Brown Potential in PyTorch</a></li>
<li><a href="/videos/muller-brown-basin-ma-simulation/">Basin A Simulation</a></li>
<li><a href="/videos/muller-brown-basin-mb-simulation/">Basin B Simulation</a></li>
<li><a href="/videos/muller-brown-transition-simulation/">Transition Path Simulation</a></li>
</ul>
]]></content:encoded></item><item><title>Müller-Brown Basin MB: Langevin Dynamics Simulation</title><link>https://hunterheidenreich.com/videos/muller-brown-basin-mb-simulation/</link><pubDate>Wed, 27 Aug 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/videos/muller-brown-basin-mb-simulation/</guid><description>Langevin dynamics simulation showing particle motion in the product minimum (Basin MB) of the Müller-Brown potential energy surface.</description><content:encoded><![CDATA[<div style="position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;">
      <iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share; fullscreen" loading="eager" referrerpolicy="strict-origin-when-cross-origin" src="https://www.youtube-nocookie.com/embed/gdAHme07bGs?autoplay=0&amp;controls=1&amp;end=0&amp;loop=0&amp;mute=0&amp;start=0" style="position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;" title="YouTube video"></iframe>
    </div>

<p>This video shows Langevin dynamics simulation in Basin MB, the product minimum of the Müller-Brown potential. The particle shows moderate thermal motion around (0.623, 0.028) with intermediate behavior between the deep reactant well and shallow intermediate basin (-108.17 in reduced units).</p>
<p>Details on the simulation and implementation can be found in the <a href="/posts/muller-brown-in-pytorch/">Implementing the Müller-Brown Potential in PyTorch</a> post.</p>
]]></content:encoded></item><item><title>Müller-Brown Basin MA: Langevin Dynamics Simulation</title><link>https://hunterheidenreich.com/videos/muller-brown-basin-ma-simulation/</link><pubDate>Wed, 27 Aug 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/videos/muller-brown-basin-ma-simulation/</guid><description>Langevin dynamics simulation showing particle motion in the deep reactant minimum (Basin MA) of the Müller-Brown potential energy surface.</description><content:encoded><![CDATA[<div style="position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;">
      <iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share; fullscreen" loading="eager" referrerpolicy="strict-origin-when-cross-origin" src="https://www.youtube-nocookie.com/embed/woVM90qXUQs?autoplay=0&amp;controls=1&amp;end=0&amp;loop=0&amp;mute=0&amp;start=0" style="position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;" title="YouTube video"></iframe>
    </div>

<p>This video shows Langevin dynamics simulation in Basin MA, the deep reactant minimum of the Müller-Brown potential. The particle remains tightly confined around (-0.558, 1.442) due to the deep potential well (-146.70 in reduced units) and steep energy barriers.</p>
<p>Details on the simulation and implementation can be found in the <a href="/posts/muller-brown-in-pytorch/">Implementing the Müller-Brown Potential in PyTorch</a> post.</p>
]]></content:encoded></item><item><title>DenoiseVAE: Adaptive Noise for Molecular Pre-training</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/ml-potentials/denoise-vae/</link><pubDate>Sun, 24 Aug 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/ml-potentials/denoise-vae/</guid><description>Liu et al.'s ICLR 2025 paper introducing DenoiseVAE, which learns adaptive, atom-specific noise distributions for better molecular force fields.</description><content:encoded><![CDATA[<h2 id="paper-contribution-type">Paper Contribution Type</h2>
<p>This is a <strong>method paper</strong> with a supporting theoretical component. It introduces a new pre-training framework, DenoiseVAE, that challenges the standard practice of using fixed, hand-crafted noise distributions in denoising-based molecular representation learning.</p>
<h2 id="motivation-the-inter--and-intra-molecular-variations-problem">Motivation: The Inter- and Intra-molecular Variations Problem</h2>
<p>The motivation is to create a more physically principled denoising pre-training task for 3D molecules. The core idea of denoising is to learn molecular force fields by corrupting an equilibrium conformation with noise and then learning to recover it. However, existing methods use a single, hand-crafted noise strategy (e.g., Gaussian noise of a fixed scale) for all atoms across all molecules. This is physically unrealistic for two main reasons:</p>
<ol>
<li><strong>Inter-molecular differences</strong>: Different molecules have unique Potential Energy Surfaces (PES), meaning the space of low-energy (i.e., physically plausible) conformations is highly molecule-specific.</li>
<li><strong>Intra-molecular differences (Anisotropy)</strong>: Within a single molecule, different atoms have different degrees of freedom. For instance, an atom in a rigid functional group can move much less than one connected by a single, rotatable bond.</li>
</ol>
<p>The authors argue that this &ldquo;one-size-fits-all&rdquo; noise approach leads to inaccurate force field learning because it samples many physically improbable conformations.</p>
<h2 id="novelty-a-learnable-atom-specific-noise-generator">Novelty: A Learnable, Atom-Specific Noise Generator</h2>
<p>The core novelty is a framework that learns to generate noise tailored to each specific molecule and atom. This is achieved through three key innovations:</p>
<ol>
<li><strong>Learnable Noise Generator</strong>: The authors introduce a Noise Generator module (a 4-layer Equivariant Graph Neural Network) that takes a molecule&rsquo;s equilibrium conformation $X$ as input and outputs a unique, atom-specific Gaussian noise distribution (i.e., a different variance $\sigma_i^2$ for each atom $i$). This directly addresses the issues of PES specificity and force field anisotropy.</li>
<li><strong>Variational Autoencoder (VAE) Framework</strong>: The Noise Generator (encoder) and a Denoising Module (a 7-layer EGNN decoder) are trained jointly within a VAE paradigm. The noisy conformation is sampled using the reparameterization trick:
$$
\begin{aligned}
\tilde{x}_i &amp;= x_i + \epsilon \sigma_i
\end{aligned}
$$</li>
<li><strong>Principled Optimization Objective</strong>: The training loss balances two competing goals:
$$
\begin{aligned}
\mathcal{L}_{DenoiseVAE} &amp;= \mathcal{L}_{Denoise} + \lambda \mathcal{L}_{KL}
\end{aligned}
$$
<ul>
<li>A denoising reconstruction loss ($\mathcal{L}_{Denoise}$) encourages the Noise Generator to produce physically plausible perturbations from which the original conformation can be recovered. This implicitly constrains the noise to respect the molecule&rsquo;s underlying force fields.</li>
<li>A KL divergence regularization term ($\mathcal{L}_{KL}$) pushes the generated noise distributions towards a predefined prior. This prevents the trivial solution of generating zero noise and encourages the model to explore a diverse set of low-energy conformations.</li>
</ul>
</li>
</ol>
<p>The authors also provide a theoretical analysis showing that optimizing their objective is equivalent to maximizing the Evidence Lower Bound (ELBO) on the log-likelihood of observing physically realistic conformations.</p>
<h2 id="methodology--experimental-baselines">Methodology &amp; Experimental Baselines</h2>
<p>The model was pretrained on the PCQM4Mv2 dataset (approximately 3.4 million organic molecules) and then evaluated on a comprehensive suite of downstream tasks to test the quality of the learned representations:</p>
<ol>
<li><strong>Molecular Property Prediction (<a href="/notes/chemistry/datasets/qm9/">QM9</a>)</strong>: The model was evaluated on 12 quantum chemical property prediction tasks for small molecules (134k molecules; 100k train, 18k val, 13k test split). DenoiseVAE achieved state-of-the-art or second-best performance on 11 of the 12 tasks, with particularly significant gains on $C_v$ (heat capacity), indicating better capture of vibrational modes.</li>
<li><strong>Force Prediction (MD17)</strong>: The task was to predict atomic forces from molecular dynamics trajectories for 8 different small molecules (9,500 train, 500 val split). DenoiseVAE was the top performer on 5 of the 8 molecules (Aspirin, Benzene, Ethanol, Naphthalene, Toluene), though it underperformed Frad on Malonaldehyde, Salicylic Acid, and Uracil by significant margins.</li>
<li><strong>Ligand Binding Affinity (PDBBind v2019)</strong>: On the PDBBind dataset with 30% and 60% protein sequence identity splits, the model showed strong generalization, outperforming baselines like Uni-Mol particularly on the more stringent 30% split across RMSE, Pearson correlation, and Spearman correlation.</li>
<li><strong>PCQM4Mv2 Validation</strong>: DenoiseVAE achieved a validation MAE of 0.0777 on the PCQM4Mv2 HOMO-LUMO gap prediction task with only 1.44M parameters, competitive with models 10-40x larger (e.g., GPS++ at 44.3M params achieves 0.0778).</li>
<li><strong>Ablation Studies</strong>: The authors analyzed the sensitivity to key hyperparameters, namely the prior&rsquo;s standard deviation ($\sigma$) and the KL-divergence weight ($\lambda$), confirming that $\lambda=1$ and $\sigma=0.1$ are optimal. Removing the KL term leads to trivial solutions (near-zero noise). An additional ablation on the Noise Generator depth found 4 EGNN layers optimal over 2 layers. A comparison of independent (diagonal) versus non-independent (full covariance) noise sampling showed comparable results, suggesting the EGNN already captures inter-atomic dependencies implicitly.</li>
<li><strong>Case Studies</strong>: Visualizations of the learned noise variances for different molecules confirmed that the model learns chemically intuitive noise patterns. For example, it applies smaller perturbations to atoms in a rigid bicyclic norcamphor derivative and larger ones to atoms in flexible functional groups of a cyclopropane derivative. Even identical functional groups (e.g., hydroxyl) receive different noise scales in different molecular contexts.</li>
</ol>
<h2 id="key-findings-on-force-field-learning">Key Findings on Force Field Learning</h2>
<ul>
<li><strong>Primary Conclusion</strong>: Learning a <strong>molecule-adaptive and atom-specific</strong> noise distribution is a superior strategy for denoising-based pre-training compared to using fixed, hand-crafted heuristics. This more physically-grounded approach leads to representations that better capture molecular force fields.</li>
<li><strong>Strong Benchmark Performance</strong>: DenoiseVAE achieves best or second-best results on 11 of 12 QM9 tasks, 5 of 8 MD17 molecules, and leads on the stringent 30% LBA split. Performance is mixed on some MD17 molecules (Malonaldehyde, Salicylic Acid, Uracil), where it trails Frad.</li>
<li><strong>Effective Framework</strong>: The proposed VAE-based framework, which jointly trains a Noise Generator and a Denoising Module, is an effective and theoretically sound method for implementing this adaptive noise strategy. The interplay between the reconstruction loss and the KL-divergence regularization is key to its success.</li>
<li><strong>Limitation and Future Direction</strong>: The method is based on classical force field assumptions. The authors note that integrating more accurate force fields represents a promising direction for future work.</li>
</ul>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<h3 id="artifacts">Artifacts</h3>
<table>
  <thead>
      <tr>
          <th>Artifact</th>
          <th>Type</th>
          <th>License</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><a href="https://github.com/Serendipity-r/DenoiseVAE">Serendipity-r/DenoiseVAE</a></td>
          <td>Code</td>
          <td>Unknown</td>
          <td>Official implementation</td>
      </tr>
  </tbody>
</table>
<h3 id="reproducibility-status">Reproducibility Status</h3>
<ul>
<li><strong>Source Code</strong>: The authors have released their code at <a href="https://github.com/Serendipity-r/DenoiseVAE">Serendipity-r/DenoiseVAE</a> on GitHub. No license is specified in the repository.</li>
<li><strong>Implementation</strong>: Hyperparameters and architectures are detailed in the paper&rsquo;s appendix (A.14), and the repository provides reference implementations.</li>
</ul>
<h3 id="data">Data</h3>
<ul>
<li><strong>Pre-training Dataset</strong>: <a href="https://ogb.stanford.edu/docs/lsc/pcqm4mv2/">PCQM4Mv2</a> (approximately 3.4 million organic molecules)</li>
<li><strong>Property Prediction</strong>: <a href="https://pytorch-geometric.readthedocs.io/en/latest/generated/torch_geometric.datasets.QM9.html">QM9 dataset</a> (134k molecules; 100k train, 18k val, 13k test split) for 12 quantum chemical properties</li>
<li><strong>Force Prediction</strong>: <a href="http://www.sgdml.org/#datasets">MD17 dataset</a> (9,500 train, 500 val split) for 8 different small molecules</li>
<li><strong>Ligand Binding Affinity</strong>: PDBBind v2019 (4,463 protein-ligand complexes) with 30% and 60% sequence identity splits</li>
</ul>
<h3 id="algorithms">Algorithms</h3>
<ul>
<li><strong>Noise Generator</strong>: 4-layer Equivariant Graph Neural Network (EGNN) that outputs atom-specific Gaussian noise distributions</li>
<li><strong>Denoising Module</strong>: 7-layer EGNN decoder</li>
<li><strong>Training Objective</strong>: $\mathcal{L}_{DenoiseVAE} = \mathcal{L}_{Denoise} + \lambda \mathcal{L}_{KL}$ with $\lambda=1$</li>
<li><strong>Noise Sampling</strong>: Reparameterization trick with $\tilde{x}_i = x_i + \epsilon \sigma_i$</li>
<li><strong>Prior Distribution</strong>: Standard deviation $\sigma=0.1$</li>
</ul>
<h3 id="models">Models</h3>
<ul>
<li><strong>Model Size</strong>: 1.44M parameters total</li>
<li><strong>Fine-tuning Protocol</strong>: Noise Generator discarded after pre-training; only the pre-trained Denoising Module (7-layer EGNN) is retained for downstream fine-tuning</li>
<li><strong>Optimizer</strong>: AdamW with cosine learning rate decay (max LR of 0.0005)</li>
<li><strong>Batch Size</strong>: 128</li>
<li><strong>System Training</strong>: Fine-tuned end-to-end for specific tasks; force prediction involves computing the gradient of the predicted energy</li>
</ul>
<h3 id="evaluation">Evaluation</h3>
<ul>
<li><strong>Ablation Studies</strong>: Sensitivity analysis confirmed $\lambda=1$ and $\sigma=0.1$ as optimal hyperparameters; removing the KL term leads to trivial solutions (near-zero noise)</li>
<li><strong>Noise Generator Depth</strong>: 4 EGNN layers outperformed 2 layers across both QM9 and MD17 benchmarks</li>
<li><strong>Covariance Structure</strong>: Full covariance matrix (non-independent noise sampling) yielded comparable results to diagonal variance (independent sampling), likely because the EGNN already integrates neighboring atom information</li>
<li><strong>O(3) Invariance</strong>: The method satisfies O(3) probabilistic invariance, meaning the noise distribution is unchanged under rotations and reflections</li>
</ul>
<h3 id="hardware">Hardware</h3>
<ul>
<li><strong>GPU Configuration</strong>: Experiments conducted on a single RTX A3090 GPU; 6 GPUs with 144GB total memory sufficient for full reproduction</li>
<li><strong>CPU</strong>: Intel Xeon Gold 5318Y @ 2.10GHz</li>
</ul>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Liu, Y., Chen, J., Jiao, R., Li, J., Huang, W., &amp; Su, B. (2025). DenoiseVAE: Learning Molecule-Adaptive Noise Distributions for Denoising-based 3D Molecular Pre-training. <em>The Thirteenth International Conference on Learning Representations (ICLR)</em>.</p>
<p><strong>Publication</strong>: ICLR 2025</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@inproceedings</span>{liu2025denoisevae,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span>=<span style="color:#e6db74">{DenoiseVAE: Learning Molecule-Adaptive Noise Distributions for Denoising-based 3D Molecular Pre-training}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span>=<span style="color:#e6db74">{Yurou Liu and Jiahao Chen and Rui Jiao and Jiangmeng Li and Wenbing Huang and Bing Su}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">booktitle</span>=<span style="color:#e6db74">{The Thirteenth International Conference on Learning Representations}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span>=<span style="color:#e6db74">{2025}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">url</span>=<span style="color:#e6db74">{https://openreview.net/forum?id=ym7pr83XQr}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p><strong>Additional Resources</strong>:</p>
<ul>
<li><a href="https://iclr.cc/virtual/2025/poster/27701">ICLR 2025 poster page</a></li>
<li><a href="https://openreview.net/forum?id=ym7pr83XQr">OpenReview forum</a></li>
<li><a href="https://openreview.net/pdf?id=ym7pr83XQr">PDF on OpenReview</a></li>
</ul>
]]></content:encoded></item><item><title>Modernizing Rahman's 1964 Argon Simulation</title><link>https://hunterheidenreich.com/posts/rahman-1964-lammps-liquid-argon/</link><pubDate>Sat, 23 Aug 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/posts/rahman-1964-lammps-liquid-argon/</guid><description>How I used modern software engineering (caching, vectorization, and dependency locking) to reproduce a 60-year-old physics milestone.</description><content:encoded><![CDATA[<p>Some papers invent entire fields. Aneesur Rahman&rsquo;s 1964 paper, <strong>&ldquo;Correlations in the Motion of Atoms in Liquid Argon&rdquo;</strong>, is the &ldquo;Hello World&rdquo; of molecular dynamics (MD). Using a computer with less memory than a modern microwave, Rahman solved Newton&rsquo;s equations for 864 atoms and proved that liquids have distinct, quantifiable structure.</p>
<p>The physics of liquid argon is a solved problem. We know the answer.</p>
<p>So, why replicate it in 2025? <strong>To apply modern engineering standards to legacy science.</strong></p>
<p>This project served as an exercise in <strong>software archaeology</strong>: taking a vintage scientific workflow and rebuilding it with a modular Python analysis pipeline. I wanted to see if I could replace Rahman&rsquo;s &ldquo;write-once&rdquo; Fortran mentality with modern reproducibility, type safety, and intelligent caching.</p>
<p>The full source code is available on <a href="https://github.com/hunter-heidenreich/argon-simulation">GitHub</a>. The complete project overview, including analysis results and pipeline architecture, is on the <a href="/projects/rahman-1964-replication/">Rahman 1964 Replication project page</a>.</p>
<div style="position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;">
      <iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share; fullscreen" loading="eager" referrerpolicy="strict-origin-when-cross-origin" src="https://www.youtube-nocookie.com/embed/KjFixUt6bnQ?autoplay=0&amp;controls=1&amp;end=0&amp;loop=0&amp;mute=0&amp;start=0" style="position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;" title="YouTube video"></iframe>
    </div>

<hr>
<h2 id="engineering-the-pipeline">Engineering the Pipeline</h2>
<p>The most interesting part of this project isn&rsquo;t the simulation engine (LAMMPS handles that); it&rsquo;s the architecture of the analysis suite. MD analysis is computationally expensive ($O(N^2)$), and iterating on plots can be painfully slow if you re-compute trajectory data every time.</p>
<p>Why bother? Don&rsquo;t modern MD packages come with analysis tools?
Well, some say that writing is thinking.
Sometimes getting into the weeds of how an algorithm works or an analysis is performed, you gain insights and a deeper understanding that might be obscured by a plug-and-play tool.</p>
<h3 id="intelligent-caching">Intelligent Caching</h3>
<p>I built the <code>argon_sim</code> package with a decorator-based caching layer. The system hashes the source file&rsquo;s modification time and the function&rsquo;s arguments to avoid re-calculating the Radial Distribution Function (RDF) or Van Hove correlations on every script run.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#a6e22e">@cached_computation</span>(<span style="color:#e6db74">&#34;gr&#34;</span>)
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">compute_radial_distribution</span>(filename: str, dr: float <span style="color:#f92672">=</span> <span style="color:#ae81ff">0.05</span>):
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># ... expensive O(N^2) distance calculations ...</span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> r_values, g_r, density
</span></span></code></pre></div><p>If I tweak a plot axis, the script runs instantly, loading pre-computed arrays from disk instead of re-running the $O(N^2)$ computation. If I change the simulation trajectory, the cache invalidates automatically.</p>
<h3 id="vectorization--memory-management">Vectorization &amp; Memory Management</h3>
<p>Rahman likely relied on nested loops. Python is too slow for that. I utilized <strong>NumPy broadcasting</strong> to vectorize the calculation of atomic displacements.</p>
<p>However, calculating an $864 \times 864$ distance matrix for 5,000 frames consumes significant RAM. I implemented a <strong>chunked MSD (Mean Square Displacement) algorithm</strong> that processes the trajectory in blocks, balancing vectorization speed with memory constraints. The chunking trades some vectorization speed for a bounded memory footprint, so the analysis is not capped by holding the full distance matrix in RAM.</p>
<h3 id="reproducibility-as-a-feature">Reproducibility as a Feature</h3>
<p>Academic code is notorious for &ldquo;it works on my machine.&rdquo; To combat this, I used <strong><code>uv</code></strong> for dependency management, locking the exact environment state. The entire workflow (from simulation to final figure generation) is abstracted into a <code>Makefile</code>.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bash" data-lang="bash"><span style="display:flex;"><span><span style="color:#75715e"># One command to run the physics, analyze data, and generate plots</span>
</span></span><span style="display:flex;"><span>make workflow
</span></span></code></pre></div><hr>
<h2 id="the-simulation-1964-vs-2025">The Simulation: 1964 vs. 2025</h2>
<p>I preserved Rahman&rsquo;s physical parameters exactly to ensure a fair comparison:</p>
<ul>
<li><strong>System</strong>: 864 Argon atoms</li>
<li><strong>Potential</strong>: Lennard-Jones ($\sigma = 3.4$ Å, $\epsilon/k_B = 120$ K)</li>
<li><strong>Target</strong>: 94.4 K, 1.374 g/cm³</li>
</ul>
<p>However, I modernized the <em>numerical</em> methods to ensure stability:</p>
<table>
  <thead>
      <tr>
          <th style="text-align: left">Feature</th>
          <th style="text-align: left">Rahman (1964)</th>
          <th style="text-align: left">This Work (2025)</th>
          <th style="text-align: left">Why it Matters</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left"><strong>Integration</strong></td>
          <td style="text-align: left">Predictor-Corrector</td>
          <td style="text-align: left">Velocity Verlet</td>
          <td style="text-align: left">Better energy conservation over long runs</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>Timestep</strong></td>
          <td style="text-align: left">10 fs</td>
          <td style="text-align: left">2 fs</td>
          <td style="text-align: left">Rahman&rsquo;s step was aggressive; 2 fs ensures numerical stability</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>Equilibration</strong></td>
          <td style="text-align: left">Velocity Scaling</td>
          <td style="text-align: left">1 ns NVT</td>
          <td style="text-align: left">Rahman couldn&rsquo;t afford long equilibrations; I melted the crystal properly to remove bias</td>
      </tr>
  </tbody>
</table>
<p>The production run lasted 10 ps in the NVE ensemble, generating 5,001 frames. Temperature remained within 1% of target with an RMS fluctuation of 0.0165.</p>
<hr>
<h2 id="validation-results">Validation Results</h2>
<p>The replication was quantitatively successful. The analysis pipeline faithfully reproduced every key signature of liquid argon.</p>
<h3 id="the-cage-effect">The Cage Effect</h3>
<p>This is the paper&rsquo;s crown jewel. In a gas, velocity correlations decay exponentially. In a liquid, Rahman discovered that atoms get trapped by their neighbors and bounce back, causing the correlation to go <em>negative</em>.</p>















<figure class="post-figure center ">
    <img src="/img/rahman-1964-argon-molecular-dynamics/rahman-argon-velocity-autocorrelation.webp"
         alt="Velocity Autocorrelation Function"
         title="Velocity Autocorrelation Function"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">The VACF dips below zero at 0.3 ps. This &rsquo;negative correlation&rsquo; is the signature of the cage effect: atoms rattling against their neighbors.</figcaption>
    
</figure>

<p>My simulation captures this minimum at -0.083, matching Rahman&rsquo;s observation. The Fourier transform of this data (the frequency spectrum) reveals a peak at $\beta \approx 0.25$, physically representing the frequency of atomic collisions within the cage.</p>















<figure class="post-figure center ">
    <img src="/img/rahman-1964-argon-molecular-dynamics/rahman-argon-vacf-frequency-spectrum.webp"
         alt="Frequency spectrum of the VACF showing characteristic peak from atomic caging effects"
         title="Frequency spectrum of the VACF showing characteristic peak from atomic caging effects"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">Frequency spectrum of the VACF showing characteristic peak from atomic caging effects</figcaption>
    
</figure>

<h3 id="structural-fingerprints">Structural Fingerprints</h3>
<p>The Radial Distribution Function $g(r)$ and its Fourier transform, the Structure Factor $S(k)$, are the &ldquo;fingerprints&rdquo; of a liquid&rsquo;s structure.</p>















<figure class="post-figure center ">
    <img src="/img/rahman-1964-argon-molecular-dynamics/rahman-argon-radial-distribution-function.webp"
         alt="Radial Distribution Function and Structure Factor"
         title="Radial Distribution Function and Structure Factor"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">The sharp first peak (3.82 Å) shows defined nearest neighbors, while the decay shows the lack of long-range order. My calculated peaks match Rahman&rsquo;s within 3%.</figcaption>
    
</figure>

<p>The agreement here is striking. My first peak appeared at <strong>3.82 Å</strong> (Rahman: 3.7 Å). The slight discrepancy is likely due to my improved equilibration method, which allowed the system to relax into a more natural liquid state than Rahman&rsquo;s 1960s hardware allowed.</p>
<h3 id="diffusion-and-non-gaussian-behavior">Diffusion and Non-Gaussian Behavior</h3>
<p>By calculating the Mean Square Displacement (MSD), I derived a diffusion coefficient of <strong>$D = 2.47 \times 10^{-5}$ cm²/s</strong>, which deviates only <strong>2%</strong> from Rahman&rsquo;s reported $2.43 \times 10^{-5}$.</p>















<figure class="post-figure center ">
    <img src="/img/rahman-1964-argon-molecular-dynamics/rahman-argon-mean-square-displacement.webp"
         alt="Mean Square Displacement vs time showing ballistic to diffusive transition"
         title="Mean Square Displacement vs time showing ballistic to diffusive transition"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">Mean Square Displacement vs. time showing ballistic to diffusive transition</figcaption>
    
</figure>

<p>More interestingly, I reproduced the &ldquo;Non-Gaussian&rdquo; parameters. Standard diffusion assumes a Gaussian distribution of displacements. Rahman found (and I confirmed) that liquid atoms deviate from this. They exhibit &ldquo;jump&rdquo; and &ldquo;wait&rdquo; dynamics, a behavior that standard Brownian motion models fail to capture.</p>















<figure class="post-figure center ">
    <img src="/img/rahman-1964-argon-molecular-dynamics/rahman-argon-non-gaussian-parameters.webp"
         alt="Non-Gaussian parameters showing deviation from simple diffusive behavior"
         title="Non-Gaussian parameters showing deviation from simple diffusive behavior"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">Evidence that atoms do not follow a simple random walk. The non-zero alpha parameters indicate heterogeneous dynamics.</figcaption>
    
</figure>

<h3 id="advanced-analysis-van-hove-functions">Advanced Analysis: Van Hove Functions</h3>
<p>Rahman also explored advanced properties like the Van Hove correlation function $G(r,t)$, which describes how liquid structure evolves over time.</p>















<figure class="post-figure center ">
    <img src="/img/rahman-1964-argon-molecular-dynamics/rahman-argon-van-hove-correlation.webp"
         alt="Van Hove distinct correlation function G_d(r,t) at two time points"
         title="Van Hove distinct correlation function G_d(r,t) at two time points"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">Van Hove distinct correlation function showing how neighbor coordination shells &lsquo;melt&rsquo; as time progresses</figcaption>
    
</figure>

<p>At 1.0 ps, the structure remains well-defined with clear shells. By 2.5 ps, it becomes increasingly diffuse. Rahman compared this evolution to theoretical predictions (the Vineyard approximation) and found that theory predicted overly rapid structural decay. My results confirm this finding.</p>















<figure class="post-figure center ">
    <img src="/img/rahman-1964-argon-molecular-dynamics/rahman-argon-delayed-convolution.webp"
         alt="Delayed convolution approximation testing Rahman&#39;s theoretical improvement"
         title="Delayed convolution approximation testing Rahman&#39;s theoretical improvement"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">Testing Rahman&rsquo;s &lsquo;delayed convolution approximation&rsquo; (his proposed improvement over existing theory)</figcaption>
    
</figure>

<hr>
<h2 id="system-validation">System Validation</h2>
<p>Before analyzing physics, basic sanity checks confirmed proper thermal equilibrium.</p>















<figure class="post-figure center ">
    <img src="/img/rahman-1964-argon-molecular-dynamics/rahman-argon-temperature-stability.webp"
         alt="Temperature vs time plot showing excellent temperature control around 94.4 K target"
         title="Temperature vs time plot showing excellent temperature control around 94.4 K target"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">Temperature vs. Time - 5001 frames showing excellent temperature control with mean 94.73 K</figcaption>
    
</figure>

<p>Mean temperature was 94.73 K (0.33 K off target) with a standard deviation of 1.56 K.</p>















<figure class="post-figure center ">
    <img src="/img/rahman-1964-argon-molecular-dynamics/rahman-argon-maxwell-boltzmann-velocity.webp"
         alt="Maxwell-Boltzmann velocity distribution"
         title="Maxwell-Boltzmann velocity distribution"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">Maxwell-Boltzmann velocity distribution from 12.9 million velocity components</figcaption>
    
</figure>

<p>The velocity distribution from 12.9 million velocity components produces a clean Maxwell-Boltzmann distribution, as expected for thermal equilibrium. The distribution widths at various heights closely match Rahman&rsquo;s results: 1.77, 2.48, and 3.56 compared to his 1.77, 2.52, and 3.52.</p>
<hr>
<h2 id="conclusion">Conclusion</h2>
<p>Replicating a 60-year-old paper might seem like a solved puzzle, but it teaches a valuable lesson in computational science. Rahman relied on brilliance and raw mathematical intuition because he lacked compute power. Today, pairing modern compute with disciplined software practices makes the same result reproducible and auditable.</p>
<p>Applying modern software engineering (<strong>modular architecture, caching, and automated workflows</strong>) to classical physics reproduces the past and builds a foundation that makes the <em>next</em> discovery easier, faster, and more reliable.</p>
<p>The quantitative agreement is striking: diffusion coefficients within 2%, structural peaks within 0.1 Å, velocity distributions matching to three significant figures. This level of reproducibility, achieved with completely different hardware and software, validates something fundamental: Rahman&rsquo;s physical model was remarkably sound, and his computational methodology was scientifically rigorous despite 1960s constraints.</p>
<p>The cage effect, velocity correlations, and structural evolution are fundamental characteristics of how matter behaves at the atomic scale, as relevant today as they were six decades ago.</p>
]]></content:encoded></item><item><title>Modernizing Rahman''s 1964 Argon Simulation</title><link>https://hunterheidenreich.com/projects/rahman-1964-replication/</link><pubDate>Sat, 23 Aug 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/projects/rahman-1964-replication/</guid><description>A high-fidelity replication of foundational molecular dynamics using modern software engineering practices: caching, vectorization, and strict reproducibility.</description><content:encoded><![CDATA[<h2 id="overview">Overview</h2>
<p>This project is a &ldquo;digital restoration&rdquo; of Aneesur Rahman&rsquo;s seminal 1964 paper, <em>Correlations in the Motion of Atoms in Liquid Argon</em>. While the physics of liquid argon is a solved problem, the challenge lies in bridging the gap between 1960s mainframe constraints and 2025 software architecture.</p>
<p>I replicated the simulation using <strong>LAMMPS</strong> and built a <strong>Python analysis pipeline</strong> to process the trajectory data. The project demonstrates how modern tooling (<code>uv</code>, type hinting, vectorized NumPy) can transform academic &ldquo;write-once&rdquo; scripts into a reproducible research toolkit.</p>
<h2 id="features">Features</h2>
<h3 id="the-analysis-pipeline">The Analysis Pipeline</h3>
<p>I architected a modular Python package (<code>argon_sim</code>) designed for performance and maintainability.</p>
<ul>
<li><strong>Intelligent Caching System</strong>: MD analysis is compute-intensive ($O(N^2)$). I implemented a decorator-based caching layer (<code>@cached_computation</code>) that hashes source file modification times and function arguments. This ensures expensive calculations (like RDF or Van Hove correlations) are only re-run when the underlying trajectory or parameters actually change.</li>
<li><strong>Vectorization &amp; Optimization</strong>: To handle the $N^2$ complexity of pair-wise interactions without C++ extensions, I utilized NumPy broadcasting. For example, the Mean Square Displacement (MSD) calculation is fully vectorized, with a fallback &ldquo;chunked&rdquo; implementation to handle memory overflows on smaller machines.</li>
<li><strong>Modern Python Tooling</strong>:
<ul>
<li><strong>Dependency Management</strong>: Used <code>uv</code> for deterministic environment locking (sub-second resolution).</li>
<li><strong>Type Safety</strong>: Fully type-hinted codebase for static analysis compliance.</li>
<li><strong>Automation</strong>: A <code>Makefile</code> abstracts the workflow (simulation → analysis → figure generation) into single commands (e.g., <code>make figure-5</code>).</li>
</ul>
</li>
</ul>
<h3 id="the-simulation-strategy">The Simulation Strategy</h3>
<p>I used LAMMPS for the MD engine but strictly adhered to Rahman&rsquo;s physical parameters while modernizing the stability mechanisms.</p>
<ul>
<li><strong>Integration</strong>: Replaced Rahman&rsquo;s predictor-corrector method with the modern standard <strong>Velocity Verlet</strong> algorithm (2 fs timestep).</li>
<li><strong>Equilibration</strong>: I implemented a 1 ns <strong>NVT equilibration</strong> phase (500,000 steps at the 2 fs timestep) to properly melt the FCC crystal structure before the NVE production run.</li>
<li><strong>Intellectual Honesty</strong>: The <code>in.argon</code> script explicitly documents every deviation from the original methodology (e.g., energy minimization) and the justification for ensuring numerical stability.</li>
</ul>
<h2 id="usage">Usage</h2>
<p>The project uses a <code>Makefile</code> to automate the workflow. Run <code>make all</code> to execute the LAMMPS simulation and generate all analysis figures.</p>
<h2 id="results">Results</h2>
<p>The replication achieved high quantitative agreement with the historical data, validating both the simulation parameters and the custom analysis code.</p>
<table>
  <thead>
      <tr>
          <th style="text-align: left">Property</th>
          <th style="text-align: left">Rahman (1964)</th>
          <th style="text-align: left">This Work</th>
          <th style="text-align: left">Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left">Diffusion Coefficient ($D$)</td>
          <td style="text-align: left">$2.43 \times 10^{-5}$ cm²/s</td>
          <td style="text-align: left">$2.47 \times 10^{-5}$ cm²/s</td>
          <td style="text-align: left">Agreement within 2%</td>
      </tr>
      <tr>
          <td style="text-align: left">RDF First Peak</td>
          <td style="text-align: left">$3.7$ Å</td>
          <td style="text-align: left">$3.82$ Å</td>
          <td style="text-align: left">Slight shift</td>
      </tr>
      <tr>
          <td style="text-align: left">Velocity Dist. Width ($e^{-1/2}$)</td>
          <td style="text-align: left">$1.77$</td>
          <td style="text-align: left">$1.77$</td>
          <td style="text-align: left">Exact match to theoretical Maxwell-Boltzmann</td>
      </tr>
  </tbody>
</table>
<h3 id="visual-replication">Visual Replication</h3>
<p>I used Matplotlib to digitally recreate Rahman&rsquo;s hand-drawn plots, confirming signatures like the <strong>negative region in the Velocity Autocorrelation Function (VACF)</strong>, which provided the first evidence of the &ldquo;cage effect&rdquo; in simple liquids.</p>















<figure class="post-figure center ">
    <img src="/img/rahman-1964-argon-molecular-dynamics/rahman-argon-velocity-autocorrelation.webp"
         alt="Velocity Autocorrelation Function comparison showing the characteristic negative region"
         title="Velocity Autocorrelation Function comparison showing the characteristic negative region"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">The VACF&rsquo;s negative region (first evidence of the &lsquo;cage effect&rsquo; in liquids) reproduced 60 years later.</figcaption>
    
</figure>

<h2 id="challenges--learnings">Challenges &amp; Learnings</h2>
<ul>
<li><strong>Unit Hell</strong>: Rahman&rsquo;s paper uses a mix of reduced units and CGS. Mapping these to LAMMPS&rsquo;s <code>real</code> units required a dedicated <code>constants.py</code> module and rigorous unit testing to prevent dimensional errors.</li>
<li><strong>Fourier Transforms</strong>: Calculating the Structure Factor $S(k)$ from $g(r)$ required implementing a manual 3D Fourier transform for spherical symmetry, as standard FFT packages do not account for the radial shell integration implicit in liquid structure analysis.</li>
<li><strong>Code as a Liability</strong>: Early in the project, I realized that re-running analysis scripts was becoming a bottleneck. This drove the decision to build the caching infrastructure, reinforcing the lesson that investing in developer tooling pays off even in small-scale scientific projects.</li>
</ul>
<h2 id="related-work">Related Work</h2>
<p>The full methodology and physics are documented in the companion blog post:</p>
<ul>
<li><a href="/posts/rahman-1964-lammps-liquid-argon/">Replicating Rahman&rsquo;s 1964 Liquid Argon Simulation</a></li>
</ul>
]]></content:encoded></item><item><title>Liquid Argon: LAMMPS Simulation</title><link>https://hunterheidenreich.com/videos/liquid-argon-lammps-simulation/</link><pubDate>Sat, 23 Aug 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/videos/liquid-argon-lammps-simulation/</guid><description>LAMMPS molecular dynamics simulation of liquid argon demonstrating fundamental liquid-state behavior and molecular motion.</description><content:encoded><![CDATA[<div style="position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;">
      <iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share; fullscreen" loading="eager" referrerpolicy="strict-origin-when-cross-origin" src="https://www.youtube-nocookie.com/embed/KjFixUt6bnQ?autoplay=0&amp;controls=1&amp;end=0&amp;loop=0&amp;mute=0&amp;start=0" style="position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;" title="YouTube video"></iframe>
    </div>

<p>Details on the simulation can be found in the <a href="/posts/rahman-1964-lammps-liquid-argon/">Liquid Argon: LAMMPS Simulation</a> post.</p>
]]></content:encoded></item><item><title>eSEN: Smooth Interatomic Potentials (ICML Spotlight)</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/ml-potentials/learning-smooth-interatomic-potentials/</link><pubDate>Sat, 23 Aug 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/ml-potentials/learning-smooth-interatomic-potentials/</guid><description>Fu et al. propose energy conservation as a key MLIP diagnostic and introduce eSEN, bridging test accuracy and real performance.</description><content:encoded><![CDATA[<h2 id="paper-overview">Paper Overview</h2>
<p>This is a <strong>method paper</strong>. It addresses a critical disconnect in the evaluation of Machine Learning Interatomic Potentials (MLIPs) and introduces a novel architecture, <strong>eSEN</strong>, designed based on insights from this analysis. The paper proposes a new standard for evaluating MLIPs beyond simple test-set errors.</p>
<h2 id="the-energy-conservation-gap-in-mlip-evaluation">The Energy Conservation Gap in MLIP Evaluation</h2>
<p>The motivation addresses a well-known but under-addressed problem in the field: improvements in standard MLIP metrics (lower energy/force MAE on static test sets) do not reliably translate to better performance on complex downstream tasks like molecular dynamics (MD) simulations, materials stability prediction, or phonon calculations. The authors seek to understand why this gap exists and how to design models that are both accurate on test sets and physically reliable in practical scientific workflows.</p>
<h2 id="the-esen-architecture-and-continuous-representation">The eSEN Architecture and Continuous Representation</h2>
<p>The novelty is twofold, spanning both a conceptual framework for evaluation and a new model architecture:</p>
<ol>
<li>
<p><strong>Energy Conservation as a Diagnostic Test</strong>: The core conceptual contribution is using an MLIP&rsquo;s ability to conserve energy in out-of-distribution MD simulations as a crucial diagnostic test. The authors demonstrate that for models passing this test, a strong correlation between test-set error and downstream task performance is restored.</p>
</li>
<li>
<p><strong>The eSEN Architecture</strong>: The paper introduces the <strong>equivariant Smooth Energy Network (eSEN)</strong>, designed with specific choices to ensure a smooth and well-behaved Potential Energy Surface (PES):</p>
<ul>
<li><strong>Strictly Conservative Forces</strong>: Forces are computed exclusively as the negative gradient of energy ($F = -\nabla E$), using conservative force prediction instead of faster direct-force prediction heads.</li>
<li><strong>Continuous Representations</strong>: Maintains strict equivariance and smoothness by using equivariant gated non-linearities instead of discretizing spherical harmonic representations during nodewise processing.</li>
<li><strong>Smooth PES Construction</strong>: Critical design choices include using distance cutoffs, polynomial envelope functions ensuring derivatives go to zero at cutoffs, and limited radial basis functions to avoid overly sensitive PES.</li>
</ul>
</li>
<li>
<p><strong>Efficient Training Strategy</strong>: A two-stage training regimen with fast pre-training using a non-conservative direct-force model, followed by fine-tuning to enforce energy conservation. This captures the efficiency of direct-force training while ensuring physical robustness.</p>
</li>
</ol>
<h2 id="evaluating-ood-energy-conservation-and-physical-properties">Evaluating OOD Energy Conservation and Physical Properties</h2>
<p>The paper presents a comprehensive experimental validation:</p>
<ol>
<li>
<p><strong>Ablation Studies on Energy Conservation</strong>: MD simulations on out-of-distribution systems (TM23 and MD22 datasets) systematically tested key design choices (direct-force vs. conservative, representation discretization, neighbor limits, envelope functions). This empirically demonstrated which choices lead to energy drift despite negligible impact on test-set MAE.</p>
</li>
<li>
<p><strong>Physical Property Prediction Benchmarks</strong>: The eSEN model was evaluated on challenging downstream tasks:</p>
<ul>
<li><strong>Matbench-Discovery</strong>: Materials stability and thermal conductivity prediction, where eSEN achieved the highest F1 score among compliant models and excelled at both metrics simultaneously.</li>
<li><strong>MDR Phonon Benchmark</strong>: Predicting phonon properties that test accurate second and third-order derivatives of the PES. eSEN achieved state-of-the-art results, particularly outperforming direct-force models.</li>
<li><strong>SPICE-MACE-OFF</strong>: Standard energy and force prediction on organic molecules, demonstrating that physical plausibility design choices enhanced raw accuracy.</li>
</ul>
</li>
<li>
<p><strong>Correlation Analysis</strong>: Explicit plots of test-set energy MAE versus performance on downstream benchmarks showed weak overall correlation that becomes strong and predictive when restricted to models passing the energy conservation test.</p>
</li>
</ol>
<h2 id="outcomes-and-conclusions">Outcomes and Conclusions</h2>
<ul>
<li>
<p><strong>Primary Conclusion</strong>: Energy conservation is a critical, practical property for MLIPs. Using it as a filter re-establishes test-set error as a reliable proxy for model development, dramatically accelerating the innovation cycle. Models that are not conservative, even with low test error, are unreliable for many critical scientific applications.</p>
</li>
<li>
<p><strong>Model Performance</strong>: The eSEN architecture outperforms base models across diverse tasks, from energy/force prediction to geometry optimization, phonon calculations, and thermal conductivity prediction.</p>
</li>
<li>
<p><strong>Actionable Design Principles</strong>: The paper provides experimentally-validated architectural choices that promote physical plausibility. Seemingly minor details, like how atomic neighbors are selected, can have profound impacts on a model&rsquo;s utility in simulations.</p>
</li>
<li>
<p><strong>Efficient Path to Robust Models</strong>: The direct-force pre-training plus conservative fine-tuning strategy offers a practical method for developing physically robust models without incurring the full computational cost of conservative training from scratch.</p>
</li>
</ul>
<hr>
<h2 id="reproducibility">Reproducibility</h2>
<h3 id="artifacts">Artifacts</h3>
<table>
  <thead>
      <tr>
          <th>Artifact</th>
          <th>Type</th>
          <th>License</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><a href="https://github.com/facebookresearch/fairchem">fairchem (GitHub)</a></td>
          <td>Code</td>
          <td>MIT</td>
          <td>Official implementation within FAIR Chemistry framework</td>
      </tr>
      <tr>
          <td><a href="https://huggingface.co/facebook/OMAT24">OMAT24 (Hugging Face)</a></td>
          <td>Model</td>
          <td>FAIR Acceptable Use Policy</td>
          <td>Pre-trained eSEN-30M-MP and eSEN-30M-OAM checkpoints</td>
      </tr>
      <tr>
          <td><a href="https://openreview.net/forum?id=R0PBjxIbgm">OpenReview</a></td>
          <td>Paper</td>
          <td>CC BY 4.0</td>
          <td>ICML 2025 camera-ready paper</td>
      </tr>
  </tbody>
</table>
<h3 id="models">Models</h3>
<p>The eSEN architecture builds on components from <strong>eSCN</strong> (Equivariant Spherical Channel Network) and <strong>Equiformer</strong>, combining them with design choices that prioritize smoothness and energy conservation. The implementation integrates into the standard <code>fairchem</code> Open Catalyst experimental framework.</p>
<h4 id="layer-structure">Layer Structure</h4>
<ul>
<li><strong>Edgewise Convolution</strong>: Uses <code>SO2</code> convolution layers (from eSCN) with an envelope function applied. Source and target embeddings are concatenated before convolution.</li>
<li><strong>Nodewise Feed-Forward</strong>: Two equivariant linear layers with an intermediate <strong>SiLU-based gated non-linearity</strong> (from Equiformer).</li>
<li><strong>Normalization</strong>: Equivariant Layer Normalization (from Equiformer).</li>
</ul>
<h4 id="smoothness-design-choices">Smoothness Design Choices</h4>
<p>Several architectural decisions distinguish eSEN from prior work:</p>
<ul>
<li><strong>No Grid Projection</strong>: eSEN performs operations directly in the spherical harmonic space to maintain equivariance and energy conservation, bypassing the projection of spherical harmonics to spatial grids for non-linearity.</li>
<li><strong>Distance Cutoff for Graph Construction</strong>: Uses a strict distance cutoff (6 Å for MPTrj models, 5 Å for SPICE models). Neighbor limits introduce discontinuities that break energy conservation.</li>
<li><strong>Polynomial Envelope Functions</strong>: Ensures derivatives go to zero smoothly at the cutoff radius.</li>
</ul>
<h3 id="algorithms">Algorithms</h3>
<h4 id="two-stage-training-esen-30m-mp">Two-Stage Training (eSEN-30M-MP)</h4>
<ol>
<li><strong>Direct-Force Pre-training</strong> (60 epochs): Uses <strong>DeNS</strong> (Denoising Non-equilibrium Structures) to reduce overfitting. This stage is fast because it does not require backpropagation through energy gradients.</li>
<li><strong>Conservative Fine-tuning</strong> (40 epochs): The direct-force head is removed, and forces are calculated via gradients ($F = -\nabla E$). This enforces energy conservation.</li>
</ol>
<p><strong>Important</strong>: DeNS is used exclusively during the direct-force pre-training stage, with a noising probability of 0.5, a standard deviation of 0.1 Å for the added Gaussian noise, and a DeNS loss coefficient of 10. The fine-tuning strategy reduces the wall-clock time for model training by 40%.</p>
<h4 id="optimization">Optimization</h4>
<ul>
<li><strong>Optimizer</strong>: AdamW with cosine learning rate scheduler</li>
<li><strong>Max Learning Rate</strong>: $4 \times 10^{-4}$</li>
<li><strong>Batch Size</strong>: 512 (for MPTrj models)</li>
<li><strong>Weight Decay</strong>: $1 \times 10^{-3}$</li>
<li><strong>Gradient Clipping</strong>: Norm of 100</li>
<li><strong>Warmup</strong>: 0.1 epochs with a factor of 0.2</li>
</ul>
<h4 id="loss-function">Loss Function</h4>
<p>A composite loss combining per-atom energy MAE, force $L_2$ loss, and stress MAE:</p>
<p>$$
\begin{aligned}
\mathcal{L} = \lambda_{\text{e}} \frac{1}{N} \sum_{i=1}^N \lvert E_{i} - \hat{E}_{i} \rvert + \lambda_{\text{f}} \frac{1}{3N} \sum_{i=1}^N \lVert \mathbf{F}_{i} - \hat{\mathbf{F}}_{i} \rVert_2^2 + \lambda_{\text{s}} \lVert \mathbf{S} - \hat{\mathbf{S}} \rVert_1
\end{aligned}
$$</p>
<p>For MPTrj-30M, the weighting coefficients are set to $\lambda_{\text{e}} = 20$, $\lambda_{\text{f}} = 20$, and $\lambda_{\text{s}} = 5$.</p>
<h3 id="data">Data</h3>
<h4 id="training-data">Training Data</h4>
<ul>
<li><strong>Inorganic</strong>: MPTrj (Materials Project Trajectory) dataset</li>
<li><strong>Organic</strong>: SPICE-MACE-OFF dataset</li>
</ul>
<h4 id="test-data-construction">Test Data Construction</h4>
<ul>
<li><strong>MPTrj Testing</strong>: Since MPTrj lacks an official test split, the authors created a test set using 5,000 random samples from the <strong>subsampled Alexandria (sAlex)</strong> dataset to ensure fair comparison.</li>
<li><strong>Out-of-Distribution Conservation Testing</strong>:
<ul>
<li><em>Inorganic</em>: <strong>TM23</strong> dataset (transition metal defects). Simulation: 100 ps, 5 fs timestep.</li>
<li><em>Organic</em>: <strong>MD22</strong> dataset (large molecules). Simulation: 100 ps, 1 fs timestep.</li>
</ul>
</li>
</ul>
<h3 id="hardware">Hardware</h3>
<p>Compute for training operations predominantly utilizes <strong>80GB NVIDIA A100 GPUs</strong>.</p>
<h4 id="inference-efficiency">Inference Efficiency</h4>
<p>For a periodic system of <strong>216 atoms</strong> on a single A100 (PyTorch 2.4.0, CUDA 12.1, no compile/torchscript), the 2-layer eSEN models achieve approximately <strong>0.4 million steps per day</strong> (3.2M parameters) and <strong>0.8 million steps per day</strong> (6.5M parameters), comparable to MACE-OFF-L at 0.7 million steps per day.</p>
<h3 id="evaluation">Evaluation</h3>
<p>The paper evaluated eSEN across three major benchmark tasks. Key evaluation metrics included energy MAE (meV/atom), force MAE (meV/Å), stress MAE (meV/Å/atom), F1 score for stability prediction, $\kappa_{\text{SRME}}$ for thermal conductivity, and phonon frequency accuracy.</p>
<h4 id="ablation-test-set-mae-table-1">Ablation Test-Set MAE (Table 1)</h4>
<p>Design choices that dramatically affect energy conservation have negligible impact on static test-set MAE, which is precisely why test-set error alone is misleading. All models are 2-layer with 3.2M parameters, $L_{\text{max}} = 2$, $M_{\text{max}} = 2$:</p>
<table>
  <thead>
      <tr>
          <th>Model</th>
          <th>Energy MAE</th>
          <th>Force MAE</th>
          <th>Stress MAE</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>eSEN (default)</td>
          <td>17.02</td>
          <td>43.96</td>
          <td>0.14</td>
      </tr>
      <tr>
          <td>eSEN, direct-force</td>
          <td>18.66</td>
          <td>43.62</td>
          <td>0.16</td>
      </tr>
      <tr>
          <td>eSEN, neighbor limit</td>
          <td>17.30</td>
          <td>44.11</td>
          <td>0.14</td>
      </tr>
      <tr>
          <td>eSEN, no envelope</td>
          <td>17.60</td>
          <td>44.69</td>
          <td>0.14</td>
      </tr>
      <tr>
          <td>eSEN, $N_{\text{basis}} = 512$</td>
          <td>19.87</td>
          <td>48.29</td>
          <td>0.15</td>
      </tr>
      <tr>
          <td>eSEN, Bessel</td>
          <td>17.65</td>
          <td>44.83</td>
          <td>0.15</td>
      </tr>
      <tr>
          <td>eSEN, discrete, res=6</td>
          <td>17.05</td>
          <td>43.10</td>
          <td>0.14</td>
      </tr>
      <tr>
          <td>eSEN, discrete, res=10</td>
          <td>17.11</td>
          <td>43.13</td>
          <td>0.14</td>
      </tr>
      <tr>
          <td>eSEN, discrete, res=14</td>
          <td>17.12</td>
          <td>43.09</td>
          <td>0.14</td>
      </tr>
  </tbody>
</table>
<p>Energy MAE in meV/atom. Force MAE in meV/Å. Stress MAE in meV/Å/atom.</p>
<h4 id="matbench-discovery-tables-2-and-3">Matbench-Discovery (Tables 2 and 3)</h4>
<p><strong>Compliant models</strong> (trained only on MPTrj or its subset), unique prototype split:</p>
<table>
  <thead>
      <tr>
          <th>Model</th>
          <th>F1</th>
          <th>DAF</th>
          <th>$\kappa_{\text{SRME}}$</th>
          <th>RMSD</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><strong>eSEN-30M-MP</strong></td>
          <td><strong>0.831</strong></td>
          <td><strong>5.260</strong></td>
          <td><strong>0.340</strong></td>
          <td><strong>0.0752</strong></td>
      </tr>
      <tr>
          <td>eqV2-S-DeNS</td>
          <td>0.815</td>
          <td>5.042</td>
          <td>1.676</td>
          <td>0.0757</td>
      </tr>
      <tr>
          <td>MatRIS-MP</td>
          <td>0.809</td>
          <td>5.049</td>
          <td>0.861</td>
          <td>0.0773</td>
      </tr>
      <tr>
          <td>AlphaNet-MP</td>
          <td>0.799</td>
          <td>4.863</td>
          <td>1.31</td>
          <td>0.1067</td>
      </tr>
      <tr>
          <td>DPA3-v2-MP</td>
          <td>0.786</td>
          <td>4.822</td>
          <td>0.959</td>
          <td>0.0823</td>
      </tr>
      <tr>
          <td>ORB v2 MPtrj</td>
          <td>0.765</td>
          <td>4.702</td>
          <td>1.725</td>
          <td>0.1007</td>
      </tr>
      <tr>
          <td>SevenNet-13i5</td>
          <td>0.760</td>
          <td>4.629</td>
          <td>0.550</td>
          <td>0.0847</td>
      </tr>
      <tr>
          <td>GRACE-2L-MPtrj</td>
          <td>0.691</td>
          <td>4.163</td>
          <td>0.525</td>
          <td>0.0897</td>
      </tr>
      <tr>
          <td>MACE-MP-0</td>
          <td>0.669</td>
          <td>3.777</td>
          <td>0.647</td>
          <td>0.0915</td>
      </tr>
      <tr>
          <td>CHGNet</td>
          <td>0.613</td>
          <td>3.361</td>
          <td>1.717</td>
          <td>0.0949</td>
      </tr>
      <tr>
          <td>M3GNet</td>
          <td>0.569</td>
          <td>2.882</td>
          <td>1.412</td>
          <td>0.1117</td>
      </tr>
  </tbody>
</table>
<p>eSEN-30M-MP excels at both F1 and $\kappa_{\text{SRME}}$ simultaneously, while all previous models only achieve SOTA on one or the other.</p>
<p><strong>Non-compliant models</strong> (trained on additional datasets):</p>
<table>
  <thead>
      <tr>
          <th>Model</th>
          <th>F1</th>
          <th>$\kappa_{\text{SRME}}$</th>
          <th>RMSD</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><strong>eSEN-30M-OAM</strong></td>
          <td><strong>0.925</strong></td>
          <td><strong>0.170</strong></td>
          <td><strong>0.0608</strong></td>
      </tr>
      <tr>
          <td>eqV2-M-OAM</td>
          <td>0.917</td>
          <td>1.771</td>
          <td>0.0691</td>
      </tr>
      <tr>
          <td>ORB v3</td>
          <td>0.905</td>
          <td>0.210</td>
          <td>0.0750</td>
      </tr>
      <tr>
          <td>SevenNet-MF-ompa</td>
          <td>0.901</td>
          <td>0.317</td>
          <td>0.0639</td>
      </tr>
      <tr>
          <td>DPA3-v2-OpenLAM</td>
          <td>0.890</td>
          <td>0.687</td>
          <td>0.0679</td>
      </tr>
      <tr>
          <td>GRACE-2L-OAM</td>
          <td>0.880</td>
          <td>0.294</td>
          <td>0.0666</td>
      </tr>
      <tr>
          <td>MatterSim-v1-5M</td>
          <td>0.862</td>
          <td>0.574</td>
          <td>0.0733</td>
      </tr>
      <tr>
          <td>MACE-MPA-0</td>
          <td>0.852</td>
          <td>0.412</td>
          <td>0.0731</td>
      </tr>
  </tbody>
</table>
<p>The eSEN-30M-OAM model is pre-trained on the OMat24 dataset, then fine-tuned on the subsampled Alexandria (sAlex) dataset and MPTrj dataset.</p>
<h4 id="mdr-phonon-benchmark-table-4">MDR Phonon Benchmark (Table 4)</h4>
<p>Metrics: maximum phonon frequency MAE($\omega_{\text{max}}$) in K, vibrational entropy MAE($S$) in J/K/mol, Helmholtz free energy MAE($F$) in kJ/mol, heat capacity MAE($C_V$) in J/K/mol.</p>
<table>
  <thead>
      <tr>
          <th>Model</th>
          <th>MAE($\omega_{\text{max}}$)</th>
          <th>MAE($S$)</th>
          <th>MAE($F$)</th>
          <th>MAE($C_V$)</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><strong>eSEN-30M-MP</strong></td>
          <td><strong>21</strong></td>
          <td><strong>13</strong></td>
          <td><strong>5</strong></td>
          <td><strong>4</strong></td>
      </tr>
      <tr>
          <td>SevenNet-13i5</td>
          <td>26</td>
          <td>28</td>
          <td>10</td>
          <td>5</td>
      </tr>
      <tr>
          <td>GRACE-2L (r6)</td>
          <td>40</td>
          <td>25</td>
          <td>9</td>
          <td>5</td>
      </tr>
      <tr>
          <td>SevenNet-0</td>
          <td>40</td>
          <td>48</td>
          <td>19</td>
          <td>9</td>
      </tr>
      <tr>
          <td>MACE</td>
          <td>61</td>
          <td>60</td>
          <td>24</td>
          <td>13</td>
      </tr>
      <tr>
          <td>CHGNet</td>
          <td>89</td>
          <td>114</td>
          <td>45</td>
          <td>21</td>
      </tr>
      <tr>
          <td>M3GNet</td>
          <td>98</td>
          <td>150</td>
          <td>56</td>
          <td>22</td>
      </tr>
  </tbody>
</table>
<p>Direct-force models show dramatically worse performance at the standard 0.01 Å displacement (e.g., eqV2-S-DeNS: 280/224/54/94) but improve at larger displacements (0.2 Å: 58/26/8/8), revealing that their PES is rough near energy minima.</p>
<h4 id="spice-mace-off-table-5">SPICE-MACE-OFF (Table 5)</h4>
<p>Test set MAE for organic molecule energy/force prediction. Energy MAE in meV/atom, force MAE in meV/Å:</p>
<table>
  <thead>
      <tr>
          <th>Dataset</th>
          <th>MACE-4.7M (E/F)</th>
          <th>EscAIP-45M* (E/F)</th>
          <th>eSEN-3.2M (E/F)</th>
          <th>eSEN-6.5M (E/F)</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>PubChem</td>
          <td>0.88 / 14.75</td>
          <td>0.53 / 5.86</td>
          <td>0.22 / 6.10</td>
          <td><strong>0.15</strong> / <strong>4.21</strong></td>
      </tr>
      <tr>
          <td>DES370K M.</td>
          <td>0.59 / 6.58</td>
          <td>0.41 / 3.48</td>
          <td>0.17 / 1.85</td>
          <td><strong>0.13</strong> / <strong>1.24</strong></td>
      </tr>
      <tr>
          <td>DES370K D.</td>
          <td>0.54 / 6.62</td>
          <td>0.38 / 2.18</td>
          <td>0.20 / 2.77</td>
          <td><strong>0.15</strong> / <strong>2.12</strong></td>
      </tr>
      <tr>
          <td>Dipeptides</td>
          <td>0.42 / 10.19</td>
          <td>0.31 / 5.21</td>
          <td>0.10 / 3.04</td>
          <td><strong>0.07</strong> / <strong>2.00</strong></td>
      </tr>
      <tr>
          <td>Sol. AA</td>
          <td>0.98 / 19.43</td>
          <td>0.61 / 11.52</td>
          <td>0.30 / 5.76</td>
          <td><strong>0.25</strong> / <strong>3.68</strong></td>
      </tr>
      <tr>
          <td>Water</td>
          <td>0.83 / 13.57</td>
          <td>0.72 / 10.31</td>
          <td>0.24 / 3.88</td>
          <td><strong>0.15</strong> / <strong>2.50</strong></td>
      </tr>
      <tr>
          <td>QMugs</td>
          <td>0.45 / 16.93</td>
          <td>0.41 / 8.74</td>
          <td>0.16 / 5.70</td>
          <td><strong>0.12</strong> / <strong>3.78</strong></td>
      </tr>
  </tbody>
</table>
<p>*EscAIP-45M is a direct-force model. eSEN-6.5M outperforms MACE-OFF-L and EscAIP on all test splits. The smaller eSEN-3.2M has inference efficiency comparable to MACE-4.7M while achieving lower MAE.</p>
<hr>
<h2 id="why-these-design-choices-matter">Why These Design Choices Matter</h2>
<h3 id="bounded-energy-derivatives-and-the-verlet-integrator">Bounded Energy Derivatives and the Verlet Integrator</h3>
<p>The theoretical foundation for why smoothness matters comes from Theorem 5.1 of Hairer et al. (2003). For the Verlet integrator (the standard NVE integrator), the total energy drift satisfies:</p>
<p>$$
|E(\mathbf{r}_T, \mathbf{a}) - E(\mathbf{r}_0, \mathbf{a})| \leq C \Delta t^2 + C_N \Delta t^N T
$$</p>
<p>where $T$ is the total simulation time ($T \leq \Delta t^{-N}$), $N$ is the highest order for which the $N$th derivative of $E$ is continuously differentiable with bounded derivative, and $C$, $C_N$ are constants independent of $T$ and $\Delta t$. The first term is a time-independent fluctuation of $O(\Delta t^2)$; the second term governs long-term conservation. This means the PES must be continuously differentiable to high order, with bounded derivatives, for energy conservation in long-time simulations.</p>
<h3 id="architectural-choices-that-break-conservation">Architectural Choices That Break Conservation</h3>
<p>The authors provide theoretical justification for why specific architectural choices break energy conservation:</p>
<ul>
<li><strong>Max Neighbor Limit (KNN)</strong>: Introduces discontinuity in the PES. If a neighbor at distance $r$ moves to $r + \epsilon$ and drops out of the top-$K$, the energy changes discontinuously.</li>
<li><strong>Grid Discretization</strong>: Projecting spherical harmonics to a spatial grid introduces discretization errors in energy gradients that break conservation. This can be mitigated with higher-resolution grids but not eliminated.</li>
<li><strong>Direct-Force Prediction</strong>: Imposes no mathematical constraint that forces must be the gradient of an energy scalar field. In other words, $\nabla \times \mathbf{F} \neq 0$ is permitted, violating the requirement for a conservative force field.</li>
</ul>
<h3 id="displacement-sensitivity-in-phonon-calculations">Displacement Sensitivity in Phonon Calculations</h3>
<p>An important empirical finding concerns how displacement values affect phonon predictions. Conservative models (eSEN, MACE) show convergent phonon band structures as displacement decreases toward zero. In contrast, direct-force models (eqV2-S-DeNS) fail to converge, exhibiting missing acoustic branches and spurious imaginary frequencies at small displacements. While direct-force models achieve competitive thermodynamic property accuracy at large displacements (0.2 Å), this is deceptive: the underlying phonon band structures remain inaccurate, and the apparent accuracy comes from Boltzmann-weighted integrals smoothing over errors.</p>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Fu, X., Wood, B. M., Barroso-Luque, L., Levine, D. S., Gao, M., Dzamba, M., &amp; Zitnick, C. L. (2025). Learning Smooth and Expressive Interatomic Potentials for Physical Property Prediction. <em>Proceedings of the 42nd International Conference on Machine Learning (ICML)</em>, PMLR 267:17875–17893.</p>
<p><strong>Publication</strong>: ICML 2025 (Spotlight)</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@inproceedings</span>{fu2025learning,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span>=<span style="color:#e6db74">{Learning Smooth and Expressive Interatomic Potentials for Physical Property Prediction}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span>=<span style="color:#e6db74">{Fu, Xiang and Wood, Brandon M. and Barroso-Luque, Luis and Levine, Daniel S. and Gao, Meng and Dzamba, Misko and Zitnick, C. Lawrence}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">booktitle</span>=<span style="color:#e6db74">{Proceedings of the 42nd International Conference on Machine Learning}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">series</span>=<span style="color:#e6db74">{Proceedings of Machine Learning Research}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">volume</span>=<span style="color:#e6db74">{267}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span>=<span style="color:#e6db74">{17875--17893}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">publisher</span>=<span style="color:#e6db74">{PMLR}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span>=<span style="color:#e6db74">{2025}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p><strong>Additional Resources</strong>:</p>
<ul>
<li><a href="https://icml.cc/virtual/2025/poster/45302">ICML 2025 poster page</a></li>
<li><a href="https://openreview.net/forum?id=R0PBjxIbgm">OpenReview forum</a></li>
<li><a href="https://openreview.net/pdf?id=R0PBjxIbgm">PDF on OpenReview</a></li>
<li><a href="https://huggingface.co/facebook/OMAT24">OMAT24 model on Hugging Face</a></li>
<li><a href="https://github.com/facebookresearch/fairchem">Code on GitHub (fairchem)</a></li>
</ul>
]]></content:encoded></item><item><title>Dark Side of Forces: Non-Conservative ML Force Models</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/ml-potentials/dark-side-of-forces/</link><pubDate>Sat, 23 Aug 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/ml-potentials/dark-side-of-forces/</guid><description>Bigi et al. critique non-conservative force models in ML potentials, showing their simulation failures and proposing hybrid solutions.</description><content:encoded><![CDATA[<h2 id="contribution-systematic-assessment-of-non-conservative-ml-force-models">Contribution: Systematic Assessment of Non-Conservative ML Force Models</h2>
<p>This is a <strong>Systematization</strong> paper. It systematically catalogs the exact failure modes of existing non-conservative force approaches, quantifies them with a new diagnostic metric, and proposes a hybrid Multiple Time-Stepping solution combining the speed benefits of direct force prediction with the physical correctness of conservative models.</p>
<h2 id="motivation-the-speed-accuracy-trade-off-in-ml-force-fields">Motivation: The Speed-Accuracy Trade-off in ML Force Fields</h2>
<p>Many recent machine learning interatomic potential (MLIP) architectures predict forces directly ($F_\theta(r)$). This &ldquo;non-conservative&rdquo; approach avoids the computational overhead of automatic differentiation, yielding faster inference (typically 2-3x speedup) and faster training (up to 3x). However, it sacrifices energy conservation and rotational constraints, potentially destabilizing molecular dynamics simulations. The field lacks rigorous quantification of when this trade-off breaks down and how to mitigate the failures.</p>
<h2 id="novelty-jacobian-asymmetry-and-hybrid-architectures">Novelty: Jacobian Asymmetry and Hybrid Architectures</h2>
<p>Four key contributions:</p>
<ol>
<li>
<p><strong>Jacobian Asymmetry Metric ($\lambda$):</strong> A quantitative diagnostic for non-conservation. Since conservative forces derive from a scalar field, their Jacobian (the Hessian of energy) must be symmetric. The normalized norm of the antisymmetric part quantifies the degree of violation:
$$ \lambda = \frac{|| \mathbf{J}_{\text{anti}} ||_F}{|| \mathbf{J} ||_F} $$
where $\mathbf{J}_{\text{anti}} = (\mathbf{J} - \mathbf{J}^\top)/2$. Measured values range from $\lambda \approx 0.004$ (PET-NC) to $\lambda \approx 0.032$ (SOAP-BPNN-NC), with ORB at 0.015 and EquiformerV2 at 0.017. Notably, the pairwise $\lambda_{ij}$ approaches 1 at large interatomic distances, meaning non-conservative artifacts disproportionately affect long-range and collective interactions.</p>
</li>
<li>
<p><strong>Systematic Failure Mode Catalog:</strong> First comprehensive demonstration that non-conservative models cause runaway heating in NVE ensembles (temperature drifts of ~7,000 billion K/s for PET-NC and ~10x larger for ORB) and equipartition violations in NVT ensembles where different atom types equilibrate to different temperatures, a physical impossibility.</p>
</li>
<li>
<p><strong>Theoretical Analysis of Force vs. Energy Training:</strong> Force-only training overemphasizes high-frequency vibrational modes because force labels carry per-atom gradients that are dominated by stiff, short-range interactions. Energy labels provide a more balanced representation across the frequency spectrum. Additionally, conservative models benefit from backpropagation extending the effective receptive field to approximately 2x the interaction cutoff, while direct-force models are limited to the nominal cutoff radius.</p>
</li>
<li>
<p><strong>Hybrid Training and Inference Protocol:</strong> A practical workflow that combines fast direct-force prediction with conservative corrections:</p>
<ul>
<li><strong>Training:</strong> Pre-train on direct forces, then fine-tune on energy gradients (2-4x faster than training conservative models from scratch)</li>
<li><strong>Inference:</strong> Multiple Time-Stepping (MTS) where fast non-conservative forces are periodically corrected by slower conservative forces</li>
</ul>
</li>
</ol>
<h2 id="methodology-systematic-failure-mode-analysis">Methodology: Systematic Failure Mode Analysis</h2>
<p>The evaluation systematically tests multiple state-of-the-art models across diverse simulation scenarios:</p>
<p><strong>Models tested:</strong></p>
<ul>
<li><strong>PET-C/PET-NC</strong> (Point Edge Transformer, conservative and non-conservative variants)</li>
<li><strong>PET-M</strong> (hybrid variant jointly predicting both conservative and non-conservative forces)</li>
<li><strong>ORB-v2</strong> (non-conservative, trained on Alexandria/MPtrj)</li>
<li><strong>EquiformerV2</strong> (non-conservative equivariant Transformer)</li>
<li><strong>MACE-MP-0</strong> (conservative message-passing)</li>
<li><strong>SevenNet</strong> (conservative message-passing)</li>
<li><strong>SOAP-BPNN-C/SOAP-BPNN-NC</strong> (descriptor-based baseline, both conservative and non-conservative variants)</li>
</ul>
<p><strong>Test scenarios:</strong></p>
<ol>
<li><strong>NVE stability tests</strong> on bulk liquid water, graphene, amorphous carbon, and FCC aluminum</li>
<li><strong>Thermostat artifact analysis</strong> with Langevin and GLE thermostats</li>
<li><strong>Geometry optimization</strong> on water snapshots and <a href="/notes/chemistry/datasets/qm9/">QM9</a> molecules using FIRE and L-BFGS</li>
<li><strong>MTS validation</strong> on OC20 catalysis dataset</li>
<li><strong>Species-resolved temperature measurements</strong> for equipartition testing</li>
</ol>
<p><strong>Key metrics:</strong></p>
<ul>
<li>Jacobian asymmetry ($\lambda$)</li>
<li>Kinetic temperature drift in NVE</li>
<li>Velocity-velocity correlations</li>
<li>Radial distribution functions</li>
<li>Species-resolved temperatures</li>
<li>Inference speed benchmarks</li>
</ul>
<h2 id="results-simulation-instability-and-hybrid-solutions">Results: Simulation Instability and Hybrid Solutions</h2>
<p>Purely non-conservative models are <strong>unsuitable for production simulations</strong> due to uncontrollable unphysical artifacts that no thermostat can correct. Key findings:</p>
<p><strong>Performance failures:</strong></p>
<ul>
<li>Non-conservative models exhibited catastrophic temperature drift in NVE simulations: ~7,000 billion K/s for PET-NC and ~70,000 billion K/s for ORB, with EquiformerV2 comparable to PET-NC</li>
<li>Strong Langevin thermostats ($\tau=10$ fs) damped diffusion by ~5x, negating the speed benefits of non-conservative models</li>
<li>Advanced GLE thermostats also failed to control non-conservative drift (ORB reached 1181 K vs. 300 K target)</li>
<li>Equipartition violations: under stochastic velocity rescaling, O and H atoms equilibrated at different temperatures. For ORB, H atoms reached 336 K and O atoms 230 K against a 300 K target. For PET-NC, deviations were smaller but still significant (H at 296 K, O at 310 K).</li>
<li>Geometry optimization was more fragile with non-conservative forces: inaccurate NC models (SOAP-BPNN-NC) failed catastrophically, while more accurate ones (PET-NC) could converge with FIRE but showed large force fluctuations with L-BFGS. Non-conservative models consistently had lower success rates across water and QM9 benchmarks.</li>
</ul>
<p><strong>Hybrid solution success:</strong></p>
<ul>
<li>MTS with non-conservative forces corrected every 8 steps ($M=8$) achieved conservative stability with only ~20% overhead compared to a purely non-conservative trajectory. Results were essentially indistinguishable from fully conservative simulations. Higher stride values ($M=16$) became unstable due to resonances between fast degrees of freedom and integration errors.</li>
<li>Conservative fine-tuning achieved the accuracy of from-scratch training in about 1/3 the total training time (2-4x resource reduction)</li>
<li>Validated on OC20 catalysis benchmark</li>
</ul>
<p><strong>Scaling caveat:</strong> The authors note that as training datasets grow and models become more expressive, non-conservative artifacts should diminish because accurate models naturally exhibit less non-conservative behavior. However, they argue the best path forward is hybrid approaches rather than waiting for scale to solve the problem.</p>
<p><strong>Recommendation:</strong> The optimal production path is hybrid architectures using direct forces for acceleration (via MTS and pre-training) while anchoring models in conservative energy surfaces. This captures computational benefits without sacrificing physical reliability.</p>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<h3 id="data">Data</h3>
<p><strong>Primary training/evaluation:</strong></p>
<ul>
<li><strong>Bulk Liquid Water</strong> (Cheng et al., 2019): revPBE0-D3 calculations with over 250,000 force/energy targets, chosen for rigorous thermodynamic testing</li>
</ul>
<p><strong>Generalization tests:</strong></p>
<ul>
<li>Graphene, amorphous carbon, FCC aluminum (tested with general-purpose foundation models)</li>
</ul>
<p><strong>Benchmarks:</strong></p>
<ul>
<li><strong>QM9</strong>: Geometry optimization tests</li>
<li><strong>OC20</strong> (Open Catalyst): Oxygen on alloy surfaces for MTS validation</li>
</ul>
<p>All datasets publicly available through cited sources.</p>
<h3 id="models">Models</h3>
<p><strong>Point Edge Transformer (PET)</strong> variants:</p>
<ul>
<li><strong>PET-C (Conservative)</strong>: Forces via energy backpropagation</li>
<li><strong>PET-NC (Non-Conservative)</strong>: Direct force prediction head, slightly higher parameter count</li>
<li><strong>PET-M (Hybrid)</strong>: Jointly predicts both conservative and non-conservative forces, accuracy within ~10% of the best single-task models</li>
</ul>
<p><strong>Baseline comparisons:</strong></p>
<table>
  <thead>
      <tr>
          <th>Model</th>
          <th>Type</th>
          <th>Training Data</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>ORB-v2</td>
          <td>Non-conservative</td>
          <td>Alexandria/MPtrj</td>
          <td>Rotationally unconstrained</td>
      </tr>
      <tr>
          <td>EquiformerV2</td>
          <td>Non-conservative</td>
          <td>Alexandria/MPtrj</td>
          <td>Equivariant Transformer</td>
      </tr>
      <tr>
          <td>MACE-MP-0</td>
          <td>Conservative</td>
          <td>MPtrj</td>
          <td>Equivariant message-passing</td>
      </tr>
      <tr>
          <td>SevenNet</td>
          <td>Conservative</td>
          <td>MPtrj</td>
          <td>Equivariant message-passing</td>
      </tr>
      <tr>
          <td>SOAP-BPNN-C</td>
          <td>Conservative</td>
          <td>Bulk water</td>
          <td>Descriptor-based baseline</td>
      </tr>
      <tr>
          <td>SOAP-BPNN-NC</td>
          <td>Non-conservative</td>
          <td>Bulk water</td>
          <td>Descriptor-based baseline</td>
      </tr>
  </tbody>
</table>
<p><strong>Training details:</strong></p>
<ul>
<li><strong>Loss functions</strong>: PET-C uses joint Energy + Force $L^2$ loss; PET-NC uses Force-only $L^2$ loss</li>
<li><strong>Fine-tuning protocol</strong>: PET-NC converted to conservative via energy head fine-tuning</li>
<li><strong>MTS configuration</strong>: Non-conservative forces with conservative corrections every 8 steps ($M=8$)</li>
</ul>
<h3 id="evaluation">Evaluation</h3>
<p><strong>Metrics &amp; Software:</strong>
Molecular dynamics evaluations were performed using <strong>i-PI</strong>, while geometry optimizations used <strong>ASE (Atomic Simulation Environment)</strong>. Note that primary code reproducibility is provided via an archived Zenodo snapshot; the authors did not link a live, public GitHub repository.</p>
<ol>
<li><strong>Jacobian asymmetry</strong> ($\lambda$): Quantifies non-conservation via antisymmetric component</li>
<li><strong>Temperature drift</strong>: NVE ensemble stability</li>
<li><strong>Velocity-velocity correlation</strong> ($\hat{c}_{vv}(\omega)$): Thermostat artifact detection</li>
<li><strong>Radial distribution functions</strong> ($g(r)$): Structural accuracy</li>
<li><strong>Species-resolved temperature</strong>: Equipartition testing</li>
<li><strong>Inference speed</strong>: Wall-clock time per MD step</li>
</ol>
<p><strong>Key results:</strong></p>
<table>
  <thead>
      <tr>
          <th>Model</th>
          <th>Speed (ms/step)</th>
          <th>NVE Stability</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>PET-NC</td>
          <td>8.58</td>
          <td>Failed</td>
          <td>~7,000 billion K/s drift</td>
      </tr>
      <tr>
          <td>PET-C</td>
          <td>19.4</td>
          <td>Stable</td>
          <td>2.3x slower than PET-NC</td>
      </tr>
      <tr>
          <td>SevenNet</td>
          <td>52.8</td>
          <td>Stable</td>
          <td>Conservative baseline</td>
      </tr>
      <tr>
          <td><strong>PET Hybrid (MTS)</strong></td>
          <td><strong>~10.3</strong></td>
          <td><strong>Stable</strong></td>
          <td><strong>~20% overhead vs. pure NC</strong></td>
      </tr>
  </tbody>
</table>
<p><strong>Thermostat artifacts:</strong></p>
<ul>
<li>Langevin ($\tau=10$ fs) dampened diffusion by ~5x (weaker coupling at $\tau=100$ fs reduced diffusion by ~1.5x)</li>
<li>GLE thermostats also failed to control non-conservative drift</li>
<li>Equipartition violations under SVR: ORB showed H at 336 K and O at 230 K (target 300 K); PET-NC showed smaller but significant species-resolved deviations</li>
</ul>
<p><strong>Optimization failures:</strong></p>
<ul>
<li>Non-conservative models showed lower geometry optimization success rates across water and QM9 benchmarks, with inaccurate NC models failing catastrophically</li>
</ul>
<h3 id="hardware">Hardware</h3>
<p><strong>Compute resources:</strong></p>
<ul>
<li><strong>Training</strong>: From-scratch baseline models were trained using 4x Nvidia H100 GPUs (over a duration of around two days).</li>
<li><strong>Fine-Tuning</strong>: Conservative fine-tuning was performed using a single (1x) Nvidia H100 GPU for a duration of one day.</li>
<li>This hybrid fine-tuning approach achieved a 2-4x reduction in computational resources compared to training conservative models from scratch.</li>
</ul>
<p><strong>Reproduction resources:</strong></p>
<table>
  <thead>
      <tr>
          <th>Artifact</th>
          <th>Type</th>
          <th>License</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><a href="https://zenodo.org/records/14778891">Zenodo repository</a></td>
          <td>Code/Data</td>
          <td>Unknown</td>
          <td>Code and data to reproduce all results</td>
      </tr>
      <tr>
          <td><a href="https://atomistic-cookbook.org/examples/pet-mad-nc/pet-mad-nc.html">MTS inference tutorial</a></td>
          <td>Other</td>
          <td>Unknown</td>
          <td>Multiple time-stepping dynamics tutorial</td>
      </tr>
      <tr>
          <td><a href="https://atomistic-cookbook.org/examples/pet-finetuning/pet-ft-nc.html">Conservative fine-tuning tutorial</a></td>
          <td>Other</td>
          <td>Unknown</td>
          <td>Fine-tuning workflow tutorial</td>
      </tr>
  </tbody>
</table>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Bigi, F., Langer, M. F., &amp; Ceriotti, M. (2025). The dark side of the forces: assessing non-conservative force models for atomistic machine learning. <em>Proceedings of the 42nd International Conference on Machine Learning</em>, PMLR 267.</p>
<p><strong>Publication</strong>: ICML 2025</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@inproceedings</span>{bigi2025dark,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span>=<span style="color:#e6db74">{The dark side of the forces: assessing non-conservative force models for atomistic machine learning}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span>=<span style="color:#e6db74">{Bigi, Filippo and Langer, Marcel F and Ceriotti, Michele}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">booktitle</span>=<span style="color:#e6db74">{Proceedings of the 42nd International Conference on Machine Learning}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">series</span>=<span style="color:#e6db74">{Proceedings of Machine Learning Research}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">volume</span>=<span style="color:#e6db74">{267}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">address</span>=<span style="color:#e6db74">{Vancouver, Canada}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span>=<span style="color:#e6db74">{2025}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p><strong>Additional Resources</strong>:</p>
<ul>
<li><a href="https://icml.cc/virtual/2025/poster/45458">ICML 2025 poster page</a></li>
<li><a href="https://openreview.net/pdf?id=OEl3L8osas">PDF on OpenReview</a></li>
<li><a href="https://zenodo.org/records/14778891">Zenodo repository</a></li>
<li><a href="https://atomistic-cookbook.org/examples/pet-mad-nc/pet-mad-nc.html">MTS Inference Tutorial</a></li>
<li><a href="https://atomistic-cookbook.org/examples/pet-finetuning/pet-ft-nc.html">Conservative Fine-Tuning Tutorial</a></li>
</ul>
]]></content:encoded></item><item><title>Embedded-Atom Method: Impurities and Defects in Metals</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/classical-methods/embedded-atom-method/</link><pubDate>Fri, 22 Aug 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/classical-methods/embedded-atom-method/</guid><description>Daw and Baskes's foundational 1984 paper introducing the Embedded-Atom Method (EAM), a many-body potential for metal simulations.</description><content:encoded><![CDATA[<h2 id="contribution-adaptive-many-body-potentials">Contribution: Adaptive Many-Body Potentials</h2>
<p>This is a foundational <strong>method paper</strong> that introduces a new class of semi-empirical, many-body interatomic potential: the <strong>Embedded-Atom Method (EAM)</strong>. It is designed for large-scale atomistic simulations of metallic systems, bridging the gap between computationally cheap (but physically limited) pair potentials and accurate (but expensive) quantum mechanical methods. The EAM achieves pair-potential speed while incorporating many-body physics inspired by density functional theory.</p>
<h2 id="motivation-the-geometric-limits-of-pair-potentials">Motivation: The Geometric Limits of Pair Potentials</h2>
<p>The authors sought to overcome the limitations of <strong>pair potentials</strong> (the dominant method of the time), which failed in three key areas:</p>
<ul>
<li><strong>Elastic Anisotropy:</strong> Pair potentials enforce the Cauchy relation ($C_{12} = C_{44}$), which is violated by most transition metals.</li>
<li><strong>Volume Ambiguity:</strong> Pair potentials require a volume-dependent energy term, making them impossible to use accurately on surfaces or cracks where local volume is undefined.</li>
<li><strong>Chemical Incompatibility:</strong> Pair potentials cannot model chemically active impurities like Hydrogen.</li>
</ul>
<p>First-principles quantum mechanical methods (e.g., band theory) are limited by basis-set size and periodicity requirements, making them impractical for the large systems (thousands of atoms) needed to study defects, surfaces, and mechanical properties.</p>
<p>The goal was to create a new model that bridges this gap in accuracy and computational cost.</p>
<h2 id="core-innovation-the-embedding-energy-function">Core Innovation: The Embedding Energy Function</h2>
<p>The EAM postulates that the energy of an atom is determined by the local electron density of its neighbors. The total energy is:</p>
<p>$$E_{tot} = \sum_{i} F_i(\rho_{h,i}) + \frac{1}{2}\sum_{i \neq j} \phi_{ij}(R_{ij})$$</p>
<ul>
<li><strong>$F_i(\rho_{h,i})$ (Embedding Energy):</strong> The energy required to embed atom $i$ into the background electron density $\rho$ provided by its neighbors. This term is non-linear and captures many-body effects.</li>
<li><strong>$\phi_{ij}$ (Pair Potential):</strong> A short-range electrostatic repulsion between cores.</li>
<li><strong>$\rho_{h,i}$ (Host Density):</strong> Approximated as a linear superposition of atomic densities: $\rho_{h,i} = \sum_{j \neq i} \rho^a_j(R_{ij})$.</li>
</ul>
<p>The key innovations are:</p>
<ol>
<li><strong>The Embedding Energy</strong>: Each atom $i$ contributes an energy $F_i$ which is a non-linear function of the local electron density $\rho_{h,i}$ it is embedded in. This density is approximated as a simple linear superposition of the atomic electron densities of all its neighbors. This term captures the crucial many-body effects of metallic bonding.</li>
<li><strong>A Redefined Pair Potential</strong>: A short-range, two-body potential $\phi_{ij}$ is retained, but it primarily models the electrostatic core-core repulsion.</li>
<li><strong>Elimination of the &ldquo;Volume&rdquo; Problem</strong>: Because the embedding energy depends on the local electron density (a quantity that is always well-defined, even at a surface or a crack tip), the method circumvents the ambiguities of volume-dependent pair potentials.</li>
<li><strong>Intrinsic Many-Body Nature</strong>: The non-linearity of the embedding function $F(\rho)$ naturally accounts for why chemically active impurities (like hydrogen) cannot be described by pair potentials and correctly breaks the Cauchy relation for elastic constants.</li>
</ol>
<h2 id="experimental-design-robust-parameter-validation">Experimental Design: Robust Parameter Validation</h2>
<p>The authors validated EAM through a rigorous split between parameterization data and prediction tasks:</p>
<p><strong>Fitting Data (Bulk Properties Only):</strong></p>
<p>The model parameters were fitted exclusively to these experimental values for Ni and Pd:</p>
<ul>
<li>Lattice constant ($a_0$)</li>
<li>Elastic constants ($C_{11}, C_{12}, C_{44}$)</li>
<li>Sublimation energy ($E_s$)</li>
<li>Vacancy-formation energy ($E^F_{1V}$)</li>
<li>Hydrogen heat of solution (for fitting H parameters)</li>
</ul>
<p><strong>Validation Tests (No Further Fitting):</strong></p>
<p>The model was then evaluated on its ability to predict these properties without any additional parameter adjustments:</p>
<ul>
<li><strong>Surface Relaxations:</strong> Ni(110) surface contraction</li>
<li><strong>Surface Energy:</strong> Ni(100) surface energy</li>
<li><strong>Hydrogen Migration:</strong> H migration energy in Pd</li>
<li><strong>Fracture Mechanics:</strong> Hydrogen embrittlement in Ni slabs</li>
</ul>
<h2 id="results-extending-predictive-power-to-surfaces-and-defects">Results: Extending Predictive Power to Surfaces and Defects</h2>
<ol>
<li><strong>Many-Body Physics:</strong> The embedding function $F(\rho)$ successfully captures the volume-dependence of metallic cohesion, fixing the &ldquo;Cauchy discrepancy&rdquo; inherent in pair potentials.</li>
<li><strong>Surface Properties:</strong> A single set of functions, fitted only to bulk data, correctly reproduces surface relaxations within 0.1 Å of experiment across three faces (100), (110), and (111) for Ni. The Ni(100) surface energy (1550 erg/cm²) compares well with the measured crystal-vapor average (1725 erg/cm²).</li>
<li><strong>Hydrogen in Bulk:</strong> The method predicts H migration energy in Pd as 0.26 eV, matching experiment exactly. Hydride lattice expansions are also well reproduced: 4.5% for NiH (experiment: 5%) and 4% for PdH (experiment: 3.5% for PdH$_{0.6}$).</li>
<li><strong>Hydrogen on Surfaces:</strong> Calculated adsorption sites on all three Ni and Pd faces agree with experimentally determined sites. Adsorption energies on Ni surfaces are systematically about 0.25 eV too low, while on Pd surfaces the error is much smaller (about 0.05 eV too high on average).</li>
<li><strong>Fracture Mechanics:</strong> Static fracture calculations on Ni slabs demonstrate brittle fracture behavior and show that hydrogen lowers the fracture stress, providing a qualitative model of hydrogen embrittlement.</li>
</ol>
<h2 id="limitations">Limitations</h2>
<p>The authors acknowledge several limitations:</p>
<ul>
<li>The functions $F$ and $\phi$ are not uniquely determined by the empirical fitting procedure. The short-range pair potential (restricted to first neighbors in fcc metals) may not be the best choice for all crystal structures.</li>
<li>The choice of hydrogen embedding function (Puska et al. vs. Norskov&rsquo;s corrected function) remains undecided and may affect hydrogen binding energies.</li>
<li>The fracture calculations are static, and dynamical effects and plasticity play important roles in real fracture that are not captured.</li>
<li>The method has only been demonstrated for fcc metals (Ni and Pd). Extension to bcc metals and other crystal structures requires further investigation.</li>
</ul>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<h3 id="algorithms">Algorithms</h3>
<p>To replicate the method, three specific algorithmic definitions are needed:</p>
<ol>
<li>
<p><strong>Atomic Density Construction</strong>: The electron density $\rho^a(r)$ is a weighted sum of Hartree-Fock $s$ and $d$ orbital densities (from Clementi &amp; Roetti tables), controlled by a parameter $N_s$ (the number of s-like electrons):
$$\rho^a(r) = N_s\rho_s^a(r) + (N-N_s)\rho_d^a(r)$$
For Ni, $N_s = 0.85$; for Pd, $N_s = 0.65$ (fitted to H solution heat).</p>
</li>
<li>
<p><strong>Pair Potential Form</strong>: The short-range pair interaction derives from an effective charge function $Z(r)$ to handle core repulsion:
$$\phi_{ij}(r) = \frac{Z_i(r)Z_j(r)}{r}$$
Splines for $Z(r)$ are provided in Table II.</p>
</li>
<li>
<p><strong>Analytic Forces</strong>: Because embedding energy depends on neighbor density, the force calculation is many-body:
$$\vec{f}_{k} = -\sum_{j(\neq k)} (F&rsquo;_{k} \rho&rsquo;_{j} + F&rsquo;_{j} \rho&rsquo;_{k} + \phi&rsquo;_{jk}) \vec{r}_{jk}$$</p>
</li>
</ol>
<h3 id="models">Models</h3>
<p>The functions $F(\rho)$ and $\phi(r)$ are modeled using <strong>cubic splines</strong>, with parameters fitted to reproduce bulk experimental constants. The embedding function $F(\rho)$ is constrained to have a single minimum and to be linear at high densities, matching the qualitative form of the first-principles calculations by Puska et al. Energy minimization uses the <strong>conjugate gradients</strong> technique. The paper explicitly lists spline knots, coefficients, and cutoffs in Tables II and IV, making the method fully reproducible.</p>















<figure class="post-figure center ">
    <img src="/img/notes/chemistry/eam-embedding-effective-charge.webp"
         alt="Reproduction of Figures 1 and 2 from Daw &amp; Baskes (1984) showing the embedding energy and effective charge functions for Ni and Pd"
         title="Reproduction of Figures 1 and 2 from Daw &amp; Baskes (1984) showing the embedding energy and effective charge functions for Ni and Pd"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption"><strong>Left:</strong> Dimensionless embedding energy ($E/E_s$) vs. normalized electron density ($\rho/\bar{\rho}$). The minimum near $\rho/\bar{\rho} \approx 1.0$ drives metallic cohesion. <strong>Right:</strong> Normalized effective charge ($Z/Z_0$) vs. normalized distance ($R/a_0$). The charge drops to zero near $R/a_0 = 0.85$, ensuring short-range interactions. Reproduced from Table II spline knots.</figcaption>
    
</figure>

<h3 id="evaluation">Evaluation</h3>
<p><strong>Fitting Data (Used for Parameterization):</strong></p>
<p>Bulk experimental properties for Ni and Pd only:</p>
<ul>
<li>Lattice constant ($a_0$)</li>
<li>Elastic constants ($C_{11}, C_{12}, C_{44}$)</li>
<li>Sublimation energy ($E_s$)</li>
<li>Vacancy-formation energy ($E^F_{1V}$)</li>
<li>Hydrogen heat of solution (for fitting H parameters)</li>
</ul>
<p><strong>Validation Results (Predictions Without Further Fitting):</strong></p>
<table>
  <thead>
      <tr>
          <th>Property</th>
          <th>Predicted</th>
          <th>Experimental</th>
          <th>Agreement</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Ni(110) surface contraction</td>
          <td>-0.11 Å</td>
          <td>-0.06 to -0.10 Å</td>
          <td>Within 0.1 Å</td>
      </tr>
      <tr>
          <td>Ni(100) surface energy</td>
          <td>1550 erg/cm²</td>
          <td>1725 erg/cm² (avg.)</td>
          <td>Close</td>
      </tr>
      <tr>
          <td>H migration in Pd</td>
          <td>0.26 eV</td>
          <td>0.26 eV</td>
          <td>Exact</td>
      </tr>
      <tr>
          <td>NiH lattice expansion</td>
          <td>4.5%</td>
          <td>5%</td>
          <td>Close</td>
      </tr>
      <tr>
          <td>PdH lattice expansion</td>
          <td>4%</td>
          <td>3.5% (PdH$_{0.6}$)</td>
          <td>Close</td>
      </tr>
      <tr>
          <td>H adsorption sites (Ni, Pd)</td>
          <td>Correct on all faces</td>
          <td>Matches experiment</td>
          <td>Exact</td>
      </tr>
      <tr>
          <td>H embrittlement in Ni</td>
          <td>Qualitative model</td>
          <td>-</td>
          <td>Qualitative</td>
      </tr>
  </tbody>
</table>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Daw, M. S., &amp; Baskes, M. I. (1984). Embedded-atom method: Derivation and application to impurities, surfaces, and other defects in metals. <em>Physical Review B</em>, 29(12), 6443-6453. <a href="https://doi.org/10.1103/PhysRevB.29.6443">https://doi.org/10.1103/PhysRevB.29.6443</a></p>
<p><strong>Publication</strong>: Physical Review B, 1984</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@article</span>{daw1984embedded,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span>=<span style="color:#e6db74">{Embedded-atom method: Derivation and application to impurities, surfaces, and other defects in metals}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span>=<span style="color:#e6db74">{Daw, Murray S and Baskes, Mike I}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">journal</span>=<span style="color:#e6db74">{Physical Review B}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">volume</span>=<span style="color:#e6db74">{29}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">number</span>=<span style="color:#e6db74">{12}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span>=<span style="color:#e6db74">{6443--6453}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span>=<span style="color:#e6db74">{1984}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">publisher</span>=<span style="color:#e6db74">{APS}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">doi</span>=<span style="color:#e6db74">{10.1103/PhysRevB.29.6443}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p><strong>Additional Resources</strong>:</p>
<ul>
<li><a href="/notes/chemistry/molecular-simulation/classical-methods/embedded-atom-method-review-1993/">EAM Review (1993)</a></li>
<li><a href="/notes/chemistry/molecular-simulation/classical-methods/embedded-atom-method-voter-1994/">EAM User Guide (1994)</a></li>
<li><a href="https://www.ctcms.nist.gov/potentials/">NIST Interatomic Potentials Repository</a></li>
</ul>
]]></content:encoded></item><item><title>Umbrella Sampling: Monte Carlo Free-Energy Estimation</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/classical-methods/umbrella-sampling/</link><pubDate>Thu, 21 Aug 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/classical-methods/umbrella-sampling/</guid><description>Torrie and Valleau's 1977 paper introducing Umbrella Sampling, an importance sampling technique for Monte Carlo free-energy calculations.</description><content:encoded><![CDATA[<h2 id="a-methodological-shift-in-monte-carlo-simulations">A Methodological Shift in Monte Carlo Simulations</h2>
<p>This is a <strong>Method</strong> paper that introduces a novel computational technique for Monte Carlo simulations. It presents Umbrella Sampling, an importance sampling approach that uses non-physical distributions to calculate free energy differences in molecular systems.</p>
<h2 id="the-sampling-gap-in-phase-transitions">The Sampling Gap in Phase Transitions</h2>
<p>The paper addresses the failure of conventional Boltzmann-weighted Monte Carlo to estimate free energy differences.</p>
<ul>
<li><strong>The Problem</strong>: Free energy depends on the integral of configurations that are rare in the reference system. In a standard simulation, the relevant probability density $f_0(\Delta U^*)$ is too small to be sampled accurately by conventional Boltzmann-weighted Monte Carlo.</li>
<li><strong>Phase Transitions</strong>: Conventional &ldquo;thermodynamic integration&rdquo; fails near phase transitions because it requires a path of integration where ensemble averages can be reliably measured, which is difficult in unstable regions.</li>
</ul>
<h2 id="bridging-states-with-non-physical-distributions">Bridging States with Non-Physical Distributions</h2>
<p>The authors introduce a non-physical distribution $\pi(q^N)$ to bridge the gap between a reference system (0) and a system of interest (1).</p>
<ul>
<li><strong>Arbitrary Weights</strong>: They generate a Markov chain with a limiting distribution $\pi(q^N)$ that differs from the Boltzmann distribution of either system. This distribution is written as $\pi(q&rsquo;^N) = w(q&rsquo;^N) \exp(-U_0(q&rsquo;^N)/kT_0) / Z$, where $w(q^N) = W(\Delta U^*)$ is a weighting function chosen to favor configurations with values of $\Delta U^*$ important to the free-energy integral.</li>
<li><strong>Reweighting Formula</strong>: The unbiased average of any property $\theta$ is recovered via the ratio of biased averages:</li>
</ul>
<p>$$\langle\theta\rangle_{0}=\frac{\langle\theta/w\rangle_{w}}{\langle1/w\rangle_{w}}$$</p>
<ul>
<li><strong>Overlap</strong>: The method allows sampling a range of $\Delta U^*$ up to <strong>three times</strong> that of a conventional Monte Carlo experiment, enabling accurate determination of values of $f_0(\Delta U^*)$ as small as $10^{-8}$. If a single weight function cannot span the entire gap, additional overlapping umbrella-sampling experiments are carried out with different weighting functions exploring successively overlapping ranges of $\Delta U^*$.</li>
</ul>
<h2 id="validation-on-lennard-jones-fluids">Validation on Lennard-Jones Fluids</h2>
<p>The authors validated Umbrella Sampling using Monte Carlo simulations of model fluids.</p>
<h3 id="experimental-setup">Experimental Setup</h3>
<ul>
<li><strong>System Specifications</strong>: The study used a <strong>Lennard-Jones (LJ)</strong> fluid and an <strong>inverse-12 &ldquo;soft-sphere&rdquo;</strong> fluid.</li>
<li><strong>System Size</strong>: Simulations were primarily performed with <strong>$N=32$ particles</strong>, with some validation runs at <strong>$N=108$ particles</strong> to check for size dependence.</li>
<li><strong>State Points</strong>: Calculations covered a wide range of densities ($N\sigma^3/V = 0.50$ to $0.85$) and temperatures ($kT/\epsilon = 0.7$ to $2.8$), including the gas-liquid coexistence region.</li>
</ul>
<h3 id="baselines">Baselines</h3>
<ul>
<li><strong>Baselines</strong>: Results were compared to thermodynamic integration data from <strong>Hansen</strong>, <strong>Levesque</strong>, and <strong>Verlet</strong>.</li>
<li><strong>Quantitative Success</strong>:
<ul>
<li><strong>Agreement</strong>: The free energy estimates agreed with pressure integration results to within statistical uncertainties (e.g., at $kT/\epsilon=1.35$, Umbrella Sampling gave -3.236 vs. Conventional -3.25).</li>
<li><strong>Precision</strong>: Free energy differences were obtained with high precision ($\pm 0.005 NkT$ for $N=108$).</li>
<li><strong>Efficiency</strong>: A single umbrella run could replace the &ldquo;numerous runs&rdquo; required for conventional $1/T$ integrations.</li>
</ul>
</li>
</ul>
<h2 id="temperature-scaling-via-reweighting">Temperature Scaling via Reweighting</h2>
<p>When the reference system has the same internal energy function as the system of interest (i.e., the same fluid at a different temperature), the free-energy expression simplifies to:</p>
<p>$$\frac{A(T)}{kT} = \frac{A(T_0)}{kT_0} - \ln \int f_0(U) \exp\left[-U\left(\frac{1}{kT} - \frac{1}{kT_0}\right)\right] dU$$</p>
<p>This is especially useful because a single determination of $f_0(U)$ over a wide energy range gives the free energy over a whole range of temperatures simultaneously. For 32 Lennard-Jones particles, only two umbrella-sampling experiments are needed to span the temperature range from the triple point ($kT/\epsilon = 0.7$) to twice the critical temperature ($kT/\epsilon = 2.8$). For 108 particles, four experiments suffice.</p>
<h2 id="mapping-the-liquid-gas-free-energy-surface">Mapping the Liquid-Gas Free Energy Surface</h2>
<ul>
<li><strong>Methodological Utility</strong>: The method successfully mapped the free energy of the LJ fluid across the liquid-gas transition, a region where conventional methods face convergence problems.</li>
<li><strong>N-Dependence</strong>: Comparison between $N=32$ and $N=108$ showed no statistically significant size dependence for free energy differences, suggesting small systems are sufficient for these estimates.</li>
<li><strong>Comparison with Gosling-Singer Method</strong>: The paper contrasts its results with free energies derived from Gosling and Singer&rsquo;s entropy estimation technique, finding discrepancies as large as $0.4N\epsilon$ (a 20% error in the nonideal entropy), equivalent to overestimating the configurational integral of a 108-particle system by a factor of $10^{16}$.</li>
<li><strong>Generality</strong>: While demonstrated on energy ($U$), the authors note the weighting function $w$ can be any function of the coordinates, generalizing the technique beyond simple free energy differences.</li>
</ul>
<h2 id="reproducibility">Reproducibility</h2>
<p>This 1977 paper predates modern code-sharing practices, and no source code or data files are publicly available. However, the paper provides sufficient algorithmic detail for reimplementation:</p>
<ul>
<li><strong>Constructing $W$</strong>: The paper does not derive $W$ analytically. It uses a <strong>trial-and-error procedure</strong>: start with a short Boltzmann-weighted experiment, then broaden the distribution in stages through short test runs, adjusting weights to flatten the probability density $f_w(\Delta U^*)$. The paper acknowledges this requires &ldquo;interaction between the trial computer results and human judgment.&rdquo;</li>
<li><strong>Specific Weights</strong>: Table I provides the exact numerical weights used for the 32-particle soft-sphere experiment at $N\sigma^3/V = 0.85$, $kT/\epsilon = 2.74$, with values spanning from $W=1{,}500{,}000$ at the lowest energies down to $W=1.0$ at the center and back up to $W=16.0$ at the highest energies.</li>
<li><strong>Potentials</strong>: The Lennard-Jones and inverse-twelve potentials are fully specified (Eqs. 8 and 9).</li>
<li><strong>State Points</strong>: Densities and temperatures are enumerated in Tables II and III.</li>
<li><strong>Block Averaging</strong>: Errors were estimated by treating sequences of $m$ steps as independent samples, where $m$ is determined by increasing block size until no systematic trends can be detected in either the average or the standard deviation of the mean.</li>
</ul>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Torrie, G. M., &amp; Valleau, J. P. (1977). Nonphysical sampling distributions in Monte Carlo free-energy estimation: Umbrella sampling. <em>Journal of Computational Physics</em>, 23(2), 187-199. <a href="https://doi.org/10.1016/0021-9991(77)90121-8">https://doi.org/10.1016/0021-9991(77)90121-8</a></p>
<p><strong>Publication</strong>: Journal of Computational Physics, 1977</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@article</span>{torrie1977nonphysical,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span>=<span style="color:#e6db74">{Nonphysical sampling distributions in Monte Carlo free-energy estimation: Umbrella sampling}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span>=<span style="color:#e6db74">{Torrie, Glenn M and Valleau, John P}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">journal</span>=<span style="color:#e6db74">{Journal of Computational Physics}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">volume</span>=<span style="color:#e6db74">{23}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">number</span>=<span style="color:#e6db74">{2}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span>=<span style="color:#e6db74">{187--199}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span>=<span style="color:#e6db74">{1977}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">publisher</span>=<span style="color:#e6db74">{Elsevier}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">doi</span>=<span style="color:#e6db74">{10.1016/0021-9991(77)90121-8}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div>]]></content:encoded></item><item><title>Kabsch Algorithm: NumPy, PyTorch, TensorFlow, and JAX</title><link>https://hunterheidenreich.com/posts/kabsch-algorithm/</link><pubDate>Tue, 03 Oct 2023 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/posts/kabsch-algorithm/</guid><description>Learn about the Kabsch algorithm for optimal point alignment with implementations in NumPy, PyTorch, TensorFlow, and JAX for ML applications.</description><content:encoded><![CDATA[<h2 id="what-is-the-kabsch-algorithm">What is the Kabsch Algorithm?</h2>
<p>In computer vision or scientific computing, a common problem frequently arises: given two sets of points, what is the optimal rigid body transformation for their alignment? The Kabsch algorithm provides a nice solution.</p>















<figure class="post-figure center ">
    <img src="/img/scientific-computing/kabsch-alignment-before-and-after.webp"
         alt="Visualization of two point sets before and after Kabsch alignment"
         title="Visualization of two point sets before and after Kabsch alignment"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">The Kabsch algorithm optimally rotates and translates the blue points to align with the red points.</figcaption>
    
</figure>

<p>What are some concrete situations where this crops up?</p>
<ul>
<li><strong>Molecular Dynamics</strong>: Your points are a set of atoms (with physically relevant types), and you want to compare two molecular conformations. Are they the same structure with minor noise or rotation? Or are they different conformations, like a different folding of a protein? This is especially helpful when applying generative models to chemical structures. For example, if you are building a <a href="/notes/chemistry/molecular-simulation/ml-potentials/denoise-vae/">3D Molecular VAE</a> in PyTorch or working with <a href="/notes/machine-learning/generative-models/flow-matching-for-generative-modeling/">Flow Matching models</a>, Kabsch alignment ensures your generative loss function remains rotationally invariant.</li>
<li><strong>Computer Vision</strong>: You have two point clouds from 3D scans of an object taken from different angles. You want to align them to reconstruct the full shape. Or perhaps you&rsquo;re generating 3D shapes from 2D images and need to compare the generated shape to a ground truth scan. Anytime a 3D system is represented as a point cloud, the Kabsch algorithm can help with alignment.</li>
</ul>
<p>Of course, existing libraries implement this algorithm. However, often I find it beneficial to implement algorithms from scratch to build intuition. Furthermore, modern machine learning applications require automatic differentiation, so we will implement the algorithm in PyTorch, TensorFlow, and JAX.</p>
<p>Below, we&rsquo;ll cover the math behind the Kabsch algorithm (and its scaling variant, the <strong>Kabsch-Umeyama</strong> algorithm) and provide complete, differentiable implementations in <strong>NumPy</strong>, <strong>PyTorch</strong>, <strong>TensorFlow</strong>, and <strong>JAX</strong>, demonstrating both single-pair and batched computations for ML applications.</p>
<h2 id="the-math">The Math</h2>















<figure class="post-figure center ">
    <img src="/img/scientific-computing/kabsch-algorithm-basic-animation.webp"
         alt="Animation showing the iterative steps of centroid alignment and rotation"
         title="Animation showing the iterative steps of centroid alignment and rotation"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">Visualizing the alignment process: first centering the datasets, then finding the optimal rotation.</figcaption>
    
</figure>

<p>Let&rsquo;s say we have two sets of paired points,
$P={\mathbf{p}_i} \in \mathbb{R}^{N \times D}$ and $Q={\mathbf{q}_i} \in \mathbb{R}^{N \times D}$, for $i = 1, \dots, N$
(where $D$ is the dimensionality and $N$ is the number of points).
We want to find a translation vector $\mathbf{t}$ and rotation matrix $R$ to transform $P$ to align with $Q$.</p>
<p>The optimization problem is:</p>
<p>$$
\min_{\mathbf{t}, \ R} \mathcal{L}(\mathbf{t}, R) = \frac{1}{2} \sum_{i=1}^N | \mathbf{q}_i - (R\mathbf{p}_i + \mathbf{t}) |^2
$$</p>
<p>where $\mathbf{t}^\ast \in \mathbb{R}^D$ and $R^\ast \in \mathbb{R}^{D \times D}$ are the optimal translation and rotation.</p>
<p>Often we use a weighted version with weights $w_i$ (e.g., atomic masses in molecular dynamics):</p>
<p>$$
\min_{\mathbf{t}, \ R} \mathcal{L}(\mathbf{t}, R) = \frac{1}{2} \sum_{i=1}^N w_i | \mathbf{q}_i - (R\mathbf{p}_i + \mathbf{t}) |^2
$$</p>
<h3 id="the-translation">The Translation</h3>
<p>The translation and rotation are coupled, but they separate cleanly once we work in centroid-centered coordinates. Compute the centroids (averages) of both point sets:</p>
<p>$$
\bar{\mathbf{p}} = \frac{1}{N} \sum_{i=1}^N \mathbf{p}_i \quad \text{and} \quad \bar{\mathbf{q}} = \frac{1}{N} \sum_{i=1}^N \mathbf{q}_i
$$</p>
<p>For any fixed rotation $R$, the translation that minimizes $\mathcal{L}$ is found by setting $\partial \mathcal{L} / \partial \mathbf{t} = 0$. It maps the rotated source centroid onto the target centroid:</p>
<p>$$
\mathbf{t} = \bar{\mathbf{q}} - R\bar{\mathbf{p}}
$$</p>
<p>A tempting shortcut is to write $\mathbf{t} = \bar{\mathbf{q}} - \bar{\mathbf{p}}$, but that is only correct when $R = I$. In general the translation depends on the rotation, so we compute it <em>after</em> solving for $R$. Substituting this optimal $\mathbf{t}$ back into the objective cancels the centroids and leaves a rotation-only problem in the centered coordinates $\mathbf{p}_i^\prime = \mathbf{p}_i - \bar{\mathbf{p}}$ and $\mathbf{q}_i^\prime = \mathbf{q}_i - \bar{\mathbf{q}}$:</p>
<p>$$
\mathcal{L}(R) = \frac{1}{2} \sum_{i=1}^N | \mathbf{q}_i^\prime - R\mathbf{p}_i^\prime |^2
$$</p>
<p>which is what the next section solves.</p>
<h3 id="the-rotation-matrix">The Rotation Matrix</h3>
<p>We now minimize $\mathcal{L}(R)$ over rotations, using the centered points $\mathbf{p}_i^\prime$ and $\mathbf{q}_i^\prime$ from above. Compute the cross-covariance matrix between the centered sets:</p>
<p>$$
C = P^{\prime T} Q^\prime = \sum_{i=1}^N \mathbf{p}_i^{\prime T} \mathbf{q}_i^{\prime} \in \mathbb{R}^{D \times D}
$$</p>
<p>This is a fairly lightweight operation since $D$ is typically small (e.g., 3 for 3D points), even if $N$ is large.</p>
<p>With $C$ in hand, we want to compute its Singular Value Decomposition (SVD):</p>
<p>$$
C = U \Sigma V^T
$$</p>
<p>This operation is computationally expensive. It scales cubically with $D$ (i.e., $O(D^3)$).
However, since we&rsquo;re often interested in cases where $D$ is small (e.g., 2D or 3D points), this is manageable.</p>
<p>Next, we check for improper rotations (i.e., reflections) and correct for them where necessary:</p>
<p>$$
d = \text{sign}(\det(V U^T))
$$</p>
<p>If $d = -1$, we need to flip the last column of $V$ in the final rotation matrix.</p>
<p>Let $B = \text{diag}(1, 1, d)$.
The optimal rotation matrix comes out:</p>
<p>$$
R^\ast = V B U^T
$$</p>
<h3 id="summary">Summary</h3>
<p>In a nutshell, the Kabsch algorithm boils down to:</p>
<ol>
<li>Compute centroids of $P$ and $Q$ ($\bar{\mathbf{p}}$ and $\bar{\mathbf{q}}$)</li>
<li>Center both point sets by subtracting centroids: $P^\prime$ and $Q^\prime$</li>
<li>Compute cross-covariance matrix $C = P^{\prime T} Q^\prime$</li>
<li>Compute SVD: $C = U \Sigma V^T$ (<em>expensive step</em>)</li>
<li>Compute $d = \text{sign}(\det(V U^T))$ and $B = \text{diag}(1, 1, d)$</li>
<li>Optimal rotation: $R^\ast = V B U^T$</li>
<li>Optimal translation (using the rotation from step 6): $\mathbf{t}^\ast = \bar{\mathbf{q}} - R^\ast\bar{\mathbf{p}}$</li>
</ol>
<p>The resulting root-mean-square deviation (RMSD) between aligned point sets is</p>
<p>$$
\text{RMSD} = \sqrt{\frac{1}{N} \sum_{i=1}^N | \mathbf{q}_i - (R^\ast\mathbf{p}_i + \mathbf{t}^\ast) |^2}
$$</p>















<figure class="post-figure center ">
    <img src="/img/scientific-computing/kabsch-algorithm-visualized-rmsd.webp"
         alt="Diagram illustrating Root Mean Square Deviation (RMSD) distances"
         title="Diagram illustrating Root Mean Square Deviation (RMSD) distances"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">RMSD measures the average distance between the aligned points.</figcaption>
    
</figure>

<p>which is frequently used as a measure of similarity between molecular structures or as a metric in loss functions for ML applications.</p>
<h3 id="the-kabsch-umeyama-algorithm-scaling">The Kabsch-Umeyama Algorithm (Scaling)</h3>
<p>While the standard Kabsch algorithm solves for optimal rotation and translation, the <strong>Kabsch-Umeyama algorithm</strong> extends this by also finding an optimal <strong>scaling factor</strong> $c$. This is essential when aligning structures of different scales, such as a 3D scan versus a ground truth model.</p>
<p><em>(Note: This is sometimes searched for as the &ldquo;Absch-Umeyama algorithm&rdquo; due to typos, but the correct attribution is to Shinji Umeyama based on Wolfgang Kabsch&rsquo;s work.)</em></p>
<p>The method estimates the transformation $\mathbf{q}_i \approx c R \mathbf{p}_i + \mathbf{t}$. The optimal scale is the trace of the (reflection-corrected) singular values of the cross-covariance divided by the variance of the source points about their centroid. See the <a href="/notes/biology/computational-biology/umeyama-similarity-transformation/">Umeyama paper notes</a> for the full derivation.</p>
<p><strong>A Note on SVD and Automatic Differentiation</strong></p>
<p>While modern frameworks allow us to backpropagate through the Singular Value Decomposition (SVD), it comes with a known stability issue: if the cross-covariance matrix has identical (degenerate) singular values (which can occur if the point clouds are perfectly aligned or have certain symmetries), the gradient of the SVD approaches infinity, causing <code>NaN</code> values during backpropagation. If you plan to use this algorithm as a loss function for a neural network, it is often necessary to add a tiny epsilon to the matrix before computing the SVD, or to utilize an SVD gradient patch. The <a href="/projects/kabsch-horn-cookbook/">Kabsch-Horn Cookbook</a> library provides a SafeSVD primitive that floors the singular-value-gap denominator at machine epsilon in the backward pass, producing finite gradients at degenerate inputs across PyTorch, JAX, TensorFlow, and MLX.</p>
<h2 id="implementation">Implementation</h2>
<p>Let&rsquo;s implement the algorithm in different frameworks. Note that for simplicity, the following implementations cover the <strong>unweighted</strong> Kabsch algorithm. If your application (like molecular dynamics) requires weights (e.g., atomic masses), the <a href="/projects/kabsch-horn-cookbook/">Kabsch-Horn Cookbook</a> library provides per-point weighted alignment out of the box.</p>
<h3 id="numpy">NumPy</h3>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">import</span> numpy <span style="color:#66d9ef">as</span> np
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">kabsch_numpy</span>(P, Q):
</span></span><span style="display:flex;"><span>    <span style="color:#e6db74">&#34;&#34;&#34;
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    Computes the optimal rotation and translation to align two sets of points (P -&gt; Q),
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    and their RMSD.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :param P: A Nx3 matrix of points
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :param Q: A Nx3 matrix of points
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :return: A tuple containing the optimal rotation matrix, the optimal
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">             translation vector, and the RMSD.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    &#34;&#34;&#34;</span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">assert</span> P<span style="color:#f92672">.</span>shape <span style="color:#f92672">==</span> Q<span style="color:#f92672">.</span>shape, <span style="color:#e6db74">&#34;Matrix dimensions must match&#34;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Compute centroids</span>
</span></span><span style="display:flex;"><span>    centroid_P <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>mean(P, axis<span style="color:#f92672">=</span><span style="color:#ae81ff">0</span>)
</span></span><span style="display:flex;"><span>    centroid_Q <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>mean(Q, axis<span style="color:#f92672">=</span><span style="color:#ae81ff">0</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Center the points</span>
</span></span><span style="display:flex;"><span>    p <span style="color:#f92672">=</span> P <span style="color:#f92672">-</span> centroid_P
</span></span><span style="display:flex;"><span>    q <span style="color:#f92672">=</span> Q <span style="color:#f92672">-</span> centroid_Q
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Compute the covariance matrix</span>
</span></span><span style="display:flex;"><span>    H <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>dot(p<span style="color:#f92672">.</span>T, q)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># SVD</span>
</span></span><span style="display:flex;"><span>    U, S, Vt <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>linalg<span style="color:#f92672">.</span>svd(H)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Validate right-handed coordinate system</span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">if</span> np<span style="color:#f92672">.</span>linalg<span style="color:#f92672">.</span>det(np<span style="color:#f92672">.</span>dot(Vt<span style="color:#f92672">.</span>T, U<span style="color:#f92672">.</span>T)) <span style="color:#f92672">&lt;</span> <span style="color:#ae81ff">0.0</span>:
</span></span><span style="display:flex;"><span>        Vt[<span style="color:#f92672">-</span><span style="color:#ae81ff">1</span>, :] <span style="color:#f92672">*=</span> <span style="color:#f92672">-</span><span style="color:#ae81ff">1.0</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Optimal rotation</span>
</span></span><span style="display:flex;"><span>    R <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>dot(Vt<span style="color:#f92672">.</span>T, U<span style="color:#f92672">.</span>T)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Optimal translation (depends on R, so computed after it)</span>
</span></span><span style="display:flex;"><span>    t <span style="color:#f92672">=</span> centroid_Q <span style="color:#f92672">-</span> np<span style="color:#f92672">.</span>dot(R, centroid_P)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># RMSD</span>
</span></span><span style="display:flex;"><span>    rmsd <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>sqrt(np<span style="color:#f92672">.</span>sum(np<span style="color:#f92672">.</span>square(np<span style="color:#f92672">.</span>dot(p, R<span style="color:#f92672">.</span>T) <span style="color:#f92672">-</span> q)) <span style="color:#f92672">/</span> P<span style="color:#f92672">.</span>shape[<span style="color:#ae81ff">0</span>])
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> R, t, rmsd
</span></span></code></pre></div><p>Here&rsquo;s a quick test to verify correctness:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">test_numpy</span>():
</span></span><span style="display:flex;"><span>    np<span style="color:#f92672">.</span>random<span style="color:#f92672">.</span>seed(<span style="color:#ae81ff">12345</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    P <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>random<span style="color:#f92672">.</span>randn(<span style="color:#ae81ff">100</span>, <span style="color:#ae81ff">3</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    alpha <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>random<span style="color:#f92672">.</span>rand() <span style="color:#f92672">*</span> <span style="color:#ae81ff">2</span> <span style="color:#f92672">*</span> np<span style="color:#f92672">.</span>pi
</span></span><span style="display:flex;"><span>    R <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>array([[np<span style="color:#f92672">.</span>cos(alpha), <span style="color:#f92672">-</span>np<span style="color:#f92672">.</span>sin(alpha), <span style="color:#ae81ff">0</span>],
</span></span><span style="display:flex;"><span>                    [np<span style="color:#f92672">.</span>sin(alpha), np<span style="color:#f92672">.</span>cos(alpha), <span style="color:#ae81ff">0</span>],
</span></span><span style="display:flex;"><span>                    [<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">1</span>]])
</span></span><span style="display:flex;"><span>    t <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>random<span style="color:#f92672">.</span>randn(<span style="color:#ae81ff">3</span>) <span style="color:#f92672">*</span> <span style="color:#ae81ff">10</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    Q <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>dot(P, R<span style="color:#f92672">.</span>T) <span style="color:#f92672">+</span> t
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    R_opt, t_opt, rmsd <span style="color:#f92672">=</span> kabsch_numpy(P, Q)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    print(<span style="color:#e6db74">&#39;RMSD: </span><span style="color:#e6db74">{}</span><span style="color:#e6db74">&#39;</span><span style="color:#f92672">.</span>format(rmsd))
</span></span><span style="display:flex;"><span>    print(<span style="color:#e6db74">&#39;R:</span><span style="color:#ae81ff">\n</span><span style="color:#e6db74">{}</span><span style="color:#e6db74">&#39;</span><span style="color:#f92672">.</span>format(R))
</span></span><span style="display:flex;"><span>    print(<span style="color:#e6db74">&#39;R_opt:</span><span style="color:#ae81ff">\n</span><span style="color:#e6db74">{}</span><span style="color:#e6db74">&#39;</span><span style="color:#f92672">.</span>format(R_opt))
</span></span><span style="display:flex;"><span>    print(<span style="color:#e6db74">&#39;t:</span><span style="color:#ae81ff">\n</span><span style="color:#e6db74">{}</span><span style="color:#e6db74">&#39;</span><span style="color:#f92672">.</span>format(t))
</span></span><span style="display:flex;"><span>    print(<span style="color:#e6db74">&#39;t_opt:</span><span style="color:#ae81ff">\n</span><span style="color:#e6db74">{}</span><span style="color:#e6db74">&#39;</span><span style="color:#f92672">.</span>format(t_opt))
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    l2_t <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>linalg<span style="color:#f92672">.</span>norm(t <span style="color:#f92672">-</span> t_opt)
</span></span><span style="display:flex;"><span>    l2_R <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>linalg<span style="color:#f92672">.</span>norm(R <span style="color:#f92672">-</span> R_opt)
</span></span><span style="display:flex;"><span>    print(<span style="color:#e6db74">&#39;l2_t: </span><span style="color:#e6db74">{}</span><span style="color:#e6db74">&#39;</span><span style="color:#f92672">.</span>format(l2_t))
</span></span><span style="display:flex;"><span>    print(<span style="color:#e6db74">&#39;l2_R: </span><span style="color:#e6db74">{}</span><span style="color:#e6db74">&#39;</span><span style="color:#f92672">.</span>format(l2_R))
</span></span></code></pre></div><p>Running this test shows the algorithm correctly recovers the rotation and translation:</p>
<pre><code>RMSD: 3.2111501877699246e-15
R:
[[-0.8475392 -0.5307328  0.       ]
 [ 0.5307328 -0.8475392  0.       ]
 [ 0.         0.         1.       ]]
R_opt:
[[-8.47539198e-01 -5.30732803e-01 -2.95434260e-16]
 [ 5.30732803e-01 -8.47539198e-01  2.92859649e-16]
 [ 0.00000000e+00 -2.77555756e-16  1.00000000e+00]]
t:
[ 5.99726796  1.50078468 -3.34633977]
t_opt:
[ 5.99726796  1.50078468 -3.34633977]
l2_t: 2.7012892057857038e-15
l2_R: 8.028174304721057e-16
</code></pre>
<p>Both the rotation and the translation are recovered to within floating-point precision (the residuals <code>l2_t</code> and <code>l2_R</code> are on the order of <code>1e-15</code>).</p>
<p>For batch processing:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">kabsch_numpy_batched</span>(P, Q):
</span></span><span style="display:flex;"><span>    <span style="color:#e6db74">&#34;&#34;&#34;
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    Computes the optimal rotation and translation to align two sets of points (P -&gt; Q),
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    and their RMSD.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :param P: A BxNx3 matrix of points
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :param Q: A BxNx3 matrix of points
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :return: A tuple containing the optimal rotation matrix, the optimal
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">             translation vector, and the RMSD.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    &#34;&#34;&#34;</span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">assert</span> P<span style="color:#f92672">.</span>shape <span style="color:#f92672">==</span> Q<span style="color:#f92672">.</span>shape, <span style="color:#e6db74">&#34;Matrix dimensions must match&#34;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Compute centroids</span>
</span></span><span style="display:flex;"><span>    centroid_P <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>mean(P, axis<span style="color:#f92672">=</span><span style="color:#ae81ff">1</span>, keepdims<span style="color:#f92672">=</span><span style="color:#66d9ef">True</span>)  <span style="color:#75715e"># Bx1x3</span>
</span></span><span style="display:flex;"><span>    centroid_Q <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>mean(Q, axis<span style="color:#f92672">=</span><span style="color:#ae81ff">1</span>, keepdims<span style="color:#f92672">=</span><span style="color:#66d9ef">True</span>)  <span style="color:#75715e"># Bx1x3</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Center the points</span>
</span></span><span style="display:flex;"><span>    p <span style="color:#f92672">=</span> P <span style="color:#f92672">-</span> centroid_P  <span style="color:#75715e"># BxNx3</span>
</span></span><span style="display:flex;"><span>    q <span style="color:#f92672">=</span> Q <span style="color:#f92672">-</span> centroid_Q  <span style="color:#75715e"># BxNx3</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Compute the covariance matrix</span>
</span></span><span style="display:flex;"><span>    H <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>matmul(p<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">1</span>), q)  <span style="color:#75715e"># Bx3x3</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># SVD</span>
</span></span><span style="display:flex;"><span>    U, S, Vt <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>linalg<span style="color:#f92672">.</span>svd(H)  <span style="color:#75715e"># Bx3x3</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Validate right-handed coordinate system</span>
</span></span><span style="display:flex;"><span>    d <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>linalg<span style="color:#f92672">.</span>det(np<span style="color:#f92672">.</span>matmul(Vt<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">1</span>), U<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">1</span>)))
</span></span><span style="display:flex;"><span>    flip <span style="color:#f92672">=</span> d <span style="color:#f92672">&lt;</span> <span style="color:#ae81ff">0.0</span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">if</span> flip<span style="color:#f92672">.</span>any():
</span></span><span style="display:flex;"><span>        Vt[flip, <span style="color:#f92672">-</span><span style="color:#ae81ff">1</span>, :] <span style="color:#f92672">*=</span> <span style="color:#f92672">-</span><span style="color:#ae81ff">1.0</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Optimal rotation</span>
</span></span><span style="display:flex;"><span>    R <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>matmul(Vt<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">1</span>), U<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">1</span>))  <span style="color:#75715e"># Bx3x3</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Optimal translation (depends on R, so computed after it)</span>
</span></span><span style="display:flex;"><span>    t <span style="color:#f92672">=</span> centroid_Q<span style="color:#f92672">.</span>squeeze(<span style="color:#ae81ff">1</span>) <span style="color:#f92672">-</span> np<span style="color:#f92672">.</span>matmul(centroid_P, R<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">1</span>))<span style="color:#f92672">.</span>squeeze(<span style="color:#ae81ff">1</span>)  <span style="color:#75715e"># Bx3</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># RMSD</span>
</span></span><span style="display:flex;"><span>    rmsd <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>sqrt(np<span style="color:#f92672">.</span>sum(np<span style="color:#f92672">.</span>square(np<span style="color:#f92672">.</span>matmul(p, R<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">1</span>)) <span style="color:#f92672">-</span> q), axis<span style="color:#f92672">=</span>(<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">2</span>)) <span style="color:#f92672">/</span> P<span style="color:#f92672">.</span>shape[<span style="color:#ae81ff">1</span>])
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> R, t, rmsd
</span></span></code></pre></div><h3 id="pytorch">PyTorch</h3>


<p><details >
  <summary markdown="span">📝 Important Update (February 15, 2026)</summary>
  <strong>Bug Fix Notice:</strong> The PyTorch implementation has been updated to use the &ldquo;B-matrix&rdquo; broadcasting approach. This eliminates in-place tensor modification (which breaks <code>autograd</code>) and data-dependent control flow (which breaks <code>torch.compile</code> and <code>torch.vmap</code>).
</details></p>

<p>The PyTorch implementation now uses broadcasting to ensure differentiability:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">import</span> torch
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">kabsch_torch</span>(P, Q):
</span></span><span style="display:flex;"><span>    <span style="color:#e6db74">&#34;&#34;&#34;
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    Computes the optimal rotation and translation to align two sets of points (P -&gt; Q),
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    and their RMSD.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :param P: A Nx3 matrix of points
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :param Q: A Nx3 matrix of points
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :return: A tuple containing the optimal rotation matrix, the optimal
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">             translation vector, and the RMSD.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    &#34;&#34;&#34;</span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">assert</span> P<span style="color:#f92672">.</span>shape <span style="color:#f92672">==</span> Q<span style="color:#f92672">.</span>shape, <span style="color:#e6db74">&#34;Matrix dimensions must match&#34;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Compute centroids</span>
</span></span><span style="display:flex;"><span>    centroid_P <span style="color:#f92672">=</span> torch<span style="color:#f92672">.</span>mean(P, dim<span style="color:#f92672">=</span><span style="color:#ae81ff">0</span>)
</span></span><span style="display:flex;"><span>    centroid_Q <span style="color:#f92672">=</span> torch<span style="color:#f92672">.</span>mean(Q, dim<span style="color:#f92672">=</span><span style="color:#ae81ff">0</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Center the points</span>
</span></span><span style="display:flex;"><span>    p <span style="color:#f92672">=</span> P <span style="color:#f92672">-</span> centroid_P
</span></span><span style="display:flex;"><span>    q <span style="color:#f92672">=</span> Q <span style="color:#f92672">-</span> centroid_Q
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Compute the covariance matrix</span>
</span></span><span style="display:flex;"><span>    H <span style="color:#f92672">=</span> torch<span style="color:#f92672">.</span>matmul(p<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">1</span>), q)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># SVD</span>
</span></span><span style="display:flex;"><span>    U, S, Vt <span style="color:#f92672">=</span> torch<span style="color:#f92672">.</span>linalg<span style="color:#f92672">.</span>svd(H)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># 1. Calculate determinant</span>
</span></span><span style="display:flex;"><span>    d <span style="color:#f92672">=</span> torch<span style="color:#f92672">.</span>det(torch<span style="color:#f92672">.</span>matmul(Vt<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">1</span>), U<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">1</span>)))
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># 2. Build diagonal B tensor without in-place mutation</span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># We use stack to preserve gradients and graph connections</span>
</span></span><span style="display:flex;"><span>    B_diag <span style="color:#f92672">=</span> torch<span style="color:#f92672">.</span>stack([torch<span style="color:#f92672">.</span>tensor(<span style="color:#ae81ff">1.0</span>, device<span style="color:#f92672">=</span>d<span style="color:#f92672">.</span>device, dtype<span style="color:#f92672">=</span>d<span style="color:#f92672">.</span>dtype),
</span></span><span style="display:flex;"><span>                          torch<span style="color:#f92672">.</span>tensor(<span style="color:#ae81ff">1.0</span>, device<span style="color:#f92672">=</span>d<span style="color:#f92672">.</span>device, dtype<span style="color:#f92672">=</span>d<span style="color:#f92672">.</span>dtype),
</span></span><span style="display:flex;"><span>                          torch<span style="color:#f92672">.</span>sign(d)])
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># 3. Scale columns of Vt.T via broadcasting, then multiply by U^T</span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Vt.T: (3, 3). B_diag: (3) -&gt; B_diag[None, :]: (1, 3)</span>
</span></span><span style="display:flex;"><span>    R <span style="color:#f92672">=</span> torch<span style="color:#f92672">.</span>matmul(Vt<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">1</span>) <span style="color:#f92672">*</span> B_diag[<span style="color:#66d9ef">None</span>, :], U<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">1</span>))
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Optimal translation (depends on R, so computed after it)</span>
</span></span><span style="display:flex;"><span>    t <span style="color:#f92672">=</span> centroid_Q <span style="color:#f92672">-</span> centroid_P <span style="color:#f92672">@</span> R<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">1</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># RMSD</span>
</span></span><span style="display:flex;"><span>    rmsd <span style="color:#f92672">=</span> torch<span style="color:#f92672">.</span>sqrt(torch<span style="color:#f92672">.</span>sum(torch<span style="color:#f92672">.</span>square(torch<span style="color:#f92672">.</span>matmul(p, R<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">1</span>)) <span style="color:#f92672">-</span> q)) <span style="color:#f92672">/</span> P<span style="color:#f92672">.</span>shape[<span style="color:#ae81ff">0</span>])
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> R, t, rmsd
</span></span></code></pre></div><p>And our batched version:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">kabsch_torch_batched</span>(P, Q):
</span></span><span style="display:flex;"><span>    <span style="color:#e6db74">&#34;&#34;&#34;
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    Computes the optimal rotation and translation to align two sets of points (P -&gt; Q),
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    and their RMSD, in a batched manner.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :param P: A BxNx3 matrix of points
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :param Q: A BxNx3 matrix of points
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :return: A tuple containing the optimal rotation matrix, the optimal
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">             translation vector, and the RMSD.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    &#34;&#34;&#34;</span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">assert</span> P<span style="color:#f92672">.</span>shape <span style="color:#f92672">==</span> Q<span style="color:#f92672">.</span>shape, <span style="color:#e6db74">&#34;Matrix dimensions must match&#34;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Compute centroids</span>
</span></span><span style="display:flex;"><span>    centroid_P <span style="color:#f92672">=</span> torch<span style="color:#f92672">.</span>mean(P, dim<span style="color:#f92672">=</span><span style="color:#ae81ff">1</span>, keepdims<span style="color:#f92672">=</span><span style="color:#66d9ef">True</span>)  <span style="color:#75715e"># Bx1x3</span>
</span></span><span style="display:flex;"><span>    centroid_Q <span style="color:#f92672">=</span> torch<span style="color:#f92672">.</span>mean(Q, dim<span style="color:#f92672">=</span><span style="color:#ae81ff">1</span>, keepdims<span style="color:#f92672">=</span><span style="color:#66d9ef">True</span>)  <span style="color:#75715e"># Bx1x3</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Center the points</span>
</span></span><span style="display:flex;"><span>    p <span style="color:#f92672">=</span> P <span style="color:#f92672">-</span> centroid_P  <span style="color:#75715e"># BxNx3</span>
</span></span><span style="display:flex;"><span>    q <span style="color:#f92672">=</span> Q <span style="color:#f92672">-</span> centroid_Q  <span style="color:#75715e"># BxNx3</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Compute the covariance matrix</span>
</span></span><span style="display:flex;"><span>    H <span style="color:#f92672">=</span> torch<span style="color:#f92672">.</span>matmul(p<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">2</span>), q)  <span style="color:#75715e"># Bx3x3</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># SVD</span>
</span></span><span style="display:flex;"><span>    U, S, Vt <span style="color:#f92672">=</span> torch<span style="color:#f92672">.</span>linalg<span style="color:#f92672">.</span>svd(H)  <span style="color:#75715e"># Bx3x3</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># 1. Calculate batched determinant</span>
</span></span><span style="display:flex;"><span>    d <span style="color:#f92672">=</span> torch<span style="color:#f92672">.</span>det(torch<span style="color:#f92672">.</span>matmul(Vt<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">2</span>), U<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">2</span>)))  <span style="color:#75715e"># B</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># 2. Build batched B_diag without in-place mutation or control flow</span>
</span></span><span style="display:flex;"><span>    ones <span style="color:#f92672">=</span> torch<span style="color:#f92672">.</span>ones_like(d)
</span></span><span style="display:flex;"><span>    B_diag <span style="color:#f92672">=</span> torch<span style="color:#f92672">.</span>stack([ones, ones, torch<span style="color:#f92672">.</span>sign(d)], dim<span style="color:#f92672">=-</span><span style="color:#ae81ff">1</span>) <span style="color:#75715e"># Bx3</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># 3. Scale columns of Vt.T and multiply</span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Vt.T: (B, 3, 3). B_diag: (B, 3). B_diag[:, None, :]: (B, 1, 3).</span>
</span></span><span style="display:flex;"><span>    R <span style="color:#f92672">=</span> torch<span style="color:#f92672">.</span>matmul(Vt<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">2</span>) <span style="color:#f92672">*</span> B_diag[:, <span style="color:#66d9ef">None</span>, :], U<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">2</span>))
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Optimal translation (depends on R, so computed after it)</span>
</span></span><span style="display:flex;"><span>    t <span style="color:#f92672">=</span> centroid_Q<span style="color:#f92672">.</span>squeeze(<span style="color:#ae81ff">1</span>) <span style="color:#f92672">-</span> torch<span style="color:#f92672">.</span>matmul(centroid_P, R<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">2</span>))<span style="color:#f92672">.</span>squeeze(<span style="color:#ae81ff">1</span>)  <span style="color:#75715e"># Bx3</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># RMSD</span>
</span></span><span style="display:flex;"><span>    rmsd <span style="color:#f92672">=</span> torch<span style="color:#f92672">.</span>sqrt(torch<span style="color:#f92672">.</span>sum(torch<span style="color:#f92672">.</span>square(torch<span style="color:#f92672">.</span>matmul(p, R<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">2</span>)) <span style="color:#f92672">-</span> q), dim<span style="color:#f92672">=</span>(<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">2</span>)) <span style="color:#f92672">/</span> P<span style="color:#f92672">.</span>shape[<span style="color:#ae81ff">1</span>])
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> R, t, rmsd
</span></span></code></pre></div><h3 id="tensorflow">TensorFlow</h3>
<p>The TensorFlow implementation returns <code>S</code>, <code>U</code>, and <code>V</code> directly. To handle immutability and potential compilation (e.g., via <code>@tf.function</code>), we avoid explicit conditional branching by constructing a correction matrix $B$ and broadcasting it.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">import</span> tensorflow <span style="color:#66d9ef">as</span> tf
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">kabsch_tensorflow</span>(P, Q):
</span></span><span style="display:flex;"><span>    <span style="color:#e6db74">&#34;&#34;&#34;
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    Computes the optimal rotation and translation to align two sets of points (P -&gt; Q),
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    and their RMSD.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :param P: A Nx3 matrix of points
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :param Q: A Nx3 matrix of points
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :return: A tuple containing the optimal rotation matrix, the optimal
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">             translation vector, and the RMSD.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    &#34;&#34;&#34;</span>
</span></span><span style="display:flex;"><span>    P <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>convert_to_tensor(P, dtype<span style="color:#f92672">=</span>tf<span style="color:#f92672">.</span>float32)
</span></span><span style="display:flex;"><span>    Q <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>convert_to_tensor(Q, dtype<span style="color:#f92672">=</span>tf<span style="color:#f92672">.</span>float32)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">assert</span> P<span style="color:#f92672">.</span>shape <span style="color:#f92672">==</span> Q<span style="color:#f92672">.</span>shape, <span style="color:#e6db74">&#34;Matrix dimensions must match&#34;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Compute centroids</span>
</span></span><span style="display:flex;"><span>    centroid_P <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>reduce_mean(P, axis<span style="color:#f92672">=</span><span style="color:#ae81ff">0</span>)
</span></span><span style="display:flex;"><span>    centroid_Q <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>reduce_mean(Q, axis<span style="color:#f92672">=</span><span style="color:#ae81ff">0</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Center the points</span>
</span></span><span style="display:flex;"><span>    p <span style="color:#f92672">=</span> P <span style="color:#f92672">-</span> centroid_P
</span></span><span style="display:flex;"><span>    q <span style="color:#f92672">=</span> Q <span style="color:#f92672">-</span> centroid_Q
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Compute the covariance matrix</span>
</span></span><span style="display:flex;"><span>    H <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>matmul(tf<span style="color:#f92672">.</span>transpose(p), q)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># SVD</span>
</span></span><span style="display:flex;"><span>    S, U, V <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>linalg<span style="color:#f92672">.</span>svd(H)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># 1. Calculate determinant</span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Note: V in TF SVD is V, not V^T.</span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># R = V * U^T. Det(R) = Det(V * U^T)</span>
</span></span><span style="display:flex;"><span>    d <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>linalg<span style="color:#f92672">.</span>det(tf<span style="color:#f92672">.</span>matmul(V, tf<span style="color:#f92672">.</span>transpose(U)))
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># 2. Build diagonal B tensor: [1.0, 1.0, sign(d)]</span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Use static shape 3 if possible, or infer from D. Assuming D=3 here.</span>
</span></span><span style="display:flex;"><span>    B_diag <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>stack([<span style="color:#ae81ff">1.0</span>, <span style="color:#ae81ff">1.0</span>, tf<span style="color:#f92672">.</span>sign(d)])
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># 3. Scale columns of V via broadcasting (V * B_diag), then multiply by U^T</span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># V is DxD, B_diag is D. V * B_diag[None, :] multiplies each column j by B_diag[j]</span>
</span></span><span style="display:flex;"><span>    R <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>matmul(V <span style="color:#f92672">*</span> B_diag[<span style="color:#66d9ef">None</span>, :], tf<span style="color:#f92672">.</span>transpose(U))
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Optimal translation (depends on R, so computed after it)</span>
</span></span><span style="display:flex;"><span>    t <span style="color:#f92672">=</span> centroid_Q <span style="color:#f92672">-</span> tf<span style="color:#f92672">.</span>linalg<span style="color:#f92672">.</span>matvec(R, centroid_P)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># RMSD</span>
</span></span><span style="display:flex;"><span>    rmsd <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>sqrt(tf<span style="color:#f92672">.</span>reduce_sum(tf<span style="color:#f92672">.</span>square(tf<span style="color:#f92672">.</span>matmul(p, tf<span style="color:#f92672">.</span>transpose(R)) <span style="color:#f92672">-</span> q)) <span style="color:#f92672">/</span> P<span style="color:#f92672">.</span>shape[<span style="color:#ae81ff">0</span>])
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> R, t, rmsd
</span></span></code></pre></div><p>and a batched version:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">kabsch_tensorflow_batched</span>(P, Q):
</span></span><span style="display:flex;"><span>    <span style="color:#e6db74">&#34;&#34;&#34;
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    Computes the optimal rotation and translation to align two sets of points (P -&gt; Q),
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    and their RMSD.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :param P: A Nx3 matrix of points
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :param Q: A Nx3 matrix of points
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :return: A tuple containing the optimal rotation matrix, the optimal
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">             translation vector, and the RMSD.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    &#34;&#34;&#34;</span>
</span></span><span style="display:flex;"><span>    P <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>convert_to_tensor(P, dtype<span style="color:#f92672">=</span>tf<span style="color:#f92672">.</span>float32)
</span></span><span style="display:flex;"><span>    Q <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>convert_to_tensor(Q, dtype<span style="color:#f92672">=</span>tf<span style="color:#f92672">.</span>float32)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">assert</span> P<span style="color:#f92672">.</span>shape <span style="color:#f92672">==</span> Q<span style="color:#f92672">.</span>shape, <span style="color:#e6db74">&#34;Matrix dimensions must match&#34;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Compute centroids</span>
</span></span><span style="display:flex;"><span>    centroid_P <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>reduce_mean(P, axis<span style="color:#f92672">=</span><span style="color:#ae81ff">1</span>, keepdims<span style="color:#f92672">=</span><span style="color:#66d9ef">True</span>)
</span></span><span style="display:flex;"><span>    centroid_Q <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>reduce_mean(Q, axis<span style="color:#f92672">=</span><span style="color:#ae81ff">1</span>, keepdims<span style="color:#f92672">=</span><span style="color:#66d9ef">True</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Center the points</span>
</span></span><span style="display:flex;"><span>    p <span style="color:#f92672">=</span> P <span style="color:#f92672">-</span> centroid_P
</span></span><span style="display:flex;"><span>    q <span style="color:#f92672">=</span> Q <span style="color:#f92672">-</span> centroid_Q
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Compute the covariance matrix</span>
</span></span><span style="display:flex;"><span>    H <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>matmul(tf<span style="color:#f92672">.</span>transpose(p, perm<span style="color:#f92672">=</span>[<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">1</span>]), q)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># SVD</span>
</span></span><span style="display:flex;"><span>    S, U, V <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>linalg<span style="color:#f92672">.</span>svd(H)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># 1. Calculate batched determinant</span>
</span></span><span style="display:flex;"><span>    d <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>linalg<span style="color:#f92672">.</span>det(tf<span style="color:#f92672">.</span>matmul(V, tf<span style="color:#f92672">.</span>transpose(U, perm<span style="color:#f92672">=</span>[<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">1</span>])))
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># 2. Build batched B_diag: shape (B, 3)</span>
</span></span><span style="display:flex;"><span>    ones <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>ones_like(d)
</span></span><span style="display:flex;"><span>    B_diag <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>stack([ones, ones, tf<span style="color:#f92672">.</span>sign(d)], axis<span style="color:#f92672">=-</span><span style="color:#ae81ff">1</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># 3. Scale columns of V (Broadcasting adds the middle dimension)</span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># V: (B, 3, 3), B_diag: (B, 3) -&gt; B_diag[:, None, :]: (B, 1, 3)</span>
</span></span><span style="display:flex;"><span>    R <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>matmul(V <span style="color:#f92672">*</span> B_diag[:, <span style="color:#66d9ef">None</span>, :], tf<span style="color:#f92672">.</span>transpose(U, perm<span style="color:#f92672">=</span>[<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">1</span>]))
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Optimal translation (depends on R, so computed after it)</span>
</span></span><span style="display:flex;"><span>    t <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>squeeze(centroid_Q, axis<span style="color:#f92672">=</span><span style="color:#ae81ff">1</span>) <span style="color:#f92672">-</span> tf<span style="color:#f92672">.</span>linalg<span style="color:#f92672">.</span>matvec(R, tf<span style="color:#f92672">.</span>squeeze(centroid_P, axis<span style="color:#f92672">=</span><span style="color:#ae81ff">1</span>))  <span style="color:#75715e"># Bx3</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># RMSD</span>
</span></span><span style="display:flex;"><span>    rmsd <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>sqrt(tf<span style="color:#f92672">.</span>reduce_sum(tf<span style="color:#f92672">.</span>square(tf<span style="color:#f92672">.</span>matmul(p, tf<span style="color:#f92672">.</span>transpose(R, perm<span style="color:#f92672">=</span>[<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">1</span>])) <span style="color:#f92672">-</span> q), axis<span style="color:#f92672">=</span>(<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">2</span>)) <span style="color:#f92672">/</span> P<span style="color:#f92672">.</span>shape[<span style="color:#ae81ff">1</span>])
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> R, t, rmsd
</span></span></code></pre></div><h3 id="jax">JAX</h3>
<p>The JAX implementation closely mirrors NumPy, replacing <code>np</code> with <code>jnp</code>. However, we again avoid <code>if</code> statements and in-place assignment (which JAX disallows) by using the broadcasting B-matrix approach.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">import</span> jax.numpy <span style="color:#66d9ef">as</span> jnp
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">kabsch_jax</span>(P, Q):
</span></span><span style="display:flex;"><span>    <span style="color:#e6db74">&#34;&#34;&#34;
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    Computes the optimal rotation and translation to align two sets of points (P -&gt; Q),
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    and their RMSD.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :param P: A Nx3 matrix of points
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :param Q: A Nx3 matrix of points
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :return: A tuple containing the optimal rotation matrix, the optimal
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">             translation vector, and the RMSD.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    &#34;&#34;&#34;</span>
</span></span><span style="display:flex;"><span>    P <span style="color:#f92672">=</span> jnp<span style="color:#f92672">.</span>array(P)
</span></span><span style="display:flex;"><span>    Q <span style="color:#f92672">=</span> jnp<span style="color:#f92672">.</span>array(Q)
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">assert</span> P<span style="color:#f92672">.</span>shape <span style="color:#f92672">==</span> Q<span style="color:#f92672">.</span>shape, <span style="color:#e6db74">&#34;Matrix dimensions must match&#34;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Compute centroids</span>
</span></span><span style="display:flex;"><span>    centroid_P <span style="color:#f92672">=</span> jnp<span style="color:#f92672">.</span>mean(P, axis<span style="color:#f92672">=</span><span style="color:#ae81ff">0</span>)
</span></span><span style="display:flex;"><span>    centroid_Q <span style="color:#f92672">=</span> jnp<span style="color:#f92672">.</span>mean(Q, axis<span style="color:#f92672">=</span><span style="color:#ae81ff">0</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Center the points</span>
</span></span><span style="display:flex;"><span>    p <span style="color:#f92672">=</span> P <span style="color:#f92672">-</span> centroid_P
</span></span><span style="display:flex;"><span>    q <span style="color:#f92672">=</span> Q <span style="color:#f92672">-</span> centroid_Q
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Compute the covariance matrix</span>
</span></span><span style="display:flex;"><span>    H <span style="color:#f92672">=</span> jnp<span style="color:#f92672">.</span>dot(p<span style="color:#f92672">.</span>T, q)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># SVD</span>
</span></span><span style="display:flex;"><span>    U, S, Vt <span style="color:#f92672">=</span> jnp<span style="color:#f92672">.</span>linalg<span style="color:#f92672">.</span>svd(H)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># 1. Calculate determinant</span>
</span></span><span style="display:flex;"><span>    d <span style="color:#f92672">=</span> jnp<span style="color:#f92672">.</span>linalg<span style="color:#f92672">.</span>det(jnp<span style="color:#f92672">.</span>dot(Vt<span style="color:#f92672">.</span>T, U<span style="color:#f92672">.</span>T))
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># 2. Build diagonal B array</span>
</span></span><span style="display:flex;"><span>    B_diag <span style="color:#f92672">=</span> jnp<span style="color:#f92672">.</span>array([<span style="color:#ae81ff">1.0</span>, <span style="color:#ae81ff">1.0</span>, jnp<span style="color:#f92672">.</span>sign(d)])
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># 3. Scale columns of Vt.T and multiply by U.T</span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Vt.T is V.</span>
</span></span><span style="display:flex;"><span>    R <span style="color:#f92672">=</span> jnp<span style="color:#f92672">.</span>dot(Vt<span style="color:#f92672">.</span>T <span style="color:#f92672">*</span> B_diag[<span style="color:#66d9ef">None</span>, :], U<span style="color:#f92672">.</span>T)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Optimal translation (depends on R, so computed after it)</span>
</span></span><span style="display:flex;"><span>    t <span style="color:#f92672">=</span> centroid_Q <span style="color:#f92672">-</span> jnp<span style="color:#f92672">.</span>dot(R, centroid_P)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># RMSD</span>
</span></span><span style="display:flex;"><span>    rmsd <span style="color:#f92672">=</span> jnp<span style="color:#f92672">.</span>sqrt(jnp<span style="color:#f92672">.</span>sum(jnp<span style="color:#f92672">.</span>square(jnp<span style="color:#f92672">.</span>dot(p, R<span style="color:#f92672">.</span>T) <span style="color:#f92672">-</span> q)) <span style="color:#f92672">/</span> P<span style="color:#f92672">.</span>shape[<span style="color:#ae81ff">0</span>])
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> R, t, rmsd
</span></span></code></pre></div><p>and batched:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">kabsch_jax_batched</span>(P, Q):
</span></span><span style="display:flex;"><span>    <span style="color:#e6db74">&#34;&#34;&#34;
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    Computes the optimal rotation and translation to align two sets of points (P -&gt; Q),
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    and their RMSD.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :param P: A BxNx3 matrix of points
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :param Q: A BxNx3 matrix of points
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :return: A tuple containing the optimal rotation matrix, the optimal
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">             translation vector, and the RMSD.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    &#34;&#34;&#34;</span>
</span></span><span style="display:flex;"><span>    P <span style="color:#f92672">=</span> jnp<span style="color:#f92672">.</span>array(P)
</span></span><span style="display:flex;"><span>    Q <span style="color:#f92672">=</span> jnp<span style="color:#f92672">.</span>array(Q)
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">assert</span> P<span style="color:#f92672">.</span>shape <span style="color:#f92672">==</span> Q<span style="color:#f92672">.</span>shape, <span style="color:#e6db74">&#34;Matrix dimensions must match&#34;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Compute centroids</span>
</span></span><span style="display:flex;"><span>    centroid_P <span style="color:#f92672">=</span> jnp<span style="color:#f92672">.</span>mean(P, axis<span style="color:#f92672">=</span><span style="color:#ae81ff">1</span>, keepdims<span style="color:#f92672">=</span><span style="color:#66d9ef">True</span>)  <span style="color:#75715e"># Bx1x3</span>
</span></span><span style="display:flex;"><span>    centroid_Q <span style="color:#f92672">=</span> jnp<span style="color:#f92672">.</span>mean(Q, axis<span style="color:#f92672">=</span><span style="color:#ae81ff">1</span>, keepdims<span style="color:#f92672">=</span><span style="color:#66d9ef">True</span>)  <span style="color:#75715e"># Bx1x3</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Center the points</span>
</span></span><span style="display:flex;"><span>    p <span style="color:#f92672">=</span> P <span style="color:#f92672">-</span> centroid_P  <span style="color:#75715e"># BxNx3</span>
</span></span><span style="display:flex;"><span>    q <span style="color:#f92672">=</span> Q <span style="color:#f92672">-</span> centroid_Q  <span style="color:#75715e"># BxNx3</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Compute the covariance matrix</span>
</span></span><span style="display:flex;"><span>    H <span style="color:#f92672">=</span> jnp<span style="color:#f92672">.</span>matmul(p<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">1</span>), q)  <span style="color:#75715e"># Bx3x3</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># SVD</span>
</span></span><span style="display:flex;"><span>    U, S, Vt <span style="color:#f92672">=</span> jnp<span style="color:#f92672">.</span>linalg<span style="color:#f92672">.</span>svd(H)  <span style="color:#75715e"># Bx3x3</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># 1. Calculate batched determinant</span>
</span></span><span style="display:flex;"><span>    d <span style="color:#f92672">=</span> jnp<span style="color:#f92672">.</span>linalg<span style="color:#f92672">.</span>det(jnp<span style="color:#f92672">.</span>matmul(Vt<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">1</span>), U<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">1</span>)))
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># 2. Build batched B_diag</span>
</span></span><span style="display:flex;"><span>    ones <span style="color:#f92672">=</span> jnp<span style="color:#f92672">.</span>ones_like(d)
</span></span><span style="display:flex;"><span>    B_diag <span style="color:#f92672">=</span> jnp<span style="color:#f92672">.</span>stack([ones, ones, jnp<span style="color:#f92672">.</span>sign(d)], axis<span style="color:#f92672">=-</span><span style="color:#ae81ff">1</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># 3. Scale columns of Vt.T and multiply by U.T</span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Vt.T: (B, 3, 3). B_diag: (B, 3).</span>
</span></span><span style="display:flex;"><span>    R <span style="color:#f92672">=</span> jnp<span style="color:#f92672">.</span>matmul(Vt<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">1</span>) <span style="color:#f92672">*</span> B_diag[:, <span style="color:#66d9ef">None</span>, :], U<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">1</span>))
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Optimal translation (depends on R, so computed after it)</span>
</span></span><span style="display:flex;"><span>    t <span style="color:#f92672">=</span> centroid_Q<span style="color:#f92672">.</span>squeeze(<span style="color:#ae81ff">1</span>) <span style="color:#f92672">-</span> jnp<span style="color:#f92672">.</span>matmul(centroid_P, R<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">1</span>))<span style="color:#f92672">.</span>squeeze(<span style="color:#ae81ff">1</span>)  <span style="color:#75715e"># Bx3</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># RMSD</span>
</span></span><span style="display:flex;"><span>    rmsd <span style="color:#f92672">=</span> jnp<span style="color:#f92672">.</span>sqrt(jnp<span style="color:#f92672">.</span>sum(jnp<span style="color:#f92672">.</span>square(jnp<span style="color:#f92672">.</span>matmul(p, R<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">1</span>)) <span style="color:#f92672">-</span> q), axis<span style="color:#f92672">=</span>(<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">2</span>)) <span style="color:#f92672">/</span> P<span style="color:#f92672">.</span>shape[<span style="color:#ae81ff">1</span>])
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> R, t, rmsd
</span></span></code></pre></div>














<figure class="post-figure center ">
    <img src="/img/scientific-computing/kabsch-animated-protein-conformational-alignment-analysis.webp"
         alt="Animation of a protein structure being aligned using the Kabsch algorithm"
         title="Animation of a protein structure being aligned using the Kabsch algorithm"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">Real-world application: Aligning protein conformations to analyze structural changes.</figcaption>
    
</figure>

<h2 id="extensions">Extensions</h2>
<p>The Kabsch algorithm has several important extensions that go beyond the formulation dealt with here:</p>
<ul>
<li><strong>Quaternion Form</strong>: The algorithm can be reformulated using quaternions for better numerical stability, particularly useful in applications requiring high precision.</li>
<li><strong>Iterative Versions</strong>: More robust variants that handle noise better and have improved scaling properties for large point sets. This also can be advantageous for setups with limited computational resources.</li>
<li><strong>Weighted Kabsch</strong>: Extensions that incorporate point weights (e.g., atomic masses in molecular dynamics). While SciPy provides a <a href="https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.transform.Rotation.align_vectors.html#scipy.spatial.transform.Rotation.align_vectors">weighted version</a>, it lacks batch processing capabilities.</li>
<li><strong>The Umeyama Algorithm</strong>: If your point sets are rotated, translated, and scaled differently, the Umeyama algorithm is the direct extension of Kabsch. It solves the same optimization problem but introduces a scaling factor $c$, finding the optimal alignment for $Q \approx c R P + t$.</li>
</ul>
<p>Several of these extensions are implemented in the <a href="/projects/kabsch-horn-cookbook/">Kabsch-Horn Cookbook</a> library, which provides differentiable Kabsch, Horn, and Umeyama alignment across NumPy, PyTorch, JAX, TensorFlow, and MLX.</p>
<h2 id="further-reading">Further Reading</h2>
<ul>
<li><a href="https://en.wikipedia.org/wiki/Kabsch_algorithm">Wikipedia, Kabsch Algorithm</a></li>
<li><a href="https://zalo.github.io/blog/kabsch/">Zalo on Kabsch</a>: An interactive shape matching demo.</li>
</ul>
<h3 id="original-papers">Original Papers</h3>
<ul>
<li><strong>[Kabsch 1976]</strong> Kabsch, W. (1976). &ldquo;A solution for the best rotation to relate two sets of vectors.&rdquo; <em>Acta Crystallographica Section A</em>, 32(5), 922-923. <a href="https://doi.org/10.1107/S0567739476001873">DOI: 10.1107/S0567739476001873</a>
<em>The original paper: a closed-form, non-iterative optimal-rotation solution derived via Lagrange multipliers and eigendecomposition of $\tilde{R}R$ (the SVD reformulation came later; see Arun et al. 1987).</em> See also: <a href="/notes/biology/computational-biology/kabsch-algorithm/">paper notes</a>.</li>
<li><strong>[Kabsch 1978]</strong> Kabsch, W. (1978). &ldquo;A discussion of the solution for the best rotation to relate two sets of vectors.&rdquo; <em>Acta Crystallographica Section A</em>, 34(5), 827-828. <a href="https://doi.org/10.1107/S0567739478001680">DOI: 10.1107/S0567739478001680</a>
<em>The follow-up paper correcting for improper rotations (reflections).</em></li>
<li><strong>[Arun et al. 1987]</strong> Arun, K. S., Huang, T. S., &amp; Blostein, S. D. (1987). &ldquo;Least-Squares Fitting of Two 3-D Point Sets.&rdquo; <em>IEEE Transactions on Pattern Analysis and Machine Intelligence</em>, PAMI-9(5), 698-700. <a href="https://doi.org/10.1109/TPAMI.1987.4767965">DOI: 10.1109/TPAMI.1987.4767965</a>
<em>The first SVD-based formulation for 3D point set alignment.</em> See also: <a href="/notes/biology/computational-biology/arun-svd-point-fitting/">paper notes</a>.</li>
<li><strong>[Horn et al. 1988]</strong> Horn, B. K. P., Hilden, H. M., &amp; Negahdaripour, S. (1988). &ldquo;Closed-form solution of absolute orientation using orthonormal matrices.&rdquo; <em>Journal of the Optical Society of America A</em>, 5(7), 1127-1135. <a href="https://doi.org/10.1364/JOSAA.5.001127">DOI: 10.1364/JOSAA.5.001127</a>
<em>The matrix square root (polar decomposition) approach to the same problem.</em> See also: <a href="/notes/biology/computational-biology/horn-orthonormal-matrices/">paper notes</a>.</li>
<li><strong>[Horn 1987]</strong> Horn, B. K. P. (1987). &ldquo;Closed-form solution of absolute orientation using unit quaternions.&rdquo; <em>Journal of the Optical Society of America A</em>, 4(4), 629-642. <a href="https://doi.org/10.1364/JOSAA.4.000629">DOI: 10.1364/JOSAA.4.000629</a>
<em>An alternative quaternion-based closed-form solution that also handles scale.</em> See also: <a href="/notes/biology/computational-biology/horn-absolute-orientation/">paper notes</a>.</li>
<li><strong>[Umeyama 1991]</strong> Umeyama, S. (1991). &ldquo;Least-squares estimation of transformation parameters between two point patterns.&rdquo; <em>IEEE Transactions on Pattern Analysis and Machine Intelligence</em>, 13(4), 376-380. <a href="https://doi.org/10.1109/34.88573">DOI: 10.1109/34.88573</a>
<em>The extension of the algorithm to include optimal scaling in addition to rotation and translation.</em> See also: <a href="/notes/biology/computational-biology/umeyama-similarity-transformation/">paper notes</a>.</li>
</ul>
]]></content:encoded></item><item><title>Platinum Adatom Diffusion on Pt(100): LAMMPS Simulation</title><link>https://hunterheidenreich.com/videos/pt-adatom-diffusion/</link><pubDate>Wed, 27 Sep 2023 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/videos/pt-adatom-diffusion/</guid><description>LAMMPS molecular dynamics simulation of platinum adatom diffusion on a Pt(100) surface, showing atomic mobility mechanisms.</description><content:encoded><![CDATA[<div style="position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;">
      <iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share; fullscreen" loading="eager" referrerpolicy="strict-origin-when-cross-origin" src="https://www.youtube-nocookie.com/embed/1hhf5cQh56w?autoplay=0&amp;controls=1&amp;end=0&amp;loop=0&amp;mute=0&amp;start=0" style="position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;" title="YouTube video"></iframe>
    </div>

<p>Details on the simulation can be found in the <a href="/posts/adatom-cu-diffusion/">LAMMPS Tutorial: Copper and Platinum Adatom Diffusion</a> post and the <a href="/projects/lammps-adatom-diffusion/">Automated Adatom Diffusion Workflow</a> project page.</p>
]]></content:encoded></item><item><title>LAMMPS Tutorial: Copper and Platinum Adatom Diffusion</title><link>https://hunterheidenreich.com/posts/adatom-cu-diffusion/</link><pubDate>Wed, 27 Sep 2023 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/posts/adatom-cu-diffusion/</guid><description>LAMMPS tutorial for copper and platinum surface diffusion simulation and ML training data generation. Includes setup, analysis, and Ovito visualization.</description><content:encoded><![CDATA[<h2 id="introduction">Introduction</h2>
<p>Understanding how individual atoms move on crystal surfaces is fundamental to materials science, catalysis, and nanotechnology. This atomic-scale motion, called adatom diffusion, drives processes like thin film growth and surface chemical reactions.</p>
<p>While learning molecular dynamics simulations for my graduate work, I discovered these simulations generate valuable training data for machine learning models. This tutorial walks through simulating copper adatom diffusion on a Cu(100) surface using LAMMPS, building on Eric N. Hahn&rsquo;s excellent <a href="https://www.ericnhahn.com/tutorials/lammps-tutorials/adatom">adatom tutorial</a>.</p>
<p><strong>What you&rsquo;ll learn:</strong></p>
<ul>
<li>Setting up LAMMPS for surface diffusion simulations</li>
<li>Understanding simulation parameters and their impact</li>
<li>Visualizing results with Ovito</li>
<li>Analyzing trajectory data for ML applications</li>
<li>Connecting simulation data to machine learning workflows</li>
</ul>
<p>In this tutorial, we will explore both Copper (Cu) and Platinum (Pt) to show how atomic properties affect diffusion behavior, generating data for training element-aware ML models.</p>
<h2 id="prerequisites">Prerequisites</h2>
<p>Before starting this tutorial, you&rsquo;ll need:</p>
<ul>
<li><strong>LAMMPS</strong> with EAM potential support (version 2020 or later recommended)</li>
<li><strong>Python 3.x</strong> with matplotlib for analysis scripts</li>
<li><strong>Ovito</strong> (free version) for trajectory visualization</li>
<li><strong>Cu01.eam.alloy</strong> potential file from the <a href="https://www.ctcms.nist.gov/potentials/">NIST repository</a></li>
<li>Basic familiarity with molecular dynamics concepts (atoms, forces, timesteps)</li>
</ul>
<h2 id="understanding-adatoms-and-surface-diffusion">Understanding Adatoms and Surface Diffusion</h2>
<h3 id="what-is-an-adatom">What is an Adatom?</h3>
<p>An <strong>adatom</strong> (adsorbed atom) sits on a crystal surface but isn&rsquo;t incorporated into the bulk structure. Adatoms have fewer bonds than fully coordinated bulk atoms, making them highly mobile and reactive.</p>















<figure class="post-figure center ">
    <img src="/img/posts/crystal-surface.webp"
         alt="Ball model representation of a real (atomically rough) crystal surface with steps, kinks, adatoms, and vacancies in a closely-packed crystalline material. Adsorbed molecules, substitutional and interstitial atoms are also illustrated."
         title="Ball model representation of a real (atomically rough) crystal surface with steps, kinks, adatoms, and vacancies in a closely-packed crystalline material. Adsorbed molecules, substitutional and interstitial atoms are also illustrated."
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">Ball model representation of a real (atomically rough) crystal surface with steps, kinks, adatoms, and vacancies in a closely-packed crystalline material. Adsorbed molecules, substitutional and interstitial atoms are also illustrated. (<a href="https://creativecommons.org/licenses/by-sa/4.0/deed.en">CC-BY-SA-4.0: ShutterWaves</a>)</figcaption>
    
</figure>

<h3 id="why-study-adatom-diffusion">Why Study Adatom Diffusion?</h3>
<p>Adatom diffusion is important for several technological processes:</p>
<ul>
<li><strong>Thin film growth</strong>: Adatoms are the building blocks of deposited films</li>
<li><strong>Catalysis</strong>: Many reactions happen at these mobile surface atoms</li>
<li><strong>Corrosion</strong>: How surface atoms move affects material degradation</li>
<li><strong>Self-assembly</strong>: Adatom movement enables formation of ordered structures</li>
</ul>
<p>From a <strong>machine learning perspective</strong>, adatom diffusion is an ideal test case because:</p>
<ul>
<li>Well-understood physics provides ground truth for validation</li>
<li>Small system size enables extensive simulation</li>
<li>Behavior varies significantly with temperature and atomic species</li>
<li>Systematic data generation across different conditions</li>
</ul>
<h3 id="why-cu100">Why Cu(100)?</h3>
<p>Cu(100) surfaces are well-studied in literature, making them excellent benchmarks. The face-centered cubic (fcc) structure creates clear diffusion pathways, and copper&rsquo;s moderate binding energy lets us observe diffusion at reasonable temperatures without extreme computational demands.</p>
<h2 id="simulation-overview">Simulation Overview</h2>
<p>Before diving into the code details, let&rsquo;s understand the simulation design:</p>
<h3 id="key-simulation-parameters">Key Simulation Parameters</h3>
<table>
  <thead>
      <tr>
          <th>Parameter</th>
          <th>Value</th>
          <th>Why this choice</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><strong>System size</strong></td>
          <td>$8 \x8 \x6$ unit cells</td>
          <td>Large enough to avoid edge effects while keeping simulation time reasonable</td>
      </tr>
      <tr>
          <td><strong>Ensemble</strong></td>
          <td>NVT (constant volume, temperature)</td>
          <td>Appropriate for surface studies where pressure isn&rsquo;t the focus</td>
      </tr>
      <tr>
          <td><strong>Potential</strong></td>
          <td>EAM (Embedded Atom Method)</td>
          <td>Captures metallic bonding better than simple pair potentials</td>
      </tr>
      <tr>
          <td><strong>Time step</strong></td>
          <td>5 fs</td>
          <td>Small enough for numerical stability while allowing reasonable run times</td>
      </tr>
      <tr>
          <td><strong>Duration</strong></td>
          <td>500 ps</td>
          <td>Long enough to see multiple diffusion events</td>
      </tr>
      <tr>
          <td><strong>Temperature</strong></td>
          <td>600 K initial seed; 850 K thermostat on the bottom reservoir layer</td>
          <td>Drives thermal energy up from the substrate into the free surface where the adatom diffuses</td>
      </tr>
  </tbody>
</table>
<h3 id="simulation-strategy">Simulation Strategy</h3>
<p>The approach uses a <strong>thermal gradient setup</strong>:</p>
<ul>
<li>Bottom layers: Fixed to represent bulk crystal</li>
<li>Middle layers: Heated to 850 K for thermal energy</li>
<li>Top layers and adatom: Equilibrate to $\sim 600$ K for diffusion</li>
<li>This lets thermal energy propagate up from the heated reservoir to the free surface where the adatom diffuses</li>
</ul>
<p>The complete LAMMPS script implementing this approach:</p>
<pre tabindex="0"><code class="language-lammps" data-lang="lammps">### Original Created by Eric N. Hahn  ###
### ericnhahn@gmail.com ###

### Modifications by Hunter Heidenreich, CSE lab (Harvard, 2023)
### hheidenreich@g.harvard.edu
### 2023-09-01

### Simulating adatoms ###
### Version 0.2 ###


units metal
dimension 3
boundary p p s
atom_style atomic

lattice fcc 3.614
variable cubel equal 4
variable fixer1 equal &#34;v_cubel+2&#34;
variable fixer2 equal &#34;v_cubel+1.49&#34;
region  box block -${cubel} ${cubel} -${cubel} ${cubel} -${fixer1} 1 units lattice
region cbox block -${cubel} ${cubel} -${cubel} ${cubel} -${fixer1} 0 units lattice
create_box 1 box
create_atoms 1 region cbox
create_atoms 1 single -0.5 0 0.5 units lattice
region hold block INF INF INF INF -${fixer1} -${fixer2} units lattice
region temp block INF INF INF INF -${fixer2} -${cubel} units lattice
group hold region hold
group temp region temp

pair_style eam/alloy
pair_coeff * * Cu01.eam.alloy Cu

timestep        0.005
compute         new all temp
velocity        temp create 600 12345
fix heater temp temp/rescale 1 850 850 5 1
fix nve all nve
fix freeze hold setforce 0 0 0

variable e     equal pe
variable k     equal ke
variable t     equal etotal
variable T     equal temp
fix energy all ave/time 1 50 50 v_k v_e v_t v_T file energy_avg.txt

minimize 1.0e-4 1.0e-6 1000 10000

dump eve all custom 5 dump.lammpstrj id type xu yu zu   # fx fy fz  # uncomment for forces
dump_modify eve sort id

thermo 50
run 100000  # 100_000 * 5 fs = 500 ps
</code></pre><h2 id="line-by-line-breakdown">Line-by-Line Breakdown</h2>
<p>Let&rsquo;s examine each part of the LAMMPS script:</p>
<h3 id="simulation-setup">Simulation Setup</h3>
<h4 id="units">Units</h4>
<pre tabindex="0"><code class="language-lammps" data-lang="lammps">units metal
</code></pre><p>Sets simulation units to &ldquo;metal&rdquo; units (a standard choice for metallic systems). Key conversions: length in $\text{\AA}$, energy in eV, time in ps. Full details in the <a href="https://docs.lammps.org/units.html">LAMMPS documentation</a>.</p>
<pre tabindex="0"><code class="language-lammps" data-lang="lammps">dimension 3
</code></pre><p>Sets 3D simulation.</p>
<pre tabindex="0"><code class="language-lammps" data-lang="lammps">boundary p p s
</code></pre><p>Boundary conditions: periodic in x,y (infinite surface) and shrink-wrapped in z (finite surface height). This allows the adatom to potentially leave the surface if needed.</p>
<pre tabindex="0"><code class="language-lammps" data-lang="lammps">atom_style atomic
</code></pre><p>Uses &ldquo;atomic&rdquo; style, atoms as point masses without internal structure. Standard for metallic systems.</p>
<h4 id="lattice">Lattice</h4>
<pre tabindex="0"><code class="language-lammps" data-lang="lammps">lattice fcc 3.614
</code></pre><p>Defines face-centered cubic lattice with experimental Cu lattice constant ($3.614 \text{ \AA}$).</p>
<pre tabindex="0"><code class="language-lammps" data-lang="lammps">variable cubel equal 4
variable fixer1 equal &#34;v_cubel+2&#34;
variable fixer2 equal &#34;v_cubel+1.49&#34;
</code></pre><p>Define variables for simulation box dimensions. <code>cubel=4</code> sets system size, while <code>fixer1</code> and <code>fixer2</code> define the frozen and heated regions.</p>
<pre tabindex="0"><code class="language-lammps" data-lang="lammps">region  box block -${cubel} ${cubel} -${cubel} ${cubel} -${fixer1} 1 units lattice
region cbox block -${cubel} ${cubel} -${cubel} ${cubel} -${fixer1} 0 units lattice
</code></pre><p>Define regions: <code>box</code> for the entire simulation volume and <code>cbox</code> for crystal creation (excludes the surface layer where we&rsquo;ll place the adatom).</p>
<pre tabindex="0"><code class="language-lammps" data-lang="lammps">create_box 1 box
create_atoms 1 region cbox
create_atoms 1 single -0.5 0 0.5 units lattice
</code></pre><p>Create simulation box, populate with Cu atoms, then add single adatom at specified position.</p>
<pre tabindex="0"><code class="language-lammps" data-lang="lammps">region hold block INF INF INF INF -${fixer1} -${fixer2} units lattice
region temp block INF INF INF INF -${fixer2} -${cubel} units lattice
group hold region hold
group temp region temp
</code></pre><p>Define atom groups: <code>hold</code> (frozen bottom layers) and <code>temp</code> (heated middle layers for thermal energy).</p>
<pre tabindex="0"><code class="language-lammps" data-lang="lammps">pair_style eam/alloy
pair_coeff * * Cu01.eam.alloy Cu
</code></pre><p>Use <a href="/notes/chemistry/molecular-simulation/classical-methods/embedded-atom-method/">Embedded Atom Method (EAM)</a> potential for metallic bonding. The Cu01.eam.alloy potential from <a href="https://doi.org/10.1103/PhysRevB.63.224106">Mishin et al.</a> is available from the <a href="https://www.ctcms.nist.gov/potentials/testing/entry/2001--Mishin-Y-Mehl-M-J-Papaconstantopoulos-D-A-et-al--Cu-1/">NIST repository</a>.</p>
<pre tabindex="0"><code class="language-lammps" data-lang="lammps">timestep        0.005
</code></pre><p>5 femtosecond timestep (small enough for numerical stability).</p>
<h4 id="initial-conditions">Initial Conditions</h4>
<pre tabindex="0"><code class="language-lammps" data-lang="lammps">velocity        temp create 600 12345
</code></pre><p>Initialize velocities for 600 K temperature using random seed 12345.</p>
<pre tabindex="0"><code class="language-lammps" data-lang="lammps">fix heater temp temp/rescale 1 850 850 5 1
fix nve all nve
fix freeze hold setforce 0 0 0
</code></pre><p>Three fixes control dynamics:</p>
<ul>
<li><code>heater</code>: Maintains 850 K in middle layers</li>
<li><code>nve</code>: Velocity Verlet integration for all atoms</li>
<li><code>freeze</code>: Sets forces to zero for bottom atoms</li>
</ul>
<pre tabindex="0"><code class="language-lammps" data-lang="lammps">variable e     equal pe
variable k     equal ke
variable t     equal etotal
variable T     equal temp
fix energy all ave/time 1 50 50 v_k v_e v_t v_T file energy_avg.txt
</code></pre><p>Track energies and temperature, averaging every 50 timesteps and writing to file.</p>
<h3 id="execution">Execution</h3>
<h4 id="minimization">Minimization</h4>
<pre tabindex="0"><code class="language-lammps" data-lang="lammps">minimize 1.0e-4 1.0e-6 1000 10000
</code></pre><p>Relax initial structure. Should converge quickly, indicating the system is already well-optimized.</p>
<h4 id="output-setup">Output Setup</h4>
<pre tabindex="0"><code class="language-lammps" data-lang="lammps">dump eve all custom 5 dump.lammpstrj id type xu yu zu   # fx fy fz  # uncomment for forces
dump_modify eve sort id
</code></pre><p>Write atomic positions every 5 timesteps, sorted by atom ID. Uncomment force components if needed for analysis.</p>
<h4 id="production-run">Production Run</h4>
<pre tabindex="0"><code class="language-lammps" data-lang="lammps">thermo 50
run 100000  # 100_000 * 5 fs = 500 ps
</code></pre><p>Run simulation for 500 ps with thermo output every 50 steps.</p>
<h2 id="visualization-and-analysis">Visualization and Analysis</h2>
<p>Visualize results using <a href="https://www.ovito.org/">Ovito</a>, a free atomistic visualization tool:</p>
<ol>
<li>Open the trajectory file in Ovito</li>
<li>Color atoms by z-coordinate</li>
<li>Restrict height range to $0\text{-}2 \text{ \AA}$ for surface focus</li>
<li>Animate to observe diffusion events</li>
</ol>
<div style="position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;">
      <iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share; fullscreen" loading="eager" referrerpolicy="strict-origin-when-cross-origin" src="https://www.youtube-nocookie.com/embed/nIdbNqEEPys?autoplay=0&amp;controls=1&amp;end=0&amp;loop=0&amp;mute=0&amp;start=0" style="position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;" title="YouTube video"></iframe>
    </div>

<h2 id="analysis-results">Analysis Results</h2>
<p>The simulation generates rich data for machine learning applications:</p>
<h3 id="energy-analysis">Energy Analysis</h3>
<p>Energy fluctuations reveal thermal motion patterns:</p>















<figure class="post-figure center ">
    <img src="/img/adatom_cu_energy_avg.webp"
         alt="Average kinetic energy, potential energy, total energy, and temperature over time."
         title="Average kinetic energy, potential energy, total energy, and temperature over time."
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">Energy and temperature evolution over 500 ps simulation.</figcaption>
    
</figure>

<p>Skipping the first 30 logged data points (each averaged over 50 timesteps, so the first ~1500 timesteps / 7.5 ps of equilibration), these fluctuations enable:</p>
<ul>
<li><strong>Anomaly detection</strong>: Identifying unusual diffusion events</li>
<li><strong>Temperature prediction</strong>: Estimating local temperature from atomic motion</li>
<li><strong>Stability analysis</strong>: Detecting equilibrium states</li>
</ul>
<h3 id="trajectory-analysis">Trajectory Analysis</h3>
<p>Adatom motion reveals diffusion mechanisms:</p>















<figure class="post-figure center ">
    <img src="/img/adatom_cu_xy.webp"
         alt="x and y coordinates of the adatom over time."
         title="x and y coordinates of the adatom over time."
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">Adatom surface trajectory showing random walk behavior.</figcaption>
    
</figure>

<p>This data enables:</p>
<ul>
<li><strong>Path prediction</strong>: Training models for future position forecasting</li>
<li><strong>Diffusion coefficient estimation</strong>: Learning temperature-mobility relationships</li>
<li><strong>Transition state identification</strong>: Detecting hops between stable sites</li>
</ul>















<figure class="post-figure center ">
    <img src="/img/adatom_cu_z.webp"
         alt="z coordinate of the adatom over time."
         title="z coordinate of the adatom over time."
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">Height fluctuations revealing exchange events with surface atoms.</figcaption>
    
</figure>

<p>Z-coordinate data shows <strong>exchange events</strong> where the adatom swaps with surface atoms (crucial for surface chemistry understanding). This enables:</p>
<ul>
<li><strong>Event classification</strong>: Distinguishing diffusion vs. exchange mechanisms</li>
<li><strong>Activation barrier estimation</strong>: Learning energy landscapes from fluctuations</li>
<li><strong>Surface coordination analysis</strong>: Correlating height with local environment</li>
</ul>
<h3 id="machine-learning-applications">Machine Learning Applications</h3>
<p>This simulation produces multiple data types for ML training:</p>
<ol>
<li><strong>Coordinate trajectories</strong>: Neural network potential inputs or graph neural network features</li>
<li><strong>Energy time series</strong>: Regression model features for system property prediction</li>
<li><strong>Event annotations</strong>: Supervised learning labels for diffusion mechanism classification</li>
<li><strong>Environmental descriptors</strong>: Local atomic arrangement features</li>
</ol>
<p>Systematic MD simulations generate large, labeled datasets across varied conditions.</p>
<h2 id="extending-to-platinum-mass-and-bonding-effects">Extending to Platinum: Mass and Bonding Effects</h2>
<p>To understand how different elements behave, we can extend this framework to platinum (Pt). Platinum&rsquo;s higher atomic mass and stronger metallic bonding create notably different diffusion behavior, providing comparative data for machine learning.</p>
<h3 id="key-differences-from-copper">Key Differences from Copper</h3>
<table>
  <thead>
      <tr>
          <th>Parameter</th>
          <th>Copper (Cu)</th>
          <th>Platinum (Pt)</th>
          <th>Impact</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><strong>Atomic mass</strong></td>
          <td>63.5 u</td>
          <td>195.1 u</td>
          <td>Slower diffusion, longer correlation times</td>
      </tr>
      <tr>
          <td><strong>Lattice const.</strong></td>
          <td>3.614 Å</td>
          <td>3.96 Å</td>
          <td>Larger diffusion barriers, different pathways</td>
      </tr>
      <tr>
          <td><strong>Potential</strong></td>
          <td>Mishin et al.</td>
          <td>Zhou et al.</td>
          <td>Different interaction strengths</td>
      </tr>
      <tr>
          <td><strong>Melting point</strong></td>
          <td>1358 K</td>
          <td>2041 K</td>
          <td>Stronger surface binding</td>
      </tr>
  </tbody>
</table>
<h3 id="modifying-the-lammps-script">Modifying the LAMMPS Script</h3>
<p>The platinum simulation uses the exact same framework as the copper case, with three simple element-specific modifications:</p>
<ol>
<li><strong>Lattice constant</strong>: Change <code>lattice fcc 3.614</code> to <code>lattice fcc 3.96</code></li>
<li><strong>Potential file</strong>: Change <code>Cu01.eam.alloy</code> to <code>Pt_Zhou04.eam.alloy</code> (available from the <a href="https://www.ctcms.nist.gov/potentials/testing/entry/2004--Zhou-X-W-Johnson-R-A-Wadley-H-N-G--Pt/">NIST repository</a>)</li>
<li><strong>Element specification</strong>: Change <code>Cu</code> to <code>Pt</code> in the <code>pair_coeff</code> line</li>
</ol>
<p>These simple changes capture the essential physics differences between elements while maintaining the same simulation protocol, which is ideal for generating comparative datasets for ML training.</p>
<h3 id="expected-behavior-vs-copper">Expected Behavior vs. Copper</h3>
<p>When you run the analysis scripts on the platinum trajectory, you will observe:</p>
<ul>
<li><strong>Slower motion</strong>: Heavier atoms move more slowly at the same temperature. Platinum&rsquo;s ~3x greater mass reduces diffusion rates.</li>
<li><strong>Higher energy barriers</strong>: Stronger metallic bonding creates deeper potential wells, requiring more thermal energy for diffusion hops.</li>
<li><strong>Different pathways</strong>: The larger lattice constant changes the energy landscape, potentially favoring different diffusion mechanisms.</li>
</ul>
<p>Comparing Cu and Pt trajectories enables training element-aware models that account for atomic mass effects, binding strengths, and temperature scaling across different metals.</p>
<h2 id="code-and-data">Code and Data</h2>
<p>The complete simulation scripts and analysis tools are available for reproducibility:</p>
<h3 id="energy-analysis-script">Energy Analysis Script</h3>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#75715e"># Hunter Heidenreich, 2023</span>
</span></span><span style="display:flex;"><span><span style="color:#75715e"># Plots the energy of a simulation over time.</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#f92672">import</span> matplotlib.pyplot <span style="color:#66d9ef">as</span> plt
</span></span><span style="display:flex;"><span><span style="color:#f92672">from</span> argparse <span style="color:#f92672">import</span> ArgumentParser
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">if</span> __name__ <span style="color:#f92672">==</span> <span style="color:#e6db74">&#39;__main__&#39;</span>:
</span></span><span style="display:flex;"><span>    parser <span style="color:#f92672">=</span> ArgumentParser()
</span></span><span style="display:flex;"><span>    parser<span style="color:#f92672">.</span>add_argument(<span style="color:#e6db74">&#39;--input&#39;</span>, type<span style="color:#f92672">=</span>str, required<span style="color:#f92672">=</span><span style="color:#66d9ef">True</span>)
</span></span><span style="display:flex;"><span>    parser<span style="color:#f92672">.</span>add_argument(<span style="color:#e6db74">&#39;--output&#39;</span>, type<span style="color:#f92672">=</span>str, required<span style="color:#f92672">=</span><span style="color:#66d9ef">True</span>)
</span></span><span style="display:flex;"><span>    parser<span style="color:#f92672">.</span>add_argument(<span style="color:#e6db74">&#39;--skip&#39;</span>, type<span style="color:#f92672">=</span>int, default<span style="color:#f92672">=</span><span style="color:#ae81ff">1</span>)
</span></span><span style="display:flex;"><span>    args <span style="color:#f92672">=</span> parser<span style="color:#f92672">.</span>parse_args()
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Parse energy data</span>
</span></span><span style="display:flex;"><span>    data <span style="color:#f92672">=</span> {<span style="color:#e6db74">&#39;ts&#39;</span>: [], <span style="color:#e6db74">&#39;kes&#39;</span>: [], <span style="color:#e6db74">&#39;pes&#39;</span>: [], <span style="color:#e6db74">&#39;tes&#39;</span>: [], <span style="color:#e6db74">&#39;Ts&#39;</span>: []}
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">with</span> open(args<span style="color:#f92672">.</span>input, <span style="color:#e6db74">&#39;r&#39;</span>) <span style="color:#66d9ef">as</span> f:
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">for</span> line <span style="color:#f92672">in</span> f:
</span></span><span style="display:flex;"><span>            <span style="color:#66d9ef">if</span> line<span style="color:#f92672">.</span>startswith(<span style="color:#e6db74">&#39;#&#39;</span>) <span style="color:#f92672">or</span> <span style="color:#f92672">not</span> line<span style="color:#f92672">.</span>strip():
</span></span><span style="display:flex;"><span>                <span style="color:#66d9ef">continue</span>
</span></span><span style="display:flex;"><span>            t, v_k, v_e, v_t, v_T <span style="color:#f92672">=</span> map(float, line<span style="color:#f92672">.</span>split())
</span></span><span style="display:flex;"><span>            data[<span style="color:#e6db74">&#39;ts&#39;</span>]<span style="color:#f92672">.</span>append(t)
</span></span><span style="display:flex;"><span>            data[<span style="color:#e6db74">&#39;kes&#39;</span>]<span style="color:#f92672">.</span>append(v_k)
</span></span><span style="display:flex;"><span>            data[<span style="color:#e6db74">&#39;pes&#39;</span>]<span style="color:#f92672">.</span>append(v_e)
</span></span><span style="display:flex;"><span>            data[<span style="color:#e6db74">&#39;tes&#39;</span>]<span style="color:#f92672">.</span>append(v_t)
</span></span><span style="display:flex;"><span>            data[<span style="color:#e6db74">&#39;Ts&#39;</span>]<span style="color:#f92672">.</span>append(v_T)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Skip initial equilibration</span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">for</span> key <span style="color:#f92672">in</span> data:
</span></span><span style="display:flex;"><span>        data[key] <span style="color:#f92672">=</span> data[key][args<span style="color:#f92672">.</span>skip:]
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Create subplots</span>
</span></span><span style="display:flex;"><span>    fig, axs <span style="color:#f92672">=</span> plt<span style="color:#f92672">.</span>subplots(<span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">2</span>, figsize<span style="color:#f92672">=</span>(<span style="color:#ae81ff">16</span>, <span style="color:#ae81ff">12</span>))
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    plots <span style="color:#f92672">=</span> [(<span style="color:#e6db74">&#39;Kinetic Energy&#39;</span>, <span style="color:#e6db74">&#39;kes&#39;</span>), (<span style="color:#e6db74">&#39;Potential Energy&#39;</span>, <span style="color:#e6db74">&#39;pes&#39;</span>),
</span></span><span style="display:flex;"><span>             (<span style="color:#e6db74">&#39;Total Energy&#39;</span>, <span style="color:#e6db74">&#39;tes&#39;</span>), (<span style="color:#e6db74">&#39;Temperature&#39;</span>, <span style="color:#e6db74">&#39;Ts&#39;</span>)]
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">for</span> ax, (title, key) <span style="color:#f92672">in</span> zip(axs<span style="color:#f92672">.</span>flat, plots):
</span></span><span style="display:flex;"><span>        ax<span style="color:#f92672">.</span>plot(data[<span style="color:#e6db74">&#39;ts&#39;</span>], data[key])
</span></span><span style="display:flex;"><span>        ax<span style="color:#f92672">.</span>set_xlabel(<span style="color:#e6db74">&#39;TimeStep&#39;</span>)
</span></span><span style="display:flex;"><span>        ax<span style="color:#f92672">.</span>set_ylabel(title)
</span></span><span style="display:flex;"><span>        ax<span style="color:#f92672">.</span>set_title(title)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    plt<span style="color:#f92672">.</span>tight_layout()
</span></span><span style="display:flex;"><span>    plt<span style="color:#f92672">.</span>savefig(args<span style="color:#f92672">.</span>output, dpi<span style="color:#f92672">=</span><span style="color:#ae81ff">300</span>, bbox_inches<span style="color:#f92672">=</span><span style="color:#e6db74">&#39;tight&#39;</span>)
</span></span></code></pre></div><h3 id="trajectory-analysis-script">Trajectory Analysis Script</h3>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#75715e"># Hunter Heidenreich, 2023</span>
</span></span><span style="display:flex;"><span><span style="color:#75715e"># Plots the coordinates of the adatom.</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#f92672">import</span> matplotlib.pyplot <span style="color:#66d9ef">as</span> plt
</span></span><span style="display:flex;"><span><span style="color:#f92672">from</span> argparse <span style="color:#f92672">import</span> ArgumentParser
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">if</span> __name__ <span style="color:#f92672">==</span> <span style="color:#e6db74">&#39;__main__&#39;</span>:
</span></span><span style="display:flex;"><span>    parser <span style="color:#f92672">=</span> ArgumentParser()
</span></span><span style="display:flex;"><span>    parser<span style="color:#f92672">.</span>add_argument(<span style="color:#e6db74">&#39;--input&#39;</span>, type<span style="color:#f92672">=</span>str, required<span style="color:#f92672">=</span><span style="color:#66d9ef">True</span>)
</span></span><span style="display:flex;"><span>    parser<span style="color:#f92672">.</span>add_argument(<span style="color:#e6db74">&#39;--output&#39;</span>, type<span style="color:#f92672">=</span>str, required<span style="color:#f92672">=</span><span style="color:#66d9ef">True</span>)
</span></span><span style="display:flex;"><span>    parser<span style="color:#f92672">.</span>add_argument(<span style="color:#e6db74">&#39;--id&#39;</span>, type<span style="color:#f92672">=</span>int, default<span style="color:#f92672">=</span><span style="color:#ae81ff">1665</span>,
</span></span><span style="display:flex;"><span>                       help<span style="color:#f92672">=</span><span style="color:#e6db74">&#39;Atom ID to track (the adatom is the last created atom)&#39;</span>)
</span></span><span style="display:flex;"><span>    parser<span style="color:#f92672">.</span>add_argument(<span style="color:#e6db74">&#39;--do_z&#39;</span>, action<span style="color:#f92672">=</span><span style="color:#e6db74">&#39;store_true&#39;</span>,
</span></span><span style="display:flex;"><span>                       help<span style="color:#f92672">=</span><span style="color:#e6db74">&#39;Plot z-coordinate instead of xy scatter&#39;</span>)
</span></span><span style="display:flex;"><span>    args <span style="color:#f92672">=</span> parser<span style="color:#f92672">.</span>parse_args()
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    coords <span style="color:#f92672">=</span> {<span style="color:#e6db74">&#39;x&#39;</span>: [], <span style="color:#e6db74">&#39;y&#39;</span>: [], <span style="color:#e6db74">&#39;z&#39;</span>: []}
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">with</span> open(args<span style="color:#f92672">.</span>input, <span style="color:#e6db74">&#39;r&#39;</span>) <span style="color:#66d9ef">as</span> f:
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">for</span> line <span style="color:#f92672">in</span> f:
</span></span><span style="display:flex;"><span>            <span style="color:#66d9ef">if</span> line<span style="color:#f92672">.</span>startswith(<span style="color:#e6db74">f</span><span style="color:#e6db74">&#39;</span><span style="color:#e6db74">{</span>args<span style="color:#f92672">.</span>id<span style="color:#e6db74">}</span><span style="color:#e6db74"> &#39;</span>):
</span></span><span style="display:flex;"><span>                x, y, z <span style="color:#f92672">=</span> map(float, line<span style="color:#f92672">.</span>split()[<span style="color:#ae81ff">2</span>:<span style="color:#ae81ff">5</span>])
</span></span><span style="display:flex;"><span>                coords[<span style="color:#e6db74">&#39;x&#39;</span>]<span style="color:#f92672">.</span>append(x)
</span></span><span style="display:flex;"><span>                coords[<span style="color:#e6db74">&#39;y&#39;</span>]<span style="color:#f92672">.</span>append(y)
</span></span><span style="display:flex;"><span>                coords[<span style="color:#e6db74">&#39;z&#39;</span>]<span style="color:#f92672">.</span>append(z)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    plt<span style="color:#f92672">.</span>figure(figsize<span style="color:#f92672">=</span>(<span style="color:#ae81ff">10</span>, <span style="color:#ae81ff">8</span>))
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">if</span> args<span style="color:#f92672">.</span>do_z:
</span></span><span style="display:flex;"><span>        plt<span style="color:#f92672">.</span>plot(range(len(coords[<span style="color:#e6db74">&#39;z&#39;</span>])), coords[<span style="color:#e6db74">&#39;z&#39;</span>], <span style="color:#e6db74">&#39;b-&#39;</span>, linewidth<span style="color:#f92672">=</span><span style="color:#ae81ff">1</span>)
</span></span><span style="display:flex;"><span>        plt<span style="color:#f92672">.</span>xlabel(<span style="color:#e6db74">&#39;Simulation Step&#39;</span>)
</span></span><span style="display:flex;"><span>        plt<span style="color:#f92672">.</span>ylabel(<span style="color:#e6db74">&#39;Z Coordinate (Å)&#39;</span>)
</span></span><span style="display:flex;"><span>        plt<span style="color:#f92672">.</span>title(<span style="color:#e6db74">f</span><span style="color:#e6db74">&#39;Height vs. Time for Adatom </span><span style="color:#e6db74">{</span>args<span style="color:#f92672">.</span>id<span style="color:#e6db74">}</span><span style="color:#e6db74">&#39;</span>)
</span></span><span style="display:flex;"><span>        plt<span style="color:#f92672">.</span>grid(<span style="color:#66d9ef">True</span>, alpha<span style="color:#f92672">=</span><span style="color:#ae81ff">0.3</span>)
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">else</span>:
</span></span><span style="display:flex;"><span>        plt<span style="color:#f92672">.</span>scatter(coords[<span style="color:#e6db74">&#39;x&#39;</span>], coords[<span style="color:#e6db74">&#39;y&#39;</span>], s<span style="color:#f92672">=</span><span style="color:#ae81ff">1</span>, alpha<span style="color:#f92672">=</span><span style="color:#ae81ff">0.7</span>, c<span style="color:#f92672">=</span><span style="color:#e6db74">&#39;red&#39;</span>)
</span></span><span style="display:flex;"><span>        plt<span style="color:#f92672">.</span>xlabel(<span style="color:#e6db74">&#39;X Coordinate (Å)&#39;</span>)
</span></span><span style="display:flex;"><span>        plt<span style="color:#f92672">.</span>ylabel(<span style="color:#e6db74">&#39;Y Coordinate (Å)&#39;</span>)
</span></span><span style="display:flex;"><span>        plt<span style="color:#f92672">.</span>title(<span style="color:#e6db74">f</span><span style="color:#e6db74">&#39;XY Trajectory for Adatom </span><span style="color:#e6db74">{</span>args<span style="color:#f92672">.</span>id<span style="color:#e6db74">}</span><span style="color:#e6db74">&#39;</span>)
</span></span><span style="display:flex;"><span>        plt<span style="color:#f92672">.</span>axis(<span style="color:#e6db74">&#39;equal&#39;</span>)
</span></span><span style="display:flex;"><span>        plt<span style="color:#f92672">.</span>grid(<span style="color:#66d9ef">True</span>, alpha<span style="color:#f92672">=</span><span style="color:#ae81ff">0.3</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    plt<span style="color:#f92672">.</span>savefig(args<span style="color:#f92672">.</span>output, dpi<span style="color:#f92672">=</span><span style="color:#ae81ff">300</span>, bbox_inches<span style="color:#f92672">=</span><span style="color:#e6db74">&#39;tight&#39;</span>)
</span></span></code></pre></div><h2 id="summary-and-next-steps">Summary and Next Steps</h2>
<p>This tutorial demonstrates how molecular dynamics generates valuable ML training data for materials science. Adatom diffusion provides an ideal starting point because it:</p>
<ul>
<li><strong>Has interpretable physics</strong>: Well-understood mechanisms enable ML validation</li>
<li><strong>Shows diverse behaviors</strong>: Temperature-dependent dynamics create rich datasets</li>
<li><strong>Scales efficiently</strong>: Small systems allow extensive parameter exploration</li>
<li><strong>Connects to applications</strong>: Direct relevance to catalysis and surface engineering</li>
</ul>
<h3 id="whats-next">What&rsquo;s Next</h3>
<p>Future posts will extend this framework:</p>
<ol>
<li><strong>Mixed-metal surfaces</strong>: Alloy effects on diffusion pathways</li>
<li><strong>Stepped surfaces</strong>: How defects alter atomic mobility</li>
<li><strong>ML implementation</strong>: Training neural networks on simulation data</li>
</ol>
<h3 id="broader-applications">Broader Applications</h3>
<p>These simulation techniques enable various ML applications:</p>
<ul>
<li><strong>Neural network potentials</strong>: Replacing expensive quantum calculations with trained models</li>
<li><strong>Rare event sampling</strong>: ML-enhanced diffusion pathway identification</li>
<li><strong>Catalyst design</strong>: Predicting surface modification effects on reactivity</li>
<li><strong>Materials discovery</strong>: Screening alloy compositions for desired properties</li>
</ul>
<h3 id="getting-started">Getting Started</h3>
<p>To reproduce these simulations:</p>
<ol>
<li>Install LAMMPS with EAM potential support</li>
<li>Download Cu01.eam.alloy from the <a href="https://www.ctcms.nist.gov/potentials/entry/2001--Mishin-Y-Mehl-M-J-Papaconstantopoulos-D-A-et-al--Cu-1/">NIST repository</a> and place in your working directory</li>
<li>Save the LAMMPS script as <code>adatom_cu.lammps</code> and run:
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bash" data-lang="bash"><span style="display:flex;"><span>lammps -in adatom_cu.lammps
</span></span></code></pre></div></li>
<li>Analyze the results with the Python scripts:
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bash" data-lang="bash"><span style="display:flex;"><span>python plot_energy.py --input energy_avg.txt --output energy.png --skip <span style="color:#ae81ff">30</span>
</span></span><span style="display:flex;"><span>python plot_trajectory.py --input dump.lammpstrj --output trajectory_xy.png
</span></span><span style="display:flex;"><span>python plot_trajectory.py --input dump.lammpstrj --output trajectory_z.png --do_z
</span></span></code></pre></div></li>
<li>Visualize in Ovito by opening <code>dump.lammpstrj</code></li>
<li>Experiment with different temperatures, orientations, or elements</li>
</ol>
<hr>
<p>The full project, including the simulation architecture and automated analysis pipeline, is documented on the <a href="/projects/lammps-adatom-diffusion/">Automated Adatom Diffusion Workflow project page</a>.</p>
<p><em>Questions about the simulation setup or interested in applying these techniques to your research? Feel free to reach out. I&rsquo;m always happy to discuss molecular dynamics and machine learning applications.</em></p>
<h2 id="references">References</h2>
<ul>
<li><a href="https://www.lammps.org/">LAMMPS</a></li>
<li><a href="https://www.ovito.org/">Ovito</a></li>
<li><a href="https://www.ctcms.nist.gov/potentials/">NIST Interatomic Potentials Repository</a></li>
<li><a href="https://doi.org/10.1103/PhysRevB.63.224106">Mishin et al.</a></li>
</ul>
]]></content:encoded></item><item><title>Copper Adatom Diffusion on Cu(100): LAMMPS Simulation</title><link>https://hunterheidenreich.com/videos/cu-adatom-diffusion/</link><pubDate>Wed, 27 Sep 2023 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/videos/cu-adatom-diffusion/</guid><description>LAMMPS molecular dynamics simulation of copper adatom diffusion on a Cu(100) surface, showing atomic mobility mechanisms.</description><content:encoded><![CDATA[<div style="position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;">
      <iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share; fullscreen" loading="eager" referrerpolicy="strict-origin-when-cross-origin" src="https://www.youtube-nocookie.com/embed/nIdbNqEEPys?autoplay=0&amp;controls=1&amp;end=0&amp;loop=0&amp;mute=0&amp;start=0" style="position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;" title="YouTube video"></iframe>
    </div>

<p>Details on the simulation can be found in the <a href="/posts/adatom-cu-diffusion/">Cu Adatom Diffusion on Cu(100)</a> post and the <a href="/projects/lammps-adatom-diffusion/">Automated Adatom Diffusion Workflow</a> project page.</p>
]]></content:encoded></item><item><title>Generating Mini-Protein Trajectories with GROMACS</title><link>https://hunterheidenreich.com/posts/mini-proteins/</link><pubDate>Thu, 21 Sep 2023 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/posts/mini-proteins/</guid><description>Systematic GROMACS workflows for simulating mini-proteins across multiple amino acids to generate diverse MD trajectories for ML applications.</description><content:encoded><![CDATA[<h2 id="introduction">Introduction</h2>
<p>When developing machine learning models for protein dynamics, I needed training data, lots of it. Most researchers start with alanine dipeptide, a tiny two-amino-acid system that&rsquo;s become the &ldquo;hello world&rdquo; of protein simulation. It&rsquo;s small enough to simulate quickly but complex enough to show interesting folding behavior.</p>
<p>I wanted more diversity in my training data. Different amino acid side chains behave differently, and I was curious how this would affect model performance. So I extended the typical alanine dipeptide approach to include eight other amino acids, creating a small collection of &ldquo;mini-proteins&rdquo; for ML studies.</p>
<p>These dipeptides give a controlled testbed for studying how different chemical properties (aromatic rings, flexibility, branching) affect molecular dynamics, and for generating training data that varies those properties systematically.</p>
<h2 id="what-are-mini-proteins">What Are Mini-Proteins?</h2>
<p>In this context, &ldquo;mini-proteins&rdquo; are single amino acid residues capped with acetyl and N-methyl groups (Ace-X-Nme, where X is the amino acid). These systems act as the simplest possible models that still capture essential protein-like behavior.</p>
<p>These systems are popular in computational studies because they:</p>
<ul>
<li>Simulate quickly (seconds to minutes instead of hours)</li>
<li>Have well-characterized behavior for validation</li>
<li>Show enough complexity to be interesting</li>
<li>Can be systematically varied to study different chemical effects</li>
</ul>
<h2 id="getting-started">Getting Started</h2>
<p>The complete workflow and scripts are available on GitHub: <a href="https://github.com/hunter-heidenreich/mini-proteins/">mini-proteins</a>. The full project overview is on the <a href="/projects/mini-protein-trajectories/">Mini-Protein Trajectory Generation project page</a>.</p>
<h3 id="requirements">Requirements</h3>
<ul>
<li>Linux system with GROMACS installed</li>
<li>Python 3 with numpy and matplotlib</li>
<li>Basic familiarity with molecular dynamics concepts</li>
</ul>
<h3 id="quick-start">Quick Start</h3>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bash" data-lang="bash"><span style="display:flex;"><span>git clone https://github.com/hunter-heidenreich/mini-proteins.git
</span></span><span style="display:flex;"><span>cd mini-proteins
</span></span><span style="display:flex;"><span>ID<span style="color:#f92672">=</span>ala sh scripts/run.sh
</span></span></code></pre></div><p>This runs the complete pipeline: energy minimization, solvation, equilibration, and production simulation. The default settings generate 1 ns of trajectory data saved every 100 fs. I chose high temporal resolution for my ML models, but you can adjust this in <code>config/md_langevin.mdp</code>.</p>
<p>For longer production runs (recommended for most applications), increase the simulation time to ~100 ns and reduce the save frequency to manage file sizes.</p>
<h2 id="the-collection">The Collection</h2>
<p>I&rsquo;ve included nine different amino acid dipeptides, each with distinct chemical properties:</p>
<p><strong>Flexible systems</strong>: Glycine (smallest side chain), Alanine (methyl group)</p>
<p><strong>Branched systems</strong>: Valine, Isoleucine, Leucine (different branching patterns)</p>
<p><strong>Aromatic systems</strong>: Phenylalanine, Tryptophan (different ring structures)</p>
<p><strong>Special cases</strong>: Proline (ring constraint), Methionine (sulfur chemistry)</p>
<p>This systematic set allows studying how different chemical features affect dynamics:</p>
<ul>
<li>Does the flexibility of glycine lead to more diverse conformational sampling?</li>
<li>How do aromatic rings in tryptophan affect folding pathways?</li>
<li>Does the ring constraint in proline create different energy landscapes?</li>
</ul>
<p>These fundamental questions provide systematic data to test ML models against known chemical intuition, building confidence in the approach.</p>
<p>Ideally, a neural network trained on this dataset should learn physical <em>invariances</em>. By training on both aliphatic (Val, Leu, Ile) and aromatic (Phe, Trp) systems, the model learns to focus entirely on how electron density (π-systems vs. σ-bonds) influences local potential energy surfaces.</p>
<h3 id="generating-ml-ready-trajectory-data">Generating ML-Ready Trajectory Data</h3>
<p>Generating raw coordinates is easy; generating <strong>ML-ready data</strong> requires specific configurations. Standard MD simulations compress trajectory files to save space, discarding high-frequency velocity and force data. To train Neural Network Potentials (NNPs), I configured the GROMACS pipeline differently.</p>
<p>The fastest way to generate trajectory data is using the <code>run.sh</code> script:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bash" data-lang="bash"><span style="display:flex;"><span>ID<span style="color:#f92672">=</span>ala sh scripts/run.sh
</span></span></code></pre></div><p>where <code>ID</code> is the three-letter amino acid code (here, <code>ala</code> for alanine).</p>
<p>This script performs energy minimization, solvation, neutralization, NVT equilibration, NPT equilibration, and production simulation. The resulting trajectory saves to the <code>out/ID/data</code> directory.</p>
<h4 id="why-this-pipeline-differs-from-standard-tutorials">Why This Pipeline Differs from Standard Tutorials</h4>
<p>A key deviation from standard tutorials is the use of <strong>Stochastic Dynamics (Langevin)</strong> as the integrator. This adds friction and noise terms to the equations of motion, ensuring correct thermodynamic sampling:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-ini" data-lang="ini"><span style="display:flex;"><span><span style="color:#75715e">; config/md_langevin.mdp</span>
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">integrator</span>  <span style="color:#f92672">=</span> <span style="color:#e6db74">sd        ; Stochastic dynamics (Langevin)</span>
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">dt</span>          <span style="color:#f92672">=</span> <span style="color:#e6db74">0.001     ; 1 fs timestep</span>
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">nstxout</span>     <span style="color:#f92672">=</span> <span style="color:#e6db74">100       ; Save coordinates every 100 steps</span>
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">nstvout</span>     <span style="color:#f92672">=</span> <span style="color:#e6db74">100       ; Save velocities every 100 steps</span>
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">nstfout</span>     <span style="color:#f92672">=</span> <span style="color:#e6db74">100       ; Save forces every 100 steps</span>
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">tc-grps</span>     <span style="color:#f92672">=</span> <span style="color:#e6db74">Protein Non-Protein</span>
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">tau_t</span>       <span style="color:#f92672">=</span> <span style="color:#e6db74">0.1  0.1  ; Friction constant (ps)</span>
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">ref_t</span>       <span style="color:#f92672">=</span> <span style="color:#e6db74">298  298  ; Reference temperature (K)</span>
</span></span></code></pre></div><p>The critical settings for ML applications:</p>
<ol>
<li><strong>Langevin Dynamics (<code>sd</code>)</strong>: Ensures proper canonical (NVT) sampling, providing a robust alternative to the velocity-rescaling thermostat often used in tutorials</li>
<li><strong>Uncompressed Force Output (<code>nstfout = 100</code>)</strong>: Writing to <code>.trr</code> format captures the precise atomic forces acting on every atom, essential for force-matching in NNP training</li>
<li><strong>High-Frequency Sampling (0.1 ps)</strong>: Saving frames every 100 fs captures fast bond vibrations often missed in standard 10 ps snapshots</li>
</ol>
<p><strong>Note</strong>: A production simulation currently runs for 1 nanosecond, saved every 0.1 picoseconds (100 fs). For most applications, increase this to 100 nanoseconds and adjust the save frequency to avoid large data files. I targeted 100 fs because I needed correlated time data for ML models; other applications may require a lower frequency.</p>
<p>You can also run each step individually (see <code>scripts/run.sh</code> for examples).</p>
<h2 id="the-systems">The Systems</h2>
<p>Here are the nine amino acid dipeptides I&rsquo;ve included, each chosen for different chemical properties:</p>
<h3 id="alanine-dipeptide-the-standard">Alanine Dipeptide: The Standard</h3>















<figure class="post-figure center ">
    <img src="/img/alanine-dipeptide-molecular-dynamics.webp"
         alt="Alanine dipeptide molecular dynamics simulation animation"
         title="Alanine dipeptide molecular dynamics simulation animation"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">Alanine Dipeptide</figcaption>
    
</figure>

<p>The classic starting point for protein folding studies. The small methyl side chain provides a simple yet challenging system.</p>
<h3 id="glycine-dipeptide-maximum-flexibility">Glycine Dipeptide: Maximum Flexibility</h3>















<figure class="post-figure center ">
    <img src="/img/glycine-dipeptide-molecular-dynamics.webp"
         alt="Glycine dipeptide molecular dynamics simulation animation"
         title="Glycine dipeptide molecular dynamics simulation animation"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">Glycine Dipeptide</figcaption>
    
</figure>

<p>No side chain means maximum backbone flexibility. Great for studying how constraints affect conformational sampling.</p>
<h3 id="proline-dipeptide-built-in-rigidity">Proline Dipeptide: Built-in Rigidity</h3>















<figure class="post-figure center ">
    <img src="/img/proline-dipeptide-molecular-dynamics.webp"
         alt="Proline dipeptide molecular dynamics simulation animation"
         title="Proline dipeptide molecular dynamics simulation animation"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">Proline Dipeptide</figcaption>
    
</figure>

<p>The ring structure creates backbone constraints. Interesting comparison to glycine&rsquo;s flexibility.</p>
<h3 id="aromatic-systems">Aromatic Systems</h3>















<figure class="post-figure center ">
    <img src="/img/phenylalanine-dipeptide-molecular-dynamics.webp"
         alt="Phenylalanine dipeptide molecular dynamics simulation animation"
         title="Phenylalanine dipeptide molecular dynamics simulation animation"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">Phenylalanine Dipeptide</figcaption>
    
</figure>

<p><strong>Phenylalanine</strong>: Simple benzene ring for studying aromatic interactions.</p>















<figure class="post-figure center ">
    <img src="/img/tryptophan-dipeptide-molecular-dynamics.webp"
         alt="Tryptophan dipeptide molecular dynamics simulation animation"
         title="Tryptophan dipeptide molecular dynamics simulation animation"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">Tryptophan Dipeptide</figcaption>
    
</figure>

<p><strong>Tryptophan</strong>: Larger indole ring system with more complex aromatic chemistry.</p>
<h3 id="branched-aliphatic-systems">Branched Aliphatic Systems</h3>















<figure class="post-figure center ">
    <img src="/img/valine-dipeptide-molecular-dynamics.webp"
         alt="Valine dipeptide molecular dynamics simulation animation"
         title="Valine dipeptide molecular dynamics simulation animation"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">Valine Dipeptide</figcaption>
    
</figure>

<p><strong>Valine</strong>: β-branched, creates steric constraints near the backbone.</p>















<figure class="post-figure center ">
    <img src="/img/isoleucine-dipeptide-molecular-dynamics.webp"
         alt="Isoleucine dipeptide molecular dynamics simulation animation"
         title="Isoleucine dipeptide molecular dynamics simulation animation"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">Isoleucine Dipeptide</figcaption>
    
</figure>

<p><strong>Isoleucine</strong>: γ-branched, different steric profile than valine.</p>















<figure class="post-figure center ">
    <img src="/img/leucine-dipeptide-molecular-dynamics.webp"
         alt="Leucine dipeptide molecular dynamics simulation animation"
         title="Leucine dipeptide molecular dynamics simulation animation"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">Leucine Dipeptide</figcaption>
    
</figure>

<p><strong>Leucine</strong>: Longer branched chain with more conformational freedom.</p>
<h3 id="special-chemistry">Special Chemistry</h3>















<figure class="post-figure center ">
    <img src="/img/methionine-dipeptide-molecular-dynamics.webp"
         alt="Methionine dipeptide molecular dynamics simulation animation"
         title="Methionine dipeptide molecular dynamics simulation animation"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">Methionine Dipeptide</figcaption>
    
</figure>

<p><strong>Methionine</strong>: Sulfur chemistry, different from the others and interesting for studying heteroatom effects.</p>
<h2 id="whats-next">What&rsquo;s Next?</h2>
<p>These mini-protein simulations have been useful for my ML work, providing systematic training data with controlled chemical variation. These simple systems have helped me understand how different amino acid properties affect molecular behavior, knowledge that&rsquo;s valuable when working with larger, more complex proteins.</p>
<p>The primary value of this pipeline lies in the <strong>force extraction</strong> workflow. Having atomic forces alongside coordinates enables training NNPs via force matching; force information is a richer training signal than energies alone. Tools like <a href="https://github.com/torchmd/torchmd-net">TorchMD-Net</a>, <a href="https://github.com/mir-group/nequip">NequIP</a>, and <a href="https://github.com/ACEsuit/mace">MACE</a> can directly consume this data format.</p>
<p>The scripts are designed to be easily modified for different amino acids or simulation conditions. I&rsquo;ve tried to make the workflow straightforward while keeping it flexible.</p>
<p>This work complements my other molecular dynamics projects:</p>
<ul>
<li><a href="/posts/adatom-cu-diffusion/">LAMMPS Tutorial: Copper and Platinum Adatom Diffusion</a>: Learning LAMMPS for surface simulations and extending to different elements</li>
</ul>
<p>Together, these projects have given me a solid foundation in MD simulations for generating ML training data across different molecular systems.</p>
<hr>
<p><em>Find the complete code and documentation on <a href="https://github.com/hunter-heidenreich/mini-proteins">GitHub</a>. Questions or suggestions? I&rsquo;d love to hear from you, especially if you&rsquo;ve found interesting ways to extend or improve the approach.</em></p>
<h2 id="acknowledgements">Acknowledgements</h2>
<p>The scripts build on the <a href="https://cbp-unitn.gitlab.io/qcb22-23/QCB/tutorial2_gromacs">GROMACS tutorial</a> by Luca Tubiana at the University of Trento.</p>
]]></content:encoded></item><item><title>Automated Adatom Diffusion Workflow</title><link>https://hunterheidenreich.com/projects/lammps-adatom-diffusion/</link><pubDate>Thu, 21 Sep 2023 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/projects/lammps-adatom-diffusion/</guid><description>Python-wrapped reference implementation for surface diffusion simulations using LAMMPS and EAM potentials, with automated analysis pipelines.</description><content:encoded><![CDATA[<h2 id="overview">Overview</h2>
<p>This project provides an &ldquo;input-to-analysis&rdquo; workflow for simulating adatom diffusion on FCC metal surfaces. It demonstrates how to set up surface diffusion simulations in LAMMPS, manage EAM potentials, and parse trajectory data into energy and trajectory plots using Python. The LAMMPS input scripts are adapted from Eric N. Hahn&rsquo;s adatom tutorial; the Python analysis layer (<code>plot_energy.py</code>, <code>plot_xy.py</code>) is my own, written while in CSElab (Harvard, 2023).</p>
<p>The workflow covers two material systems (Copper (Cu) and Platinum (Pt)) providing comparative datasets that highlight how atomic mass and bonding strength affect surface dynamics.</p>
<h2 id="features">Features</h2>
<h3 id="simulation-architecture">Simulation Architecture</h3>
<p>The project separates simulation logic from analysis code:</p>
<table>
  <thead>
      <tr>
          <th style="text-align: left">Directory</th>
          <th style="text-align: left">Description</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left"><strong><code>/adatom_cu</code></strong></td>
          <td style="text-align: left">Copper adatom diffusion on Cu(100)</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong><code>/adatom_pt</code></strong></td>
          <td style="text-align: left">Platinum adatom diffusion on Pt(100)</td>
      </tr>
  </tbody>
</table>
<p>Each directory contains:</p>
<ul>
<li><strong>LAMMPS input scripts</strong> (<code>.in</code> files) defining the physics</li>
<li><strong>EAM potential files</strong> for metallic bonding (the Cu potential is committed; the Pt potential must be downloaded separately from the NIST Interatomic Potentials Repository, so the Pt system does not run as-checked-out)</li>
<li><strong>Python analysis scripts</strong> for trajectory and energy parsing</li>
</ul>
<h3 id="key-features">Key Features</h3>
<ul>
<li><strong>EAM Potentials</strong>: Uses Embedded Atom Method alloy potentials to accurately model metallic bonding and surface energies, providing accuracy beyond simple Lennard-Jones potentials</li>
<li><strong>Automated Analysis</strong>: Python pipeline (<code>plot_energy.py</code>, <code>plot_xy.py</code>) that parses raw thermodynamic logs and trajectory dumps to generate &ldquo;health check&rdquo; dashboards</li>
<li><strong>Workflow Orchestration</strong>: Demonstrates the &ldquo;Input → Simulation → Analysis&rdquo; loop, automating the transition from raw <code>.lammpstrj</code> files to publication-ready plots</li>
<li><strong>Kokkos Support</strong>: Includes Kokkos execution commands for GPU/multi-threaded runs</li>
</ul>
<h3 id="simulation-parameters">Simulation Parameters</h3>
<table>
  <thead>
      <tr>
          <th style="text-align: left">Parameter</th>
          <th style="text-align: left">Value</th>
          <th style="text-align: left">Purpose</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left"><strong>Ensemble</strong></td>
          <td style="text-align: left">NVT → NVE</td>
          <td style="text-align: left">Equilibration followed by energy conservation checks</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>Potential</strong></td>
          <td style="text-align: left">EAM/alloy</td>
          <td style="text-align: left">Accurate metallic bonding for surface dynamics</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>Minimization</strong></td>
          <td style="text-align: left">CG (1.0e-4)</td>
          <td style="text-align: left">Remove steric overlaps before dynamics</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>Timestep</strong></td>
          <td style="text-align: left">5 fs (metal units)</td>
          <td style="text-align: left">EAM-appropriate integration step</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>Trajectory dump</strong></td>
          <td style="text-align: left">every 5 steps (25 fs)</td>
          <td style="text-align: left">Tracks adatom site-to-site hops</td>
      </tr>
  </tbody>
</table>
<h2 id="usage">Usage</h2>
<p>The repository includes LAMMPS input scripts and Python analysis scripts. Run the LAMMPS scripts to generate trajectory data, then use the Python scripts to visualize the results.</p>
<h2 id="results">Results</h2>
<p>This workflow is documented in detail in companion blog posts:</p>
<ul>
<li><a href="/posts/adatom-cu-diffusion/">LAMMPS Tutorial: Copper and Platinum Adatom Diffusion</a> - Complete setup walkthrough with line-by-line script explanation and comparison of how heavier atoms behave differently on surfaces</li>
</ul>
]]></content:encoded></item><item><title>Mini-Protein Trajectory Generation</title><link>https://hunterheidenreich.com/projects/mini-protein-trajectories/</link><pubDate>Tue, 01 Aug 2023 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/projects/mini-protein-trajectories/</guid><description>Automated GROMACS pipeline generating MD trajectories with atomic force extraction for Neural Network Potential training.</description><content:encoded><![CDATA[<h2 id="overview">Overview</h2>
<p>I developed an automated GROMACS pipeline to generate molecular dynamics (MD) datasets for machine learning applications. The workflow automates the simulation of capped dipeptides across nine distinct residue types, creating a diverse training set suitable for Neural Network Potentials (NNPs). The pipeline is built off Luca Tubiana&rsquo;s GROMACS tutorial (University of Trento); the Python analysis layer and the curated dipeptide dataset are my own.</p>
<h2 id="features">Features</h2>
<h3 id="automated-simulation-pipeline">Automated Simulation Pipeline</h3>
<ul>
<li><strong>End-to-End Scripting</strong>: Bash-automated workflow handling topology generation (<code>pdb2gmx</code>), solvation, ionization, and equilibration</li>
<li><strong>Langevin Dynamics</strong>: Implemented Stochastic Dynamics (SD) integration to ensure proper canonical (NVT) ensemble sampling</li>
<li><strong>High-Resolution Output</strong>: Configured to capture <strong>0.1 ps (100 fs) resolution</strong> trajectories, critical for capturing fast bond vibrations</li>
<li><strong>Force Extraction</strong>: Optimized output to <code>.trr</code> format preserving uncompressed atomic forces, a key requirement for force-matching in ML potentials</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-ini" data-lang="ini"><span style="display:flex;"><span><span style="color:#75715e">; md_langevin.mdp</span>
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">integrator</span>  <span style="color:#f92672">=</span> <span style="color:#e6db74">sd        ; Stochastic dynamics for proper sampling</span>
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">dt</span>          <span style="color:#f92672">=</span> <span style="color:#e6db74">0.001     ; 1 fs timestep</span>
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">nstxout</span>     <span style="color:#f92672">=</span> <span style="color:#e6db74">100       ; Output every 100 steps = 0.1 ps resolution</span>
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">tc-grps</span>     <span style="color:#f92672">=</span> <span style="color:#e6db74">Protein Non-Protein</span>
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">tau_t</span>       <span style="color:#f92672">=</span> <span style="color:#e6db74">0.1  0.1  ; Friction constant (ps)</span>
</span></span></code></pre></div><h3 id="chemical-diversity-suite">Chemical Diversity Suite</h3>
<p>Designed to stress-test ML models against varied kinematic constraints:</p>
<table>
  <thead>
      <tr>
          <th>Category</th>
          <th>Residues</th>
          <th>Dynamics Challenge</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><strong>Aromatic</strong></td>
          <td>Phe, Trp</td>
          <td>π-stacking, bulky side chains</td>
      </tr>
      <tr>
          <td><strong>Constrained</strong></td>
          <td>Pro</td>
          <td>Cyclic backbone restrictions</td>
      </tr>
      <tr>
          <td><strong>Flexible</strong></td>
          <td>Gly, Ala</td>
          <td>High conformational entropy</td>
      </tr>
      <tr>
          <td><strong>Branched</strong></td>
          <td>Val, Ile, Leu</td>
          <td>Steric clashes, rotamer preferences</td>
      </tr>
      <tr>
          <td><strong>Sulfur-Containing</strong></td>
          <td>Met</td>
          <td>Flexible thioether linkage</td>
      </tr>
  </tbody>
</table>
<h2 id="usage">Usage</h2>
<p>The pipeline is executed via bash scripts, requiring GROMACS to be installed.</p>
<h2 id="results">Results</h2>
<ul>
<li><strong>Data Volume vs. Fidelity</strong>: Balanced high-frequency force outputs (every 100 steps) against storage constraints by automating post-processing extraction of forces into lightweight <code>.xvg</code> formats</li>
<li><strong>Force Field Consistency</strong>: Standardized the Amber03 force field and TIP3P water model across all residues to ensure consistent potential energy surfaces for downstream model training</li>
</ul>
<blockquote>
<p><strong>Note</strong>: This pipeline uses Amber03 for consistency across residue types. For production ML potentials, consider swapping to Charmm36m or similar modern force fields.</p></blockquote>
<h2 id="retrospective">Retrospective</h2>
<ul>
<li><strong>Demonstrative, not production-scale</strong>: the 1 ns trajectories exercise the pipeline and capture fast bond vibrations, but proper conformational sampling needs 100 ns to 1 µs runs. This is a working reference, not a finished dataset.</li>
<li><strong>Dated force field</strong>: Amber03 / TIP3P keeps the potential energy surface consistent across residues, but it is not state-of-the-art for ML-potential training; CHARMM36m or Amber ff19SB would be the upgrade path.</li>
<li><strong>Paused, not abandoned</strong>: a candidate to revive and extend (more residues, longer trajectories, Ramachandran analysis) for future force-matching work.</li>
</ul>
<h2 id="related-work">Related Work</h2>
<ul>
<li><a href="/posts/mini-proteins/">Mini-Protein Dynamics</a> - Detailed blog post on the simulation methodology</li>
</ul>
]]></content:encoded></item></channel></rss>