<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Scientific Computing on Hunter Heidenreich | Senior AI Research Scientist</title><link>https://hunterheidenreich.com/categories/scientific-computing/</link><description>Recent content in Scientific Computing on Hunter Heidenreich | Senior AI Research Scientist</description><image><title>Hunter Heidenreich | Senior AI Research Scientist</title><url>https://hunterheidenreich.com/img/avatar.webp</url><link>https://hunterheidenreich.com/img/avatar.webp</link></image><generator>Hugo -- 0.147.7</generator><language>en-US</language><copyright>2026 Hunter Heidenreich</copyright><lastBuildDate>Sat, 30 May 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://hunterheidenreich.com/categories/scientific-computing/index.xml" rel="self" type="application/rss+xml"/><item><title>nauty and Traces: Graph Isomorphism Algorithms</title><link>https://hunterheidenreich.com/notes/interdisciplinary/graph-theory/nauty-traces-graph-isomorphism/</link><pubDate>Sat, 11 Apr 2026 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/interdisciplinary/graph-theory/nauty-traces-graph-isomorphism/</guid><description>nauty and Traces use individualization-refinement with search tree pruning for graph isomorphism testing and canonical labeling.</description><content:encoded><![CDATA[<h2 id="a-method-paper-on-practical-graph-isomorphism">A Method Paper on Practical Graph Isomorphism</h2>
<p>This is a <strong>Method</strong> paper that brings the published description of nauty (version 2.5) up to date and introduces Traces (version 2.0), a new program for graph isomorphism testing and canonical labeling. The paper provides a unified theoretical framework for the individualization-refinement paradigm that underpins all leading graph isomorphism programs, then details the distinct implementation strategies of nauty and Traces. Extensive benchmarks compare both programs against saucy, Bliss, and conauto across graph families ranging from easy to extremely difficult.</p>
<h2 id="the-graph-isomorphism-problem-in-practice">The Graph Isomorphism Problem in Practice</h2>
<p>An isomorphism between two graphs is a bijection between their vertex sets that preserves adjacency. The graph isomorphism problem (GI) asks whether such a bijection exists. While GI is in NP, it is neither known to be in co-NP nor proven NP-complete. NP-completeness is considered unlikely, as it would imply collapse of the <a href="https://en.wikipedia.org/wiki/Polynomial_hierarchy">polynomial-time hierarchy</a>. The best proven worst-case running time has stood for three decades at $e^{O(\sqrt{n \log n})}$.</p>
<p>In practice, direct isomorphism testing is poorly suited for common tasks like removing duplicates from large graph collections or looking up graphs in databases. The standard approach is <strong>canonical labeling</strong>: relabeling a graph so that isomorphic graphs become identical after relabeling. This allows sorting algorithms and standard data structures to handle isomorph rejection and retrieval.</p>
<p>The dominant practical approach is the <strong>individualization-refinement paradigm</strong>, introduced by Parris and Read (1969) and developed by Corneil and Gotlieb (1970). McKay&rsquo;s nauty (1978, 1980) was the first program to handle both structurally regular graphs with hundreds of vertices and graphs with large <a href="https://en.wikipedia.org/wiki/Automorphism_group">automorphism groups</a>. Its key innovation was using discovered automorphisms to prune the search tree. nauty dominated the field for decades until competitors like saucy (2004), Bliss (2007), and conauto (2009) introduced sparse data structures, early refinement abort, and other improvements.</p>
<h2 id="the-individualization-refinement-framework">The Individualization-Refinement Framework</h2>
<p>The paper provides a general formal framework encompassing all leading graph isomorphism algorithms. The core idea has three components: vertex colorings, a search tree built by individualizing vertices, and pruning via node invariants and automorphisms.</p>
<h3 id="colorings-and-refinement">Colorings and Refinement</h3>
<p>A <strong>colouring</strong> of vertex set $V$ is a surjective function $\pi: V \to {1, 2, \ldots, k}$. A colouring is <strong>equitable</strong> if any two vertices of the same colour are adjacent to the same number of vertices of each colour. Given any colouring $\pi$, there exists a unique coarsest equitable colouring $\pi&rsquo;$ with $\pi&rsquo; \preceq \pi$ (meaning $\pi&rsquo;$ is finer than or equal to $\pi$). Computing this equitable refinement is the primary computational bottleneck.</p>
<p><strong>Individualization</strong> gives a single vertex a unique colour, then refines:</p>
<p>$$
I(\pi, v)(w) = \begin{cases} \pi(w), &amp; \text{if } \pi(w) &lt; \pi(v) \text{ or } w = v \\ \pi(w) + 1, &amp; \text{otherwise} \end{cases}
$$</p>
<p>The refinement function $R(G, \pi_0, \nu)$ applies equitable refinement after each individualization step for a sequence of vertices $\nu = (v_1, v_2, \ldots)$.</p>
<h3 id="search-tree-and-canonical-forms">Search Tree and Canonical Forms</h3>
<p>The search tree $\mathcal{T}(G, \pi_0)$ is a rooted tree whose nodes are vertex sequences. Starting from the empty sequence at the root, each node extends the sequence by choosing a vertex from a <strong>target cell</strong> (a non-singleton cell of the current colouring). Leaves correspond to discrete colourings (permutations of $V$).</p>
<p>A <strong>canonical form</strong> is a function $C: \mathcal{G} \times \Pi \to \mathcal{G} \times \Pi$ satisfying:</p>
<ul>
<li>$C(G, \pi) \cong (G, \pi)$ (the canonical form is isomorphic to the input)</li>
<li>$C(G^g, \pi^g) = C(G, \pi)$ for all $g \in S_n$ (label-invariance)</li>
</ul>
<p>The canonical form is computed by finding the leaf $\nu^*$ maximizing the node invariant $\phi(G, \pi_0, \nu)$, then applying the corresponding discrete colouring.</p>
<h3 id="tree-pruning">Tree Pruning</h3>
<p>Three pruning operations keep the search tractable:</p>
<ul>
<li><strong>$P_A(\nu, \nu&rsquo;)$</strong>: Remove subtree at $\nu&rsquo;$ if $\phi(G, \pi_0, \nu) &gt; \phi(G, \pi_0, \nu&rsquo;)$ (invariant comparison)</li>
<li><strong>$P_B(\nu, \nu&rsquo;)$</strong>: Remove subtree at $\nu&rsquo;$ if $\phi(G, \pi_0, \nu) \neq \phi(G, \pi_0, \nu&rsquo;)$ (inequivalence)</li>
<li><strong>$P_C(\nu, g)$</strong>: Remove subtree at $\nu^g$ if $g \in \text{Aut}(G, \pi_0)$ and $\nu &lt; \nu^g$ (automorphism pruning)</li>
</ul>
<p>Theorem 5 in the paper guarantees that after any sequence of these pruning operations, at least one canonical leaf survives and the discovered automorphisms generate the full automorphism group.</p>
<h2 id="implementation-nauty-vs-traces">Implementation: nauty vs. Traces</h2>
<p>While both programs operate within the same individualization-refinement framework, their implementation strategies differ substantially.</p>
<h3 id="refinement-strategies">Refinement Strategies</h3>
<p>Both nauty and Traces compute equitable colourings using Algorithm 1, which iteratively splits cells based on adjacency counts. For regular graphs (where all vertices have equal degree), the initial colouring is trivially equitable, making these graphs difficult. nauty addresses this with a library of stronger partitioning functions (e.g., triangle counting), which require user expertise to select. Traces instead uses a richer node invariant that often makes stronger refinements unnecessary.</p>
<h3 id="target-cell-selection">Target Cell Selection</h3>
<p>nauty has two strategies: using the first non-singleton cell regardless of size, or choosing the first cell with the most non-trivial joins to other cells (where a non-trivial join means more than 0 edges and less than the maximum possible between two cells). An earlier version of nauty preferred the smallest non-singleton cell, hypothesizing it would more likely correspond to a group orbit, but experiments showed the first non-singleton cell performs better in most cases. Traces prefers <strong>large</strong> target cells, which produce shallower search trees. Specifically, Traces selects the first largest non-singleton cell that is a subset of the parent node&rsquo;s target cell. If no non-singleton cells satisfy this, it falls back to the grandparent node&rsquo;s target cell, and so on.</p>
<h3 id="node-invariants-the-trace">Node Invariants: The Trace</h3>
<p>The most consequential difference is in node invariants. nauty computes a single integer $f(\nu)$ at each node, forming a vector $(f([\nu]_0), f([\nu]_1), \ldots, f(\nu))$ for lexicographic comparison. Traces defines $f(\nu)$ as a <strong>vector</strong> encoding the sizes and positions of cells in the order they are created during refinement. This vector-of-vectors structure (the &ldquo;trace,&rdquo; hence the program&rsquo;s name) enables comparison while refinement is still incomplete. For many difficult graph families, only a fraction of refinement operations need to finish before pruning can occur.</p>
<h3 id="tree-scanning-order">Tree Scanning Order</h3>
<p>This is the fundamental architectural difference. nauty uses <strong>depth-first</strong> search, keeping the lexicographically least leaf $\nu_1$ and the leaf $\nu^*$ with the greatest invariant discovered so far. Pruning applies when a node&rsquo;s invariant matches neither.</p>
<p>Traces uses <strong>breadth-first</strong> search, processing all nodes at each level $k$ and retaining only those with the greatest invariant value. By property $(\phi 1)$, the best nodes at level $k$ are children of the best nodes at level $k-1$, so no backtracking is needed. This maximizes pruning operation $P_A$.</p>
<p>To compensate for the fact that breadth-first search delays automorphism discovery (which requires leaves), Traces generates <strong>experimental paths</strong>: random paths from each node down to a leaf. Random experimental paths tend to find automorphisms generating larger subgroups, making more of the group available early for pruning. Both programs maintain discovered automorphisms using the <a href="https://en.wikipedia.org/wiki/Schreier%E2%80%93Sims_algorithm">random Schreier method</a> for efficient orbit computation.</p>
<h3 id="low-degree-vertex-handling">Low-Degree Vertex Handling</h3>
<p>Traces includes special handling for vertices of degree 0, 1, 2, or $n-1$. After the initial refinement, vertices with equal colours also have equal degrees. The target cell selector never selects cells containing vertices of these low degrees, and nodes whose non-trivial cells consist only of such vertices are not expanded further. Instead, special-purpose code produces generators for the automorphism group fixed by that node and, if needed, a unique discrete colouring. This technique is effective for graphs with many small components and tree-like structures (as in constraint satisfaction problems), though the authors note that such graphs could also benefit from preprocessing that factors out tree-like appendages and replaces vertices with identical neighborhoods.</p>
<h3 id="automorphism-detection">Automorphism Detection</h3>
<p>Beyond leaf comparison, saucy introduced early detection of automorphisms higher in the search tree by checking whether partial mappings between equivalent colourings extend trivially. Traces extends this idea with a heuristic that attempts non-trivial extensions. When computing only the automorphism group (not canonical labeling), Traces employs a strategy where it finds all discrete children of one node and then checks each remaining node for a single matching discrete child, further reducing search effort.</p>
<h2 id="performance-benchmarks">Performance Benchmarks</h2>
<p>The authors compare nauty 2.5, Traces 2.0, saucy 3.0, Bliss 7.2, and conauto 2.0.1 on a MacBook Pro with a 2.66 GHz Intel i7 processor. All graphs were randomly labeled before processing to avoid artifacts from input ordering. The benchmark covers both automorphism group computation and canonical labeling.</p>
<table>
  <thead>
      <tr>
          <th>Graph Family</th>
          <th>Best Program(s)</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Random graphs ($p = 1/2$)</td>
          <td>nauty, Traces</td>
          <td>All programs fast; easy class</td>
      </tr>
      <tr>
          <td>Random graphs ($p = n^{-1/2}$)</td>
          <td>nauty</td>
          <td>Sparse random graphs</td>
      </tr>
      <tr>
          <td>Random cubic graphs</td>
          <td>nauty (with invariant)</td>
          <td>nauty benefits from distance invariant</td>
      </tr>
      <tr>
          <td><a href="https://en.wikipedia.org/wiki/Hypercube_graph">Hypercubes</a></td>
          <td>Traces</td>
          <td>Vertex-transitive; Traces dramatically faster</td>
      </tr>
      <tr>
          <td>Misc. vertex-transitive</td>
          <td>Traces</td>
          <td>Large automorphism groups</td>
      </tr>
      <tr>
          <td>Unions of tripartite graphs</td>
          <td>conauto, Bliss</td>
          <td>Special handling for disjoint components</td>
      </tr>
      <tr>
          <td>Small strongly-regular graphs</td>
          <td>Traces, nauty</td>
          <td>Both competitive</td>
      </tr>
      <tr>
          <td>Large strongly-regular graphs</td>
          <td>Traces</td>
          <td>Orders of magnitude faster</td>
      </tr>
      <tr>
          <td>Hadamard matrix graphs</td>
          <td>Traces</td>
          <td>Among the hardest known classes</td>
      </tr>
      <tr>
          <td>Random trees</td>
          <td>nauty</td>
          <td>Low-degree preprocessing helps</td>
      </tr>
      <tr>
          <td>Cai-Furer-Immerman graphs</td>
          <td>Traces</td>
          <td>Designed to defeat refinement; Traces still efficient</td>
      </tr>
      <tr>
          <td>Miyazaki graphs</td>
          <td>Traces</td>
          <td>Another hard class; dramatic advantage</td>
      </tr>
      <tr>
          <td><a href="https://en.wikipedia.org/wiki/Projective_plane">Projective planes</a> (order 16)</td>
          <td>Traces</td>
          <td>Large automorphism groups on bipartite graphs</td>
      </tr>
      <tr>
          <td>Combinatorial graphs</td>
          <td>Mixed</td>
          <td>Performance varies by instance; Traces generally competitive</td>
      </tr>
  </tbody>
</table>
<p>The results show that nauty is generally fastest for small graphs and some easier families, while Traces dominates on most difficult graph classes, sometimes by orders of magnitude. The breadth-first tree scanning strategy of Traces, combined with its richer node invariant, provides the largest gains on graphs with complex symmetry structure (<a href="https://en.wikipedia.org/wiki/Strongly_regular_graph">strongly-regular graphs</a>, <a href="https://en.wikipedia.org/wiki/Hadamard_matrix">Hadamard matrix</a> graphs, <a href="https://en.wikipedia.org/wiki/Vertex-transitive_graph">vertex-transitive graphs</a>). The exception is graph families with many disjoint or minimally-overlapping components, where conauto and Bliss have specialized handling that nauty and Traces lack.</p>
<h2 id="key-findings-and-limitations">Key Findings and Limitations</h2>
<p>The paper establishes several findings:</p>
<ol>
<li>The breadth-first tree scanning approach in Traces, combined with experimental paths for early automorphism discovery, provides large efficiency gains on difficult graph classes.</li>
<li>Traces&rsquo; richer node invariant (the trace) enables early pruning during incomplete refinement, reducing dependence on user-selected invariant functions compared to nauty.</li>
<li>No single program dominates all graph classes. nauty remains preferred for mass processing of small graphs.</li>
<li>The random Schreier method for maintaining the automorphism group is effective in both programs, enabling more complete pruning via orbit computation.</li>
</ol>
<p>Limitations acknowledged by the authors include: nauty and Traces lack specialized code for graphs consisting of disjoint or minimally-overlapping components (where conauto and Bliss excel), and the choice of refinement function in nauty still requires user expertise for certain difficult graph classes.</p>
<hr>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<h3 id="data">Data</h3>
<table>
  <thead>
      <tr>
          <th>Purpose</th>
          <th>Dataset</th>
          <th>Size</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Benchmarking</td>
          <td>Bliss test collection</td>
          <td>Multiple families</td>
          <td>Graphs ranging from easy to very difficult</td>
      </tr>
      <tr>
          <td>Benchmarking</td>
          <td>nauty/Traces website collection</td>
          <td>Multiple families</td>
          <td>All test graphs available at the project website</td>
      </tr>
  </tbody>
</table>
<p>All test graphs are publicly available at the nauty and Traces website. Graphs were randomly labeled before processing to avoid non-typical behavior from input labeling.</p>
<h3 id="algorithms">Algorithms</h3>
<p>The core algorithms are described formally with proofs of correctness (Theorem 5 guarantees pruning validity). Key implementation choices:</p>
<ul>
<li><strong>Refinement</strong>: Equitable colouring via Algorithm 1 (iterated cell splitting by adjacency counts)</li>
<li><strong>Target cell selection</strong>: nauty uses first non-singleton or most non-trivially joined cell; Traces uses first largest cell within parent&rsquo;s target</li>
<li><strong>Tree scanning</strong>: nauty uses depth-first; Traces uses breadth-first with experimental paths</li>
<li><strong>Group maintenance</strong>: Random Schreier method for orbit computation in both programs</li>
</ul>
<h3 id="software">Software</h3>
<table>
  <thead>
      <tr>
          <th>Program</th>
          <th>Version</th>
          <th>Canonical Labeling</th>
          <th>Open Source</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>nauty</td>
          <td>2.5</td>
          <td>Yes</td>
          <td>Yes</td>
      </tr>
      <tr>
          <td>Traces</td>
          <td>2.0</td>
          <td>Yes</td>
          <td>Yes</td>
      </tr>
      <tr>
          <td>saucy</td>
          <td>3.0</td>
          <td>No (v3.0)</td>
          <td>Yes</td>
      </tr>
      <tr>
          <td>Bliss</td>
          <td>7.2</td>
          <td>Yes</td>
          <td>Yes</td>
      </tr>
      <tr>
          <td>conauto</td>
          <td>2.0.1</td>
          <td>No</td>
          <td>Yes</td>
      </tr>
  </tbody>
</table>
<h3 id="artifacts">Artifacts</h3>
<table>
  <thead>
      <tr>
          <th>Artifact</th>
          <th>Type</th>
          <th>License</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><a href="http://pallini.di.uniroma1.it/">nauty and Traces</a></td>
          <td>Code</td>
          <td>Apache 2.0</td>
          <td>Official distribution (v2.9.3 as of 2026); includes gtools graph utilities</td>
      </tr>
      <tr>
          <td><a href="http://pallini.di.uniroma1.it/">Test graphs</a></td>
          <td>Dataset</td>
          <td>Apache 2.0</td>
          <td>All benchmark graphs from the paper, available at the project website</td>
      </tr>
  </tbody>
</table>
<h3 id="hardware">Hardware</h3>
<p>Benchmarks run on a MacBook Pro with 2.66 GHz Intel i7 processor, compiled with gcc 4.7, single-threaded execution.</p>
<hr>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: McKay, B. D., &amp; Piperno, A. (2013). Practical graph isomorphism, II. <em>Journal of Symbolic Computation</em>, 60, 94-112.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@article</span>{mckay2013practical,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span>=<span style="color:#e6db74">{Practical graph isomorphism, {II}}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span>=<span style="color:#e6db74">{McKay, Brendan D. and Piperno, Adolfo}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">journal</span>=<span style="color:#e6db74">{Journal of Symbolic Computation}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">volume</span>=<span style="color:#e6db74">{60}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span>=<span style="color:#e6db74">{94--112}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span>=<span style="color:#e6db74">{2013}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">publisher</span>=<span style="color:#e6db74">{Elsevier BV}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">doi</span>=<span style="color:#e6db74">{10.1016/j.jsc.2013.09.003}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div>]]></content:encoded></item><item><title>Kabsch-Horn Cookbook: Differentiable Alignment</title><link>https://hunterheidenreich.com/projects/kabsch-horn-cookbook/</link><pubDate>Fri, 20 Mar 2026 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/projects/kabsch-horn-cookbook/</guid><description>Differentiable Kabsch (SVD) and Horn (quaternion) alignment for NumPy, PyTorch, JAX, TensorFlow, and MLX with gradient-safe SVD.</description><content:encoded><![CDATA[<h2 id="overview">Overview</h2>
<p>Aligning two sets of corresponding points, finding the optimal rotation (and optionally translation and scale) that maps one onto the other, is a fundamental operation across scientific computing. It appears in molecular dynamics (superimposing protein conformations), robotics (sensor registration), and computer vision (shape matching). The two dominant algorithm families are the Kabsch (SVD-based) method and the Horn (quaternion-based) method.</p>
<p>The <strong>Kabsch-Horn Cookbook</strong> is a Python library that implements both algorithm families across five numerical frameworks: NumPy, PyTorch, JAX, TensorFlow, and MLX. Every backend shares the same API, supports N-dimensional point sets, per-point weights, and arbitrary batch dimensions. The PyTorch, JAX, TensorFlow, and MLX backends are fully differentiable, with custom autograd rules that bypass the numerically unstable gradient of the standard SVD near degenerate singular values.</p>
<h2 id="features">Features</h2>
<h3 id="algorithms">Algorithms</h3>
<ul>
<li><strong>Kabsch</strong>: SVD-based optimal rotation for rigid alignment</li>
<li><strong>Kabsch-Umeyama</strong>: Kabsch with an additional optimal scaling factor $c$, solving $Q \approx cRP + t$</li>
<li><strong>Horn</strong>: Quaternion-based optimal rotation via the eigendecomposition of a $4 \times 4$ key matrix</li>
<li><strong>Horn + Scale</strong>: Horn&rsquo;s method extended with optimal isotropic scaling</li>
<li><strong>RMSD Wrappers</strong>: Convenience functions that return RMSD directly alongside the alignment parameters</li>
</ul>
<h3 id="framework-support">Framework Support</h3>
<table>
  <thead>
      <tr>
          <th>Framework</th>
          <th style="text-align: center">Differentiable</th>
          <th style="text-align: center">Compile/JIT</th>
          <th>Versions</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>NumPy</td>
          <td style="text-align: center"></td>
          <td style="text-align: center"></td>
          <td>1.24+</td>
      </tr>
      <tr>
          <td>PyTorch</td>
          <td style="text-align: center">Yes</td>
          <td style="text-align: center"><code>torch.compile</code></td>
          <td>2.0+</td>
      </tr>
      <tr>
          <td>JAX</td>
          <td style="text-align: center">Yes</td>
          <td style="text-align: center"><code>jax.jit</code></td>
          <td>0.4+</td>
      </tr>
      <tr>
          <td>TensorFlow</td>
          <td style="text-align: center">Yes</td>
          <td style="text-align: center"></td>
          <td>2.13+</td>
      </tr>
      <tr>
          <td>MLX</td>
          <td style="text-align: center">Yes</td>
          <td style="text-align: center"></td>
          <td>0.1+</td>
      </tr>
  </tbody>
</table>
<p><code>torch.compile</code> and <code>jax.jit</code> are the tested compile/JIT paths. MLX supports 3D inputs only; the Kabsch (SVD) path is N-dimensional on the other four backends.</p>
<h3 id="numerical-robustness">Numerical Robustness</h3>
<p>Standard SVD and eigendecomposition backward passes produce <code>NaN</code> gradients when singular values collide or are near-zero. The library provides custom autograd primitives to handle these cases:</p>
<ul>
<li><strong>SafeSVD</strong> (PyTorch, JAX, TF, MLX): Custom backward pass that clamps the singular value gap, preventing division-by-zero in the gradient</li>
<li><strong>SafeEigh</strong> (PyTorch, JAX, TF, MLX): Analogous safe backward for the symmetric eigendecomposition used in Horn&rsquo;s method</li>
<li><strong>Per-point weights</strong>: Weighted centroids and weighted cross-covariance for mass-weighted or confidence-weighted alignment</li>
<li><strong>Batch dimensions</strong>: All functions broadcast over leading batch dimensions without explicit loops</li>
<li><strong>Mixed-dtype promotion</strong>: Inputs are promoted to a common floating-point dtype automatically</li>
</ul>
<h3 id="testing">Testing</h3>
<p>The test suite uses Hypothesis-based property testing across 13 modules covering:</p>
<ul>
<li>Round-trip correctness (align then compare)</li>
<li>Gradient finiteness and correctness (finite-difference checks)</li>
<li>Reflection handling (proper vs. improper rotations)</li>
<li>Weighted alignment consistency</li>
<li>Batch broadcasting</li>
<li>4 differentiable backends $\times$ 4 precisions (float32, float64, and where supported, float16, bfloat16)</li>
</ul>
<h2 id="usage">Usage</h2>
<p>This is a reference cookbook, so you can copy the framework folder you need from <code>src/kabsch_horn/&lt;framework&gt;/</code> directly into your project (the code has no runtime dependencies beyond the framework itself). To depend on it instead, install a pinned version from GitHub:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bash" data-lang="bash"><span style="display:flex;"><span>pip install <span style="color:#e6db74">&#34;git+https://github.com/hunter-heidenreich/Kabsch-Cookbook.git@v0.4.1&#34;</span>
</span></span></code></pre></div><p>Basic alignment with NumPy:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">import</span> numpy <span style="color:#66d9ef">as</span> np
</span></span><span style="display:flex;"><span><span style="color:#f92672">from</span> kabsch_horn <span style="color:#f92672">import</span> numpy <span style="color:#66d9ef">as</span> kh
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e"># Two sets of corresponding 3D points</span>
</span></span><span style="display:flex;"><span>P <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>random<span style="color:#f92672">.</span>randn(<span style="color:#ae81ff">100</span>, <span style="color:#ae81ff">3</span>)
</span></span><span style="display:flex;"><span>R_true <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>linalg<span style="color:#f92672">.</span>qr(np<span style="color:#f92672">.</span>random<span style="color:#f92672">.</span>randn(<span style="color:#ae81ff">3</span>, <span style="color:#ae81ff">3</span>))[<span style="color:#ae81ff">0</span>]  <span style="color:#75715e"># random rotation matrix</span>
</span></span><span style="display:flex;"><span>Q <span style="color:#f92672">=</span> (P <span style="color:#f92672">@</span> R_true<span style="color:#f92672">.</span>T) <span style="color:#f92672">+</span> np<span style="color:#f92672">.</span>random<span style="color:#f92672">.</span>randn(<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">3</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>R, t, rmsd <span style="color:#f92672">=</span> kh<span style="color:#f92672">.</span>kabsch(P, Q)
</span></span><span style="display:flex;"><span>aligned <span style="color:#f92672">=</span> P <span style="color:#f92672">@</span> R<span style="color:#f92672">.</span>T <span style="color:#f92672">+</span> t
</span></span></code></pre></div><p>RMSD loss for training in PyTorch:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">import</span> torch
</span></span><span style="display:flex;"><span><span style="color:#f92672">from</span> kabsch_horn <span style="color:#f92672">import</span> pytorch <span style="color:#66d9ef">as</span> kh
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>pred_coords <span style="color:#f92672">=</span> model(input_features)   <span style="color:#75715e"># (B, N, 3), requires_grad=True</span>
</span></span><span style="display:flex;"><span>target_coords <span style="color:#f92672">=</span> batch[<span style="color:#e6db74">&#34;target&#34;</span>]       <span style="color:#75715e"># (B, N, 3)</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>rmsd <span style="color:#f92672">=</span> kh<span style="color:#f92672">.</span>kabsch_rmsd(pred_coords, target_coords)  <span style="color:#75715e"># (B,)</span>
</span></span><span style="display:flex;"><span>loss <span style="color:#f92672">=</span> rmsd<span style="color:#f92672">.</span>mean()
</span></span><span style="display:flex;"><span>loss<span style="color:#f92672">.</span>backward()  <span style="color:#75715e"># safe gradients via SafeSVD</span>
</span></span></code></pre></div><p>For the full API reference and additional examples, see the <a href="https://hunter-heidenreich.github.io/Kabsch-Cookbook/">documentation site</a>.</p>
<h2 id="results">Results</h2>
<h3 id="gradient-stability">Gradient Stability</h3>
<p>The standard SVD backward pass computes terms of the form $\frac{1}{\sigma_i^2 - \sigma_j^2}$, which diverges when two singular values are close. In molecular alignment this happens frequently: planar molecules, symmetric structures, and noisy coordinates can all produce near-degenerate singular values. The SafeSVD primitive floors the magnitude of that denominator at the dtype&rsquo;s machine epsilon (<code>finfo(dtype).eps</code>), producing finite (if slightly biased) gradients in these edge cases. Property-based tests confirm that gradients remain finite across thousands of random rotations, scales, and noise levels for all four differentiable backends.</p>
<h3 id="framework-parity">Framework Parity</h3>
<p>All five backends produce numerically equivalent results (up to floating-point tolerance) on the same inputs. The shared API means switching from NumPy prototyping to PyTorch training requires changing only the import path.</p>
<h2 id="related-work">Related Work</h2>
<p>This project builds on the foundational alignment algorithms described in these papers:</p>
<ul>
<li><a href="/notes/biology/computational-biology/kabsch-algorithm/">Kabsch (1976)</a>: the original SVD-based rotation alignment</li>
<li><a href="/notes/biology/computational-biology/arun-svd-point-fitting/">Arun et al. (1987)</a>: SVD formulation for 3D point set fitting</li>
<li><a href="/notes/biology/computational-biology/horn-absolute-orientation/">Horn (1987)</a>: quaternion-based closed-form absolute orientation</li>
<li><a href="/notes/biology/computational-biology/horn-orthonormal-matrices/">Horn et al. (1988)</a>: orthonormal matrix (polar decomposition) approach</li>
<li><a href="/notes/biology/computational-biology/umeyama-similarity-transformation/">Umeyama (1991)</a>: extension to include optimal scaling</li>
</ul>
<p>For a detailed walkthrough of the Kabsch algorithm with code examples, see the companion blog post: <a href="/posts/kabsch-algorithm/">The Kabsch Algorithm</a>.</p>
]]></content:encoded></item><item><title>Second-Order Langevin Equation for Field Simulations</title><link>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/classical-methods/second-order-langevin-1987/</link><pubDate>Sun, 14 Dec 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/molecular-simulation/classical-methods/second-order-langevin-1987/</guid><description>Hyperbolic Algorithm adds second-order derivatives to Langevin dynamics, reducing systematic errors to O(ε²) for lattice field simulations.</description><content:encoded><![CDATA[<h2 id="contribution-and-paper-type">Contribution and Paper Type</h2>
<p>This is a <strong>Methodological Paper</strong> ($\Psi_{\text{Method}}$). It proposes a novel stochastic algorithm, the Hyperbolic Algorithm (HA), and validates its superior efficiency against the existing Langevin Algorithm (LA) through formal error analysis and numerical simulation. It contains significant theoretical derivation (Liouville dynamics) that serves primarily to justify the algorithmic performance claims.</p>
<h2 id="motivation-and-gaps-in-prior-work">Motivation and Gaps in Prior Work</h2>
<p>The standard Langevin Algorithm (LA) for numerical simulation of Euclidean field theories suffers from efficiency bottlenecks. The simplest Euler-discretization of the LA introduces systematic errors of $O(\epsilon)$ (where $\epsilon$ is the step size). To maintain accuracy, $\epsilon$ must be kept small, which increases the sweep-sweep correlation time (autocorrelation time), making simulations computationally expensive.</p>
<h2 id="core-novelty-second-order-dynamics">Core Novelty: Second-Order Dynamics</h2>
<p>The core contribution is the introduction of a <strong>second-order derivative in fictitious time</strong> to the stochastic equation. This converts the parabolic Langevin equation into a hyperbolic equation:</p>
<p>$$
\begin{aligned}
\frac{\partial^{2}\phi}{\partial t^{2}}+\gamma\frac{\partial\phi}{\partial t}=-\frac{\partial S}{\partial\phi}+\eta
\end{aligned}
$$</p>
<h3 id="equation-comparison">Equation Comparison</h3>
<p>The key difference from the standard (first-order) Langevin equation:</p>
<table>
  <thead>
      <tr>
          <th style="text-align: left">Equation Type</th>
          <th style="text-align: left">Formula</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left"><strong>Hyperbolic (Second Order)</strong></td>
          <td style="text-align: left">$$\frac{\partial^{2}\phi}{\partial t^{2}}+\gamma\frac{\partial\phi}{\partial t}=-\frac{\partial S}{\partial\phi}+\eta$$</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>Langevin (First Order)</strong></td>
          <td style="text-align: left">$$\frac{\partial\phi}{\partial t}=-\frac{\partial S}{\partial\phi}+\eta$$</td>
      </tr>
  </tbody>
</table>
<p>The standard Langevin equation corresponds to the overdamped limit where the acceleration term is absent. Physically, the Hyperbolic equation can be viewed as microcanonical equations of motion with an added friction term.</p>
<h3 id="key-innovations">Key Innovations</h3>
<ul>
<li><strong>Higher Order Accuracy</strong>: The simplest discretization of this equation leads to systematic errors of only $O(\epsilon^2)$ compared to $O(\epsilon)$ for LA.</li>
<li><strong>Tunable Damping</strong>: The addition of the damping parameter $\gamma$ allows tuning to minimize autocorrelation tails.</li>
<li><strong>Uniform Evolution</strong>: The method evolves structures of different wavelengths more uniformly than LA due to the specific dissipation structure.</li>
</ul>
<h2 id="methodology-and-experiments">Methodology and Experiments</h2>
<p>The author validated the method using the <strong>XY Model</strong> on 2D lattices.</p>
<ul>
<li><strong>System</strong>: Euclidean action $S = -\sum_{x,\mu} \cos(\theta_{x+\mu} - \theta_x)$.</li>
<li><strong>Setup</strong>:
<ul>
<li>Lattice sizes: $15^2$ (helical boundary conditions) and $30^2$.</li>
<li>$\beta$ range: 0.9 to 1.2 (crossing the critical point $\approx 1.0$).</li>
<li>Run length: &gt;100,000 updates in equilibrium.</li>
</ul>
</li>
<li><strong>Metrics</strong>:
<ul>
<li><strong>Autocorrelation time ($\tau$)</strong>: Defined as the number of updates for the time-correlation function to drop to 10% of its initial value.</li>
<li><strong>Systematic Error</strong>: Measured via deviation of average action from Monte Carlo values.</li>
</ul>
</li>
</ul>
<h2 id="results-and-conclusions">Results and Conclusions</h2>
<ul>
<li><strong>Efficiency</strong>: The Hyperbolic Algorithm (HA) is far more efficient. For equal systematic errors, sweep-sweep correlation times are significantly lower than LA.</li>
<li><strong>Error Scaling</strong>: Numerical results confirmed that HA step size $\epsilon_H = 0.1$ yields systematic errors comparable to LA step size $\epsilon_L \approx 0.008$ ($O(\epsilon^2)$ vs $O(\epsilon)$ scaling).</li>
<li><strong>Speedup</strong>: In the disordered phase, HA is roughly $\epsilon_H / \epsilon_L$ times faster (approximately a factor of 12.5 for $\epsilon_H = 0.1$, $\epsilon_L = 0.008$). In the ordered phase, efficiency gains increase with distance scale, reaching factors of 20 or more for long-range correlations.</li>
<li><strong>Optimal Damping</strong>: For the XY model, the optimal damping parameter was found to be $\gamma \approx 0.4$.</li>
</ul>
<hr>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<h3 id="algorithms">Algorithms</h3>
<p><strong>1. The Hyperbolic Algorithm (HA)</strong></p>
<p>The discretized update equations for scalar fields are:</p>
<p>$$
\begin{aligned}
\pi_{t+\epsilon} - \pi_{t} &amp;= -\epsilon\gamma\pi_{t} - \epsilon\frac{\partial S}{\partial\phi_{t}} + \sqrt{2\epsilon\gamma/\beta}\xi_{t} \\
\phi_{t+\epsilon} - \phi_{t} &amp;= \epsilon\pi_{t+\epsilon}
\end{aligned}
$$</p>
<ul>
<li><strong>Variables</strong>: $\phi$ is the field, $\pi$ is the conjugate momentum ($\dot{\phi}$).</li>
<li><strong>Parameters</strong>: $\epsilon$ (step size), $\gamma$ (damping constant).</li>
<li><strong>Noise</strong>: $\xi$ is Gaussian noise with $\langle\xi_x \xi_y\rangle = \delta_{x,y}$.</li>
<li><strong>Storage</strong>: Requires storing both $\phi$ and $\pi$ vectors.</li>
</ul>
<p><strong>2. Non-Abelian Generalization</strong></p>
<p>For Lie group elements $U$ with generators $T^a$:</p>
<p>$$
\begin{aligned}
\pi_{t+\epsilon}^a - \pi_{t}^a &amp;= -\epsilon\gamma\pi_{t}^a - \epsilon\delta^a S[U_t] + \sqrt{2\epsilon\gamma/\beta}\xi_{t}^a \\
U_{t+\epsilon} &amp;= e^{i\epsilon\pi_{t+\epsilon}^a T^a} U_t
\end{aligned}
$$</p>
<h3 id="theoretical-proof-of-oepsilon2-accuracy">Theoretical Proof of $O(\epsilon^2)$ Accuracy</h3>
<p>The derivation relies on the generalized Liouville equation for the probability distribution $P[\phi, \pi; t]$.</p>
<ol>
<li><strong>Transition Probability</strong>: The transition $W$ for one iteration is defined.</li>
<li><strong>Effective Liouville Operator</strong>: The evolution is written as $P(t+\epsilon) = \exp(\epsilon L_{\text{eff}}) P(t)$.</li>
<li><strong>Baker-Hausdorff Expansion</strong>: Using normal ordering of operators, the equilibrium distribution $P_{\text{eq}}$ is derived through $O(\epsilon^2)$:</li>
</ol>
<p>$$
\begin{aligned}
P_{\text{eq}} &amp;= \exp\left\lbrace-\frac{1}{2}\beta_{1}\sum_{x}\pi_{x}^{2} - \beta S[\phi] + \frac{1}{2}\epsilon\beta\sum_{x}\pi_{x}S_{x} + \epsilon^{2}G + O(\epsilon^3)\right\rbrace
\end{aligned}
$$</p>
<p>where $\beta_1 = \beta\left(1 - \frac{1}{2}\epsilon\gamma\right)$.</p>
<ol start="4">
<li><strong>Effective Action</strong>: Integrating out $\pi$ yields the effective action for $\phi$:</li>
</ol>
<p>$$
\begin{aligned}
S_{\text{eff}}[\phi] &amp;= S[\phi] - \frac{1}{8}\epsilon^2 \sum_x S_x^2 + \dots
\end{aligned}
$$</p>
<p>The absence of $O(\epsilon)$ terms proves the higher-order accuracy.</p>
<h3 id="evaluation">Evaluation</h3>
<ul>
<li><strong>Model</strong>: XY Model (2D)</li>
<li><strong>Hamiltonian</strong>: $H = \frac{1}{2}\sum \pi^2 + S[\phi]$ where $S = -\sum \cos(\Delta \theta)$.</li>
<li><strong>Observables</strong>:
<ul>
<li>$\Gamma_n = \cos(\theta_{m+n} - \theta_m)$ (averaged over lattice $m$).</li>
</ul>
</li>
<li><strong>Comparisons</strong>:
<ul>
<li><strong>LA Step</strong>: $\epsilon_L \approx 0.005 - 0.02$.</li>
<li><strong>HA Step</strong>: $\epsilon_H \approx 0.1 - 0.2$.</li>
<li><strong>Equivalence</strong>: $\epsilon_H = 0.1$ matches error of $\epsilon_L \approx 0.008$.</li>
</ul>
</li>
</ul>
<hr>
<h2 id="terminology-note">Terminology Note</h2>
<p>The naming conventions in this paper differ from those commonly used in molecular dynamics (MD). The following table provides a cross-field mapping:</p>
<table>
  <thead>
      <tr>
          <th style="text-align: left">Concept</th>
          <th style="text-align: left"><strong>Field Theory (This Paper)</strong></th>
          <th style="text-align: left"><strong>Molecular Dynamics</strong></th>
          <th style="text-align: left"><strong>Mathematics</strong></th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left"><strong>Equation 1</strong></td>
          <td style="text-align: left">&ldquo;Langevin Equation&rdquo;</td>
          <td style="text-align: left">Brownian Dynamics (BD)</td>
          <td style="text-align: left">Overdamped Langevin</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>Equation 2</strong></td>
          <td style="text-align: left">&ldquo;Hyperbolic Equation&rdquo;</td>
          <td style="text-align: left">Langevin Dynamics (LD)</td>
          <td style="text-align: left">Underdamped Langevin</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>Integrator 1</strong></td>
          <td style="text-align: left">Euler Discretization</td>
          <td style="text-align: left">Euler Integrator</td>
          <td style="text-align: left">Euler-Maruyama</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>Integrator 2</strong></td>
          <td style="text-align: left">Hyperbolic Algorithm (HA)</td>
          <td style="text-align: left">Velocity Verlet / Leapfrog</td>
          <td style="text-align: left">Quasi-Symplectic Splitting</td>
      </tr>
  </tbody>
</table>
<p><strong>Key insight</strong>: The paper&rsquo;s &ldquo;Hyperbolic Algorithm&rdquo; is mathematically equivalent to Langevin Dynamics with a Leapfrog/Verlet integrator, commonly used in MD. The baseline &ldquo;Langevin Algorithm&rdquo; corresponds to Brownian Dynamics. The term &ldquo;Langevin equation&rdquo; is overloaded: field theorists often use it for overdamped dynamics (no inertia), while chemists assume it includes momentum ($F=ma$).</p>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Horowitz, A. M. (1987). The Second Order Langevin Equation and Numerical Simulations. <em>Nuclear Physics B</em>, 280, 510-522. <a href="https://doi.org/10.1016/0550-3213(87)90159-3">https://doi.org/10.1016/0550-3213(87)90159-3</a></p>
<p><strong>Publication</strong>: Nuclear Physics B 1987</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@article</span>{horowitzSecondOrderLangevin1987,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span> = <span style="color:#e6db74">{The Second Order {{Langevin}} Equation and Numerical Simulations}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span> = <span style="color:#e6db74">{Horowitz, Alan M.}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span> = <span style="color:#ae81ff">1987</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">month</span> = jan,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">journal</span> = <span style="color:#e6db74">{Nuclear Physics B}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">volume</span> = <span style="color:#e6db74">{280}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span> = <span style="color:#e6db74">{510--522}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">issn</span> = <span style="color:#e6db74">{05503213}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">doi</span> = <span style="color:#e6db74">{10.1016/0550-3213(87)90159-3}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div>]]></content:encoded></item><item><title>AI &amp; Physical Sciences Taxonomy: A Seven-Vector Framework</title><link>https://hunterheidenreich.com/notes/interdisciplinary/research-methods/ai-physical-sciences-paper-taxonomy/</link><pubDate>Sat, 13 Dec 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/interdisciplinary/research-methods/ai-physical-sciences-paper-taxonomy/</guid><description>Seven-vector framework for classifying research papers at the nexus of AI and physical sciences by dominant contribution type.</description><content:encoded><![CDATA[<h2 id="overview">Overview</h2>
<p>This is a personal working taxonomy, a lens I use to orient myself when reading papers, not a formal classification scheme. The categories are fuzzy in practice, and reasonable people will assign the same paper differently. The goal is clarity of thought, not consensus.</p>
<p>The taxonomy uses a <strong>superposition model</strong> where each paper is viewed as a linear combination of seven fundamental contribution types (basis vectors). Examples throughout draw primarily from chemistry and materials science, though the framework applies across physical sciences.</p>
<p>The framework helps answer: &ldquo;What is this paper&rsquo;s primary contribution?&rdquo; by identifying rhetorical patterns and structural elements that signal different research paradigms.</p>
<h2 id="core-principle-the-superposition-model">Core Principle: The Superposition Model</h2>
<p>All papers in this domain can be viewed as a <strong>superposition</strong> of fundamental contribution vectors. This is an analogy, not a formal mathematical claim: the seven types below are not linearly independent in the strict sense, and most papers project onto more than one.</p>
<p>Most papers exhibit a <strong>profile</strong> across multiple basis vectors, blending contribution types (e.g., Method + Theory). One vector usually provides the primary narrative thrust; secondary vectors supply the supporting evidence.</p>
<p>I tend to classify a paper by identifying its <strong>Primary Projection</strong> (the dominant contribution) and <strong>Secondary Projections</strong> (supporting work). When a paper seems to split evenly across two types, I look at which type&rsquo;s indicators dominate the abstract and section headings, and I treat the secondary projection as context that supports the primary narrative, not a co-equal claim.</p>
<h2 id="the-seven-basis-vectors-psi">The Seven Basis Vectors ($\Psi$)</h2>
<table>
  <thead>
      <tr>
          <th style="text-align: left">Basis Vector</th>
          <th style="text-align: left">Alias/Focus</th>
          <th style="text-align: left">Core Question</th>
          <th style="text-align: left">Primary Output</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left"><strong>1. $\Psi_{\text{Method}}$</strong></td>
          <td style="text-align: left">The Methodological Basis (Architecture/Algorithm)</td>
          <td style="text-align: left"><strong>What new mechanism does this introduce?</strong></td>
          <td style="text-align: left">New algorithm, architecture, or approximation</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>2. $\Psi_{\text{Theory}}$</strong></td>
          <td style="text-align: left">The Theoretical Basis (Formal Analysis)</td>
          <td style="text-align: left"><strong>Why does this work?</strong></td>
          <td style="text-align: left">Formal proof, generalization bound, or physical derivation</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>3. $\Psi_{\text{Resource}}$</strong></td>
          <td style="text-align: left">The Infrastructure Basis (Data/Software)</td>
          <td style="text-align: left"><strong>What resources are available?</strong></td>
          <td style="text-align: left">Dataset, benchmark, or open-source software ecosystem</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>4. $\Psi_{\text{Systematization}}$</strong></td>
          <td style="text-align: left">The Review Basis (Synthesis)</td>
          <td style="text-align: left"><strong>What do we know?</strong></td>
          <td style="text-align: left">Comprehensive survey or new organizing taxonomy (Systematization of Knowledge, SoK)</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>5. $\Psi_{\text{Position}}$</strong></td>
          <td style="text-align: left">The Sociological Basis (Perspective)</td>
          <td style="text-align: left"><strong>Where should the field go?</strong></td>
          <td style="text-align: left">Opinion piece, perspective, or critique of community practice</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>6. $\Psi_{\text{Discovery}}$</strong></td>
          <td style="text-align: left">The Empirical Discovery Basis</td>
          <td style="text-align: left"><strong>What new thing did we find?</strong></td>
          <td style="text-align: left">Experimentally or computationally validated material, molecule, or physical phenomenon</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>7. $\Psi_{\text{Application}}$</strong></td>
          <td style="text-align: left">The Transfer Basis (Domain Application)</td>
          <td style="text-align: left"><strong>Does this approach work here?</strong></td>
          <td style="text-align: left">Empirical evaluation of an existing method in a new scientific domain</td>
      </tr>
  </tbody>
</table>
<h2 id="assessment-guide-rhetorical-indicators">Assessment Guide: Rhetorical Indicators</h2>
<p>These rhetorical patterns tend to signal which vector dominates. They are heuristics, not rules; the goal is to notice what the paper is primarily arguing for, not to tick boxes.</p>
<h3 id="1-psi_textmethod-the-methodological-paper">1. $\Psi_{\text{Method}}$: The Methodological Paper</h3>
<p>Focuses on proposing a <strong>novel</strong> mechanism, architecture, or approximation (e.g., a new Transformer variant, a GNN with symmetry, a new DFT functional).</p>
<p><strong>Rhetorical Indicators:</strong></p>
<ul>
<li><strong>Ablation Study:</strong> Authors systematically remove components of their system to prove their specific innovation drives the performance gain. In physics venues, this often appears as parameter sensitivity analysis or direct comparison to prior potentials and functionals rather than a classic ablation table.</li>
<li><strong>Baseline Comparison:</strong> A prominent table comparing the new method against the <strong>State-of-the-Art (SOTA)</strong></li>
<li><strong>Pseudo-code:</strong> An explicit block detailing the algorithmic steps (e.g., for training, sampling, or inference)</li>
</ul>
<p><strong>Examples in the notes:</strong> <a href="/notes/machine-learning/generative-models/flow-matching-for-generative-modeling/">Flow Matching for Generative Modeling</a> (introduces a new continuous normalizing flow mechanism with ablations), <a href="/notes/chemistry/molecular-simulation/ml-potentials/mofflow/">MOFFlow</a> (novel SE(3) flow matching architecture for metal-organic frameworks with SOTA comparisons).</p>
<h3 id="2-psi_texttheory-the-theoretical-paper">2. $\Psi_{\text{Theory}}$: The Theoretical Paper</h3>
<p>Focuses on mathematical guarantees, proofs, or derivations from first principles.</p>
<p><strong>Rhetorical Indicators:</strong></p>
<ul>
<li><strong>Mathematical Proof Sections:</strong> Sections titled &ldquo;Theorem 1,&rdquo; &ldquo;Proof of Equivariance,&rdquo; or &ldquo;Formal Bounds&rdquo;</li>
<li><strong>Analysis of Limits/Capacity:</strong> Investigates the <strong>expressivity</strong> (e.g., comparing a GNN to the Weisfeiler-Lehman Test) or analyzes the geometry of the optimization landscape</li>
<li><strong>Generalization/OOD:</strong> Derives <strong>generalization bounds</strong> on test error or formally defines &ldquo;chemical space coverage&rdquo; for out-of-distribution (OOD) behavior</li>
<li><strong>Exact Constraints:</strong> Derives exact conditions that true physical functions (like the universal Density Functional) must satisfy</li>
</ul>
<p><strong>Examples in the notes:</strong> <a href="/notes/biology/computational-biology/funnels-pathways-energy-landscape/">Funnels, Pathways, and the Energy Landscape</a> (derives a formal theoretical basis for protein folding dynamics from energy landscape theory), <a href="/notes/machine-learning/generative-models/convexity-principle-interacting-gases/">The Convexity Principle and Interacting Gases</a> (formal theoretical analysis connecting flow-based generative models to thermodynamic principles).</p>
<h3 id="3-psi_textresource-the-infrastructure-paper">3. $\Psi_{\text{Resource}}$: The Infrastructure Paper</h3>
<p>Focuses on creating and sharing foundational tools for the community.</p>
<p><strong>Rhetorical Indicators:</strong></p>
<ul>
<li><strong>Curation Description:</strong> Detailed steps on how data was generated, filtered, or curated (e.g., describing millions of CPU-hours of DFT calculations for a dataset like QM9). In physics contexts, this often takes the form of DFT protocol descriptions and computational settings rather than a formal &ldquo;datasheet.&rdquo;</li>
<li><strong>&ldquo;Datasheets&rdquo; and &ldquo;Data Cards&rdquo;:</strong> Inclusion of formal documentation detailing provenance, copyright, and potential biases in the data. In physics contexts, this often appears as detailed computational settings, DFT functional choices, and convergence criteria rather than a named datasheet document.</li>
<li><strong>Benchmark Definition:</strong> Argues that &ldquo;Metric X on Dataset Y&rdquo; is the correct proxy for progress in a specific scientific task</li>
</ul>
<p><strong>Examples in the notes:</strong> <a href="/notes/chemistry/molecular-design/generation/evaluation/molecular-sets-moses/">Molecular Sets (MOSES)</a> (releases a benchmark dataset and evaluation suite for molecular generation), <a href="/notes/chemistry/molecular-representations/notations/selfies-2023/">SELFIES 2023</a> (releases an updated molecular string representation standard with tooling and validation).</p>
<h3 id="4-psi_textsystematization-the-review-paper">4. $\Psi_{\text{Systematization}}$: The Review Paper</h3>
<p>Focuses on organizing and synthesizing existing literature.</p>
<p><strong>Rhetorical Indicators:</strong></p>
<ul>
<li><strong>Survey Structure:</strong> Follows a linear, often chronological, progression or is grouped by architecture (e.g., Variational Autoencoders (VAEs), GANs, Diffusion models)</li>
<li><strong>Systematization of Knowledge (SoK):</strong> A higher-order contribution that proposes a <strong>new taxonomy or a unified framework</strong> to connect disparate concepts</li>
<li><strong>Citation Density:</strong> References a large fraction of the relevant prior literature; often includes a comparison table spanning many prior works across years or venues</li>
</ul>
<p><strong>Examples in the notes:</strong> <a href="/notes/interdisciplinary/planetary-science/venus-evolution-through-time/">Venus Evolution Through Time</a> (synthesizes multi-decade observations into a unified timeline), <a href="/notes/chemistry/molecular-simulation/classical-methods/embedded-atom-method-review-1993/">Embedded Atom Method Review (1993)</a> (organizes and contextualizes the body of work on EAM potentials into a coherent reference).</p>
<h3 id="5-psi_textposition-the-sociological-paper">5. $\Psi_{\text{Position}}$: The Sociological Paper</h3>
<p>Focuses on meta-science, arguing for a change in community norms, or critiquing systemic issues.</p>
<p>I see two subtypes come up often. The first is the <strong>roadmap/perspective</strong> paper: a constructive argument for where the field should focus, often written by senior researchers after a period of reflection. The second is the <strong>critique</strong> paper: a more adversarial argument that something the community is currently doing is wrong or counterproductive. Both share the same core signal (the argument itself is the contribution), but they have different tones and are often targeted at different audiences.</p>
<p><strong>Rhetorical Indicators:</strong></p>
<ul>
<li><strong>Venue/Track:</strong> Often found in &ldquo;Position Tracks&rdquo; or called &ldquo;Blue Sky&rdquo; or &ldquo;Forward Looking&rdquo; papers</li>
<li><strong>Argumentative Tone:</strong> Uses qualitative or quantitative analysis (meta-analysis) to argue for a shift in how research is conducted or funded (e.g., a paper arguing that AI contracts the focus of science)</li>
<li><strong>Argument as Contribution:</strong> The paper presents no new experimental findings; the primary contribution is the argument itself</li>
</ul>
<p><strong>Examples in the notes:</strong> <a href="/notes/biology/computational-biology/fold-graciously/">Fold Graciously</a> (position paper arguing for a shift in how the protein folding community evaluates progress), <a href="/notes/chemistry/molecular-representations/notations/selfies-2022/">SELFIES 2022</a> (makes a normative argument for adopting SELFIES over SMILES for robust molecular generation; the paper also has Method and Resource characteristics, but the argument-first framing and advocacy tone push it toward Position for me).</p>
<h3 id="6-psi_textdiscovery-the-empirical-discovery-paper">6. $\Psi_{\text{Discovery}}$: The Empirical Discovery Paper</h3>
<p>Focuses on the discovery of novel scientific artifacts using AI/ML tools. The key criterion is experimental or higher-fidelity confirmation of the AI&rsquo;s prediction (wet-lab synthesis, physical characterization, or a higher-fidelity simulation), rather than performance on a held-out test set.</p>
<p><strong>Rhetorical Indicators:</strong></p>
<ul>
<li><strong>Structure:</strong> Follows a workflow: (1) Computational Screening (AI selects candidates), (2) <strong>Validation</strong> (wet-lab synthesis, physical characterization, or independent computational confirmation)</li>
<li><strong>Core Claim:</strong> The primary contribution is a <strong>new material, molecule, or physical finding</strong>, with the AI/ML part serving as the necessary first step</li>
<li><strong>Key Question:</strong> Does the AI&rsquo;s prediction hold true against an independent ground truth (e.g., a physical experiment or a higher-fidelity simulation)?</li>
</ul>
<p><strong>Examples in the notes:</strong> <a href="/notes/chemistry/molecular-simulation/surface-science/oxidation-reduction-oscillations-pt-sio2-1994/">Oxidation-Reduction Oscillations on Pt/SiO2 (1994)</a> (validates a computationally-predicted dynamic phenomenon against physical experiment), <a href="/notes/biology/evolutionary-biology/nature-of-luca-early-earth-system/">Nature of LUCA and the Early Earth System</a> (uses computational inference to recover a validated empirical claim about early life).</p>
<h3 id="7-psi_textapplication-the-application-paper">7. $\Psi_{\text{Application}}$: The Application Paper</h3>
<p>Applies an existing method, architecture, or technique to a new scientific domain or task without introducing a new mechanism, dataset, or experimentally confirmed finding. The primary contribution is demonstrating feasibility or utility of transfer.</p>
<p><strong>Rhetorical Indicators:</strong></p>
<ul>
<li><strong>&ldquo;We Apply X to Y&rdquo; framing:</strong> The abstract explicitly names an existing method and a new domain; the contribution is the connection, not the method itself.</li>
<li><strong>Benchmark-style Results:</strong> Performance reported on domain-specific tasks using existing metrics; baselines are often simple (classical methods, prior domain tools) rather than ML SOTA.</li>
<li><strong>Absence of novelty claims:</strong> The paper does not argue for a new architecture, does not release a new dataset, and does not report a validated experimental discovery.</li>
</ul>
<p><strong>Examples in the notes:</strong> <a href="/notes/chemistry/llm-applications/benchmarking-llms-molecule-prediction/">Benchmarking LLMs for Molecular Property Prediction</a> (evaluates GPT-3.5, GPT-4, and Llama-2 on six existing OGB molecular tasks without modifying the models), <a href="/notes/chemistry/molecular-design/property-prediction/maxsmi-smiles-augmentation-property-prediction/">Maxsmi: SMILES Augmentation for Property Prediction</a> (systematic evaluation of five augmentation strategies with existing CNN/RNN architectures), <a href="/notes/chemistry/molecular-design/generation/evaluation/molecular-language-models-rnns-or-transformer/">RNNs vs Transformers for Molecular Generation Tasks</a> (empirical comparison of existing architectures on SMILES and SELFIES generation).</p>
<h2 id="edge-cases-and-disambiguation">Edge Cases and Disambiguation</h2>
<p>These are the pairs I find myself second-guessing most often. I&rsquo;ve written down how I tend to resolve them, not as rules, but as a record of my reasoning.</p>
<p><strong>Method vs. Discovery:</strong> The key question I ask is whether there is independent validation beyond a held-out test set. If the paper reports a wet-lab confirmation or a higher-fidelity simulation verifying the AI&rsquo;s output, I lean toward Discovery: the finding is real and the AI was the tool. If the architecture is the thing being argued for, and validation is limited to benchmarks, I lean toward Method. One edge case I see often in computational chemistry is ML-predicts + DFT-validates: no wet lab, but DFT is genuinely independent of the ML model. I treat this as Discovery if the DFT result is the claimed contribution, and Method if the ML model&rsquo;s performance against DFT is what&rsquo;s being argued for.</p>
<p><strong>Resource vs. Systematization:</strong> I think of this as artifact-first vs. interpretation-first. If the paper&rsquo;s main output is something you can download and use (a benchmark, a curated dataset, a software library), it&rsquo;s Resource. If the paper&rsquo;s main output is a new way of thinking about a body of literature (a taxonomy, a conceptual unification), it&rsquo;s Systematization. The presence of a clear benchmark definition with metrics tends to push toward Resource even when there is substantial review content.</p>
<p><strong>Application vs. Method:</strong> The cleanest distinguisher here is whether anything new was designed. If an existing method is taken off the shelf and pointed at a new domain, that&rsquo;s Application. If the domain&rsquo;s requirements motivated a modification to the method (even a small one), then there may be a legitimate Method component. Application papers often read as feasibility demonstrations; Method papers read as capability expansions.</p>
<p><strong>When I&rsquo;m still stuck:</strong> I look at three things in roughly this order: the abstract&rsquo;s final sentence (in my experience, especially in ML venues, authors tend to signal what they most want credit for in that last sentence), section heading vocabulary (presence of &ldquo;theorem,&rdquo; &ldquo;algorithm,&rdquo; &ldquo;dataset,&rdquo; &ldquo;survey,&rdquo; or &ldquo;we find&rdquo; is usually diagnostic), and venue track (position/perspective tracks, SoK tracks, and empirical tracks each carry strong prior signal). I treat the result as a best guess, not a verdict.</p>
<h2 id="applications">Applications</h2>
<p>This framework has been useful to me in a few concrete ways:</p>
<ul>
<li><strong>Organizing literature reviews:</strong> Knowing a paper&rsquo;s vector profile tells me where it belongs in a review&rsquo;s structure: whether it fits in the &ldquo;methods&rdquo; or &ldquo;findings&rdquo; or &ldquo;community context&rdquo; section, rather than forcing everything into chronological order.</li>
<li><strong>Understanding conference and journal acceptance criteria:</strong> Different venues weight vectors differently. Physics journals favor Discovery; ML venues favor Method and Theory; interdisciplinary venues like <em>Nature</em> or <em>Science</em> often weight Discovery heavily but also publish high-impact Position papers. Knowing this helps calibrate expectations.</li>
<li><strong>Identifying gaps in research portfolios:</strong> A collection that is heavy on Method but light on Resource or Discovery points to work that may lack empirical grounding or shared infrastructure. The vector breakdown makes these gaps visible.</li>
<li><strong>Recognizing different types of scientific contribution:</strong> Not everything that looks like a methods paper is one, and not everything that looks like a survey is one. The rhetorical patterns here help me notice when my first impression was wrong and reconsider what is actually being claimed.</li>
</ul>
<h2 id="on-site-enforcement">On-Site Enforcement</h2>
<p>Every <code>note_type: paper</code> entry on this site declares its Primary Projection through a single-valued <code>paper_types</code> frontmatter list. A schema validator rejects notes that omit this field or use values outside the seven canonical vectors, and the classification is surfaced as a Hugo taxonomy at <a href="/paper-types/">/paper-types/</a>. Browsing a vector landing page such as <a href="/paper-types/method/">/paper-types/method/</a> or <a href="/paper-types/systematization/">/paper-types/systematization/</a> is the fastest way to pull up every note that shares a contribution type.</p>
<p>Enforcement is deliberately single-select: it captures the Primary Projection only. Secondary vectors still exist in the text of each note, but the tag axis is chosen to discriminate between classes rather than to describe a paper exhaustively.</p>
]]></content:encoded></item><item><title>Molecular String Renderer: Chemical Visualization Library</title><link>https://hunterheidenreich.com/projects/molecular-string-renderer/</link><pubDate>Sun, 30 Nov 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/projects/molecular-string-renderer/</guid><description>A type-safe Python library for converting chemical strings (SMILES, SELFIES, InChI) into publication-quality molecular images.</description><content:encoded><![CDATA[<h2 id="overview">Overview</h2>
<p>In computational chemistry and AI drug discovery, visualization pipelines are often brittle; breaking on edge cases or failing silently when processing millions of molecules for training data.</p>
<p>I built <code>molecular-string-renderer</code> to treat molecular visualization as a strict software engineering problem. It is a highly configurable wrapper around RDKit that standardizes the conversion of text-based chemical representations (SMILES, <a href="/notes/chemistry/molecular-representations/notations/inchi-2013/">InChI</a>, SELFIES) into raster and vector graphics, degrading gracefully on inputs RDKit cannot vectorize.</p>
<h2 id="features">Features</h2>
<p>This library differentiates itself from standard plotting scripts through strict architectural patterns designed for reliability:</p>
<h3 id="1-strategy-pattern-for-svg-generation">1. Strategy Pattern for SVG Generation</h3>
<p>RDKit&rsquo;s vector rendering can sometimes fail on complex molecular topologies. I implemented a <strong>Hybrid Strategy</strong> so that a single molecule RDKit cannot vectorize does not fail the batch:</p>
<ul>
<li><strong>Vector Strategy</strong>: Attempts to generate a true, scalable vector graphic.</li>
<li><strong>Raster Fallback</strong>: If the vector engine fails, the system automatically renders a high-res PNG and embeds it transparently into the SVG container.</li>
</ul>
<h3 id="2-native-generative-ai-support">2. Native Generative AI Support</h3>
<p>With the rise of Large Language Models in chemistry, <strong>SELFIES</strong> (Self-Referencing Embedded Strings) has become a standard output format. This library handles SELFIES natively, managing the decoding and sanitization lifecycle internally so that ML training loops can simply &ldquo;pass strings and get images.&rdquo;</p>
<h3 id="3-strict-configuration-contracts">3. Strict Configuration Contracts</h3>
<p>The library uses <strong>Pydantic</strong> models (<code>RenderConfig</code>, <code>ParserConfig</code>, <code>OutputConfig</code>) to enforce strict data contracts. This ensures that visualization parameters are validated before any heavy computation begins, preventing runtime errors deep in a batch job.</p>
<h2 id="usage">Usage</h2>
<p>The library provides a simple Python API for rendering single molecules or batches of molecules from various string formats.</p>
<h2 id="results">Results</h2>
<ul>
<li><strong>Type Safety</strong>: The codebase runs with strict <code>mypy</code> settings, ensuring type safety across the entire pipeline.</li>
<li><strong>Grid Auto-Fitting</strong>: Implemented smart layout algorithms that automatically adjust grid dimensions based on the input batch size.</li>
<li><strong>Format Agnostic</strong>: Decouples the <em>parsing</em> logic (SMILES vs. MolBlock vs. SELFIES) from the <em>rendering</em> logic, making it trivial to add support for new proprietary formats.</li>
</ul>
<h2 id="reliability">Reliability</h2>
<p>When rendering large batches of generated molecules, a single hard-to-draw structure should not fail the whole job. The raster fallback and the strict Pydantic and mypy contracts exist so the pipeline degrades gracefully on edge cases rather than crashing or failing silently, the common failure mode of ad hoc RDKit plotting scripts.</p>
<h2 id="related-work">Related Work</h2>
<ul>
<li><a href="/posts/visualizing-smiles-and-selfies-strings/">Visualizing SMILES and SELFIES Strings</a>: walkthrough of the visualization pipeline this library implements</li>
<li><a href="/projects/isomer-dataset-generation/">Isomer Dataset Generation</a>: related project generating molecular datasets using SMILES/SELFIES representations</li>
</ul>
]]></content:encoded></item><item><title>Exponential Random Numbers: Two Classic Algorithms</title><link>https://hunterheidenreich.com/posts/random-number-tricks/</link><pubDate>Sun, 31 Aug 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/posts/random-number-tricks/</guid><description>Compare inverse transform sampling and von Neumann's rejection method for exponential random numbers with Python implementations and performance.</description><content:encoded><![CDATA[<h2 id="introduction">Introduction</h2>
<p>In the early days of computing, generating random numbers was a significant computational challenge. In a landmark 1951 paper, mathematician John von Neumann detailed various &ldquo;cooking recipes&rdquo; for producing and using random numbers on machines like the ENIAC. While much of the paper focuses on generating <em>uniform</em> random digits, he also described ingenious methods for generating numbers from more complex, non-uniform probability distributions.</p>
<p>One of the most fundamental needs in scientific simulation (from modeling radioactive decay to calculating particle free-paths in molecular dynamics) is sampling from an <strong>exponential distribution</strong> with probability density function:</p>
<p>$$f(x) = e^{-x} \quad \text{for } x \ge 0$$</p>
<p>Today&rsquo;s standard approach is elegant and direct, but it requires computing a natural logarithm (a computationally expensive operation on early hardware). To sidestep this limitation, von Neumann described a fascinating alternative that uses only basic comparisons, resembling what he called &ldquo;a well known game of chance Twenty-One, or Black Jack.&rdquo;</p>
<p>In this post, we&rsquo;ll explore both methods: the modern inverse transform approach and von Neumann&rsquo;s ingenious comparison-based algorithm. We&rsquo;ll implement them in Python, verify their correctness, and compare their performance, empirically testing the trade-offs von Neumann identified nearly 75 years ago.</p>
<hr>
<h2 id="method-1-the-standard-approach-inverse-transform-sampling">Method 1: The Standard Approach (Inverse Transform Sampling)</h2>
<p>The most common method for sampling from a given distribution is <strong>inverse transform sampling</strong>. This method relies on a fundamental principle: if you have a uniform random variable $U$ on the interval (0, 1), you can transform it into a random variable $X$ with any desired cumulative distribution function (CDF) $F(x)$ by applying:</p>
<p>$$X = F^{-1}(U)$$</p>
<p>For the exponential distribution, the CDF is $F(x) = 1 - e^{-x}$. To find the inverse, we set $U = 1 - e^{-X}$ and solve for $X$:</p>
<p>$$
\begin{align}
e^{-X} &amp;= 1 - U \
-X &amp;= \ln(1 - U) \
X &amp;= -\ln(1 - U)
\end{align}
$$</p>
<p>Here&rsquo;s a useful simplification: since $U$ is uniformly distributed on (0, 1), the quantity $(1 - U)$ is also uniformly distributed on (0, 1). Therefore, we can use the simpler formula:</p>
<p>$$X = -\ln(U)$$</p>
<p>This gives us an efficient method for generating exponentially distributed numbers, provided the logarithm function is computationally accessible.</p>
<h3 id="python-implementation">Python Implementation</h3>
<p>Here&rsquo;s a straightforward implementation using NumPy:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">import</span> numpy <span style="color:#66d9ef">as</span> np
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">exponential_inverse_transform</span>(n_samples<span style="color:#f92672">=</span><span style="color:#ae81ff">1</span>):
</span></span><span style="display:flex;"><span>    <span style="color:#e6db74">&#34;&#34;&#34;
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    Generate samples from an exponential distribution using inverse transform sampling.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    Args:
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">        n_samples (int): Number of samples to generate.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    Returns:
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">        np.ndarray: Array of exponentially distributed samples.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    &#34;&#34;&#34;</span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Generate uniform random numbers</span>
</span></span><span style="display:flex;"><span>    U <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>random<span style="color:#f92672">.</span>rand(n_samples)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Apply the inverse transform</span>
</span></span><span style="display:flex;"><span>    X <span style="color:#f92672">=</span> <span style="color:#f92672">-</span>np<span style="color:#f92672">.</span>log(U)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> X
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e"># Generate 100,000 samples for testing</span>
</span></span><span style="display:flex;"><span>n_samples <span style="color:#f92672">=</span> <span style="color:#ae81ff">100000</span>
</span></span><span style="display:flex;"><span>inverse_samples <span style="color:#f92672">=</span> exponential_inverse_transform(n_samples)
</span></span></code></pre></div><hr>
<h2 id="method-2-von-neumanns-ingenious-trick-rejection-sampling">Method 2: Von Neumann&rsquo;s Ingenious Trick (Rejection Sampling)</h2>
<p>Von Neumann proposed a clever alternative that avoids transcendental functions entirely. His procedure, which he noted &ldquo;resembles a well known game of chance Twenty-One, or Black Jack,&rdquo; generates sequences of uniform random numbers and accepts or rejects them based on simple comparison rules.</p>
<p>The algorithm works as follows to generate a single exponential sample $X$:</p>
<ol>
<li>
<p><strong>Initialize</strong>: Start with an integer offset <code>k = 0</code>, which will form the integer part of the final result.</p>
</li>
<li>
<p><strong>Generate a trial sequence</strong>:</p>
<ul>
<li>Generate uniform random numbers $Y_1, Y_2, Y_3, \ldots$ from (0, 1)</li>
<li>Find the smallest integer <code>n</code> such that the sequence is no longer strictly decreasing</li>
<li>That is, find <code>n</code> where $Y_1 &gt; Y_2 &gt; \cdots &gt; Y_n$ but $Y_n \leq Y_{n+1}$</li>
</ul>
</li>
<li>
<p><strong>Accept or reject</strong>:</p>
<ul>
<li>If <code>n</code> is <strong>odd</strong>: Accept the trial. Return $X = Y_1 + k$ and terminate.</li>
<li>If <code>n</code> is <strong>even</strong>: Reject the trial. Increment <code>k</code> by 1 and start a new trial.</li>
</ul>
</li>
</ol>
<p>This process is guaranteed to terminate and produces samples that follow the exponential distribution exactly. As von Neumann elegantly put it, the machine has &ldquo;in effect computed a logarithm by performing only discriminations on the relative magnitude of numbers.&rdquo;</p>
<h3 id="python-implementation-1">Python Implementation</h3>
<p>This implementation requires more careful state management due to the nested trial structure:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">import</span> numpy <span style="color:#66d9ef">as</span> np
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">exponential_von_neumann</span>(n_samples<span style="color:#f92672">=</span><span style="color:#ae81ff">1</span>):
</span></span><span style="display:flex;"><span>    <span style="color:#e6db74">&#34;&#34;&#34;
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    Generate samples from an exponential distribution using von Neumann&#39;s
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    comparison-based rejection sampling method.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    Args:
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">        n_samples (int): Number of samples to generate.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    Returns:
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">        tuple[np.ndarray, float]: Array of samples and average uniform draws per sample.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    &#34;&#34;&#34;</span>
</span></span><span style="display:flex;"><span>    samples <span style="color:#f92672">=</span> []
</span></span><span style="display:flex;"><span>    total_uniform_draws <span style="color:#f92672">=</span> <span style="color:#ae81ff">0</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">for</span> _ <span style="color:#f92672">in</span> range(n_samples):
</span></span><span style="display:flex;"><span>        k <span style="color:#f92672">=</span> <span style="color:#ae81ff">0</span>  <span style="color:#75715e"># Integer offset</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">while</span> <span style="color:#66d9ef">True</span>:  <span style="color:#75715e"># Trial loop</span>
</span></span><span style="display:flex;"><span>            <span style="color:#75715e"># Generate decreasing sequence</span>
</span></span><span style="display:flex;"><span>            y_prev <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>random<span style="color:#f92672">.</span>rand()
</span></span><span style="display:flex;"><span>            total_uniform_draws <span style="color:#f92672">+=</span> <span style="color:#ae81ff">1</span>
</span></span><span style="display:flex;"><span>            y1 <span style="color:#f92672">=</span> y_prev  <span style="color:#75715e"># Store first value</span>
</span></span><span style="display:flex;"><span>            n <span style="color:#f92672">=</span> <span style="color:#ae81ff">1</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>            <span style="color:#75715e"># Find length of decreasing sequence</span>
</span></span><span style="display:flex;"><span>            <span style="color:#66d9ef">while</span> <span style="color:#66d9ef">True</span>:
</span></span><span style="display:flex;"><span>                y_curr <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>random<span style="color:#f92672">.</span>rand()
</span></span><span style="display:flex;"><span>                total_uniform_draws <span style="color:#f92672">+=</span> <span style="color:#ae81ff">1</span>
</span></span><span style="display:flex;"><span>                <span style="color:#66d9ef">if</span> y_prev <span style="color:#f92672">&lt;=</span> y_curr:
</span></span><span style="display:flex;"><span>                    <span style="color:#66d9ef">break</span>  <span style="color:#75715e"># Sequence no longer decreasing</span>
</span></span><span style="display:flex;"><span>                y_prev <span style="color:#f92672">=</span> y_curr
</span></span><span style="display:flex;"><span>                n <span style="color:#f92672">+=</span> <span style="color:#ae81ff">1</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>            <span style="color:#75715e"># Accept if n is odd, reject if even</span>
</span></span><span style="display:flex;"><span>            <span style="color:#66d9ef">if</span> n <span style="color:#f92672">%</span> <span style="color:#ae81ff">2</span> <span style="color:#f92672">==</span> <span style="color:#ae81ff">1</span>:  <span style="color:#75715e"># Accept</span>
</span></span><span style="display:flex;"><span>                samples<span style="color:#f92672">.</span>append(y1 <span style="color:#f92672">+</span> k)
</span></span><span style="display:flex;"><span>                <span style="color:#66d9ef">break</span>
</span></span><span style="display:flex;"><span>            <span style="color:#66d9ef">else</span>:  <span style="color:#75715e"># Reject</span>
</span></span><span style="display:flex;"><span>                k <span style="color:#f92672">+=</span> <span style="color:#ae81ff">1</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    avg_draws <span style="color:#f92672">=</span> total_uniform_draws <span style="color:#f92672">/</span> n_samples
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> np<span style="color:#f92672">.</span>array(samples), avg_draws
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e"># Generate samples using von Neumann&#39;s method</span>
</span></span><span style="display:flex;"><span>von_neumann_samples, avg_draws <span style="color:#f92672">=</span> exponential_von_neumann(n_samples)
</span></span><span style="display:flex;"><span>print(<span style="color:#e6db74">f</span><span style="color:#e6db74">&#34;Von Neumann method used </span><span style="color:#e6db74">{</span>avg_draws<span style="color:#e6db74">:</span><span style="color:#e6db74">.2f</span><span style="color:#e6db74">}</span><span style="color:#e6db74"> uniform draws per sample on average.&#34;</span>)
</span></span></code></pre></div><div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>Von Neumann method used 4.30 uniform draws per sample on average.
</span></span></code></pre></div><p>The algorithm requires approximately <strong>4.3</strong> uniform draws per exponential sample, matching the theoretical value $e^2/(e-1) = 4.30$.</p>
<hr>
<h2 id="verification-and-comparison">Verification and Comparison</h2>
<p>The critical test: do both methods actually produce the same distribution? And how do their performance characteristics compare?</p>
<h3 id="visual-verification">Visual Verification</h3>
<p>Let&rsquo;s plot histograms of samples from both methods alongside the theoretical probability density function $f(x) = e^{-x}$:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">import</span> matplotlib.pyplot <span style="color:#66d9ef">as</span> plt
</span></span><span style="display:flex;"><span><span style="color:#f92672">import</span> seaborn <span style="color:#66d9ef">as</span> sns
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e"># Configure plot aesthetics</span>
</span></span><span style="display:flex;"><span>sns<span style="color:#f92672">.</span>set_style(<span style="color:#e6db74">&#34;whitegrid&#34;</span>)
</span></span><span style="display:flex;"><span>plt<span style="color:#f92672">.</span>figure(figsize<span style="color:#f92672">=</span>(<span style="color:#ae81ff">12</span>, <span style="color:#ae81ff">7</span>))
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e"># Plot histograms for both methods</span>
</span></span><span style="display:flex;"><span>plt<span style="color:#f92672">.</span>hist(inverse_samples, bins<span style="color:#f92672">=</span><span style="color:#ae81ff">50</span>, density<span style="color:#f92672">=</span><span style="color:#66d9ef">True</span>, alpha<span style="color:#f92672">=</span><span style="color:#ae81ff">0.7</span>,
</span></span><span style="display:flex;"><span>         label<span style="color:#f92672">=</span><span style="color:#e6db74">&#39;Inverse Transform&#39;</span>, color<span style="color:#f92672">=</span><span style="color:#e6db74">&#39;skyblue&#39;</span>)
</span></span><span style="display:flex;"><span>plt<span style="color:#f92672">.</span>hist(von_neumann_samples, bins<span style="color:#f92672">=</span><span style="color:#ae81ff">50</span>, density<span style="color:#f92672">=</span><span style="color:#66d9ef">True</span>, alpha<span style="color:#f92672">=</span><span style="color:#ae81ff">0.7</span>,
</span></span><span style="display:flex;"><span>         label<span style="color:#f92672">=</span><span style="color:#e6db74">&#34;Von Neumann&#39;s Method&#34;</span>, color<span style="color:#f92672">=</span><span style="color:#e6db74">&#39;lightcoral&#39;</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e"># Overlay theoretical PDF</span>
</span></span><span style="display:flex;"><span>x <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>linspace(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">8</span>, <span style="color:#ae81ff">400</span>)
</span></span><span style="display:flex;"><span>pdf <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>exp(<span style="color:#f92672">-</span>x)
</span></span><span style="display:flex;"><span>plt<span style="color:#f92672">.</span>plot(x, pdf, <span style="color:#e6db74">&#39;r-&#39;</span>, linewidth<span style="color:#f92672">=</span><span style="color:#ae81ff">2</span>, label<span style="color:#f92672">=</span><span style="color:#e6db74">&#39;Theoretical PDF ($e^{-x}$)&#39;</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>plt<span style="color:#f92672">.</span>title(<span style="color:#e6db74">&#39;Exponential Sampling Methods vs. Theoretical Distribution&#39;</span>, fontsize<span style="color:#f92672">=</span><span style="color:#ae81ff">16</span>)
</span></span><span style="display:flex;"><span>plt<span style="color:#f92672">.</span>xlabel(<span style="color:#e6db74">&#39;x&#39;</span>, fontsize<span style="color:#f92672">=</span><span style="color:#ae81ff">12</span>)
</span></span><span style="display:flex;"><span>plt<span style="color:#f92672">.</span>ylabel(<span style="color:#e6db74">&#39;Density&#39;</span>, fontsize<span style="color:#f92672">=</span><span style="color:#ae81ff">12</span>)
</span></span><span style="display:flex;"><span>plt<span style="color:#f92672">.</span>legend()
</span></span><span style="display:flex;"><span>plt<span style="color:#f92672">.</span>xlim(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">8</span>)
</span></span><span style="display:flex;"><span>plt<span style="color:#f92672">.</span>tight_layout()
</span></span><span style="display:flex;"><span>plt<span style="color:#f92672">.</span>show()
</span></span></code></pre></div>














<figure class="post-figure center ">
    <img src="/img/exponential_random_gens.webp"
         alt="Comparison of exponential sampling methods showing histograms from both inverse transform and von Neumann methods overlaid with the theoretical exponential distribution"
         title="Comparison of exponential sampling methods showing histograms from both inverse transform and von Neumann methods overlaid with the theoretical exponential distribution"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">Both sampling methods reproduce the exponential distribution $f(x) = e^{-x}$</figcaption>
    
</figure>

<p>The visualization confirms that both methods accurately reproduce the target exponential distribution. The empirical histograms match the theoretical curve, confirming both algorithms sample the target distribution.</p>
<h3 id="performance-analysis">Performance Analysis</h3>
<p>Mathematical elegance often diverges from computational efficiency. Von Neumann himself observed that on the ENIAC, it was actually &ldquo;slightly quicker to use a truncated power series for log(1-T)&rdquo; than to perform all the comparisons his method required.</p>
<p>Let&rsquo;s benchmark both approaches in a modern Python environment:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">import</span> time
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e"># Benchmark inverse transform method</span>
</span></span><span style="display:flex;"><span>start_time <span style="color:#f92672">=</span> time<span style="color:#f92672">.</span>time()
</span></span><span style="display:flex;"><span>_ <span style="color:#f92672">=</span> exponential_inverse_transform(n_samples)
</span></span><span style="display:flex;"><span>inverse_time <span style="color:#f92672">=</span> time<span style="color:#f92672">.</span>time() <span style="color:#f92672">-</span> start_time
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e"># Benchmark von Neumann method</span>
</span></span><span style="display:flex;"><span>start_time <span style="color:#f92672">=</span> time<span style="color:#f92672">.</span>time()
</span></span><span style="display:flex;"><span>_ <span style="color:#f92672">=</span> exponential_von_neumann(n_samples)
</span></span><span style="display:flex;"><span>vn_time <span style="color:#f92672">=</span> time<span style="color:#f92672">.</span>time() <span style="color:#f92672">-</span> start_time
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>print(<span style="color:#e6db74">f</span><span style="color:#e6db74">&#34;Inverse Transform:  </span><span style="color:#e6db74">{</span>inverse_time<span style="color:#e6db74">:</span><span style="color:#e6db74">.4f</span><span style="color:#e6db74">}</span><span style="color:#e6db74"> seconds&#34;</span>)
</span></span><span style="display:flex;"><span>print(<span style="color:#e6db74">f</span><span style="color:#e6db74">&#34;Von Neumann Method: </span><span style="color:#e6db74">{</span>vn_time<span style="color:#e6db74">:</span><span style="color:#e6db74">.4f</span><span style="color:#e6db74">}</span><span style="color:#e6db74"> seconds&#34;</span>)
</span></span><span style="display:flex;"><span>print(<span style="color:#e6db74">f</span><span style="color:#e6db74">&#34;Speedup factor: </span><span style="color:#e6db74">{</span>vn_time <span style="color:#f92672">/</span> inverse_time<span style="color:#e6db74">:</span><span style="color:#e6db74">.1f</span><span style="color:#e6db74">}</span><span style="color:#e6db74">x&#34;</span>)
</span></span></code></pre></div><div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>Inverse Transform:  0.0018 seconds
</span></span><span style="display:flex;"><span>Von Neumann Method: 0.1860 seconds
</span></span><span style="display:flex;"><span>Speedup factor: 103.3x
</span></span></code></pre></div><p>The gap is large. The vectorized NumPy implementation of inverse transform sampling, leveraging a highly optimized C-backed logarithm function, outperforms the Python-looped von Neumann implementation by more than two orders of magnitude. While a vectorized or JIT-compiled version of von Neumann&rsquo;s method would close this gap by removing Python interpreter overhead, the inverse transform remains the practical winner on modern hardware with fast floating-point units. This confirms von Neumann&rsquo;s prescient observation: the &ldquo;theoretically elegant&rdquo; method avoiding transcendental functions often yields to direct computation.</p>
<h2 id="conclusion">Conclusion</h2>
<p>This exploration offers a window into the ingenuity of early computational mathematics. Von Neumann&rsquo;s comparison-based algorithm demonstrates remarkable mathematical creativity (showing how to &ldquo;compute a logarithm&rdquo; using only basic machine operations). Our implementation reproduces the algorithm, producing samples whose histogram and moments match the exponential distribution.</p>
<p>The performance comparison validates von Neumann&rsquo;s own pragmatic assessment. His rejection sampling method is intellectually elegant and historically significant. The direct logarithmic approach proves far more efficient on both early and modern hardware. It serves as a timeless reminder in scientific computing: theoretical beauty often diverges from computational practicality.</p>
<p>The enduring value of von Neumann&rsquo;s work lies in the fundamental insight that creative mathematical thinking can circumvent apparent computational limitations. Understanding alternative methods deepens our appreciation for the rich landscape of algorithmic possibilities, even when the direct approach proves superior.</p>
]]></content:encoded></item><item><title>Modernizing Rahman's 1964 Argon Simulation</title><link>https://hunterheidenreich.com/posts/rahman-1964-lammps-liquid-argon/</link><pubDate>Sat, 23 Aug 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/posts/rahman-1964-lammps-liquid-argon/</guid><description>How I used modern software engineering (caching, vectorization, and dependency locking) to reproduce a 60-year-old physics milestone.</description><content:encoded><![CDATA[<p>Some papers invent entire fields. Aneesur Rahman&rsquo;s 1964 paper, <strong>&ldquo;Correlations in the Motion of Atoms in Liquid Argon&rdquo;</strong>, is the &ldquo;Hello World&rdquo; of molecular dynamics (MD). Using a computer with less memory than a modern microwave, Rahman solved Newton&rsquo;s equations for 864 atoms and proved that liquids have distinct, quantifiable structure.</p>
<p>The physics of liquid argon is a solved problem. We know the answer.</p>
<p>So, why replicate it in 2025? <strong>To apply modern engineering standards to legacy science.</strong></p>
<p>This project served as an exercise in <strong>software archaeology</strong>: taking a vintage scientific workflow and rebuilding it with a modular Python analysis pipeline. I wanted to see if I could replace Rahman&rsquo;s &ldquo;write-once&rdquo; Fortran mentality with modern reproducibility, type safety, and intelligent caching.</p>
<p>The full source code is available on <a href="https://github.com/hunter-heidenreich/argon-simulation">GitHub</a>. The complete project overview, including analysis results and pipeline architecture, is on the <a href="/projects/rahman-1964-replication/">Rahman 1964 Replication project page</a>.</p>
<div style="position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;">
      <iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share; fullscreen" loading="eager" referrerpolicy="strict-origin-when-cross-origin" src="https://www.youtube-nocookie.com/embed/KjFixUt6bnQ?autoplay=0&amp;controls=1&amp;end=0&amp;loop=0&amp;mute=0&amp;start=0" style="position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;" title="YouTube video"></iframe>
    </div>

<hr>
<h2 id="engineering-the-pipeline">Engineering the Pipeline</h2>
<p>The most interesting part of this project isn&rsquo;t the simulation engine (LAMMPS handles that); it&rsquo;s the architecture of the analysis suite. MD analysis is computationally expensive ($O(N^2)$), and iterating on plots can be painfully slow if you re-compute trajectory data every time.</p>
<p>Why bother? Don&rsquo;t modern MD packages come with analysis tools?
Well, some say that writing is thinking.
Sometimes getting into the weeds of how an algorithm works or an analysis is performed, you gain insights and a deeper understanding that might be obscured by a plug-and-play tool.</p>
<h3 id="intelligent-caching">Intelligent Caching</h3>
<p>I built the <code>argon_sim</code> package with a decorator-based caching layer. The system hashes the source file&rsquo;s modification time and the function&rsquo;s arguments to avoid re-calculating the Radial Distribution Function (RDF) or Van Hove correlations on every script run.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#a6e22e">@cached_computation</span>(<span style="color:#e6db74">&#34;gr&#34;</span>)
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">compute_radial_distribution</span>(filename: str, dr: float <span style="color:#f92672">=</span> <span style="color:#ae81ff">0.05</span>):
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># ... expensive O(N^2) distance calculations ...</span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> r_values, g_r, density
</span></span></code></pre></div><p>If I tweak a plot axis, the script runs instantly, loading pre-computed arrays from disk instead of re-running the $O(N^2)$ computation. If I change the simulation trajectory, the cache invalidates automatically.</p>
<h3 id="vectorization--memory-management">Vectorization &amp; Memory Management</h3>
<p>Rahman likely relied on nested loops. Python is too slow for that. I utilized <strong>NumPy broadcasting</strong> to vectorize the calculation of atomic displacements.</p>
<p>However, calculating an $864 \times 864$ distance matrix for 5,000 frames consumes significant RAM. I implemented a <strong>chunked MSD (Mean Square Displacement) algorithm</strong> that processes the trajectory in blocks, balancing vectorization speed with memory constraints. The chunking trades some vectorization speed for a bounded memory footprint, so the analysis is not capped by holding the full distance matrix in RAM.</p>
<h3 id="reproducibility-as-a-feature">Reproducibility as a Feature</h3>
<p>Academic code is notorious for &ldquo;it works on my machine.&rdquo; To combat this, I used <strong><code>uv</code></strong> for dependency management, locking the exact environment state. The entire workflow (from simulation to final figure generation) is abstracted into a <code>Makefile</code>.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bash" data-lang="bash"><span style="display:flex;"><span><span style="color:#75715e"># One command to run the physics, analyze data, and generate plots</span>
</span></span><span style="display:flex;"><span>make workflow
</span></span></code></pre></div><hr>
<h2 id="the-simulation-1964-vs-2025">The Simulation: 1964 vs. 2025</h2>
<p>I preserved Rahman&rsquo;s physical parameters exactly to ensure a fair comparison:</p>
<ul>
<li><strong>System</strong>: 864 Argon atoms</li>
<li><strong>Potential</strong>: Lennard-Jones ($\sigma = 3.4$ Å, $\epsilon/k_B = 120$ K)</li>
<li><strong>Target</strong>: 94.4 K, 1.374 g/cm³</li>
</ul>
<p>However, I modernized the <em>numerical</em> methods to ensure stability:</p>
<table>
  <thead>
      <tr>
          <th style="text-align: left">Feature</th>
          <th style="text-align: left">Rahman (1964)</th>
          <th style="text-align: left">This Work (2025)</th>
          <th style="text-align: left">Why it Matters</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left"><strong>Integration</strong></td>
          <td style="text-align: left">Predictor-Corrector</td>
          <td style="text-align: left">Velocity Verlet</td>
          <td style="text-align: left">Better energy conservation over long runs</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>Timestep</strong></td>
          <td style="text-align: left">10 fs</td>
          <td style="text-align: left">2 fs</td>
          <td style="text-align: left">Rahman&rsquo;s step was aggressive; 2 fs ensures numerical stability</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>Equilibration</strong></td>
          <td style="text-align: left">Velocity Scaling</td>
          <td style="text-align: left">1 ns NVT</td>
          <td style="text-align: left">Rahman couldn&rsquo;t afford long equilibrations; I melted the crystal properly to remove bias</td>
      </tr>
  </tbody>
</table>
<p>The production run lasted 10 ps in the NVE ensemble, generating 5,001 frames. Temperature remained within 1% of target with an RMS fluctuation of 0.0165.</p>
<hr>
<h2 id="validation-results">Validation Results</h2>
<p>The replication was quantitatively successful. The analysis pipeline faithfully reproduced every key signature of liquid argon.</p>
<h3 id="the-cage-effect">The Cage Effect</h3>
<p>This is the paper&rsquo;s crown jewel. In a gas, velocity correlations decay exponentially. In a liquid, Rahman discovered that atoms get trapped by their neighbors and bounce back, causing the correlation to go <em>negative</em>.</p>















<figure class="post-figure center ">
    <img src="/img/rahman-1964-argon-molecular-dynamics/rahman-argon-velocity-autocorrelation.webp"
         alt="Velocity Autocorrelation Function"
         title="Velocity Autocorrelation Function"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">The VACF dips below zero at 0.3 ps. This &rsquo;negative correlation&rsquo; is the signature of the cage effect: atoms rattling against their neighbors.</figcaption>
    
</figure>

<p>My simulation captures this minimum at -0.083, matching Rahman&rsquo;s observation. The Fourier transform of this data (the frequency spectrum) reveals a peak at $\beta \approx 0.25$, physically representing the frequency of atomic collisions within the cage.</p>















<figure class="post-figure center ">
    <img src="/img/rahman-1964-argon-molecular-dynamics/rahman-argon-vacf-frequency-spectrum.webp"
         alt="Frequency spectrum of the VACF showing characteristic peak from atomic caging effects"
         title="Frequency spectrum of the VACF showing characteristic peak from atomic caging effects"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">Frequency spectrum of the VACF showing characteristic peak from atomic caging effects</figcaption>
    
</figure>

<h3 id="structural-fingerprints">Structural Fingerprints</h3>
<p>The Radial Distribution Function $g(r)$ and its Fourier transform, the Structure Factor $S(k)$, are the &ldquo;fingerprints&rdquo; of a liquid&rsquo;s structure.</p>















<figure class="post-figure center ">
    <img src="/img/rahman-1964-argon-molecular-dynamics/rahman-argon-radial-distribution-function.webp"
         alt="Radial Distribution Function and Structure Factor"
         title="Radial Distribution Function and Structure Factor"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">The sharp first peak (3.82 Å) shows defined nearest neighbors, while the decay shows the lack of long-range order. My calculated peaks match Rahman&rsquo;s within 3%.</figcaption>
    
</figure>

<p>The agreement here is striking. My first peak appeared at <strong>3.82 Å</strong> (Rahman: 3.7 Å). The slight discrepancy is likely due to my improved equilibration method, which allowed the system to relax into a more natural liquid state than Rahman&rsquo;s 1960s hardware allowed.</p>
<h3 id="diffusion-and-non-gaussian-behavior">Diffusion and Non-Gaussian Behavior</h3>
<p>By calculating the Mean Square Displacement (MSD), I derived a diffusion coefficient of <strong>$D = 2.47 \times 10^{-5}$ cm²/s</strong>, which deviates only <strong>2%</strong> from Rahman&rsquo;s reported $2.43 \times 10^{-5}$.</p>















<figure class="post-figure center ">
    <img src="/img/rahman-1964-argon-molecular-dynamics/rahman-argon-mean-square-displacement.webp"
         alt="Mean Square Displacement vs time showing ballistic to diffusive transition"
         title="Mean Square Displacement vs time showing ballistic to diffusive transition"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">Mean Square Displacement vs. time showing ballistic to diffusive transition</figcaption>
    
</figure>

<p>More interestingly, I reproduced the &ldquo;Non-Gaussian&rdquo; parameters. Standard diffusion assumes a Gaussian distribution of displacements. Rahman found (and I confirmed) that liquid atoms deviate from this. They exhibit &ldquo;jump&rdquo; and &ldquo;wait&rdquo; dynamics, a behavior that standard Brownian motion models fail to capture.</p>















<figure class="post-figure center ">
    <img src="/img/rahman-1964-argon-molecular-dynamics/rahman-argon-non-gaussian-parameters.webp"
         alt="Non-Gaussian parameters showing deviation from simple diffusive behavior"
         title="Non-Gaussian parameters showing deviation from simple diffusive behavior"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">Evidence that atoms do not follow a simple random walk. The non-zero alpha parameters indicate heterogeneous dynamics.</figcaption>
    
</figure>

<h3 id="advanced-analysis-van-hove-functions">Advanced Analysis: Van Hove Functions</h3>
<p>Rahman also explored advanced properties like the Van Hove correlation function $G(r,t)$, which describes how liquid structure evolves over time.</p>















<figure class="post-figure center ">
    <img src="/img/rahman-1964-argon-molecular-dynamics/rahman-argon-van-hove-correlation.webp"
         alt="Van Hove distinct correlation function G_d(r,t) at two time points"
         title="Van Hove distinct correlation function G_d(r,t) at two time points"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">Van Hove distinct correlation function showing how neighbor coordination shells &lsquo;melt&rsquo; as time progresses</figcaption>
    
</figure>

<p>At 1.0 ps, the structure remains well-defined with clear shells. By 2.5 ps, it becomes increasingly diffuse. Rahman compared this evolution to theoretical predictions (the Vineyard approximation) and found that theory predicted overly rapid structural decay. My results confirm this finding.</p>















<figure class="post-figure center ">
    <img src="/img/rahman-1964-argon-molecular-dynamics/rahman-argon-delayed-convolution.webp"
         alt="Delayed convolution approximation testing Rahman&#39;s theoretical improvement"
         title="Delayed convolution approximation testing Rahman&#39;s theoretical improvement"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">Testing Rahman&rsquo;s &lsquo;delayed convolution approximation&rsquo; (his proposed improvement over existing theory)</figcaption>
    
</figure>

<hr>
<h2 id="system-validation">System Validation</h2>
<p>Before analyzing physics, basic sanity checks confirmed proper thermal equilibrium.</p>















<figure class="post-figure center ">
    <img src="/img/rahman-1964-argon-molecular-dynamics/rahman-argon-temperature-stability.webp"
         alt="Temperature vs time plot showing excellent temperature control around 94.4 K target"
         title="Temperature vs time plot showing excellent temperature control around 94.4 K target"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">Temperature vs. Time - 5001 frames showing excellent temperature control with mean 94.73 K</figcaption>
    
</figure>

<p>Mean temperature was 94.73 K (0.33 K off target) with a standard deviation of 1.56 K.</p>















<figure class="post-figure center ">
    <img src="/img/rahman-1964-argon-molecular-dynamics/rahman-argon-maxwell-boltzmann-velocity.webp"
         alt="Maxwell-Boltzmann velocity distribution"
         title="Maxwell-Boltzmann velocity distribution"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">Maxwell-Boltzmann velocity distribution from 12.9 million velocity components</figcaption>
    
</figure>

<p>The velocity distribution from 12.9 million velocity components produces a clean Maxwell-Boltzmann distribution, as expected for thermal equilibrium. The distribution widths at various heights closely match Rahman&rsquo;s results: 1.77, 2.48, and 3.56 compared to his 1.77, 2.52, and 3.52.</p>
<hr>
<h2 id="conclusion">Conclusion</h2>
<p>Replicating a 60-year-old paper might seem like a solved puzzle, but it teaches a valuable lesson in computational science. Rahman relied on brilliance and raw mathematical intuition because he lacked compute power. Today, pairing modern compute with disciplined software practices makes the same result reproducible and auditable.</p>
<p>Applying modern software engineering (<strong>modular architecture, caching, and automated workflows</strong>) to classical physics reproduces the past and builds a foundation that makes the <em>next</em> discovery easier, faster, and more reliable.</p>
<p>The quantitative agreement is striking: diffusion coefficients within 2%, structural peaks within 0.1 Å, velocity distributions matching to three significant figures. This level of reproducibility, achieved with completely different hardware and software, validates something fundamental: Rahman&rsquo;s physical model was remarkably sound, and his computational methodology was scientifically rigorous despite 1960s constraints.</p>
<p>The cage effect, velocity correlations, and structural evolution are fundamental characteristics of how matter behaves at the atomic scale, as relevant today as they were six decades ago.</p>
]]></content:encoded></item><item><title>Modernizing Rahman''s 1964 Argon Simulation</title><link>https://hunterheidenreich.com/projects/rahman-1964-replication/</link><pubDate>Sat, 23 Aug 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/projects/rahman-1964-replication/</guid><description>A high-fidelity replication of foundational molecular dynamics using modern software engineering practices: caching, vectorization, and strict reproducibility.</description><content:encoded><![CDATA[<h2 id="overview">Overview</h2>
<p>This project is a &ldquo;digital restoration&rdquo; of Aneesur Rahman&rsquo;s seminal 1964 paper, <em>Correlations in the Motion of Atoms in Liquid Argon</em>. While the physics of liquid argon is a solved problem, the challenge lies in bridging the gap between 1960s mainframe constraints and 2025 software architecture.</p>
<p>I replicated the simulation using <strong>LAMMPS</strong> and built a <strong>Python analysis pipeline</strong> to process the trajectory data. The project demonstrates how modern tooling (<code>uv</code>, type hinting, vectorized NumPy) can transform academic &ldquo;write-once&rdquo; scripts into a reproducible research toolkit.</p>
<h2 id="features">Features</h2>
<h3 id="the-analysis-pipeline">The Analysis Pipeline</h3>
<p>I architected a modular Python package (<code>argon_sim</code>) designed for performance and maintainability.</p>
<ul>
<li><strong>Intelligent Caching System</strong>: MD analysis is compute-intensive ($O(N^2)$). I implemented a decorator-based caching layer (<code>@cached_computation</code>) that hashes source file modification times and function arguments. This ensures expensive calculations (like RDF or Van Hove correlations) are only re-run when the underlying trajectory or parameters actually change.</li>
<li><strong>Vectorization &amp; Optimization</strong>: To handle the $N^2$ complexity of pair-wise interactions without C++ extensions, I utilized NumPy broadcasting. For example, the Mean Square Displacement (MSD) calculation is fully vectorized, with a fallback &ldquo;chunked&rdquo; implementation to handle memory overflows on smaller machines.</li>
<li><strong>Modern Python Tooling</strong>:
<ul>
<li><strong>Dependency Management</strong>: Used <code>uv</code> for deterministic environment locking (sub-second resolution).</li>
<li><strong>Type Safety</strong>: Fully type-hinted codebase for static analysis compliance.</li>
<li><strong>Automation</strong>: A <code>Makefile</code> abstracts the workflow (simulation → analysis → figure generation) into single commands (e.g., <code>make figure-5</code>).</li>
</ul>
</li>
</ul>
<h3 id="the-simulation-strategy">The Simulation Strategy</h3>
<p>I used LAMMPS for the MD engine but strictly adhered to Rahman&rsquo;s physical parameters while modernizing the stability mechanisms.</p>
<ul>
<li><strong>Integration</strong>: Replaced Rahman&rsquo;s predictor-corrector method with the modern standard <strong>Velocity Verlet</strong> algorithm (2 fs timestep).</li>
<li><strong>Equilibration</strong>: I implemented a 1 ns <strong>NVT equilibration</strong> phase (500,000 steps at the 2 fs timestep) to properly melt the FCC crystal structure before the NVE production run.</li>
<li><strong>Intellectual Honesty</strong>: The <code>in.argon</code> script explicitly documents every deviation from the original methodology (e.g., energy minimization) and the justification for ensuring numerical stability.</li>
</ul>
<h2 id="usage">Usage</h2>
<p>The project uses a <code>Makefile</code> to automate the workflow. Run <code>make all</code> to execute the LAMMPS simulation and generate all analysis figures.</p>
<h2 id="results">Results</h2>
<p>The replication achieved high quantitative agreement with the historical data, validating both the simulation parameters and the custom analysis code.</p>
<table>
  <thead>
      <tr>
          <th style="text-align: left">Property</th>
          <th style="text-align: left">Rahman (1964)</th>
          <th style="text-align: left">This Work</th>
          <th style="text-align: left">Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left">Diffusion Coefficient ($D$)</td>
          <td style="text-align: left">$2.43 \times 10^{-5}$ cm²/s</td>
          <td style="text-align: left">$2.47 \times 10^{-5}$ cm²/s</td>
          <td style="text-align: left">Agreement within 2%</td>
      </tr>
      <tr>
          <td style="text-align: left">RDF First Peak</td>
          <td style="text-align: left">$3.7$ Å</td>
          <td style="text-align: left">$3.82$ Å</td>
          <td style="text-align: left">Slight shift</td>
      </tr>
      <tr>
          <td style="text-align: left">Velocity Dist. Width ($e^{-1/2}$)</td>
          <td style="text-align: left">$1.77$</td>
          <td style="text-align: left">$1.77$</td>
          <td style="text-align: left">Exact match to theoretical Maxwell-Boltzmann</td>
      </tr>
  </tbody>
</table>
<h3 id="visual-replication">Visual Replication</h3>
<p>I used Matplotlib to digitally recreate Rahman&rsquo;s hand-drawn plots, confirming signatures like the <strong>negative region in the Velocity Autocorrelation Function (VACF)</strong>, which provided the first evidence of the &ldquo;cage effect&rdquo; in simple liquids.</p>















<figure class="post-figure center ">
    <img src="/img/rahman-1964-argon-molecular-dynamics/rahman-argon-velocity-autocorrelation.webp"
         alt="Velocity Autocorrelation Function comparison showing the characteristic negative region"
         title="Velocity Autocorrelation Function comparison showing the characteristic negative region"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">The VACF&rsquo;s negative region (first evidence of the &lsquo;cage effect&rsquo; in liquids) reproduced 60 years later.</figcaption>
    
</figure>

<h2 id="challenges--learnings">Challenges &amp; Learnings</h2>
<ul>
<li><strong>Unit Hell</strong>: Rahman&rsquo;s paper uses a mix of reduced units and CGS. Mapping these to LAMMPS&rsquo;s <code>real</code> units required a dedicated <code>constants.py</code> module and rigorous unit testing to prevent dimensional errors.</li>
<li><strong>Fourier Transforms</strong>: Calculating the Structure Factor $S(k)$ from $g(r)$ required implementing a manual 3D Fourier transform for spherical symmetry, as standard FFT packages do not account for the radial shell integration implicit in liquid structure analysis.</li>
<li><strong>Code as a Liability</strong>: Early in the project, I realized that re-running analysis scripts was becoming a bottleneck. This drove the decision to build the caching infrastructure, reinforcing the lesson that investing in developer tooling pays off even in small-scale scientific projects.</li>
</ul>
<h2 id="related-work">Related Work</h2>
<p>The full methodology and physics are documented in the companion blog post:</p>
<ul>
<li><a href="/posts/rahman-1964-lammps-liquid-argon/">Replicating Rahman&rsquo;s 1964 Liquid Argon Simulation</a></li>
</ul>
]]></content:encoded></item><item><title>Kabsch Algorithm: NumPy, PyTorch, TensorFlow, and JAX</title><link>https://hunterheidenreich.com/posts/kabsch-algorithm/</link><pubDate>Tue, 03 Oct 2023 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/posts/kabsch-algorithm/</guid><description>Learn about the Kabsch algorithm for optimal point alignment with implementations in NumPy, PyTorch, TensorFlow, and JAX for ML applications.</description><content:encoded><![CDATA[<h2 id="what-is-the-kabsch-algorithm">What is the Kabsch Algorithm?</h2>
<p>In computer vision or scientific computing, a common problem frequently arises: given two sets of points, what is the optimal rigid body transformation for their alignment? The Kabsch algorithm provides a nice solution.</p>















<figure class="post-figure center ">
    <img src="/img/scientific-computing/kabsch-alignment-before-and-after.webp"
         alt="Visualization of two point sets before and after Kabsch alignment"
         title="Visualization of two point sets before and after Kabsch alignment"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">The Kabsch algorithm optimally rotates and translates the blue points to align with the red points.</figcaption>
    
</figure>

<p>What are some concrete situations where this crops up?</p>
<ul>
<li><strong>Molecular Dynamics</strong>: Your points are a set of atoms (with physically relevant types), and you want to compare two molecular conformations. Are they the same structure with minor noise or rotation? Or are they different conformations, like a different folding of a protein? This is especially helpful when applying generative models to chemical structures. For example, if you are building a <a href="/notes/chemistry/molecular-simulation/ml-potentials/denoise-vae/">3D Molecular VAE</a> in PyTorch or working with <a href="/notes/machine-learning/generative-models/flow-matching-for-generative-modeling/">Flow Matching models</a>, Kabsch alignment ensures your generative loss function remains rotationally invariant.</li>
<li><strong>Computer Vision</strong>: You have two point clouds from 3D scans of an object taken from different angles. You want to align them to reconstruct the full shape. Or perhaps you&rsquo;re generating 3D shapes from 2D images and need to compare the generated shape to a ground truth scan. Anytime a 3D system is represented as a point cloud, the Kabsch algorithm can help with alignment.</li>
</ul>
<p>Of course, existing libraries implement this algorithm. However, often I find it beneficial to implement algorithms from scratch to build intuition. Furthermore, modern machine learning applications require automatic differentiation, so we will implement the algorithm in PyTorch, TensorFlow, and JAX.</p>
<p>Below, we&rsquo;ll cover the math behind the Kabsch algorithm (and its scaling variant, the <strong>Kabsch-Umeyama</strong> algorithm) and provide complete, differentiable implementations in <strong>NumPy</strong>, <strong>PyTorch</strong>, <strong>TensorFlow</strong>, and <strong>JAX</strong>, demonstrating both single-pair and batched computations for ML applications.</p>
<h2 id="the-math">The Math</h2>















<figure class="post-figure center ">
    <img src="/img/scientific-computing/kabsch-algorithm-basic-animation.webp"
         alt="Animation showing the iterative steps of centroid alignment and rotation"
         title="Animation showing the iterative steps of centroid alignment and rotation"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">Visualizing the alignment process: first centering the datasets, then finding the optimal rotation.</figcaption>
    
</figure>

<p>Let&rsquo;s say we have two sets of paired points,
$P={\mathbf{p}_i} \in \mathbb{R}^{N \times D}$ and $Q={\mathbf{q}_i} \in \mathbb{R}^{N \times D}$, for $i = 1, \dots, N$
(where $D$ is the dimensionality and $N$ is the number of points).
We want to find a translation vector $\mathbf{t}$ and rotation matrix $R$ to transform $P$ to align with $Q$.</p>
<p>The optimization problem is:</p>
<p>$$
\min_{\mathbf{t}, \ R} \mathcal{L}(\mathbf{t}, R) = \frac{1}{2} \sum_{i=1}^N | \mathbf{q}_i - (R\mathbf{p}_i + \mathbf{t}) |^2
$$</p>
<p>where $\mathbf{t}^\ast \in \mathbb{R}^D$ and $R^\ast \in \mathbb{R}^{D \times D}$ are the optimal translation and rotation.</p>
<p>Often we use a weighted version with weights $w_i$ (e.g., atomic masses in molecular dynamics):</p>
<p>$$
\min_{\mathbf{t}, \ R} \mathcal{L}(\mathbf{t}, R) = \frac{1}{2} \sum_{i=1}^N w_i | \mathbf{q}_i - (R\mathbf{p}_i + \mathbf{t}) |^2
$$</p>
<h3 id="the-translation">The Translation</h3>
<p>The translation and rotation are coupled, but they separate cleanly once we work in centroid-centered coordinates. Compute the centroids (averages) of both point sets:</p>
<p>$$
\bar{\mathbf{p}} = \frac{1}{N} \sum_{i=1}^N \mathbf{p}_i \quad \text{and} \quad \bar{\mathbf{q}} = \frac{1}{N} \sum_{i=1}^N \mathbf{q}_i
$$</p>
<p>For any fixed rotation $R$, the translation that minimizes $\mathcal{L}$ is found by setting $\partial \mathcal{L} / \partial \mathbf{t} = 0$. It maps the rotated source centroid onto the target centroid:</p>
<p>$$
\mathbf{t} = \bar{\mathbf{q}} - R\bar{\mathbf{p}}
$$</p>
<p>A tempting shortcut is to write $\mathbf{t} = \bar{\mathbf{q}} - \bar{\mathbf{p}}$, but that is only correct when $R = I$. In general the translation depends on the rotation, so we compute it <em>after</em> solving for $R$. Substituting this optimal $\mathbf{t}$ back into the objective cancels the centroids and leaves a rotation-only problem in the centered coordinates $\mathbf{p}_i^\prime = \mathbf{p}_i - \bar{\mathbf{p}}$ and $\mathbf{q}_i^\prime = \mathbf{q}_i - \bar{\mathbf{q}}$:</p>
<p>$$
\mathcal{L}(R) = \frac{1}{2} \sum_{i=1}^N | \mathbf{q}_i^\prime - R\mathbf{p}_i^\prime |^2
$$</p>
<p>which is what the next section solves.</p>
<h3 id="the-rotation-matrix">The Rotation Matrix</h3>
<p>We now minimize $\mathcal{L}(R)$ over rotations, using the centered points $\mathbf{p}_i^\prime$ and $\mathbf{q}_i^\prime$ from above. Compute the cross-covariance matrix between the centered sets:</p>
<p>$$
C = P^{\prime T} Q^\prime = \sum_{i=1}^N \mathbf{p}_i^{\prime T} \mathbf{q}_i^{\prime} \in \mathbb{R}^{D \times D}
$$</p>
<p>This is a fairly lightweight operation since $D$ is typically small (e.g., 3 for 3D points), even if $N$ is large.</p>
<p>With $C$ in hand, we want to compute its Singular Value Decomposition (SVD):</p>
<p>$$
C = U \Sigma V^T
$$</p>
<p>This operation is computationally expensive. It scales cubically with $D$ (i.e., $O(D^3)$).
However, since we&rsquo;re often interested in cases where $D$ is small (e.g., 2D or 3D points), this is manageable.</p>
<p>Next, we check for improper rotations (i.e., reflections) and correct for them where necessary:</p>
<p>$$
d = \text{sign}(\det(V U^T))
$$</p>
<p>If $d = -1$, we need to flip the last column of $V$ in the final rotation matrix.</p>
<p>Let $B = \text{diag}(1, 1, d)$.
The optimal rotation matrix comes out:</p>
<p>$$
R^\ast = V B U^T
$$</p>
<h3 id="summary">Summary</h3>
<p>In a nutshell, the Kabsch algorithm boils down to:</p>
<ol>
<li>Compute centroids of $P$ and $Q$ ($\bar{\mathbf{p}}$ and $\bar{\mathbf{q}}$)</li>
<li>Center both point sets by subtracting centroids: $P^\prime$ and $Q^\prime$</li>
<li>Compute cross-covariance matrix $C = P^{\prime T} Q^\prime$</li>
<li>Compute SVD: $C = U \Sigma V^T$ (<em>expensive step</em>)</li>
<li>Compute $d = \text{sign}(\det(V U^T))$ and $B = \text{diag}(1, 1, d)$</li>
<li>Optimal rotation: $R^\ast = V B U^T$</li>
<li>Optimal translation (using the rotation from step 6): $\mathbf{t}^\ast = \bar{\mathbf{q}} - R^\ast\bar{\mathbf{p}}$</li>
</ol>
<p>The resulting root-mean-square deviation (RMSD) between aligned point sets is</p>
<p>$$
\text{RMSD} = \sqrt{\frac{1}{N} \sum_{i=1}^N | \mathbf{q}_i - (R^\ast\mathbf{p}_i + \mathbf{t}^\ast) |^2}
$$</p>















<figure class="post-figure center ">
    <img src="/img/scientific-computing/kabsch-algorithm-visualized-rmsd.webp"
         alt="Diagram illustrating Root Mean Square Deviation (RMSD) distances"
         title="Diagram illustrating Root Mean Square Deviation (RMSD) distances"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">RMSD measures the average distance between the aligned points.</figcaption>
    
</figure>

<p>which is frequently used as a measure of similarity between molecular structures or as a metric in loss functions for ML applications.</p>
<h3 id="the-kabsch-umeyama-algorithm-scaling">The Kabsch-Umeyama Algorithm (Scaling)</h3>
<p>While the standard Kabsch algorithm solves for optimal rotation and translation, the <strong>Kabsch-Umeyama algorithm</strong> extends this by also finding an optimal <strong>scaling factor</strong> $c$. This is essential when aligning structures of different scales, such as a 3D scan versus a ground truth model.</p>
<p><em>(Note: This is sometimes searched for as the &ldquo;Absch-Umeyama algorithm&rdquo; due to typos, but the correct attribution is to Shinji Umeyama based on Wolfgang Kabsch&rsquo;s work.)</em></p>
<p>The method estimates the transformation $\mathbf{q}_i \approx c R \mathbf{p}_i + \mathbf{t}$. The optimal scale is the trace of the (reflection-corrected) singular values of the cross-covariance divided by the variance of the source points about their centroid. See the <a href="/notes/biology/computational-biology/umeyama-similarity-transformation/">Umeyama paper notes</a> for the full derivation.</p>
<p><strong>A Note on SVD and Automatic Differentiation</strong></p>
<p>While modern frameworks allow us to backpropagate through the Singular Value Decomposition (SVD), it comes with a known stability issue: if the cross-covariance matrix has identical (degenerate) singular values (which can occur if the point clouds are perfectly aligned or have certain symmetries), the gradient of the SVD approaches infinity, causing <code>NaN</code> values during backpropagation. If you plan to use this algorithm as a loss function for a neural network, it is often necessary to add a tiny epsilon to the matrix before computing the SVD, or to utilize an SVD gradient patch. The <a href="/projects/kabsch-horn-cookbook/">Kabsch-Horn Cookbook</a> library provides a SafeSVD primitive that floors the singular-value-gap denominator at machine epsilon in the backward pass, producing finite gradients at degenerate inputs across PyTorch, JAX, TensorFlow, and MLX.</p>
<h2 id="implementation">Implementation</h2>
<p>Let&rsquo;s implement the algorithm in different frameworks. Note that for simplicity, the following implementations cover the <strong>unweighted</strong> Kabsch algorithm. If your application (like molecular dynamics) requires weights (e.g., atomic masses), the <a href="/projects/kabsch-horn-cookbook/">Kabsch-Horn Cookbook</a> library provides per-point weighted alignment out of the box.</p>
<h3 id="numpy">NumPy</h3>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">import</span> numpy <span style="color:#66d9ef">as</span> np
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">kabsch_numpy</span>(P, Q):
</span></span><span style="display:flex;"><span>    <span style="color:#e6db74">&#34;&#34;&#34;
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    Computes the optimal rotation and translation to align two sets of points (P -&gt; Q),
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    and their RMSD.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :param P: A Nx3 matrix of points
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :param Q: A Nx3 matrix of points
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :return: A tuple containing the optimal rotation matrix, the optimal
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">             translation vector, and the RMSD.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    &#34;&#34;&#34;</span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">assert</span> P<span style="color:#f92672">.</span>shape <span style="color:#f92672">==</span> Q<span style="color:#f92672">.</span>shape, <span style="color:#e6db74">&#34;Matrix dimensions must match&#34;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Compute centroids</span>
</span></span><span style="display:flex;"><span>    centroid_P <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>mean(P, axis<span style="color:#f92672">=</span><span style="color:#ae81ff">0</span>)
</span></span><span style="display:flex;"><span>    centroid_Q <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>mean(Q, axis<span style="color:#f92672">=</span><span style="color:#ae81ff">0</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Center the points</span>
</span></span><span style="display:flex;"><span>    p <span style="color:#f92672">=</span> P <span style="color:#f92672">-</span> centroid_P
</span></span><span style="display:flex;"><span>    q <span style="color:#f92672">=</span> Q <span style="color:#f92672">-</span> centroid_Q
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Compute the covariance matrix</span>
</span></span><span style="display:flex;"><span>    H <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>dot(p<span style="color:#f92672">.</span>T, q)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># SVD</span>
</span></span><span style="display:flex;"><span>    U, S, Vt <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>linalg<span style="color:#f92672">.</span>svd(H)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Validate right-handed coordinate system</span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">if</span> np<span style="color:#f92672">.</span>linalg<span style="color:#f92672">.</span>det(np<span style="color:#f92672">.</span>dot(Vt<span style="color:#f92672">.</span>T, U<span style="color:#f92672">.</span>T)) <span style="color:#f92672">&lt;</span> <span style="color:#ae81ff">0.0</span>:
</span></span><span style="display:flex;"><span>        Vt[<span style="color:#f92672">-</span><span style="color:#ae81ff">1</span>, :] <span style="color:#f92672">*=</span> <span style="color:#f92672">-</span><span style="color:#ae81ff">1.0</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Optimal rotation</span>
</span></span><span style="display:flex;"><span>    R <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>dot(Vt<span style="color:#f92672">.</span>T, U<span style="color:#f92672">.</span>T)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Optimal translation (depends on R, so computed after it)</span>
</span></span><span style="display:flex;"><span>    t <span style="color:#f92672">=</span> centroid_Q <span style="color:#f92672">-</span> np<span style="color:#f92672">.</span>dot(R, centroid_P)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># RMSD</span>
</span></span><span style="display:flex;"><span>    rmsd <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>sqrt(np<span style="color:#f92672">.</span>sum(np<span style="color:#f92672">.</span>square(np<span style="color:#f92672">.</span>dot(p, R<span style="color:#f92672">.</span>T) <span style="color:#f92672">-</span> q)) <span style="color:#f92672">/</span> P<span style="color:#f92672">.</span>shape[<span style="color:#ae81ff">0</span>])
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> R, t, rmsd
</span></span></code></pre></div><p>Here&rsquo;s a quick test to verify correctness:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">test_numpy</span>():
</span></span><span style="display:flex;"><span>    np<span style="color:#f92672">.</span>random<span style="color:#f92672">.</span>seed(<span style="color:#ae81ff">12345</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    P <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>random<span style="color:#f92672">.</span>randn(<span style="color:#ae81ff">100</span>, <span style="color:#ae81ff">3</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    alpha <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>random<span style="color:#f92672">.</span>rand() <span style="color:#f92672">*</span> <span style="color:#ae81ff">2</span> <span style="color:#f92672">*</span> np<span style="color:#f92672">.</span>pi
</span></span><span style="display:flex;"><span>    R <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>array([[np<span style="color:#f92672">.</span>cos(alpha), <span style="color:#f92672">-</span>np<span style="color:#f92672">.</span>sin(alpha), <span style="color:#ae81ff">0</span>],
</span></span><span style="display:flex;"><span>                    [np<span style="color:#f92672">.</span>sin(alpha), np<span style="color:#f92672">.</span>cos(alpha), <span style="color:#ae81ff">0</span>],
</span></span><span style="display:flex;"><span>                    [<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">1</span>]])
</span></span><span style="display:flex;"><span>    t <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>random<span style="color:#f92672">.</span>randn(<span style="color:#ae81ff">3</span>) <span style="color:#f92672">*</span> <span style="color:#ae81ff">10</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    Q <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>dot(P, R<span style="color:#f92672">.</span>T) <span style="color:#f92672">+</span> t
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    R_opt, t_opt, rmsd <span style="color:#f92672">=</span> kabsch_numpy(P, Q)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    print(<span style="color:#e6db74">&#39;RMSD: </span><span style="color:#e6db74">{}</span><span style="color:#e6db74">&#39;</span><span style="color:#f92672">.</span>format(rmsd))
</span></span><span style="display:flex;"><span>    print(<span style="color:#e6db74">&#39;R:</span><span style="color:#ae81ff">\n</span><span style="color:#e6db74">{}</span><span style="color:#e6db74">&#39;</span><span style="color:#f92672">.</span>format(R))
</span></span><span style="display:flex;"><span>    print(<span style="color:#e6db74">&#39;R_opt:</span><span style="color:#ae81ff">\n</span><span style="color:#e6db74">{}</span><span style="color:#e6db74">&#39;</span><span style="color:#f92672">.</span>format(R_opt))
</span></span><span style="display:flex;"><span>    print(<span style="color:#e6db74">&#39;t:</span><span style="color:#ae81ff">\n</span><span style="color:#e6db74">{}</span><span style="color:#e6db74">&#39;</span><span style="color:#f92672">.</span>format(t))
</span></span><span style="display:flex;"><span>    print(<span style="color:#e6db74">&#39;t_opt:</span><span style="color:#ae81ff">\n</span><span style="color:#e6db74">{}</span><span style="color:#e6db74">&#39;</span><span style="color:#f92672">.</span>format(t_opt))
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    l2_t <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>linalg<span style="color:#f92672">.</span>norm(t <span style="color:#f92672">-</span> t_opt)
</span></span><span style="display:flex;"><span>    l2_R <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>linalg<span style="color:#f92672">.</span>norm(R <span style="color:#f92672">-</span> R_opt)
</span></span><span style="display:flex;"><span>    print(<span style="color:#e6db74">&#39;l2_t: </span><span style="color:#e6db74">{}</span><span style="color:#e6db74">&#39;</span><span style="color:#f92672">.</span>format(l2_t))
</span></span><span style="display:flex;"><span>    print(<span style="color:#e6db74">&#39;l2_R: </span><span style="color:#e6db74">{}</span><span style="color:#e6db74">&#39;</span><span style="color:#f92672">.</span>format(l2_R))
</span></span></code></pre></div><p>Running this test shows the algorithm correctly recovers the rotation and translation:</p>
<pre><code>RMSD: 3.2111501877699246e-15
R:
[[-0.8475392 -0.5307328  0.       ]
 [ 0.5307328 -0.8475392  0.       ]
 [ 0.         0.         1.       ]]
R_opt:
[[-8.47539198e-01 -5.30732803e-01 -2.95434260e-16]
 [ 5.30732803e-01 -8.47539198e-01  2.92859649e-16]
 [ 0.00000000e+00 -2.77555756e-16  1.00000000e+00]]
t:
[ 5.99726796  1.50078468 -3.34633977]
t_opt:
[ 5.99726796  1.50078468 -3.34633977]
l2_t: 2.7012892057857038e-15
l2_R: 8.028174304721057e-16
</code></pre>
<p>Both the rotation and the translation are recovered to within floating-point precision (the residuals <code>l2_t</code> and <code>l2_R</code> are on the order of <code>1e-15</code>).</p>
<p>For batch processing:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">kabsch_numpy_batched</span>(P, Q):
</span></span><span style="display:flex;"><span>    <span style="color:#e6db74">&#34;&#34;&#34;
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    Computes the optimal rotation and translation to align two sets of points (P -&gt; Q),
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    and their RMSD.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :param P: A BxNx3 matrix of points
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :param Q: A BxNx3 matrix of points
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :return: A tuple containing the optimal rotation matrix, the optimal
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">             translation vector, and the RMSD.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    &#34;&#34;&#34;</span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">assert</span> P<span style="color:#f92672">.</span>shape <span style="color:#f92672">==</span> Q<span style="color:#f92672">.</span>shape, <span style="color:#e6db74">&#34;Matrix dimensions must match&#34;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Compute centroids</span>
</span></span><span style="display:flex;"><span>    centroid_P <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>mean(P, axis<span style="color:#f92672">=</span><span style="color:#ae81ff">1</span>, keepdims<span style="color:#f92672">=</span><span style="color:#66d9ef">True</span>)  <span style="color:#75715e"># Bx1x3</span>
</span></span><span style="display:flex;"><span>    centroid_Q <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>mean(Q, axis<span style="color:#f92672">=</span><span style="color:#ae81ff">1</span>, keepdims<span style="color:#f92672">=</span><span style="color:#66d9ef">True</span>)  <span style="color:#75715e"># Bx1x3</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Center the points</span>
</span></span><span style="display:flex;"><span>    p <span style="color:#f92672">=</span> P <span style="color:#f92672">-</span> centroid_P  <span style="color:#75715e"># BxNx3</span>
</span></span><span style="display:flex;"><span>    q <span style="color:#f92672">=</span> Q <span style="color:#f92672">-</span> centroid_Q  <span style="color:#75715e"># BxNx3</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Compute the covariance matrix</span>
</span></span><span style="display:flex;"><span>    H <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>matmul(p<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">1</span>), q)  <span style="color:#75715e"># Bx3x3</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># SVD</span>
</span></span><span style="display:flex;"><span>    U, S, Vt <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>linalg<span style="color:#f92672">.</span>svd(H)  <span style="color:#75715e"># Bx3x3</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Validate right-handed coordinate system</span>
</span></span><span style="display:flex;"><span>    d <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>linalg<span style="color:#f92672">.</span>det(np<span style="color:#f92672">.</span>matmul(Vt<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">1</span>), U<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">1</span>)))
</span></span><span style="display:flex;"><span>    flip <span style="color:#f92672">=</span> d <span style="color:#f92672">&lt;</span> <span style="color:#ae81ff">0.0</span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">if</span> flip<span style="color:#f92672">.</span>any():
</span></span><span style="display:flex;"><span>        Vt[flip, <span style="color:#f92672">-</span><span style="color:#ae81ff">1</span>, :] <span style="color:#f92672">*=</span> <span style="color:#f92672">-</span><span style="color:#ae81ff">1.0</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Optimal rotation</span>
</span></span><span style="display:flex;"><span>    R <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>matmul(Vt<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">1</span>), U<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">1</span>))  <span style="color:#75715e"># Bx3x3</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Optimal translation (depends on R, so computed after it)</span>
</span></span><span style="display:flex;"><span>    t <span style="color:#f92672">=</span> centroid_Q<span style="color:#f92672">.</span>squeeze(<span style="color:#ae81ff">1</span>) <span style="color:#f92672">-</span> np<span style="color:#f92672">.</span>matmul(centroid_P, R<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">1</span>))<span style="color:#f92672">.</span>squeeze(<span style="color:#ae81ff">1</span>)  <span style="color:#75715e"># Bx3</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># RMSD</span>
</span></span><span style="display:flex;"><span>    rmsd <span style="color:#f92672">=</span> np<span style="color:#f92672">.</span>sqrt(np<span style="color:#f92672">.</span>sum(np<span style="color:#f92672">.</span>square(np<span style="color:#f92672">.</span>matmul(p, R<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">1</span>)) <span style="color:#f92672">-</span> q), axis<span style="color:#f92672">=</span>(<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">2</span>)) <span style="color:#f92672">/</span> P<span style="color:#f92672">.</span>shape[<span style="color:#ae81ff">1</span>])
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> R, t, rmsd
</span></span></code></pre></div><h3 id="pytorch">PyTorch</h3>


<p><details >
  <summary markdown="span">📝 Important Update (February 15, 2026)</summary>
  <strong>Bug Fix Notice:</strong> The PyTorch implementation has been updated to use the &ldquo;B-matrix&rdquo; broadcasting approach. This eliminates in-place tensor modification (which breaks <code>autograd</code>) and data-dependent control flow (which breaks <code>torch.compile</code> and <code>torch.vmap</code>).
</details></p>

<p>The PyTorch implementation now uses broadcasting to ensure differentiability:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">import</span> torch
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">kabsch_torch</span>(P, Q):
</span></span><span style="display:flex;"><span>    <span style="color:#e6db74">&#34;&#34;&#34;
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    Computes the optimal rotation and translation to align two sets of points (P -&gt; Q),
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    and their RMSD.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :param P: A Nx3 matrix of points
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :param Q: A Nx3 matrix of points
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :return: A tuple containing the optimal rotation matrix, the optimal
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">             translation vector, and the RMSD.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    &#34;&#34;&#34;</span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">assert</span> P<span style="color:#f92672">.</span>shape <span style="color:#f92672">==</span> Q<span style="color:#f92672">.</span>shape, <span style="color:#e6db74">&#34;Matrix dimensions must match&#34;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Compute centroids</span>
</span></span><span style="display:flex;"><span>    centroid_P <span style="color:#f92672">=</span> torch<span style="color:#f92672">.</span>mean(P, dim<span style="color:#f92672">=</span><span style="color:#ae81ff">0</span>)
</span></span><span style="display:flex;"><span>    centroid_Q <span style="color:#f92672">=</span> torch<span style="color:#f92672">.</span>mean(Q, dim<span style="color:#f92672">=</span><span style="color:#ae81ff">0</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Center the points</span>
</span></span><span style="display:flex;"><span>    p <span style="color:#f92672">=</span> P <span style="color:#f92672">-</span> centroid_P
</span></span><span style="display:flex;"><span>    q <span style="color:#f92672">=</span> Q <span style="color:#f92672">-</span> centroid_Q
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Compute the covariance matrix</span>
</span></span><span style="display:flex;"><span>    H <span style="color:#f92672">=</span> torch<span style="color:#f92672">.</span>matmul(p<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">1</span>), q)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># SVD</span>
</span></span><span style="display:flex;"><span>    U, S, Vt <span style="color:#f92672">=</span> torch<span style="color:#f92672">.</span>linalg<span style="color:#f92672">.</span>svd(H)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># 1. Calculate determinant</span>
</span></span><span style="display:flex;"><span>    d <span style="color:#f92672">=</span> torch<span style="color:#f92672">.</span>det(torch<span style="color:#f92672">.</span>matmul(Vt<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">1</span>), U<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">1</span>)))
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># 2. Build diagonal B tensor without in-place mutation</span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># We use stack to preserve gradients and graph connections</span>
</span></span><span style="display:flex;"><span>    B_diag <span style="color:#f92672">=</span> torch<span style="color:#f92672">.</span>stack([torch<span style="color:#f92672">.</span>tensor(<span style="color:#ae81ff">1.0</span>, device<span style="color:#f92672">=</span>d<span style="color:#f92672">.</span>device, dtype<span style="color:#f92672">=</span>d<span style="color:#f92672">.</span>dtype),
</span></span><span style="display:flex;"><span>                          torch<span style="color:#f92672">.</span>tensor(<span style="color:#ae81ff">1.0</span>, device<span style="color:#f92672">=</span>d<span style="color:#f92672">.</span>device, dtype<span style="color:#f92672">=</span>d<span style="color:#f92672">.</span>dtype),
</span></span><span style="display:flex;"><span>                          torch<span style="color:#f92672">.</span>sign(d)])
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># 3. Scale columns of Vt.T via broadcasting, then multiply by U^T</span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Vt.T: (3, 3). B_diag: (3) -&gt; B_diag[None, :]: (1, 3)</span>
</span></span><span style="display:flex;"><span>    R <span style="color:#f92672">=</span> torch<span style="color:#f92672">.</span>matmul(Vt<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">1</span>) <span style="color:#f92672">*</span> B_diag[<span style="color:#66d9ef">None</span>, :], U<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">1</span>))
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Optimal translation (depends on R, so computed after it)</span>
</span></span><span style="display:flex;"><span>    t <span style="color:#f92672">=</span> centroid_Q <span style="color:#f92672">-</span> centroid_P <span style="color:#f92672">@</span> R<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">1</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># RMSD</span>
</span></span><span style="display:flex;"><span>    rmsd <span style="color:#f92672">=</span> torch<span style="color:#f92672">.</span>sqrt(torch<span style="color:#f92672">.</span>sum(torch<span style="color:#f92672">.</span>square(torch<span style="color:#f92672">.</span>matmul(p, R<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">1</span>)) <span style="color:#f92672">-</span> q)) <span style="color:#f92672">/</span> P<span style="color:#f92672">.</span>shape[<span style="color:#ae81ff">0</span>])
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> R, t, rmsd
</span></span></code></pre></div><p>And our batched version:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">kabsch_torch_batched</span>(P, Q):
</span></span><span style="display:flex;"><span>    <span style="color:#e6db74">&#34;&#34;&#34;
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    Computes the optimal rotation and translation to align two sets of points (P -&gt; Q),
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    and their RMSD, in a batched manner.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :param P: A BxNx3 matrix of points
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :param Q: A BxNx3 matrix of points
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :return: A tuple containing the optimal rotation matrix, the optimal
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">             translation vector, and the RMSD.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    &#34;&#34;&#34;</span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">assert</span> P<span style="color:#f92672">.</span>shape <span style="color:#f92672">==</span> Q<span style="color:#f92672">.</span>shape, <span style="color:#e6db74">&#34;Matrix dimensions must match&#34;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Compute centroids</span>
</span></span><span style="display:flex;"><span>    centroid_P <span style="color:#f92672">=</span> torch<span style="color:#f92672">.</span>mean(P, dim<span style="color:#f92672">=</span><span style="color:#ae81ff">1</span>, keepdims<span style="color:#f92672">=</span><span style="color:#66d9ef">True</span>)  <span style="color:#75715e"># Bx1x3</span>
</span></span><span style="display:flex;"><span>    centroid_Q <span style="color:#f92672">=</span> torch<span style="color:#f92672">.</span>mean(Q, dim<span style="color:#f92672">=</span><span style="color:#ae81ff">1</span>, keepdims<span style="color:#f92672">=</span><span style="color:#66d9ef">True</span>)  <span style="color:#75715e"># Bx1x3</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Center the points</span>
</span></span><span style="display:flex;"><span>    p <span style="color:#f92672">=</span> P <span style="color:#f92672">-</span> centroid_P  <span style="color:#75715e"># BxNx3</span>
</span></span><span style="display:flex;"><span>    q <span style="color:#f92672">=</span> Q <span style="color:#f92672">-</span> centroid_Q  <span style="color:#75715e"># BxNx3</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Compute the covariance matrix</span>
</span></span><span style="display:flex;"><span>    H <span style="color:#f92672">=</span> torch<span style="color:#f92672">.</span>matmul(p<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">2</span>), q)  <span style="color:#75715e"># Bx3x3</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># SVD</span>
</span></span><span style="display:flex;"><span>    U, S, Vt <span style="color:#f92672">=</span> torch<span style="color:#f92672">.</span>linalg<span style="color:#f92672">.</span>svd(H)  <span style="color:#75715e"># Bx3x3</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># 1. Calculate batched determinant</span>
</span></span><span style="display:flex;"><span>    d <span style="color:#f92672">=</span> torch<span style="color:#f92672">.</span>det(torch<span style="color:#f92672">.</span>matmul(Vt<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">2</span>), U<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">2</span>)))  <span style="color:#75715e"># B</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># 2. Build batched B_diag without in-place mutation or control flow</span>
</span></span><span style="display:flex;"><span>    ones <span style="color:#f92672">=</span> torch<span style="color:#f92672">.</span>ones_like(d)
</span></span><span style="display:flex;"><span>    B_diag <span style="color:#f92672">=</span> torch<span style="color:#f92672">.</span>stack([ones, ones, torch<span style="color:#f92672">.</span>sign(d)], dim<span style="color:#f92672">=-</span><span style="color:#ae81ff">1</span>) <span style="color:#75715e"># Bx3</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># 3. Scale columns of Vt.T and multiply</span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Vt.T: (B, 3, 3). B_diag: (B, 3). B_diag[:, None, :]: (B, 1, 3).</span>
</span></span><span style="display:flex;"><span>    R <span style="color:#f92672">=</span> torch<span style="color:#f92672">.</span>matmul(Vt<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">2</span>) <span style="color:#f92672">*</span> B_diag[:, <span style="color:#66d9ef">None</span>, :], U<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">2</span>))
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Optimal translation (depends on R, so computed after it)</span>
</span></span><span style="display:flex;"><span>    t <span style="color:#f92672">=</span> centroid_Q<span style="color:#f92672">.</span>squeeze(<span style="color:#ae81ff">1</span>) <span style="color:#f92672">-</span> torch<span style="color:#f92672">.</span>matmul(centroid_P, R<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">2</span>))<span style="color:#f92672">.</span>squeeze(<span style="color:#ae81ff">1</span>)  <span style="color:#75715e"># Bx3</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># RMSD</span>
</span></span><span style="display:flex;"><span>    rmsd <span style="color:#f92672">=</span> torch<span style="color:#f92672">.</span>sqrt(torch<span style="color:#f92672">.</span>sum(torch<span style="color:#f92672">.</span>square(torch<span style="color:#f92672">.</span>matmul(p, R<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">2</span>)) <span style="color:#f92672">-</span> q), dim<span style="color:#f92672">=</span>(<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">2</span>)) <span style="color:#f92672">/</span> P<span style="color:#f92672">.</span>shape[<span style="color:#ae81ff">1</span>])
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> R, t, rmsd
</span></span></code></pre></div><h3 id="tensorflow">TensorFlow</h3>
<p>The TensorFlow implementation returns <code>S</code>, <code>U</code>, and <code>V</code> directly. To handle immutability and potential compilation (e.g., via <code>@tf.function</code>), we avoid explicit conditional branching by constructing a correction matrix $B$ and broadcasting it.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">import</span> tensorflow <span style="color:#66d9ef">as</span> tf
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">kabsch_tensorflow</span>(P, Q):
</span></span><span style="display:flex;"><span>    <span style="color:#e6db74">&#34;&#34;&#34;
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    Computes the optimal rotation and translation to align two sets of points (P -&gt; Q),
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    and their RMSD.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :param P: A Nx3 matrix of points
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :param Q: A Nx3 matrix of points
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :return: A tuple containing the optimal rotation matrix, the optimal
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">             translation vector, and the RMSD.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    &#34;&#34;&#34;</span>
</span></span><span style="display:flex;"><span>    P <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>convert_to_tensor(P, dtype<span style="color:#f92672">=</span>tf<span style="color:#f92672">.</span>float32)
</span></span><span style="display:flex;"><span>    Q <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>convert_to_tensor(Q, dtype<span style="color:#f92672">=</span>tf<span style="color:#f92672">.</span>float32)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">assert</span> P<span style="color:#f92672">.</span>shape <span style="color:#f92672">==</span> Q<span style="color:#f92672">.</span>shape, <span style="color:#e6db74">&#34;Matrix dimensions must match&#34;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Compute centroids</span>
</span></span><span style="display:flex;"><span>    centroid_P <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>reduce_mean(P, axis<span style="color:#f92672">=</span><span style="color:#ae81ff">0</span>)
</span></span><span style="display:flex;"><span>    centroid_Q <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>reduce_mean(Q, axis<span style="color:#f92672">=</span><span style="color:#ae81ff">0</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Center the points</span>
</span></span><span style="display:flex;"><span>    p <span style="color:#f92672">=</span> P <span style="color:#f92672">-</span> centroid_P
</span></span><span style="display:flex;"><span>    q <span style="color:#f92672">=</span> Q <span style="color:#f92672">-</span> centroid_Q
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Compute the covariance matrix</span>
</span></span><span style="display:flex;"><span>    H <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>matmul(tf<span style="color:#f92672">.</span>transpose(p), q)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># SVD</span>
</span></span><span style="display:flex;"><span>    S, U, V <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>linalg<span style="color:#f92672">.</span>svd(H)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># 1. Calculate determinant</span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Note: V in TF SVD is V, not V^T.</span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># R = V * U^T. Det(R) = Det(V * U^T)</span>
</span></span><span style="display:flex;"><span>    d <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>linalg<span style="color:#f92672">.</span>det(tf<span style="color:#f92672">.</span>matmul(V, tf<span style="color:#f92672">.</span>transpose(U)))
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># 2. Build diagonal B tensor: [1.0, 1.0, sign(d)]</span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Use static shape 3 if possible, or infer from D. Assuming D=3 here.</span>
</span></span><span style="display:flex;"><span>    B_diag <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>stack([<span style="color:#ae81ff">1.0</span>, <span style="color:#ae81ff">1.0</span>, tf<span style="color:#f92672">.</span>sign(d)])
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># 3. Scale columns of V via broadcasting (V * B_diag), then multiply by U^T</span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># V is DxD, B_diag is D. V * B_diag[None, :] multiplies each column j by B_diag[j]</span>
</span></span><span style="display:flex;"><span>    R <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>matmul(V <span style="color:#f92672">*</span> B_diag[<span style="color:#66d9ef">None</span>, :], tf<span style="color:#f92672">.</span>transpose(U))
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Optimal translation (depends on R, so computed after it)</span>
</span></span><span style="display:flex;"><span>    t <span style="color:#f92672">=</span> centroid_Q <span style="color:#f92672">-</span> tf<span style="color:#f92672">.</span>linalg<span style="color:#f92672">.</span>matvec(R, centroid_P)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># RMSD</span>
</span></span><span style="display:flex;"><span>    rmsd <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>sqrt(tf<span style="color:#f92672">.</span>reduce_sum(tf<span style="color:#f92672">.</span>square(tf<span style="color:#f92672">.</span>matmul(p, tf<span style="color:#f92672">.</span>transpose(R)) <span style="color:#f92672">-</span> q)) <span style="color:#f92672">/</span> P<span style="color:#f92672">.</span>shape[<span style="color:#ae81ff">0</span>])
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> R, t, rmsd
</span></span></code></pre></div><p>and a batched version:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">kabsch_tensorflow_batched</span>(P, Q):
</span></span><span style="display:flex;"><span>    <span style="color:#e6db74">&#34;&#34;&#34;
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    Computes the optimal rotation and translation to align two sets of points (P -&gt; Q),
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    and their RMSD.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :param P: A Nx3 matrix of points
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :param Q: A Nx3 matrix of points
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :return: A tuple containing the optimal rotation matrix, the optimal
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">             translation vector, and the RMSD.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    &#34;&#34;&#34;</span>
</span></span><span style="display:flex;"><span>    P <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>convert_to_tensor(P, dtype<span style="color:#f92672">=</span>tf<span style="color:#f92672">.</span>float32)
</span></span><span style="display:flex;"><span>    Q <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>convert_to_tensor(Q, dtype<span style="color:#f92672">=</span>tf<span style="color:#f92672">.</span>float32)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">assert</span> P<span style="color:#f92672">.</span>shape <span style="color:#f92672">==</span> Q<span style="color:#f92672">.</span>shape, <span style="color:#e6db74">&#34;Matrix dimensions must match&#34;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Compute centroids</span>
</span></span><span style="display:flex;"><span>    centroid_P <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>reduce_mean(P, axis<span style="color:#f92672">=</span><span style="color:#ae81ff">1</span>, keepdims<span style="color:#f92672">=</span><span style="color:#66d9ef">True</span>)
</span></span><span style="display:flex;"><span>    centroid_Q <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>reduce_mean(Q, axis<span style="color:#f92672">=</span><span style="color:#ae81ff">1</span>, keepdims<span style="color:#f92672">=</span><span style="color:#66d9ef">True</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Center the points</span>
</span></span><span style="display:flex;"><span>    p <span style="color:#f92672">=</span> P <span style="color:#f92672">-</span> centroid_P
</span></span><span style="display:flex;"><span>    q <span style="color:#f92672">=</span> Q <span style="color:#f92672">-</span> centroid_Q
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Compute the covariance matrix</span>
</span></span><span style="display:flex;"><span>    H <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>matmul(tf<span style="color:#f92672">.</span>transpose(p, perm<span style="color:#f92672">=</span>[<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">1</span>]), q)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># SVD</span>
</span></span><span style="display:flex;"><span>    S, U, V <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>linalg<span style="color:#f92672">.</span>svd(H)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># 1. Calculate batched determinant</span>
</span></span><span style="display:flex;"><span>    d <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>linalg<span style="color:#f92672">.</span>det(tf<span style="color:#f92672">.</span>matmul(V, tf<span style="color:#f92672">.</span>transpose(U, perm<span style="color:#f92672">=</span>[<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">1</span>])))
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># 2. Build batched B_diag: shape (B, 3)</span>
</span></span><span style="display:flex;"><span>    ones <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>ones_like(d)
</span></span><span style="display:flex;"><span>    B_diag <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>stack([ones, ones, tf<span style="color:#f92672">.</span>sign(d)], axis<span style="color:#f92672">=-</span><span style="color:#ae81ff">1</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># 3. Scale columns of V (Broadcasting adds the middle dimension)</span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># V: (B, 3, 3), B_diag: (B, 3) -&gt; B_diag[:, None, :]: (B, 1, 3)</span>
</span></span><span style="display:flex;"><span>    R <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>matmul(V <span style="color:#f92672">*</span> B_diag[:, <span style="color:#66d9ef">None</span>, :], tf<span style="color:#f92672">.</span>transpose(U, perm<span style="color:#f92672">=</span>[<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">1</span>]))
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Optimal translation (depends on R, so computed after it)</span>
</span></span><span style="display:flex;"><span>    t <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>squeeze(centroid_Q, axis<span style="color:#f92672">=</span><span style="color:#ae81ff">1</span>) <span style="color:#f92672">-</span> tf<span style="color:#f92672">.</span>linalg<span style="color:#f92672">.</span>matvec(R, tf<span style="color:#f92672">.</span>squeeze(centroid_P, axis<span style="color:#f92672">=</span><span style="color:#ae81ff">1</span>))  <span style="color:#75715e"># Bx3</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># RMSD</span>
</span></span><span style="display:flex;"><span>    rmsd <span style="color:#f92672">=</span> tf<span style="color:#f92672">.</span>sqrt(tf<span style="color:#f92672">.</span>reduce_sum(tf<span style="color:#f92672">.</span>square(tf<span style="color:#f92672">.</span>matmul(p, tf<span style="color:#f92672">.</span>transpose(R, perm<span style="color:#f92672">=</span>[<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">1</span>])) <span style="color:#f92672">-</span> q), axis<span style="color:#f92672">=</span>(<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">2</span>)) <span style="color:#f92672">/</span> P<span style="color:#f92672">.</span>shape[<span style="color:#ae81ff">1</span>])
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> R, t, rmsd
</span></span></code></pre></div><h3 id="jax">JAX</h3>
<p>The JAX implementation closely mirrors NumPy, replacing <code>np</code> with <code>jnp</code>. However, we again avoid <code>if</code> statements and in-place assignment (which JAX disallows) by using the broadcasting B-matrix approach.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">import</span> jax.numpy <span style="color:#66d9ef">as</span> jnp
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">kabsch_jax</span>(P, Q):
</span></span><span style="display:flex;"><span>    <span style="color:#e6db74">&#34;&#34;&#34;
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    Computes the optimal rotation and translation to align two sets of points (P -&gt; Q),
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    and their RMSD.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :param P: A Nx3 matrix of points
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :param Q: A Nx3 matrix of points
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :return: A tuple containing the optimal rotation matrix, the optimal
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">             translation vector, and the RMSD.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    &#34;&#34;&#34;</span>
</span></span><span style="display:flex;"><span>    P <span style="color:#f92672">=</span> jnp<span style="color:#f92672">.</span>array(P)
</span></span><span style="display:flex;"><span>    Q <span style="color:#f92672">=</span> jnp<span style="color:#f92672">.</span>array(Q)
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">assert</span> P<span style="color:#f92672">.</span>shape <span style="color:#f92672">==</span> Q<span style="color:#f92672">.</span>shape, <span style="color:#e6db74">&#34;Matrix dimensions must match&#34;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Compute centroids</span>
</span></span><span style="display:flex;"><span>    centroid_P <span style="color:#f92672">=</span> jnp<span style="color:#f92672">.</span>mean(P, axis<span style="color:#f92672">=</span><span style="color:#ae81ff">0</span>)
</span></span><span style="display:flex;"><span>    centroid_Q <span style="color:#f92672">=</span> jnp<span style="color:#f92672">.</span>mean(Q, axis<span style="color:#f92672">=</span><span style="color:#ae81ff">0</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Center the points</span>
</span></span><span style="display:flex;"><span>    p <span style="color:#f92672">=</span> P <span style="color:#f92672">-</span> centroid_P
</span></span><span style="display:flex;"><span>    q <span style="color:#f92672">=</span> Q <span style="color:#f92672">-</span> centroid_Q
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Compute the covariance matrix</span>
</span></span><span style="display:flex;"><span>    H <span style="color:#f92672">=</span> jnp<span style="color:#f92672">.</span>dot(p<span style="color:#f92672">.</span>T, q)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># SVD</span>
</span></span><span style="display:flex;"><span>    U, S, Vt <span style="color:#f92672">=</span> jnp<span style="color:#f92672">.</span>linalg<span style="color:#f92672">.</span>svd(H)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># 1. Calculate determinant</span>
</span></span><span style="display:flex;"><span>    d <span style="color:#f92672">=</span> jnp<span style="color:#f92672">.</span>linalg<span style="color:#f92672">.</span>det(jnp<span style="color:#f92672">.</span>dot(Vt<span style="color:#f92672">.</span>T, U<span style="color:#f92672">.</span>T))
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># 2. Build diagonal B array</span>
</span></span><span style="display:flex;"><span>    B_diag <span style="color:#f92672">=</span> jnp<span style="color:#f92672">.</span>array([<span style="color:#ae81ff">1.0</span>, <span style="color:#ae81ff">1.0</span>, jnp<span style="color:#f92672">.</span>sign(d)])
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># 3. Scale columns of Vt.T and multiply by U.T</span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Vt.T is V.</span>
</span></span><span style="display:flex;"><span>    R <span style="color:#f92672">=</span> jnp<span style="color:#f92672">.</span>dot(Vt<span style="color:#f92672">.</span>T <span style="color:#f92672">*</span> B_diag[<span style="color:#66d9ef">None</span>, :], U<span style="color:#f92672">.</span>T)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Optimal translation (depends on R, so computed after it)</span>
</span></span><span style="display:flex;"><span>    t <span style="color:#f92672">=</span> centroid_Q <span style="color:#f92672">-</span> jnp<span style="color:#f92672">.</span>dot(R, centroid_P)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># RMSD</span>
</span></span><span style="display:flex;"><span>    rmsd <span style="color:#f92672">=</span> jnp<span style="color:#f92672">.</span>sqrt(jnp<span style="color:#f92672">.</span>sum(jnp<span style="color:#f92672">.</span>square(jnp<span style="color:#f92672">.</span>dot(p, R<span style="color:#f92672">.</span>T) <span style="color:#f92672">-</span> q)) <span style="color:#f92672">/</span> P<span style="color:#f92672">.</span>shape[<span style="color:#ae81ff">0</span>])
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> R, t, rmsd
</span></span></code></pre></div><p>and batched:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">kabsch_jax_batched</span>(P, Q):
</span></span><span style="display:flex;"><span>    <span style="color:#e6db74">&#34;&#34;&#34;
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    Computes the optimal rotation and translation to align two sets of points (P -&gt; Q),
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    and their RMSD.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :param P: A BxNx3 matrix of points
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :param Q: A BxNx3 matrix of points
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    :return: A tuple containing the optimal rotation matrix, the optimal
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">             translation vector, and the RMSD.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    &#34;&#34;&#34;</span>
</span></span><span style="display:flex;"><span>    P <span style="color:#f92672">=</span> jnp<span style="color:#f92672">.</span>array(P)
</span></span><span style="display:flex;"><span>    Q <span style="color:#f92672">=</span> jnp<span style="color:#f92672">.</span>array(Q)
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">assert</span> P<span style="color:#f92672">.</span>shape <span style="color:#f92672">==</span> Q<span style="color:#f92672">.</span>shape, <span style="color:#e6db74">&#34;Matrix dimensions must match&#34;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Compute centroids</span>
</span></span><span style="display:flex;"><span>    centroid_P <span style="color:#f92672">=</span> jnp<span style="color:#f92672">.</span>mean(P, axis<span style="color:#f92672">=</span><span style="color:#ae81ff">1</span>, keepdims<span style="color:#f92672">=</span><span style="color:#66d9ef">True</span>)  <span style="color:#75715e"># Bx1x3</span>
</span></span><span style="display:flex;"><span>    centroid_Q <span style="color:#f92672">=</span> jnp<span style="color:#f92672">.</span>mean(Q, axis<span style="color:#f92672">=</span><span style="color:#ae81ff">1</span>, keepdims<span style="color:#f92672">=</span><span style="color:#66d9ef">True</span>)  <span style="color:#75715e"># Bx1x3</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Center the points</span>
</span></span><span style="display:flex;"><span>    p <span style="color:#f92672">=</span> P <span style="color:#f92672">-</span> centroid_P  <span style="color:#75715e"># BxNx3</span>
</span></span><span style="display:flex;"><span>    q <span style="color:#f92672">=</span> Q <span style="color:#f92672">-</span> centroid_Q  <span style="color:#75715e"># BxNx3</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Compute the covariance matrix</span>
</span></span><span style="display:flex;"><span>    H <span style="color:#f92672">=</span> jnp<span style="color:#f92672">.</span>matmul(p<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">1</span>), q)  <span style="color:#75715e"># Bx3x3</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># SVD</span>
</span></span><span style="display:flex;"><span>    U, S, Vt <span style="color:#f92672">=</span> jnp<span style="color:#f92672">.</span>linalg<span style="color:#f92672">.</span>svd(H)  <span style="color:#75715e"># Bx3x3</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># 1. Calculate batched determinant</span>
</span></span><span style="display:flex;"><span>    d <span style="color:#f92672">=</span> jnp<span style="color:#f92672">.</span>linalg<span style="color:#f92672">.</span>det(jnp<span style="color:#f92672">.</span>matmul(Vt<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">1</span>), U<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">1</span>)))
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># 2. Build batched B_diag</span>
</span></span><span style="display:flex;"><span>    ones <span style="color:#f92672">=</span> jnp<span style="color:#f92672">.</span>ones_like(d)
</span></span><span style="display:flex;"><span>    B_diag <span style="color:#f92672">=</span> jnp<span style="color:#f92672">.</span>stack([ones, ones, jnp<span style="color:#f92672">.</span>sign(d)], axis<span style="color:#f92672">=-</span><span style="color:#ae81ff">1</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># 3. Scale columns of Vt.T and multiply by U.T</span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Vt.T: (B, 3, 3). B_diag: (B, 3).</span>
</span></span><span style="display:flex;"><span>    R <span style="color:#f92672">=</span> jnp<span style="color:#f92672">.</span>matmul(Vt<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">1</span>) <span style="color:#f92672">*</span> B_diag[:, <span style="color:#66d9ef">None</span>, :], U<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">1</span>))
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Optimal translation (depends on R, so computed after it)</span>
</span></span><span style="display:flex;"><span>    t <span style="color:#f92672">=</span> centroid_Q<span style="color:#f92672">.</span>squeeze(<span style="color:#ae81ff">1</span>) <span style="color:#f92672">-</span> jnp<span style="color:#f92672">.</span>matmul(centroid_P, R<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">1</span>))<span style="color:#f92672">.</span>squeeze(<span style="color:#ae81ff">1</span>)  <span style="color:#75715e"># Bx3</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># RMSD</span>
</span></span><span style="display:flex;"><span>    rmsd <span style="color:#f92672">=</span> jnp<span style="color:#f92672">.</span>sqrt(jnp<span style="color:#f92672">.</span>sum(jnp<span style="color:#f92672">.</span>square(jnp<span style="color:#f92672">.</span>matmul(p, R<span style="color:#f92672">.</span>transpose(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">1</span>)) <span style="color:#f92672">-</span> q), axis<span style="color:#f92672">=</span>(<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">2</span>)) <span style="color:#f92672">/</span> P<span style="color:#f92672">.</span>shape[<span style="color:#ae81ff">1</span>])
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> R, t, rmsd
</span></span></code></pre></div>














<figure class="post-figure center ">
    <img src="/img/scientific-computing/kabsch-animated-protein-conformational-alignment-analysis.webp"
         alt="Animation of a protein structure being aligned using the Kabsch algorithm"
         title="Animation of a protein structure being aligned using the Kabsch algorithm"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">Real-world application: Aligning protein conformations to analyze structural changes.</figcaption>
    
</figure>

<h2 id="extensions">Extensions</h2>
<p>The Kabsch algorithm has several important extensions that go beyond the formulation dealt with here:</p>
<ul>
<li><strong>Quaternion Form</strong>: The algorithm can be reformulated using quaternions for better numerical stability, particularly useful in applications requiring high precision.</li>
<li><strong>Iterative Versions</strong>: More robust variants that handle noise better and have improved scaling properties for large point sets. This also can be advantageous for setups with limited computational resources.</li>
<li><strong>Weighted Kabsch</strong>: Extensions that incorporate point weights (e.g., atomic masses in molecular dynamics). While SciPy provides a <a href="https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.transform.Rotation.align_vectors.html#scipy.spatial.transform.Rotation.align_vectors">weighted version</a>, it lacks batch processing capabilities.</li>
<li><strong>The Umeyama Algorithm</strong>: If your point sets are rotated, translated, and scaled differently, the Umeyama algorithm is the direct extension of Kabsch. It solves the same optimization problem but introduces a scaling factor $c$, finding the optimal alignment for $Q \approx c R P + t$.</li>
</ul>
<p>Several of these extensions are implemented in the <a href="/projects/kabsch-horn-cookbook/">Kabsch-Horn Cookbook</a> library, which provides differentiable Kabsch, Horn, and Umeyama alignment across NumPy, PyTorch, JAX, TensorFlow, and MLX.</p>
<h2 id="further-reading">Further Reading</h2>
<ul>
<li><a href="https://en.wikipedia.org/wiki/Kabsch_algorithm">Wikipedia, Kabsch Algorithm</a></li>
<li><a href="https://zalo.github.io/blog/kabsch/">Zalo on Kabsch</a>: An interactive shape matching demo.</li>
</ul>
<h3 id="original-papers">Original Papers</h3>
<ul>
<li><strong>[Kabsch 1976]</strong> Kabsch, W. (1976). &ldquo;A solution for the best rotation to relate two sets of vectors.&rdquo; <em>Acta Crystallographica Section A</em>, 32(5), 922-923. <a href="https://doi.org/10.1107/S0567739476001873">DOI: 10.1107/S0567739476001873</a>
<em>The original paper: a closed-form, non-iterative optimal-rotation solution derived via Lagrange multipliers and eigendecomposition of $\tilde{R}R$ (the SVD reformulation came later; see Arun et al. 1987).</em> See also: <a href="/notes/biology/computational-biology/kabsch-algorithm/">paper notes</a>.</li>
<li><strong>[Kabsch 1978]</strong> Kabsch, W. (1978). &ldquo;A discussion of the solution for the best rotation to relate two sets of vectors.&rdquo; <em>Acta Crystallographica Section A</em>, 34(5), 827-828. <a href="https://doi.org/10.1107/S0567739478001680">DOI: 10.1107/S0567739478001680</a>
<em>The follow-up paper correcting for improper rotations (reflections).</em></li>
<li><strong>[Arun et al. 1987]</strong> Arun, K. S., Huang, T. S., &amp; Blostein, S. D. (1987). &ldquo;Least-Squares Fitting of Two 3-D Point Sets.&rdquo; <em>IEEE Transactions on Pattern Analysis and Machine Intelligence</em>, PAMI-9(5), 698-700. <a href="https://doi.org/10.1109/TPAMI.1987.4767965">DOI: 10.1109/TPAMI.1987.4767965</a>
<em>The first SVD-based formulation for 3D point set alignment.</em> See also: <a href="/notes/biology/computational-biology/arun-svd-point-fitting/">paper notes</a>.</li>
<li><strong>[Horn et al. 1988]</strong> Horn, B. K. P., Hilden, H. M., &amp; Negahdaripour, S. (1988). &ldquo;Closed-form solution of absolute orientation using orthonormal matrices.&rdquo; <em>Journal of the Optical Society of America A</em>, 5(7), 1127-1135. <a href="https://doi.org/10.1364/JOSAA.5.001127">DOI: 10.1364/JOSAA.5.001127</a>
<em>The matrix square root (polar decomposition) approach to the same problem.</em> See also: <a href="/notes/biology/computational-biology/horn-orthonormal-matrices/">paper notes</a>.</li>
<li><strong>[Horn 1987]</strong> Horn, B. K. P. (1987). &ldquo;Closed-form solution of absolute orientation using unit quaternions.&rdquo; <em>Journal of the Optical Society of America A</em>, 4(4), 629-642. <a href="https://doi.org/10.1364/JOSAA.4.000629">DOI: 10.1364/JOSAA.4.000629</a>
<em>An alternative quaternion-based closed-form solution that also handles scale.</em> See also: <a href="/notes/biology/computational-biology/horn-absolute-orientation/">paper notes</a>.</li>
<li><strong>[Umeyama 1991]</strong> Umeyama, S. (1991). &ldquo;Least-squares estimation of transformation parameters between two point patterns.&rdquo; <em>IEEE Transactions on Pattern Analysis and Machine Intelligence</em>, 13(4), 376-380. <a href="https://doi.org/10.1109/34.88573">DOI: 10.1109/34.88573</a>
<em>The extension of the algorithm to include optimal scaling in addition to rotation and translation.</em> See also: <a href="/notes/biology/computational-biology/umeyama-similarity-transformation/">paper notes</a>.</li>
</ul>
]]></content:encoded></item><item><title>Automated Adatom Diffusion Workflow</title><link>https://hunterheidenreich.com/projects/lammps-adatom-diffusion/</link><pubDate>Thu, 21 Sep 2023 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/projects/lammps-adatom-diffusion/</guid><description>Python-wrapped reference implementation for surface diffusion simulations using LAMMPS and EAM potentials, with automated analysis pipelines.</description><content:encoded><![CDATA[<h2 id="overview">Overview</h2>
<p>This project provides an &ldquo;input-to-analysis&rdquo; workflow for simulating adatom diffusion on FCC metal surfaces. It demonstrates how to set up surface diffusion simulations in LAMMPS, manage EAM potentials, and parse trajectory data into energy and trajectory plots using Python. The LAMMPS input scripts are adapted from Eric N. Hahn&rsquo;s adatom tutorial; the Python analysis layer (<code>plot_energy.py</code>, <code>plot_xy.py</code>) is my own, written while in CSElab (Harvard, 2023).</p>
<p>The workflow covers two material systems (Copper (Cu) and Platinum (Pt)) providing comparative datasets that highlight how atomic mass and bonding strength affect surface dynamics.</p>
<h2 id="features">Features</h2>
<h3 id="simulation-architecture">Simulation Architecture</h3>
<p>The project separates simulation logic from analysis code:</p>
<table>
  <thead>
      <tr>
          <th style="text-align: left">Directory</th>
          <th style="text-align: left">Description</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left"><strong><code>/adatom_cu</code></strong></td>
          <td style="text-align: left">Copper adatom diffusion on Cu(100)</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong><code>/adatom_pt</code></strong></td>
          <td style="text-align: left">Platinum adatom diffusion on Pt(100)</td>
      </tr>
  </tbody>
</table>
<p>Each directory contains:</p>
<ul>
<li><strong>LAMMPS input scripts</strong> (<code>.in</code> files) defining the physics</li>
<li><strong>EAM potential files</strong> for metallic bonding (the Cu potential is committed; the Pt potential must be downloaded separately from the NIST Interatomic Potentials Repository, so the Pt system does not run as-checked-out)</li>
<li><strong>Python analysis scripts</strong> for trajectory and energy parsing</li>
</ul>
<h3 id="key-features">Key Features</h3>
<ul>
<li><strong>EAM Potentials</strong>: Uses Embedded Atom Method alloy potentials to accurately model metallic bonding and surface energies, providing accuracy beyond simple Lennard-Jones potentials</li>
<li><strong>Automated Analysis</strong>: Python pipeline (<code>plot_energy.py</code>, <code>plot_xy.py</code>) that parses raw thermodynamic logs and trajectory dumps to generate &ldquo;health check&rdquo; dashboards</li>
<li><strong>Workflow Orchestration</strong>: Demonstrates the &ldquo;Input → Simulation → Analysis&rdquo; loop, automating the transition from raw <code>.lammpstrj</code> files to publication-ready plots</li>
<li><strong>Kokkos Support</strong>: Includes Kokkos execution commands for GPU/multi-threaded runs</li>
</ul>
<h3 id="simulation-parameters">Simulation Parameters</h3>
<table>
  <thead>
      <tr>
          <th style="text-align: left">Parameter</th>
          <th style="text-align: left">Value</th>
          <th style="text-align: left">Purpose</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left"><strong>Ensemble</strong></td>
          <td style="text-align: left">NVT → NVE</td>
          <td style="text-align: left">Equilibration followed by energy conservation checks</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>Potential</strong></td>
          <td style="text-align: left">EAM/alloy</td>
          <td style="text-align: left">Accurate metallic bonding for surface dynamics</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>Minimization</strong></td>
          <td style="text-align: left">CG (1.0e-4)</td>
          <td style="text-align: left">Remove steric overlaps before dynamics</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>Timestep</strong></td>
          <td style="text-align: left">5 fs (metal units)</td>
          <td style="text-align: left">EAM-appropriate integration step</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>Trajectory dump</strong></td>
          <td style="text-align: left">every 5 steps (25 fs)</td>
          <td style="text-align: left">Tracks adatom site-to-site hops</td>
      </tr>
  </tbody>
</table>
<h2 id="usage">Usage</h2>
<p>The repository includes LAMMPS input scripts and Python analysis scripts. Run the LAMMPS scripts to generate trajectory data, then use the Python scripts to visualize the results.</p>
<h2 id="results">Results</h2>
<p>This workflow is documented in detail in companion blog posts:</p>
<ul>
<li><a href="/posts/adatom-cu-diffusion/">LAMMPS Tutorial: Copper and Platinum Adatom Diffusion</a> - Complete setup walkthrough with line-by-line script explanation and comparison of how heavier atoms behave differently on surfaces</li>
</ul>
]]></content:encoded></item><item><title>IQCRNN: Certified Stability for Neural Networks</title><link>https://hunterheidenreich.com/projects/iqcrnn-pytorch/</link><pubDate>Wed, 11 May 2022 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/projects/iqcrnn-pytorch/</guid><description>PyTorch IQCRNN enforcing stability guarantees on RNNs via Integral Quadratic Constraints and semidefinite programming.</description><content:encoded><![CDATA[<p>This project is a PyTorch re-implementation of <strong>IQCRNN</strong>, a method that enforces strict stability guarantees on Recurrent Neural Networks used in control systems.</p>
<h2 id="overview">Overview</h2>
<p>Standard Reinforcement Learning agents can behave unpredictably in unseen states. This approach forces the agent&rsquo;s weights to satisfy <strong>Integral Quadratic Constraints (IQC)</strong> via a projection step. Effectively, it solves a convex optimization problem (Semidefinite Program) inside the gradient descent loop to ensure the controller never violates Lyapunov stability criteria.</p>
<p>The method bridges classic <strong>Robust Control Theory</strong> (1990s) with <strong>Deep Reinforcement Learning</strong> (2020s), providing mathematical certificates of safety for neural network controllers.</p>
<h2 id="features">Features</h2>
<ul>
<li><strong>Hybrid Optimization:</strong> Interleaved standard Gradient Descent (PyTorch) with Convex Optimization (<code>cvxpy</code> + <code>MOSEK</code>) to project weights onto the &ldquo;safe&rdquo; manifold after each training step.</li>
<li><strong>Complex Constraints:</strong> Implemented the &ldquo;Tilde&rdquo; parametrization from the original paper to convexify the non-convex stability conditions of the RNN dynamics, transforming an intractable problem into a solvable Linear Matrix Inequality (LMI).</li>
<li><strong>Safety-Critical Domains:</strong> Applied the controller across six control systems (cartpole, inverted pendulum, nonlinear pendulum, pendubot, power grid, and vehicle dynamics), including unstable plants where &ldquo;crashing&rdquo; during training is unacceptable.</li>
</ul>
<h2 id="usage">Usage</h2>
<p>The repository includes training scripts for the inverted pendulum and power grid environments, demonstrating the stability guarantees in practice.</p>
<h2 id="results">Results</h2>
<p>This project was a deep dive into the tension between <strong>Safety</strong> and <strong>Speed</strong>.</p>
<ul>
<li><strong>The Bottleneck:</strong> Solving an SDP at every few steps of training is computationally expensive (interior-point SDP solvers scale steeply, roughly $O(n^6)$ in the matrix dimension). While it provided mathematical certificates of safety, it highlighted why these methods haven&rsquo;t yet overtaken standard PPO/SAC in production: the &ldquo;safety tax&rdquo; on training time is steep.</li>
<li><strong>The Lesson:</strong> It taught me that &ldquo;theoretical guarantees&rdquo; often come with &ldquo;engineering fine print.&rdquo; If I were to redo this today, I would look into <strong>differentiable convex optimization layers</strong> (like <code>cvxpylayers</code>) to make the projection end-to-end differentiable.</li>
<li><strong>The &ldquo;Rough Edges&rdquo;:</strong> The codebase has artifacts of its research origins (e.g., the <code>reqs.txt</code> dependency dump). Reading a dense control theory paper (Gu et al., 2021) and implementing the math correctly was the primary focus.</li>
</ul>
<h2 id="citation">Citation</h2>
<p>Credit to the original authors:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@misc</span>{gu2021recurrentneuralnetworkcontrollers,
</span></span><span style="display:flex;"><span>      <span style="color:#a6e22e">title</span>=<span style="color:#e6db74">{Recurrent Neural Network Controllers Synthesis with Stability Guarantees for Partially Observed Systems}</span>,
</span></span><span style="display:flex;"><span>      <span style="color:#a6e22e">author</span>=<span style="color:#e6db74">{Fangda Gu and He Yin and Laurent El Ghaoui and Murat Arcak and Peter Seiler and Ming Jin}</span>,
</span></span><span style="display:flex;"><span>      <span style="color:#a6e22e">year</span>=<span style="color:#e6db74">{2021}</span>,
</span></span><span style="display:flex;"><span>      <span style="color:#a6e22e">eprint</span>=<span style="color:#e6db74">{2109.03861}</span>,
</span></span><span style="display:flex;"><span>      <span style="color:#a6e22e">archivePrefix</span>=<span style="color:#e6db74">{arXiv}</span>,
</span></span><span style="display:flex;"><span>      <span style="color:#a6e22e">primaryClass</span>=<span style="color:#e6db74">{eess.SY}</span>,
</span></span><span style="display:flex;"><span>      <span style="color:#a6e22e">url</span>=<span style="color:#e6db74">{https://arxiv.org/abs/2109.03861}</span>,
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><h2 id="related-work">Related Work</h2>
<ul>
<li><a href="/research/deconstructing-recurrence-attention-gating/">Deconstructing Recurrence and Attention Gating</a>: research on recurrent network architectures, providing context for why stability guarantees on RNNs matter</li>
</ul>
]]></content:encoded></item><item><title>Cartesian Genetic Programming in Julia</title><link>https://hunterheidenreich.com/projects/cgp-julia/</link><pubDate>Sun, 18 Nov 2018 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/projects/cgp-julia/</guid><description>A fork of Dennis Wilson's CGP.jl applying Cartesian Genetic Programming to Atari RL tasks; my work was the Atari experiments, not the core framework.</description><content:encoded><![CDATA[<p>Written in 2018, this was an exploration into <strong>Evolutionary Algorithms</strong> applied to Reinforcement Learning tasks (specifically Atari games). It is a fork of <a href="https://github.com/d9w/CGP.jl">d9w/CGP.jl</a> (Dennis Wilson, Apache 2.0); my work centered on the Atari reinforcement-learning experiments rather than the core CGP framework.</p>
<h2 id="overview">Overview</h2>
<p>Standard Cartesian Genetic Programming (CGP) relies heavily on mutation. The upstream library hybridizes CGP with <strong>NEAT (NeuroEvolution of Augmenting Topologies)</strong> concepts to protect topological innovation through speciation.</p>
<p>My goal in forking it was to evolve graph-based programs that could learn Atari control policies using gradient-free optimization.</p>
<h2 id="features">Features</h2>
<p>The upstream framework provides the CGP machinery this project builds on:</p>
<ul>
<li><strong>Graph-based Crossover:</strong> Crossover operators such as <code>subgraph_crossover</code> and <code>aligned_node_crossover</code> that handle the destructive nature of mating graph structures.</li>
<li><strong>Speciation:</strong> A NEAT-inspired compatibility-distance metric (<code>cgpneat.jl</code>) to maintain population diversity and prevent premature convergence.</li>
<li><strong>Active Gene Tracking:</strong> Differentiates between &ldquo;active&rdquo; nodes (those contributing to output) and &ldquo;junk DNA,&rdquo; focusing mutation on phenotypic changes.</li>
</ul>
<p>My own contribution was the <strong>Atari reinforcement-learning layer</strong> on top of this: experiment variants (<code>action_atari.jl</code>, <code>original_atari.jl</code>, <code>manual_atari.jl</code>, <code>play_atari.jl</code>, <code>param_sweep.jl</code>), custom fitness and scoring functions, early-stopping and completion-percentage logging, multithreading and <code>pmap</code> multiprocessing attempts (reverted to single-thread), and config tuning to match a reference paper&rsquo;s hyperparameters.</p>
<h2 id="usage">Usage</h2>
<p>The library provides a Julia API for defining CGP graphs, configuring evolutionary parameters, and running the evolutionary loop against custom environments.</p>
<h2 id="results">Results</h2>
<p>Looking back, this codebase captures a transitional moment where I was moving from scripting to library design.</p>
<ul>
<li><strong>The Ambition:</strong> Getting CGP graphs to learn Atari policies under the mixed-type regime (RGB-array inputs, scalar action outputs) was an ambitious undertaking for my software engineering skills at the time.</li>
<li><strong>The &ldquo;Legacy&rdquo; Code:</strong> The project relies on the now-deprecated Julia v0.6 and uses <code>eval(parse(...))</code> patterns for configuration (a significant performance anti-pattern in modern Julia).</li>
<li><strong>The Lesson:</strong> It taught me the difficulty of designing genetic operators that respect topological constraints, a lesson that informs my current understanding of optimization in structured spaces.</li>
</ul>
]]></content:encoded></item><item><title>FFTW Compiler in Haskell</title><link>https://hunterheidenreich.com/projects/fftw-compiler-haskell/</link><pubDate>Thu, 15 Mar 2018 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/projects/fftw-compiler-haskell/</guid><description>Reverse-engineering the genfft logic to generate optimized C kernels for Fast Fourier Transforms using Haskell metaprogramming.</description><content:encoded><![CDATA[<p>Written during my sophomore year, this project was an attempt to look inside the &ldquo;black box&rdquo; of one of the fastest Fourier transform libraries: <strong>FFTW</strong>.</p>
<h2 id="overview">Overview</h2>
<p>I sought to replicate the logic of FFTW&rsquo;s <code>genfft</code>: a metaprogram that generates straight-line, highly optimized C code. The goal was to understand how abstract algebra (group theory) could be translated into efficient machine code through symbolic manipulation.</p>
<h2 id="features">Features</h2>
<p>This was my first deep dive into <strong>functional metaprogramming</strong> and <strong>compiler theory</strong>:</p>
<ul>
<li><strong>Symbolic AST:</strong> Modeled mathematical operations as a Directed Acyclic Graph (DAG) in Haskell (<code>data Node</code>), separating the <em>definition</em> of the math from its <em>execution</em>.</li>
<li><strong>Algebraic Simplification:</strong> Implemented a symbolic optimization pass that pruned operations at compile-time (e.g., eliminating multiplications by $1$, $0$, or $-1$) before code generation.</li>
<li><strong>Monadic State Management:</strong> Used Haskell&rsquo;s <code>State</code> Monad to manage the graph construction and memoization, ensuring common subexpressions (like reusable cosine factors) were calculated only once.</li>
<li><strong>Code Generation:</strong> The system outputted unrolled, straight-line C code (e.g., <code>fftw4.c</code>), mimicking the &ldquo;codelets&rdquo; used by the actual FFTW library.</li>
</ul>
<h2 id="usage">Usage</h2>
<p>The compiler is run via the command line, taking the desired FFT size as input and outputting the optimized C code.</p>
<h2 id="results">Results</h2>
<p>Looking back, this project represents a pivotal moment where I moved from &ldquo;writing programs&rdquo; to &ldquo;writing tools that write programs.&rdquo;</p>
<ul>
<li><strong>The &ldquo;Magic&rdquo;:</strong> It demystified high-performance computing. I learned that speed often comes from unrolling recursion and managing register pressure at compile time alongside writing fast loops.</li>
<li><strong>The &ldquo;Rough Edges&rdquo;:</strong> The scheduler (coloring nodes Red/Blue for register allocation) was a heuristic approximation of the optimal Aho-Johnson-Ullman algorithm.</li>
<li><strong>Legacy:</strong> The core lesson that domain-specific compilers can outperform hand-tuned generic code remains relevant to my current work in optimizing scientific computing kernels.</li>
</ul>
]]></content:encoded></item><item><title>Term Schedule Optimizer</title><link>https://hunterheidenreich.com/projects/term-schedule-optimizer/</link><pubDate>Wed, 15 Feb 2017 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/projects/term-schedule-optimizer/</guid><description>A constraint satisfaction solver built to generate conflict-free university schedules from web-scraped course data.</description><content:encoded><![CDATA[<p>A Python-based automation tool I wrote as a freshman to solve the &ldquo;Term Master Schedule&rdquo; problem (and used throughout my undergrad from 2016 to 2020).</p>
<h2 id="overview">Overview</h2>
<p>Manually creating a university schedule involves solving a <strong>Constraint Satisfaction Problem (CSP)</strong> with multiple variables:</p>
<ul>
<li><strong>Hard Constraints:</strong> No time overlaps between classes.</li>
<li><strong>Soft Constraints:</strong> Preferences for &ldquo;no 8 AMs,&rdquo; specific lunch breaks, or maximizing free days.</li>
</ul>
<p>The naive approach (manually checking every possible combination) becomes intractable as the number of courses and sections grows.</p>
<h2 id="features">Features</h2>
<p>I built a script that:</p>
<ol>
<li><strong>Scraped Data:</strong> Parsed the Drexel WebTMS (Term Master Schedule) using <code>lxml</code> to build a localized dataset of course availability.</li>
<li><strong>Solved for X:</strong> Implemented a <strong>recursive backtracking algorithm</strong> to generate every valid schedule permutation that satisfied user-defined constraints.</li>
</ol>
<h3 id="the-algorithm">The Algorithm</h3>
<p>The core of this project is a <code>recursive_generator</code> function that implements a valid CSP solver using backtracking. It performs a recursive depth-first search that:</p>
<ol>
<li>Takes a set of variables (courses).</li>
<li>Checks constraints (time overlaps, lunch hours, max classes per day).</li>
<li>Backtracks when a branch fails.</li>
</ol>
<p>It is the same backtracking pattern used in everything from Sudoku solvers to compiler register allocation.</p>
<h2 id="usagegameplay">Usage/Gameplay</h2>
<p>The tool is run via the command line, taking a list of desired courses and outputting valid schedule combinations.</p>
<h2 id="results">Results</h2>
<p>This tool saved me (and several friends) hours of planning time each quarter. While the scraping logic was fragile (dependent on 2017 HTML structures), the core logic (a depth-first search through the state space of possible schedules) remains a fundamental algorithmic pattern.</p>
]]></content:encoded></item></channel></rss>