<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Computational Biology on Hunter Heidenreich | Senior AI Research Scientist</title><link>https://hunterheidenreich.com/notes/computational-biology/</link><description>Recent content in Computational Biology on Hunter Heidenreich | Senior AI Research Scientist</description><image><title>Hunter Heidenreich | Senior AI Research Scientist</title><url>https://hunterheidenreich.com/img/avatar.webp</url><link>https://hunterheidenreich.com/img/avatar.webp</link></image><generator>Hugo -- 0.163.3</generator><language>en-US</language><copyright>2026 Hunter Heidenreich</copyright><lastBuildDate>Sun, 28 Jun 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://hunterheidenreich.com/notes/computational-biology/index.xml" rel="self" type="application/rss+xml"/><item><title>Umeyama's Method: Corrected SVD for Point Alignment</title><link>https://hunterheidenreich.com/notes/computational-biology/umeyama-similarity-transformation/</link><pubDate>Mon, 16 Mar 2026 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/computational-biology/umeyama-similarity-transformation/</guid><description>Umeyama (1991) fixes the SVD-based point set alignment method to always produce proper rotations, jointly solving for rotation, translation, and scale.</description><content:encoded><![CDATA[<h2 id="fixing-the-reflection-problem-in-svd-based-alignment">Fixing the Reflection Problem in SVD-Based Alignment</h2>
<p>This <strong>Method</strong> paper addresses a specific failure mode in prior SVD-based solutions to the point set registration problem. Both <a href="/notes/computational-biology/arun-svd-point-fitting/">Arun et al. (1987)</a> and <a href="/notes/computational-biology/horn-orthonormal-matrices/">Horn, Hilden, and Negahdaripour (1988)</a> presented SVD-based methods for finding the optimal rotation between two point patterns. (Note: this is a different paper from <a href="/notes/computational-biology/horn-absolute-orientation/">Horn&rsquo;s 1987 quaternion method</a>, which does not suffer from this issue.) These SVD-based methods can produce a reflection ($\det(R) = -1$) instead of a proper rotation when the data is severely corrupted. Umeyama provides a corrected formulation that always yields a proper rotation matrix.</p>
<h2 id="the-similarity-transformation-problem">The Similarity Transformation Problem</h2>
<p>Given two point sets ${\mathbf{x}_i}$ and ${\mathbf{y}_i}$ ($i = 1, \ldots, n$) in $m$-dimensional space, find the similarity transformation parameters (rotation $R$, translation $\mathbf{t}$, and scale $c$) minimizing the mean squared error:</p>
<p>$$
e^2(R, \mathbf{t}, c) = \frac{1}{n} \sum_{i=1}^{n} \lVert \mathbf{y}_i - (cR\mathbf{x}_i + \mathbf{t}) \rVert^2
$$</p>
<p>This generalizes the <a href="/notes/computational-biology/kabsch-algorithm/">Kabsch problem</a> (rotation only) and the <a href="/notes/computational-biology/horn-absolute-orientation/">absolute orientation problem</a> (rotation + translation + scale) to arbitrary dimensions $m$.</p>
<h2 id="the-core-lemma-corrected-svd-rotation">The Core Lemma: Corrected SVD Rotation</h2>
<p>The key contribution is a lemma for finding the rotation $R$ minimizing $\lVert A - RB \rVert^2$. Given the SVD of $AB^T = UDV^T$ (with $d_1 \geq d_2 \geq \cdots \geq d_m \geq 0$), define the correction matrix:</p>
<p>$$
S = \begin{cases} I &amp; \text{if } \det(AB^T) \geq 0 \\ \operatorname{diag}(1, 1, \ldots, 1, -1) &amp; \text{if } \det(AB^T) &lt; 0 \end{cases}
$$</p>
<p>The minimum value is:</p>
<p>$$
\min_{R} \lVert A - RB \rVert^2 = \lVert A \rVert^2 + \lVert B \rVert^2 - 2\operatorname{tr}(DS)
$$</p>
<p>When $\operatorname{rank}(AB^T) \geq m - 1$, the optimal rotation is uniquely determined as:</p>
<p>$$
R = USV^T
$$</p>
<p>The critical insight is that when $\det(AB^T) = 0$ (i.e., $\operatorname{rank}(AB^T) = m - 1$), the matrix $S$ must instead be chosen based on $\det(U)\det(V)$:</p>
<p>$$
S = \begin{cases} I &amp; \text{if } \det(U)\det(V) = 1 \\ \operatorname{diag}(1, 1, \ldots, 1, -1) &amp; \text{if } \det(U)\det(V) = -1 \end{cases}
$$</p>
<p>This handles the degenerate case where the sign of $\det(AB^T)$ is unreliable.</p>
<h2 id="complete-similarity-transformation-solution">Complete Similarity Transformation Solution</h2>
<p>Umeyama derives the full solution using centered coordinates and the covariance matrix $\Sigma_{xy} = \frac{1}{n} \sum_i (\mathbf{y}_i - \boldsymbol{\mu}_y)(\mathbf{x}_i - \boldsymbol{\mu}_x)^T$.</p>
<p>Given the SVD $\Sigma_{xy} = UDV^T$:</p>
<p><strong>Rotation</strong>:</p>
<p>$$
R = USV^T
$$</p>
<p><strong>Scale</strong>:</p>
<p>$$
c = \frac{1}{\sigma_x^2} \operatorname{tr}(DS)
$$</p>
<p><strong>Translation</strong>:</p>
<p>$$
\mathbf{t} = \boldsymbol{\mu}_y - cR\boldsymbol{\mu}_x
$$</p>
<p><strong>Minimum error</strong>:</p>
<p>$$
\varepsilon^2 = \sigma_y^2 - \frac{\operatorname{tr}(DS)^2}{\sigma_x^2}
$$</p>
<p>where $\sigma_x^2$ and $\sigma_y^2$ are the variances of the respective point sets around their centroids.</p>
<h2 id="why-prior-methods-fail">Why Prior Methods Fail</h2>
<p>The methods of Arun et al. and Horn et al. use $R = UV^T$ directly from the SVD. This works when $\det(UV^T) = 1$ (proper rotation). When $\det(UV^T) = -1$, these methods either produce a reflection or apply an ad hoc correction (flipping the sign of the last column of $U$). Umeyama shows that the correct fix depends on $\det(\Sigma_{xy})$:</p>
<ul>
<li>If $\det(\Sigma_{xy}) \geq 0$: set $S = I$, so $R = UV^T$</li>
<li>If $\det(\Sigma_{xy}) &lt; 0$: set $S = \operatorname{diag}(1, \ldots, 1, -1)$, flipping the last singular value&rsquo;s contribution</li>
</ul>
<p>This distinction matters because corrupted data can make $\det(UV^T) = -1$ even when the true transformation is a proper rotation. Simply flipping a column of $U$ does not always yield the correct least-squares solution.</p>
<h2 id="generality">Generality</h2>
<p>The formulation works for any dimension $m$, covering both 2D and 3D registration problems. The proof uses Lagrange multipliers with explicit enforcement of both orthogonality ($R^T R = I$) and the proper rotation constraint ($\det(R) = 1$), which prior methods enforced only partially.</p>
<hr>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Umeyama, S. (1991). Least-squares estimation of transformation parameters between two point patterns. <em>IEEE Transactions on Pattern Analysis and Machine Intelligence</em>, 13(4), 376-380. <a href="https://doi.org/10.1109/34.88573">https://doi.org/10.1109/34.88573</a></p>
<p><strong>Publication</strong>: IEEE TPAMI, 1991</p>
<p><strong>Additional Resources</strong>:</p>
<ul>
<li><a href="/posts/kabsch-algorithm/">Kabsch Algorithm: NumPy, PyTorch, TensorFlow, and JAX</a> (tutorial with implementations including the Kabsch-Umeyama scaling extension)</li>
<li><a href="/projects/kabsch-horn-cookbook/">Kabsch-Horn Cookbook</a> (a differentiable, gradient-safe implementation of Kabsch, Horn, and Umeyama alignment across NumPy, PyTorch, JAX, TensorFlow, and MLX)</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@article</span>{umeyama1991least,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span>=<span style="color:#e6db74">{Least-squares estimation of transformation parameters between two point patterns}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span>=<span style="color:#e6db74">{Umeyama, Shinji}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">journal</span>=<span style="color:#e6db74">{IEEE Transactions on Pattern Analysis and Machine Intelligence}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">volume</span>=<span style="color:#e6db74">{13}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">number</span>=<span style="color:#e6db74">{4}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span>=<span style="color:#e6db74">{376--380}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span>=<span style="color:#e6db74">{1991}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">publisher</span>=<span style="color:#e6db74">{IEEE}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">doi</span>=<span style="color:#e6db74">{10.1109/34.88573}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div>]]></content:encoded></item><item><title>Horn et al.: Absolute Orientation Using Orthonormal Matrices</title><link>https://hunterheidenreich.com/notes/computational-biology/horn-orthonormal-matrices/</link><pubDate>Mon, 16 Mar 2026 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/computational-biology/horn-orthonormal-matrices/</guid><description>Horn, Hilden, and Negahdaripour (1988) solve absolute orientation using matrix square roots, providing an orthonormal matrix alternative to quaternions.</description><content:encoded><![CDATA[<h2 id="a-matrix-based-companion-to-the-quaternion-method">A Matrix-Based Companion to the Quaternion Method</h2>
<p>This <strong>Method</strong> paper presents a closed-form solution to the absolute orientation problem using $3 \times 3$ orthonormal matrices directly, complementing <a href="/notes/computational-biology/horn-absolute-orientation/">Horn&rsquo;s earlier quaternion-based solution</a> (1987). The authors note that while quaternions are more elegant, orthonormal matrices are more widely used in photogrammetry, graphics, and robotics. The solution relies on the polar decomposition of the cross-covariance matrix via its matrix square root.</p>
<p>The paper also compares two approaches: (1) directly finding the best-fit orthonormal matrix (the main result), and (2) finding an unconstrained best-fit linear transformation and then projecting it onto the nearest orthonormal matrix. These give different results, and only the first approach has the desired symmetry property.</p>
<h2 id="the-rotation-via-polar-decomposition">The Rotation via Polar Decomposition</h2>
<p>As in the quaternion paper, the problem reduces to finding the orthonormal matrix $R$ maximizing $\operatorname{Tr}(R^T M)$, where $M = \sum_{i=1}^{n} \mathbf{r}&rsquo;_{r,i} (\mathbf{r}&rsquo;_{l,i})^T$ is the cross-covariance matrix of the centered point sets.</p>
<p>The key insight is the polar decomposition: any matrix $M$ can be written as:</p>
<p>$$
M = U S
$$</p>
<p>where $U$ is orthonormal and $S = (M^T M)^{1/2}$ is positive semidefinite. When $M$ is nonsingular:</p>
<p>$$
U = M (M^T M)^{-1/2}
$$</p>
<p>The matrix square root $(M^T M)^{1/2}$ is computed via eigendecomposition. If $M^T M$ has eigenvalues $\lambda_1, \lambda_2, \lambda_3$ and eigenvectors $\hat{\mathbf{u}}_1, \hat{\mathbf{u}}_2, \hat{\mathbf{u}}_3$:</p>
<p>$$
(M^T M)^{1/2} = \sqrt{\lambda_1} , \hat{\mathbf{u}}_1 \hat{\mathbf{u}}_1^T + \sqrt{\lambda_2} , \hat{\mathbf{u}}_2 \hat{\mathbf{u}}_2^T + \sqrt{\lambda_3} , \hat{\mathbf{u}}_3 \hat{\mathbf{u}}_3^T
$$</p>
<p>The sign of $\det(U)$ equals the sign of $\det(M)$, so $U$ is a proper rotation when $\det(M) &gt; 0$ and a reflection when $\det(M) &lt; 0$.</p>
<h2 id="handling-the-coplanar-case">Handling the Coplanar Case</h2>
<p>When one set of measurements is coplanar, $M$ is singular ($\operatorname{rank}(M) = 2$) and one eigenvalue of $M^T M$ is zero. The matrix square root still exists (positive semidefinite rather than positive definite), but $S$ is no longer invertible.</p>
<p>In this case, $U$ is determined only for two of its three columns. The third column (corresponding to the zero eigenvalue) is fixed by the orthonormality constraint, with a sign ambiguity resolved by requiring $\det(U) = +1$ (proper rotation).</p>
<h2 id="the-nearest-orthonormal-matrix-alternative-approach">The Nearest Orthonormal Matrix (Alternative Approach)</h2>
<p>The paper also derives a closed-form solution for finding the orthonormal matrix nearest to an arbitrary matrix $A$ (minimizing $\lVert A - R \rVert^2$). This uses the same polar decomposition machinery: if $A = U_A S_A$, then $U_A$ is the nearest orthonormal matrix.</p>
<p>This approach (find unconstrained best-fit transform, then project to nearest orthonormal matrix) was used by some earlier methods. Horn et al. show it gives a different result from the direct least-squares solution and lacks the symmetry property: the inverse transformation from right-to-left is generally not the exact inverse of the left-to-right solution.</p>
<h2 id="relationship-to-other-methods">Relationship to Other Methods</h2>
<table>
	<thead>
			<tr>
					<th>Method</th>
					<th>Rotation representation</th>
					<th>Core computation</th>
			</tr>
	</thead>
	<tbody>
			<tr>
					<td><a href="/notes/computational-biology/kabsch-algorithm/">Kabsch (1976)</a></td>
					<td>Orthogonal matrix</td>
					<td>Eigendecomposition of $\tilde{R}R$ ($3 \times 3$)</td>
			</tr>
			<tr>
					<td><a href="/notes/computational-biology/horn-absolute-orientation/">Horn (1987)</a></td>
					<td>Unit quaternion</td>
					<td>Eigenvector of $N$ ($4 \times 4$)</td>
			</tr>
			<tr>
					<td>Horn et al. (1988)</td>
					<td>Orthonormal matrix</td>
					<td>Square root of $M^T M$ ($3 \times 3$)</td>
			</tr>
			<tr>
					<td><a href="/notes/computational-biology/arun-svd-point-fitting/">Arun et al. (1987)</a></td>
					<td>Orthonormal matrix</td>
					<td>SVD of $H$ ($3 \times 3$)</td>
			</tr>
	</tbody>
</table>
<p>The polar decomposition approach (this paper) and the SVD approach (<a href="/notes/computational-biology/arun-svd-point-fitting/">Arun et al.</a>) are closely related: the SVD $M = U \Lambda V^T$ gives the polar decomposition as $M = (UV^T)(V \Lambda V^T)$ where $UV^T$ is the orthonormal factor and $V \Lambda V^T$ is the positive semidefinite factor. Both methods can produce reflections under noisy data, which <a href="/notes/computational-biology/umeyama-similarity-transformation/">Umeyama (1991)</a> later addressed.</p>
<hr>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Horn, B. K. P., Hilden, H. M., &amp; Negahdaripour, S. (1988). Closed-form solution of absolute orientation using orthonormal matrices. <em>Journal of the Optical Society of America A</em>, 5(7), 1127-1135. <a href="https://doi.org/10.1364/josaa.5.001127">https://doi.org/10.1364/josaa.5.001127</a></p>
<p><strong>Publication</strong>: Journal of the Optical Society of America A, 1988</p>
<p><strong>Additional Resources</strong>:</p>
<ul>
<li><a href="/posts/kabsch-algorithm/">Kabsch Algorithm: NumPy, PyTorch, TensorFlow, and JAX</a> (tutorial with differentiable implementations)</li>
<li><a href="/projects/kabsch-horn-cookbook/">Kabsch-Horn Cookbook</a> (a differentiable, gradient-safe implementation of Kabsch, Horn, and Umeyama alignment across NumPy, PyTorch, JAX, TensorFlow, and MLX)</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@article</span>{horn1988closed,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span>=<span style="color:#e6db74">{Closed-form solution of absolute orientation using orthonormal matrices}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span>=<span style="color:#e6db74">{Horn, Berthold K. P. and Hilden, Hugh M. and Negahdaripour, Shahriar}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">journal</span>=<span style="color:#e6db74">{Journal of the Optical Society of America A}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">volume</span>=<span style="color:#e6db74">{5}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">number</span>=<span style="color:#e6db74">{7}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span>=<span style="color:#e6db74">{1127--1135}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span>=<span style="color:#e6db74">{1988}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">publisher</span>=<span style="color:#e6db74">{Optica Publishing Group}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">doi</span>=<span style="color:#e6db74">{10.1364/josaa.5.001127}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div>]]></content:encoded></item><item><title>Arun et al.: SVD-Based Least-Squares Fitting of 3D Points</title><link>https://hunterheidenreich.com/notes/computational-biology/arun-svd-point-fitting/</link><pubDate>Mon, 16 Mar 2026 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/computational-biology/arun-svd-point-fitting/</guid><description>Arun, Huang, and Blostein (1987) introduce an SVD-based algorithm for least-squares rotation and translation between two 3D point sets.</description><content:encoded><![CDATA[<h2 id="svd-for-3d-point-set-registration">SVD for 3D Point Set Registration</h2>
<p>This <strong>Method</strong> paper presents a concise algorithm for finding the least-squares rotation and translation between two 3D point sets using the singular value decomposition (SVD) of a $3 \times 3$ cross-covariance matrix. The approach is closely related to the earlier <a href="/notes/computational-biology/kabsch-algorithm/">Kabsch algorithm</a> (1976), which used eigendecomposition, and was developed independently of <a href="/notes/computational-biology/horn-absolute-orientation/">Horn&rsquo;s quaternion method</a> (1987). The paper also identifies a reflection degeneracy that <a href="/notes/computational-biology/umeyama-similarity-transformation/">Umeyama</a> later provided a complete fix for.</p>
<h2 id="problem-formulation">Problem Formulation</h2>
<p>Given two 3D point sets ${p_i}$ and ${p&rsquo;_i}$ ($i = 1, \ldots, N$) related by:</p>
<p>$$
p&rsquo;_i = R p_i + T + N_i
$$</p>
<p>where $R$ is a rotation matrix, $T$ is a translation vector, and $N_i$ is noise, find $\hat{R}$ and $\hat{T}$ minimizing:</p>
<p>$$
\Sigma^2 = \sum_{i=1}^{N} \lVert p&rsquo;_i - (R p_i + T) \rVert^2
$$</p>
<h2 id="decoupling-translation-and-rotation">Decoupling Translation and Rotation</h2>
<p>The translation is eliminated by centering both point sets at their centroids $p$ and $p&rsquo;$. Defining centered coordinates $q_i = p_i - p$ and $q&rsquo;_i = p&rsquo;_i - p&rsquo;$, the problem reduces to:</p>
<p>$$
\Sigma^2 = \sum_{i=1}^{N} \lVert q&rsquo;_i - R q_i \rVert^2
$$</p>
<p>Once $\hat{R}$ is found, the translation follows as $\hat{T} = p&rsquo; - \hat{R} p$.</p>
<h2 id="the-svd-algorithm">The SVD Algorithm</h2>
<p>The algorithm proceeds in five steps:</p>
<ol>
<li>Center both point sets by subtracting centroids</li>
<li>Compute the $3 \times 3$ cross-covariance matrix: $H = \sum_{i=1}^{N} q_i q&rsquo;^t_i$</li>
<li>Compute the SVD: $H = U \Lambda V^t$</li>
<li>Form the candidate rotation: $X = V U^t$</li>
<li>Check $\det(X)$: if $+1$, then $\hat{R} = X$; if $-1$, the result is a reflection</li>
</ol>
<p>The key insight is that minimizing $\Sigma^2$ is equivalent to maximizing $\operatorname{Trace}(RH)$. Using a lemma based on the Cauchy-Schwarz inequality, Arun et al. show that $X = VU^t$ maximizes this trace over all orthonormal matrices.</p>
<h2 id="the-reflection-problem">The Reflection Problem</h2>
<p>When $\det(VU^t) = -1$, the SVD produces a reflection rather than a proper rotation. Arun et al. analyze three cases:</p>
<p><strong>Noiseless, non-coplanar points</strong>: The SVD always gives a proper rotation ($\det = +1$). No issue arises.</p>
<p><strong>Coplanar points</strong> (including $N = 3$): One singular value of $H$ is zero. Both a rotation and a reflection achieve $\Sigma^2 = 0$. The fix is to flip the sign of the column of $V$ corresponding to the zero singular value:</p>
<p>$$
V&rsquo; = [v_1, v_2, -v_3], \quad X&rsquo; = V&rsquo; U^t
$$</p>
<p><strong>Noisy, non-coplanar points with $\det = -1$</strong>: The paper acknowledges this case cannot be handled by the algorithm. The reflection genuinely minimizes $\Sigma^2$ over all orthonormal matrices, meaning no rotation achieves a lower error. The authors suggest this only occurs with very large noise and recommend RANSAC-like approaches.</p>
<p>This last case is precisely what <a href="/notes/computational-biology/umeyama-similarity-transformation/">Umeyama (1991)</a> later resolved with a corrected formulation using a sign matrix $S$ conditioned on $\det(\Sigma_{xy})$.</p>
<h2 id="computational-comparison">Computational Comparison</h2>
<p>The paper includes VAX 11/780 benchmarks comparing three methods:</p>
<table>
	<thead>
			<tr>
					<th>Points</th>
					<th>SVD (ms)</th>
					<th>Quaternion (ms)</th>
					<th>Iterative (ms)</th>
			</tr>
	</thead>
	<tbody>
			<tr>
					<td>3</td>
					<td>54.6</td>
					<td>26.6</td>
					<td>126.8</td>
			</tr>
			<tr>
					<td>11</td>
					<td>37.0</td>
					<td>41.0</td>
					<td>105.2</td>
			</tr>
			<tr>
					<td>30</td>
					<td>44.2</td>
					<td>48.3</td>
					<td>111.0</td>
			</tr>
	</tbody>
</table>
<p>The SVD and quaternion methods have comparable speed, both significantly faster than the iterative approach. SVD becomes faster than quaternion for larger point sets since its core computation operates on a $3 \times 3$ matrix regardless of $N$.</p>
<hr>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Arun, K. S., Huang, T. S., &amp; Blostein, S. D. (1987). Least-Squares Fitting of Two 3-D Point Sets. <em>IEEE Transactions on Pattern Analysis and Machine Intelligence</em>, PAMI-9(5), 698-700. <a href="https://doi.org/10.1109/TPAMI.1987.4767965">https://doi.org/10.1109/TPAMI.1987.4767965</a></p>
<p><strong>Publication</strong>: IEEE TPAMI, 1987</p>
<p><strong>Additional Resources</strong>:</p>
<ul>
<li><a href="/posts/kabsch-algorithm/">Kabsch Algorithm: NumPy, PyTorch, TensorFlow, and JAX</a> (tutorial with differentiable implementations)</li>
<li><a href="/projects/kabsch-horn-cookbook/">Kabsch-Horn Cookbook</a> (a differentiable, gradient-safe implementation of Kabsch, Horn, and Umeyama alignment across NumPy, PyTorch, JAX, TensorFlow, and MLX)</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@article</span>{arun1987least,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span>=<span style="color:#e6db74">{Least-Squares Fitting of Two 3-D Point Sets}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span>=<span style="color:#e6db74">{Arun, K. S. and Huang, T. S. and Blostein, S. D.}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">journal</span>=<span style="color:#e6db74">{IEEE Transactions on Pattern Analysis and Machine Intelligence}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">volume</span>=<span style="color:#e6db74">{PAMI-9}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">number</span>=<span style="color:#e6db74">{5}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span>=<span style="color:#e6db74">{698--700}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span>=<span style="color:#e6db74">{1987}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">publisher</span>=<span style="color:#e6db74">{IEEE}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">doi</span>=<span style="color:#e6db74">{10.1109/TPAMI.1987.4767965}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div>]]></content:encoded></item><item><title>Kabsch Algorithm: Optimal Rotation for Point Set Alignment</title><link>https://hunterheidenreich.com/notes/computational-biology/kabsch-algorithm/</link><pubDate>Sun, 15 Mar 2026 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/computational-biology/kabsch-algorithm/</guid><description>Kabsch (1976) derives a closed-form solution for the optimal rotation aligning two weighted vector sets by minimizing squared deviations.</description><content:encoded><![CDATA[<h2 id="a-closed-form-solution-for-optimal-rotation">A Closed-Form Solution for Optimal Rotation</h2>
<p>This short communication presents a <strong>Method</strong> paper: a direct, analytical solution to a constrained optimization problem. Given two sets of vectors, Kabsch derives the orthogonal matrix (rotation) that best superimposes one set onto the other by minimizing a weighted sum of squared deviations. Prior approaches either solved an unconstrained problem and factorized the result (Diamond, 1976) or used iterative methods (McLachlan, 1972). Kabsch shows that a direct, non-iterative solution exists despite the non-linear nature of the orthogonality constraint.</p>
<h2 id="the-superposition-problem">The Superposition Problem</h2>
<p>The core problem arises frequently in crystallography and structural biology: given two sets of corresponding points (e.g., atomic coordinates from a known structure and experimentally measured coordinates), find the rigid rotation that best aligns them. Translations can be removed by centering both point sets at the origin, leaving only the rotational component.</p>
<p>Formally, given vector sets $\mathbf{x}_n$ and $\mathbf{y}_n$ ($n = 1, 2, \ldots, N$) with weights $w_n$, find the orthogonal matrix $\mathsf{U}$ minimizing:</p>
<p>$$
E = \frac{1}{2} \sum_{n} w_n (\mathsf{U} \mathbf{x}_n - \mathbf{y}_n)^2
$$</p>
<p>subject to orthogonality: $\tilde{\mathsf{U}} \mathsf{U} = \mathsf{I}$.</p>
<h2 id="derivation-via-lagrange-multipliers">Derivation via Lagrange Multipliers</h2>
<p>Kabsch introduces a symmetric matrix $\mathsf{L}$ of Lagrange multipliers to enforce orthogonality, forming the Lagrangian:</p>
<p>$$
G = E + \frac{1}{2} \sum_{i,j} l_{ij} \left( \sum_{k} u_{ki} u_{kj} - \delta_{ij} \right)
$$</p>
<p>Setting $\partial G / \partial u_{ij} = 0$ and defining two key matrices:</p>
<p>$$
r_{ij} = \sum_{n} w_n , y_{ni} , x_{nj} \qquad s_{ij} = \sum_{n} w_n , x_{ni} , x_{nj}
$$</p>
<p>where $\mathsf{R} = (r_{ij})$ is the weighted cross-covariance matrix and $\mathsf{S} = (s_{ij})$ is the weighted auto-covariance matrix, the stationarity condition becomes:</p>
<p>$$
\mathsf{U} \cdot (\mathsf{S} + \mathsf{L}) = \mathsf{R}
$$</p>
<h2 id="eigendecomposition-solution">Eigendecomposition Solution</h2>
<p>The key insight is that multiplying both sides by their transposes eliminates the unknown $\mathsf{U}$:</p>
<p>$$
(\mathsf{S} + \mathsf{L})(\mathsf{S} + \mathsf{L}) = \tilde{\mathsf{R}} \mathsf{R}
$$</p>
<p>Since $\tilde{\mathsf{R}} \mathsf{R}$ is symmetric positive definite, it has positive eigenvalues $\mu_k$ and eigenvectors $\mathbf{a}_k$. The matrix $\mathsf{S} + \mathsf{L}$ shares the same eigenvectors with eigenvalues $\sqrt{\mu_k}$.</p>
<p>From the eigenvectors $\mathbf{a}_k$, a second set of unit vectors $\mathbf{b}_k$ is defined:</p>
<p>$$
\mathbf{b}_k = \frac{1}{\sqrt{\mu_k}} \mathsf{R} , \mathbf{a}_k
$$</p>
<p>The optimal rotation matrix is then constructed directly:</p>
<p>$$
u_{ij} = \sum_{k} b_{ki} , a_{kj}
$$</p>
<h2 id="handling-degeneracies-and-generalizations">Handling Degeneracies and Generalizations</h2>
<p>Kabsch addresses two extensions:</p>
<ol>
<li>
<p><strong>Planar point sets</strong>: When all vectors lie in a plane, one eigenvalue of $\tilde{\mathsf{R}} \mathsf{R}$ is zero. The missing eigenvectors are recovered via cross products: $\mathbf{a}_3 = \mathbf{a}_1 \times \mathbf{a}_2$ and $\mathbf{b}_3 = \mathbf{b}_1 \times \mathbf{b}_2$.</p>
</li>
<li>
<p><strong>General metric constraints</strong>: The orthogonality constraint $\tilde{\mathsf{U}} \mathsf{U} = \mathsf{I}$ can be replaced by $\tilde{\mathsf{U}} \mathsf{U} = \mathsf{M}$ for any symmetric positive definite $\mathsf{M}$. By finding any specific solution $\mathsf{B}$ and transforming the input vectors as $\mathbf{x}&rsquo;_n = \mathsf{B} \mathbf{x}_n$, the problem reduces back to the standard orthogonal case.</p>
</li>
</ol>
<p>The method generalizes naturally to vector spaces of arbitrary dimension.</p>
<h2 id="legacy-and-impact">Legacy and Impact</h2>
<p>This two-page communication became one of the most cited papers in structural biology. The &ldquo;Kabsch algorithm&rdquo; (or &ldquo;Kabsch rotation&rdquo;) is the standard method for computing the root-mean-square deviation (RMSD) between two molecular structures after optimal superposition. It underpins structure comparison tools across crystallography, NMR spectroscopy, cryo-EM, and computational chemistry.</p>
<hr>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Kabsch, W. (1976). A solution for the best rotation to relate two sets of vectors. <em>Acta Crystallographica Section A</em>, 32(5), 922-923. <a href="https://doi.org/10.1107/s0567739476001873">https://doi.org/10.1107/s0567739476001873</a></p>
<p><strong>Publication</strong>: Acta Crystallographica Section A, 1976</p>
<p><strong>Additional Resources</strong>:</p>
<ul>
<li><a href="/posts/kabsch-algorithm/">Kabsch Algorithm: NumPy, PyTorch, TensorFlow, and JAX</a> (tutorial with differentiable implementations)</li>
<li><a href="/projects/kabsch-horn-cookbook/">Kabsch-Horn Cookbook</a> (a differentiable, gradient-safe implementation of Kabsch, Horn, and Umeyama alignment across NumPy, PyTorch, JAX, TensorFlow, and MLX)</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@article</span>{kabsch1976solution,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span>=<span style="color:#e6db74">{A solution for the best rotation to relate two sets of vectors}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span>=<span style="color:#e6db74">{Kabsch, Wolfgang}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">journal</span>=<span style="color:#e6db74">{Acta Crystallographica Section A: Crystal Physics, Diffraction, Theoretical and General Crystallography}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">volume</span>=<span style="color:#e6db74">{32}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">number</span>=<span style="color:#e6db74">{5}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span>=<span style="color:#e6db74">{922--923}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span>=<span style="color:#e6db74">{1976}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">publisher</span>=<span style="color:#e6db74">{International Union of Crystallography}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">doi</span>=<span style="color:#e6db74">{10.1107/s0567739476001873}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div>]]></content:encoded></item><item><title>Horn's Method: Absolute Orientation via Unit Quaternions</title><link>https://hunterheidenreich.com/notes/computational-biology/horn-absolute-orientation/</link><pubDate>Sun, 15 Mar 2026 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/computational-biology/horn-absolute-orientation/</guid><description>Horn (1987) presents a closed-form quaternion solution for absolute orientation, finding optimal rotation, translation, and scale between two point sets.</description><content:encoded><![CDATA[<h2 id="a-quaternion-approach-to-point-set-registration">A Quaternion Approach to Point Set Registration</h2>
<p>This <strong>Method</strong> paper presents a closed-form solution to the absolute orientation problem: given corresponding points measured in two different coordinate systems, find the optimal rotation, translation, and scale that maps one set onto the other. While the <a href="/notes/computational-biology/kabsch-algorithm/">Kabsch algorithm</a> (1976) solved the rotation subproblem via eigendecomposition of $\tilde{\mathsf{R}}\mathsf{R}$, Horn&rsquo;s approach uses unit quaternions to represent rotation, reducing the problem to finding the eigenvector of a $4 \times 4$ symmetric matrix associated with its largest eigenvalue.</p>
<h2 id="the-absolute-orientation-problem">The Absolute Orientation Problem</h2>
<p>Given $n$ point pairs ${\mathbf{r}_{l,i}}$ and ${\mathbf{r}_{r,i}}$ measured in &ldquo;left&rdquo; and &ldquo;right&rdquo; coordinate systems, find the transformation:</p>
<p>$$
\mathbf{r}_r = s , R(\mathbf{r}_l) + \mathbf{r}_0
$$</p>
<p>where $s$ is a scale factor, $R$ is a rotation, and $\mathbf{r}_0$ is a translation, minimizing the sum of squared residual errors:</p>
<p>$$
\sum_{i=1}^{n} \lVert \mathbf{r}_{r,i} - s , R(\mathbf{r}_{l,i}) - \mathbf{r}_0 \rVert^2
$$</p>
<p>Prior methods either used iterative numerical procedures or selectively discarded constraints (e.g., Thompson&rsquo;s and Schut&rsquo;s three-point methods). Horn derives a direct solution that uses all available information from all points simultaneously.</p>
<h2 id="decoupling-translation-scale-and-rotation">Decoupling Translation, Scale, and Rotation</h2>
<p>Horn shows that the three components of the transformation can be solved sequentially.</p>
<p><strong>Translation</strong>: After centering both point sets at their centroids ($\bar{\mathbf{r}}_l$ and $\bar{\mathbf{r}}_r$), the optimal translation is:</p>
<p>$$
\mathbf{r}_0 = \bar{\mathbf{r}}_r - s , R(\bar{\mathbf{r}}_l)
$$</p>
<p><strong>Scale</strong>: Horn derives three formulations (asymmetric left, asymmetric right, and symmetric). The symmetric version, which ensures the inverse transformation yields the reciprocal scale, is:</p>
<p>$$
s = \left( \frac{\sum_{i=1}^{n} \lVert \mathbf{r}&rsquo;_{r,i} \rVert^2}{\sum_{i=1}^{n} \lVert \mathbf{r}&rsquo;_{l,i} \rVert^2} \right)^{1/2}
$$</p>
<p>the ratio of root-mean-square deviations from the respective centroids.</p>
<p><strong>Rotation</strong>: After removing translation and scale, the remaining problem is to find the rotation $R$ that maximizes:</p>
<p>$$
\sum_{i=1}^{n} \mathbf{r}&rsquo;_{r,i} \cdot R(\mathbf{r}&rsquo;_{l,i})
$$</p>
<h2 id="the-quaternion-eigenvector-solution">The Quaternion Eigenvector Solution</h2>
<p>Horn represents rotation using unit quaternions $\dot{q} = q_0 + i q_x + j q_y + k q_z$ with $\lVert \dot{q} \rVert = 1$. A rotation acts on a vector (represented as a purely imaginary quaternion $\dot{r}$) via the composite product:</p>
<p>$$
\dot{r}&rsquo; = \dot{q} , \dot{r} , \dot{q}^*
$$</p>
<p>Using the $4 \times 4$ matrix representations of quaternion products, the objective function becomes a quadratic form:</p>
<p>$$
\dot{q}^T N \dot{q}
$$</p>
<p>where $N$ is a real symmetric $4 \times 4$ matrix whose elements are combinations of the sums of products $S_{xx}, S_{xy}, \ldots, S_{zz}$ from the $3 \times 3$ cross-covariance matrix $M = \sum_i \mathbf{r}&rsquo;_{l,i} \mathbf{r}&rsquo;^T_{r,i}$:</p>
<p>$$
N = \begin{bmatrix} (S_{xx} + S_{yy} + S_{zz}) &amp; S_{yz} - S_{zy} &amp; S_{zx} - S_{xz} &amp; S_{xy} - S_{yx} \\ S_{yz} - S_{zy} &amp; (S_{xx} - S_{yy} - S_{zz}) &amp; S_{xy} + S_{yx} &amp; S_{zx} + S_{xz} \\ S_{zx} - S_{xz} &amp; S_{xy} + S_{yx} &amp; (-S_{xx} + S_{yy} - S_{zz}) &amp; S_{yz} + S_{zy} \\ S_{xy} - S_{yx} &amp; S_{zx} + S_{xz} &amp; S_{yz} + S_{zy} &amp; (-S_{xx} - S_{yy} + S_{zz}) \end{bmatrix}
$$</p>
<p>The trace of $N$ is always zero. The unit quaternion maximizing $\dot{q}^T N \dot{q}$ is the eigenvector corresponding to the most positive eigenvalue of $N$.</p>
<h2 id="the-characteristic-polynomial">The Characteristic Polynomial</h2>
<p>The eigenvalues satisfy a quartic $\lambda^4 + c_3 \lambda^3 + c_2 \lambda^2 + c_1 \lambda + c_0 = 0$ where:</p>
<ul>
<li>$c_3 = 0$ (trace of $N$ is zero, so the four roots sum to zero)</li>
<li>$c_2 = -2 \operatorname{Tr}(M^T M)$ (always negative, guaranteeing both positive and negative roots)</li>
<li>$c_1 = -8 \det(M)$</li>
<li>$c_0 = \det(N)$</li>
</ul>
<p>When points are coplanar (including the common case of exactly three points), $\det(M) = 0$, so $c_1 = 0$ and the quartic reduces to a biquadratic solvable in closed form.</p>
<h2 id="coplanar-points-and-the-three-point-case">Coplanar Points and the Three-Point Case</h2>
<p>For coplanar measurements, the quartic simplifies to $\lambda^4 + c_2 \lambda^2 + c_0 = 0$, yielding:</p>
<p>$$
\lambda_m = \left[ \frac{1}{2} \left( (c_2^2 - 4c_0)^{1/2} - c_2 \right) \right]^{1/2}
$$</p>
<p>Horn also provides a geometric interpretation for the coplanar case: first rotate one plane into the other (about their line of intersection), then solve a 2D least-squares rotation within the shared plane.</p>
<h2 id="comparison-with-the-kabsch-algorithm">Comparison with the Kabsch Algorithm</h2>
<p>Both methods solve the same underlying optimization problem but approach it differently:</p>
<table>
	<thead>
			<tr>
					<th>Aspect</th>
					<th>Kabsch (1976)</th>
					<th>Horn (1987)</th>
			</tr>
	</thead>
	<tbody>
			<tr>
					<td>Rotation representation</td>
					<td>Orthogonal matrix</td>
					<td>Unit quaternion</td>
			</tr>
			<tr>
					<td>Core computation</td>
					<td>SVD or eigendecomposition of $\tilde{R}R$ ($3 \times 3$)</td>
					<td>Eigenvector of $N$ ($4 \times 4$)</td>
			</tr>
			<tr>
					<td>Scale estimation</td>
					<td>Not addressed</td>
					<td>Three formulations (including symmetric)</td>
			</tr>
			<tr>
					<td>Constraint enforcement</td>
					<td>Lagrange multipliers</td>
					<td>Unit quaternion norm</td>
			</tr>
			<tr>
					<td>Symmetry guarantee</td>
					<td>Not addressed</td>
					<td>Proven for symmetric scale</td>
			</tr>
			<tr>
					<td>Degenerate cases</td>
					<td>Cross-product fallback</td>
					<td>Biquadratic closed form</td>
			</tr>
	</tbody>
</table>
<p>Horn emphasizes a symmetry property: the inverse transformation should yield exactly the inverse parameters. This holds automatically for the quaternion rotation but requires a specific (symmetric) choice of scale formula.</p>
<hr>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Horn, B. K. P. (1987). Closed-form solution of absolute orientation using unit quaternions. <em>Journal of the Optical Society of America A</em>, 4(4), 629-642. <a href="https://doi.org/10.1364/JOSAA.4.000629">https://doi.org/10.1364/JOSAA.4.000629</a></p>
<p><strong>Publication</strong>: Journal of the Optical Society of America A, 1987</p>
<p><strong>Additional Resources</strong>:</p>
<ul>
<li><a href="/posts/kabsch-algorithm/">Kabsch Algorithm: NumPy, PyTorch, TensorFlow, and JAX</a> (tutorial with differentiable implementations of the related SVD-based method)</li>
<li><a href="/projects/kabsch-horn-cookbook/">Kabsch-Horn Cookbook</a> (a differentiable, gradient-safe implementation of Kabsch, Horn, and Umeyama alignment across NumPy, PyTorch, JAX, TensorFlow, and MLX)</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@article</span>{horn1987closed,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span>=<span style="color:#e6db74">{Closed-form solution of absolute orientation using unit quaternions}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span>=<span style="color:#e6db74">{Horn, Berthold K. P.}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">journal</span>=<span style="color:#e6db74">{Journal of the Optical Society of America A}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">volume</span>=<span style="color:#e6db74">{4}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">number</span>=<span style="color:#e6db74">{4}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span>=<span style="color:#e6db74">{629--642}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span>=<span style="color:#e6db74">{1987}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">publisher</span>=<span style="color:#e6db74">{Optica Publishing Group}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">doi</span>=<span style="color:#e6db74">{10.1364/josaa.4.000629}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div>]]></content:encoded></item><item><title>DynamicFlow: Integrating Protein Dynamics into Drug Design</title><link>https://hunterheidenreich.com/notes/computational-biology/dynamicflow/</link><pubDate>Sat, 20 Dec 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/computational-biology/dynamicflow/</guid><description>Flow matching model that co-generates ligands and flexible protein pockets, addressing rigid-receptor limitations in structure-based drug design.</description><content:encoded><![CDATA[<h2 id="what-kind-of-paper-is-this">What kind of paper is this?</h2>
<p>This is primarily a <strong>Methodological Paper</strong> ($\Psi_{\text{Method}}$) with a strong <strong>Resource</strong> ($\Psi_{\text{Resource}}$) component.</p>
<ul>
<li><strong>Method</strong>: It proposes <strong>DynamicFlow</strong>, a novel multiscale architecture combining atom-level SE(3)-equivariant GNNs (SE(3) is the special Euclidean group in 3D: the set of all 3D rotations and translations, and equivariance means predictions transform consistently under those symmetries) and residue-level Transformers within a <a href="/notes/machine-learning/generative-models/flow-matching-for-generative-modeling/">flow matching</a> framework to model the joint distribution of ligand generation and protein conformational change.</li>
<li><strong>Resource</strong>: It curates a significant dataset derived from MISATO, pairing AlphaFold2-predicted apo structures with multiple MD-simulated holo states, specifically filtered for flow matching tasks.</li>
</ul>
<h2 id="what-is-the-motivation">What is the motivation?</h2>
<p>Traditional Structure-Based Drug Design (SBDD) methods typically assume the protein target is rigid, which limits their applicability because proteins are dynamic and undergo conformational changes (induced fit) upon ligand binding.</p>
<ul>
<li><strong>Biological Reality</strong>: Proteins exist as ensembles of states; binding often involves transitions from &ldquo;apo&rdquo; (unbound) to &ldquo;holo&rdquo; (bound) <a href="/posts/geom-conformer-generation-dataset/">conformational changes</a>, sometimes revealing cryptic pockets.</li>
<li><strong>Computational Bottleneck</strong>: <a href="/notes/chemistry/molecular-simulation/">Molecular Dynamics (MD)</a> simulates these changes but incurs high computational costs due to energy barriers.</li>
<li><strong>Gap</strong>: <a href="/notes/machine-learning/generative-models/">Existing generative models</a> for SBDD mostly condition on a fixed pocket structure, ignoring the co-adaptation of the protein and ligand.</li>
</ul>
<h2 id="what-is-the-novelty-here">What is the novelty here?</h2>
<p>The core novelty is the <strong>simultaneous modeling of ligand generation and protein conformational dynamics</strong> using a unified flow matching framework.</p>
<ul>
<li><strong>DynamicFlow Architecture</strong>: A multiscale model that treats the protein as both full-atom (for interaction) and residue-level frames (for large-scale dynamics), utilizing separate flow matching objectives for backbone frames, side-chain torsions, and ligand atoms.</li>
<li><strong>Stochastic Flow (SDE)</strong>: Introduction of a <a href="/notes/machine-learning/generative-models/score-based-generative-modeling-sde/">stochastic variant</a> (DynamicFlow-SDE) that improves robustness and diversity compared to the deterministic ODE flow.</li>
<li><strong>Coupled Generation</strong>: The model learns to transport the <em>apo</em> pocket distribution to the <em>holo</em> pocket distribution while simultaneously denoising the ligand, advancing beyond rigid pocket docking methods.</li>
</ul>
<h2 id="what-experiments-were-performed">What experiments were performed?</h2>
<p>The authors validated the method on a curated dataset of 5,692 protein-ligand complexes.</p>
<ul>
<li><strong>Baselines</strong>: Compared against rigid-pocket SBDD methods: Pocket2Mol, TargetDiff, and IPDiff (adapted as TargetDiff* and IPDiff* for fair comparison of atom numbers). Also compared against conformation sampling baselines (Str2Str).</li>
<li><strong>Metrics</strong>:
<ul>
<li><strong>Ligand Quality</strong>: Vina Score (binding affinity), QED (drug-likeness), SA (synthesizability), Lipinski&rsquo;s rule of 5.</li>
<li><strong>Pocket Quality</strong>: RMSD between generated and ground-truth holo pockets, Cover Ratio (percentage of holo states successfully retrieved), and Pocket Volume distributions.</li>
<li><strong>Interaction</strong>: Protein-Ligand Interaction Profiler (PLIP) to measure specific non-covalent interactions.</li>
</ul>
</li>
<li><strong>Ablations</strong>: Tested the impact of the interaction loss, residue-level Transformer, and SDE vs. ODE formulations.</li>
</ul>
<h2 id="what-outcomesconclusions">What outcomes/conclusions?</h2>
<ul>
<li><strong>Improved Affinity</strong>: DynamicFlow-SDE achieved the best (lowest) Vina scores ($-7.65$) compared to baselines like TargetDiff ($-5.09$) and Pocket2Mol ($-5.50$). Note that Vina scores are a computational proxy and do not directly predict experimental binding affinity. Moreover, Vina score optimization is gameable: molecules can achieve strong computed binding energies while remaining synthetically inaccessible. QED and SA scores, which assess drug-likeness and synthesizability respectively, were reported but were not primary optimization targets in the paper, which limits the strength of this affinity claim.</li>
<li><strong>Realistic Dynamics</strong>: The model successfully generated holo-like pocket conformations with volume distributions and interaction profiles closer to ground-truth MD simulations than the initial apo structures.</li>
<li><strong>Enhancing Rigid Methods</strong>: Holo pockets generated by DynamicFlow served as better inputs for rigid-SBDD baselines (e.g., TargetDiff improved from $-5.09$ to $-9.00$ and IPDiff improved from $-7.55$ to $-11.04$ when using &ldquo;Our Pocket&rdquo;), suggesting the method can act as a &ldquo;pocket refiner&rdquo;.</li>
<li><strong>ODE vs. SDE Trade-off</strong>: The deterministic ODE variant achieves better pocket RMSD, while the stochastic SDE variant achieves better Cover Ratio (diversity of holo states captured) and binding affinity. Neither dominates uniformly.</li>
<li><strong>Conformation Sampling Baseline</strong>: Str2Str, a dedicated conformation sampling baseline, performed worse than simply perturbing the apo structure with noise. One interpretation is that this highlights the difficulty of the apo-to-holo prediction task; another is that Str2Str was not designed specifically for apo-to-holo prediction, making it a limited test of its capabilities.</li>
</ul>
<hr>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<h3 id="data">Data</h3>
<p>The dataset is derived from <strong>MISATO</strong>, which contains MD trajectories for PDBbind complexes.</p>
<table>
	<thead>
			<tr>
					<th>Purpose</th>
					<th>Dataset</th>
					<th>Size</th>
					<th>Notes</th>
			</tr>
	</thead>
	<tbody>
			<tr>
					<td><strong>Training/Test</strong></td>
					<td>Curated MISATO</td>
					<td>5,692 complexes</td>
					<td>Filtered for valid MD (<a href="/posts/kabsch-algorithm/">RMSD</a> $&lt; 3\text{\AA}$), clustered to remove redundancy. Contains 46,235 holo-ligand conformations total.</td>
			</tr>
			<tr>
					<td><strong>Apo Structures</strong></td>
					<td>AlphaFold2</td>
					<td>N/A</td>
					<td>Apo structures were obtained by mapping PDB IDs to UniProt and retrieving AlphaFold2 predictions, then aligning to MISATO structures.</td>
			</tr>
			<tr>
					<td><strong>Splits</strong></td>
					<td>Standard</td>
					<td>50 test complexes</td>
					<td>50 complexes with no overlap with the training set selected for testing. Note: 50 is a small held-out set; results should be interpreted cautiously.</td>
			</tr>
	</tbody>
</table>
<p><strong>Preprocessing</strong>:</p>
<ul>
<li><strong>Clustering</strong>: Holo-ligand conformations clustered with RMSD threshold $1.0\text{\AA}$; top 10 clusters kept per complex.</li>
<li><strong>Pocket Definition</strong>: Residues within $7\text{\AA}$ of the ligand.</li>
<li><strong>Alignment</strong>: AlphaFold predicted structures (apo) aligned to MISATO holo structures using sequence alignment (Smith-Waterman) to identify pocket residues.</li>
</ul>
<h3 id="algorithms">Algorithms</h3>
<p><strong>Flow Matching Framework</strong>:</p>
<ul>
<li><strong>Continuous Variables</strong> (Pocket translation/rotation/torsions, Ligand positions): Modeled using <strong>Conditional Flow Matching (CFM)</strong>.
<ul>
<li><em>Prior</em>: Apo state for pocket; Normal distribution for ligand positions.</li>
<li><em>Target</em>: Holo state from MD; Ground truth ligand.</li>
<li><em>Interpolant</em>: Linear interpolation for Euclidean variables; Geodesic for rotations ($SO(3)$, the rotation-only subgroup of SE(3) containing all 3D rotations but not translations); Wrapped linear interpolation for torsions (Torus).</li>
</ul>
</li>
<li><strong>Discrete Variables</strong> (Ligand atom/bond types): Modeled using <strong>Discrete Flow Matching</strong> based on Continuous-Time Markov Chains (CTMC).
<ul>
<li><em>Rate Matrix</em>: Interpolates between mask token and data distribution.</li>
</ul>
</li>
<li><strong>Loss Function</strong>: Weighted sum of 7 losses:
<ol>
<li>Translation CFM (Eq 5)</li>
<li>Rotation CFM (Eq 7)</li>
<li>Torsion CFM (Eq 11)</li>
<li>Ligand Position CFM</li>
<li>Ligand Atom Type CTMC (Eq 14)</li>
<li>Ligand Bond Type CTMC</li>
<li><strong>Interaction Loss</strong> (Eq 18): Explicitly penalizes deviations in pairwise distances between protein and ligand atoms for pairs $\leq 3.5\text{\AA}$.</li>
</ol>
</li>
</ul>
<h3 id="models">Models</h3>
<p><strong>Architecture</strong>: <strong>DynamicFlow</strong> is a multiscale model with 15.9M parameters.</p>
<ol>
<li><strong>Atom-Level SE(3)-Equivariant GNN</strong>:
<ul>
<li><em>Input</em>: Complex graph (k-NN) and Ligand graph (fully connected).</li>
<li><em>Layers</em>: 6 EGNN blocks modified to maintain node and edge hidden states.</li>
<li><em>Function</em>: Updates ligand positions and predicts ligand atom/bond types.</li>
</ul>
</li>
<li><strong>Residue-Level Transformer</strong>:
<ul>
<li><em>Input</em>: Aggregated atom features from the GNN + Residue frames/torsions.</li>
<li><em>Layers</em>: 4 Transformer blocks with <strong>Invariant Point Attention (IPA)</strong>.</li>
<li><em>Function</em>: Updates protein residue frames (translation/rotation) and predicts side-chain torsions.</li>
</ul>
</li>
</ol>
<h3 id="evaluation">Evaluation</h3>
<p><strong>Metrics</strong>:</p>
<ul>
<li><strong>Vina Score</strong>: <code>vina_minimize</code> mode used for binding affinity.</li>
<li><strong>RMSD</strong>: Minimum RMSD between generated pocket and ground-truth holo conformations.</li>
<li><strong>Cover Ratio</strong>: % of ground-truth holo conformations covered by at least one generated sample (threshold $1.42\text{\AA}$).</li>
<li><strong>POVME 3</strong>: For pocket volume calculation.</li>
</ul>
<h3 id="hardware">Hardware</h3>
<ul>
<li><strong>Inference Benchmark</strong>: 1x Tesla V100-SXM2-32GB.</li>
<li><strong>Speed</strong>: Generates 10 ligands in ~35-36 seconds (100 NFE), significantly faster than diffusion baselines like Pocket2Mol (980s) or TargetDiff (156s).</li>
</ul>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Zhou, X., Xiao, Y., Lin, H., He, X., Guan, J., Wang, Y., Liu, Q., Zhou, F., Wang, L., &amp; Ma, J. (2025). Integrating Protein Dynamics into Structure-Based Drug Design via Full-Atom Stochastic Flows. <em>International Conference on Learning Representations (ICLR)</em>. <a href="https://arxiv.org/abs/2503.03989">https://arxiv.org/abs/2503.03989</a></p>
<p><strong>Publication</strong>: ICLR 2025</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@inproceedings</span>{zhouIntegratingProteinDynamics2025,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span> = <span style="color:#e6db74">{Integrating Protein Dynamics into Structure-Based Drug Design via Full-Atom Stochastic Flows}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span> = <span style="color:#e6db74">{Zhou, Xiangxin and Xiao, Yi and Lin, Haowei and He, Xinheng and Guan, Jiaqi and Wang, Yang and Liu, Qiang and Zhou, Feng and Wang, Liang and Ma, Jianzhu}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">booktitle</span> = <span style="color:#e6db74">{International Conference on Learning Representations}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span> = <span style="color:#e6db74">{2025}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">url</span> = <span style="color:#e6db74">{https://arxiv.org/abs/2503.03989}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p><strong>Additional Resources</strong>:</p>
<ul>
<li><a href="https://arxiv.org/abs/2503.03989">arXiv Page</a></li>
<li>Code: no public repository available at time of writing</li>
</ul>
]]></content:encoded></item><item><title>How to Fold Graciously: Levinthal's Paradox (1969)</title><link>https://hunterheidenreich.com/notes/computational-biology/fold-graciously/</link><pubDate>Mon, 08 Sep 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/computational-biology/fold-graciously/</guid><description>A perspective paper defining the Grand Challenge of protein folding: distinguishing kinetic pathways from thermodynamic endpoints.</description><content:encoded><![CDATA[<h2 id="what-kind-of-paper-is-this">What kind of paper is this?</h2>
<p>This is technically a transcription of a conference talk, not a paper Levinthal wrote himself. The proceedings page credits &ldquo;Notes by: A. Rawitch, Retranscribed: B. Krantz&rdquo;, meaning what we have is a third-party record of an oral presentation Levinthal gave at the 1969 Mössbauer Spectroscopy in Biological Systems meeting at Allerton House, Illinois. This explains the informal, conversational register and the attached Q&amp;A discussion.</p>
<p>In terms of contribution type, it functions as a <strong>Position</strong> paper (with Theory and Discovery elements):</p>
<ul>
<li><strong>Position</strong>: Defines a &ldquo;Grand Challenge&rdquo; and argues for a conceptual shift in how we view biomolecular assembly</li>
<li><strong>Theory</strong>: Uses formal combinatorial arguments to establish the bounds of the search space ($10^{300}$ configurations)</li>
<li><strong>Discovery</strong>: Uses experimental data on alkaline phosphatase to validate the kinetic hypothesis</li>
</ul>
<h2 id="what-is-the-motivation">What is the motivation?</h2>
<p><strong>The Central Question</strong>: How does a protein choose one unique structure out of a hyper-astronomical number of possibilities in a biological timeframe (seconds)?</p>
<p>Levinthal provides a &ldquo;back-of-the-envelope&rdquo; derivation to define the problem scope:</p>
<ol>
<li><strong>Degrees of Freedom:</strong> A generic, unrestricted protein with 2,000 atoms would possess ~6,000 degrees of freedom. However, physical constraints (specifically the planar peptide bond) reduce this significantly. For a 150-amino acid protein, these constraints lower the complexity to ~450 degrees of freedom (300 rotations, 150 bond angles).</li>
<li><strong>The Combinatorial Explosion:</strong> Even with conservative estimates, this results in $10^{300}$ possible conformations.</li>
<li><strong>The Time Constraint:</strong> Since proteins fold in seconds, Levinthal argues they can sample at most <strong>$10^8$ conformations</strong> (&ldquo;postulating a minimum time from one conformation to another&rdquo;) before stabilizing. Against $10^{300}$ possibilities, this search effectively covers 0% of the space, proving the impossibility of random search.</li>
</ol>
<blockquote>
<p><strong>The Insight:</strong> The existence of folded proteins proves the <strong>impossibility of random global search</strong>. The system <em>must</em> be guided.</p>
</blockquote>
<h2 id="what-is-the-novelty-here">What is the novelty here?</h2>
<p><strong>Core Contribution</strong>: Levinthal reframes folding from a thermodynamic problem (seeking the absolute global minimum) to a <strong>Kinetic Control</strong> problem. He argues the native state is a &ldquo;metastable&rdquo; energy well found quickly by a specific pathway, which can differ from the system&rsquo;s lowest possible energy state.</p>
<h3 id="the-pathway-dependence-hypothesis">The Pathway Dependence Hypothesis</h3>
<p>The key insights of kinetic control:</p>
<ul>
<li><strong>Nucleation:</strong> The process is &ldquo;speeded and guided by the rapid formation of local interactions&rdquo;</li>
<li><strong>Pathway Constraints:</strong> Local amino acid sequences form stable interactions and serve as nucleation points in the folding process, restricting the conformational search space</li>
<li><strong>The &ldquo;Metastable&rdquo; State:</strong> The final structure represents a &ldquo;metastable state&rdquo; in a sufficiently deep energy well that is <em>kinetically accessible</em> via the folding pathway, independent of the global energy minimum. Think of a ball that rolls into a valley on the side of a hill and stays there: it is not in the lowest valley on the entire landscape, but it is stable enough that it never escapes.</li>
</ul>















<figure class="post-figure center ">
    <img src="/img/notes/folding-funnel.webp"
         alt="The protein folding energy landscape funnel, showing many unfolded states at high energy converging through multiple pathways to the native folded state at the bottom of the funnel"
         title="The protein folding energy landscape funnel, showing many unfolded states at high energy converging through multiple pathways to the native folded state at the bottom of the funnel"
         
         
         loading="lazy"
         class="post-image">
    
    <figcaption class="post-caption">The Energy Landscape Funnel: The modern resolution to Levinthal&rsquo;s Paradox. While Levinthal envisioned a single guided pathway, the &lsquo;funnel&rsquo; model (Wolynes, Dill) shows that many different pathways can lead to the same native state basin. The roughness of the funnel surface represents local energy minima (kinetic traps) that can slow folding.</figcaption>
    
</figure>

<h2 id="what-experiments-were-performed">What experiments were performed?</h2>
<p>To support the pathway hypothesis, Levinthal cites work on <strong>Alkaline Phosphatase</strong> (MW ~40,000), utilizing its property as a dimer of two identical subunits:</p>
<ul>
<li><strong>Renaturation Window:</strong> The wild-type enzyme refolds optimally at 37°C. However, mutants were isolated that only produce active enzyme (and renature) at temperatures <em>below</em> 37°C.</li>
<li><strong>Stability vs. Formation:</strong> Crucially, once folded, both the wild-type and mutant enzymes are stable up to 90°C.</li>
<li><strong>The Rate-Limiting Step:</strong> Levinthal notes that the rate-limiting step for activity is the <strong>formation of the dimer</strong> from monomers. This proves that the <em>order of assembly</em> (kinetic pathway) dictates the final structure, distinct from the final structure&rsquo;s thermodynamic stability.</li>
</ul>
<p>The talk concluded with a short motion picture Levinthal showed live, illustrating polypeptide synthesis and &ldquo;the process of then forming a desired interaction via the most favored energy path as displayed on the computer controlled oscilloscope.&rdquo;</p>
<p>The Q&amp;A discussion following the talk includes one exchange directly relevant to the folding argument: when asked whether a protein is ever truly unfolded (devoid of all secondary and tertiary structure), Levinthal answered that both physical measurements and synthetic polypeptide work suggest yes. The other exchanges concerned the tangent formula for x-ray crystallographic phase refinement and whether computed structures had been tested for thermal perturbations.</p>
<h2 id="what-outcomesconclusions">What outcomes/conclusions?</h2>
<h3 id="key-finding">Key Finding</h3>
<p>The mutant experiments serve as the &ldquo;smoking gun&rdquo;: a protein seeking a global thermodynamic minimum would fold spontaneously at any temperature where the final state is stable (up to 90°C). The fact that mutants require specific lower temperatures for <em>formation</em> (while remaining stable at high temperatures once formed) proves that the <strong>kinetic pathway</strong> determines the outcome alongside the thermodynamic endpoint.</p>
<h3 id="broader-implications">Broader Implications</h3>
<p>Levinthal explicitly asks: &ldquo;Is a unique folding necessary for any random 150-amino acid sequence?&rdquo; and answers &ldquo;Probably not.&rdquo; He supports this by noting the difficulty many researchers face in attempting to crystallize proteins, suggesting that not all sequences produce stably folded structures.</p>
<p>He concludes by connecting these computational models to <strong>Mössbauer spectroscopy</strong>, suggesting that these computational studies may help in understanding how small perturbations of polypeptide structures affect the Mössbauer nucleus (a reminder of the specific conference context where this perspective was delivered).</p>
<h3 id="connection-to-modern-work">Connection to Modern Work</h3>
<p>Levinthal&rsquo;s arguments remain relevant context for modern computational protein folding:</p>
<ul>
<li><strong>Early computational visualization:</strong> Levinthal used computer-controlled oscilloscopes and vector matrix multiplications to build and display 3D polypeptide structures, and showed a motion picture of forming a desired interaction via the most favored energy path. This was an early instance of computational molecular visualization.</li>
<li><strong>Local interactions and folding pathways:</strong> The hypothesis that &ldquo;local interactions&rdquo; serve as nucleation points that guide folding remains central to how modern structure prediction methods (e.g., AlphaFold) model residue-residue interactions.</li>
<li><strong>The paradox&rsquo;s lasting influence:</strong> The impossibility of random conformational search that Levinthal articulated continues to motivate approaches that exploit the structure of the energy landscape rather than exhaustive enumeration.</li>
<li><strong>Sequence-structure relationship:</strong> Levinthal&rsquo;s suggestion that not every random amino acid sequence would fold uniquely foreshadows the modern challenge of inverse folding (protein design), where the goal is to find sequences within the subset that does fold to a target structure.</li>
</ul>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Levinthal, C. (1969). How to Fold Graciously. In <em>Mössbauer Spectroscopy in Biological Systems: Proceedings of a meeting held at Allerton House, Monticello, Illinois</em> (pp. 22-24). University of Illinois Press.</p>
<p><strong>Publication</strong>: Mössbauer Spectroscopy in Biological Systems Proceedings, 1969</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@inproceedings</span>{levinthal1969fold,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span>=<span style="color:#e6db74">{How to fold graciously}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span>=<span style="color:#e6db74">{Levinthal, Cyrus}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">booktitle</span>=<span style="color:#e6db74">{M{\&#34;o}ssbauer spectroscopy in biological systems}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span>=<span style="color:#e6db74">{22--24}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span>=<span style="color:#e6db74">{1969}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">publisher</span>=<span style="color:#e6db74">{University of Illinois Press}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">url</span>=<span style="color:#e6db74">{https://faculty.cc.gatech.edu/~turk/bio_sim/articles/proteins_levinthal_1969.pdf}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p><strong>Additional Resources</strong>:</p>
<ul>
<li><a href="https://en.wikipedia.org/wiki/Levinthal%27s_paradox">Levinthal&rsquo;s Paradox (Wikipedia)</a></li>
</ul>
]]></content:encoded></item></channel></rss>