<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Online Recognition on Hunter Heidenreich | ML Research Scientist</title><link>https://hunterheidenreich.com/notes/chemistry/optical-structure-recognition/online-recognition/</link><description>Recent content in Online Recognition on Hunter Heidenreich | ML Research Scientist</description><image><title>Hunter Heidenreich | ML Research Scientist</title><url>https://hunterheidenreich.com/img/avatar.webp</url><link>https://hunterheidenreich.com/img/avatar.webp</link></image><generator>Hugo -- 0.147.7</generator><language>en-US</language><copyright>2026 Hunter Heidenreich</copyright><lastBuildDate>Mon, 06 Apr 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://hunterheidenreich.com/notes/chemistry/optical-structure-recognition/online-recognition/index.xml" rel="self" type="application/rss+xml"/><item><title>Unified Framework for Handwritten Chemical Expressions</title><link>https://hunterheidenreich.com/notes/chemistry/optical-structure-recognition/online-recognition/chang-unified-framework-2009/</link><pubDate>Wed, 17 Dec 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/optical-structure-recognition/online-recognition/chang-unified-framework-2009/</guid><description>A 2009 unified framework for inorganic/organic chemical handwriting recognition using graph search and statistical symbol grouping.</description><content:encoded><![CDATA[<h2 id="addressing-the-complexity-of-handwritten-organic-chemistry">Addressing the Complexity of Handwritten Organic Chemistry</h2>
<p>This is a <strong>Methodological Paper</strong> ($\Psi_{\text{Method}}$) from Microsoft Research Asia that addresses the challenge of recognizing complex 2D organic chemistry structures. By 2009, math expression recognition had seen significant commercial progress, but chemical expression recognition remained less developed.</p>
<p>The specific gap addressed is the geometric complexity of organic formulas. While inorganic formulas typically follow a linear, equation-like structure, organic formulas present complex 2D diagrammatic structures with various bond types and rings. Existing work often relied on strong assumptions (like single-stroke symbols) or failed to handle arbitrary compounds. There was a clear need for a unified solution capable of handling both inorganic and organic domains consistently.</p>
<h2 id="the-chemical-expression-structure-graph-cesg">The Chemical Expression Structure Graph (CESG)</h2>
<p>The core innovation is a unified statistical framework that processes inorganic and organic expressions within the same pipeline. Key technical novelties include:</p>
<ol>
<li><strong>Unified Bond Modeling</strong>: Bonds are treated as special symbols. The framework detects &ldquo;extended bond symbols&rdquo; (multi-stroke bonds) and splits them into single, double, or triple bonds using corner detection for consistent processing.</li>
<li><strong>Chemical Expression Structure Graph (CESG)</strong>: A defined graph representation for generic chemical expressions where nodes represent symbols and edges represent bonds or spatial relations.</li>
<li><strong>Non-Symbol Modeling</strong>: During the symbol grouping phase, the system explicitly models &ldquo;invalid groups&rdquo; to reduce over-grouping errors.</li>
<li><strong>Global Graph Search</strong>: Structure analysis is formulated as finding the optimal CESG by searching over a Weighted Direction Graph ($G_{WD}$).</li>
</ol>
<h2 id="graph-search-and-statistical-validation">Graph Search and Statistical Validation</h2>
<p>The authors validated the framework on a proprietary database of 35,932 handwritten chemical expressions collected from 300 writers.</p>
<ul>
<li><strong>Setup</strong>: The data was split into roughly 26,000 training and 6,400 testing samples.</li>
<li><strong>Metric</strong>: Recognition accuracy was measured strictly by expression (all symbols and the complete structure must be correct).</li>
<li><strong>Ablations</strong>: The team evaluated the performance contribution of symbol grouping, structure analysis, and full semantic verification.</li>
</ul>
<h2 id="recognition-accuracy-and-outcomes">Recognition Accuracy and Outcomes</h2>
<p>The full framework achieved a Top-1 accuracy of 75.4% and a Top-5 accuracy of 83.1%.</p>
<ul>
<li><strong>Component Contribution</strong>: Structure analysis is the primary bottleneck. Adding it drops the theoretical &ldquo;perfect grouping&rdquo; performance from 85.9% to 74.1% (Top-1) due to structural errors.</li>
<li><strong>Semantic Verification</strong>: Checking valence and grammar improved relative accuracy by 1.7%.</li>
</ul>
<p>The unified framework effectively handles the variance in 2D space for chemical expressions, demonstrating that delayed decision-making (keeping top-N candidates) improves robustness.</p>
<hr>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<h3 id="artifacts">Artifacts</h3>
<table>
  <thead>
      <tr>
          <th style="text-align: left">Artifact</th>
          <th style="text-align: left">Type</th>
          <th style="text-align: left">License</th>
          <th style="text-align: left">Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left">N/A</td>
          <td style="text-align: left">N/A</td>
          <td style="text-align: left">N/A</td>
          <td style="text-align: left">No public artifacts (code, data, models) were released by the authors.</td>
      </tr>
  </tbody>
</table>
<h3 id="data">Data</h3>
<p>The study used a private Microsoft Research Asia dataset, making direct reproduction difficult.</p>
<table>
  <thead>
      <tr>
          <th>Purpose</th>
          <th>Dataset</th>
          <th>Size</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Total</td>
          <td>Proprietary MSRA DB</td>
          <td>35,932 expressions</td>
          <td>Written by 300 people</td>
      </tr>
      <tr>
          <td>Training</td>
          <td>Subset</td>
          <td>25,934 expressions</td>
          <td></td>
      </tr>
      <tr>
          <td>Testing</td>
          <td>Subset</td>
          <td>6,398 expressions</td>
          <td></td>
      </tr>
  </tbody>
</table>
<ul>
<li><strong>Content</strong>: 2,000 unique expressions from high school/college textbooks.</li>
<li><strong>Composition</strong>: ~25% of samples are organic expressions.</li>
<li><strong>Vocabulary</strong>: 163 symbol classes (elements, digits, <code>+</code>, <code>↑</code>, <code>%</code>, bonds, etc.).</li>
</ul>
<h3 id="algorithms">Algorithms</h3>
<p><strong>1. Symbol Grouping (Dynamic Programming)</strong></p>
<ul>
<li>Objective: Find the optimal symbol sequence $G_{max}$ maximizing the posterior probability given the ink strokes:
$$ G_{max} = \arg\max_{G} P(G | \text{Ink}) $$</li>
<li><strong>Non-symbol modeling</strong>: Iteratively trained models on &ldquo;incorrect grouping results&rdquo; to learn to reject invalid strokes.</li>
<li><strong>Inter-group modeling</strong>: Uses Gaussian Mixture Models (GMM) to model spatial relations ($R_j$) between groups.</li>
</ul>
<p><strong>2. Bond Processing</strong></p>
<ul>
<li><strong>Extended Bond Symbol</strong>: Recognizes connected strokes (e.g., a messy double bond written in one stroke) as a single &ldquo;extended&rdquo; symbol.</li>
<li><strong>Splitting</strong>: Uses <strong>Curvature Scale Space (CSS)</strong> corner detection to split extended symbols into primitive lines.</li>
<li><strong>Classification</strong>: A Neural Network verifies if the split lines form valid single, double, or triple bonds.</li>
</ul>
<p><strong>3. Structure Analysis (Graph Search)</strong></p>
<ul>
<li><strong>Graph Construction</strong>: Builds a Weighted Direction Graph ($G_{WD}$) where nodes are symbol candidates and edges are potential relationships ($E_{c}, E_{nc}, E_{peer}, E_{sub}$).</li>
<li><strong>Edge Weights</strong>: Calculated as the product of observation, spatial, and contextual probabilities:
$$ W(S, O, R) = P(O|S) \times P(\text{Spatial}|R) \times P(\text{Context}|S, R) $$
<ul>
<li>Spatial probability uses rectangular control regions and distance functions.</li>
<li>Contextual probability uses statistical co-occurrence (e.g., &lsquo;C&rsquo; often appears with &lsquo;H&rsquo;).</li>
</ul>
</li>
<li><strong>Search</strong>: Breadth-first search with pruning to find the top-N optimal CESGs.</li>
</ul>
<h3 id="models">Models</h3>
<ul>
<li><strong>Symbol Recognition</strong>: Implementation details not specified, but likely HMM or NN based on the era. Bond verification explicitly uses a <strong>Neural Network</strong>.</li>
<li><strong>Spatial Models</strong>: <strong>Gaussian Mixture Models (GMM)</strong> are used to model the 9 spatial relations (e.g., Left-super, Above, Subscript).</li>
<li><strong>Semantic Model</strong>: A <strong>Context-Free Grammar (CFG)</strong> parser is used for final verification (e.g., ensuring digits aren&rsquo;t reactants).</li>
</ul>
<h3 id="evaluation">Evaluation</h3>
<p>Evaluation is performed using &ldquo;Expression-level accuracy&rdquo;.</p>
<table>
  <thead>
      <tr>
          <th>Metric</th>
          <th>Value (Top-1)</th>
          <th>Value (Top-5)</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Full Framework</td>
          <td>75.4%</td>
          <td>83.1%</td>
          <td></td>
      </tr>
      <tr>
          <td>Without Semantics</td>
          <td>74.1%</td>
          <td>83.0%</td>
          <td></td>
      </tr>
      <tr>
          <td>Grouping Only</td>
          <td>85.9%</td>
          <td>95.6%</td>
          <td>Theoretical max if structure analysis was perfect</td>
      </tr>
  </tbody>
</table>
<hr>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Chang, M., Han, S., &amp; Zhang, D. (2009). A Unified Framework for Recognizing Handwritten Chemical Expressions. <em>2009 10th International Conference on Document Analysis and Recognition</em>, 1345&ndash;1349. <a href="https://doi.org/10.1109/ICDAR.2009.64">https://doi.org/10.1109/ICDAR.2009.64</a></p>
<p><strong>Publication</strong>: ICDAR 2009</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@inproceedings</span>{changUnifiedFrameworkRecognizing2009,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span> = <span style="color:#e6db74">{A {{Unified Framework}} for {{Recognizing Handwritten Chemical Expressions}}}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">booktitle</span> = <span style="color:#e6db74">{2009 10th {{International Conference}} on {{Document Analysis}} and {{Recognition}}}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span> = <span style="color:#e6db74">{Chang, Ming and Han, Shi and Zhang, Dongmei}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span> = <span style="color:#ae81ff">2009</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">volume</span> = <span style="color:#e6db74">{3}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span> = <span style="color:#e6db74">{1345--1349}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">publisher</span> = <span style="color:#e6db74">{IEEE}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">address</span> = <span style="color:#e6db74">{Barcelona, Spain}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">doi</span> = <span style="color:#e6db74">{10.1109/ICDAR.2009.64}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div>]]></content:encoded></item><item><title>SVM-HMM Online Classifier for Chemical Symbols</title><link>https://hunterheidenreich.com/notes/chemistry/optical-structure-recognition/online-recognition/zhang-svm-hmm-2010/</link><pubDate>Wed, 17 Dec 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/optical-structure-recognition/online-recognition/zhang-svm-hmm-2010/</guid><description>A dual-stage classifier combining SVM and HMM to recognize online handwritten chemical symbols, introducing a reordering algorithm for organic rings.</description><content:encoded><![CDATA[<h2 id="contribution-double-stage-classification-method">Contribution: Double-Stage Classification Method</h2>
<p><strong>Method</strong>.
This paper is a methodological contribution that proposes a novel &ldquo;double-stage classifier&rdquo; architecture. It fits the taxonomy by introducing a specific algorithmic pipeline (SVM rough classification followed by HMM fine classification) and a novel pre-processing algorithm (Point Sequence Reordering) to solve technical limitations in recognizing organic ring structures. The contribution is validated through ablation studies (comparing SVM kernels and HMM state/Gaussian counts) and performance benchmarks.</p>
<h2 id="motivation-recognizing-complex-organic-ring-structures">Motivation: Recognizing Complex Organic Ring Structures</h2>
<p>The primary motivation is the complexity of recognizing handwritten chemical symbols, specifically the distinction between <strong>Organic Ring Structures (ORS)</strong> and <strong>Non-Ring Structures (NRS)</strong>. Existing single-stage classifiers are unreliable for ORS because these symbols have arbitrary writing styles, variable stroke numbers, and inconsistent stroke orders due to their 2D hexagonal structure. A robust system is needed to handle this uncertainty and achieve high accuracy.</p>
<h2 id="core-innovation-point-sequence-reordering-psr">Core Innovation: Point Sequence Reordering (PSR)</h2>
<p>The authors introduce two main novelties:</p>
<ol>
<li><strong>Double-Stage Architecture</strong>: A hybrid system where an SVM (using RBF kernel) first roughly classifies inputs as either ORS or NRS, followed by specialized HMMs for fine-grained recognition.</li>
<li><strong>Point Sequence Reordering (PSR) Algorithm</strong>: A stroke-order independent algorithm designed specifically for ORS. It reorders the point sequence of a symbol based on a counter-clockwise scan from the centroid, effectively eliminating the uncertainty caused by variations in stroke number and writing order.</li>
</ol>
<h2 id="methodology--experimental-design">Methodology &amp; Experimental Design</h2>
<p>The authors collected a custom dataset and performed sequential optimizations:</p>
<ul>
<li><strong>SVM Optimization</strong>: Compared Polynomial, RBF, and Sigmoid kernels to find the best rough classifier.</li>
<li><strong>HMM Optimization</strong>: Tested multiple combinations of states (4, 6, 8) and Gaussians (3, 4, 6, 8, 9, 12) to maximize fine classification accuracy.</li>
<li><strong>PSR Validation</strong>: Conducted an ablation study comparing HMM accuracy on ORS symbols &ldquo;Before PSR&rdquo; vs &ldquo;After PSR&rdquo; to quantify the algorithm&rsquo;s impact.</li>
</ul>
<h2 id="results--final-conclusions">Results &amp; Final Conclusions</h2>
<ul>
<li><strong>Architecture Performance</strong>: The RBF-based SVM achieved 99.88% accuracy in differentiating ORS from NRS.</li>
<li><strong>HMM Configuration</strong>: The optimal HMM topology was found to be 8-states and 12-Gaussians for both symbol types.</li>
<li><strong>PSR Impact</strong>: The PSR algorithm improved ORS recognition. Top-1 accuracy shifted from <strong>49.84% (Before PSR)</strong> to <strong>98.36% (After PSR)</strong>.</li>
<li><strong>Overall Accuracy</strong>: The final integrated system achieved a Top-1 accuracy of <strong>93.10%</strong> and Top-3 accuracy of <strong>98.08%</strong> on the test set.</li>
</ul>
<hr>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<h3 id="data">Data</h3>
<p>The study defined 101 chemical symbols split into two categories.</p>
<table>
  <thead>
      <tr>
          <th>Category</th>
          <th>Count</th>
          <th>Content</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><strong>NRS</strong> (Non-Ring)</td>
          <td>63</td>
          <td>Digits 0-9, 44 letters, 9 operators</td>
          <td>Operators include +, -, =, $\rightarrow$, etc.</td>
      </tr>
      <tr>
          <td><strong>ORS</strong> (Organic Ring)</td>
          <td>38</td>
          <td>2D hexagonal structures</td>
          <td>Benzene rings, cyclohexane, etc.</td>
      </tr>
  </tbody>
</table>
<ul>
<li><strong>Collection</strong>: 12,322 total samples (122 per symbol) collected from 20 writers (teachers and students).</li>
<li><strong>Split</strong>: 9,090 training samples and 3,232 test samples.</li>
<li><strong>Constraints</strong>: Three specifications were used: normal, standard, and freestyle.</li>
</ul>
<h3 id="algorithms">Algorithms</h3>
<p><strong>1. SVM Feature Extraction (Rough Classification)</strong>
The input strokes are scaled, and a 58-dimensional feature vector is calculated:</p>
<ul>
<li><strong>Mesh ($4 \times 4$)</strong>: Ratio of points in 16 grids (16 features).</li>
<li><strong>Outline</strong>: Normalized scan distance from 4 edges with 5 scan lines each (20 features).</li>
<li><strong>Projection</strong>: Point density in 5 bins per edge (20 features).</li>
<li><strong>Aspect Ratio</strong>: Height/Width ratios (2 features).</li>
</ul>
<p><strong>2. Point Sequence Reordering (PSR)</strong>
Used strictly for ORS preprocessing:</p>
<ol>
<li>Calculate the centroid $(x_c, y_c)$ of the symbol.</li>
<li>Initialize a scan line at angle $\theta = 0$.</li>
<li>Traverse points; if a point $p_i = (x_i, y_i)$ satisfies the distance threshold to the scan line, add it to the reordered list. Distance $d_i$ is calculated as:
$$ d_i = |(y_i - y_c)\cos(\theta) - (x_i - x_c)\sin(\theta)| $$</li>
<li>Increment $\theta$ by $\Delta\theta$ and repeat until a full circle ($2\pi$) is completed.</li>
</ol>
<h3 id="models">Models</h3>
<ul>
<li><strong>SVM (Stage 1)</strong>: RBF Kernel was selected as optimal with parameters $C=512$ and $\gamma=0.5$.</li>
<li><strong>HMM (Stage 2)</strong>: Left-right continuous HMM trained via Baum-Welch algorithm. The topology is one model per symbol using <strong>8 states and 12 Gaussians</strong>.</li>
</ul>
<h3 id="evaluation">Evaluation</h3>
<p>Metrics reported are Top-1, Top-2, and Top-3 accuracy on the held-out test set.</p>
<table>
  <thead>
      <tr>
          <th>Metric</th>
          <th>NRS Accuracy</th>
          <th>ORS Accuracy</th>
          <th>Overall Test Accuracy</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><strong>Top-1</strong></td>
          <td>91.91%</td>
          <td>97.53%</td>
          <td>93.10%</td>
      </tr>
      <tr>
          <td><strong>Top-3</strong></td>
          <td>99.12%</td>
          <td>99.34%</td>
          <td>98.08%</td>
      </tr>
  </tbody>
</table>
<h3 id="hardware">Hardware</h3>
<ul>
<li><strong>Device</strong>: HP Pavilion tx1000 Tablet PC.</li>
<li><strong>Processor</strong>: 2.00GHz CPU.</li>
</ul>
<hr>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Zhang, Y., Shi, G., &amp; Wang, K. (2010). A SVM-HMM Based Online Classifier for Handwritten Chemical Symbols. <em>2010 International Conference on Pattern Recognition</em>, 1888&ndash;1891. <a href="https://doi.org/10.1109/ICPR.2010.465">https://doi.org/10.1109/ICPR.2010.465</a></p>
<p><strong>Publication</strong>: ICPR 2010</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@inproceedings</span>{zhang2010svm,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span> = <span style="color:#e6db74">{A SVM-HMM Based Online Classifier for Handwritten Chemical Symbols}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">booktitle</span> = <span style="color:#e6db74">{2010 International Conference on Pattern Recognition}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span> = <span style="color:#e6db74">{Zhang, Yang and Shi, Guangshun and Wang, Kai}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span> = <span style="color:#e6db74">{2010}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span> = <span style="color:#e6db74">{1888--1891}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">publisher</span> = <span style="color:#e6db74">{IEEE}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">doi</span> = <span style="color:#e6db74">{10.1109/ICPR.2010.465}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div>]]></content:encoded></item><item><title>Recognition of On-line Handwritten Chemical Expressions</title><link>https://hunterheidenreich.com/notes/chemistry/optical-structure-recognition/online-recognition/yang-online-handwritten-2008/</link><pubDate>Wed, 17 Dec 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/optical-structure-recognition/online-recognition/yang-online-handwritten-2008/</guid><description>A two-level recognition algorithm for on-line handwritten chemical expressions using structural and syntactic features.</description><content:encoded><![CDATA[<h2 id="contribution-on-line-chemical-expression-recognition-framework">Contribution: On-line Chemical Expression Recognition Framework</h2>
<p>This is a <strong>Method</strong> paper. It proposes a novel architectural pipeline (&ldquo;Algorithm Model&rdquo;) for recognizing on-line handwritten chemical expressions. The paper focuses on detailing the specific mechanisms of this pipeline (pre-processing, segmentation, two-level recognition, and HCI) and validates its effectiveness through quantitative comparison against a conventional baseline. The rhetorical structure aligns with the &ldquo;Methodological Basis&rdquo; of the taxonomy, prioritizing the &ldquo;how well does this work?&rdquo; question over theoretical derivation or dataset curation.</p>
<h2 id="motivation-the-hci-gap-in-chemical-drawing">Motivation: The HCI Gap in Chemical Drawing</h2>
<p>The authors identify a gap in existing human-computer interaction (HCI) for chemistry. While mathematical formula recognition had seen progress, chemical expression recognition was under-researched. Existing tools relied on keyboard/mouse input, which was time-consuming and inefficient for the complex, variable nature of chemical structures. Previous attempts were either too slow (vectorization-based) or failed to leverage specific chemical knowledge effectively. There was a practical need for a system that could handle the specific syntactic rules of chemistry in an on-line (real-time) handwriting setting.</p>
<h2 id="novelty-two-level-recognition-architecture">Novelty: Two-Level Recognition Architecture</h2>
<p>The core contribution is a <strong>two-level recognition algorithm</strong> that integrates chemical domain knowledge.</p>
<ul>
<li><strong>Level 1 (Substance Level):</strong> Treats connected strokes as a potential &ldquo;substance unit&rdquo; (e.g., &ldquo;H2O&rdquo;) and matches them against a dictionary using a modified edit distance algorithm.</li>
<li><strong>Level 2 (Character Level):</strong> If the substance match fails, it falls back to segmenting the unit into isolated characters and reconstructing them using syntactic rules.</li>
<li><strong>Hybrid Segmentation:</strong> Combines structural analysis (using bounding box geometry for super/subscript detection) with &ldquo;partial recognition&rdquo; (identifying special symbols like <code>+</code>, <code>=</code>, <code>-&gt;</code> early to split the expression).</li>
</ul>
<h2 id="methodology-custom-dataset-and-baseline-comparisons">Methodology: Custom Dataset and Baseline Comparisons</h2>
<p>The authors conducted a validation experiment in a laboratory environment with 20 participants (chemistry students and teachers).</p>
<ul>
<li><strong>Dataset:</strong> 1,197 total samples (983 from a standard set of 341 expressions, 214 arbitrary expressions written by users).</li>
<li><strong>Baselines:</strong> They compared their &ldquo;Two-Level&rdquo; algorithm against a &ldquo;Conventional&rdquo; algorithm that skips the substance-level check and directly recognizes characters (&ldquo;Recognize Character Directly&rdquo;).</li>
<li><strong>Conditions:</strong> They also tested the impact of their Human-Computer Interaction (HCI) module which allows user corrections.</li>
</ul>
<h2 id="results-high-accuracy-and-hci-corrections">Results: High Accuracy and HCI Corrections</h2>
<ul>
<li><strong>Accuracy:</strong> The proposed two-level algorithm achieved significantly higher accuracy (<strong>96.4%</strong> for expression recognition) compared to the conventional baseline (<strong>91.5%</strong>).</li>
<li><strong>Robustness:</strong> The method performed well even on &ldquo;arbitrary&rdquo; expressions not in the standard set (92.5% accuracy vs 88.2% baseline).</li>
<li><strong>HCI Impact:</strong> Allowing users to modify results via the HCI module pushed final accuracy to high levels (<strong>98.8%</strong>).</li>
<li><strong>Conclusion:</strong> The authors concluded the algorithm is reliable for real applications and flexible enough to be extended to other domains like physics or engineering.</li>
</ul>
<hr>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<h3 id="data">Data</h3>
<p>The paper does not use a public benchmark but collected its own data for validation.</p>
<table>
  <thead>
      <tr>
          <th style="text-align: left">Purpose</th>
          <th style="text-align: left">Dataset</th>
          <th style="text-align: left">Size</th>
          <th style="text-align: left">Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left"><strong>Validation</strong></td>
          <td style="text-align: left">Custom Lab Dataset</td>
          <td style="text-align: left">1,197 samples</td>
          <td style="text-align: left">Collected from 20 chemistry students/teachers using Tablet PCs. Includes 341 standard expressions + arbitrary user inputs.</td>
      </tr>
  </tbody>
</table>
<h3 id="algorithms">Algorithms</h3>
<p>The pipeline consists of four distinct phases with specific algorithmic choices:</p>
<p><strong>1. Pre-processing</strong></p>
<ul>
<li><strong>Smoothing:</strong> Uses a 5-tap Gaussian low-pass filter (Eq. 1) with specific coefficients to smooth stroke data.</li>
<li><strong>Redundancy:</strong> Merges redundant points and removes &ldquo;prickles&rdquo; (isolated noise).</li>
<li><strong>Re-ordering:</strong> Strokes are spatially re-sorted left-to-right, top-to-down to correct for arbitrary writing order.</li>
</ul>
<p><strong>2. Segmentation</strong></p>
<ul>
<li><strong>Structural Analysis:</strong> Distinguishes relationships (Superscript vs. Subscript vs. Horizontal) using a geometric feature vector $(T, B)$ based on bounding box heights ($h$), vertical centers ($C$), and barycenters ($B_{bary}$):
$$
\begin{aligned}
d &amp;= 0.7 \cdot y_{12} - y_{22} + 0.3 \cdot y_{11} \\
T &amp;= 1000 \cdot \frac{d}{h_1} \\
B &amp;= 1000 \cdot \frac{B_{bary1} - B_{bary2}}{h_1}
\end{aligned}
$$</li>
<li><strong>Partial Recognition:</strong> Detects special symbols (<code>+</code>, <code>=</code>, <code>-&gt;</code>) early to break expressions into &ldquo;super-substance units&rdquo; (e.g., separating reactants from products).</li>
</ul>
<p><strong>3. Recognition (Two-Level)</strong></p>
<ul>
<li><strong>Level 1 (Dictionary Match):</strong>
<ul>
<li>Uses a modified <strong>Edit Distance</strong> (Eq. 6) incorporating a specific distance matrix based on chemical syntax.</li>
<li>Similarity $\lambda_{ij}$ is weighted by stroke credibility $\mu_i$ and normalized by string length.</li>
</ul>
</li>
<li><strong>Level 2 (Character Segmentation):</strong>
<ul>
<li>Falls back to this if Level 1 fails.</li>
<li>Segments characters by analyzing pixel density in horizontal/vertical/diagonal directions to find concave/convex points.</li>
<li>Recombines characters using syntactic rules (e.g., valency checks) to verify validity.</li>
</ul>
</li>
</ul>
<h3 id="evaluation">Evaluation</h3>
<p>Evaluation focused on recognition accuracy at both the character and expression level.</p>
<table>
  <thead>
      <tr>
          <th style="text-align: left">Metric</th>
          <th style="text-align: left">Value (Proposed)</th>
          <th style="text-align: left">Value (Baseline)</th>
          <th style="text-align: left">Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left"><strong>Expression Accuracy (EA)</strong></td>
          <td style="text-align: left"><strong>96.4%</strong></td>
          <td style="text-align: left">91.5%</td>
          <td style="text-align: left">&ldquo;Standard&rdquo; dataset subset.</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>Expression Accuracy (EA)</strong></td>
          <td style="text-align: left"><strong>92.5%</strong></td>
          <td style="text-align: left">88.2%</td>
          <td style="text-align: left">&ldquo;Other&rdquo; (arbitrary) dataset subset.</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>HCI-Assisted Accuracy</strong></td>
          <td style="text-align: left"><strong>98.8%</strong></td>
          <td style="text-align: left">N/A</td>
          <td style="text-align: left">Accuracy after user correction.</td>
      </tr>
  </tbody>
</table>
<h3 id="hardware">Hardware</h3>
<ul>
<li><strong>Input Devices:</strong> Tablet PCs were used for data collection and testing.</li>
<li><strong>Compute:</strong> Specific training hardware is not listed, but the algorithm is designed for real-time interaction on standard 2008-era computing devices.</li>
</ul>
<hr>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Yang, J., Shi, G., Wang, Q., &amp; Zhang, Y. (2008). Recognition of On-line Handwritten Chemical Expressions. <em>2008 IEEE International Joint Conference on Neural Networks</em>, 2360&ndash;2365. <a href="https://doi.org/10.1109/IJCNN.2008.4634125">https://doi.org/10.1109/IJCNN.2008.4634125</a></p>
<p><strong>Publication</strong>: IJCNN 2008</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@inproceedings</span>{jufengyangRecognitionOnlineHandwritten2008,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span> = <span style="color:#e6db74">{Recognition of On-Line Handwritten Chemical Expressions}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">booktitle</span> = <span style="color:#e6db74">{2008 {{IEEE International Joint Conference}} on {{Neural Networks}} ({{IEEE World Congress}} on {{Computational Intelligence}})}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span> = <span style="color:#e6db74">{{Jufeng Yang} and {Guangshun Shi} and {Qingren Wang} and {Yong Zhang}}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span> = <span style="color:#ae81ff">2008</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">month</span> = jun,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span> = <span style="color:#e6db74">{2360--2365}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">publisher</span> = <span style="color:#e6db74">{IEEE}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">address</span> = <span style="color:#e6db74">{Hong Kong, China}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">doi</span> = <span style="color:#e6db74">{10.1109/IJCNN.2008.4634125}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">urldate</span> = <span style="color:#e6db74">{2025-12-17}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">isbn</span> = <span style="color:#e6db74">{978-1-4244-1820-6}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div>]]></content:encoded></item><item><title>Online Handwritten Chemical Formula Structure Analysis</title><link>https://hunterheidenreich.com/notes/chemistry/optical-structure-recognition/online-recognition/wang-online-handwritten-2009/</link><pubDate>Wed, 17 Dec 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/optical-structure-recognition/online-recognition/wang-online-handwritten-2009/</guid><description>A hierarchical grammar-based approach for recognizing and analyzing online handwritten chemical formulas in mobile education contexts.</description><content:encoded><![CDATA[<h2 id="hierarchical-grammatical-framework-contribution">Hierarchical Grammatical Framework Contribution</h2>
<p>This is a <strong>Method</strong> paper. It proposes a novel architectural framework for processing chemical formulas by decomposing them into three hierarchical levels (Formula, Molecule, Text). The contribution is defined by a specific set of formal grammatical rules and parsing algorithms used to construct a &ldquo;grammar spanning tree&rdquo; and &ldquo;molecule spanning graph&rdquo; from online handwritten strokes.</p>
<h2 id="motivation-for-online-formula-recognition">Motivation for Online Formula Recognition</h2>
<p>The primary motivation is the application of mobile computing in chemistry education, where precise comprehension of casual, <em>online</em> handwritten formulas is a significant challenge.</p>
<ul>
<li><strong>2D Complexity</strong>: Unlike 1D text, chemical formulas utilize complex 2D spatial relationships that convey specific chemical meaning (e.g., bonds, rings).</li>
<li><strong>Format Limitations</strong>: Existing storage formats like CML (Chemical Markup Language) or MDL MOLFILE do not natively record the layout or abbreviated information necessary for recognizing handwritten input.</li>
<li><strong>Online Gap</strong>: Previous research focused heavily on <em>offline</em> (image-based) recognition, lacking solutions for <em>online</em> (stroke-based) handwritten chemical formulas (OHCF).</li>
</ul>
<h2 id="core-novelty-in-three-level-grammatical-analysis">Core Novelty in Three-Level Grammatical Analysis</h2>
<p>The core novelty is the <strong>Three-Level Grammatical Analysis</strong> approach:</p>
<ol>
<li><strong>Formula Level (1D)</strong>: Treats the reaction equation as a linear sequence of components (Reactants, Products, Separators), parsed via a context-free grammar to build a spanning tree.</li>
<li><strong>Molecule Level (2D)</strong>: Treats molecules as graphs where &ldquo;text groups&rdquo; are vertices and &ldquo;bonds&rdquo; are edges. It introduces specific handling for &ldquo;hidden Carbon dots&rdquo; (intersections of bonds without text).</li>
<li><strong>Text Level (1D)</strong>: Analyzes the internal structure of text groups (atoms, subscripts).</li>
</ol>
<p>Unique to this approach is the <strong>formal definition of the chemical grammar</strong> as a 5-tuple $G=(T,N,P,M,S)$ and the generation of an <strong>Adjacency Matrix</strong> directly from the handwritten sketch to represent chemical connectivity.</p>
<h2 id="experimental-validation-on-handwritten-strokes">Experimental Validation on Handwritten Strokes</h2>
<p>The authors validated their model using a custom dataset of online handwritten formulas.</p>
<ul>
<li><strong>Data Source</strong>: 25 formulas were randomly selected from a larger pool of 1,250 samples.</li>
<li><strong>Scope</strong>: The test set included 484 total symbols, comprising generators, separators, text symbols, rings, and various bond types.</li>
<li><strong>Granular Validation</strong>: The system was tested at multiple distinct stages:
<ul>
<li>Key Symbol Extraction (Formula Level)</li>
<li>Text Localization (Molecule Level)</li>
<li>Bond End Grouping (Molecule Level)</li>
<li>Text Recognition (Text Level)</li>
</ul>
</li>
</ul>
<h2 id="downstream-impact-and-parsing-accuracy">Downstream Impact and Parsing Accuracy</h2>
<p>The system achieved high accuracy across all sub-tasks, demonstrating that the hierarchical grammar approach is effective for both inorganic and organic formulas.</p>
<ul>
<li><strong>Formula Level</strong>: 98.3% accuracy for Key Symbols; 100% for State-assisted symbols.</li>
<li><strong>Molecule Level</strong>: 98.8% accuracy for Bond End Grouping; 100% for Free End-Text connection detection.</li>
<li><strong>Text Recognition</strong>: 98.7% accuracy (Top-3) using HMMs.</li>
<li><strong>Impact</strong>: The method successfully preserves the writer&rsquo;s &ldquo;online information&rdquo; (habits/intentions) while converting the handwritten input into standard formats suitable for visual editing or data retrieval.</li>
</ul>
<hr>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<p>To replicate this work, one would need to implement the specific grammatical production rules and the geometric thresholds defined for bond analysis.</p>
<h3 id="data">Data</h3>
<table>
  <thead>
      <tr>
          <th>Purpose</th>
          <th>Dataset</th>
          <th>Size</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><strong>Training</strong></td>
          <td>Symbol HMMs</td>
          <td>5,670 samples</td>
          <td>Used to train the text recognition module</td>
      </tr>
      <tr>
          <td><strong>Testing</strong></td>
          <td>Text Recognition</td>
          <td>2,016 samples</td>
          <td>Test set for character HMMs</td>
      </tr>
      <tr>
          <td><strong>Testing</strong></td>
          <td>Formula Analysis</td>
          <td>25 formulas</td>
          <td>Random subset of 1,250 collected samples; contains 484 symbols</td>
      </tr>
  </tbody>
</table>
<h3 id="algorithms">Algorithms</h3>
<p><strong>1. Formula Level Parsing</strong></p>
<ul>
<li><strong>HBL Analysis</strong>: Identify the &ldquo;Horizontal Baseline&rdquo; (HBL) containing the most symbols to locate key operators (e.g., $+$, $\rightarrow$).</li>
<li><strong>Grammar</strong>: Use the productions defined in Figure 4. Example rules include:
<ul>
<li>$Reaction ::= ReactantList \ Generator \ ProductList$</li>
<li>$Reactant ::= BalancingNum \ Molecule \ IonicCharacter$</li>
</ul>
</li>
</ul>
<p><strong>2. Molecule Level Analysis (Bond Grouping)</strong></p>
<ul>
<li><strong>Endpoint Classification</strong>: Points are classified as <em>free ends</em>, <em>junctions</em> (3+ bonds), or <em>connections</em> (2 bonds).</li>
<li><strong>Grouping Equation</strong>: An endpoint $(x_k, y_k)$ belongs to Group A based on distance thresholding:
$$
\begin{aligned}
Include(x_0, y_0) = \begin{cases} 1, &amp; d_0 &lt; t \cdot \max d_x + \partial \\ 0, &amp; \text{else} \end{cases}
\end{aligned}
$$
Where $d_k$ is the Euclidean distance to the group center $(x_a, y_a)$.</li>
</ul>
<p><strong>3. Connection Detection</strong></p>
<ul>
<li><strong>Text-Bond Connection</strong>: A text group is connected to a bond if the free end falls within a bounding box expanded by thresholds $t_W$ and $t_H$:
$$
\begin{aligned}
Con(x,y) = \begin{cases} 1, &amp; \min x - t_W &lt; x &lt; \max x + t_W \text{ AND } \min y - t_H &lt; y &lt; \max y + t_H \\ 0, &amp; \text{else} \end{cases}
\end{aligned}
$$</li>
</ul>
<h3 id="models">Models</h3>
<ul>
<li><strong>Text Recognition</strong>: Hidden Markov Models (HMM) are used for recognizing individual text symbols.</li>
<li><strong>Grammar</strong>: Context-Free Grammar (CFG) designed with ambiguity elimination to ensure a single valid parse tree for any valid formula.</li>
</ul>
<h3 id="evaluation">Evaluation</h3>
<p>Performance is measured by recognition accuracy at specific processing stages:</p>
<table>
  <thead>
      <tr>
          <th>Metric</th>
          <th>Task</th>
          <th>Value</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Accuracy</td>
          <td>F1 (Key Symbol Extraction)</td>
          <td>98.3%</td>
          <td>Formula Level</td>
      </tr>
      <tr>
          <td>Accuracy</td>
          <td>F2 (State-assisted Symbol)</td>
          <td>100%</td>
          <td>Formula Level</td>
      </tr>
      <tr>
          <td>Accuracy</td>
          <td>M2 (Bond End Grouping)</td>
          <td>98.8%</td>
          <td>Molecule Level</td>
      </tr>
      <tr>
          <td>Accuracy</td>
          <td>M3 (Free End-Text Conn)</td>
          <td>100%</td>
          <td>Molecule Level</td>
      </tr>
      <tr>
          <td>Accuracy</td>
          <td>T1 (Text Recognition)</td>
          <td>98.7%</td>
          <td>Top-3 Accuracy</td>
      </tr>
  </tbody>
</table>
<hr>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Wang, X., Shi, G., &amp; Yang, J. (2009). The Understanding and Structure Analyzing for Online Handwritten Chemical Formulas. <em>2009 10th International Conference on Document Analysis and Recognition</em>, 1056&ndash;1060. <a href="https://doi.org/10.1109/ICDAR.2009.70">https://doi.org/10.1109/ICDAR.2009.70</a></p>
<p><strong>Publication</strong>: ICDAR 2009</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@inproceedings</span>{wangUnderstandingStructureAnalyzing2009,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span> = <span style="color:#e6db74">{The {{Understanding}} and {{Structure Analyzing}} for {{Online Handwritten Chemical Formulas}}}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">booktitle</span> = <span style="color:#e6db74">{2009 10th {{International Conference}} on {{Document Analysis}} and {{Recognition}}}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span> = <span style="color:#e6db74">{Wang, Xin and Shi, Guangshun and Yang, Jufeng}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span> = <span style="color:#e6db74">{2009}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span> = <span style="color:#e6db74">{1056--1060}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">publisher</span> = <span style="color:#e6db74">{IEEE}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">address</span> = <span style="color:#e6db74">{Barcelona, Spain}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">doi</span> = <span style="color:#e6db74">{10.1109/ICDAR.2009.70}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">isbn</span> = <span style="color:#e6db74">{978-1-4244-4500-4}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">langid</span> = <span style="color:#e6db74">{english}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div>]]></content:encoded></item><item><title>On-line Handwritten Chemical Expression Recognition</title><link>https://hunterheidenreich.com/notes/chemistry/optical-structure-recognition/online-recognition/yang-icpr-2008/</link><pubDate>Wed, 17 Dec 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/optical-structure-recognition/online-recognition/yang-icpr-2008/</guid><description>Two-level algorithm for recognizing on-line handwritten chemical expressions using structural analysis, ANNs, and string edit distance.</description><content:encoded><![CDATA[<h2 id="a-methodological-approach-to-chemical-recognition">A Methodological Approach to Chemical Recognition</h2>
<p>This is a <strong>Method</strong> paper. It proposes a specific &ldquo;novel two-level algorithm&rdquo; and a &ldquo;System model&rdquo; for recognizing chemical expressions. The paper focuses on the architectural design of the recognition pipeline (segmentation, substance recognition, symbol recognition) and validates it against a &ldquo;conventional algorithm&rdquo; baseline, fitting the standard profile of a methodological contribution.</p>
<h2 id="bridging-the-gap-in-pen-based-chemical-input">Bridging the Gap in Pen-Based Chemical Input</h2>
<p>While pen-based computing has advanced for text and mathematical formulas, inputting chemical expressions remains &ldquo;time-consuming&rdquo;. Existing research often lacks &ldquo;adequate chemical knowledge&rdquo; or relies on algorithms that are too slow (global optimization) or structurally weak (local optimization). The authors aim to bridge this gap by integrating chemical domain knowledge into the recognition process to improve speed and accuracy.</p>
<h2 id="two-level-recognition-strategy-for-formulas">Two-Level Recognition Strategy for Formulas</h2>
<p>The core novelty is a <strong>two-level recognition strategy</strong>:</p>
<ol>
<li><strong>Level 1 (Substance Recognition)</strong>: Uses global structural information to identify entire &ldquo;substance units&rdquo; (e.g., $H_2SO_4$) by matching against a dictionary.</li>
<li><strong>Level 2 (Symbol Recognition)</strong>: If Level 1 fails, the system falls back to segmenting the substance into isolated characters and recognizing them individually.</li>
</ol>
<p>Additionally, the method integrates <strong>syntactic features</strong> (chemical knowledge) such as element conservation to validate and correct results and uses specific geometric features to distinguish superscript/subscript relationships.</p>
<h2 id="dataset-collection-and-baseline-comparisons">Dataset Collection and Baseline Comparisons</h2>
<ul>
<li><strong>Dataset Collection</strong>: The authors collected 1197 handwritten expression samples from 20 chemistry professionals and students. This included 983 &ldquo;standard&rdquo; expressions (from 341 templates) and 214 &ldquo;arbitrary&rdquo; expressions written freely.</li>
<li><strong>Comparison</strong>: They compared their &ldquo;Two-level recognition&rdquo; approach against a &ldquo;conventional algorithm&rdquo; that shields the first level (directly segmenting into characters).</li>
<li><strong>Metrics</strong>: They measured Material Accuracy (MA), Correct Expressions Number (AEN), and Expression Accuracy (EA).</li>
</ul>
<h2 id="high-accuracy-in-formula-recognition">High Accuracy in Formula Recognition</h2>
<ul>
<li><strong>High Accuracy</strong>: The proposed algorithm achieved <strong>96.4% Material Accuracy (MA)</strong> and <strong>95.7% Expression Accuracy (EA)</strong> on the total test set.</li>
<li><strong>Robustness</strong>: The method performed well on both standard (96.3% EA) and arbitrary (92.5% EA) expressions.</li>
<li><strong>Validation</strong>: The authors conclude the algorithm is &ldquo;reliable,&rdquo; &ldquo;flexible,&rdquo; and suitable for real-time applications compared to prior work.</li>
</ul>
<hr>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<h3 id="data">Data</h3>
<p>The authors constructed two distinct datasets for training and evaluation:</p>
<table>
  <thead>
      <tr>
          <th style="text-align: left">Purpose</th>
          <th style="text-align: left">Dataset</th>
          <th style="text-align: left">Size</th>
          <th style="text-align: left">Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left"><strong>Symbol Training</strong></td>
          <td style="text-align: left">ISF Files</td>
          <td style="text-align: left">12,240 files</td>
          <td style="text-align: left">Used to train the ANN classifier. Covers 102 symbol classes (numerals, letters, operators, organic loops).</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>Expression Testing</strong></td>
          <td style="text-align: left">Handwritten Expressions</td>
          <td style="text-align: left">1,197 samples</td>
          <td style="text-align: left">983 standard + 214 arbitrary expressions collected from 20 chemistry teachers/students.</td>
      </tr>
  </tbody>
</table>
<h3 id="algorithms">Algorithms</h3>
<p><strong>1. Structural Segmentation (Superscript/Subscript)</strong></p>
<p>To distinguish relationships (superscript, subscript, in-line), the authors define geometric parameters based on the bounding boxes of adjacent symbols ($x_{i1}, y_{i1}, x_{i2}, y_{i2}$):</p>
<p>$$d = 0.7 \times y_{12} - y_{22} + 0.3 \times y_{11}$$
$$T = 1000 \times d/h$$
$$B = 1000 \times (B_1 - B_2)/h_1$$</p>
<p>Where $B_1, B_2$ are barycenters and $h$ is height. $(T, B)$ serves as the feature vector for classification.</p>
<p><strong>2. Segmentation Reliability</strong></p>
<p>For segmenting strokes into units, the reliability of a segmentation path is calculated as:</p>
<p>$$Cof(K_{i},n)=\sum_{j=0}^{N}P(k_{j},k_{j+1})+P(S_{K_{i}})+\delta(N)$$</p>
<p>Where $P(k_j, k_{j+1})$ is the reliability of strokes being recognized as symbol $S_{k_j}$.</p>
<p><strong>3. Substance Matching (Level 1)</strong></p>
<p>A modified string edit distance is used to match handwritten input against a dictionary:</p>
<p>$$\lambda_{\overline{u}}=\mu_{i} \times f(Dis(i,j,r)/\sqrt{Max(Len_{i},Len_{j})})$$</p>
<p>Where $\mu_i$ is the recognizer credibility and $Dis(i,j,r)$ is the edit distance.</p>
<h3 id="models">Models</h3>
<ul>
<li><strong>Classifier</strong>: An ANN-based classifier is used for isolated symbol recognition.</li>
<li><strong>Input Features</strong>: A set of ~30 features is extracted from strokes, including writing time, interval time, elastic mesh, and stroke outline.</li>
<li><strong>Performance</strong>: The classifier achieved 92.1% accuracy on a test set of 2,702 isolated symbols.</li>
</ul>
<h3 id="evaluation">Evaluation</h3>
<p>The system was evaluated on the 1,197 expression samples.</p>
<table>
  <thead>
      <tr>
          <th style="text-align: left">Metric</th>
          <th style="text-align: left">Value (Total)</th>
          <th style="text-align: left">Value (Standard)</th>
          <th style="text-align: left">Value (Other)</th>
          <th style="text-align: left">Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left"><strong>Material Accuracy (MA)</strong></td>
          <td style="text-align: left">96.4%</td>
          <td style="text-align: left">97.7%</td>
          <td style="text-align: left">94%</td>
          <td style="text-align: left">Accuracy of substance recognition.</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>Expression Accuracy (EA)</strong></td>
          <td style="text-align: left">95.7%</td>
          <td style="text-align: left">96.3%</td>
          <td style="text-align: left">92.5%</td>
          <td style="text-align: left">Accuracy of full expression recognition.</td>
      </tr>
  </tbody>
</table>
<hr>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Yang, J., Shi, G., Wang, K., Geng, Q., &amp; Wang, Q. (2008). A Study of On-Line Handwritten Chemical Expressions Recognition. <em>2008 19th International Conference on Pattern Recognition</em>, 1&ndash;4. <a href="https://doi.org/10.1109/ICPR.2008.4761824">https://doi.org/10.1109/ICPR.2008.4761824</a></p>
<p><strong>Publication</strong>: ICPR 2008</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@inproceedings</span>{yangStudyOnlineHandwritten2008,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span> = <span style="color:#e6db74">{A Study of On-Line Handwritten Chemical Expressions Recognition}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">booktitle</span> = <span style="color:#e6db74">{2008 19th {{International Conference}} on {{Pattern Recognition}}}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span> = <span style="color:#e6db74">{Yang, Jufeng and Shi, Guangshun and Wang, Kai and Geng, Qian and Wang, Qingren}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span> = <span style="color:#ae81ff">2008</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">month</span> = dec,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span> = <span style="color:#e6db74">{1--4}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">publisher</span> = <span style="color:#e6db74">{IEEE}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">address</span> = <span style="color:#e6db74">{Tampa, FL, USA}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">doi</span> = <span style="color:#e6db74">{10.1109/ICPR.2008.4761824}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div>]]></content:encoded></item><item><title>HMM-based Online Recognition of Chemical Symbols</title><link>https://hunterheidenreich.com/notes/chemistry/optical-structure-recognition/online-recognition/zhang-hmm-handwriting-2009/</link><pubDate>Wed, 17 Dec 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/optical-structure-recognition/online-recognition/zhang-hmm-handwriting-2009/</guid><description>Online recognition of handwritten chemical symbols using Hidden Markov Models with 11-dimensional local features, achieving 89.5% top-1 accuracy.</description><content:encoded><![CDATA[<h2 id="what-kind-of-paper-is-this">What kind of paper is this?</h2>
<p>This is a <strong>Method</strong> paper that proposes a specific algorithmic pipeline for the online recognition of handwritten chemical symbols. The core contribution is the engineering of an 11-dimensional feature vector combined with a Hidden Markov Model (HMM) architecture. The paper validates this method through quantitative experiments on a custom dataset, focusing on recognition accuracy as the primary metric.</p>
<h2 id="what-is-the-motivation">What is the motivation?</h2>
<p>Recognizing chemical symbols is uniquely challenging due to the complex structure of chemical expressions and the nature of pen-based input, which often results in broken or conglutinate strokes. Additionally, variations in writing style and random noise make the task difficult. While online recognition for Western characters and CJK scripts is well-developed, works specifically targeting online chemical symbol recognition are scarce, with most prior research focusing on offline recognition or global optimization.</p>
<h2 id="what-is-the-novelty-here">What is the novelty here?</h2>
<p>The primary novelty is the application of continuous HMMs specifically to the domain of <strong>online</strong> chemical symbol recognition, utilizing a specialized set of <strong>11-dimensional local features</strong>. While HMMs have been used for other scripts, this paper tailors the feature extraction (including curliness, linearity, and writing direction) to capture the specific geometric properties of chemical symbols.</p>
<h2 id="what-experiments-were-performed">What experiments were performed?</h2>
<p>The authors constructed a specific dataset for this task involving 20 participants (college teachers and students).</p>
<ul>
<li><strong>Dataset</strong>: 64 distinct symbols (digits, English letters, Greek letters, operators)</li>
<li><strong>Volume</strong>: 7,808 total samples (122 per symbol), split into 5,670 training samples and 2,016 testing samples</li>
<li><strong>Model Sweeps</strong>: They evaluated the HMM performance by varying the number of states (4, 6, 8) and the number of Gaussians per state (3, 4, 6, 9, 12)</li>
</ul>
<h2 id="what-were-the-outcomes-and-conclusions-drawn">What were the outcomes and conclusions drawn?</h2>
<ul>
<li><strong>Performance</strong>: The best configuration (6 states, 9 Gaussians) achieved a <strong>top-1 accuracy of 89.5%</strong> and a <strong>top-3 accuracy of 98.7%</strong></li>
<li><strong>Scaling</strong>: Results showed that generally, increasing the number of states and Gaussians improved accuracy, though at the cost of computational efficiency</li>
<li><strong>Error Analysis</strong>: The primary sources of error were shape similarities between specific characters (e.g., &lsquo;0&rsquo; vs &lsquo;O&rsquo; vs &lsquo;o&rsquo;, and &lsquo;C&rsquo; vs &lsquo;c&rsquo; vs &lsquo;(&rsquo;)</li>
</ul>
<hr>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<p><strong>Status:</strong> Closed / Very Low Reproducibility. This 2009 study relies on a private, custom-collected dataset and does not provide source code, model weights, or an open-access preprint.</p>
<h3 id="artifacts">Artifacts</h3>
<table>
  <thead>
      <tr>
          <th style="text-align: left">Artifact</th>
          <th style="text-align: left">Type</th>
          <th style="text-align: left">License</th>
          <th style="text-align: left">Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left"><em>None publicly available</em></td>
          <td style="text-align: left">N/A</td>
          <td style="text-align: left">N/A</td>
          <td style="text-align: left">No open source code, open datasets, or open-access preprints were released with this publication.</td>
      </tr>
  </tbody>
</table>
<h3 id="data">Data</h3>
<p>The study utilized a custom dataset collected in a laboratory environment.</p>
<table>
  <thead>
      <tr>
          <th style="text-align: left">Purpose</th>
          <th style="text-align: left">Dataset</th>
          <th style="text-align: left">Size</th>
          <th style="text-align: left">Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left"><strong>Training</strong></td>
          <td style="text-align: left">Custom Chemical Symbol Set</td>
          <td style="text-align: left">5,670 samples</td>
          <td style="text-align: left">90 samples per symbol</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>Testing</strong></td>
          <td style="text-align: left">Custom Chemical Symbol Set</td>
          <td style="text-align: left">2,016 samples</td>
          <td style="text-align: left">32 samples per symbol</td>
      </tr>
  </tbody>
</table>
<p><strong>Dataset Composition</strong>: The set includes <strong>64 symbols</strong>: Digits (0-9), Uppercase (A-Z, missing Q), Lowercase (a-z, selected), Greek letters ($\alpha$, $\beta$, $\gamma$, $\pi$), and operators ($+$, $=$, $\rightarrow$, $\uparrow$, $\downarrow$, $($ , $)$).</p>
<h3 id="algorithms">Algorithms</h3>
<p><strong>1. Preprocessing</strong></p>
<p>The raw tablet data undergoes a 6-step pipeline:</p>
<ol>
<li><strong>Duplicate Point Elimination</strong>: Removing sequential points with identical coordinates</li>
<li><strong>Broken Stroke Connection</strong>: Using Bezier curves to interpolate missing points/connect broken strokes</li>
<li><strong>Hook Elimination</strong>: Removing artifacts at the start/end of strokes characterized by short length and sharp angle changes</li>
<li><strong>Smoothing</strong>: Reducing noise from erratic pen movement</li>
<li><strong>Re-sampling</strong>: Spacing points equidistantly to remove temporal variation</li>
<li><strong>Size Normalization</strong>: Removing variation in writing scale</li>
</ol>
<p><strong>2. Feature Extraction (11 Dimensions)</strong></p>
<p>Features are extracted from a 5-point window centered on $t$ ($t-2$ to $t+2$). The 11 dimensions are:</p>
<ol>
<li><strong>Normalized Vertical Position</strong>: $y(t)$ mapped to $[0,1]$</li>
<li><strong>Normalized First Derivative ($x&rsquo;$)</strong>: Calculated via weighted sum of neighbors</li>
<li><strong>Normalized First Derivative ($y&rsquo;$)</strong>: Calculated via weighted sum of neighbors</li>
<li><strong>Normalized Second Derivative ($x&rsquo;&rsquo;$)</strong>: Computed using $x&rsquo;$ values</li>
<li><strong>Normalized Second Derivative ($y&rsquo;&rsquo;$)</strong>: Computed using $y&rsquo;$ values</li>
<li><strong>Curvature</strong>: $\frac{x&rsquo;y&rsquo;&rsquo; - x&rsquo;&lsquo;y&rsquo;}{(x&rsquo;^2 + y&rsquo;^2)^{3/2}}$</li>
<li><strong>Writing Direction (Cos)</strong>: $\cos \alpha(t)$ based on vector from $t-1$ to $t+1$</li>
<li><strong>Writing Direction (Sin)</strong>: $\sin \alpha(t)$</li>
<li><strong>Aspect Ratio</strong>: Ratio of height to width in the 5-point window</li>
<li><strong>Curliness</strong>: Deviation from the straight line connecting the first and last point of the window</li>
<li><strong>Linearity</strong>: Average squared distance of points in the window to the straight line connecting start/end points</li>
</ol>
<p><strong>3. Feature Normalization</strong></p>
<p>The final feature matrix $V$ is normalized to zero mean and unit standard deviation using the covariance matrix: $o_t = \Sigma^{-1/2}(v_t - \mu)$.</p>
<h3 id="models">Models</h3>
<ul>
<li><strong>Architecture</strong>: Continuous Hidden Markov Models (HMM)</li>
<li><strong>Topology</strong>: Left-to-right (Bakis model)</li>
<li><strong>Initialization</strong>: Initial distribution $\pi = {1, 0, &hellip;, 0}$; uniform transition matrix $A$; segmental k-means for observation matrix $B$</li>
<li><strong>Training</strong>: Baum-Welch re-estimation</li>
<li><strong>Decision</strong>: Maximum likelihood classification ($\hat{\lambda} = \arg \max P(O|\lambda)$)</li>
</ul>
<h3 id="evaluation">Evaluation</h3>
<table>
  <thead>
      <tr>
          <th style="text-align: left">Metric</th>
          <th style="text-align: left">Best Value</th>
          <th style="text-align: left">Configuration</th>
          <th style="text-align: left">Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left"><strong>Top-1 Accuracy</strong></td>
          <td style="text-align: left"><strong>89.5%</strong></td>
          <td style="text-align: left">6 States, 9 Gaussians</td>
          <td style="text-align: left">Highest reported accuracy</td>
      </tr>
      <tr>
          <td style="text-align: left"><strong>Top-3 Accuracy</strong></td>
          <td style="text-align: left"><strong>98.7%</strong></td>
          <td style="text-align: left">6 States, 9 Gaussians</td>
          <td style="text-align: left">Top-3 candidate accuracy</td>
      </tr>
  </tbody>
</table>
<hr>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Zhang, Y., Shi, G., &amp; Yang, J. (2009). HMM-Based Online Recognition of Handwritten Chemical Symbols. <em>2009 10th International Conference on Document Analysis and Recognition</em>, 1255&ndash;1259. <a href="https://doi.org/10.1109/ICDAR.2009.99">https://doi.org/10.1109/ICDAR.2009.99</a></p>
<p><strong>Publication</strong>: ICDAR 2009</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@inproceedings</span>{zhang2009hmm,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span> = <span style="color:#e6db74">{HMM-Based Online Recognition of Handwritten Chemical Symbols}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">booktitle</span> = <span style="color:#e6db74">{2009 10th International Conference on Document Analysis and Recognition}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span> = <span style="color:#e6db74">{Zhang, Yang and Shi, Guangshun and Yang, Jufeng}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span> = <span style="color:#e6db74">{2009}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">volume</span> = <span style="color:#e6db74">{75}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span> = <span style="color:#e6db74">{1255--1259}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">publisher</span> = <span style="color:#e6db74">{IEEE}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">doi</span> = <span style="color:#e6db74">{10.1109/ICDAR.2009.99}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div>]]></content:encoded></item><item><title>Handwritten Chemical Symbol Recognition Using SVMs</title><link>https://hunterheidenreich.com/notes/chemistry/optical-structure-recognition/online-recognition/tang-online-symbol-2013/</link><pubDate>Wed, 17 Dec 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/optical-structure-recognition/online-recognition/tang-online-symbol-2013/</guid><description>A hybrid SVM and elastic matching approach for recognizing handwritten chemical symbols drawn on touch devices, achieving 89.7% top-1 accuracy.</description><content:encoded><![CDATA[<h2 id="paper-contribution-and-taxonomy">Paper Contribution and Taxonomy</h2>
<p>This is a <strong>Method</strong> paper according to the <a href="/notes/interdisciplinary/research-methods/ai-physical-sciences-paper-taxonomy/">AI for Physical Sciences taxonomy</a>.</p>
<ul>
<li><strong>Dominant Basis</strong>: The authors propose a novel hybrid architecture (SVM-EM) that combines two existing techniques to solve a specific recognition problem.</li>
<li><strong>Rhetorical Indicators</strong>: The paper explicitly defines algorithms (Algorithm 1 &amp; 2), presents a system architecture, and validates the method via ablation studies comparing the hybrid approach against its individual components.</li>
</ul>
<h2 id="motivation-for-pen-based-input">Motivation for Pen-Based Input</h2>
<p>Entering chemical expressions on digital devices is difficult due to their complex 2D spatial structure.</p>
<ul>
<li><strong>The Problem</strong>: While handwriting recognition for text and math is mature, chemical structures involve unique symbols and spatial arrangements that existing tools struggle to process efficiently.</li>
<li><strong>Existing Solutions</strong>: Standard tools (like ChemDraw) rely on point-and-click interactions, which are described as complicated and non-intuitive compared to direct handwriting.</li>
<li><strong>Goal</strong>: To enable fluid handwriting input on pen/touch-based devices (like iPads) by accurately recognizing individual chemical symbols in real-time.</li>
</ul>
<h2 id="novelty-hybrid-svm-and-elastic-matching">Novelty: Hybrid SVM and Elastic Matching</h2>
<p>The core contribution is the <strong>Hybrid SVM-EM</strong> approach, which splits recognition into a coarse classification stage and a fine-grained verification stage.</p>
<ul>
<li><strong>Two-Stage Pipeline</strong>:
<ol>
<li><strong>SVM Recognition</strong>: Uses statistical features (stroke count, turning angles) to generate a short-list of candidate symbols.</li>
<li><strong>Elastic Matching (EM)</strong>: Uses a geometric point-to-point distance metric to re-rank these candidates against a library of stored symbol prototypes.</li>
</ol>
</li>
<li><strong>Online Stroke Partitioning</strong>: A heuristic-based method to group strokes into symbols in real-time based on time adjacency (grouping the last $n$ strokes) and spatial intersection checks, without waiting for the user to finish the entire drawing.</li>
</ul>
<h2 id="experimental-design-and-data-collection">Experimental Design and Data Collection</h2>
<p>The authors conducted a user study to collect data and evaluate the system:</p>
<ul>
<li><strong>Participants</strong>: 10 users were recruited to write chemical symbols on an iPad.</li>
<li><strong>Task</strong>: Each user wrote 78 distinct chemical symbols (digits, alphabets, bonds) 3 times each.</li>
<li><strong>Baselines</strong>: The hybrid method was compared against two baselines:
<ol>
<li>SVM only</li>
<li>Elastic Matching only.</li>
</ol>
</li>
<li><strong>Metrics</strong>: Evaluation focused on <strong>Precision@k</strong> (where $k=1, 3, 5$), measuring how often the correct symbol appeared in the top-$k$ suggestions.</li>
</ul>
<h2 id="recognition-performance-and-outcomes">Recognition Performance and Outcomes</h2>
<p>The hybrid approach demonstrated improved performance compared to using either technique in isolation.</p>
<ul>
<li><strong>Key Results</strong>:
<ul>
<li><strong>Hybrid SVM-EM</strong>: 89.7% Precision@1 (Top-1 accuracy).</li>
<li><strong>SVM Only</strong>: 85.1% Precision@1.</li>
<li><strong>EM Only</strong>: 76.7% Precision@1.</li>
</ul>
</li>
<li><strong>Category Performance</strong>: The system performed best on Operators (91.9%) and Digits (91.3%), with slightly lower performance on Alphabetic characters (88.6%).</li>
<li><strong>Impact</strong>: The system was successfully implemented as a real-time iOS application, allowing users to draw complex structures like $C\#CC(O)$ which are then converted to SMILES strings.</li>
</ul>
<hr>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<h3 id="data">Data</h3>
<p>The study generated a custom dataset for training and evaluation.</p>
<table>
  <thead>
      <tr>
          <th>Purpose</th>
          <th>Dataset Stats</th>
          <th>Details</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><strong>Evaluation</strong></td>
          <td>2,340 samples</td>
          <td>Collected from 10 users. Consists of <strong>78 unique symbols</strong>: 10 digits (0-9), 52 letters (A-Z, a-z), and 16 bonds/operators (e.g., $=$, $+$, hash bonds).</td>
      </tr>
      <tr>
          <td><strong>Training</strong></td>
          <td>Unspecified size</td>
          <td>A &ldquo;Chemical Elastic Symbol Library&rdquo; was created containing samples of all supported symbols to serve as prototypes for the Elastic Matching step.</td>
      </tr>
  </tbody>
</table>
<h3 id="algorithms">Algorithms</h3>
<p>The pipeline consists of four distinct algorithmic steps:</p>
<p><strong>1. Stroke Partitioning</strong></p>
<ul>
<li><strong>Logic</strong>: Groups the most recently written stroke with up to the last 4 previous strokes.</li>
<li><strong>Filtering</strong>: Invalid groups are removed using &ldquo;Spatial Distance Checking&rdquo; (strokes too far apart) and &ldquo;Stroke Intersection Checking&rdquo; (strokes that don&rsquo;t intersect where expected).</li>
</ul>
<p><strong>2. Preprocessing</strong></p>
<ul>
<li><strong>Size Normalization</strong>: Scales symbol to a standard size based on its bounding box.</li>
<li><strong>Smoothing</strong>: Uses average smoothing (replacing mid-points with the average of neighbors) to remove jitter.</li>
<li><strong>Sampling</strong>: Resamples valid strokes to a fixed number of <strong>50 points</strong>.</li>
</ul>
<p><strong>3. SVM Feature Extraction</strong></p>
<ul>
<li><strong>Horizontal Angle</strong>: Calculated between two consecutive points ($P_1, P_2$). Values are binned into 12 groups ($30^{\circ}$ each).</li>
<li><strong>Turning Angle</strong>: The difference between two consecutive horizontal angles. Values are binned into 18 groups ($10^{\circ}$ each).</li>
<li><strong>Features</strong>: Input vector consists of stroke count, normalized coordinates, and the percentage of angles falling into the histograms described above.</li>
</ul>
<p><strong>4. Elastic Matching (Verification)</strong></p>
<ul>
<li><strong>Distance Function</strong>: Euclidean distance summation between the points of the candidate symbol ($s$) and the partitioned input ($s_p$).
$$
\begin{aligned}
D(s, s_p) = \sum_{j=1}^{n} \sqrt{(x_{s,j} - x_{p,j})^2 + (y_{s,j} - y_{p,j})^2}
\end{aligned}
$$
<em>Note: The paper formula sums the distances; $n$ is the number of points (50).</em></li>
<li><strong>Ranking</strong>: Candidates are re-ranked in ascending order of this elastic distance.</li>
</ul>
<h3 id="models">Models</h3>
<ul>
<li><strong>Classifier</strong>: Linear Support Vector Machine (SVM) implemented using <strong>LibSVM</strong>.</li>
<li><strong>Symbol Library</strong>: A &ldquo;Chemical Elastic Symbol Library&rdquo; stores the raw stroke point sequences for all 78 supported symbols to enable the elastic matching comparison.</li>
</ul>
<h3 id="evaluation">Evaluation</h3>
<p>Performance was measured using precision at different ranks (Top-N accuracy).</p>
<table>
  <thead>
      <tr>
          <th>Metric</th>
          <th>Value</th>
          <th>Baseline</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><strong>Precision@1</strong></td>
          <td><strong>89.7%</strong></td>
          <td>85.1% (SVM)</td>
          <td>Hybrid model reduces error rate significantly over baselines.</td>
      </tr>
      <tr>
          <td><strong>Precision@3</strong></td>
          <td><strong>94.1%</strong></td>
          <td>N/A</td>
          <td>High recall in top 3 allows users to quickly correct errors via UI selection.</td>
      </tr>
      <tr>
          <td><strong>Precision@5</strong></td>
          <td><strong>94.6%</strong></td>
          <td>N/A</td>
          <td></td>
      </tr>
  </tbody>
</table>
<h3 id="hardware">Hardware</h3>
<ul>
<li><strong>Device</strong>: Apple iPad (iOS platform).</li>
<li><strong>Input</strong>: Touch/Pen-based input recording digital ink (x, y coordinates and pen-up/down events).</li>
</ul>
<hr>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Tang, P., Hui, S. C., &amp; Fu, C. W. (2013). Online Chemical Symbol Recognition for Handwritten Chemical Expression Recognition. <em>2013 IEEE/ACIS 12th International Conference on Computer and Information Science (ICIS)</em>, 535&ndash;540. <a href="https://doi.org/10.1109/ICIS.2013.6607894">https://doi.org/10.1109/ICIS.2013.6607894</a></p>
<p><strong>Publication</strong>: IEEE ICIS 2013</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@inproceedings</span>{tangOnlineChemicalSymbol2013,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span> = <span style="color:#e6db74">{Online Chemical Symbol Recognition for Handwritten Chemical Expression Recognition}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">booktitle</span> = <span style="color:#e6db74">{2013 IEEE/ACIS 12th International Conference on Computer and Information Science (ICIS)}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span> = <span style="color:#e6db74">{Tang, Peng and Hui, Siu Cheung and Fu, Chi-Wing}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span> = <span style="color:#ae81ff">2013</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">volume</span> = <span style="color:#e6db74">{22}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span> = <span style="color:#e6db74">{535--540}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">publisher</span> = <span style="color:#e6db74">{IEEE}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">doi</span> = <span style="color:#e6db74">{10.1109/ICIS.2013.6607894}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div>]]></content:encoded></item><item><title>ChemInk: Real-Time Recognition for Chemical Drawings</title><link>https://hunterheidenreich.com/notes/chemistry/optical-structure-recognition/online-recognition/chemink-2011/</link><pubDate>Mon, 15 Dec 2025 00:00:00 +0000</pubDate><guid>https://hunterheidenreich.com/notes/chemistry/optical-structure-recognition/online-recognition/chemink-2011/</guid><description>A sketch recognition framework for chemical diagrams using a joint CRF model to combine multi-level visual features for real-time interpretation.</description><content:encoded><![CDATA[<h2 id="contribution-real-time-sketch-recognition-method">Contribution: Real-Time Sketch Recognition Method</h2>
<p>This is a <strong>Method</strong> paper. It proposes a novel architectural framework for sketch recognition that integrates visual features at three distinct levels (inkpoints, segments, symbols) into a single probabilistic model. The rhetorical structure centers on the proposal of this new architecture, the introduction of a specific &ldquo;trainable corner detector&rdquo; algorithm, and the validation of these methods against existing benchmarks and alternative toolsets (ChemDraw).</p>
<h2 id="motivation-bridging-the-gap-between-sketching-and-cad">Motivation: Bridging the Gap Between Sketching and CAD</h2>
<p>The primary motivation is to bridge the gap between the natural, efficient process of drawing chemical diagrams by hand and the cumbersome &ldquo;point-click-and-drag&rdquo; interactions required by CAD tools like ChemDraw. While chemists prefer sketching for communication, existing digital tools do not offer the same speed or ease of use. The goal is to build an intelligent system that understands freehand sketches in real-time, converting them into structured data suitable for analysis or search.</p>
<h2 id="core-innovation-hierarchical-joint-crf-model">Core Innovation: Hierarchical Joint CRF Model</h2>
<p>The core novelty lies in the <strong>hierarchical joint model</strong>. Unlike previous approaches that might treat stroke segmentation and symbol recognition as separate, isolated steps, ChemInk uses a <strong>Conditional Random Field (CRF)</strong> to jointly model dependencies across three levels:</p>
<ol>
<li><strong>Inkpoints</strong>: Local visual appearance.</li>
<li><strong>Segments</strong>: Stroke fragments separated by corners.</li>
<li><strong>Candidates</strong>: Potential symbol groupings.</li>
</ol>
<p>Additionally, the paper introduces a <strong>trainable corner detector</strong> that learns domain-specific corner definitions from data.</p>
<h2 id="experimental-design-and-baselines">Experimental Design and Baselines</h2>
<p>The authors conducted two primary evaluations:</p>
<ol>
<li><strong>Off-line Accuracy Evaluation</strong>:
<ul>
<li><strong>Dataset</strong>: 12 real-world organic compounds drawn by 10 participants.</li>
<li><strong>Metric</strong>: Recognition accuracy (Recall and Precision).</li>
<li><strong>Baseline</strong>: Comparison against their own previous work (O&amp;D 2009) and ablations (with/without context).</li>
</ul>
</li>
<li><strong>On-line User Study</strong>:
<ul>
<li><strong>Task</strong>: 9 participants (chemistry students) drew 5 diagrams using both ChemInk (Tablet PC) and ChemDraw (Mouse/Keyboard).</li>
<li><strong>Metric</strong>: Time to completion and subjective user ratings (speed/ease of use).</li>
</ul>
</li>
</ol>
<h2 id="results-accuracy-and-user-study-outcomes">Results: Accuracy and User Study Outcomes</h2>
<ul>
<li><strong>Accuracy</strong>: The system achieved <strong>97.4% symbol recognition accuracy</strong>, slightly outperforming the best prior result (97.1%). The trainable corner detector achieved <strong>99.91% recall</strong>.</li>
<li><strong>Speed</strong>: Users were <strong>twice as fast</strong> using ChemInk (avg. 36s) compared to ChemDraw (avg. 79s).</li>
<li><strong>Usability</strong>: Participants rated ChemInk significantly higher for speed (6.3 vs 4.5) and ease of use (6.3 vs 4.7) on a 7-point scale.</li>
<li><strong>Conclusion</strong>: Sketch recognition is a viable, superior alternative to standard CAD tools for authoring chemical diagrams.</li>
</ul>
<hr>
<h2 id="reproducibility-details">Reproducibility Details</h2>
<h3 id="data">Data</h3>
<ul>
<li><strong>Training/Test Data</strong>: 12 real-world organic compounds (e.g., Aspirin, Penicillin) drawn by 10 participants (organic chemistry familiar).</li>
<li><strong>Evaluation Split</strong>: User-independent cross-validation (training on 9 users, testing on 1).</li>
<li><strong>Input</strong>: Raw digital ink (strokes) collected on a Tablet PC.</li>
</ul>
<h3 id="algorithms">Algorithms</h3>
<p><strong>1. Corner Detection (Trainable)</strong></p>
<ul>
<li><strong>Method</strong>: Iterative vertex elimination.</li>
<li><strong>Cost Function</strong>: $cost(p_{i}) = \sqrt{mse(s_{i}; p_{i-1}, p_{i+1})} \cdot dist(p_{i}; p_{i-1}, p_{i+1})$</li>
<li><strong>Procedure</strong>: Repeatedly remove the vertex with the lowest cost until the classifier (trained on features like cost, diagonal length, ink density) predicts the remaining vertices are corners.</li>
</ul>
<p><strong>2. Feature Extraction</strong></p>
<ul>
<li><strong>Inkpoints</strong>: Sampled at regular intervals. Features = $10 \times 10$ pixel orientation filters (0, 45, 90, 135 degrees) at two scales ($L/2$, $L$), smoothed and downsampled to $5 \times 5$. Total 400 features.</li>
<li><strong>Segments</strong>: Similar image features centered at segment midpoint, plus geometric features (length, ink density).</li>
<li><strong>Candidates</strong>: 5 feature images ($20 \times 20$) including an &ldquo;endpoint&rdquo; image, stretched to normalize aspect ratio.</li>
<li><strong>Dimensionality Reduction</strong>: PCA used to compress feature images to 256 components.</li>
</ul>
<p><strong>3. Structure Generation</strong></p>
<ul>
<li><strong>Clustering</strong>: Agglomerative clustering with a complete-link metric to connect symbols.</li>
<li><strong>Threshold</strong>: Stop clustering at distance $0.4L$.</li>
</ul>
<h3 id="models">Models</h3>
<p><strong>Conditional Random Field (CRF)</strong></p>
<ul>
<li><strong>Structure</strong>: 3-level hierarchy (Inkpoints $V_p$, Segments $V_s$, Candidates $V_c$).</li>
<li><strong>Nodes</strong>:
<ul>
<li>$V_p, V_s$ labels: &ldquo;bond&rdquo;, &ldquo;hash&rdquo;, &ldquo;wedge&rdquo;, &ldquo;text&rdquo;.</li>
<li>$V_c$ labels: specific candidate interpretations.</li>
</ul>
</li>
<li><strong>Edges/Potentials</strong>:
<ul>
<li><strong>Entity-Feature</strong>: $\phi(y, x)$ (Linear classifier).</li>
<li><strong>Consistency</strong>: $\psi(y_i, y_j)$ (Hard constraint: child must match parent label).</li>
<li><strong>Spatial Context</strong>: $\psi_{ss}(y_i, y_j)$ (Pairwise geometric relationships between segments: angle, distance).</li>
<li><strong>Overlap</strong>: Prevents conflicting candidates from sharing segments.</li>
</ul>
</li>
<li><strong>Inference</strong>: Loopy Belief Propagation (up to 100 iterations).</li>
<li><strong>Training</strong>: Maximum Likelihood via gradient ascent (L-BFGS).</li>
</ul>
<h3 id="evaluation">Evaluation</h3>
<ul>
<li><strong>Primary Metric</strong>: Accuracy (Recall/Precision) of symbol detection.</li>
<li><strong>Comparison</strong>: Compared against Ouyang &amp; Davis 2009 (previous SOTA).</li>
<li><strong>Speed Metric</strong>: Wall-clock time for diagram creation (ChemInk vs. ChemDraw).</li>
</ul>
<h3 id="hardware">Hardware</h3>
<ul>
<li><strong>Processor</strong>: 3.7 GHz processor (single thread) for base benchmarking (approx. 1 sec/sketch).</li>
<li><strong>Deployment</strong>: Validated on 1.8 GHz Tablet PCs using multi-core parallelization for real-time feedback.</li>
</ul>
<hr>
<h2 id="paper-information">Paper Information</h2>
<p><strong>Citation</strong>: Ouyang, T. Y., &amp; Davis, R. (2011). ChemInk: A Natural Real-Time Recognition System for Chemical Drawings. <em>Proceedings of the 16th International Conference on Intelligent User Interfaces</em>, 267&ndash;276. <a href="https://doi.org/10.1145/1943403.1943444">https://doi.org/10.1145/1943403.1943444</a></p>
<p><strong>Publication</strong>: IUI &lsquo;11</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bibtex" data-lang="bibtex"><span style="display:flex;"><span><span style="color:#a6e22e">@inproceedings</span>{ouyangChemInkNaturalRealtime2011,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">title</span> = <span style="color:#e6db74">{ChemInk: A Natural Real-Time Recognition System for Chemical Drawings}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">shorttitle</span> = <span style="color:#e6db74">{ChemInk}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">booktitle</span> = <span style="color:#e6db74">{Proceedings of the 16th International Conference on Intelligent User Interfaces}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">author</span> = <span style="color:#e6db74">{Ouyang, Tom Y. and Davis, Randall}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">year</span> = <span style="color:#e6db74">{2011}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">month</span> = feb,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">pages</span> = <span style="color:#e6db74">{267--276}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">publisher</span> = <span style="color:#e6db74">{ACM}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">address</span> = <span style="color:#e6db74">{Palo Alto, CA, USA}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">doi</span> = <span style="color:#e6db74">{10.1145/1943403.1943444}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">isbn</span> = <span style="color:#e6db74">{978-1-4503-0419-1}</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">url</span> = <span style="color:#e6db74">{http://hdl.handle.net/1721.1/78898}</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div>]]></content:encoded></item></channel></rss>