HMM-based Online Recognition of Chemical Symbols

Paper Information

Title: HMM-Based Online Recognition of Handwritten Chemical Symbols

Authors: Yang Zhang, Guangshun Shi, Jufeng Yang

Publication: ICDAR 2009

DOI: 10.1109/ICDAR.2009.99

What kind of paper is this?

This is a Method paper that proposes a specific algorithmic pipeline for the online recognition of handwritten chemical symbols. The core contribution is the engineering of an 11-dimensional feature vector combined with a Hidden Markov Model (HMM) architecture. The paper validates this method through quantitative experiments on a custom dataset, focusing on recognition accuracy as the primary metric.

What is the motivation?

Recognizing chemical symbols is uniquely challenging due to the complex structure of chemical expressions and the nature of pen-based input, which often results in broken or conglutinate strokes. Additionally, variations in writing style and random noise make the task difficult. While online recognition for Western characters and CJK scripts is well-developed, works specifically targeting online chemical symbol recognition are scarce, with most prior research focusing on offline recognition or global optimization.

What is the novelty here?

The primary novelty is the application of continuous HMMs specifically to the domain of online chemical symbol recognition, utilizing a specialized set of 11-dimensional local features. While HMMs have been used for other scripts, this paper tailors the feature extraction (including curliness, linearity, and writing direction) to capture the specific geometric properties of chemical symbols.

What experiments were performed?

The authors constructed a specific dataset for this task involving 20 participants (college teachers and students).

Dataset: 64 distinct symbols (digits, English letters, Greek letters, operators)
Volume: 7,808 total samples (122 per symbol), split into 5,670 training samples and 2,016 testing samples
Model Sweeps: They evaluated the HMM performance by varying the number of states (4, 6, 8) and the number of Gaussians per state (3, 4, 6, 9, 12)

What were the outcomes and conclusions drawn?

Performance: The best configuration (6 states, 9 Gaussians) achieved a top-1 accuracy of 89.5% and a top-3 accuracy of 98.7%
Scaling: Results showed that generally, increasing the number of states and Gaussians improved accuracy, though at the cost of computational efficiency
Error Analysis: The primary sources of error were shape similarities between specific characters (e.g., ‘0’ vs ‘O’ vs ‘o’, and ‘C’ vs ‘c’ vs ‘(’)

Reproducibility Details

Status: Closed / Very Low Reproducibility. This 2009 study relies on a private, custom-collected dataset and does not provide source code, model weights, or an open-access preprint.

Artifacts

Artifact	Type	License	Notes
None publicly available	N/A	N/A	No open source code, open datasets, or open-access preprints were released with this publication.

Data

The study utilized a custom dataset collected in a laboratory environment.

Purpose	Dataset	Size	Notes
Training	Custom Chemical Symbol Set	5,670 samples	90 samples per symbol
Testing	Custom Chemical Symbol Set	2,016 samples	32 samples per symbol

Dataset Composition: The set includes 64 symbols: Digits (0-9), Uppercase (A-Z, missing Q), Lowercase (a-z, selected), Greek letters ($\alpha$, $\beta$, $\gamma$, $\pi$), and operators ($+$, $=$, $\rightarrow$, $\uparrow$, $\downarrow$, $($ , $)$).

Algorithms

1. Preprocessing

The raw tablet data undergoes a 6-step pipeline:

Duplicate Point Elimination: Removing sequential points with identical coordinates
Broken Stroke Connection: Using Bezier curves to interpolate missing points/connect broken strokes
Hook Elimination: Removing artifacts at the start/end of strokes characterized by short length and sharp angle changes
Smoothing: Reducing noise from erratic pen movement
Re-sampling: Spacing points equidistantly to remove temporal variation
Size Normalization: Removing variation in writing scale

2. Feature Extraction (11 Dimensions)

Features are extracted from a 5-point window centered on $t$ ($t-2$ to $t+2$). The 11 dimensions are:

Normalized Vertical Position: $y(t)$ mapped to $[0,1]$
Normalized First Derivative ($x’$): Calculated via weighted sum of neighbors
Normalized First Derivative ($y’$): Calculated via weighted sum of neighbors
Normalized Second Derivative ($x’’$): Computed using $x’$ values
Normalized Second Derivative ($y’’$): Computed using $y’$ values
Curvature: $\frac{x’y’’ - x’‘y’}{(x’^2 + y’^2)^{3/2}}$
Writing Direction (Cos): $\cos \alpha(t)$ based on vector from $t-1$ to $t+1$
Writing Direction (Sin): $\sin \alpha(t)$
Aspect Ratio: Ratio of height to width in the 5-point window
Curliness: Deviation from the straight line connecting the first and last point of the window
Linearity: Average squared distance of points in the window to the straight line connecting start/end points

3. Feature Normalization

The final feature matrix $V$ is normalized to zero mean and unit standard deviation using the covariance matrix: $o_t = \Sigma^{-1/2}(v_t - \mu)$.

Models

Architecture: Continuous Hidden Markov Models (HMM)
Topology: Left-to-right (Bakis model)
Initialization: Initial distribution $\pi = {1, 0, …, 0}$; uniform transition matrix $A$; segmental k-means for observation matrix $B$
Training: Baum-Welch re-estimation
Decision: Maximum likelihood classification ($\hat{\lambda} = \arg \max P(O|\lambda)$)

Evaluation

Metric	Best Value	Configuration	Notes
Top-1 Accuracy	89.5%	6 States, 9 Gaussians	Highest reported accuracy
Top-3 Accuracy	98.7%	6 States, 9 Gaussians	Top-3 candidate accuracy

Citation

@inproceedings{zhang2009hmm,
  title={HMM-Based Online Recognition of Handwritten Chemical Symbols},
  author={Zhang, Yang and Shi, Guangshun and Yang, Jufeng},
  booktitle={2009 10th International Conference on Document Analysis and Recognition},
  pages={1255--1259},
  year={2009},
  organization={IEEE},
  doi={10.1109/ICDAR.2009.99}
}

Paper Information#

What kind of paper is this?#

What is the motivation?#

What is the novelty here?#

What experiments were performed?#

What were the outcomes and conclusions drawn?#

Reproducibility Details#

Artifacts#

Data#

Algorithms#

Models#

Evaluation#

Citation#