Paper Information

Citation: Ouyang, T. Y., & Davis, R. (2007). Recognition of Hand Drawn Chemical Diagrams. Proceedings of the 22nd National Conference on Artificial Intelligence (AAAI-07), 846-851.

Publication: AAAI 2007

What kind of paper is this?

This is a Method paper. It proposes a novel architecture for interpreting hand-drawn diagrams that integrates a trainable symbol recognizer with a domain-specific verification step. The authors validate the method through an ablation study comparing the full system against a baseline lacking domain knowledge.

What is the motivation?

Current software for specifying chemical structures (e.g., ChemDraw, IsisDraw) relies on mouse and keyboard interfaces, which lack the speed, ease of use, and naturalness of drawing on paper. The goal is to bridge the gap between natural expression and computer interpretation by building a system that understands freehand chemical sketches.

What is the novelty here?

The primary novelty is the integration of domain knowledge (specifically chemical valence rules) directly into the interpretation loop to resolve ambiguities and correct errors.

Specific technical contributions include:

  • Hybrid Recognizer: Combines feature-based SVMs, image-based template matching (modified Tanimoto), and off-the-shelf handwriting recognition to handle the mix of geometry and text.
  • Domain Verification Loop: A post-processing step that checks the chemical validity of the structure (e.g., nitrogen must have 3 bonds). If an inconsistency is found, the system searches the space of alternative hypotheses generated during the initial parsing phase to find a valid interpretation.
  • Contextual Parsing: Uses a sliding window (up to 7 strokes) and spatial context to parse interspersed symbols rather than requiring users to finish one symbol before starting the next.

What experiments were performed?

The authors conducted a user study to evaluate the system’s robustness on unconstrained sketches.

  • Participants: 6 users familiar with organic chemistry.
  • Task: Each user drew 12 pre-specified molecular compounds on a Tablet PC.
  • Conditions: The system was evaluated in two modes:
    1. Domain: The full system with chemical valence checks.
    2. Baseline: A simplified version with no knowledge of chemical valence/verification.
  • Data Split: Evaluated on collected sketches using a leave-one-out style approach (training on 11 examples from the same users).

What were the outcomes and conclusions drawn?

  • Performance: The full system achieved an overall F-measure of 0.87 (Precision 0.86, Recall 0.89).
  • Impact of Domain Knowledge: Using domain knowledge reduced the overall error rate (measured by recall) by 27% compared to the baseline. The improvement was statistically significant ($p < .05$).
  • Error Recovery: The system successfully recovered from interpretations that were geometrically plausible but chemically impossible (e.g., misinterpreting “N” as bonds), as illustrated in their qualitative analysis.
  • Limitations: The system struggled with “messy” sketches where users drew single bonds with multiple strokes or over-traced lines, as the current bond recognizer assumes single-stroke straight bonds.

Reproducibility Details

Data

The study collected a custom dataset of hand-drawn diagrams.

  • Volume: 6 participants $\times$ 12 molecules = 72 total sketches (implied).
  • Preprocessing:
    • Scale Normalization: The system estimates scale based on the average length of straight bonds (chosen because they are easy to identify). This normalizes geometric features for the classifier.
    • Stroke Segmentation: Poly-line approximation using recursive splitting (minimizing least squared error) to break multi-segment strokes (e.g., connected bonds) into primitives.

Algorithms

1. Ink Parsing (Sliding Window)

  • Examines all combinations of up to $n=7$ sequential strokes.
  • Classifies each group as a valid symbol or invalid garbage.

2. Template Matching (Image-based)

  • Used for resolving ambiguities in text/symbols (e.g., ‘H’ vs ‘N’).
  • Metric: Modified Tanimoto coefficient. Unlike standard Tanimoto (point overlap), this version accounts for relative angle and curvature at each point.

3. Domain Verification

  • Trigger: An element with incorrect valence (e.g., Hydrogen with >1 bond).
  • Resolution: Searches stored alternative hypotheses for the affected strokes. It accepts a new hypothesis if it resolves the valence error without introducing new ones.
  • Constraint: It keeps an inconsistent structure if the original confidence score is significantly higher than alternatives (assuming user is still drawing or intentionally left it incomplete).

Models

Symbol Recognizer (Discriminative Classifier)

  • Type: Support Vector Machine (SVM).
  • Classes: Element letters, straight bonds, hash bonds, wedge bonds, invalid groups.
  • Input Features:
    1. Number of strokes
    2. Bounding-box dimensions (width, height, diagonal)
    3. Ink density (ink length / diagonal length)
    4. Inter-stroke distance (max distance between strokes in group)
    5. Inter-stroke orientation (vector of relative orientations)

Text Recognition

  • Microsoft Tablet PC SDK: Used for recognizing alphanumeric characters (elements and subscripts).
  • Integrated with the SVM and Template Matcher via a combined scoring mechanism.

Evaluation

MetricValue (Overall)Baseline ComparisonNotes
Precision0.860.81 (Baseline)Full system vs. no domain knowledge
Recall0.890.85 (Baseline)27% error reduction
F-Measure0.870.83 (Baseline)Statistically significant ($p < .05$)
  • True Positive Definition: Match in both location (stroke grouping) and classification (label).

Hardware

  • Device: 1.5GHz Tablet PC.
  • Performance: Real-time feedback.

Citation

@inproceedings{ouyang2007recognition,
  title={Recognition of Hand Drawn Chemical Diagrams},
  author={Ouyang, Tom Y and Davis, Randall},
  booktitle={Proceedings of the 22nd National Conference on Artificial Intelligence},
  volume={1},
  pages={846--851},
  year={2007}
}