Paper Information

Citation: Ramel, J.-Y., Boissier, G., & Emptoz, H. (1999). Automatic Reading of Handwritten Chemical Formulas from a Structural Representation of the Image. Proceedings of the Fifth International Conference on Document Analysis and Recognition (ICDAR ‘99), 83-86. https://doi.org/10.1109/ICDAR.1999.791730

Publication: ICDAR 1999

What kind of paper is this?

Method. This paper proposes a novel system architecture for document analysis. It introduces a specific pipeline (Global Perception followed by Incremental Extraction) and validates this “new strategy” with recognition rates on specific tasks. The core contribution is the shift from bitmap-based processing to a structural graph representation of graphical primitives.

What is the motivation?

  • Complexity of Freehand: Freehand drawings contain fluctuating lines and noise that make standard vectorization techniques difficult to apply directly.
  • Limitation of Bitmap Analysis: Most existing systems at the time attempted to interpret the document by working directly on the static bitmap image throughout the process.
  • Need for Context: Interpretation requires a dynamic resource that can evolve as “knowledge” is extracted (e.g., recognizing a polygon changes the context for its neighbors).

What is the novelty here?

The authors propose a Structural Representation as the “unique resource” for interpretation, rather than the original image.

  • Quadrilateral Primitives: Instead of simple vectors, the system builds “Quadrilaterals” (pairs of vectors) to represent thin shapes, which are robust to handwriting fluctuations.
  • Structural Graph: These primitives are organized into a graph where arcs represent geometric relationships (T-junctions, L-junctions, parallels).
  • Specialist Agents: Interpretation is driven by independent modules (“specialists”) that browse this graph recursively to identify high-level chemical entities like rings (polygons) or chains.

What experiments were performed?

  • Validation Set: The system was tested on 20 handwritten documents containing chemical formulas.
  • Text Database: A separate base of 328 models was used for the text recognition component.
  • Metric: Recognition rates were calculated for both text components and graphical elements (chemical structures).

What outcomes/conclusions?

  • High Graphical Accuracy: The system achieved a 97% recognition rate for graphical parts (chemical elements like rings and bonds).
  • Text Recognition: The text recognition module achieved a 93% success rate.
  • Robustness: The structural graph approach successfully handled “multiple liaisons, polygons, chains” and allowed for the progressive construction of a solution consistent with the context.

Reproducibility Details

Data

PurposeDatasetSizeNotes
EvaluationHandwritten Documents20 docsOff-line documents at 300 dpi
TrainingCharacter Models328 modelsUsed for the Pattern Matching text recognition base

Algorithms

The interpretation process is divided into two distinct phases:

1. Global Perception (Graph Construction)

  • Vectorization: Contour tracking produces a chain of vectors, which are simplified via iterative polygonal approximation until fusion stabilizes (2-5 iterations).
  • Quadrilateral Formation: Vectors are paired to form quadrilaterals based on Euclidean distance and “empirical” alignment criteria.
  • Graph Generation: Quadrilaterals become nodes. Arcs are created based on “zones of influence” and classified into 5 types: T-junction, Intersection (X), Parallel (//), L-junction, and Successive (S).
  • Redraw Heuristic: A pre-processing step transforms T, X, and S junctions into L or // relations, as chemical drawings primarily consist of L-junctions and parallels.

2. Specialists (Interpretation)

  • Liaison Specialist: Scans the graph for // arcs or quadrilaterals with free extremities to identify bonds.
  • Polygon/Chain Specialist: Uses recursive look-left and look-right procedures. If a search returns to the start node after $n$ steps, a polygon is detected.
  • Text Localization: Clusters “short” quadrilaterals by physical proximity into “focus zones”. Zones are classified as text/non-text based on connected components.

Models

Text Recognition Hybrid:

  1. Normalization & Pattern Matching: A classic method using the database of 328 models.
  2. Structural Rule Base: Uses “significant” quadrilaterals (length $\ge 1/3$ of zone dimension) to verify characters. A rule base defines the expected count of horizontal, vertical, and diagonal lines for each character.

Evaluation

MetricValueBaselineNotes
Graphical Element Recognition~97%N/AEvaluated on 20 documents (Fig. 7 examples)
Text Recognition~93%N/AEvaluated on 20 documents

Citation

@inproceedings{ramelAutomaticReadingHandwritten1999,
  title = {Automatic Reading of Handwritten Chemical Formulas from a Structural Representation of the Image},
  booktitle = {Proceedings of the {{Fifth International Conference}} on {{Document Analysis}} and {{Recognition}}. {{ICDAR}} '99 ({{Cat}}. {{No}}.{{PR00318}})},
  author = {Ramel, J.-Y. and Boissier, G. and Emptoz, H.},
  year = 1999,
  pages = {83--86},
  publisher = {IEEE},
  address = {Bangalore, India},
  doi = {10.1109/ICDAR.1999.791730},
  isbn = {978-0-7695-0318-9}
}