OCSR Methods: A Taxonomy of Approaches

Overview

Optical Chemical Structure Recognition (OCSR) aims to automatically extract machine-readable molecular representations (e.g., SMILES, InChI, mol files) from images of chemical structures. Methods have evolved from early rule-based systems to modern deep learning approaches.

This note organizes OCSR methods by their fundamental approach, providing a framework for understanding the landscape of techniques.

Review & Survey Papers

Comprehensive surveys and systematization of knowledge papers that organize and synthesize the OCSR literature.

Year	Paper	Notes	Focus
2020	A review of optical chemical structure recognition tools	Rajan et al. 2020	Survey of 30 years of OCSR development (1990-2019); benchmark of three open-source tools (OSRA, Imago, MolVec) on four datasets
2022	Review of techniques and models used in optical chemical structure recognition	Musazade et al. 2022	Systematization of OCSR evolution from rule-based systems to modern deep learning; identifies paradigm shift to image captioning and critiques evaluation metrics
2024	Comparing software tools for optical chemical structure recognition	Krasnov et al. 2024	Benchmark of 8 open-access tools on 2,702 manually curated patent images; proposes ChemIC classifier for hybrid routing approach

Deep Learning Methods

End-to-end neural network architectures that learn to map images directly to molecular representations.

Note on Paper Types: Papers listed below are primarily Method ($\Psi_{\text{Method}}$) papers focused on novel architectures and performance improvements. Some also have secondary Resource ($\Psi_{\text{Resource}}$) contributions through released tools or datasets. See the AI and Physical Sciences paper taxonomy for classification details.

Image-to-Sequence Paradigm

Treating chemical structure recognition as an image captioning task, these methods use encoder-decoder architectures (often with attention mechanisms) to generate sequential molecular representations like SMILES directly from pixels.

Year	Paper	Notes	Architecture
2019	Molecular Structure Extraction From Documents Using Deep Learning	Staker et al. Notes	U-Net segmentation + CNN-GridLSTM encoder-decoder with attention
2020	DECIMER: towards deep learning for chemical image recognition	DECIMER Notes	Inception V3 encoder + GRU decoder with attention
2021	ChemPix: automated recognition of hand-drawn hydrocarbon structures	ChemPix Notes	CNN encoder + LSTM decoder with attention
2021	DECIMER 1.0: deep learning for chemical image recognition using transformers	DECIMER 1.0 Notes	EfficientNet-B3 encoder + Transformer decoder with SELFIES output
2021	End-to-End Attention-based Image Captioning	ViT-InChI Transformer Notes	Vision Transformer encoder + Transformer decoder with InChI output
2021	Img2Mol - accurate SMILES recognition from molecular graphical depictions	Img2Mol Notes	CNN encoder + pre-trained CDDD decoder for continuous embedding
2021	IMG2SMI: Translating Molecular Structure Images to SMILES	IMG2SMI Notes	ResNet-101 encoder + Transformer decoder with SELFIES output
2022	Automated Recognition of Chemical Molecule Images Based on an Improved TNT Model	ICMDT Notes	Deep TNT encoder + Transformer decoder with InChI output
2022	Image2SMILES: Transformer-Based Molecular Optical Recognition Engine	Image2SMILES Notes	ResNet-50 encoder + Transformer decoder with FG-SMILES output
2022	MICER: a pre-trained encoder-decoder architecture for molecular image captioning	MICER Notes	Fine-tuned ResNet101 encoder + LSTM decoder with attention
2022	Performance of chemical structure string representations for chemical image recognition using transformers	Rajan String Representations	Comparative ablation: SMILES vs DeepSMILES vs SELFIES vs InChI
2022	SwinOCSR: end-to-end optical chemical structure recognition using a Swin Transformer	SwinOCSR Notes	Swin Transformer encoder + Transformer decoder with DeepSMILES output
2023	Handwritten Chemical Structure Image to Structure-Specific Markup Using Random Conditional Guided Decoder	Hu et al. RCGD Notes	DenseNet encoder + GRU decoder with attention and SSML output
2023	DECIMER.ai: an open platform for automated optical chemical structure identification, segmentation and recognition in scientific publications	DECIMER.ai Notes	EfficientNet-V2-M encoder + Transformer decoder with SMILES output
2024	ChemReco: automated recognition of hand-drawn carbon-hydrogen-oxygen structures using deep learning	ChemReco Notes	EfficientNet encoder + Transformer decoder with SMILES output
2024	Advancements in hand-drawn chemical structure recognition through an enhanced DECIMER architecture	Enhanced DECIMER Notes	EfficientNet-V2-M encoder + Transformer decoder with SMILES output
2024	Image2InChI: Automated Molecular Optical Image Recognition	Image2InChI Notes	Improved SwinTransformer encoder + Transformer decoder with InChI output
2024	MMSSC-Net: multi-stage sequence cognitive networks for drug molecule recognition	MMSSC-Net Notes	SwinV2 encoder + GPT-2 decoder with MLP for multi-stage cognition
2024	RFL: Simplifying Chemical Structure Recognition with Ring-Free Language	RFL Notes	DenseNet encoder + GRU decoder with hierarchical ring decomposition
2025	Dual-Path Global Awareness Transformer for Optical Chemical Structure Recognition	DGAT Notes	ResNet-101 encoder + Transformer with CGFE/SDGLA modules and SMILES output
2025	GTR-CoT: Graph Traversal as Visual Chain of Thought for Molecular Structure Recognition	GTR-CoT Notes	Qwen-VL 2.5 3B encoder-decoder with graph traversal chain-of-thought and SMILES output
2025	MolParser: End-to-end Visual Recognition of Molecule Structures in the Wild	MolParser Notes	Swin Transformer encoder + BART decoder with Extended SMILES (E-SMILES) output
2025	MolSight: OCSR with SMILES Pretraining, Multi-Granularity Learning and Reinforcement Learning	MolSight Notes	EfficientViT-L1 encoder + Transformer decoder with RL (GRPO) and SMILES output
2025	OCSU: Optical Chemical Structure Understanding for Molecule-centric Scientific Discovery	OCSU Notes	Mol-VL: Qwen2-VL encoder-decoder with multi-task learning for multi-level understanding

Image-to-Graph Paradigm

Methods that explicitly construct molecular graphs as intermediate representations, identifying atoms as vertices and bonds as edges before converting to standard molecular formats.

Year	Paper	Notes	Architecture
2020	ChemGrapher: Optical Graph Recognition of Chemical Compounds by Deep Learning	ChemGrapher Notes	U-Net-based semantic segmentation + graph building algorithm + classification CNNs
2022	ABC-Net: A divide-and-conquer based deep learning architecture for SMILES recognition	ABC-Net Notes	U-Net-style FCN with keypoint detection heatmaps + multi-task property prediction
2022	Image-to-Graph Transformers for Chemical Structure Recognition	Image-to-Graph Transformers Notes	ResNet-34 encoder + Transformer encoder + Graph-Aware Transformer (GRAT) decoder
2022	MolMiner: You Only Look Once for Chemical Structure Recognition	MolMiner Notes	MobileNetV2 segmentation + YOLOv5 object detection + EasyOCR + graph construction
2023	MolGrapher: Graph-based Visual Recognition of Chemical Structures	MolGrapher Notes	ResNet-18 keypoint detector + supergraph construction + GNN classifier
2023	MolScribe: Robust Molecular Structure Recognition with Image-To-Graph Generation	MolScribe Notes	Swin Transformer encoder + Transformer decoder with explicit atom coordinates and bond prediction
2024	Atom-Level Optical Chemical Structure Recognition with Limited Supervision	AtomLenz Notes	Faster R-CNN object detection + graph constructor with weakly supervised training (ProbKT*)
2024	MolNexTR: a generalized deep learning model for molecular image recognition	MolNexTR Notes	Dual-stream (ConvNext + ViT) encoder + Transformer decoder with graph generation
2025	MarkushGrapher: Joint Visual and Textual Recognition of Markush Structures	MarkushGrapher Notes	UDOP VTL encoder + MolScribe OCSR encoder + T5 decoder with CXSMILES + substituent table
2025	MolMole: Molecule Mining from Scientific Literature	MolMole Notes	ViDetect (DINO) + ViReact (RxnScribe) + ViMore (detection-based) unified page-level pipeline
2025	OCSU: Optical Chemical Structure Understanding for Molecule-centric Scientific Discovery	OCSU Notes	DoubleCheck: MolScribe + attentive feature enhancement with local ambiguous atom refinement

Image-to-Fingerprint Paradigm

Methods that bypass molecular graph reconstruction entirely, generating molecular fingerprints directly from images through functional group recognition and spatial analysis. These approaches prioritize retrieval and similarity search over exact structure reconstruction.

Year	Paper	Notes	Architecture
2025	SubGrapher: visual fingerprinting of chemical structures	SubGrapher Notes	Dual Mask-RCNN instance segmentation (1,534 groups + 27 backbones) + substructure-graph + SVMF fingerprint

Image Classification and Filtering

Methods that classify chemical structure images for preprocessing purposes, such as detecting Markush structures or other problematic inputs that should be filtered before full OCSR processing.

Year	Paper	Notes	Architecture
2023	One Strike, You’re Out: Detecting Markush Structures in Low Signal-to-Noise Ratio Images	Jurriaans et al. Notes	Patch-based pipeline with Inception V3 or ResNet18 for binary classification

Traditional Machine Learning Methods

Hybrid approaches combining classical machine learning algorithms (neural networks, SVMs, CRFs) with domain-specific heuristics and image processing. These methods (primarily from 1992-2014) used ML for specific subtasks like character recognition or symbol classification while relying on rule-based systems for chemical structure interpretation.

Year	Paper	Notes	Key ML Component
1992	Kekulé: OCR-Optical Chemical (Structure) Recognition	Kekulé Notes	Multilayer perceptron for OCR
1996	Automatic Interpretation of Chemical Structure Diagrams	Kekulé-1 Notes	Neural network with shared weights (proto-CNN)
2007	Recognition of Hand Drawn Chemical Diagrams	Ouyang-Davis Notes	SVM for symbol classification
2008	Chemical Ring Handwritten Recognition Based on Neural Networks	Hewahi et al. Notes	Two-phase classifier-recognizer with feed-forward NNs
2008	Recognition of On-line Handwritten Chemical Expressions	Yang et al. Notes	Two-level algorithm with edit distance matching
2008	A Study of On-Line Handwritten Chemical Expressions Recognition	Yang et al. Notes	ANN with two-level substance recognition
2009	A Unified Framework for Recognizing Handwritten Chemical Expressions	Chang et al. Notes	GMM for spatial relations, NN for bond verification
2009	HMM-Based Online Recognition of Handwritten Chemical Symbols	Zhang et al. Notes	Hidden Markov Model for online handwriting
2009	The Understanding and Structure Analyzing for Online Handwritten Chemical Formulas	Wang et al. Notes	HMM for text recognition + CFG for structure parsing
2010	A SVM-HMM Based Online Classifier for Handwritten Chemical Symbols	Zhang et al. Notes	Dual-stage SVM-HMM with PSR algorithm
2011	ChemInk: A Natural Real-Time Recognition System for Chemical Drawings	ChemInk Notes	Conditional Random Field (CRF) joint model
2013	Online Chemical Symbol Recognition for Handwritten Chemical Expression Recognition	Tang et al. Notes	SVM with elastic matching for handwriting
2014	Markov Logic Networks for Optical Chemical Structure Recognition	MLOCSR Notes	Markov Logic Network for probabilistic inference

Rule-Based Methods

Classic approaches using heuristics, image processing, and domain-specific rules. While some systems use traditional OCR engines (which may contain ML components), the chemical structure recognition itself is purely algorithmic.

Note: The chemoCR systems use SVM-based OCR but employ rule-based topology-preserving vectorization for core structure reconstruction, placing them primarily in this category.

Core Methods

Year	Paper	Notes
1990	Computational Perception and Recognition of Digitized Molecular Structures	Contreras et al. Notes
1993	Chemical Literature Data Extraction: The CLiDE Project	CLiDE Notes
1993	Optical Recognition of Chemical Graphics	Casey et al. Notes
1999	Automatic Reading of Handwritten Chemical Formulas from a Structural Representation of the Image	Ramel et al. Notes
2007	Automatic Recognition of Chemical Images	chemoCR Notes
2007	Reconstruction of Chemical Molecules from Images	chemoCR Notes
2009	Automated extraction of chemical structure information from digital raster images	ChemReader Notes
2009	CLiDE Pro: The Latest Generation of CLiDE, a Tool for Optical Chemical Structure Recognition	CLiDE Pro Notes
2009	Optical Structure Recognition Software To Recover Chemical Information: OSRA, An Open Source Solution	OSRA Notes
2012	Chemical Structure Recognition: A Rule Based Approach	MolRec Notes
2015	Research on Chemical Expression Images Recognition	Hong et al. Notes

TREC 2011 Chemistry Track

The TREC 2011 Chemistry Track provided a standardized benchmark for comparing OCSR systems, introducing the novel Image-to-Structure task alongside Prior Art and Technology Survey tasks. Papers from this evaluation are grouped here.

System	Paper	Notes
chemoCR	Chemical Structure Reconstruction with chemoCR	chemoCR Notes
ChemReader	Image-to-Structure Task by ChemReader	ChemReader at TREC 2011 Notes
Imago	Imago: open-source toolkit for 2D chemical structure image recognition	Imago Notes
OSRA	Optical Structure Recognition Application entry in Image2Structure task	OSRA at TREC 2011 Notes
MolRec	Performance of MolRec at TREC 2011 Overview and Analysis of Results	MolRec at TREC Notes
ChemInfty	Robust Method of Segmentation and Recognition of Chemical Structure Images in ChemInfty	ChemInfty Notes

CLEF 2012 Chemistry Track

The CLEF-IP 2012 benchmarking lab introduced three specific IR tasks in the intellectual property domain: claims-based passage retrieval, flowchart recognition, and chemical structure recognition. The chemical structure recognition task included both segmentation (identifying bounding boxes) and recognition (converting to MOL format) subtasks, with a particular focus on challenging Markush structures common in patents.

System	Paper	Notes
MolRec	MolRec at CLEF 2012 - Overview and Analysis of Results	MolRec at CLEF 2012 Notes
OSRA	Optical Structure Recognition Application entry to CLEF-IP 2012	OSRA at CLEF-IP 2012 Notes

Content Details
Category	Computational Chemistry
Date	December 2025

Overview#

Review & Survey Papers#

Deep Learning Methods#

Image-to-Sequence Paradigm#

Image-to-Graph Paradigm#

Image-to-Fingerprint Paradigm#

Image Classification and Filtering#

Traditional Machine Learning Methods#

Rule-Based Methods#

Core Methods#

TREC 2011 Chemistry Track#

CLEF 2012 Chemistry Track#