Overview

Optical Chemical Structure Recognition (OCSR) aims to automatically extract machine-readable molecular representations (e.g., SMILES, InChI, mol files) from images of chemical structures. Methods have evolved from early rule-based systems to modern deep learning approaches.

This note organizes OCSR methods by their fundamental approach, providing a framework for understanding the landscape of techniques.

Review & Survey Papers

Comprehensive surveys and systematization of knowledge papers that organize and synthesize the OCSR literature.

YearPaperNotesFocus
2020A review of optical chemical structure recognition toolsRajan et al. 2020Survey of 30 years of OCSR development (1990-2019); benchmark of three open-source tools (OSRA, Imago, MolVec) on four datasets
2022Review of techniques and models used in optical chemical structure recognitionMusazade et al. 2022Systematization of OCSR evolution from rule-based systems to modern deep learning; identifies paradigm shift to image captioning and critiques evaluation metrics
2024Comparing software tools for optical chemical structure recognitionKrasnov et al. 2024Benchmark of 8 open-access tools on 2,702 manually curated patent images; proposes ChemIC classifier for hybrid routing approach

Deep Learning Methods

End-to-end neural network architectures that learn to map images directly to molecular representations.

Note on Paper Types: Papers listed below are primarily Method ($\Psi_{\text{Method}}$) papers focused on novel architectures and performance improvements. Some also have secondary Resource ($\Psi_{\text{Resource}}$) contributions through released tools or datasets. See the AI and Physical Sciences paper taxonomy for classification details.

Image-to-Sequence Paradigm

Treating chemical structure recognition as an image captioning task, these methods use encoder-decoder architectures (often with attention mechanisms) to generate sequential molecular representations like SMILES directly from pixels.

→ See Image-to-Sequence OCSR: A Comparative Analysis for detailed comparison of architectures, output formats, training data, and hardware requirements across these methods.

YearPaperNotesArchitecture
2019Molecular Structure Extraction From Documents Using Deep LearningStaker et al. NotesU-Net segmentation + CNN-GridLSTM encoder-decoder with attention
2020DECIMER: towards deep learning for chemical image recognitionDECIMER NotesInception V3 encoder + GRU decoder with attention
2021ChemPix: automated recognition of hand-drawn hydrocarbon structuresChemPix NotesCNN encoder + LSTM decoder with attention
2021DECIMER 1.0: deep learning for chemical image recognition using transformersDECIMER 1.0 NotesEfficientNet-B3 encoder + Transformer decoder with SELFIES output
2021End-to-End Attention-based Image CaptioningViT-InChI Transformer NotesVision Transformer encoder + Transformer decoder with InChI output
2021Img2Mol - accurate SMILES recognition from molecular graphical depictionsImg2Mol NotesCNN encoder + pre-trained CDDD decoder for continuous embedding
2021IMG2SMI: Translating Molecular Structure Images to SMILESIMG2SMI NotesResNet-101 encoder + Transformer decoder with SELFIES output
2022Automated Recognition of Chemical Molecule Images Based on an Improved TNT ModelICMDT NotesDeep TNT encoder + Transformer decoder with InChI output
2022Image2SMILES: Transformer-Based Molecular Optical Recognition EngineImage2SMILES NotesResNet-50 encoder + Transformer decoder with FG-SMILES output
2022MICER: a pre-trained encoder-decoder architecture for molecular image captioningMICER NotesFine-tuned ResNet101 encoder + LSTM decoder with attention
2022Performance of chemical structure string representations for chemical image recognition using transformersRajan String RepresentationsComparative ablation: SMILES vs DeepSMILES vs SELFIES vs InChI
2022SwinOCSR: end-to-end optical chemical structure recognition using a Swin TransformerSwinOCSR NotesSwin Transformer encoder + Transformer decoder with DeepSMILES output
2023Handwritten Chemical Structure Image to Structure-Specific Markup Using Random Conditional Guided DecoderHu et al. RCGD NotesDenseNet encoder + GRU decoder with attention and SSML output
2023DECIMER.ai: an open platform for automated optical chemical structure identification, segmentation and recognition in scientific publicationsDECIMER.ai NotesEfficientNet-V2-M encoder + Transformer decoder with SMILES output
2024ChemReco: automated recognition of hand-drawn carbon-hydrogen-oxygen structures using deep learningChemReco NotesEfficientNet encoder + Transformer decoder with SMILES output
2024Advancements in hand-drawn chemical structure recognition through an enhanced DECIMER architectureEnhanced DECIMER NotesEfficientNet-V2-M encoder + Transformer decoder with SMILES output
2024Image2InChI: Automated Molecular Optical Image RecognitionImage2InChI NotesImproved SwinTransformer encoder + Transformer decoder with InChI output
2024MMSSC-Net: multi-stage sequence cognitive networks for drug molecule recognitionMMSSC-Net NotesSwinV2 encoder + GPT-2 decoder with MLP for multi-stage cognition
2024RFL: Simplifying Chemical Structure Recognition with Ring-Free LanguageRFL NotesDenseNet encoder + GRU decoder with hierarchical ring decomposition
2025Dual-Path Global Awareness Transformer for Optical Chemical Structure RecognitionDGAT NotesResNet-101 encoder + Transformer with CGFE/SDGLA modules and SMILES output
2025GTR-CoT: Graph Traversal as Visual Chain of Thought for Molecular Structure RecognitionGTR-CoT NotesQwen-VL 2.5 3B encoder-decoder with graph traversal chain-of-thought and SMILES output
2025MolParser: End-to-end Visual Recognition of Molecule Structures in the WildMolParser NotesSwin Transformer encoder + BART decoder with Extended SMILES (E-SMILES) output
2025MolSight: OCSR with SMILES Pretraining, Multi-Granularity Learning and Reinforcement LearningMolSight NotesEfficientViT-L1 encoder + Transformer decoder with RL (GRPO) and SMILES output
2025OCSU: Optical Chemical Structure Understanding for Molecule-centric Scientific DiscoveryOCSU NotesMol-VL: Qwen2-VL encoder-decoder with multi-task learning for multi-level understanding

Image-to-Graph Paradigm

Methods that explicitly construct molecular graphs as intermediate representations, identifying atoms as vertices and bonds as edges before converting to standard molecular formats.

→ See Image-to-Graph OCSR: A Comparative Analysis for detailed comparison of graph construction paradigms, architectures, coordinate prediction strategies, and benchmark performance across these methods.

YearPaperNotesArchitecture
2020ChemGrapher: Optical Graph Recognition of Chemical Compounds by Deep LearningChemGrapher NotesU-Net-based semantic segmentation + graph building algorithm + classification CNNs
2022ABC-Net: A divide-and-conquer based deep learning architecture for SMILES recognitionABC-Net NotesU-Net-style FCN with keypoint detection heatmaps + multi-task property prediction
2022Image-to-Graph Transformers for Chemical Structure RecognitionImage-to-Graph Transformers NotesResNet-34 encoder + Transformer encoder + Graph-Aware Transformer (GRAT) decoder
2022MolMiner: You Only Look Once for Chemical Structure RecognitionMolMiner NotesMobileNetV2 segmentation + YOLOv5 object detection + EasyOCR + graph construction
2023MolGrapher: Graph-based Visual Recognition of Chemical StructuresMolGrapher NotesResNet-18 keypoint detector + supergraph construction + GNN classifier
2023MolScribe: Robust Molecular Structure Recognition with Image-To-Graph GenerationMolScribe NotesSwin Transformer encoder + Transformer decoder with explicit atom coordinates and bond prediction
2024Atom-Level Optical Chemical Structure Recognition with Limited SupervisionAtomLenz NotesFaster R-CNN object detection + graph constructor with weakly supervised training (ProbKT*)
2024MolNexTR: a generalized deep learning model for molecular image recognitionMolNexTR NotesDual-stream (ConvNext + ViT) encoder + Transformer decoder with graph generation
2025MarkushGrapher: Joint Visual and Textual Recognition of Markush StructuresMarkushGrapher NotesUDOP VTL encoder + MolScribe OCSR encoder + T5 decoder with CXSMILES + substituent table
2025MolMole: Molecule Mining from Scientific LiteratureMolMole NotesViDetect (DINO) + ViReact (RxnScribe) + ViMore (detection-based) unified page-level pipeline
2025OCSU: Optical Chemical Structure Understanding for Molecule-centric Scientific DiscoveryOCSU NotesDoubleCheck: MolScribe + attentive feature enhancement with local ambiguous atom refinement

Image-to-Fingerprint Paradigm

Methods that bypass molecular graph reconstruction entirely, generating molecular fingerprints directly from images through functional group recognition and spatial analysis. These approaches prioritize retrieval and similarity search over exact structure reconstruction.

YearPaperNotesArchitecture
2025SubGrapher: visual fingerprinting of chemical structuresSubGrapher NotesDual Mask-RCNN instance segmentation (1,534 groups + 27 backbones) + substructure-graph + SVMF fingerprint

Image Classification and Filtering

Methods that classify chemical structure images for preprocessing purposes, such as detecting Markush structures or other problematic inputs that should be filtered before full OCSR processing.

YearPaperNotesArchitecture
2023One Strike, You’re Out: Detecting Markush Structures in Low Signal-to-Noise Ratio ImagesJurriaans et al. NotesPatch-based pipeline with Inception V3 or ResNet18 for binary classification

Traditional Machine Learning Methods

Hybrid approaches combining classical machine learning algorithms (neural networks, SVMs, CRFs) with domain-specific heuristics and image processing. These methods (primarily from 1992-2014) used ML for specific subtasks like character recognition or symbol classification while relying on rule-based systems for chemical structure interpretation.

YearPaperNotesKey ML Component
1992Kekulé: OCR-Optical Chemical (Structure) RecognitionKekulé NotesMultilayer perceptron for OCR
1996Automatic Interpretation of Chemical Structure DiagramsKekulé-1 NotesNeural network with shared weights (proto-CNN)
2007Recognition of Hand Drawn Chemical DiagramsOuyang-Davis NotesSVM for symbol classification
2008Chemical Ring Handwritten Recognition Based on Neural NetworksHewahi et al. NotesTwo-phase classifier-recognizer with feed-forward NNs
2008Recognition of On-line Handwritten Chemical ExpressionsYang et al. NotesTwo-level algorithm with edit distance matching
2008A Study of On-Line Handwritten Chemical Expressions RecognitionYang et al. NotesANN with two-level substance recognition
2009A Unified Framework for Recognizing Handwritten Chemical ExpressionsChang et al. NotesGMM for spatial relations, NN for bond verification
2009HMM-Based Online Recognition of Handwritten Chemical SymbolsZhang et al. NotesHidden Markov Model for online handwriting
2009The Understanding and Structure Analyzing for Online Handwritten Chemical FormulasWang et al. NotesHMM for text recognition + CFG for structure parsing
2010A SVM-HMM Based Online Classifier for Handwritten Chemical SymbolsZhang et al. NotesDual-stage SVM-HMM with PSR algorithm
2011ChemInk: A Natural Real-Time Recognition System for Chemical DrawingsChemInk NotesConditional Random Field (CRF) joint model
2013Online Chemical Symbol Recognition for Handwritten Chemical Expression RecognitionTang et al. NotesSVM with elastic matching for handwriting
2014Markov Logic Networks for Optical Chemical Structure RecognitionMLOCSR NotesMarkov Logic Network for probabilistic inference

Rule-Based Methods

Classic approaches using heuristics, image processing, and domain-specific rules. While some systems use traditional OCR engines (which may contain ML components), the chemical structure recognition itself is purely algorithmic.

Note: The chemoCR systems use SVM-based OCR but employ rule-based topology-preserving vectorization for core structure reconstruction, placing them primarily in this category.

Core Methods

YearPaperNotes
1990Computational Perception and Recognition of Digitized Molecular StructuresContreras et al. Notes
1993Chemical Literature Data Extraction: The CLiDE ProjectCLiDE Notes
1993Optical Recognition of Chemical GraphicsCasey et al. Notes
1999Automatic Reading of Handwritten Chemical Formulas from a Structural Representation of the ImageRamel et al. Notes
2007Automatic Recognition of Chemical ImageschemoCR Notes
2007Reconstruction of Chemical Molecules from ImageschemoCR Notes
2009Automated extraction of chemical structure information from digital raster imagesChemReader Notes
2009CLiDE Pro: The Latest Generation of CLiDE, a Tool for Optical Chemical Structure RecognitionCLiDE Pro Notes
2009Optical Structure Recognition Software To Recover Chemical Information: OSRA, An Open Source SolutionOSRA Notes
2012Chemical Structure Recognition: A Rule Based ApproachMolRec Notes
2015Research on Chemical Expression Images RecognitionHong et al. Notes

TREC 2011 Chemistry Track

The TREC 2011 Chemistry Track provided a standardized benchmark for comparing OCSR systems, introducing the novel Image-to-Structure task alongside Prior Art and Technology Survey tasks. Papers from this evaluation are grouped here.

SystemPaperNotes
chemoCRChemical Structure Reconstruction with chemoCRchemoCR Notes
ChemReaderImage-to-Structure Task by ChemReaderChemReader at TREC 2011 Notes
ImagoImago: open-source toolkit for 2D chemical structure image recognitionImago Notes
OSRAOptical Structure Recognition Application entry in Image2Structure taskOSRA at TREC 2011 Notes
MolRecPerformance of MolRec at TREC 2011 Overview and Analysis of ResultsMolRec at TREC Notes
ChemInftyRobust Method of Segmentation and Recognition of Chemical Structure Images in ChemInftyChemInfty Notes

CLEF 2012 Chemistry Track

The CLEF-IP 2012 benchmarking lab introduced three specific IR tasks in the intellectual property domain: claims-based passage retrieval, flowchart recognition, and chemical structure recognition. The chemical structure recognition task included both segmentation (identifying bounding boxes) and recognition (converting to MOL format) subtasks, with a particular focus on challenging Markush structures common in patents.

SystemPaperNotes
MolRecMolRec at CLEF 2012 - Overview and Analysis of ResultsMolRec at CLEF 2012 Notes
OSRAOptical Structure Recognition Application entry to CLEF-IP 2012OSRA at CLEF-IP 2012 Notes