Computational Chemistry
Pipeline diagram showing keypoint detection, supergraph construction, and GNN classification for molecular structure recognition

MolGrapher: Graph-based Chemical Structure Recognition

MolGrapher introduces a three-stage pipeline (keypoint detection, supergraph construction, GNN classification) for recognizing chemical structures from images. It achieves 91.5% accuracy on USPTO by treating molecules as graphs, and introduces the USPTO-30K benchmark.

Computational Chemistry
Overview of the MolMole pipeline showing ViDetect, ViReact, and ViMore processing document pages to extract molecules and reactions.

MolMole: Unified Vision Pipeline for Molecule Mining

MolMole unifies molecule detection, reaction parsing, and structure recognition into a single vision-based pipeline, achieving top performance on a newly introduced 550-page benchmark by processing full documents without external layout parsers.

Computational Chemistry
Overview of the MolScribe encoder-decoder architecture predicting atoms with coordinates and bonds from a molecular image.

MolScribe: Robust Image-to-Graph Molecular Recognition

MolScribe reformulates molecular recognition as an image-to-graph generation task, explicitly predicting atom coordinates and bonds to better handle stereochemistry and abbreviated structures compared to image-to-SMILES baselines.

Computational Chemistry
Three-stage training pipeline for MolSight showing pretraining, multi-granularity fine-tuning, and RL post-training stages

MolSight: OCSR with RL and Multi-Granularity Learning

MolSight introduces a three-stage training paradigm for Optical Chemical Structure Recognition (OCSR), utilizing large-scale pretraining, multi-granularity fine-tuning with auxiliary bond and coordinate prediction tasks, and reinforcement learning (GRPO) to achieve 85.1% stereochemical accuracy on USPTO, recognizing complex stereochemical structures like chiral centers and cis-trans isomers.

Computational Chemistry
ABC-Net detects atom and bond keypoints to reconstruct molecular graphs from images

ABC-Net: Keypoint-Based Molecular Image Recognition

ABC-Net reformulates molecular image recognition as a keypoint detection problem. By predicting atom/bond centers and properties via a single Fully Convolutional Network, it achieves >94% accuracy with high data efficiency.

Computational Chemistry
Overview of the ChemPix CNN-LSTM pipeline converting a hand-drawn hydrocarbon sketch to a SMILES string

ChemPix: Hand-Drawn Hydrocarbon Structure Recognition

Proposes a CNN-LSTM architecture that treats chemical structure recognition as an image captioning task. Introduces a synthetic data generation pipeline with augmentation, degradation, and background addition to train models that generalize to hand-drawn inputs without seeing real data during training.

Computational Chemistry
Architecture diagram showing the DECIMER 1.0 transformer pipeline from chemical image input to SELFIES output

DECIMER 1.0: Transformers for Chemical Image Recognition

DECIMER 1.0 introduces a Transformer-based architecture coupled with EfficientNet-B3 to solve Optical Chemical Structure Recognition. By using the SELFIES representation (which guarantees 100% valid output strings) and scaling training to over 35 million molecules, it achieves 96.47% exact match accuracy on synthetic benchmarks, offering an open-source solution for mining chemical data from legacy literature.

Computational Chemistry
Architecture diagram showing Vision Transformer encoder processing image patches and Transformer decoder generating InChI strings

End-to-End Transformer for Molecular Image Captioning

This paper introduces a convolution-free, end-to-end transformer model for molecular image translation. By replacing CNN encoders with Vision Transformers, it achieves a Levenshtein distance of 6.95 on noisy datasets, compared to 7.49 for ResNet50-LSTM baselines.

Computational Chemistry
Handwritten chemical structure recognition with RCGD and SSML

Handwritten Chemical Structure Recognition with RCGD

Proposes a Random Conditional Guided Decoder (RCGD) and a Structure-Specific Markup Language (SSML) to handle the ambiguity and complexity of handwritten chemical structure recognition, validated on a new benchmark dataset (EDU-CHEMC) with 50,000 handwritten images.

Computational Chemistry
Chemical structure diagram representing the ICMDT molecular translation system

ICMDT: Automated Chemical Structure Image Recognition

This paper introduces ICMDT, a Transformer-based architecture for molecular translation (image-to-InChI). By enhancing the TNT block to fuse pixel, small patch, and large patch embeddings, the model achieves superior accuracy on the Bristol-Myers Squibb dataset compared to CNN-RNN and standard Transformer baselines.

Computational Chemistry
Diagram showing a pixelated chemical image passing through a multi-layer encoder to produce a molecular graph with nodes and edges.

Image-to-Graph Transformers for Chemical Structures

This paper proposes an end-to-end deep learning architecture that translates chemical images directly into molecular graphs using a ResNet-Transformer encoder and a graph-aware decoder. It addresses the limitations of SMILES-based approaches by effectively handling non-atomic symbols (abbreviations) and varying drawing styles found in scientific literature.

Computational Chemistry
4-tert-butylphenol molecular structure diagram for Image2SMILES OCSR

Image2SMILES: Transformer OCSR with Synthetic Data Pipeline

A Transformer-based system for optical chemical structure recognition introducing a comprehensive data generation pipeline (FG-SMILES, Markush structures, visual contamination) achieving 79% accuracy on real-world images, outperforming rule-based systems like OSRA.