Computational Chemistry
Dual-Path Global Awareness Transformer (DGAT)

Dual-Path Global Awareness Transformer (DGAT)

Proposes a new architecture (DGAT) to solve global context loss in chemical structure recognition. Introduces Cascaded Global Feature Enhancement and Sparse Differential Global-Local Attention, achieving SOTA results (84.0% BLEU-4) and handling complex chiral structures implicitly.

Computational Chemistry
Enhanced DECIMER for Hand-Drawn Structure Recognition

Enhanced DECIMER for Hand-Drawn Structure Recognition

This paper presents an enhanced deep learning architecture for Optical Chemical Structure Recognition (OCSR) specifically optimized for hand-drawn inputs. By pairing an EfficientNetV2 encoder with a Transformer decoder and training on over 150 million synthetic images, the model achieves state-of-the-art accuracy on real-world hand-drawn benchmarks.

Computational Chemistry
MMSSC-Net: Multi-Stage Sequence Cognitive Networks

MMSSC-Net: Multi-Stage Sequence Cognitive Networks

MMSSC-Net introduces a multi-stage cognitive approach for OCSR, utilizing a SwinV2 encoder and GPT-2 decoder to recognize atomic and bond sequences. It achieves high accuracy (94%+) on benchmark datasets by effectively handling varying image resolutions and noise.

Computational Chemistry
MolGrapher: Graph-based Visual Recognition of Chemical Structures

MolGrapher: Graph-based Chemical Recognition

MolGrapher introduces a novel three-stage pipeline (keypoint detection, supergraph construction, GNN classification) for recognizing chemical structures from images. It achieves state-of-the-art results by treating molecules as graphs, and introduces the USPTO-30K benchmark.

Computational Chemistry
MolMole: Unified Vision Pipeline for Molecule Mining

MolMole: Unified Vision Pipeline for Molecule Mining

MolMole unifies molecule detection, reaction parsing, and structure recognition into a single vision-based pipeline, achieving SOTA performance on a newly introduced 550-page benchmark by processing full documents without external layout parsers.

Computational Chemistry
MolScribe: Image-to-Graph Molecular Recognition

MolScribe: Image-to-Graph Molecular Recognition

MolScribe reformulates molecular recognition as an image-to-graph generation task, explicitly predicting atom coordinates and bonds to better handle stereochemistry and abbreviated structures compared to image-to-SMILES baselines.

Computational Chemistry
OCSU: Optical Chemical Structure Understanding

OCSU: Optical Chemical Structure Understanding

Proposes the ‘Optical Chemical Structure Understanding’ (OCSU) task to translate molecular images into multi-level descriptions (motifs, IUPAC, SMILES). Introduces the Vis-CheBI20 dataset and two paradigms: DoubleCheck (OCSR-based) and Mol-VL (OCSR-free).

Computational Chemistry
ABC-Net detects atom and bond keypoints to reconstruct molecular graphs from images

ABC-Net: Divide-and-Conquer SMILES Recognition

ABC-Net reformulates molecular image recognition as a keypoint detection problem. By predicting atom/bond centers and properties via a single Fully Convolutional Network, it achieves >94% accuracy with high data efficiency.

Computational Chemistry
Handwritten chemical structure recognition with RCGD and SSML

Handwritten Chemical Structure Recognition with RCGD

Proposes a Random Conditional Guided Decoder (RCGD) and a Structure-Specific Markup Language (SSML) to handle the ambiguity and complexity of handwritten chemical structure recognition, validated on a new benchmark dataset (EDU-CHEMC) with 50,000 handwritten images.

Computational Chemistry

MolMiner: Deep Learning OCSR with YOLOv5 Detection

MolMiner replaces traditional rule-based vectorization with a deep learning object detection pipeline (YOLOv5) to extract chemical structures from PDFs. It achieves state-of-the-art performance on benchmarks and introduces a new real-world dataset of 3,040 images.

Computational Chemistry

SwinOCSR: Vision Transformers for Chemical OCR

Proposes an end-to-end architecture replacing standard CNN backbones with Swin Transformer to capture global image context. Introduces Multi-label Focal Loss to handle severe token imbalance in chemical datasets.

Computational Chemistry

Review of OCSR Tools (2020)

This paper reviews three decades of OCSR development, transitioning from rule-based heuristics to early deep learning approaches. It includes a benchmark study comparing the performance of three open-source tools (OSRA, Imago, MolVec) on four diverse datasets.