Computational Chemistry
MMSSC-Net: Multi-Stage Sequence Cognitive Networks

MMSSC-Net: Multi-Stage Sequence Cognitive Networks

MMSSC-Net introduces a multi-stage cognitive approach for OCSR, utilizing a SwinV2 encoder and GPT-2 decoder to recognize atomic and bond sequences. It achieves high accuracy (94%+) on benchmark datasets by effectively handling varying image resolutions and noise.

Computational Chemistry
MolGrapher: Graph-based Visual Recognition of Chemical Structures

MolGrapher: Graph-based Chemical Recognition

MolGrapher introduces a novel three-stage pipeline (keypoint detection, supergraph construction, GNN classification) for recognizing chemical structures from images. It achieves state-of-the-art results by treating molecules as graphs, and introduces the USPTO-30K benchmark.

Computational Chemistry
MolMole: Unified Vision Pipeline for Molecule Mining

MolMole: Unified Vision Pipeline for Molecule Mining

MolMole unifies molecule detection, reaction parsing, and structure recognition into a single vision-based pipeline, achieving SOTA performance on a newly introduced 550-page benchmark by processing full documents without external layout parsers.

Computational Chemistry
MolScribe: Image-to-Graph Molecular Recognition

MolScribe: Image-to-Graph Molecular Recognition

MolScribe reformulates molecular recognition as an image-to-graph generation task, explicitly predicting atom coordinates and bonds to better handle stereochemistry and abbreviated structures compared to image-to-SMILES baselines.

Computational Chemistry
MolSight: OCSR with RL and Multi-Granularity Learning

MolSight: OCSR with RL and Multi-Granularity Learning

MolSight introduces a three-stage training paradigm for Optical Chemical Structure Recognition (OCSR), utilizing large-scale pretraining, multi-granularity fine-tuning with auxiliary bond and coordinate prediction tasks, and reinforcement learning (GRPO) to achieve state-of-the-art performance in recognizing complex stereochemical structures like chiral centers and cis-trans isomers.

Computational Chemistry
OCSU: Optical Chemical Structure Understanding

OCSU: Optical Chemical Structure Understanding

Proposes the ‘Optical Chemical Structure Understanding’ (OCSU) task to translate molecular images into multi-level descriptions (motifs, IUPAC, SMILES). Introduces the Vis-CheBI20 dataset and two paradigms: DoubleCheck (OCSR-based) and Mol-VL (OCSR-free).

Computational Chemistry
RFL: Simplifying Chemical Structure Recognition

RFL: Simplifying Chemical Structure Recognition

Proposes Ring-Free Language (RFL) to hierarchically decouple molecular graphs into skeletons, rings, and branches, solving issues with 1D serialization of complex 2D structures. Introduces the Molecular Skeleton Decoder (MSD) to progressively predict these components, achieving state-of-the-art results on handwritten and printed chemical structures.

Computational Chemistry
ABC-Net detects atom and bond keypoints to reconstruct molecular graphs from images

ABC-Net: Divide-and-Conquer SMILES Recognition

ABC-Net reformulates molecular image recognition as a keypoint detection problem. By predicting atom/bond centers and properties via a single Fully Convolutional Network, it achieves >94% accuracy with high data efficiency.

Computational Chemistry
ChemPix: Hand-Drawn Hydrocarbon Recognition

ChemPix: Hand-Drawn Hydrocarbon Recognition

Proposes a CNN-LSTM architecture that treats chemical structure recognition as an image captioning task. Introduces a robust synthetic data generation pipeline with augmentation, degradation, and background addition to train models that generalize to hand-drawn inputs without seeing real data during pre-training.

Computational Chemistry
DECIMER 1.0: Transformers for Chemical Image Recognition

DECIMER 1.0: Transformers for Chemical Image Recognition

DECIMER 1.0 introduces a Transformer-based architecture coupled with EfficientNet-B3 to solve Optical Chemical Structure Recognition. By leveraging the robust SELFIES representation and scaling training to over 35 million molecules, it achieves state-of-the-art accuracy on synthetic benchmarks, offering an open-source solution for mining chemical data from legacy literature.

Computational Chemistry
End-to-End Transformer for Molecular Image Captioning

End-to-End Transformer for Molecular Image Captioning

This paper introduces a convolution-free, end-to-end transformer model for molecular image translation. By replacing CNN encoders with Vision Transformers, it achieves superior performance on noisy datasets compared to ResNet-LSTM baselines.

Computational Chemistry
Handwritten chemical structure recognition with RCGD and SSML

Handwritten Chemical Structure Recognition with RCGD

Proposes a Random Conditional Guided Decoder (RCGD) and a Structure-Specific Markup Language (SSML) to handle the ambiguity and complexity of handwritten chemical structure recognition, validated on a new benchmark dataset (EDU-CHEMC) with 50,000 handwritten images.