Computational Chemistry
Comparing OCSR Tools Benchmark Visualization

Comparing OCSR Tools (Krasnov et al. 2024)

Comprehensive evaluation of 8 optical chemical structure recognition tools using a newly curated dataset of 2,702 patent images. Proposes ChemIC, a ResNet-50 classifier to route images to specialized tools based on content type, demonstrating that no single tool excels at all tasks.

Computational Chemistry
DECIMER.ai: Optical Chemical Structure Recognition

DECIMER.ai: Optical Chemical Structure Recognition

DECIMER.ai addresses the lack of open tools for Optical Chemical Structure Recognition (OCSR) by providing a comprehensive, deep-learning-based workflow. It features a novel data generation pipeline (RanDepict), a web application, and models for segmentation and recognition that rival or exceed proprietary solutions.

Computational Chemistry
Dual-Path Global Awareness Transformer (DGAT)

Dual-Path Global Awareness Transformer (DGAT)

Proposes a new architecture (DGAT) to solve global context loss in chemical structure recognition. Introduces Cascaded Global Feature Enhancement and Sparse Differential Global-Local Attention, achieving SOTA results (84.0% BLEU-4) and handling complex chiral structures implicitly.

Computational Chemistry
Enhanced DECIMER for Hand-Drawn Structure Recognition

Enhanced DECIMER for Hand-Drawn Structure Recognition

This paper presents an enhanced deep learning architecture for Optical Chemical Structure Recognition (OCSR) specifically optimized for hand-drawn inputs. By pairing an EfficientNetV2 encoder with a Transformer decoder and training on over 150 million synthetic images, the model achieves state-of-the-art accuracy on real-world hand-drawn benchmarks.

Computational Chemistry
Image2InChI: SwinTransformer for Molecular Recognition

Image2InChI: SwinTransformer for Molecular Recognition

Proposes Image2InChI, an OCSR model with improved SwinTransformer encoder and novel feature fusion network with attention mechanisms that achieves 99.8% InChI accuracy on the BMS dataset.

Computational Chemistry
MarkushGrapher: Multi-modal Markush Structure Recognition

MarkushGrapher: Multi-modal Markush Structure Recognition

This paper introduces a novel multi-modal approach for extracting chemical Markush structures from patents, combining a Vision-Text-Layout encoder with a specialized chemical vision encoder. It addresses the lack of training data with a robust synthetic generation pipeline and introduces M2S, a new real-world benchmark.

Computational Chemistry
MMSSC-Net: Multi-Stage Sequence Cognitive Networks

MMSSC-Net: Multi-Stage Sequence Cognitive Networks

MMSSC-Net introduces a multi-stage cognitive approach for OCSR, utilizing a SwinV2 encoder and GPT-2 decoder to recognize atomic and bond sequences. It achieves high accuracy (94%+) on benchmark datasets by effectively handling varying image resolutions and noise.

Computational Chemistry
MolGrapher: Graph-based Visual Recognition of Chemical Structures

MolGrapher: Graph-based Chemical Recognition

MolGrapher introduces a novel three-stage pipeline (keypoint detection, supergraph construction, GNN classification) for recognizing chemical structures from images. It achieves state-of-the-art results by treating molecules as graphs, and introduces the USPTO-30K benchmark.

Computational Chemistry
MolMole: Unified Vision Pipeline for Molecule Mining

MolMole: Unified Vision Pipeline for Molecule Mining

MolMole unifies molecule detection, reaction parsing, and structure recognition into a single vision-based pipeline, achieving SOTA performance on a newly introduced 550-page benchmark by processing full documents without external layout parsers.

Computational Chemistry
MolScribe: Image-to-Graph Molecular Recognition

MolScribe: Image-to-Graph Molecular Recognition

MolScribe reformulates molecular recognition as an image-to-graph generation task, explicitly predicting atom coordinates and bonds to better handle stereochemistry and abbreviated structures compared to image-to-SMILES baselines.

Computational Chemistry
MolSight: OCSR with RL and Multi-Granularity Learning

MolSight: OCSR with RL and Multi-Granularity Learning

MolSight introduces a three-stage training paradigm for Optical Chemical Structure Recognition (OCSR), utilizing large-scale pretraining, multi-granularity fine-tuning with auxiliary bond and coordinate prediction tasks, and reinforcement learning (GRPO) to achieve state-of-the-art performance in recognizing complex stereochemical structures like chiral centers and cis-trans isomers.

Computational Chemistry
OCSU: Optical Chemical Structure Understanding

OCSU: Optical Chemical Structure Understanding

Proposes the ‘Optical Chemical Structure Understanding’ (OCSU) task to translate molecular images into multi-level descriptions (motifs, IUPAC, SMILES). Introduces the Vis-CheBI20 dataset and two paradigms: DoubleCheck (OCSR-based) and Mol-VL (OCSR-free).