Computational Chemistry
Five-stage pipeline for reconstructing chemical molecules from raster images

Reconstruction of Chemical Molecules from Images

A 5-module system converting raster images of chemical structures into machine-readable SDF files with custom …

Computational Chemistry
GTR-CoT: Graph Traversal Chain-of-Thought for Molecules

GTR-CoT: Graph Traversal Chain-of-Thought for Molecules

GTR-CoT uses graph traversal chain-of-thought reasoning to improve optical chemical structure recognition accuracy.

Computational Chemistry
Optical chemical structure recognition example

MolRec: Chemical Structure Recognition at CLEF 2012

MolRec achieves 95%+ accuracy on simple structures but struggles with complex diagrams, revealing rule-based OCSR limits …

Computational Chemistry
Optical chemical structure recognition example

MolRec: Rule-Based OCSR System

Rule-based system for optical chemical structure recognition using vectorization and geometric analysis, achieving 95% …

Computational Chemistry
Markush structure diagram

SubGrapher: Visual Fingerprinting of Chemical Structures

Novel OCSR method creating molecular fingerprints from images through functional group segmentation for database …

Computational Chemistry
αExtractor extracts structured chemical information from biomedical literature

αExtractor: Chemical Info from Biomedical Literature

αExtractor uses ResNet-Transformer to extract chemical structures from literature images, including noisy and hand-drawn …

Computational Chemistry
ChemInfty: Chemical Structure Recognition in Patent Images

ChemInfty: Chemical Structure Recognition in Patent Images

Fujiyoshi et al.'s segment-based approach for recognizing chemical structures in challenging Japanese patent images with …

Computational Chemistry

MolNexTR: Dual-Stream Molecular Image Recognition

Dual-stream encoder combining ConvNext and ViT for robust optical chemical structure recognition across diverse …

Computational Chemistry
A colored molecule with annotations, representing the diverse drawing styles found in scientific papers that OCSR models must handle.

MolParser-7M & WildMol: Large-Scale OCSR Datasets

MolParser-7M is the largest OCSR dataset with 7.7M image-text pairs of molecules and E-SMILES, including 400k real-world …

Computational Chemistry
Optical chemical structure recognition example

MolParser: End-to-End Molecular Structure Recognition

MolParser converts molecular images from scientific documents to machine-readable formats using end-to-end learning with …