Computational Chemistry
ICMDT: Automated Chemical Image Recognition with Deep TNT

ICMDT: Automated Chemical Image Recognition

This paper introduces ICMDT, a Transformer-based architecture for molecular translation (image-to-InChI). By enhancing the TNT block to fuse pixel, small patch, and large patch embeddings, the model achieves superior accuracy on the Bristol-Myers Squibb dataset compared to CNN-RNN and standard Transformer baselines.

Computational Chemistry
Image-to-Graph Transformers for Chemical Structure Recognition

Image-to-Graph Transformers

This paper proposes an end-to-end deep learning architecture that translates chemical images directly into molecular graphs using a ResNet-Transformer encoder and a graph-aware decoder. It addresses the limitations of SMILES-based approaches by effectively handling non-atomic symbols (abbreviations) and varying drawing styles found in scientific literature.

Computational Chemistry

Image2SMILES: Transformer OCSR with Synthetic Data Pipeline

A Transformer-based system for optical chemical structure recognition introducing a comprehensive data generation pipeline (FG-SMILES, Markush structures, visual contamination) achieving 79% accuracy on real-world images, outperforming rule-based systems like OSRA.

Computational Chemistry

MICER: Molecular Image Captioning with Transfer Learning

MICER treats optical chemical structure recognition as an image captioning task, leveraging transfer learning with a fine-tuned ResNet encoder and attention-based LSTM decoder to convert molecular images into SMILES strings, significantly outperforming rule-based and previous deep learning methods.

Computational Chemistry

MolMiner: Deep Learning OCSR with YOLOv5 Detection

MolMiner replaces traditional rule-based vectorization with a deep learning object detection pipeline (YOLOv5) to extract chemical structures from PDFs. It achieves state-of-the-art performance on benchmarks and introduces a new real-world dataset of 3,040 images.

Computational Chemistry

One Strike, You're Out: Detecting Markush Structures

Proposes a patch-based image processing pipeline using Inception V3 to filter Markush structures from chemical documents, significantly outperforming traditional fixed-feature (ORB) methods on low-SNR images.

Computational Chemistry

Review of OCSR Techniques (2022)

This systematization paper traces the history of OCSR, comparing early rule-based systems like OSRA with modern deep learning approaches like DECIMER. It highlights the shift from image classification to image captioning and identifies critical gaps in dataset standardization and evaluation metrics.

Computational Chemistry

String Representations for Chemical Image Recognition

This methodological study isolates the impact of chemical string representations on image-to-text translation models. It finds that while SMILES offers the highest overall accuracy, SELFIES provides a guarantee of structural validity, offering a trade-off for OCSR tasks.

Computational Chemistry

SwinOCSR: Vision Transformers for Chemical OCR

Proposes an end-to-end architecture replacing standard CNN backbones with Swin Transformer to capture global image context. Introduces Multi-label Focal Loss to handle severe token imbalance in chemical datasets.

Computational Chemistry
ChemGrapher pipeline overview showing segmentation and classification stages

ChemGrapher: Deep Learning for Chemical OCR

ChemGrapher replaces rule-based chemical OCR with a deep learning pipeline using semantic segmentation to identify atom and bond candidates, followed by specialized classification networks to resolve stereochemistry and bond multiplicity, significantly outperforming OSRA.

Computational Chemistry
DECIMER: Deep Learning for Chemical Image Recognition

DECIMER: Deep Learning for Chemical Image Recognition

DECIMER adapts the “Show, Attend and Tell” image captioning architecture to translate chemical structure images into SMILES strings. By leveraging massive synthetic datasets generated from PubChem, it demonstrates that deep learning can perform optical chemical recognition without complex, hand-engineered rule systems.

Computational Chemistry

Deep Learning for Molecular Structure Extraction

This paper presents a two-stage deep learning pipeline to extract chemical structures from documents and convert them to SMILES strings. By training on large-scale synthetic data, the method overcomes the brittleness of rule-based systems and demonstrates high accuracy even on low-resolution and noisy input images.