Computational Chemistry
The transformation from a 2D chemical structure image to a SMILES representation

What is Optical Chemical Structure Recognition (OCSR)?

A micro-review of Optical Chemical Structure Recognition (OCSR), from rule-based systems to modern AI models.

Computational Chemistry

αExtractor: Automatic Chemical Information Extraction from Biomedical Literature

αExtractor uses ResNet-Transformer to extract chemical structures from literature images, including noisy and hand-drawn …...

Computational Chemistry

ChemInfty: Robust Segmentation and Recognition of Chemical Structures in Low-Quality Patent Images

Fujiyoshi et al.'s segment-based approach for recognizing chemical structures in challenging Japanese patent images with …...

Computational Chemistry

Img2Mol: Accurate SMILES Recognition from Molecular Graphical Depictions

Clevert et al.'s two-stage CNN approach for converting molecular images to SMILES using CDDD embeddings and extensive …...

Computational Chemistry

MolNexTR: A Generalized Deep Learning Model for Molecular Image Recognition

Chen et al.'s dual-stream encoder approach for robust molecular structure recognition from diverse real-world images …...

Computational Chemistry

OSRA: Optical Structure Recognition for Chemical Information Extraction

Filippov & Nicklaus's open-source rule-based system for converting molecular structure images into machine-readable …...

Computational Chemistry

MolParser: End-to-end Visual Recognition of Molecule Structures in the Wild

MolParser converts molecular images from scientific documents to machine-readable formats using E-SMILES....

Computational Chemistry
SELFIES strings guarantee 100% valid molecules - even when generated randomly

Converting SELFIES Strings to 2D Molecular Images

Visualize SELFIES molecular representations and test their 100% robustness through random sampling experiments.

Computational Chemistry
Aspirin molecular structure generated from SMILES string

Converting SMILES Strings to 2D Molecular Images

Learn how to create 2D molecular images from SMILES strings using RDKit and PIL, with proper formatting and legends.

Computational Chemistry
SELFIES representation of 2-Fluoroethenimine molecule

SELFIES (Self-Referencing Embedded Strings)

SELFIES is a 100% robust molecular string representation for ML, implemented in the open-source selfies Python library....

Computational Chemistry
Methoxybenzonitrile

SMILES (Simplified Molecular Input Line Entry System)

SMILES is a specification for describing the structure of chemical molecules using short ASCII strings....

Computational Chemistry
GEOM dataset example molecule: N-(4-pyrimidin-2-yloxyphenyl)acetamide

GEOM: Energy-Annotated Molecular Conformations

A dataset card for the GEOM dataset, a collection of energy-annotated molecular conformations for property prediction …