Computational Chemistry

InChI and Tautomerism: Toward a Comprehensive Treatment

Dhaked et al.'s comprehensive analysis of tautomerism in chemoinformatics, introducing 86 new tautomeric rules and their …...

Computational Chemistry

InChI: The Worldwide Chemical Structure Identifier Standard

Heller et al. (2013) explain how IUPAC's InChI became the global standard for representing chemical structures, its …...

Computational Chemistry

Making InChI FAIR and Sustainable for Inorganic Chemistry

The InChI v1.07 release modernizes chemical identifiers for FAIR data principles, fixes thousands of bugs, and proposes …...

Computational Chemistry

Mixfile and MInChI: Machine-Readable Chemical Mixture Formats

Clark et al.'s Mixfile format and MInChI specification provide the first standardized, machine-readable way to represent …...

Computational Chemistry

NInChI: Toward a Chemical Identifier for Nanomaterials

Lynch et al. propose NInChI (Nanomaterials InChI) - a standardized notation system for representing complex …

Computational Chemistry

Recent Advances in the SELFIES Library (2023)

An overview of the major updates to the SELFIES Python library, including improved performance, expanded chemical …...

Computational Chemistry

RInChI: Reaction International Chemical Identifier

RInChI extends the InChI standard to create unique, machine-readable identifiers for chemical reactions, enabling …...

Computational Chemistry

SELFIES: The Original Paper (Krenn et al. 2020)

A summary of the foundational 2020 paper that introduced SELFIES - the 100% robust molecular string representation …

Computational Chemistry

SMILES: The Original Paper (Weininger 1988)

A summary of David Weininger's foundational 1988 paper that introduced SMILES notation - the string-based molecular …

Computational Chemistry
The transformation from a 2D chemical structure image to a SMILES representation

What is Optical Chemical Structure Recognition (OCSR)?

A micro-review of Optical Chemical Structure Recognition (OCSR), tracing its evolution from rule-based systems to …

Document Processing
A colored molecule with annotations, representing the diverse drawing styles found in scientific papers that OCSR models must handle.

MolParser-7M and WildMol Datasets for Robust Chemical Structure Recognition

MolParser-7M is a 7.7M-pair dataset for molecule-to-text conversion, featuring real-world images and complex structures …

Computational Chemistry
ZINC-22 Tranche Browser showing molecular count distribution

ZINC-22: Multi-Billion Molecule Database

A dataset card for ZINC-22, the largest freely available database of commercially available compounds for virtual …