Optical Chemical Structure Recognition
GraphReco system architecture showing component extraction, atom and bond ambiguity resolution, and graph reconstruction stages

GraphReco: Probabilistic Structure Recognition (2026)

GraphReco presents a rule-based OCSR system with two key innovations: a Fragment Merging line detection algorithm for precise bond identification and a Markov network for probabilistic resolution of atom/bond ambiguity during graph assembly. Achieves 94.2% accuracy on USPTO-10K, outperforming both traditional rule-based and some ML-based methods.

Optical Chemical Structure Recognition
GraSP feed-forward architecture showing GNN, FiLM-conditioned CNN, and MLP classification head

GraSP: Graph Recognition via Subgraph Prediction (2026)

GraSP introduces a general framework for recognizing graphs in images by framing it as sequential subgraph prediction with a binary classifier. A GNN conditions a CNN via FiLM layers to predict whether a candidate graph is a subgraph of the target. Applied to OCSR on QM9, GraSP achieves 67.5% accuracy with no domain-specific modifications.

Computational Biology
3D scatter plot showing left and right point sets with rotation axis and quaternion rotation arc

Horn's Method: Absolute Orientation via Unit Quaternions

Derives the optimal rotation between two 3D point sets as the eigenvector of a 4x4 symmetric matrix built from cross-covariance sums, using unit quaternions to enforce the orthogonality constraint.

Computational Biology
3D scatter plot showing source points, target points, and Kabsch-aligned points overlapping the targets

Kabsch Algorithm: Optimal Rotation for Point Set Alignment

A foundational 1976 short communication presenting a direct, non-iterative method for finding the best rotation matrix between two point sets via eigendecomposition of a cross-covariance matrix.

Generative Modeling
LDM architecture diagram showing conditioning via concatenation and cross-attention

Latent Diffusion Models for High-Res Image Synthesis

This paper introduces Latent Diffusion Models (LDMs), which apply denoising diffusion in the latent space of pretrained autoencoders. By separating perceptual compression from generative learning and adding cross-attention conditioning, LDMs achieve FID 1.50 on Places inpainting and FID 3.60 on ImageNet class-conditional synthesis, with competitive text-to-image generation, at a fraction of the compute cost of pixel-space diffusion.

Optical Chemical Structure Recognition
Uni-Parser pipeline diagram showing document pre-processing, layout detection, semantic parsing, content gathering, and format conversion stages

Uni-Parser: Industrial-Grade Multi-Modal PDF Parsing (2025)

Technical report on Uni-Parser, an industrial-grade document parsing engine that uses a modular multi-expert architecture to parse scientific PDFs into structured representations. Integrates MolParser 1.5 for OCSR, achieving 88.6% accuracy on chemical structures while processing up to 20 pages per second.

Optical Chemical Structure Recognition
Diagram showing graph traversal chain-of-thought parsing of a molecular structure image into atom and bond predictions

GTR-CoT: Graph Traversal Chain-of-Thought for Molecules

A 2025 Vision-Language Model for OCSR that uses graph traversal chain-of-thought reasoning and a two-stage SFT plus GRPO training scheme to handle both printed molecules (including chemical abbreviations like Ph and Et) and hand-drawn structures, achieving strong performance on the new MolRec-Bench benchmark.

Optical Chemical Structure Recognition
OCSU: Optical Chemical Structure Understanding

OCSU: Optical Chemical Structure Understanding (2025)

Proposes the ‘Optical Chemical Structure Understanding’ (OCSU) task to translate molecular images into multi-level descriptions (motifs, IUPAC, SMILES). Introduces the Vis-CheBI20 dataset and two paradigms: DoubleCheck (OCSR-based) and Mol-VL (OCSR-free).

Machine Learning
SE(3)-Transformer architecture showing invariant attention weights modulating equivariant value messages on a 3D point cloud

SE(3)-Transformers: Equivariant Attention for 3D Data

Fuchs et al. introduce the SE(3)-Transformer, which combines self-attention with SE(3)-equivariance for 3D point clouds and graphs. Invariant attention weights modulate equivariant value messages from tensor field networks, resolving angular filter constraints while enabling data-adaptive, anisotropic processing.

Machine Learning
Comparison of planar CNN (translation only) versus spherical CNN (SO(3)-equivariant) showing how filters rotate on the sphere

Spherical CNNs: Rotation-Equivariant Networks on the Sphere

Cohen et al. introduce Spherical CNNs that achieve SO(3)-equivariance by defining cross-correlation on the sphere and rotation group, computed efficiently via generalized FFT algorithms from non-commutative harmonic analysis.

Computational Chemistry
Chemical structures and molecular representations feeding into a neural network model that processes atomized chemical knowledge

ChemDFM-R: Chemical Reasoning LLM with Atomized Knowledge

ChemDFM-R is a 14B-parameter chemical reasoning model that integrates a 101B-token dataset of atomized chemical knowledge. Using a mix-sourced distillation strategy and domain-specific reinforcement learning, it outperforms similarly sized models and DeepSeek-R1 on ChemEval.

Molecular Representations
ChemBERTa-2 visualization showing flowing SMILES strings in blue tones representing molecular data streams

ChemBERTa-2: Scaling Molecular Transformers to 77M

This work investigates the scaling hypothesis for molecular transformers, training RoBERTa models on 77M SMILES from PubChem. It compares Masked Language Modeling (MLM) against Multi-Task Regression (MTR) pretraining, finding that MTR yields better downstream performance but is computationally heavier.