Kabsch-Horn Cookbook: Differentiable Alignment
MolGen: Molecular Generation with Chemical Feedback
Molecular Transformer: Calibrated Reaction Prediction
Arun et al.: SVD-Based Least-Squares Fitting of 3D Points
Exposing Limitations of Molecular ML with Activity Cliffs
Horn et al.: Absolute Orientation Using Orthonormal Matrices
MoLFormer: Large-Scale Chemical Language Representations
SELFormer: A SELFIES-Based Molecular Language Model
Umeyama’s Method: Corrected SVD for Point Alignment
AdaptMol: Domain Adaptation for Molecular OCSR (2026)
Consistency Models: Fast One-Step Diffusion Generation
D3PM: Discrete Denoising Diffusion Probabilistic Models
GraphReco: Probabilistic Structure Recognition (2026)
GraSP: Graph Recognition via Subgraph Prediction (2026)
Horn’s Method: Absolute Orientation via Unit Quaternions
Kabsch Algorithm: Optimal Rotation for Point Set Alignment
Latent Diffusion Models for High-Res Image Synthesis
Uni-Parser: Industrial-Grade Multi-Modal PDF Parsing (2025)
Can Recurrent Neural Networks Warp Time? (ICLR 2018)
Relational Inductive Biases in Deep Learning (2018)
Scaling Laws vs Model Architectures: Inductive Bias
SE(3)-Transformers: Equivariant Attention for 3D Data
Spherical CNNs: Rotation-Equivariant Networks on the Sphere
The Quarks of Attention: Building Blocks of Attention
Molecular Sets (MOSES): A Generative Modeling Benchmark
The Reliability Trap: The Limits of 99% Accuracy
The Evolution of Page Stream Segmentation: Rules to LLMs
GutenOCR: A Grounded Vision-Language Front-End for Documents
PubMed-OCR: PMC Open Access OCR Annotations
ChemBERTa-3: Open Source Chemical Foundation Models
ChemDFM-R: Chemical Reasoning LLM with Atomized Knowledge
ChemBERTa-2: Scaling Molecular Transformers to 77M
GP-MoLFormer: Molecular Generation via Transformers
ChemBERTa: Molecular Property Prediction via Transformers
Chemformer: A Pre-trained Transformer for Comp Chem
A Convexity Principle for Interacting Gases (McCann 1997)
Building Normalizing Flows with Stochastic Interpolants
Flow Matching for Generative Modeling: Scalable CNFs
Neural ODEs: Continuous-Depth Deep Learning Models
Rectified Flow: Learning to Generate and Transfer Data
Score Matching and Denoising Autoencoders: A Connection
Score-Based Generative Modeling with SDEs (Song 2021)
ChemDFM-X: Multimodal Foundation Model for Chemistry
DynamicFlow: Integrating Protein Dynamics into Drug Design
Image-to-Sequence OCSR: A Comparative Analysis
InstructMol: Multi-Modal Molecular LLM for Drug Discovery
InvMSAFold: Generative Inverse Folding with Potts Models
MERMaid: Multimodal Chemical Reaction Mining from PDFs
MOFFlow: Flow Matching for MOF Structure Prediction
Multimodal Search in Chemical Documents and Reactions
OCSAug: Diffusion-Based Augmentation for Hand-Drawn OCSR
STOUT V2.0: Transformer-Based SMILES to IUPAC Translation
STOUT: SMILES to IUPAC Names via Neural Machine Translation
Struct2IUPAC: Translating SMILES to IUPAC via Transformers
Translating InChI to IUPAC Names with Transformers
AtomLenz: Atom-Level OCSR with Limited Supervision
Benchmarking Eight OCSR Tools on Patent Images (2024)
ChemReco: Hand-Drawn Chemical Structure Recognition
ChemVLM: A Multimodal Large Language Model for Chemistry
DECIMER.ai: Optical Chemical Structure Recognition
Dual-Path Global Awareness Transformer (DGAT) for OCSR
Enhanced DECIMER for Hand-Drawn Structure Recognition
Image2InChI: SwinTransformer for Molecular Recognition
MarkushGrapher: Multi-modal Markush Structure Recognition
MMSSC-Net: Multi-Stage Sequence Cognitive Networks
MolGrapher: Graph-based Chemical Structure Recognition
MolMole: Unified Vision Pipeline for Molecule Mining
MolScribe: Robust Image-to-Graph Molecular Recognition
MolSight: OCSR with RL and Multi-Granularity Learning
ABC-Net: Keypoint-Based Molecular Image Recognition
ChemPix: Hand-Drawn Hydrocarbon Structure Recognition
DECIMER 1.0: Transformers for Chemical Image Recognition
End-to-End Transformer for Molecular Image Captioning
Handwritten Chemical Structure Recognition with RCGD
ICMDT: Automated Chemical Structure Image Recognition
Image-to-Graph Transformers for Chemical Structures
Image2SMILES: Transformer OCSR with Synthetic Data Pipeline
MICER: Molecular Image Captioning with Transfer Learning
MolMiner: Deep Learning OCSR with YOLOv5 Detection
One Strike, You’re Out: Detecting Markush Structures
Review of OCSR Techniques and Models (Musazade 2022)
String Representations for Chemical Image Recognition
SwinOCSR: End-to-End Chemical OCR with Swin Transformers
A Review of Optical Chemical Structure Recognition Tools
ChemGrapher: Deep Learning for Chemical Graph OCSR
DECIMER: Deep Learning for Chemical Image Recognition
Deep Learning for Molecular Structure Extraction (2019)
Handwritten Chemical Ring Recognition with Neural Networks
Handwritten Chemical Symbol Recognition Using SVMs
HMM-based Online Recognition of Chemical Symbols
Img2Mol: Accurate SMILES Recognition from Depictions
On-line Handwritten Chemical Expression Recognition
Online Handwritten Chemical Formula Structure Analysis
Recognition of On-line Handwritten Chemical Expressions
SVM-HMM Online Classifier for Chemical Symbols
Unified Framework for Handwritten Chemical Expressions
Chemical Structure Reconstruction with chemoCR (2011)
ChemReader Image-to-Structure OCR at TREC 2011 Chemical IR
CLEF-IP 2012: Patent and Chemical Structure Benchmark
MolRec at CLEF 2012: Rule-Based Structure Recognition
OSRA at CLEF-IP 2012: Native TIFF Processing for Patents
Overview of the TREC 2011 Chemical IR Track Benchmark
Probabilistic OCSR with Markov Logic Networks
Research on Chemical Expression Images Recognition
Chemical Structure Recognition (Rule-Based)
ChemInk: Real-Time Recognition for Chemical Drawings
CLiDE Pro: Optical Chemical Structure Recognition Tool
Imago: Open-Source Chemical Structure Recognition (2011)
Kekulé-1 System for Chemical Structure Recognition
OSRA at TREC-CHEM 2011: Optical Structure Recognition
Structural Analysis of Handwritten Chemical Formulas
A Spatial Model for Legislative Roll Call Analysis
Automatic Recognition of Chemical Images
Chaotic Evolution of the Solar System (Sussman 1992)
Chemical Literature Data Extraction: The CLiDE Project
ChemReader: Automated Structure Extraction
Distributed Representations: A Foundational Theory
Drive to Life on Wet and Icy Worlds: Alkaline Vent Theory
Dynamical Corrections to TST for Surface Diffusion
Embedded-Atom Method User Guide: Voter’s 1994 Chapter
Embedded-Atom Method: Theory and Applications Review
Evans 1986: Thermal Conductivity of Lennard-Jones Fluid
Funnels, Pathways, and Energy Landscapes of Protein Folding
Graph Perception for Chemical Structure OCR
Hand-Drawn Chemical Diagram Recognition (AAAI 2007)
IMG2SMI: Translating Molecular Structure Images to SMILES
In Situ XRD of Oxidation-Reduction Oscillations on Pt/SiO2
Kekulé: OCR-Optical Chemical Recognition
Kinetic Oscillations in CO Oxidation on Pt(100): Theory
MD Simulation of Self-Diffusion on Metal Surfaces (1994)
Mixture Density Networks: Modeling Multimodal Distributions
OCSR Methods: A Taxonomy of Approaches
Optical Recognition of Chemical Graphics
Oscillatory CO Oxidation on Pt(110): Temporal Modeling
OSRA: Open Source Optical Structure Recognition
Party Matters: Enhancing Legislative Vote Embeddings
Reconstruction of Chemical Molecules from Images
Second-Order Langevin Equation for Field Simulations
Stillinger-Weber Potential for Silicon Simulation
Tea Party in the House: Legislative Ideology via HIPTM
Three Domains of Life: Woese’s Phylogenetic Revolution
Adatom Dimer Diffusion on fcc(111) Crystal Surfaces
AI & Physical Sciences Taxonomy: A Seven-Vector Framework
Correlations in the Motion of Atoms in Liquid Argon
Terraforming Venus With the Cloud Continent Proposal
Venus Evolution Through Time: Key Questions and Missions
Life on Venus? Astrobiology and the Habitability Limits
Molecular String Renderer: Robust Visualization Tool
Auto-Encoding Variational Bayes: VAE Paper Summary
Importance Weighted Autoencoders (IWAE) for Tighter Bounds
Importance Weighted Autoencoders: Beyond the Standard VAE
InChI and Tautomerism: Toward Comprehensive Treatment
InChI: The Worldwide Chemical Structure Identifier Standard
Making InChI FAIR and Sustainable for Inorganic Chemistry
Mixfile & MInChI: Machine-Readable Mixture Formats
NInChI: Toward a Chemical Identifier for Nanomaterials
Recent Advances in the SELFIES Library: 2023 Update
RInChI: The Reaction International Chemical Identifier
SELFIES: The Original Paper on Robust Molecular Strings
SMILES Notation: The Original Paper by Weininger (1988)
MolRec: Chemical Structure Recognition at CLEF 2012
MolRec: Rule-Based OCSR System at TREC 2011 Benchmark
What is Optical Chemical Structure Recognition (OCSR)?
αExtractor: Chemical Info from Biomedical Literature
ChemInfty: Chemical Structure Recognition in Patent Images
MolNexTR: A Dual-Stream Molecular Image Recognition
MolParser-7M & WildMol: Large-Scale OCSR Datasets
MolParser: End-to-End Molecular Structure Recognition
ZINC-22: A Multi-Billion Scale Database for Ligand Discovery
Converting SMILES and SELFIES to 2D Molecular Images
SELFIES: The 100% Robust Molecular String Representation
Communication in the Presence of Noise: Shannon’s 1949 Paper
How to Fold Graciously: Levinthal’s Paradox (1969)
MARCEL: Molecular Conformer Ensemble Learning Benchmark
SMILES: A Compact Notation for Chemical Structures
The Müller-Brown Potential: A 2D Benchmark Surface
The Number of Isomeric Hydrocarbons of the Methane Series
The Surface of Venus: Stratigraphy and Resurfacing History
GEOM: Energy-Annotated Molecular Conformations Dataset
Exponential Random Numbers: Two Classic Algorithms
GDB-11: Chemical Universe Database (26.4M Molecules)
Implementing the Müller-Brown Potential in PyTorch
Müller-Brown Potential: A PyTorch ML Testbed
DenoiseVAE: Adaptive Noise for Molecular Pre-training
Beyond Atoms: 3D Space Modeling for Molecular Pretraining
Dark Side of Forces: Non-Conservative ML Force Models
Efficient DFT Hamiltonian Prediction via Adaptive Sparsity
Learning Smooth Interatomic Potentials with eSEN (ICML)
Modernizing Rahman’’s 1964 Argon Simulation
Modernizing Rahman’s 1964 Argon Simulation
Embedded-Atom Method: Impurities and Defects in Metals
Umbrella Sampling: Monte Carlo Free-Energy Estimation
Contrastive Learning for Variational Autoencoder Priors
Lennard-Jones on Adsorption and Diffusion on Surfaces
GDB-13: Chemical Universe Database (970M Molecules)
GDB-17: Chemical Universe Database (166.4B Molecules)
High-Performance Word2Vec in Pure PyTorch
GEOM Dataset: 3D Molecular Conformer Generation
GTR-CoT: Graph Traversal Chain-of-Thought for Molecules
SubGrapher: Visual Fingerprinting of Chemical Structures
OCSU: Optical Chemical Structure Understanding (2025)
3D Steerable CNNs: Rotationally Equivariant Features
LLMs for Insurance Document Automation
RFL: Simplifying Chemical Structure Recognition (AAAI 2025)
Optimizing Sequence Models for Dynamical Systems
LLMs for Page Stream Segmentation
The Nature of LUCA and Its Impact on the Early Earth System
Invalid SMILES Benefit Chemical Language Models: A Study
Synthetic Isomer Data Generation Pipeline
Modern PyTorch VAEs: A Detailed Implementation Guide
Sarcasm Detection with Transformers: A Cautionary Tale
Hearing Molecular Shape via Coulomb Matrix Eigenvalues
Classifying Congressional Bills with Machine Learning
Coulomb Matrices for Molecular Machine Learning
How Does Congress Actually Work? Data from 15K Bills
Kabsch Algorithm: NumPy, PyTorch, TensorFlow, and JAX
LAMMPS Tutorial: Copper and Platinum Adatom Diffusion
Automated Adatom Diffusion Workflow
Generating Mini-Protein Trajectories with GROMACS
Mini-Protein Trajectory Generation
Congressional Knowledge Graph & Policy Classification
SELFIES and the Future of Molecular String Representations
IQCRNN: Certified Stability for Neural Networks
Analytical Solution to Word2Vec Softmax & Bias Probing
EigenNoise: Data-Free Word Vector Initialization
Look, Don’t Tweet: Unified Data Models for Social NLP
PyConversations: Social Media Conversational Analysis
GPT-2 Susceptibility to Universal Adversarial Triggers
5 Axes of Multi-Arm Bandit Problems: A Practical Guide
NewsTweet Dataset: Social Media in Digital Journalism
Coordinated Social Targeting on Twitter
Data-Driven WordNet Construction from Wiktionary
A Guide to Neuroevolution: NEAT and HyperNEAT
Breaking Down Machine Learning for the Average Person
Foundations of AI: Knowledge-Based Agents and Logic
Cartesian Genetic Programming in Julia
QuAC: Question Answering in Context Dataset
CoQA Dataset: Advancing Conversational Question Answering
Understanding GANs: From Fundamentals to Objective Functions
Word Embeddings in NLP: An Introduction
Rubik’s Cube Sonification