2026  5

February  3

Molecular Sets (MOSES): A Generative Modeling Benchmark

2026-02-16 · Hunter Heidenreich

The Reliability Trap: The Limits of 99% Accuracy

2026-02-15 · 16 min · 3222 words · Hunter Heidenreich

The Evolution of Page Stream Segmentation: Rules to LLMs

2026-02-14 · 14 min · 2911 words · Hunter Heidenreich

January  2

GutenOCR: A Grounded Vision-Language Front-End for Documents

2026-01-20 · Hunter Heidenreich

PubMed-OCR: PMC Open Access OCR Annotations

2026-01-16 · Hunter Heidenreich

2025  176

December  121

ChemBERTa-3: Open Source Training Framework

2025-12-26 · Hunter Heidenreich

ChemDFM-R: Chemical Reasoner LLM

2025-12-26 · Hunter Heidenreich

ChemBERTa-2: Scaling Molecular Transformers to 77M

2025-12-25 · Hunter Heidenreich

GP-MoLFormer: Molecular Generation via Transformers

2025-12-25 · Hunter Heidenreich

ChemBERTa: Molecular Property Prediction via Transformers

2025-12-23 · Hunter Heidenreich

Chemformer: Pre-trained Transformer for Comp Chem

2025-12-23 · Hunter Heidenreich

A Convexity Principle for Interacting Gases: Theory

2025-12-21 · Hunter Heidenreich

Building Normalizing Flows with Stochastic Interpolants

2025-12-21 · Hunter Heidenreich

Flow Matching for Generative Modeling: Scalable CNFs

2025-12-21 · Hunter Heidenreich

Neural ODEs: Continuous-Depth Deep Learning

2025-12-21 · Hunter Heidenreich

Rectified Flow: Learning to Generate and Transfer Data

2025-12-21 · Hunter Heidenreich

Score Matching and Denoising Autoencoders

2025-12-21 · Hunter Heidenreich

Score-Based Generative Modeling with SDEs

2025-12-21 · Hunter Heidenreich

ChemDFM-X: Large Multimodal Model for Chemistry

2025-12-20 · Hunter Heidenreich

DynamicFlow: Integrating Protein Dynamics into Drug Design

2025-12-20 · Hunter Heidenreich

Image-to-Sequence OCSR: A Comparative Analysis

InstructMol: Multi-Modal Molecular Assistant

2025-12-20 · Hunter Heidenreich

InvMSAFold: Generative Inverse Folding with Potts Models

2025-12-20 · Hunter Heidenreich

MERMaid: Multimodal Reaction Mining

2025-12-20 · Hunter Heidenreich

MOFFlow: Flow Matching for MOF Structure Prediction

2025-12-20 · Hunter Heidenreich

Multimodal Search in Chemical Documents

2025-12-20 · Hunter Heidenreich

OCSAug: Diffusion-Based Augmentation for Hand-Drawn OCSR

2025-12-20 · Hunter Heidenreich

STOUT V2.0: SMILES to IUPAC Name Conversion

2025-12-20 · Hunter Heidenreich

STOUT: SMILES to IUPAC names using NMT

2025-12-20 · Hunter Heidenreich

Struct2IUPAC: Transformers for SMILES to IUPAC

2025-12-20 · Hunter Heidenreich

Translating InChI to IUPAC Names with Transformers

2025-12-20 · Hunter Heidenreich

AtomLenz: Atom-Level OCSR with Limited Supervision

ChemReco: Hand-Drawn Chemical Structure Recognition

ChemVLM: Multimodal LLM for Chemistry

2025-12-19 · Hunter Heidenreich

Comparing OCSR Tools (Krasnov et al. 2024)

DECIMER.ai: Optical Chemical Structure Recognition

Dual-Path Global Awareness Transformer (DGAT)

2025-12-19 · Hunter Heidenreich

Enhanced DECIMER for Hand-Drawn Structure Recognition

2025-12-19 · Hunter Heidenreich

Image2InChI: SwinTransformer for Molecular Recognition

2025-12-19 · Hunter Heidenreich

MarkushGrapher: Multi-modal Markush Structure Recognition

2025-12-19 · Hunter Heidenreich

MMSSC-Net: Multi-Stage Sequence Cognitive Networks

2025-12-19 · Hunter Heidenreich

MolGrapher: Graph-based Chemical Recognition

MolMole: Unified Vision Pipeline for Molecule Mining

MolScribe: Image-to-Graph Molecular Recognition

2025-12-19 · Hunter Heidenreich

MolSight: OCSR with RL and Multi-Granularity Learning

OCSU: Optical Chemical Structure Understanding

RFL: Simplifying Chemical Structure Recognition

ABC-Net: Divide-and-Conquer SMILES Recognition

ChemPix: Hand-Drawn Hydrocarbon Recognition

DECIMER 1.0: Transformers for Chemical Image Recognition

2025-12-18 · Hunter Heidenreich

End-to-End Transformer for Molecular Image Captioning

Handwritten Chemical Structure Recognition with RCGD

ICMDT: Automated Chemical Image Recognition

Image-to-Graph Transformers

Image2SMILES: Transformer OCSR with Synthetic Data Pipeline

2025-12-18 · Hunter Heidenreich

MICER: Molecular Image Captioning with Transfer Learning

MolMiner: Deep Learning OCSR with YOLOv5 Detection

One Strike, You’re Out: Detecting Markush Structures

Review of OCSR Techniques (2022)

String Representations for Chemical Image Recognition

SwinOCSR: Vision Transformers for Chemical OCR

2025-12-18 · Hunter Heidenreich

ChemGrapher: Deep Learning for Chemical OCR

DECIMER: Deep Learning for Chemical Image Recognition

Deep Learning for Molecular Structure Extraction

Handwritten Chemical Ring Recognition with NNs

Handwritten Chemical Symbol Recognition Using SVMs

HMM-based Online Recognition of Chemical Symbols

Img2Mol: Accurate SMILES from Molecular Depictions

On-line Handwritten Chemical Expression Recognition

Online Handwritten Chemical Formula Structure Analysis

Recognition of On-line Handwritten Chemical Expressions

Review of OCSR Tools (2020)

SVM-HMM Online Classifier for Chemical Symbols

Unified Framework for Handwritten Chemical Expressions

Chemical Structure Reconstruction with chemoCR

ChemReader at TREC 2011 Chemical IR Track

CLEF-IP 2012 Benchmark Overview

2025-12-16 · Hunter Heidenreich

MolRec at CLEF 2012

OSRA at CLEF-IP 2012

Overview of TREC 2011 Chemical IR Track

2025-12-16 · Hunter Heidenreich

Probabilistic OCSR with Markov Logic Networks

Research on Chemical Expression Images Recognition

Chemical Structure Recognition (Rule-Based)

ChemInk: Real-Time Recognition for Chemical Drawings

CLiDE Pro: Optical Chemical Structure Recognition Tool

Imago: Structure Recognition at TREC-CHEM 2011

Kekulé-1 System for Chemical Structure Recognition

OSRA: Optical Structure Recognition Application

Structural Analysis of Handwritten Chemical Formulas

A Spatial Model for Legislative Roll Call Analysis

2025-12-14 · Hunter Heidenreich

Automatic Recognition of Chemical Images

2025-12-14 · Hunter Heidenreich

Chaotic Evolution of the Solar System (1992)

2025-12-14 · Hunter Heidenreich

Chemical Literature Data Extraction: The CLiDE Project

Chemical Machine Vision

ChemReader: Automated Structure Extraction

Distributed Representations: A Foundational Theory

2025-12-14 · Hunter Heidenreich

Dynamical Corrections to TST for Surface Diffusion

2025-12-14 · Hunter Heidenreich

EAM User Guide: Voter’s Handbook Chapter

2025-12-14 · Hunter Heidenreich

Embedded-Atom Method: Theory and Applications Review

2025-12-14 · Hunter Heidenreich

Funnels, Pathways, and Energy Landscapes of Protein Folding

2025-12-14 · Hunter Heidenreich

Graph Perception for Chemical Structure OCR

Hand Drawn Chemical Diagram Recognition

IMG2SMI: Translating Molecular Structure Images to SMILES

Kekulé: OCR-Optical Chemical Recognition

Kinetic Oscillations on Pt(100): Theory

2025-12-14 · Hunter Heidenreich

MD Study of Self-Diffusion on Metal Surfaces

2025-12-14 · Hunter Heidenreich

Mixture Density Networks: Modeling Multimodal Distributions

2025-12-14 · Hunter Heidenreich

OCSR Methods: A Taxonomy of Approaches

Optical Recognition of Chemical Graphics

2025-12-14 · Hunter Heidenreich

Oscillatory CO Oxidation on Pt(110)

2025-12-14 · Hunter Heidenreich

OSRA: Open Source Optical Structure Recognition

2025-12-14 · Hunter Heidenreich

Oxidation/Reduction Oscillations on Pt/SiO2

2025-12-14 · Hunter Heidenreich

Party Matters: Enhancing Legislative Embeddings

2025-12-14 · Hunter Heidenreich

Reconstruction of Chemical Molecules from Images

2025-12-14 · Hunter Heidenreich

Second-Order Langevin Equation for Field Simulations

2025-12-14 · Hunter Heidenreich

Stillinger-Weber Potential for Silicon

2025-12-14 · Hunter Heidenreich

Tea Party in the House

2025-12-14 · Hunter Heidenreich

The Drive to Life on Wet and Icy Worlds

2025-12-14 · Hunter Heidenreich

Thermal Conductivity of the Lennard-Jones Fluid

2025-12-14 · Hunter Heidenreich

Three Domains of Life: Woese’s Phylogenetic Revolution

2025-12-14 · Hunter Heidenreich

AI & Physical Sciences Taxonomy: A Six-Vector Framework

2025-12-13 · Hunter Heidenreich

Correlations in Motion of Atoms in Liquid Argon

2025-12-13 · Hunter Heidenreich

Diffusion of Adatom Dimers on (111) Surfaces

2025-12-13 · Hunter Heidenreich

Terraforming Venus: The Cloud Continent Proposal

2025-12-07 · Hunter Heidenreich

Venus Evolution Through Time

2025-12-07 · Hunter Heidenreich

Life on Venus? Astrobiology and Habitability Limits

2025-12-05 · Hunter Heidenreich

November  4

Molecular String Renderer: Robust Visualization Tool

2025-11-30 · Hunter Heidenreich

Auto-Encoding Variational Bayes: VAE Paper Summary

2025-11-05 · Hunter Heidenreich

Importance Weighted Autoencoders: Beyond the Standard VAE

2025-11-05 · 7 min · 1355 words · Hunter Heidenreich

IWAE: Importance Weighted Autoencoders

2025-11-05 · Hunter Heidenreich

October  19

InChI and Tautomerism: Toward Comprehensive Treatment

2025-10-12 · Hunter Heidenreich

InChI: The Worldwide Chemical Structure Identifier Standard

2025-10-12 · Hunter Heidenreich

Making InChI FAIR and Sustainable for Inorganic Chemistry

2025-10-12 · Hunter Heidenreich

Mixfile & MInChI: Machine-Readable Mixture Formats

2025-10-12 · Hunter Heidenreich

NInChI: Toward a Chemical Identifier for Nanomaterials

2025-10-12 · Hunter Heidenreich

Recent Advances in the SELFIES Library (2023)

2025-10-12 · Hunter Heidenreich

RInChI: Reaction International Chemical Identifier

2025-10-12 · Hunter Heidenreich

SELFIES: The Original Paper (Krenn et al. 2020)

2025-10-12 · Hunter Heidenreich

SMILES: The Original Paper (Weininger 1988)

GTR-CoT: Graph Traversal Chain-of-Thought for Molecules

MolRec: Chemical Structure Recognition at CLEF 2012

MolRec: Rule-Based OCSR System

SubGrapher: Visual Fingerprinting of Chemical Structures

What is Optical Chemical Structure Recognition (OCSR)?

2025-10-11 · 8 min · 1519 words · Hunter Heidenreich

αExtractor: Chemical Info from Biomedical Literature

ChemInfty: Chemical Structure Recognition in Patent Images

MolNexTR: Dual-Stream Molecular Image Recognition

MolParser-7M & WildMol: Large-Scale OCSR Datasets

2025-10-03 · Hunter Heidenreich

MolParser: End-to-End Molecular Structure Recognition

September  11

ZINC-22: A Multi-Billion Scale Database for Ligand Discovery

2025-09-27 · Hunter Heidenreich

Converting SMILES and SELFIES to 2D Molecular Images

2025-09-12 · 7 min · 1447 words · Hunter Heidenreich

SELFIES (Self-Referencing Embedded Strings)

2025-09-12 · 6 min · 1138 words · Hunter Heidenreich

Communication in the Presence of Noise: Shannon’s 1949 Paper

2025-09-08 · Hunter Heidenreich

How to Fold Graciously: The Levinthal Paradox

2025-09-08 · Hunter Heidenreich

MARCEL: Molecular Representation & Conformers

2025-09-08 · Hunter Heidenreich

Müller-Brown Potential

2025-09-08 · 6 min · 1259 words · Hunter Heidenreich

SMILES: Compact Notation for Chemical Structures

2025-09-08 · Hunter Heidenreich

The Number of Isomeric Hydrocarbons of the Methane Series

2025-09-08 · Hunter Heidenreich

The Surface of Venus: Stratigraphy and Resurfacing

GEOM: Energy-Annotated Molecular Conformations

2025-09-04 · Hunter Heidenreich

August  19

Exponential Random Numbers: Two Classic Algorithms

2025-08-31 · 7 min · 1326 words · Hunter Heidenreich

GDB-11: Chemical Universe Database (26.4M Molecules)

2025-08-29 · Hunter Heidenreich

Implementing the Müller-Brown Potential in PyTorch

2025-08-27 · 17 min · 3470 words · Hunter Heidenreich

Müller-Brown Potential: A PyTorch ML Testbed

2025-08-27 · Hunter Heidenreich

DenoiseVAE: Adaptive Noise for Molecular Pre-training

2025-08-24 · Hunter Heidenreich

Beyond Atoms: 3D Space Modeling for Molecular Pretraining

2025-08-23 · Hunter Heidenreich

Dark Side of Forces: Non-Conservative ML Force Models

2025-08-23 · Hunter Heidenreich

Efficient DFT Hamiltonian Prediction via Adaptive Sparsity

2025-08-23 · Hunter Heidenreich

Learning Smooth Interatomic Potentials with eSEN

2025-08-23 · Hunter Heidenreich

Modernizing Rahman’’s 1964 Argon Simulation

2025-08-23 · Hunter Heidenreich

Modernizing Rahman’s 1964 Argon Simulation

2025-08-23 · 6 min · 1244 words · Hunter Heidenreich

Embedded-Atom Method: Impurities and Defects in Metals

2025-08-22 · Hunter Heidenreich

Umbrella Sampling: Monte Carlo Free-Energy Estimation

2025-08-21 · Hunter Heidenreich

Adsorption and Diffusion on Surfaces

2025-08-17 · Hunter Heidenreich

Contrastive Learning for Variational Autoencoder Priors

2025-08-17 · Hunter Heidenreich

GDB-13: Chemical Universe Database (970M Molecules)

2025-08-16 · Hunter Heidenreich

GDB-17: Chemical Universe Database (166.4B Molecules)

2025-08-16 · Hunter Heidenreich

High-Performance Word2Vec in Pure PyTorch

2025-08-16 · Hunter Heidenreich

GEOM Dataset: 3D Molecular Conformer Generation

2025-08-15 · 7 min · 1381 words · Hunter Heidenreich

January  2

3D Steerable CNNs: Rotationally Equivariant Features

2025-01-16 · Hunter Heidenreich

LLMs for Insurance Document Automation

2025-01-01 · Hunter Heidenreich

2024  10

October  1

Optimizing Sequence Models for Dynamical Systems

2024-10-01 · Hunter Heidenreich

August  1

LLMs for Page Stream Segmentation

2024-08-21 · Hunter Heidenreich

July  1

The Nature of LUCA and Early Earth System

2024-07-12 · Hunter Heidenreich

April  1

Invalid SMILES Benefit Chemical Language Models: A Study

2024-04-15 · Hunter Heidenreich

March  2

Synthetic Isomer Data Generation Pipeline

2024-03-09 · Hunter Heidenreich

Modern PyTorch VAEs: A Detailed Implementation Guide

2024-03-03 · 31 min · 6586 words · Hunter Heidenreich

February  4

Sarcasm Detection with Transformers: A Cautionary Tale

2024-02-25 · 5 min · 1004 words · Hunter Heidenreich

Hearing Molecular Shape via Coulomb Matrix Eigenvalues

2024-02-24 · 19 min · 3912 words · Hunter Heidenreich

Classifying Congressional Bills with Machine Learning

2024-02-21 · 13 min · 2710 words · Hunter Heidenreich

Coulomb Matrices for Molecular Machine Learning

2024-02-10 · 7 min · 1384 words · Hunter Heidenreich

2023  7

October  2

How Does Congress Actually Work? Data from 15K Bills

2023-10-05 · 6 min · 1164 words · Hunter Heidenreich

Kabsch Algorithm: NumPy, PyTorch, TensorFlow, and JAX

2023-10-03 · 15 min · 3175 words · Hunter Heidenreich

September  3

LAMMPS Tutorial: Copper and Platinum Adatom Diffusion

2023-09-27 · 11 min · 2288 words · Hunter Heidenreich

Automated Adatom Diffusion Workflow

2023-09-21 · Hunter Heidenreich

Generating Mini-Protein Trajectories with GROMACS

2023-09-21 · 6 min · 1138 words · Hunter Heidenreich

August  1

Mini-Protein Trajectory Generation

2023-08-01 · Hunter Heidenreich

March  1

Congressional Knowledge Graph & Policy Classification

2023-03-01 · Hunter Heidenreich

2022  4

October  1

SELFIES and the Future of Molecular String Representations

2022-10-14 · Hunter Heidenreich

May  3

IQCRNN: Certified Stability for Neural Networks

2022-05-11 · Hunter Heidenreich

Analytical Solution to Word2Vec Softmax & Bias Probing

2022-05-01 · Hunter Heidenreich

EigenNoise: Data-Free Word Vector Initialization

2022-05-01 · Hunter Heidenreich

2021  3

June  2

Look, Don’t Tweet: Unified Data Models for Social NLP

2021-06-30 · Hunter Heidenreich

PyConversations: Social Media Conversational Analysis

2021-06-01 · Hunter Heidenreich

May  1

GPT-2 Susceptibility to Universal Adversarial Triggers

2020  3

November  1

5 Axes of Multi-Arm Bandit Problems: A Practical Guide

2020-11-10 · 8 min · 1628 words · Hunter Heidenreich

August  1

NewsTweet Dataset: Social Media in Digital Journalism

2020-08-01 · Hunter Heidenreich

July  1

Coordinated Social Targeting on Twitter

2020-07-01 · Hunter Heidenreich

2019  2

November  1

Data-Driven WordNet Construction from Wiktionary

2019-11-01 · Hunter Heidenreich

January  1

A Guide to Neuroevolution: NEAT and HyperNEAT

2019-01-02 · 8 min · 1579 words · Hunter Heidenreich

2018  8

December  2

Breaking Down Machine Learning for the Average Person

2018-12-04 · 3 min · 559 words · Hunter Heidenreich

Foundations of AI: Knowledge-Based Agents and Logic

2018-12-01 · 8 min · 1674 words · Hunter Heidenreich

November  1

Cartesian Genetic Programming in Julia

2018-11-18 · Hunter Heidenreich

October  1

QuAC: Question Answering in Context Dataset

2018-10-31 · 5 min · 949 words · Hunter Heidenreich

August  3

CoQA Dataset: Advancing Conversational Question Answering

2018-08-23 · 5 min · 953 words · Hunter Heidenreich

Understanding GANs: From Fundamentals to Objective Functions

2018-08-18 · 13 min · 2585 words · Hunter Heidenreich

Word Embeddings in NLP: An Introduction

2018-08-05 · 9 min · 1822 words · Hunter Heidenreich

March  1

FFTW Compiler in Haskell

2018-03-15 · Hunter Heidenreich

2017  2

February  1

Term Schedule Optimizer

2017-02-15 · Hunter Heidenreich

January  1

Rubik’s Cube Sonification

2017-01-29 · Hunter Heidenreich

2014  1

October  1

Elemental Brawl

2014-10-24 · Hunter Heidenreich