Generative Modeling
Forward and Reverse SDE trajectories showing the diffusion process from data to noise and back

Score-Based Generative Modeling with SDEs (Song 2021)

This paper unifies previous score-based methods (SMLD and DDPM) under a continuous-time SDE framework. It introduces Predictor-Corrector samplers for improved generation and Probability Flow ODEs for near-exact likelihood computation, setting new records on CIFAR-10.

Computational Biology
DynamicFlow illustration showing the transformation from apo pocket to holo pocket with ligand molecule generation

DynamicFlow: Integrating Protein Dynamics into Drug Design

This paper introduces DynamicFlow, a full-atom stochastic flow matching model that simultaneously generates ligand molecules and transforms protein pockets from apo to holo states. It also contributes a new dataset of MD-simulated apo-holo pairs derived from MISATO.

Computational Biology
InvMSAFold generates diverse protein sequences from structure using a Potts model

InvMSAFold: Generative Inverse Folding with Potts Models

InvMSAFold replaces autoregressive decoding with a Potts model parameter generator, enabling diverse protein sequence sampling orders of magnitude faster than ESM-IF1.

Computational Chemistry
MOFFlow assembles metal nodes and organic linkers into Metal-Organic Framework structures

MOFFlow: Flow Matching for MOF Structure Prediction

MOFFlow is the first deep generative model tailored for Metal-Organic Framework (MOF) structure prediction. It utilizes Riemannian flow matching on SE(3) to assemble rigid building blocks (metal nodes and organic linkers), achieving higher accuracy and scalability than atom-based methods on large systems.

Scientific Computing
Grid of complex molecular structures rendered from SELFIES and SMILES strings

Molecular String Renderer: Robust Visualization Tool

A fault-tolerant RDKit wrapper treating molecular visualization as a software engineering problem, implementing strategy pattern for SVG generation with automatic raster fallback, native SELFIES support for generative AI workflows, and strict type safety for reliable batch processing of millions of molecules in training pipelines.

Generative Modeling
Diagram comparing standard stochastic sampling (gradient blocked) vs the reparameterization trick (gradient flows)

Auto-Encoding Variational Bayes: VAE Paper Summary

Kingma and Welling’s 2013 paper introducing Variational Autoencoders and the reparameterization trick, enabling end-to-end gradient-based training of generative models with continuous latent variables by moving the stochasticity outside the computational graph so that gradients can flow through a deterministic path.

Generative Modeling
Flowchart comparing VAE and IWAE computation showing the key difference in where averaging occurs relative to the log operation

Importance Weighted Autoencoders (IWAE) for Tighter Bounds

Burda et al.’s ICLR 2016 paper introducing Importance Weighted Autoencoders, which use importance sampling to derive a strictly tighter log-likelihood lower bound than standard VAEs, addressing posterior collapse and improving generative quality. The model architecture remains the same.

Generative Modeling
MNIST digit samples generated from a Variational Autoencoder latent space

Importance Weighted Autoencoders: Beyond the Standard VAE

Discover how Importance Weighted Autoencoders (IWAEs) use the same architecture as VAEs with a fundamentally more powerful objective to leverage multiple samples effectively.

Computational Chemistry
Benzene in SELFIES notation

Recent Advances in the SELFIES Library: 2023 Update

A 2023 software update paper documenting improvements to the SELFIES Python library (v2.1.1), including a streamlined context-free grammar, expanded support for aromatic systems and stereochemistry, customizable semantic constraints, ML utility functions, and performance benchmarks on 300K+ molecules.

Computational Chemistry
SELFIES molecular representation overview

SELFIES: The Original Paper on Robust Molecular Strings

The 2020 paper that introduced SELFIES: Mario Krenn and colleagues created a molecular representation that solves SMILES validity problems. It guarantees every generated string corresponds to a valid chemical structure.

Computational Chemistry
SELFIES representation of 2-Fluoroethenimine molecule

SELFIES: The 100% Robust Molecular String Representation

An in-depth overview of SELFIES, the 100% robust molecular string representation designed to overcome SMILES limitations in machine learning, where every possible string (even random ones) decodes to a valid molecule through local operations, customizable valence rules, and graph-based internal representations.

Computational Chemistry
Potential energy surface showing molecular conformation space with equilibrium and low energy conformations

DenoiseVAE: Adaptive Noise for Molecular Pre-training

ICLR 2025 paper introducing DenoiseVAE, which learns adaptive, atom-specific noise distributions through a VAE framework to improve denoising-based pre-training for molecular force field prediction, outperforming fixed Gaussian noise approaches on quantum chemistry benchmarks.