Algorithms

Log-scale plot showing exponential growth of alkane isomer counts from C1 to C40

The Number of Isomeric Hydrocarbons of the Methane Series

A foundational 1931 paper that derives exact recursive formulas for counting alkane structural isomers, correcting historical errors and establishing the first systematic enumeration up to C₄₀.

Scientific Computing

Comparison of exponential sampling methods showing histograms from both inverse transform and von Neumann methods overlaid with the theoretical exponential distribution

Exponential Random Numbers: Two Classic Algorithms

Explore two fundamental approaches to generating exponentially distributed random numbers: the modern inverse transform method using logarithms and von Neumann’s ingenious 1951 comparison-based algorithm that avoids transcendental functions entirely.

Computational Chemistry

Efficient DFT Hamiltonian Prediction via Adaptive Sparsity

ICML 2025 methodological paper introducing SPHNet, which uses adaptive network sparsification to overcome the computational bottleneck of tensor products in SE(3)-equivariant networks, achieving 2-5x speedup while maintaining accuracy on DFT Hamiltonian prediction tasks.

Computational Chemistry

Protein folding funnel diagram illustrating energy landscape

Umbrella Sampling: Monte Carlo Free-Energy Estimation

Torrie and Valleau’s 1977 methodological breakthrough introducing importance sampling with non-physical distributions to overcome the sampling gap problem in Monte Carlo free-energy calculations, particularly for rare events and phase transitions.

Natural Language Processing

Huffman Tree visualization for the input 'beep boop beer!' showing internal nodes with frequency counts and leaf nodes with characters

High-Performance Word2Vec in Pure PyTorch

A ground-up PyTorch implementation of Word2Vec treating it as a systems engineering challenge, with “tensorized tree” architecture converting pointer-chasing Hierarchical Softmax into dense GPU operations, infinite streaming datasets with Zipfian subsampling, and torch.compile compatibility for production-grade efficiency.

Machine Learning Fundamentals

Comparison of standard 3D CNN versus 3D Steerable CNN for handling rotational symmetry

3D Steerable CNNs: Rotationally Equivariant Features

Weiler et al.’s NeurIPS 2018 paper introducing 3D Steerable CNNs that achieve SE(3) equivariance through group representation theory and spherical harmonic convolution kernels, eliminating the need for rotational data augmentation and improving data efficiency for scientific applications with rotational symmetry like molecular and protein structures.

Scientific Computing

Molecular structure alignment showing protein conformations and RMSD calculation

Kabsch Algorithm: NumPy, PyTorch, TensorFlow, and JAX

Learn to align molecular structures and point clouds using the Kabsch algorithm, with differentiable implementations for modern ML frameworks.

Scientific Computing

Comparison of IQCRNN (Our Method) vs standard Policy Gradient showing training curves, phase portraits, and state trajectories for control tasks

IQCRNN: Certified Stability for Neural Networks

A PyTorch implementation enforcing strict Lyapunov stability guarantees on recurrent neural network controllers through Integral Quadratic Constraints, bridging 1990s robust control theory with modern deep reinforcement learning by solving semidefinite programs inside the gradient descent loop to provide mathematical certificates of safety.

Natural Language Processing

Information Quality Ratio plot showing statistical dependencies decay as window size increases

Analytical Solution to Word2Vec Softmax & Bias Probing

We provide the first analytical solution to Word2Vec’s softmax skip-gram objective, introducing the Independent Frequencies Model and deriving a low-cost, training-free method for measuring semantic bias directly from corpus statistics.

Machine Learning Fundamentals

Vintage slot machine with multiple arms representing the multi-arm bandit problem in machine learning

5 Axes of Multi-Arm Bandit Problems: A Practical Guide

Key dimensions that have helped me understand multi-arm bandit problems: action space, problem structure, external information, reward mechanism, and learner feedback.

Machine Learning Fundamentals

NEAT genome encoding diagram showing node genes and connection genes with innovation numbers

A Guide to Neuroevolution: NEAT and HyperNEAT

Discover how NEAT and HyperNEAT changed neuroevolution by automatically designing neural network architectures and scaling them through geometric patterns.

Machine Learning Fundamentals

Diagram showing the three main types of machine learning: supervised, unsupervised, and reinforcement learning

Breaking Down Machine Learning for the Average Person

Understand the pattern recognition behind Netflix recommendations, email spam filters, and game-playing AI through three core machine learning approaches.