Time Series Forecasting
Forecasting comparison of different neural architectures on the Multiscale Lorenz-96 system

Optimizing Sequence Models for Dynamical Systems

Ablation study deconstructing sequence models. Attention-augmented Recurrent Highway Networks outperform Transformers on …

Machine Learning Fundamentals
Diagram showing distributed representations with three pools of units (AGENT, RELATIONSHIP, PATIENT) connected via role/identity bindings

Distributed Representations

Hinton's 1984 technical report establishing the theoretical efficiency of distributed representations over local …

Machine Learning Fundamentals
Visualization of inverse problem showing one input mapping to multiple valid outputs

Mixture Density Networks

Seminal 1994 paper introducing MDNs to model arbitrary conditional probability distributions using neural networks.

Computational Social Science
Visualization of party-based legislative embeddings

Party Matters: Enhancing Legislative Embeddings

A method for improving legislative vote prediction across sessions by augmenting bill text embeddings with sponsor …

Computational Social Science
Hierarchical Ideal Point Topic Model visualization showing political polarization

Tea Party in the House

A hierarchical probabilistic model combining roll call votes, bill text, and legislative speeches to analyze political …

Generative Modeling
Diagram comparing standard stochastic sampling (gradient blocked) vs the reparameterization trick (gradient flows)

Auto-Encoding Variational Bayes (VAE Paper Summary)

Summary of Kingma & Welling's foundational VAE paper introducing the reparameterization trick and variational …

Generative Modeling
MNIST digit samples generated from a Variational Autoencoder latent space

Importance Weighted Autoencoders: Beyond the Standard VAE

The key difference between multi-sample VAEs and IWAEs: how log-of-averages creates a tighter bound on log-likelihood.

Generative Modeling
Flowchart comparing VAE and IWAE computation showing the key difference in where averaging occurs relative to the log operation

IWAE: Importance Weighted Autoencoders

Summary of Burda, Grosse & Salakhutdinov's ICLR 2016 paper introducing Importance Weighted Autoencoders for tighter …

Computational Chemistry
Chemical structure from journal publication

GTR-CoT: Graph Traversal Chain-of-Thought for Molecules

GTR-CoT uses graph traversal chain-of-thought reasoning to improve optical chemical structure recognition accuracy.

Computational Chemistry
Markush structure diagram

SubGrapher: Visual Fingerprinting of Chemical Structures

Novel OCSR method creating molecular fingerprints from images through functional group segmentation for database …

Computational Chemistry
Chemical structure diagram for optical recognition

αExtractor: Chemical Info from Biomedical Literature

αExtractor uses ResNet-Transformer to extract chemical structures from literature images, including noisy and hand-drawn …

Computational Chemistry
Optical chemical structure recognition example

Img2Mol: Accurate SMILES from Molecular Depictions

Two-stage CNN approach for converting molecular images to SMILES using CDDD embeddings and extensive data augmentation.