Time Series Forecasting
Forecasting comparison of different neural architectures on the Multiscale Lorenz-96 system

Optimizing Sequence Models for Dynamical Systems

Ablation study deconstructing sequence models. Attention-augmented Recurrent Highway Networks outperform Transformers on …

Computational Chemistry

String Representations for Chemical Image Recognition

Ablation study comparing SMILES, DeepSMILES, SELFIES, and InChI for OCSR. SMILES achieves highest accuracy; SELFIES …

Computational Chemistry

Imago: Structure Recognition at TREC-CHEM 2011

Open-source C++ toolkit for extracting 2D chemical structures from scientific literature using heuristic image …

Computational Chemistry
Optical chemical structure recognition example

MolRec: Chemical Structure Recognition at CLEF 2012

MolRec achieves 95%+ accuracy on simple structures but struggles with complex diagrams, revealing rule-based OCSR limits …

Computational Chemistry
Optical chemical structure recognition example

MolRec: Rule-Based OCSR System

Rule-based system for optical chemical structure recognition using vectorization and geometric analysis, achieving 95% …

Scientific Computing
Velocity Autocorrelation Function showing the signature negative region characteristic of liquid dynamics and the cage effect discovered by Rahman

Modernizing Rahman''s 1964 Argon Simulation

A high-fidelity replication of foundational molecular dynamics using modern software engineering practices: caching, …

Computational Chemistry
Comparison chart showing k-NN significantly outperforming logistic regression for molecular classification across different alkane sizes

Can You Hear the Shape of a Molecule? (Part Three)

Supervised learning reveals hidden eigenvalue patterns that clustering missed, testing k-NN and logistic regression on …

Computational Chemistry
Charts showing Dunn Index, distance metrics, and computation time analysis revealing clustering performance degradation with molecular size

Can You Hear the Shape of a Molecule? (Part Two)

Clustering analysis reveals why Coulomb matrix eigenvalues struggle with larger alkanes, using Dunn Index and silhouette …

Natural Language Processing
Word vector illustration showing text classification and NLP concepts

Sarcasm Detection with Transformers: A Cautionary Tale

Learn how dataset bias can lead to misleading results in NLP: a sarcasm detection model that actually learned to …

Computational Social Science
Top features for Armed Forces and National Security policy classification showing veterans, defense, military keywords

Classifying Congressional Bills with Machine Learning

Testing ML classification of congressional bills by policy area. Comparing Naive Bayes, Logistic Regression, and XGBoost …

Computational Social Science
Top features for Economics and Public Finance policy classification across Congresses

How Does Congress Actually Work? Data from 15K Bills

What happens to bills in Congress? Analyzing 15K+ bills from the 117th Congress to understand legislative patterns, …

Natural Language Processing
Information Quality Ratio plot showing statistical dependencies decay as window size increases

Analytical Solution to Word2Vec Softmax & Bias Probing

Analytical derivation of Word2Vec's softmax objective factorization and a new framework for detecting semantic bias in …