
GutenOCR: A Grounded Vision-Language Front-End for Documents
GutenOCR is a family of vision-language models designed to serve as a ‘grounded OCR front-end’, providing high-quality text transcription and explicit geometric grounding.

PubMed-OCR: PMC Open Access OCR Annotations
PubMed-OCR provides 1.5M pages of scientific articles with comprehensive OCR annotations and bounding boxes to support layout-aware modeling and document analysis.

Müller-Brown Potential: A PyTorch ML Testbed
A high-performance, GPU-accelerated PyTorch testbed for ML-MD algorithms featuring JIT-compiled analytical Jacobian force kernels achieving 3-10x speedup over autograd, robust Langevin dynamics with Velocity-Verlet integration, and modular architecture designed as ground-truth validation for novel machine learning approaches in molecular dynamics.

Modernizing Rahman''s 1964 Argon Simulation
A digital restoration of Rahman’s seminal 1964 molecular dynamics paper using LAMMPS and a production-grade Python analysis pipeline featuring intelligent decorator-based caching, fully vectorized NumPy computations for O(N^2) operations, and modern tooling (uv, type hints, Makefile automation) transforming academic scripts into reproducible research toolkit.

LLMs for Insurance Document Automation
We explore LLM applications for page stream segmentation in insurance document processing, demonstrating that parameter-efficient fine-tuning achieves strong accuracy but revealing significant calibration challenges that limit deployment confidence.

Optimizing Sequence Models for Dynamical Systems
We systematically ablate core mechanisms of Transformers and RNNs, finding that attention-augmented Recurrent Highway Networks outperform standard Transformers on forecasting high-dimensional chaotic systems.

LLMs for Page Stream Segmentation
We create TabMe++, an enhanced page stream segmentation benchmark with commercial-grade OCR, and show that parameter-efficiently fine-tuned decoder-based LLMs like Mistral-7B achieve 80% straight-through processing rates, dramatically outperforming encoder-based models.

Mini-Protein Trajectory Generation
An automated GROMACS pipeline for generating high-fidelity molecular dynamics datasets suitable for machine learning, simulating capped dipeptides across nine residue types with 0.1 ps resolution and atomic force extraction optimized for training Neural Network Potentials.

PyConversations: Social Media Conversational Analysis
Research project that investigated how different NLP models perform on social media data, finding that domain-specific approaches often outperform large pre-trained models. Includes PyConversations, a Python module for analyzing conversations across social media platforms.

GPT-2 Susceptibility to Universal Adversarial Triggers
We demonstrate that universal adversarial triggers can control both the topic and stance of GPT-2’s generated text, revealing security vulnerabilities in deployed language models and proposing constructive applications for bias auditing.
Molecular Sets (MOSES): A Generative Modeling Benchmark
MOSES introduces a comprehensive benchmarking platform for molecular generative models, offering standardized datasets, evaluation metrics, and baselines.

The Reliability Trap: The Limits of 99% Accuracy
We explore the ‘Silent Failure’ mode of LLMs in production: the limits of 99% accuracy for reliability, how confidence decays in long documents, and why standard calibration techniques struggle to fix it.