Document Processing
Diagram showing page stream segmentation workflow: an input stream of pages is processed through binary classification of page pairs to predict document breaks, producing segmented output documents

LLMs for Page Stream Segmentation

We create TabMe++, an enhanced page stream segmentation benchmark with commercial-grade OCR, and show that parameter-efficiently fine-tuned decoder-based LLMs like Mistral-7B achieve 80% straight-through processing rates, dramatically outperforming encoder-based models.

Creative Work
Rubik's cube solver interface

Rubik's Cube Player - Drexel Music Hackathon 2017

A project I built with Emmanuel Espino and Jason Zogheb at the 2017 Drexel Music Hackathon. It uses computer vision to read a Rubik’s cube and generates music based on how solved each face is.