Optical Chemical Structure Recognition
Diagram showing how Ring-Free Language decouples a molecular graph into skeleton, ring structures, and branch information

RFL: Simplifying Chemical Structure Recognition (AAAI 2025)

Proposes Ring-Free Language (RFL) to hierarchically decouple molecular graphs into skeletons, rings, and branches, solving issues with 1D serialization of complex 2D structures. Introduces the Molecular Skeleton Decoder (MSD) to progressively predict these components, achieving strong results on handwritten and printed chemical structure recognition benchmarks.

Document Processing
Diagram showing page stream segmentation workflow: an input stream of pages is processed through binary classification of page pairs to predict document breaks, producing segmented output documents

LLMs for Page Stream Segmentation

We create TabMe++, an enhanced page stream segmentation benchmark with commercial-grade OCR, and show that parameter-efficiently fine-tuned decoder-based LLMs like Mistral-7B achieve 80% straight-through processing rates, dramatically outperforming encoder-based models.

Creative Work
Rubik's cube solver interface

Rubik's Cube Player - Drexel Music Hackathon 2017

A project I built with Emmanuel Espino and Jason Zogheb at the 2017 Drexel Music Hackathon. It uses computer vision to read a Rubik’s cube and generates music based on how solved each face is.