LLMs for Page Stream Segmentation
Enhanced TABME benchmark for page stream segmentation, creating TABME++, showing fine-tuned decoder-based LLMs …...
Enhanced TABME benchmark for page stream segmentation, creating TABME++, showing fine-tuned decoder-based LLMs …...
LLM applications for insurance document automation using parameter-efficient fine-tuning and analysis of calibration …...

Data science project scraping 47,000+ congressional bills, analyzing legislative patterns, and building ML models …...
Analytical model of Word2Vec and GloVe statistics. First analytical solution to Word2Vec's softmax skip-gram with bias …...
Investigation into EigenNoise, a data-free initialization scheme for word vectors that approaches pre-trained model …...

Undergraduate thesis exploring representation learning for social media text and developing tools for cross-platform …
Investigation into whether universal adversarial triggers can control both topic and stance of GPT-2's generated text …...
Explores a data-driven approach to construct a WordNet-like semantic network using the entirety of the noisy, …...
Learn about knowledge-based agents: how AI systems use knowledge bases, reasoning, and inference to build intelligent …

Analysis of QuAC's conversational QA through student-teacher interactions, featuring 100K+ context-dependent questions …
Analysis of CoQA, a conversational QA dataset with multi-turn dialogue, coreference resolution, and natural answers for …

Learn about word embeddings in NLP: from basic one-hot encoding to contextual models like ELMo. Guide with examples.