Natural-Language-Processing

LLMs for Page Stream Segmentation

Enhanced TABME benchmark for page stream segmentation, creating TABME++, showing fine-tuned decoder-based LLMs …...

LLM applications for insurance document automation using parameter-efficient fine-tuning and analysis of calibration …...

Data science project scraping 47,000+ congressional bills, analyzing legislative patterns, and building ML models …...

Analytical model of Word2Vec and GloVe statistics. First analytical solution to Word2Vec's softmax skip-gram with bias …...

Investigation into EigenNoise, a data-free initialization scheme for word vectors that approaches pre-trained model …...

Undergraduate thesis exploring representation learning for social media text and developing tools for cross-platform …

Investigation into whether universal adversarial triggers can control both topic and stance of GPT-2's generated text …...

Explores a data-driven approach to construct a WordNet-like semantic network using the entirety of the noisy, …...

Learn about knowledge-based agents: how AI systems use knowledge bases, reasoning, and inference to build intelligent …

Analysis of QuAC's conversational QA through student-teacher interactions, featuring 100K+ context-dependent questions …

Analysis of CoQA, a conversational QA dataset with multi-turn dialogue, coreference resolution, and natural answers for …

Learn about word embeddings in NLP: from basic one-hot encoding to contextual models like ELMo. Guide with examples.