
LLMs for Page Stream Segmentation
Enhanced TabMe benchmark for page stream segmentation, creating TabMe++, showing fine-tuned decoder-based LLMs …

Enhanced TabMe benchmark for page stream segmentation, creating TabMe++, showing fine-tuned decoder-based LLMs …

MolParser-7M is a 7.7M-pair dataset for molecule-to-text with realistic images paired with SMILES, InChI, and IUPAC.

LLM applications for insurance document automation using parameter-efficient fine-tuning and analysis of calibration …