
GutenOCR: A Grounded Vision-Language Front-End for Documents
GutenOCR introduces vision-language models for grounded OCR, offering precise text transcription and geometric grounding …

GutenOCR introduces vision-language models for grounded OCR, offering precise text transcription and geometric grounding …

A large-scale dataset of 209K+ articles with OCR and layout bounding boxes, enabling layout-aware modeling and document …

Enhanced TabMe benchmark for page stream segmentation, creating TabMe++, showing fine-tuned decoder-based LLMs …

LLM applications for insurance document automation using parameter-efficient fine-tuning and analysis of calibration …