Abstract
Page Stream Segmentation (PSS) is critical for automating document processing in industries like insurance, where unstructured document collections are common. This paper explores the use of large language models (LLMs) for PSS, applying parameter-efficient fine-tuning to real-world insurance data. Our experiments show that LLMs outperform baseline models in segmentation accuracy. However, we find that stream-level calibration remains a significant challenge. We evaluate post-hoc calibration and Monte Carlo dropout, finding they offer limited improvement, highlighting the need for future work in this area for high-stakes applications.
Key Contributions
- Real-World Evaluation: Applied and evaluated LLMs on a complex, proprietary insurance dataset, demonstrating their superior performance over traditional baselines in a practical setting.
- Parameter-Efficient Fine-Tuning: Successfully used parameter-efficient fine-tuning (PEFT) to adapt LLMs for the specialized task of page stream segmentation.
- Calibration Assessment: Investigated model calibration and found that LLMs, despite their accuracy, exhibit overconfidence that is not easily corrected by standard techniques.
- Stream-Level Confidence Metric: Introduced and analyzed a stream-level confidence measure to help distinguish between streams that can be automated and those requiring human review.
Impact
This work demonstrates both the promise and the current limitations of using LLMs in high-stakes industrial applications. While LLMs can significantly improve segmentation accuracy over traditional methods, our findings serve as a crucial reminder that performance metrics alone are not enough. We highlight that for sectors like insurance, addressing model overconfidence and developing robust calibration methods is essential for moving from research to responsible, reliable automation.
Citation
@inproceedings{heidenreich2025page,
title={Page Stream Segmentation with LLMs: Challenges and Applications in Insurance Document Automation},
author={Heidenreich, Hunter and Dalvi, Ratish and Verma, Nikhil and Getachew, Yosheb},
booktitle={Proceedings of the 31st International Conference on Computational Linguistics: Industry Track},
pages={305--317},
year={2025}
}