Time Series Forecasting

Deconstructing Neural Networks for Time Series Forecasting

Ablation study examining key architectural components of neural networks for spatiotemporal forecasting, finding that …...

Document Processing

LLMs for Page Stream Segmentation

We enhance the TABME benchmark for page stream segmentation, creating TABME++, and show that fine-tuned, decoder-based …...

Natural Language Processing

Analytical Model of Word2Vec and GloVe Statistics

Presents an analytical model of statistics learned by Word2Vec and GloVe. This work derives the first known analytical …...

Natural Language Processing

EigenNoise: Data-Free Word Vector Initialization

A preliminary investigation into EigenNoise, a simple, data-free initialization scheme for word vectors that can …...

Computational Social Science

NewsTweet Dataset: Social Media in Digital Journalism

Describes the creation of NewsTweet, a large-scale dataset and pipeline for studying embedded tweets in online news. …...