Time Series Forecasting

Deconstructing Neural Networks for Time Series Forecasting

Ablation study of neural network components for forecasting, finding gating and attention improve RNNs while recurrence …...

Document Processing

LLMs for Page Stream Segmentation

Enhanced TABME benchmark for page stream segmentation, creating TABME++, showing fine-tuned decoder-based LLMs …...

Natural Language Processing

Analytical Model of Word2Vec and GloVe Statistics

Analytical model of Word2Vec and GloVe statistics. First analytical solution to Word2Vec's softmax skip-gram with bias …...

Natural Language Processing

EigenNoise: Data-Free Word Vector Initialization

Investigation into EigenNoise, a data-free initialization scheme for word vectors that approaches pre-trained model …...

Computational Social Science

NewsTweet Dataset: Social Media in Digital Journalism

NewsTweet dataset and pipeline for studying embedded tweets in online news. Analysis shows 13% of stories contain …...

Computational Social Science

Coordinated Social Targeting on Twitter

Investigation into follower dynamics on high-profile Twitter accounts, documenting sub-second spikes, saw-tooth …...