Abstract

We conduct a comprehensive ablation study examining the key architectural components of neural networks for spatiotemporal forecasting of dynamical systems. Our analysis reveals that neural gating and attention mechanisms improve RNN performance, while recurrence is actually detrimental to transformer architectures in this context.

Key Contributions

  • Comprehensive Ablation Study: Systematic deconstruction of key neural network components
  • Architecture Insights: Novel findings about recurrence, attention, and gating in forecasting
  • Dynamical Systems Focus: Specialized analysis for spatiotemporal prediction tasks
  • Transferability Analysis: Investigation of component effectiveness across architectures

Key Findings

  • Gating mechanisms significantly improve RNN performance on dynamical systems
  • Attention provides benefits for both RNNs and transformers
  • Recurrence is surprisingly detrimental to transformer performance
  • Component transferability varies significantly between architectures

Significance

This work provides crucial insights for designing neural architectures specifically for dynamical systems forecasting, challenging conventional wisdom about component effectiveness and transferability.

Citation

@article{heidenreich2024deconstructing,
  title={Deconstructing recurrence, attention, and gating: Investigating the transferability of transformers and gated recurrent neural networks in forecasting of dynamical systems},
  author={Heidenreich, Hunter S and Vlachas, Pantelis R and Koumoutsakos, Petros},
  journal={arXiv preprint arXiv:2410.02654},
  year={2024}
}