Abstract
We present NewsTweet, a comprehensive dataset and analysis pipeline for studying social media embedding in digital journalism. Our analysis reveals that 13% of news stories include embedded tweets, providing new insights into how social media content influences newsworthiness and journalistic practices.
Key Contributions
- Large-Scale Dataset: Comprehensive collection of news articles with embedded social media content
- Quantitative Analysis: First systematic measurement of social media embedding in online journalism
- Newsworthiness Patterns: Identification of factors that make social media content newsworthy
- Methodological Framework: Reproducible pipeline for analyzing social media in journalism
Dataset Characteristics
Scale and Coverage
- News Sources: Multiple major news outlets across different domains
- Time Period: Extended temporal coverage for trend analysis
- Content Types: Various forms of embedded social media content
- Metadata: Rich annotation including publication timing, article categories, and engagement metrics
Technical Implementation
- Automated collection pipeline with content validation
- Multi-modal analysis combining text and social media metadata
- Temporal tracking of embedding patterns
- Cross-platform content analysis
Key Findings
Embedding Prevalence
- 13% of news stories contain embedded social media content
- Significant variation across news categories and outlets
- Temporal patterns in embedding frequency
- Correlation with breaking news events
Newsworthiness Factors
- Analysis of what makes social media content newsworthy
- Relationship between social engagement and news inclusion
- Role of verified accounts vs. general public voices
- Impact of trending topics on embedding patterns
Methodological Innovation
Our approach combines computational journalism techniques with social media analysis, providing a framework for understanding the intersection of traditional and digital media. The methodology is designed to be reproducible and extensible to other media contexts.
Societal Impact
This research contributes to understanding how social media shapes news narratives and public discourse, with implications for media literacy, journalism ethics, and democratic participation in the digital age.
Applications
- Journalism Research: Understanding evolving news practices in digital environments
- Media Studies: Analyzing the relationship between traditional and social media
- Computational Social Science: Studying information flow and influence patterns
- Platform Studies: Examining how social media platforms influence news content
Citation
@article{mujib2020newstweet,
title={NewsTweet: a dataset of social media embedding in online journalism},
author={Mujib, Munif Ishad and Heidenreich, Hunter Scott and Murphy, Colin J and Santia, Giovanni C and Zelenkauskaite, Asta and Williams, Jake Ryland},
journal={arXiv preprint arXiv:2008.02870},
year={2020}
}