Introduction

Understanding how Congress works requires looking at data, not just following news coverage. I scraped data from Congress.gov to analyze what actually happens to the thousands of bills introduced each session.

This analysis focuses on the 117th Congress (2021-2023), examining 15,000+ bills to understand basic patterns: Which bills get introduced? How many receive votes? What factors influence success?

While there are larger goals like geographic analysis and predictive modeling, this post covers the foundational exploratory analysis and data collection process.

Data Collection

My primary source is Congress.gov, maintained by the Library of Congress. I focused on the 117th Congress (2021-2023), collecting data on resolutions and joint resolutions while omitting amendments and concurrent resolutions.

Data collected:

Bill TypeIntroduced
House Resolution9,698
House Joint Resolution106
Senate Resolution5,357
Senate Joint Resolution70
Total15,231

Technical Implementation

The web crawler used standard Python libraries to handle Congress.gov’s structure. The site loads content dynamically, requiring both static and dynamic scraping approaches.

Implementation details:

I added 5-second delays between requests to avoid overloading the server, resulting in a 3-day collection period. The crawler and processed data are available on GitHub.

For each bill, I queried two pages:

  • All info page: https://www.congress.gov/bill/117th-congress/{bill_type}/{bill_id}/all-info
  • Text page: https://www.congress.gov/bill/117th-congress/{bill_type}/{bill_id}/text?format=txt

The parsing process involved targeting specific HTML elements and implementing basic caching to avoid redundant requests.

Key Findings

The analysis reveals clear patterns in congressional activity. Most bills never receive votes, and success rates vary significantly by party and policy area.

Legislative Outcomes

The fundamental question: what happens to bills after introduction?

Each bill has a tracker status indicating its position in the legislative process. The eight possible statuses can be grouped into three meaningful categories:

  • Introduced: Bills introduced but never voted on
  • Stalled: Bills that saw votes but didn’t become law (since the 117th Congress ended, these effectively died)
  • Law: Bills signed by the President
IntroducedStalledLaw
House Resolution8,977523198
House Joint Resolution10213
Senate Resolution5,083114160
Senate Joint Resolution5794
Total14,219647365

Key insights:

  • Only 7% of introduced bills ever receive a vote
  • Of bills that receive votes, 36% become law
  • Overall, just 2% of introduced bills become law

The bill sponsor—the primary member who introduces legislation—provides insights into party and geographic patterns.

Party Breakdown

IntroducedStalledLaw
Democrat8,271437235
Republican5,883210130
Independent6500

Party comparison:

  • Democrats: 7.5% of bills moved beyond introduction; 2.6% became law
  • Republicans: 5.5% of bills moved beyond introduction; 2.1% became law
  • When bills do advance, Republicans have a slightly higher success rate (38% vs 35%)

Geographic Distribution

Top 10 states by bills introduced:

RankingState: IntroducedState: StalledState: Law
1CA: 1,350CA: 93CA: 34
2TX: 879NY: 44MI: 30
3NY: 784TX: 43TX: 25
4FL: 766MI: 28NY: 24
5IL: 660NJ: 28MN: 17
6PA: 521IL: 27IL: 16
7NJ: 478VA: 26OH: 11
8MI: 380FL: 24VA: 11
9OH: 377PA: 22FL: 11
10MA: 361OH: 19GA: 9

Per-representative normalization reveals different patterns:

RankingState: IntroducedState: StalledState: Law
1DC: 101.0DC: 7.0AK: 2.2
2NH: 47.5AK: 2.8NH: 2.0
3MT: 44.0IA: 2.3MT: 2.0
4OR: 41.0SD: 2.3MI: 1.9
5NV: 40.0NH: 2.2MN: 1.5
6DE: 38.7VA: 2.0HI: 1.5
7SD: 38.3NJ: 2.0CT: 1.3
8IA: 37.7PR: 2.0IA: 1.2
9RI: 36.5NV: 1.8OR: 1.1
10UT: 36.0MO: 1.8SD: 1.0

Top Individual Sponsors

Most prolific legislators by bills introduced:

RankingIndividual: IntroducedIndividual: StalledIndividual: Law
1Sen. Rubio (R-FL): 186Sen. Peters (D-MI): 11Sen. Peters (D-MI): 19
2Sen. Klobuchar (D-MN): 143Sen. Cornyn (R-TX): 8Sen. Cornyn (R-TX): 15
3Sen. Lee (R-UT): 125Rep. Connolly (D-VA-11): 8Sen. Klobuchar (D-MN): 7
4Sen. Markey (D-MA): 118Rep. Takano (D-CA-41): 8Sen. Tester (D-MT): 6
5Sen. Casey (D-PA): 116Sen. Grassley (R-IA): 7Sen. Rubio (R-FL): 6
6Sen. Cortez Masto (D-NV): 109Del. Norton (D-DC): 7Rep. DeLauro (D-CT-3): 6
7Sen. Booker (D-NJ): 106Rep. Johnson (D-TX-30): 7Sen. Grassley (R-IA): 5
8Sen. Durbin (D-IL): 102Rep. Katko (R-NY-24): 7Sen. Ossoff (D-GA): 4
9Del. Norton (D-DC): 101Rep. Dean (D-PA-4): 6Sen. Murkowski (R-AK): 4
10Sen. Menendez (D-NJ): 99Rep. Wagner (R-MO-2): 6Sen. Padilla (D-CA): 4

Effectiveness score (laws enacted / total bills):

$$ \text{effectiveness} = \frac{\text{bills that became law}}{\text{total bills introduced}} $$

RankingIndividual: Effectiveness Score
1Rep. Pelosi (D-CA-12): 0.500
2Rep. Mrvan (D-IN-1): 0.444
3Rep. Yarmuth (D-KY-3): 0.333
4Rep. Stivers (R-OH-15): 0.250
5Rep. Graves (R-MO-6): 0.222
6Rep. Jeffries (D-NY-8): 0.200
7Rep. Neal (D-MA-1): 0.200
8Rep. Palazzo (R-MS-4): 0.200
9Sen. Peters (D-MI): 0.186
10Rep. Fischbach (R-MN-7): 0.176

Policy Focus Areas

Each bill is assigned a primary policy area. Here are the most active areas by legislative outcome:

RankingPolicy Area: IntroducedPolicy Area: StalledPolicy Area: Law
1Health: 1,885Government Operations: 79Government Operations: 94
2Armed Forces: 1,114Armed Forces: 60Armed Forces: 69
3Taxation: 1,066International Affairs: 60Crime & Law Enforcement: 31
4Government Operations: 982Health: 56Health: 19
5International Affairs: 866Crime & Law Enforcement: 44Native Americans: 17
6Crime & Law Enforcement: 842Public Lands: 44International Affairs: 14
7Education: 663Science & Technology: 44Economics & Finance: 13
8Transportation: 663Commerce: 43Public Lands: 13
9Public Lands: 548Finance: 34Commerce: 13
10Finance: 547Emergency Management: 27Emergency Management: 11

Notable patterns: Health dominates introductions but has lower success rates, while government operations and armed forces bills are more likely to become law.

Next Steps

This analysis establishes baseline patterns: most bills fail, party affiliation affects success rates, and certain policy areas perform better than others.

Future work could explore:

  • Committee dynamics and voting patterns
  • Geographic analysis of state-level interests
  • Bill text analysis using NLP techniques
  • Predictive modeling for bill outcomes

Update: I’ve since applied machine learning to this type of data in Congressional Bill Policy Area Classification, using 48K+ bills from three Congresses to automatically categorize bills by policy area.

The complete dataset and code are publicly available to support further research into legislative transparency.