In-Congress Performance

1Logistic Regression0.8772024-02-21url

Out-of-Congress Performance

1Logistic Regression0.8712024-02-21url

What is a Policy Area?

A policy area is a broad category of public policy that includes a wide range of related policies. For example, the policy area of “Health” includes policies related to healthcare, public health, and health insurance. The policy area of “Education” includes policies related to K-12 education, higher education, and vocational training. The policy area of “Economic Development” includes policies related to job creation, workforce development, and business incentives.

On [congress.gov], there is an official list of policy areas that are used to classify legislation. We additionally include Private Legislation as a policy area, which is not included in the official list but does exist as a designation in the data. This brings us to 33 policy areas in total.

In the future, if we go further backwards in time, we may need to also consider the Commemorations policy area, which used to be a policy area but is not anymore.

The Challenge

Using data collected from [congress.gov], we have created a dataset of legislation that has been introduced in the United States Congress. This data includes:

  • A unique identifier for each piece of legislation
  • The congress in which the legislation was introduced
  • The title of the legislation (display title)
  • The summary of the legislation (earliest version available)
  • The full text of the legislation (earliest version available)
  • The policy area of the legislation

Currently, the data includes the 115th Congress (2017-2018) through the 117th Congress (2021-2022).

In general, the challenge is to build a model that can accurately predict the policy area of a piece of legislation based on its title, summary, and/or full text: $$ \text{Policy Area} = f(\text{Title}, \text{Summary}, \text{Full Text}) $$

The goal of this challenge is to train on a single Congress and either:

  • Predict on a held-out set within Congress
  • Predict on a separate Congress (either extrapolating to a future Congress or “interpolating” to/recalling a past Congress)


You can download the data directly from Hugging Face Datasets: hhieden/us-congress-bill-policy-115_117.


In-Congress performance will be evaluated using a K-fold cross-validation scheme, where K=3. For each congress, perform a 3-fold cross-validation, where the folds are stratified by policy area. Performance will be evaluated using a weighted F1 score. Final in-Congress performance will be the average weighted F1 score across the 3 folds, and further averaged across the congresses included. In math: $$ \text{In-Congress Performance} = \frac{1}{N} \sum_{i=1}^{N} \frac{1}{3} \sum_{k=1}^{3} \text{Weighted F1 Score}_{i,k} $$ where $N$ is the number of congresses included in the evaluation.

Out-of-Congress performance will be evaluated using the full held-out congresses. Final performance will be the weighted F1 score, averaged across the congresses included (minus the congresses used for training).

This gives us two-scalar scores: in-Congress performance and out-of-Congress performance.