Bulk Downloads of Congressional Data Now Available

Using the ProPublica Congress API, developers can access details on each of the thousands of bills introduced in every two-year session. But they used to have to download those details one bill at a time, and be able to write API calls in software code. Now you can download information on all of the bills introduced in each session in a single file, thanks to the bulk bill data set we’re announcing today.

You can get this data for free starting right now from the ProPublica Data Store. A data dictionary and an example file are available here.

Twice a day, we generate a single zip file containing metadata for every bill introduced in the current congress, including who sponsors and cosponsors the bill, actions taken by committees, votes on the floor and a summary of what the bill would do. So every time you download the bulk bill data from the 115th Congress, you’ll have the complete, up-to-date data set.

You can also download archives of bill data for past congresses, going back to 1973 — when current House Speaker Paul Ryan was only 3 years old and one of the bills debated was the now-familiar ERISA, a law governing employer-sponsored retirement plans.

To produce the files we’re using the same codebase that powers our Congress API and the work of sites like the @unitedstates project. That open-source effort is a significant part of our congressional data and relies on volunteers. The bulk bill downloads replace the files previously available on that site and on the Sunlight Foundation’s site.

Bulk downloads are useful for developers and journalists who need the entire set of legislation but want to avoid gathering it one bill at a time. Bill files are available in JSON and XML formats.

What can you do with this data? Anything you want, but we hope it’ll be useful to researchers, journalists and any other citizen trying to better understand our country’s legislature. You might explore how Congress’ focus on various issues has shifted, the roles of committees in passing — or delaying — legislation or, on the lighter side, the rise of sometimes-implausible acronyms in bill names.

A note for users of our Congress API interested in using the bulk download data: The bulk files don’t match the API bill endpoint exactly but contain a subset of the fields available. A data dictionary of the fields and an example are available here.

Finally, a reminder for users of the Sunlight Congress API: We will shut down that service Sept. 30, 2017. After that date, the API will no longer respond to requests. We encourage users to migrate to the ProPublica Congress API.

More from ProPublica

ProPublica5 min read
Trump Cheers as Pakistan Rounds Up the Usual Suspect in Mumbai Case
by Sebastian Rotella As the news broke Wednesday that Pakistani police had arrested the leader of an Islamist militant group blamed for the terrorist attacks that killed 166 people in Mumbai,
ProPublica2 min read
What Can Be Done Right Now to Stop a Basic Source of Health Care Fraud
by Marshall Allen In our story about the convicted health care con man David Williams, we detailed how the Texas personal trainer made off with millions by billing some of the nation’s larges
ProPublica11 min readPolitics
Un Agente De La Patrulla Fronteriza Revela La Realidad De Ser Guardia De Niños Migrantes
por Ginger Thompson ProPublica es un medio noticioso basado en Nueva York y dedicado a investigar los abusos de poder. Inscríbase para recibir el boletín de investigaciones principales de Pro