This dataset tracks how members of the United States Congress have talked about immigrants and immigration across 152 years of legislative debate. The dataset covers every congressional session from the 43rd (1873–75) through the 119th (2025–27), drawing on the full text of the Congressional Record. Each paragraph mentioning immigration or immigrant groups was extracted, classified for sentiment, rhetorical framing, and target group, and aggregated into the visualizations you see here.
The majority of speeches come from the Stanford Congressional Record Database, which digitized and structured the Congressional Record from 1873 through 2017 (114th Congress). The raw corpus contains millions of paragraphs spanning floor speeches, debate excerpts, and committee proceedings from both the House and Senate. The remaining floor speeches for the 115-119th Congress were downloaded from the GovInfo Congressional Record.
Each paragraph was passed through Gemini 2.0 Flash (via Google Vertex AI) and classified along four dimensions:
The classifier also flags paragraphs that contain stereotyping — broad, group-level generalizations about behavior, character, or capabilities.
Sentiment over time traces how the emotional tone of immigration discourse shifts across sessions, broken out by party. Across all 152 years, a majority of immigration mentions carry negative sentiment. The fraction has risen in recent decades, with Republican negative rates exceeding 70% in the 118th Congress (2023–25).
Stereotyping rate shows the fraction of paragraphs flagged as containing stereotyping language. Stereotyping among Republican speakers is on a steady increase since the late 1980's.
Target groups over time shows which immigrant communities attracted congressional attention in each era. Chinese immigrants in the 19th-century; refugees in the 20th-century; Jewish/Soviet immigrants in the 1980s; and undocumented immigrants since the 1980s.
Framing categories capture the rhetorical strategies used: economic threat framings dominate in periods of nativist backlash; humanitarian and legal-procedural frames rise during refugee crises and reform debates.
Partisan sentiment gap charts the difference between Republican and Democratic negative-sentiment rates for each target group across time. Positive values mean Republicans were more negative; negative values mean Democrats were. The gap has widened sharply since the mid-1990s for almost every group.
Geographic distribution maps the state origin of each speech.
Narratives visualizes selected quotes for individual immigrant groups, showing a common pattern where new immigrant groups are initially viewed with suspicion, then recognized for economic/cultural benefit, and finally used as a model to denigrate more recent immigrant waves by comparison.
Please cite the Stanford Congressional Record Database, which powered most of this analysis. For peer-reviewed analyses of immigration sentiment in this data (through 2020) see Card et al. (2022) PNAS -- Computational analysis of 140 years of US political speeches reveals more positive but increasingly polarized framing of immigration.