Jon Rothbaum
@jlrothbaum.bsky.social
530 followers 110 following 43 posts
Economist, U.S. Census Bureau, Returned Peace Corps Volunteer, Ecuador (All opinions are mine).
Posts Media Videos Starter Packs
Pinned
New research from the NEWS project on income and poverty from 2016-2021
Using survey, census, admin and commercial data, we show that survey estimates understate income and overstate poverty.

Release at www.census.gov/data/experim..., interactive plots at jrothbaum.github.io/news.html
Disclaimers: All opinions are my own. The results shown above were approved for release under Disclosure Review Board (DRB) approval number CBDRB‑FY25‑0280.
Thanks so much to my coauthors Adam Bee, John Creamer, Josh Mitchell, Nikolas Mittag, Elizabeth Pelletier, Carl Sanders, Lawrence Schmidt, and Matt Unrath. See how NEWS affects estimates for different groups at jrothbaum.github.io/news.html and the official release at www.census.gov/data/experim...
National Experimental Well-Being Statistics Project (NEWS)
jrothbaum.github.io
As noted above, the bias can vary a lot across groups and over time. Underreporting of UI benefits can cause bias in child poverty (their parents are likely to work and therefore collect UI in a downturn) but won’t impact elderly poverty much.
Likewise, we see more missing income, when income shifts from well-reported sources (like wage and salary earnings) to ones with greater underreporting (like unemployment insurance, or UI) in 2020 and 2021.
We said Pandemic nonresponse bias started affecting the data in 2020, but how do we know this? We can look at the results after each step. Our weighting adjustment only starts affecting our estimates for surveys conducted in 2020, affecting income estimates from 2019 forward.
We do several things, including 1) weighting to adjust for nonresponse bias (income may be correlated with survey response), 2) imputation (not everyone answers income questions on surveys), 3) and combining survey and adrec data (what’s the right number when they disagree?)
Our post-tax income+in-kind transfer measure mirrors the resource measure used to calculate the Supplemental Poverty Measure (SPM).

The NEWS SPM rate is 1.7 to 3.5pp lower than the survey, depending on the year, with as many as 11.5 million fewer people in poverty.
Split the effect by age, and we see the biggest change is among seniors, who tend to underreport other sources of retirement income (from 2018).
We estimate three income measures: money income, post-tax income, and post-tax income+in-kind transfers (excluding health insurance).

Relative to the survey, our estimates of all three measures increase across the income distribution (shown from 2018)
In the prior release, we expanded the resource measures we estimate to include taxes, credits, and in-kind benefits. We use linked adrecs to address survey underreporting of multiple safety net programs and linked tax returns to improve estimates of taxes and filing behavior.
Beyond the tldr;! We use CPS ASEC (source of official income and poverty), 1040s, W2s, info tax returns, LEHD, ACS, census, OASDI and SSI payments, federal and state safety net data (housing assistance, SNAP, TANF, and WIC), firm data, and commercial data on home values.
This is the third release of the National Experimental Wellbeing Statistics (NEWS) Project at Census. In this release, we use the same methods as the prior one, but cover additional years.

Latest release here: www.census.gov/data/experim...
Prior release here: bsky.app/profile/jlro...
This varies by group. Parents have mostly wage and salary earnings, which is well reported: not much normal bias, but lots of underreporting of UI in 2020. Those 65+ have lots of retirement income: lots of normal bias, but not much change in 2020 or 2021.
But the bias varies by year 1) in each year there’s “normal” bias from underreporting of income, like pensions, 2) From 2020 on, high income households respond at higher rates, 3) some income is reported better than others, and UI is not well reported, so UI ↑ in 2020 => bias ↑
The difference between NEWS and official survey estimates can be large. In 2020, the NEWS estimate of official poverty is 2.4pp lower than the survey, with 8 million fewer people in poverty.
New research from the NEWS project on income and poverty from 2016-2021
Using survey, census, admin and commercial data, we show that survey estimates understate income and overstate poverty.

Release at www.census.gov/data/experim..., interactive plots at jrothbaum.github.io/news.html
In my testing it's 2-5x faster than my best attempt to do this using Stata's python integration (and my python-based solution is way faster than the standard code for large files with batched file handling and multiprocessing).
I've done benchmarks (see github.com/jrothbaum/st...). Stata is efficient at loading dta files and I can't match that, but parquet files are faster to load if the file is large and you only need a subset of columns - that's where parquet shines
GitHub - jrothbaum/stata_parquet_io: Read and write parquet files to stata using polars rust
Read and write parquet files to stata using polars rust - jrothbaum/stata_parquet_io
github.com
Parquet is a standard file format with some big advantages: standard format for use in R and python (for multilanguage projects), super compressed relative to dta files (5-10% the size on disk). www.databricks.com/glossary/wha...
What is Apache Parquet?
Learn more about the open source file format Apache Parquet, its applications in data science, and its advantages over CSV and TSV formats.
www.databricks.com
Reposting the image in a different format...
2) estimating income and poverty when many or all of the adrecs are not yet available, both for timely estimates (some adrecs arrive with months or years of lag) and going back in time, and 3) estimating income and poverty at lower geographic levels (state, county, tract). (n/n)
We plan to release more years of data later in 2025. And we’re on to the next set of goals: 1) addressing underreporting (in the survey and adrecs) of other items such as self-employment earnings and rental income. (19/n)