Emily Oster is making strong claims about Omicron and schools based on weak data
The economist says schools should remain open, but her data aren't convincing.
The United States reported one million cases on January 3rd, and it was the first day of school in 2022 for many of our 70 million children. These two facts have collided in a fierce debate about whether schools should remain open or go virtual again.
“The negative externalities of in-person education are not as great as those associated with remote schooling, which in many districts means no schooling at all. The cost to children’s education and to their broader families’ ability to function is simply too large. Mask mandates for public events make more sense; the potential negative externality of a single person super-spreading to a large group outweighs the minor inconvenience of wearing a face covering.”
Oster’s work is influential. Florida Governor Ron DeSantis cited her work this summer in an executive order banning mask mandates at schools:
“WHEREAS, despite recent Centers for Disease Control and Prevention (CDC) ‘guidance,’ forcing students to wear masks lacks a well-grounded scientific justification; indeed, a Brown University study analyzed COVID-19 data for schools in Florida and found no correlation with mask mandates.”
I will leave it to public health experts to dispute Oster’s argument of low risks to children, teachers, parents, and grandparents from schooling in-person and without masks. For example, see Loretta Torrago, Tracy Lam-Hine, et al., and Julia Raifman.
I will focus on the defense that Oster is basing her policy advice on data. Here’s one of many economists this morning:
I’m all for a data-driven approach is important, and I agree that we should not call her a monster. My concern is that the data driving Oster’s work—the Covid-19 School Data Hub she created—aren’t good enough to inform school policies. Here’s why.
Data Hub does not fully reflect the experiences in our K-12 schools during the pandemic.
It’s clear from the map that the data aren’t representative of instruction at schools in the United States during the pandemic. Decisions about schooling are made at the local level, and localities approach the pandemic differently. One state cannot stand in for another, let alone one school district.
The instruction data do not have complete information on Covid-19 cases in schools or mask policies. Moreover, even if one series is available in a state, that does not guarantee the other two. That makes Data Hub even less representative and thus less useful for analysis.
The holes in the data are not that surprising. Data Hub contacted states during a pandemic and asked them to voluntarily. report information, some of which they had never collected before Covid. Hunting down, formatting, and checking these data for accuracy was unlikely to have been a priority for anyone in these states.
Rapid construction of the data set and the lack of external, independent review are problematic.
Challenges are formidable in collecting data. I know. I was on a team of economists, research assistants, and computer programmers at the Fed who constructed a new data set with daily, geographic retail sales using data from a large payment processor. We took years to understand and filter the underlying data, construct statistics, and compare our estimates to the official, representative statistics. And that was before we used it for our policy work at the Fed. See our chapter in the NBER volume, Big Data for Twenty-First-Century Economic Statistics, for our methods and applications.
I understand the urgency but note that Oster began producing highly-profile work soon after the data collection began. Reading her data summary is eye-opening. She faced formidable challenges, arguably more complex than we did at the Fed.
Specifically, they collected three types of school-level data: instruction format (online, virtual, or hybrid), masking policy, and Covid cases. For every kind of data, they faced numerous hurdles. Here’s the Covid-19 case data:
Similar to the learning model data, raw COVID-19 case data were available in various formats, including single or multiple spreadsheets, PDFs, and publicly-available datasets. The data reflected several inconsistencies across states, with some states reporting new cases per week, others reporting active cases within the past two weeks or one month, and others reporting cumulative cases. Moreover, some states separated staff and student counts, while others aggregated this information or masked case counts. We created a consistent data structure across states, including time intervals at which the state reported the data, NCES identifiers such as IDs and school/district type, and all available COVID-19 case data.
Now multiply such data issues times three. Except it’s more complicated than that. While the information on the instruction format and Covid-19 cases came from States, much of their data on mask policies came from 1,500 superintendents and principals:
The third set of data available from CSDH includes masking policy data by school districts across the U.S. throughout the 2020-21 school year. These data document whether and when school districts had masking policies in place for staff and students (separately). In all, 31 states and the District of Columbia had state-level mask mandates in place through at least May 14, 2021; for the purposes of this project, we considered these to be full-year mask mandates for all districts in these states.
The remaining 19 states had either a state-level mask mandate that ended at some point during the 2020-21 school year (prior to May 1), or no mandate in place at any time. For these states, the CSDH administered a brief survey to district leaders (commonly, superintendents or principals), asking them to indicate if the district had a mask policy in place at any time for staff and students, and if so, to indicate when. Given the demands on education leaders’ time, the survey did not delve into the nuance of the the varied policies in place in school districts across the U.S., such as policies on school buses, in the hallways, outside of the school building, and policies within classrooms that distinguished between, for example, instruction time and group work. Surveys were administered to nearly 6,000 education leaders (n=5,962) between June 23-July 31, 2021, with an average response rate of 28% by state (n=1,586).
So a key variable in their study is not even representative of 19 states. THESE are the data that were pivotal in the DeSantis order. Crushing.
Again, I get it. Covid is a crisis, and in a crisis, we do not have time to dot every i and cross every t. But, we, as experts, have a responsibility to policymakers and everyday people to match the strength of our recommendations to the strength of our data. When I read Oster, I see a tone and conviction that far exceeds the many limitations of her data.
Funding sources raise a red flag about a conflict of interest in schools versus businesses.
Creating a new data set like hers requires substantial resources. Data Hub acknowledges funding from Emergent Ventures at the Mercatus Center (launched with a grant from the Thiel Foundation), the Chan Zuckerberg Initiative, and Arnold Ventures. I applaud wealthy individuals who have poured money into research during the Covid crisis. And, it’s essential to think through potential conflicts of interest.
During the pandemic, business interests and public health have wrongly been portrayed in opposition: whether it’s ending extra jobless benefits to address labor shortages or bringing back indoor dining at bars and restaurants to increase sales. Neither of those measures from the business lobby helped solve the economic problem, but they did expose more workers and customers to Covid.
The question about keeping schools open is also wrapped up in the economy, because millions of parents left work when schools went virtual in 2020. Bringing these workers might alleviate some of the hiring difficulties among businesses, and bringing them back likely depends on school and child care being reliable again.
In no way am I suggesting that Oster is shaping her data or her analysis to satisfy wealthy funders with ties to big business. But, I doubt the funders randomly chose her efforts to support. In fact, Data Hub was started in June 2021 after Oster had already argued publicly against mask mandates in schools. Again, data require funding. Not everyone has the luxury of working at the Fed, which prints money. I get it, we need data and we must be transparent about the sources. Lives are on the line.
Data-driven analysis is necessary, not sufficient for policy.
Data-driven policymaking is good, but it’s not good enough. It’s the quality of the data and the analysis that’s decisive. Anyone can bring data to a debate, but few can get high-quality data and draw correct inferences.
No data are perfect—as every economist and policymaker knows—we never have exactly what we need. The solution is not to hang our hat on one data set. We must find several data sets and have independent experts help interpret them.
I know how to sift through macroeconomic and household data and analysis from academics and other researchers and find the gems. That’s how I was trained at the Fed. I learned to be wary of “surprising” results from one data set or research team. Sometimes they are right. Mostly they are incomplete, and occasionally they are flat-out wrong. Now is not the time to be wrong.
People-driven is the most important policy approach.
Experts must evaluate data quality, especially when it’s the basis for policy advice. When doing so, we must never forget the millions of children and their families standing whose lives those data represent.
These children are not alone. They can catch up on missed school, but they can never get their mother back. She can never watch them grow old. Put that in your model.
When Cindy Dawkins, a restaurant employee in Florida, died of COVID-19 in August, the single mom left behind four children.
Now her kids, Jenny, 24, Tre, 20, Zoey, 15, and Sierra, 12, are among the estimated tens of thousands of children marking the holiday season after suffering the loss of a parent or caregiver due to COVID-19.
"You could feel an emptiness," said Tre of celebrating the holidays without his beloved mom. "I don't know how we can go about fixing that, but it has been a weird experience."
Dawkins and her children, of Boynton Beach, were celebrating her 50th birthday in August when she began to feel sick and declined rapidly, according to Jenny and Tre. They say the last images they have of their mom are her in the back of an ambulance.
Less than 48 hours after being taken to the hospital, Dawkins, who was not vaccinated, died.
“I got to the hospital and walked into the entrance and saw my cousin crying [and] my aunt told me, ‘Your mom is gone,’” recalled Jenny. “That was the worst day of my life.”
These four children in Florida are not alone. They were in person in a school that did not require masks (see Oster’s data above), and they do not have a mother anymore.
Data-driven does not mean error-free. Emily Oster’s arguments about keeping schools open now are data-driven, but there are massive holes in her data. Downplaying the weakness and forging ahead with life-and-death advice is not the path to good policy.
Emily may be right that the risks to children in school are lower than out of school. She may be right that the big educational losses from virtual learning during the pandemic will never be made up. But how low is a low risk, and how big is a big loss?
That’s a values question. It’s not one that economists like her and I typically have the data and tools to answer. And, it’s the most critical question that elected officials need to grapple with as they make decisions.
It’s the question no one seems to want to answer. We must.