Recent results from national science tests confirm a disturbing trend: scores are declining among the lowest-performing students. And those responsible for overseeing the tests are sending muddled messages about why that’s happening.
Every few years, as part of a program called the National Assessment of Educational Progress (NAEP), a representative sample of American students is tested to see how much they know about science. The most recent results, from 2019, were released this week. Only about a third of fourth- and eighth-graders managed to score at a level deemed “proficient” or above; at twelfth grade, fewer than a quarter cleared that bar. Even more disturbing, about 30% of students at the two lower grade levels and 40% of twelfth-graders didn’t even reach the “basic” level.
Overall, the dismal scores are either the same or lower than those from the last round of science tests, given in 2015. And the declines are driven by falling scores among the lowest-achieving students. Peggy Carr, the official at the National Center for Education Statistics (NCES) who regularly announces NAEP results, called it “a concerning pattern we’ve seen before in other subjects.”
Indeed. That’s been the story with NAEP tests in U.S. history and civics as well as the biannual tests that get the most attention—those in reading and math. International assessments show the same trend. The urgent question is why our education system seems to be failing its most vulnerable students. And the officials charged with administering the tests and interpreting the results have trouble providing a consistent answer.
At the virtual event announcing the results on May 25th, a former member of the NAEP Governing Board, science education professor Cary Sneider, had a clear explanation: too much focus on what he called “the basics”—reading and math. Studies show, he noted, that science and social studies are being crowded out of the school schedule. He urged principals and school leaders to “mandate a minimum amount of time on science each day.”
But in an interview with the education news website The 74, the federal official who oversees the NAEP gave a seemingly contradictory explanation. NCES Commissioner James “Lynn” Woodworth suggested that the low scores reflect poor reading comprehension ability. “Reading is a critical skill that’s needed to improve on all subjects across the board,” Woodworth said.
In other words, Sneider says too much time on reading is to blame for low science scores. But Woodworth thinks the problem is that schools are spending too little time on reading. What are conscientious educators and policymakers supposed to do?
To figure that out, they would first need to understand that the term “reading” is being used to cover two very different processes: decoding, or sounding out, words; and understanding what you read. If students can’t fluently decode words—as many at even the high school level can’t, because of deficiencies in teacher training and curriculum—then obviously they won’t do well on any test that requires reading. Those students should be identified and given systematic instruction in foundational reading skills, including phonics. Unfortunately, our testing system doesn’t identify which students need that kind of help.
So Woodworth might be right that part of the problem is poor reading skills—except that he was specifically talking about the other aspect of reading, comprehension. Schools are already spending many hours every week having students practice reading comprehension “skills” like “finding the main idea” and “making inferences,” using easy-to-read texts on a random variety of topics. Doing more of that, as Woodworth seems to be advocating, is likely to keep test scores low rather than improve them.
Studies have shown that reading comprehension depends far more on academic knowledge than on generally applicable skills. So if schools want to boost reading comprehension, they should build students’ academic knowledge—and the vocabulary that goes with it—through a coherent, content-focused curriculum, beginning in kindergarten. The curriculum should cover social studies, science, and the arts, spending at least two or three weeks on a topic rather than being organized around a “skill of the week.” That approach would likely boost scores not just in reading but in other subjects as well, especially for students who are less likely to pick up academic knowledge at home because their families have less education.
In fact, the NAEP Governing Board itself convened a panel of reading experts in 2018 that identified a focus on comprehension skills as the cause of stagnant reading scores. But NAEP officials apparently didn’t pay much attention, because Woodworth isn’t the only one who has made statements that seem to ignore or contradict it.
In 2019, when asked why reading scores were declining more than math scores, Peggy Carr basically just shrugged: “We don’t seem to know how to move the scores for reading but we do for math,” she said. In 2020, discussing dismal results on twelfth-grade reading and math tests, she said “we can’t be sure” why the gap between higher- and lower-performing students was growing. And trying to analyze abysmal results in history, civics, and geography that same year, she—like Woodworth—suggested that the problem was reading skills: “If students struggle with reading, they will struggle to understand these tests too. You need those fundamental skills.” At the same time, she identified the problem as being that “students don’t know the content.”
All this confusion stems from a fundamental lack of clarity not just about the definition of “reading” but also about what reading comprehension tests actually measure. The premise of reading tests like the NAEP is that comprehension can be measured as a discrete skill, despite evidence from cognitive science that it’s highly dependent on background knowledge. That assumption has contributed to a recent flap over proposed revisions to the NAEP reading test. Some members of the Governing Board want to add “support features” to explain terms that students might not understand, like “talent show.” Others on the board appear to agree with cognitive scientists that providing that kind of “support” undermines the whole exercise, because a key part of what the test should assess is whether students have acquired the knowledge and vocabulary needed for comprehension.
Another muddle—which is not of NAEP officials’ own making—stems from the difficulty of measuring score gaps between students from different income groups. The data highlighted at the event on May 25th was framed entirely in terms of racial and ethnic groups, showing that white and Asian students generally far outperformed Black and Hispanic students. When asked about gaps based on income, Woodworth explained that the usual proxy for income—eligibility for free or reduced-price lunch (FRPL)—has become unreliable.
Arguably, it never was all that reliable, since it lumped together different levels of poverty that could have significantly different implications for student outcomes. But more recently, a well-intentioned change in the regulations called the Community Eligibility Option has made it even less useful. So although the NAEP website continues to offer data on gaps between students who are FRPL-eligible and those who are not, officials don’t think it’s worth mentioning it when they announce test results. That’s understandable, but omitting any reference to income gives the impression that score gaps are all about race, when in fact they may well have more to do with poverty—or with parental level of education.
What can NAEP officials do about this situation? Not that they’re listening to me, but here’s what I would recommend:
· When announcing results, make it clear that low scores largely reflect lack of content knowledge, not reading comprehension skills—even on reading tests.
· Test students’ decoding and reading fluency skills separately from comprehension. One such NAEP test was given to fourth-graders in 2018, and it showed that students who scored below basic on the NAEP reading comprehension test also scored the lowest on decoding and fluency. No doubt the same is true for many students at upper grade levels—but we’ll never know if we don’t test for it.
· Figure out how to measure students’ socioeconomic status. Woodworth said they’re working on it, but the Community Eligibility Option has been around for over ten years now. It’s past time to come up with some measure other than FRPL. One possibility: announce data that’s disaggregated by parental education level, information that NAEP already collects for eighth- and twelfth-graders. (A sample item: in 2019, only 15% of eighth-graders whose parents lacked a high school diploma tested proficient in science, compared to 45% of those whose parents had graduated from college.)
· Rename the “reading test” a “test of a diverse array of content knowledge,” as NAEP board member Martin West has suggested—or perhaps better yet, stop giving the test altogether and shift attention to tests in the content areas.
While we’re all holding our breath waiting for any of that to happen, educators and policymakers would be wise to listen to Sneider, not Woodworth—and even go beyond his advice. Spend less time on reading comprehension and more time not just on science but on history, geography, and the arts.