Saturday, April 25, 2009

Which NAEP Achievement Level Is an Appropriate Target for No Child Left Behind?

A great deal has been written lately about whether the National Assessment of Educational Progress (NAEP) achievement level score of "Basic" is a better target of comparison for state definitions of NCLB "Proficiency" than the NAEP's own "Proficient" score. In fact, one of our anonymous readers has been asserting that “Basic” represents a better target for states to shoot at with their No Child Left Behind testing.

This graph examines the percentage of eighth grade students in Kentucky who scored at or above NAEP “Basic,” at or above NAEP “Proficient” and at or above the Benchmark scores from the ACT, Inc.’s EXPLORE readiness test, now given to all eighth graders in Kentucky.

The ACT designed the EXPLORE benchmark scores to indicate students are on track to have at least a 75 percent chance of earning a "C" and a 50 percent chance of a "B" in the first related college courses at a typical US university. They are based on an empirical study conducted by the ACT several years ago (More on Benchmarks Here).

The data examined in the graph above is primarily from Kentucky’s NAEP and EXPLORE results from the 2006-07 school year. The same cohort of eighth grade students took both the math and reading portions of both assessments in that school year. NAEP last tested science in 2005, so that data is cross-cohort, but it is still of interest.


If a primary goal of our education system is to prepare students for follow-on education (and with more than half of each graduating high school class in Kentucky now going on for more education, this is a suitable goal), then the NAEP "Proficient" score is clearly a much more appropriate gauge of such readiness. In all cases, the NAEP score of "At or Above Basic" indicates far greater accomplishment than the EXPLORE indicates was actually present in Kentucky in the 2006-07 eighth grade cohort.

In fact, even the NAEP science score of “Proficient” notably over-represents real levels of adequate preparation in the subject for follow-on study.

A spreadsheet with more data, including some caveats, is available here.


Anonymous said...

ACT has identified EXPLORER benchmark scores for reading that indicate students’ probable readiness for college-level work by the time they graduate from high school. The description of the EXPLORER reading knowledge and skills for the grade 8 benchmark are as follows. [ From pages 6 and 9 of the EXPLORER Interpretive Guide for Student and School Reports: 2008/2009. Retrieved April 27, 2009, from ]

Main Ideas and Author’s Approach:
<+> Recognize a clear intent of an author or narrator in uncomplicated literary narratives.

Supporting Details:
<+> Locate basic facts (e.g., names, dates, events) clearly stated in a passage.

Sequential, Comparative, and Cause-Effect Relationships:
<+> Determine when (e.g., first, last, before, after) or if an event occurred in uncomplicated passages.
<+> Recognize clear cause-effect relationships described within a single sentence in a passage.

Meaning of Words:
<+> Understand the implication of a familiar word or phrase and of simple descriptive language.

Generalizations and Conclusions:
<+> Draw simple generalizations and conclusions about the main characters in uncomplicated literary narratives.

Uncomplicated Literary Narratives refers to excerpts from essays, short stories, and
novels that tend to use simple language and structure, have a clear purpose and a
familiar style, present straightforward interactions between characters, and employ only
a limited number of literary devices such as metaphor, simile, or hyperbole.
Let's compare the EXPLORER grade 8 reading benchmark knowledge and skills with those identified for NAEP Basic and NAEP Proficient. [ See pages 25 and 26 of the Reading framework for the 2007 National Assessment of Educational Progress. Retrieved April 27, 2009, from ]

NAEP Knowledge and Skills for Basic, Grade 8, are as follows:

“Eighth-grade students performing at the Basic level should demonstrate a literal understanding of what they read and be able to make some interpretations. When reading text appropriate to eighth grade, they should be able to identify specific aspects of the text that reflect the overall meaning, extend the ideas in the text by making simple inferences, recognize and relate interpretations and connections among ideas in the text to personal experience, and draw conclusions based on the text.

“For example, when reading literary text, Basic-level eighth graders should be able to identify themes and make inferences and logical predictions about aspects such as plot and characters. When reading informational text, they should be able to identify the main idea and the author’s purpose. They should make inferences and draw conclusions supported by information in the text. They should recognize the relationships among the facts, ideas, events, and concepts of the text (e.g., cause and effect, order).”

NAEP Knowledge and Skills for Proficient, Grade 8, are as follows:

“Eighth-grade students performing at the Proficient level should be able to show an overall understanding of the text, including inferential as well as literal information. When reading text appropriate to eighth grade, they should be able to extend the ideas in the text by making clear inferences from it, by drawing conclusions, and by making connections to their own experiences—including other reading experiences. Proficient eighth graders should be able to identify some of the devices authors use in composing text.

“ For example, when reading literary text, students at the Proficient level should be able to give details and examples to support themes that they identify. They should be able to use implied as well as explicit information in articulating themes; to interpret the actions, behaviors, and motives of characters; and to identify the use of literary devices such as personification and foreshadowing. When reading informational text, they should be able to summarize the text using explicit and implied information and support conclusions with inferences based on the text.”

It is rather obvious from a consideration of these descriptions of reading achievement that the EXPLORE reading benchmark is very much closer to NAEP Basic than to NAEP Proficient. If a state had designated EXPLORER as its NCLB reading test and EXPLORER had passed the peer-review process, then the most appropriate NAEP statistic for confirming the percentage of students meeting or exceeding the EXPLORER benchmark (i.e., the state’s reading AYP statistic) would be NAEP Basic.

Following this simple comparison a reasonable person would likely conclude that for education systems preparing students with reading knowledge and skills for post-secondary education, the NAEP Basic reading score, rather than the NAEP Proficient reading score, is indeed the appropriate gauge of such readiness (as ACT has defined “readiness” for the EXPLORER test).

Richard Innes said...

Anonymous 2:15 PM wastes an awful lot of effort in a flawed attempt to claim NAEP “Basic” is somehow more comparable to the EXPLORE benchmark performance than NAEP “Proficient” is. We have heard the same argument a number of times before. In fact, this clearly incorrect assertion is one of the reasons why I developed the graph in the newer Blog item titled .

Go back and look at the graph of the actual test results in the main blog. In no case is an EXPLORE (it’s not “EXPLORER,” by the way) benchmark performance in Kentucky anywhere close to what the NAEP grades as “Basic” performance.

This easy to understand graph shows that NAEP “Proficient” results from actual testing in Kentucky are much closer to the EXPLORE Benchmark results in every case.

By the way, Anonymous 2:15 PM’s convoluted explanation tries to compare descriptions from the “EXPLORE Reading Test College Readiness Standards by Strand and Score Range” to NAEP Frameworks. Because of the way the information is presented, you cannot do that.

The EXPLORE descriptions don’t include just the Benchmark score of 15. They cover a score range from 13 to 15, therefore including performance considerably below what is required to reach the Benchmark score of 15. Because the Benchmark performance isn’t separately and explicitly defined, the attempted comparison to the NAEP Frameworks is fatally flawed.

In any event, Anonymous 2:15PM’s logic flies in the face of the graph in the main Blog. That graph is the real killer to Anonymous’ incorrect assertions.

Anonymous needs to stop defending low performance targets for our schools. That isn’t going to get us where we need to go.

Anonymous said...


I don't buy your claim that "The EXPLORE descriptions don’t include just the Benchmark score of 15. They cover a score range from 13 to 15, therefore including performance considerably below what is required to reach the Benchmark score of 15. Because the Benchmark performance isn’t separately and explicitly defined, the attempted comparison to the NAEP Frameworks is fatally flawed."

Just look at the reading standards in the score range (16-19) above the Benchmark range. In this range, EXPLORE still uses only "uncomplicated" reading passages. Look at the standards in the second score range (20-23) above the Benchmark range. In this range, EXPLORE still uses only "uncomplicated" reading passages in four of the five reading strands. In the fifth strand, EXPLORE asks the eighth graders only for "simple" generalizations and conclusions.

I'm sorry, but the EXPLORE content even two levels above the Benchmark level simply does NOT provide opportunity for students to perform at the NAEP Proficient level, which "level represents solid academic performance for each grade assessed. Students reaching this level have demonstrated competency over challenging subject matter, including subject-matter knowledge, application of such knowledge to real-world situations, and analytical skills appropriate to the subject matter."


As a rule of thumb, researchers pose a question and then identify test(s) with score interpretations (as stipulated by the test publisher) that will answer the question. Using the test(s), they then gather and analyze the data, and report the results (often using tables and graphs).

It just isn't research to find some scores from different tests that look alike, to plot them on a chart, and to provide a personal interpretation that for the scores on the chart.

Publishers create tests for a purpose so the publisher's interpretations of the data should be acknowledged. ACT publishes EXPLORE and the National Assessment Governing Board publishes NAEP. If you want to compare EXPLORE and NAEP data, you must first establish some relationship between what the two publishers say their scores mean.

When I look at the publishers' interpretations of the data, I find that they differ considerably from your personal interpretation of their data. The graph doesn't warrent consideration; it's not research.

Richard Innes said...


The facts are what they are. You cannot dismiss them. You cannot argue them out of existance.

Given the final test score results I show in the main Blog, I doubt anyone other than yourself would be so unwise as to even attempt to assert that the EXPLORE benchmark is somehow closer to NAEP "Basic" than NAEP "Proficient." That nonsense is absolutely discredited by the actual score results from these tests.

Anyway, we welcome your comments, as we welcome all comments, because even when our commentators are misinformed, that still helps us amplify some of our points for our generally very astute and intelligent readers.

Anonymous said...

I can't post my graph with my comment. It looks very different than yours. Let me at least tell you about it.

After deciding that NAEP Basic and the EXPLORE Benchmark described about the same achievement level, I went to the NAEP Data Explorer to get Kentucky's percent Basic and Above, the percent Proficient and Above for the reading and math assesments in Winter 2007 and the science assessment in Winter 2005. I then found the "Kentucy EXPLORE Profile Summary Report" for the Fall 2007 assessment. Here's what I found:

69% Basic and Above
27% Proficient and Above
61% EXPLORE Benchmark

73% Basic and Above
28% Proficient and Above
76% EXPLORE Benchmark

63% Basic and Above
31% Proficient and Above
44% EXPLORE Benchmark

On my graph it is painfully obvious that for reading and math, at least, the appropriate comparisons are NAEP Percentage at Basic and Above versus the Percentage at the EXPLORE Benchmark and Above.

The science comparison is not so clear. Maybe this is related to the fact that NAEP and the EXPLORE test were administered in different years to different groups of students. But maybe not. Further investigation into science achievement is warranted (if anybody is interested).

You say, "The facts are what they are. You cannot dismiss them. You cannot argue them out of existance." Well, you got one out of three correct. Your "facts" are wrong. I dismiss them without hesitation. You have it right when you say that I can't argue your "facts" out of existance. Your "facts" will always exist (if no place else, then on the internet), but they will always be wrong. Sorry.

Anonymous said...

I'd retract the last post if I could, but it will live forever on the internet. The data that I used for EXPLORE are incorrect. I will have to look at this again. Sorry for wasting your time.

Richard Innes said...


I salute your honesty. I think you will find that the figures in my post are accurate.

Also, if you want a true apples-to-apples cohort, you need to look at the fall 2006, not the fall 2007 EXPLORE results.

The EXPLORE is given a couple of months after each school term starts, while NAEP, as you mention, is given around the end of winter. The administrations are only a few months apart, but they occur in different calandar years.

Thus, the proper comparison school term is the 2006-07 academic year, and you will need to compare Fall 2006 data to late winter 2007 NAEP data if you want to look at the same cohort of students.

Anonymous said...

My apology to the universe for a lapse in judgment!

Last night after I came up with corrected NAEP and EXPLORE numbers (which agreed with the numbers on the Innes chart), I realized that I would need to better understand the EXPLORE test before I could make a reasoned analysis. As I browsed the internet for information about the test, it occurred to me in a flash of inspiration that cars can’t fly.

Once upon a time, in order to add movie excitement via a dramatic car crash and explosion, a movie director laced a car with ordinance and caused it to speed off a high cliff. The camera recorded the entire event from the moment the car began to move until it crashed at the foot of the cliff. The scene was hailed critically as a cinematic milestone. Sometime later, a “researcher” came across a still frame photo of the car as it sailed through the air. For some unexplained reason, he put the photo of the airborne car next to the photo of an flying airplane. The “researcher” concluded from a point-by-point comparison of the facts he chose to observe in the side-by-side pictures that cars can fly. Automakers, of course, countered that they did not build cars to fly; they strongly recommended that drivers always conducted themselves in a safe and sane manner so their wheels stay on solid ground. Nonetheless, there were those who were taken by the “researcher’s” conclusion, who drove their cars off cliffs and died. In their ignorance, the “researcher” was able to convince them that “Pictures don’t lie, cars can fly.”

While the numbers on the Innes chart are correct, the use and interpretation of the numbers is not. The analysis fails to conform with multiple principles (published in 2002 by the National Assessment Governing Board) for using NAEP to confirm state test results. Indeed, the chart illustrates a point-by-point analysis (a process rejected by the test publisher) that is also unaccompanied by appropriate cautions and explanations regarding the interpretation of achievement level data and the differences between NAEP and EXPLORE.

Consider this text from the 2002 NAGB document:

“Informed judgment” and a “reasonable person” standard should be applied in using National Assessment data as confirmatory evidence for state results. Confirmation should not be conducted on a “point by point” basis or construed as a strict “validation” of the state’s test results.”

Limitations in using NAEP to confirm the general trend of state test results should be acknowledged explicitly. Potential differences between NAEP and state testing programs include: content coverage in the subjects, definitions of subgroups, changes in the demography within a state over time, sampling procedures, standard-setting approaches, reporting metrics, student motivation in taking the state test versus taking NAEP, mix of item formats, test difficulty, etc. […] The greater the differences between the respective state tests and NAEP, the greater the complexity in using NAEP as confirmatory evidence for state test results and the greater the cautions in interpretation that should accompany the weighing of the confirmatory evidence.

[It] also may be confusing to others, who would wonder how “basic” on one test can be equivalent to “proficient” on another. Individuals likely to be involved in reviewing state achievement data and NAEP confirmatory evidence undoubtedly will have the experience and knowledge to handle such dissonance. However, reports to the general public should be cautious about displaying such direct comparisons and should do so only if accompanied by clear explanations.

My personal interest here is not the achievement status of Kentucky’s students. That is an issue for the citizens of Kentucky. My interest is limited to the appropriate use of NAEP data to confirm state test results. What we see above fails the “informed judgment” and "reasonalble persons" tests. Nonetheless, be taken if you will.