Specificity could clear the fog away from educational testing

Educational testing sails on troubled seas these days. Tests of ability and capability are everywhere and seemingly inescapable but as the volume of testing increases, so does the volume of protests against them.

Some of the protests are merely self-serving. Claiming that a test was unfair is the easiest way for a student or a parent to explain a poor score or failing grade.

It is at the aggregate, political level, though, that protests of unfairness present the most significant problems for educational achievement tests. While these protests are often effective in getting a test discounted or discarded, they are equally as often misdirected.

The protests of bias against the entrance examinations used by New York City’s very best high schools are an excellent example of off-target protests. It is undeniable that fewer black and Hispanic students from the city’s elementary schools score high enough on the test to be admitted to the elite high schools. But the problem isn’t with the test; it is with the quality of the education those students are receiving.

Significantly, eliminating the test would not solve the problem. If New York City does not improve the quality of its minority schools, abandoning the test would just spread the educational decay to the elite schools.

That said, there is undoubtedly a problem with the test, too, but in a very real sense it is biased against everybody, not just minority groups. Educational tests do not measure achievement on an absolute scale. Instead they measure how one student’s achievements compare with those of other students. And because the “Bell Curve” is used to validate new tests, we lose our ability to tell whether students are improving or losing ground. In fact, we lose our ability to judge whether we are testing the right things.

Education author and lecturer Alfie Kohn recently wrote an essay y entitled “Why Can’t Everyone Get As?” Its subtitle explains his view that “Excellence is not a zero-sum game.”

Kohn enjoys a reputation as an iconoclast because he poses discomforting questions about things that most of us simply accept as unquestionably correct.

The question he poses about grades is very pertinent to today’s educational and regulatory institutions, which are often finding themselves engaged in battles over testing and grades. Frequently these battles are fought over political and social grounds that have little to do with the educational process but a lot to do with educational outcomes.

The short answer to his question is that at the classroom level there is no good reason why everyone cannot get As — if they earned them. For that matter they could all get F’s. Yet the teacher who filed those grades for a class in any subject, at virtually any level, would promptly find himself or herself called to account by the school administration.

Independent of outside complaints, the validity of grading on a curve depends on what is a questionable application of Bermouli’s Law of Large Numbers. In short, the existence of a bell-shaped curve of performance at the large, aggregate level of grades doesn’t mean that the grades for each class have to take that same shape. But education administrators, from school principals to college deans and government regulators, have come to view any grade reports that aren’t bell-curve shaped as wrong — and the prime suspect, usually, is the test standard.

What it does mean, though, as Mr. Cohn writes, is that “The inescapable, and deeply disturbing, implication is that ‘high standards’ really means ‘standards that all students will never be able to meet’.”

We don’t have to agree with his conclusion to recognize that it is technically correct and that he has described a real problem.

The issue will not be resolved until we address the question of what it is we wish to test and why. Educational testing has a recurrent problem with fogginess. Within living memory, for example, mathematics through high school was a matter of solving problems that had real world applicability.

Today’s tests, though, are dotted with questions on concepts, which, as test items, have yet to prove themselves useful in skill development and are likely to present many K-12 students with simply another way to fail — and parents another reason to feel frustrated.

The educational testing system and our K-12 issues are closely entangled. We might try testing more specific skills, much along the lines of tests used by the Federal Aviation Administration to test needed pilot skills and similarly structured tests used for doctors and other health care providers. Of course, it would force us to define what we want and expect of high school graduates, leaving much of the concepts to colleges where they are a more comfortable fit. It couldn’t hurt.