SAT Recentering 'Defines Down' Success
When SAT scores were sent home last week, one student in Laguna Hills, California had reason to be happy -- he got a 1,600. A student in Midwood, New York also got a 1,600 on the test, as did 73 other students across the country, triple last year's figure. The reason? The Educational Testing Service, which administers the test, "recentered" scores in order to boost the mean on each section to 500.
The mean for the SAT was last set in 1941, when the average on both the math and the verbal was 500. Since then, and up until last year, verbal scores dropped to 420 and math scores to 478. ETS maintains that inflating scores was necessary because of a more diverse test-taking pool and that the new mean will simplify interpretation of the test. Instead, the scoring change will erase the disproportionate drop in verbal skills, make temporal comparisons next to impossible and reduce the test's ability to distinguish among high-scorers.
One argument in favor of the adjustment is that students in 1941 could tell their relative proficiency in each subject simply by comparing the two scores. A student scoring 550 verbal and 570 math would be above the median in both categories but would be relatively better off in math. Despite the presence of percentile reports for each section, the ETS contends that this comparison is too difficult because the average student today scores 60 points higher on math.
Rather than making the contrast impossible, the aggregate disparity between the sections makes an important point about American student's abilities: Compared with students of 50 years ago, the math skills of today's teenagers are only mildly deficient, whereas their language skills are abominable. Hence, the old scores sent the right message. A student achieving a 480 math and a 420 verbal should not assume that he is well-balanced because he reached the median in both. He, along with most of his peers, should notice the disparity and recognize the disproportionate need for improvement in the verbal arena. Recentering artificially erases this gap.
The second problem is that recentering will complicate comparisons between generations, especially if the ETS makes a habit of changing the scores whenever they slip off the ideal average. Though the immediate function of the SAT is to compare students to their contemporaries, SAT scores are one of the few instruments available to measure the progress or decline of academic achievement over the decades. Simply observing that standards needed to be lowered for the current generation should tell us something, but even with conversion charts, easy comparisons are confounded.
Finally, the recentering reduces the number of scores available to distinguish among above-average students. To understand this effect, imagine a test with only one possible outcome. Regardless of the raw total, the reported score is the same for everyone. Hence, it provides no information. A test with two outcomes would be somewhat better, splitting its subjects into two groups. As more outcomes are added, the test becomes more useful.
With a higher median, the SAT has fewer scores available to distinguish among the top 50 percent. In fact, it is now possible to reach 1600 while missing up to four questions. Thus, students with perfect performances receive scores identical to those of students who make several errors. This effect is repeated all along the top half of the scale. Though the difference between 1600 and 1590 may seem small, the purpose of the SAT is to point out minor differences among the 2 million students taking the test.
Of course, the result is reversed at the bottom. Lower-achieving students will now get a more accurate picture of their performance. However, the lower half of SAT-takers is less likely to apply to or be accepted by colleges or to take a job where verbal and math skills are crucial. Consequently, making very fine distinctions among those students is unimportant.
Discerning among the thousands of well-qualified applicants to top-level colleges, on the other hand, is a difficult and important task. Since, for better or worse, the SAT plays such a crucial role in determining a student's future, it should be calibrated to perform especially well among those who, it predicts, are the nation's future scholars, engineers and leaders. If fewer students today can meet yesterday's standards, the drop in scores reflects real change, not a statistical anomaly. The top 600 points should continue to be reserved for students who perform well against absolute standards.
Above all, the shift is emblematic of the move to define-down success. It may allow students who do not understand percentiles to interpret their scores more easily, but it does so at the cost of functionality. By forcing scores into a statistical target, ETS erases an academic decline which should be confronted, not accommodated.