New OECD Study Reveals Many Problems in Using Test Results for School and Teacher Accountability

A study just published by the Organisation for Economic Co-operation and Development (OECD) has some lessons for Australia’s school reporting system. It says that while accountability based on student test results can change teacher and school behaviour it often involves unintended strategic behaviour such as teaching to the test and manipulation of results.

The new report provides an overview of how student test results are reported in OECD countries and reviews the literature relating to using student test results for accountability and improvement purposes.

The report says that only a minority of OECD countries publish test results for individual schools. These are Australia, Canada, France, Iceland, Mexico, the Netherlands, Portugal, Sweden, England and the United States.

Several countries mandate that national tests cannot be used to rank schools. These are Austria, Belgium (the French Community), Denmark, France and Ireland. Following recent public debate in Finland, the national consensus was against publicizing school test results.

Publishing student test results in performance tables
The report concludes that the available evidence regarding the effect of publishing student test results in school performance tables is mixed. It says there is little evidence of a positive relationship between performance tables and increased student performance. There is, however, evidence of performance tables influencing the behaviour of schools, teachers and parents – although not always as originally intended by the authorities.

It also notes that several international studies conclude that parents in general pay little attention to public performance indicators. For example, following the introduction of NCLB in the United States, only a tiny percentage (1-5%) of students have left a “failing” school for a supposedly better school – as measured by student test results.

The report also found that there is wide consensus in the literature that reporting student test results in performance tables has several methodological problems and challenges. A common objection is that performance tables based solely on “raw” student test results essentially measure the quality of the school intake rather than the teaching in the school. Some scholars recommend that value-added reporting is a better approach but this also has methodological problems.

Studies of sampling variation show that the amount of variation due to the idiosyncrasies of the particular sample of students being assessed is often large relative to the total amount of variation in student performance observed. Imprecision can be reduced through publishing student test results with a margin of error. But this margin could be larger than the true differences in average scores that would distinguish effective schools from ineffective ones. Furthermore, variation in student cohorts from year to year necessitates a further margin of error in reporting assessment scores.

A further limitation of performance tables is that differences between schools are generally not statistically significant and therefore are not suitable for ranking the majority of schools.

A further problem is that different assessments produce different results. For example, different results are obtained by state and national tests in the United States. So, school rankings will differ according to the assessment used.

Using student test results to reward and penalise schools
The report also says that the evidence relating to the effect of using student results to reward and/or penalise schools is mixed. On the one hand, rewards and sanctions for schools can have a positive effect on student performance. On the other hand, rewards and sanctions for schools seem to create unintended strategic behaviour among teachers and school leaders.

It notes several unintended consequences of publishing performance data in the public sector. These include emphasis on what is being quantified at the expense of un-quantified performance aspects; pursuit of narrow objectives at the expense of broader objectives; and manipulation of data so that reported behaviour differs from actual behaviour.

In relation to school accountability, “teaching to the test” is probably the most well-known example of strategic behaviour. The report says that there is substantial evidence of teachers and school leaders responding to student assessment schemes through teaching students the specific skills that are assessed, narrowing the curriculum and allocating more resources to subjects that are tested. There is also evidence of manipulation of school results and outright cheating.

Using student test results to evaluate teachers
The report says that the evidence shows that student test results –in combination with other measures – may serve as a basis for distinguishing between high and low performing teachers, but it is inadequate as a basis for high stake decisions such as teacher pay and promotion.

The problem in using student test results to determine teacher quality is that education is jointly produced by teachers, schools, families, and communities. Thus, it is hard to single out the effect of a single teacher on the outcome of a single student.

The problem may be overcome by using value-added assessment which is essentially a method for isolating the contribution of individual teachers on growth in student achievement by controlling for other potential influences on student learning, such as prior student achievement and student and family characteristics.

In theory value-added offers an opportunity to hold teachers accountable for student results that they can influence. However, the report says, in reality value-added assessment relies on a several problematic assumptions as well as some methodological limitations. One of the limitations with value-added assessment is that teachers are not randomly assigned to students, an assumption that most value-added assessments make. Moreover, a measured increase in student outcome could be due to the hard work of a prior teacher and not the current year teacher.

The report says that using student test results as a basis for decisions relating to teacher pay is very controversial. The theory behind performance incentive pay is that it will motivate teachers to adapt their professional practice to address performance criteria. However, the evidence of the overall impact of such schemes is mixed and can be contentious and potentially divisive. It cites another OECD report which found several international examples of how performance incentive pay schemes have led to unintended strategic behaviour among teachers.

The report concludes that there is wide consensus in the literature that student test results should not be used as the sole measurement of teacher performance. This holds especially true when student test results are used to make high stake decisions such as teacher pay and promotion. It says that a valid and reliable scheme for assessing individual teacher performance requires multiple, independent sources of evidence and multiple, independent trained assessors of that evidence.

Getting assessment right
The report concludes that all accountability systems essentially produce both positive and negative effects and the challenge is to enhance the positive effects while at the same time limiting the negative effects. It suggests that it is important to find a workable balance between accountability and improvement – a balance where they mutually support and enforce each other.

Regardless of how an assessment system is organised, the report emphasizes that student test results must be reliable, valid and fair. No single assessment can be a perfect indicator of student performance, so several assessments should be used to measure student outcome – especially when the stakes attached to the results are high for teachers and schools.

Further, teachers and schools should only be accountable for factors within their control. This increases the need for sophisticated assessments, as well as high quality gathering, analysis and use of the test results.

M. A. Rosenkvist 2010. Using Student Test Results for Accountability and Improvement: A Literature Review. OECD Education Working Papers, No. 54, OECD, Paris.

Previous Next

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.