Julia Gillard and other Australian education ministers would be well advised to have a close look at a report published last month by a leading group of academics in England.
The report recommends that school test results should be published with a “health” warning. It says this is necessary because raw test scores only measure part of what schools do and are influenced by factors beyond schools’ control. It warns that publishing school test results may damage the very outcomes that it was designed to improve. The report also raises questions about the validity of like school comparisons.
The report says that school performance tables clearly invite judgments on the effectiveness of particular schools and policy makers need to warn the public about how well the data support these judgments.
Given the weight now being placed on school-by-school information, and the inferences being made on the basis of test and examination data, policy-makers have a responsibility to the public to be clear about what the results can tell them with confidence, and what they may suggest with less certainty. Those compiling and presenting information on each institution should provide a disclaimer, stating what the data are designed to measure, and, most importantly, what they cannot. Conventional school examination data, then, should be published with this “health warning”: that they measure only part of a school’s business and that these results are influenced by factors beyond the school’s control, such as the prior attainment and background characteristics of its pupil population.
The report says that the system is not transparent about what can legitimately be inferred from school results and what cannot. It notes that, typically, the public routinely draws inferences school test results that go well beyond the inferences that the tests are actually designed to support. For example, in England, parents use school performance tables to determine which of a few primary schools in their area is likely to educate their child the best. However, this is a very long way removed from what the national tests were designed to do, which was to compare the progress of individual students.
The report notes that the stakes placed on school test results are becoming higher in England. For example, in 2008, the government placed a closure threat over any secondary school that failed to ensure that at least 30 per cent of its pupils achieved five good grades in Year 11 assessments within three years. Similar warnings have also been delivered to primary schools below a performance threshold.
It says that if test results are to be used in “high stakes” situations, important questions must be asked. If, for example, the fate of a school may hang on a single set of test results, are the data they generate reliable enough to serve as a measure of the overall quality of that institution? Are they valid in measuring all that is felt to be important in education? Do they ultimately provide information that will help pupils improve? Finally, does the act of placing weight on the results of these tests affect the teaching that takes place in preparation for these assessments and, if there are negative impacts, are we prepared to tolerate them?
The report says that special care should be exercised when using unadjusted or “raw” pupil data to make judgments about the quality of particular local education authorities, schools, or teachers.
“[Student] performance is affected by some factors that the school can control, such as the quality of teaching; and some, which usually it cannot, such as pupils’ prior attainment levels and their home backgrounds. A school with results below this benchmark might be educating better, given the factors it can control, than one with results above it. More sophisticated measures, which seek to isolate the responsibility of the school for its pupils’ examination performances, are needed to support such inferences.”
The report notes that information is made available in England and Scotland to allow comparisons of groups of schools with similar characteristics. It says this approach may appear more just than using raw school results. However, it says that like school comparisons also contain inbuilt assumptions which need to be questioned. For example, the report questions whether similarities across a limited number of measures are sufficient to suggest that the schools being compared are actually similar. It also notes that detailed school comparisons may not be very useful because within-school differences are much more significant than cross-school differences.
The report warns that educators, parents and the general public should be alert to the intended and unintended consequences of using test results to judge the effectiveness of schools. The accountability system may come to damage the very outcomes that it was designed to improve.
It says that accountability pressures have led schools to narrow the range of their aims, to teach to the test, or to choose subjects for which qualifications appear easier to secure. It says that there is now a great volume of material cataloguing the educational side-effects over too much focus on performance indicators.
These include the repetition involved in months of focusing on what is tested and on test practice, which also serves to narrow the curriculum. It also involves often excessive and inequitable focus of many schools on pupils whose results may determine whether schools hit particular achievement targets.
The report notes that Scotland adopts a different perspective on the use of test data to judge schools. For instance, it has maintained a survey approach to national monitoring, arguing that it offers the potential to provide information on pupil achievement on more of what matters across the curriculum. In addition, teachers are not tempted to narrow the curriculum or teach to the test because individual teachers and schools are not identified in reporting the results.
The decision in Scotland to stop collecting national test results for all students in all schools was taken because policy-makers recognised that it was encouraging teachers to teach to the tests. They acknowledged that, although it appeared that results were improving, it was more likely to be that teachers were getting better at rehearsing children for the tests.
Australian governments should learn from this before it is too late.