League Tables Damned by Major UK Report

Friday October 23, 2009

Just as Australia is introducing reporting of school test results and the inevitable league tables that will follow, a major review of the primary curriculum in England has issued damning conclusions on the impact of standardized tests and league tables.

The Cambridge Primary Review released last week says that the testing and reporting of school results in English and maths has distorted children’s learning and eroded their entitlement to a broad education. It says that 10 and 11-year-olds spend around half their time in the classroom studying English and maths and this has “squeezed out” other subjects from the curriculum.

The Review recommends that the English and maths tests be replaced and that league tables that report school performance on these tests be axed as well.

The Review’s 608-page final report is the most comprehensive review of primary education in England in 40 years. It is based 4,000 published reports and 1,000 submissions from around the world. It makes 78 recommendations for reforming the English system of primary education.

The Review says that the current focus on passing exams and hitting targets at a young age was “even narrower than that of the Victorian elementary schools”. It claims that the existing system caused significant “collateral damage” as children were drilled to pass exams, marginalising other subjects such as history, geography, art and science which have been “squeezed out” of the curriculum. The study said:

The prospect of testing, especially high-stakes testing undertaken in the public arena, forces teachers, pupils and parents to concentrate their attention on those areas of learning to be tested, too often to the exclusion of other activities of considerable educational importance.

As children move through the primary phase, their statutory entitlement to a broad and balanced education is increasingly but needlessly compromised by a ‘standards’ agenda which combines high-stakes testing and the national strategies’ exclusive focus on literacy and numeracy.

The head of the Review, Professor Robin Alexander, wrote in the Daily Telegraph that primary education should amount to much more than basic literacy and numeracy, supremely important though these are. He said claims that tests in those areas can serve as a proxy for the rest of a child’s education are both wrong and misleading to parents.

The report proposes that the tests be replaced by a system of less formal teacher assessment throughout primary school which could be externally moderated. A random sample of children could then take place at age 11 to gauge national performance in all subjects.

Information on the Cambridge Primary Review is available at this link

Previous Next

School Results Fail to Measure Up

Sunday October 11, 2009

A testing expert has made some devastating criticisms of the reliability of school test results to be published later this year or early next year.

Professor Margaret Wu from the University of Melbourne says that linking school performance to student achievement on these tests is “pure conjecture”.

In a keynote paper delivered in Hong Kong in July, Professor Wu said that the NAPLAN tests have a high level of inaccuracy. She said that there are large measurement errors at the individual student and class levels.

She said that these errors meant that high stakes decisions such as judging school and teacher performance on student scores should not be made on the basis of these tests.

Professor Wu also said that and that the tests are not suitable for measuring achievement growth between two points in time for individual students or classes. She also made some technical criticisms which call into question the validity of the tests and the method used to equate the scores of students across different year levels on the same scoring scale.

The extent of the errors is quite large, even for individual students, and they are exacerbated at the class and school levels. Professor Wu found that measurement errors in annual 40-item tests, such as those being used in NAPLAN, would lead to about 16 per cent of students appearing to go backward when they had actually made a year’s progress. She said this is a conservative estimate as it does not take account of other sources of error such as the assumption that two tests are assessing the same content. The errors could well be larger.

While the size of the measurement error reduces for classes and schools, they are still quite large. For example, Professor Wu found that the statistical uncertainty around the average results on these tests for classes of 30 students is equivalent to more than six month’s learning. Many schools around Australia only have this many students or less participating in the NAPLAN tests. For schools, with two classes of 30 students tested the error could amount to about four months of learning.

These results relate only to measurement error in the tests. There are also other sources of error, most notably sampling and equating errors, which add to the uncertainty and inaccuracy of the results.

Measurement error is a result of inconsistency in test results because the same students may achieve different results on the same test on different days because of differences in their own well-being, such as lack of sleep or food, or because of variations in external factors such as how cold or hot conditions are in the room in which the tests are conducted. It also arises from differences in the items selected for testing and the way answers are scored.

Sampling error arises from differences in the selection of students to participate in tests. A group of students selected for a test are likely to achieve different results from another group simply because of differences in their composition. The group selected for testing may not reflect the average level of ability of all students. The smaller the sample, the more likely there will be a significant difference between the average results of the sample tested and the results if all students were tested.

Sampling error occurs even when all students in a year cohort are tested. This is because inferences are made about school performance by testing selected cohorts, such as Years 3, 5, 7 and 9 in the national literacy and numeracy assessments. Each cohort of students tested is a sample of the students in the school for the purpose of measuring school performance.

Equating errors arise in comparing tests over time and in creating a common scale of scores for students across different Year levels. For example, building a common score scale across several year levels involves sophisticated statistical methodology to ensure that the results are reliable and valid. Different methodologies produce different results.

Professor Wu says that equating error is a major source of inaccuracy. This is because test items often work differently for different groups of students across states, there are curriculum differences across states and some content areas are not fully covered.

Professor Wu has followed up her criticisms in a letter to The Age recently saying that if student performance is not measured well by NAPLAN then the results cannot be used to assess school and teacher performance. She said that it could mean that schools and teachers are accused of not doing their job when they are.

Professor Wu says that the criticisms apply also to so-called like school comparisons. The large error margins make these comparisons practically meaningless.

When schools are grouped into ‘’like’’ groups, we need even more precision in the measures to detect differences between schools. It will be easy to demonstrate the difference between a high-profile private school and a low socio-economic government school, but it will be more difficult to determine significant differences between two high-profile private schools.

These are devastating criticisms. Julia Gillard has assured that the new national school performance reporting system will give accurate data on individual school performance. However, it appears that the national tests are not up to the mark.

The large statistical errors will wreak havoc when comparing school results.

It will not be possible to make reliable comparisons or rankings of schools because they may reflect chance differences in school performance rather than real differences. Such comparisons will mostly identify lucky and unlucky schools, not good and bad schools. It also means that current school performance is highly misleading as a guide to future school performance.

These statistical errors in school results also mean that school performance and school rankings are highly unstable from year-to-year. It is highly misleading to compare changes in school performance from one year to the next, especially in the case of smaller schools. It leads to unwarranted conclusions about changes and often unfairness in the inferences drawn about schools.

Professor Wu’s criticims show that Julia Gillard’s faith in the ability of NAPLAN to identify successful schools is misplaced. Rather than accurately measuring school performance as Gillard asserts, the new school performance reporting system is likely to mislead parents and policy makers.

Parents may be misled in choosing a school. Some schools may be recognised as outstanding while others are identified as unsuccessful simply as the result of chance and not because of actual programs and teaching practice. It also means that current school performance is highly misleading as a guide to future school performance.

The large error margins may also mislead policy makers because it will be difficult to identify effective school practices. It may mislead decision-makers and schools in recommending and adopting particular educational programs. Action taken to assist less successful schools may appear more effective than it is in practice.

Trevor Cobbold

Previous Next

New York City’s Bogus School Results

Thursday September 10, 2009

Diane Ravitch, Professor of Education at New York University and former US Assistant Secretary of Education, says that the latest school results in New York City are bogus.

Writing in the New York Daily News, Ravitch says the City’s school reporting system, so admired by Federal Education Minister Julia Gillard, makes a mockery of accountability. When nearly every school gets an A or B there is no accountability.

Ravitch attributes the massive increase in schools being graded A or B to a collapse in standards in the New York State tests in recent years.

Earlier this year, the Daily News [7 June] revealed that test questions have been getting easier. It reported an investigation by Columbia University’s Jennifer Jennings which found that the state asks nearly identical questions year after year. For example, at least 14 of the 30 multiple choice questions on the seventh-grade exam in 2009 had appeared in similar form in previous years. Only 55% of the specific math skills the state requires seventh-graders to learn were ever tested in the four years the exam has been given.

This predictability in test questions allows for intensive preparation of students which has corrupted the results. Test experts said that students are essentially tipped off as to which specific items are going to be on the test and this undermines the validity of the test.

With teachers administering daily practice tests containing questions very nearly the same as those that would appear on the state tests, it became easier for students to become “proficient.”

As a result, test scores are increasing massively. The number of students at the lowest level – those who are at risk of being held back in their grade – has dropped dramatically. In sixth-grade reading, 10.1% (7,019) were at Level 1 in 2006, but only 0.2% (146) were by 2009. In fifth-grade reading, the proportion of Level 1 students fell from 8.9% in 2006 (6,120) to 1.0% (654) in 2009. In seventh-grade math, the proportion of Level 1 students plummeted from 18.8% (14,231 students) in 2006 to 2.1% (1,457) in 2009.

In almost every grade, the state has lowered the bar, making it easier for students to get a higher score. In 2006, students had to earn around 60% of the points on the state math tests to reach Level 3, which the state defines as proficiency, but by 2009, they needed to earn only 50%.

Ravitch says that New York City’s school report cards should be revised or scrapped. “We are reaching a perilous stage where the test scores go up while real education – the kind that is available in the best schools – disappears.”

Previous Next

Gillard Renews Threat of Sanctions Against Lowly Ranked Schools

Saturday August 22, 2009

Julia Gillard has yet again raised the spectre of using school results to punish low performing schools. She said on the SBS Insight program that principals deserve to be sacked if they repeatedly fail to lift their school’s performance.

The Minister failed to produce any evidence that punishing schools succeeds in lifting their results. Her problem is that she has none to produce. There is no substantial evidence that applying sanctions against schools succeeds in improving school results. But, having no evidence to sustain her case does not seem to faze the Deputy Prime Minister. It is has become a feature of her administration of education.

Gillard has threatened sanctions against schools with low student achievement previously, as has the Prime Minister. On her recent trip to the United States, she told a roundtable discussion on education reform at the Brookings Institution that schools which persistently fail “might eventually be closed”. In his address to the National Press Club on education last year, the Prime Minister threatened:

…where despite best efforts, these schools are not lifting their performance, the Commonwealth expects education authorities to take serious action – such as replacing the school principal, replacing senior staff, reorganising the school or even merging that school with more effective schools.

In all likelihood, these threats will apply only to government schools. Although, the Federal Government has no constitutional power to sack staff or close schools, it will presumably implement its sanctions by holding state and territory governments to ransom over funding grants. However, there is no chance it will threaten any private school with forced closure or require any private school to sack its principal or staff.

Gillard and Rudd are taking their cues from England and New York City where blaming teachers and principals has become established procedure. It has allowed politicians and education officials to dodge their own responsibility for the quality of educational services serving highly disadvantaged communities.

In June 2008, the UK Schools Secretary threatened to close any English secondary school that failed to ensure that at least 30 per cent of its pupils achieved five good General Certificates of Secondary Education, including English and maths, within three years. This put some 638 secondary schools, or 20 per cent of all secondary schools in England, under threat of closure. In October, the government also threatened primary schools whose results were below a performance threshold with closure.

In the large majority of cases, the schools targeted for closures or other sanctions are schools serving highly disadvantaged communities. Of the 638 schools threatened with closure, 542 have a student intake that has an above average proportion of students who qualify for free school meals, an indicator of disadvantage used in England.

Julia Gillard’s hero, New York City’s Schools Chancellor Joel Klein, has also been sacking principals and staff and closing schools for several years because of persistently low performance on New York’s school grading system. The New York Times reported that 14 schools were marked for closure this year because they were deemed to be ‘failing’ schools. Since Klein took over the city education system, 92 low performing schools have been closed. Many have been turned over to charter schools.

The evidence is that none of this works. For example, a review of the use of sanctions and rewards across a wide range of programs, including education, published by the UK National Audit Office last September, found “no quantified evidence of the effect of sanctions and rewards on levels of performance for the programmes in the survey”. The sanctions covered in the review included closing schools and the harm to reputation from a low ranking on league tables.

A study of the impact of sanctions against low performing schools recently published by the American Educational Research Association refers to their “predictable failure”. It concluded that there is a lack of evidence that the sanctions have been successful as an effective and universal treatment for raising achievement levels at low performing schools. It concluded that the sanctions applied under the No Child Left Behind legislation are more likely to result in “unproductive turbulence than in sustained school improvement”.

A report published last April by the Education and Public Interest Center at the University of Colorado concluded that there is little to no evidence that sanctions against low performing schools increased student achievement. It recommended that policy makers refrain from adopting restructuring sanctions such as sacking principals and staff or closing schools and allowing them to be taken over by private operators or charter schools. It said that these sanctions have produced negative by-products without yielding systemic positive effects on student achievement.

Joel Klein’s sanctions against New York City schools have not worked either. National tests show that average student achievement in New York City schools has stagnated since Klein took over and there has been no reduction in achievement gaps.

Using school results to sanction low achieving schools and staff is likely to be highly arbitrary and unfair. A report published by a group of English education academics last week said that using test results to judge school performance can be very misleading. It cited extensive evidence in the UK that a large proportion of students have been marked incorrectly in tests in the past.

The report questioned making the fate of schools hang on a single set of test results, saying that raw test scores only measured part of what a school does and were influenced by factors beyond the control of schools. Sacking principals and school staff on the basis of these results is similarly unfair and arbitrary.

The use of league tables results to target schools for sanctions is also often contradicted by other assessments of performance. For example, an analysis of reports of the UK Office of Standards in Education (Ofsted), showed that a quarter of the English secondary schools threatened with closure were graded “good” by Ofsted school inspectors, and 16 were judged to be “outstanding”. About a third of them were in the top 40 per cent on the government’s “value-added” league tables. Only one in ten needed special intervention according to the Ofsted inspectors.

The use of unreliable test data to apply sanctions against schools and teachers also encourages school responses which further corrupt the results. These include poaching high achieving students from other schools; denying entry to, or expelling, low achieving students; suspending low achieving students on test days; increasing use of special dispensations for tests; encouraging students to take courses whose results are not used to compare schools and cheating.

The underlying assumption behind Gillard’s threat is that if schools are failing to deliver quality education it is the fault of the school’s leadership and teachers and that they should be replaced. It assumes that a ‘culture of success’ in so-called failing schools is just a matter of strong leadership.

A review of the Fresh Start initiative for ‘failing schools’ in England published this month in the British Educational Research Journal calls this assumption into question. It says that it ignores the ongoing impact of severe social inequalities and the context in which school operate. The study concluded that “…managerial solutions are not sufficient to deal with problems that are both educational and social”.

By raising the spectre of sanctions against low performing schools, Julia Gillard has once again resorted to discredited schemes used in England and the United States. She continues to ignore the reality of the impact of poverty on education. Closing schools in poor communities will only disadvantage them further.

The threat itself is enough to set off a spiral of decline. The curse of failure will encourage parents and teachers seek to transfer to other schools. Few will make such a school their first preference for children starting school. Wholesale sacking of staff in schools serving poor communities will only make it more difficult to attract quality teachers. Few principals will elect to take on a challenging school if they face a higher risk of being sacked, and branded a failure on the basis of dodgy statistics.

A different approach is needed as recommended by the review of the Fresh Start program in England:

If we are to improve achievement in inner-city schools, education policy needs to address fundamental matters concerning attainment, such as those related to resources, curricular innovation and pedagogy, and to design measures to raise, in particular, the attainments of pupils who are traditionally disadvantaged. [613]

Similarly, the AERA study of sanctions in the United States concluded:

…after about 15 years of state and federal sanctions-driven accountability that has yielded relatively little, it is time to try a new approach, one that centres on the idea of sharing responsibility among government, the teaching profession and low income parents. The hard cultural work of broader-based movements, nourished by government and civic action, will have to replace legal-administrative enforcement and mandates as the centrepiece of such an equity agenda. [361]

Julia Gillard would do well to heed this advice instead of playing to the grandstand of populist rhetoric and discredited policy.

Trevor Cobbold

Previous Next

‘Hot Housing’ Students to Improve League Table Rankings

Sunday January 18, 2009

The publication of new league tables of secondary school results in England last week brought to light another way of manipulating school results.

Secondary schools are encouraging students to take exams early in order to give them a chance to repeat the exam if their results are not good enough. The practice is called ‘hot housing’ students to get better results.

The Guardian newspaper (15 January) reported that many schools are allowing students to take the same General Certificate of Secondary Education (GCSE) exam up to three times in the space of a year in order to improve their results. Examination boards have reported substantial increases in the number of students who took their GCSE exams early in order to allow time for a re-sit of the exam in the case of failure.

According to the report, some students have taken the same GCSE exam up to three times in the one year in order to improve their results. Many are taking their exams up to six months early in order to allow time for a re-sit. The exam board, Edexcel, said it had seen a 67% rise in the number of students taking a whole GCSE early, while the number re-taking modules to improve their scores had nearly doubled.

The Guardian report stated that this tactic is being encouraged increasingly across England by school principals desperate to improve their school’s ranking in the league tables. Teachers’ leaders blamed Government pressure on school principals to improve results and move up the school league tables for the increased focus on exams that was putting children under stress and detracting from the depth of their learning.

Alan Smithers, Professor of Education at Buckingham University, told the Guardian that: “League tables have got all out of proportion and schools will now do all they can to improve their place. Early entry is one way they are doing it. Other ways include focusing on the pupils on the C-D border. We’re in danger of producing a set of statistics that no longer accurately reflect pupils’ progression but the work the schools can do to improve their scores.”

‘Hot housing’ students for literacy and numeracy tests used to judge school performance is now a feature of most United States school systems. Classrooms are being turned into test preparation factories. Weeks and months can be devoted to test preparation at the expense of other parts of the curriculum and other learning areas. Even recess time has been cut or eliminated in many schools so that more time can be given to test preparation.

Reporting individual school results inevitably leads to league tables and immense pressure on schools and teachers to improve school rankings. The evidence is that schools look for ‘quick fixes’ to improve their ranking. ‘Hot housing’ students by repeated test preparation and re-taking of exams is just one among many ways schools try to manipulate their results.

Other well-established practices in England and the United States include removal of low-achieving students from tests by placing them in special education streams, suspending them or encouraging them to be absent on test days and retaining them in grades not tested. The incidence of teachers helping students during tests or changing answers has been shown to increase following the introduction of reporting school results.

Another way to artificially boost school results is to make greater use of special dispensations to help students during tests. They include having extra time for tests, use of a dictionary, small group or one-on-one testing, oral reading of directions and use of an aide for transcribing responses. They also include special allowances in the case of illness such as alternative assessments and sitting tests at different times from other students.

The New York City school reporting system, so admired by Julia Gillard, has encouraged schools to use special dispensations for students during tests to improve their results. There has been a massive increase in so-called “accommodations” for students taking tests since the new reporting system was introduced.

This practice is already being used in Australia as shown by last year’s revelations in the Sydney Morning Herald about excessive use of dispensations for the HSC exam by some NSW elite private schools. Up to 30% of students at some private schools were given special provisions in the 2008 HSC, compared with an average 7% of government high school students. Masada College claimed special dispensations for 30% of its students and Scot’s College claimed them for 25% of its students. In 2006, Reddam House received special consideration for 36% of its HSC students.

All this is a harbinger of what can be expected across Australia with the decision of Australian Governments last year to go ahead with reporting of individual school results. Manipulation of school results will become a priority for schools instead of real learning in the classroom.

Trevor Cobbold

Previous Next

Models of Like School Comparisons

Sunday November 9, 2008

The Prime Minister and the Federal Minister for Education, Julia Gillard, have stated that a system of individual school performance reporting will be included in a new national education agreement due to start at the beginning of 2009.

The Minister for Education has said that she doesn’t want “silly” or “simplistic” league tables. She has rejected comparisons of the raw scores of all schools as unfair and misleading. Instead, she supports comparisons of ‘like’ schools as a way to improve school performance.

To date, the Minister has not provided any details or explanation of how like school comparisons will be made and what information will be published. All we know is that she is impressed by the model of school reporting used in New York City. So, clearly, this is one model being considered by the Government.

There are also several models of comparing student achievement in like schools in place at present in Australia. New South Wales, Victoria and Western Australia have been using like school comparisons for some time. Each state uses a different methodology for identifying like schools and there are differences in the comparative data provided to schools. New South Wales and Western Australia allow schools to publish some of the data.

This note is the first in a series about like school comparisons. It provides a description of the systems in place in New York and in Australia. Further notes in the series will analyse various features of these models and discuss the implications of introducing like school comparisons across Australia.

New York City

The New York City Department of Education publishes annual school Progress Reports that provide teachers, principals, and parents with detailed information about students and schools. The first Progress Reports were published in November 2007 for the 2006-07 school year. A citywide Progress Report summarising the results of all schools is also published.

Progress Reports give each school a score on its performance in three domains – School Environment, Student Achievement and Student Progress (and on several measures in each category). These scores are combined to give an overall score out of 100 for each school. Schools are also given a grade of A, B, C, D or F based on their domain scores and how they compare with other schools.

The various scores in each domain and the school grades are published in a Citywide Progress Report and individual school reports, all of which are available on the Department of Education website.

The Progress Reports evaluate schools in three areas:

  • School Environment (15% of score), including attendance and the results of Learning Environment Surveys;
  • Student Performance (30% of score), as measured by elementary and middle school students’ scores each year on the New York State tests in English Language Arts and Mathematics. For high schools, student performance is measured by diplomas and graduation rates; and
  • Student Progress (55% of score), as measured by how much schools help students progress during the school year in subjects such as reading, writing, math, science, and history. Schools’ progress scores also rise when they help English Language learners, special education students and students who are not performing well at the beginning of the school year.

Schools also receive additional recognition for exemplary progress by students most in need of attention and improvement. This can increase a school’s overall grade.

A school’s results in each domain are compared to results of all schools of the same type (elementary, middle and high schools) across the City. Results are also compared to a peer group of 40 similar schools. These comparisons with other schools are reported as the percentage of the distance between the lowest and highest scores achieved by each school.
The citywide and peer school ranges are determined on the basis of the past two years results for elementary and middle schools and in the past four years for high schools (the reference period).

A score of 50% on a particular measure in means that the school’s performance on that measure in the current year was exactly halfway between the bottom and top scores in the citywide or its peer group range during the previous two or four years. Similarly, 75% signifies that the school’s score was three-quarters of the distance between the bottom and top of that range. Scores above 100% are possible if in the year of the Progress Report the school exceeds the top score in the reference period range.

Peer schools are schools that serve similar populations in terms of grade span, demographic composition, and/or average incoming State exam scores. A school’s peer group consists of the twenty schools above and twenty schools below it in the same grade span category when ranked by a “peer index”. The peer index of each school is reported in the individual school Progress Reports and the overall citywide Progress Report.

Different types of school are ranked by different peer indexes. Peer group K-5 and K-8 schools are determined by their student profile while peer groups of 6-8 schools and high schools are determined by student performance on state-wide English and mathematics exams.

The peer index used for schools in the K–5 or K–8 grade span is the weighted average of the percentage of students at the school eligible for free lunch (the Title I poverty rate) (40%), percentage of Black and Hispanic students (40%), percentage of the student population enrolled in Special Education (10%), and percentage of the student population made up of English Language Learners (10%). The index value is from 0-100 and a high value reflects a high need student population with a high percentage of students from low income, Black or Hispanic families.

The index for schools in the 6–8 grade span group is the average of the Proficiency Ratings its actively enrolled students had earned on their fourth grade State ELA and mathematics exams. The index for high schools is the average of the Proficiency Levels its actively enrolled students had earned on their State ELA and mathematics exams as 8th graders. The index value is from 1-4.5 and a high value in this case indicates a student population with low need.

New South Wales

Like school comparisons for government schools was introduced in NSW in 2005 as part of a revised system of annual reporting by schools. The system was initially introduced on a trial basis.

Schools can report academic achievement against the average for their ‘Like School Group’ (LSG), as well as the State, in their annual report. The decision to report against LSG and/or the State is optional for the school. The Department of Education does not publish the data for each school.

Schools can report the following against the LSG and/or State:

  • percentage of students in each skill band (Year 3 & 5 Literacy and Numeracy) and/or each achievement level (Year 7 Literacy and Numeracy) and/or each performance band (School Certificate);
  • relative performance of the school compared to the school average over time;
  • the average progress in literacy and numeracy for matched students (value-added) for Years 3-5 and/or Years 5-10 and/or Years 10-12;
  • the school average score for School Certificate subjects compared to the school average over time;
  • the school mean score for all HSC subjects compared to the school mean over time.

Many schools publish the average score of students for Year 3, 5 and 7 literacy and numeracy tests, the average score for School Certificate subjects and the average score for all HSC subjects. They also publish corresponding averages for the like school group and the state average. Schools also publish the proportion of students performing at different proficiency levels for their school, the like school group and for the state as a whole.

All NSW government schools are allocated to one of nine LSGs based on the average socio-economic status (SES) of the school community and the school’s geographic isolation. There are four metropolitan LSGs differing in SES and five rural LSGs differing in SES and remoteness. Selective schools are placed in a separate LSG.

The school SES is determined by geo-coding the addresses of all of its students and allocating them to ABS Census Collection Districts (CDs). The Socio-Economic Indexes for Areas (SEIFA) Index of Disadvantage value associated with the CD is assigned to each address and the average value of all the student addresses is the school SES score. The school geographic isolation measure is based on the Accessibility/Remoteness Index of Australia (ARIA).

Victoria

The Victorian Education Department has been using data to compare the performance of ‘like schools’ since 1996. A new system of comparisons called Percentile/SFO Comparison Charts was piloted in 2007 and the results distributed to schools in 2008. These comparisons are not publicly reported.

Under the previous system, like school groups were identified according to the proportion of students who receive the Education Maintenance Allowance or Commonwealth Youth Alliance and the proportion of students from a Language Background other than English. All government schools were divided into nine ‘like school groups’ according to the background characteristics of their students.

The new system uses SES as a benchmark for reporting and assessing school performance. Schools compare their actual performance with that predicted by their SES and that of similar schools. This is done by comparing the percentile rank of a school’s average test results for each Year level with the SES percentile range of other schools most like the school.

The SES of each school is measured by Student Family Occupation (SFO) density which is derived from information about parent occupations provided on student enrolment forms. The data is based on five occupational groups and these are each given a different weighting. The higher is a school’s SFO density, the lower is its SES; that is, it has a high proportion of its students from families with less skilled occupations.

This system of comparisons is based on the assumption that if SES, as measured by the SFO, was the sole determinant of student achievement, a school’s achievement percentile would be expected to be similar to its SFO percentile. In effect, the SFO percentile range is used as a predictor of a school’s test results.

The exact average test scores and the actual SFO density are not directly reported to schools, each being converted to a percentile rank. Nor is the SFO percentile of individual school presented. The Department of Education provides access to the data for individual schools but it is not publicly available. Schools are not required to report student achievement against other like schools.

Individual schools can compare their performance to that predicted by their SFO by plotting the percentile of each absolute average test score against the SFO density of schools whose SFO percentile range is +/-10% of that of their own school [see Attachment 1]. For example, a school’s Year 3 average reading score may be at the 20th percentile, meaning that 20% of schools have a lower average score, that is, the school has a higher average test score in Year 3 reading than 20% of all schools. If the SFO percentile range of like schools is 12-32%, the school could said to be performing at the level predicted by the SFO percentile range of like schools. However, if the SFO percentile range of like schools is 25-45%, the school could said to be performing below the level predicted by the SFO percentile range of like schools.

Western Australia

Like school comparisons of student achievement are provided to Western Australian government schools as part of what is called the ‘Data Club’. The Data Club was developed as a program to enable school principals and teachers to analyse their students’ achievement scores on the Western Australian Literacy and Numeracy Assessment (WALNA). It allows schools to compare their performance with that of like schools and to identify the value-added component of their school. The information is only provided to schools, but many of them include broad comparative results in school annual reports.

Schools are grouped into ‘like-school’ bands based on their socio-economic index score. There are nine bands. The most disadvantaged schools are in Band 0 and the most advantaged schools are in Band 8. Band 4 schools are ‘average’.

Schools are provided with their mean and median scores, as well as the distribution of scores, for the WALNA tests in literacy and numeracy for Years 3, 5 and 7 together with those for their like school group, schools in their District and with the State as a whole. They are also provided with their average scores over time. A set of charts shows a school’s average scores and the distribution and range of scores compared with their like school group, District and State. This enables schools to compare the difference between their highest and lowest scores with that of other schools as well as the extent to which student scores are clustered around the school average. Many schools currently report this information in their school annual reports.

Like schools are determined on the basis of their score on a socio-economic index constructed by the Department of Education. The index is based on ABS Census data at the CD level and on the actual schools’ percentages of Aboriginal students. It is comprised of the double weighted dimensions of education, occupation and Aboriginality, and the single weighted dimensions of family income and single parent families.

The SES score for each school is derived by geo-coding the addresses of students at each school to CDs. The index was initially calculated using a sample of addresses in each school but now all student addresses are used. The number of students for each CD is determined and the ABS Census data is used to calculate an SES score for each CD by combining index scores for each dimension. The SES score for each school is obtained by weighting each CD SES score by the number of students resident in each CD and taking account of the percentage of Aboriginal students in each school.

Trevor Cobbold

Previous Next

Gillard’s School Reporting Model is a Triumph of Ideology over Evidence

Sunday August 31, 2008

The Rudd Government’s “education revolution” is looking more and more like an extension of the Howard Government’s school policies. All the same elements are there – choice and competition, reliance on markets, and now public reporting of school results.

The model for the new school reporting scheme comes direct from New York. Julia Gillard has been enthusing about the New York system ever since her audience with the New York Schools Chancellor, Joe Klein. She says she is “inspired” and “impressed” by Klein’s model.

It is a pity that Gillard did not look more closely. She would have seen major flaws.

The New York system produces unreliable and misleading comparisons of school performance and student progress. It is incoherent. It can be used to produce league tables. It fails to compare like with like and it is statistically flawed.

Diane Ravitch, Professor of Education at New York University, a former US Assistant Secretary of Education and advocate of school reporting now says that New York’s school reporting system is “inherently unreliable”, “dubious” and produces “bizarre results”.

Jennifer Jennings from Columbia University describes it as “statistical malpractice”, “a mess”, and based on “highly questionable methods”. The New York Sun columnist, Andrew Wolf, says that it is “an overblown grading system that already seems to be sinking from its own weight”.

New York uses an incredibly complicated scoring system, requiring two 30-page technical guides to explain. It combines a wide range of information on student achievement, student progress, student composition and school features to obtain a school grade of ‘A’ to ‘F’ and an overall performance score out of 100.

The process by which all this information is combined, weighted and assessed involves many highly arbitrary and subjective judgements. According to Andrew Wolf, it involves “a bagful of subjective adjustments, bonus points and bureaucratic discretions”. It is riddled with inconsistencies.

Amongst the most bizarre results of the system is that high performing schools can be assessed as failing and be closed down. For example, the elementary school PS 35 on Staten Island was graded last year as failing even though 85% of its students passed the reading test and 98% passed the mathematics test.

It was failed because its students had shown insufficient improvement from 2006. As a result, it is a candidate to be closed if it fails to improve further.

Julia Gillard says that she doesn’t want “simplistic and silly” league tables, and will only compare schools with a similar student population. This is disingenuous. It is not possible to use the New York model to report student results in like schools without providing the scores for all schools. The New York Times and the New York Post list each school’s grade and overall performance scores. It is a simple matter to rank all schools on their grade and scores in league tables.

Despite what Gillard says about the New York system, it fails to consistently compare like with like. Jennifer Jennings has pointed out that school peer groups include schools with very dissimilar demographic profiles. For example, the percentage of high-achieving Asian students in the schools of one peer group ranges from 1% to 69%. In another, the percentage of low income students ranges from 12% to 94%.

Another major problem is that New York school progress reports do not report measurement errors for school scores and grades. As a result, its comparisons of school performance are likely to be inaccurate and misleading.

Many studies of school performance reporting in England, the US and Australia have shown that a large proportion of school results are statistically indistinguishable when measurement error is taken into account. The problem is magnified for measures of student progress, or ‘value added’ comparisons, where measurement error is inevitably larger.

Astoundingly, Gillard’s preferred model assesses student progress on only one year’s data. Yet, a study published by the US National Bureau of Economic Research shows that 50 to 80 percent of the year-to-year fluctuation fluctuations in average school test scores are random and have nothing to do with school quality. School comparisons of progress over one year are therefore highly unreliable.

These and many other criticisms mean that the New York reporting system is deeply flawed. Any system based on it will severely mislead the public and parents.

Its adoption will subject school principals and staff to substantial risks of being punished or rewarded on the basis of dubious and unreliable data and for factors beyond their control. It will not accurately identify best practice in schools as Gillard wants.

State and Territory Governments would be well advised to reject the New York model. It will only do harm to a largely successful education system.

Australia and Finland are two of the highest achieving countries in the world in school outcomes according to the PISA surveys conducted by the OECD. Neither country got there by reporting school results.

Why the Rudd Government is choosing to emulate the reporting policies of much lower performing countries like the United States and England can only be explained as a triumph of ideology over evidence.

Trevor Cobbold
Convenor, Save Our Schools
Save Our Schools is a Canberra-based public education advocacy group.

Previous Next

Schools Cheating to Raise League Table Ratings

Monday July 30, 2007

A BBC News investigation has exposed widespread cheating by teachers in exams in England in order to raise school ratings on league tables. The revelations follow a stream of cheating incidents across US schools this year.

These revelations are a harbinger of what can be expected in Australia with both the Government and the Labor Opposition committed to forcing schools to publish their results for national literacy and numeracy tests. Publishing school results inevitably leads to league tables and immense pressure to improve school rankings.

Cheating is an easy way to deliver better school results. The BBC report says that teachers blame constant testing and the importance placed on league tables for the pressure to improve results for the cheating.

One unnamed teacher from Leeds said ready-made answers were kept in a filing cabinet.
These were used by teachers to fill in missing gaps in pupils’ coursework without the students’ knowledge. Another teacher told the BBC how he pointed over pupils’ shoulders when they made mistakes in an exam.

A teacher from Dorset told the BBC the pressure came directly from senior teaching staff.

“I was told they had to get a C grade no matter what, which I did, which was cheating.”

He said he told his pupils exactly what to write, but to change a few words to make it look like their own work.

A survey by the Teacher Support Network, a teachers’ welfare charity, found the majority of respondents thought cheating was commonplace. About two-thirds of teachers in a small survey said they personally help students “more than is appropriate” in order to improve exam results.

The former head of the Office for Education Standards (Ofsted), Chris Woodward, told the BBC that cheating by teachers is so extensive that the league tables used by parents to differentiate between schools have become unreliable.

“It makes a mockery of the league tables if one school is behaving professionally and another school is offering the kind of support where the teacher actually does the work for the child.”

The full BBC News report is available at: http://news.bbc.co.uk/2/hi/uk_news/education/6918805.stm

Previous Next