Florida’s public school system is hailed as an exemplary example of how market-based school reforms such as vouchers, charter schools, high-stakes standardized tests, and publishing school results can improve student achievement. It has a unique accountability scheme which uses vouchers as a sanction against schools with persistently low results by funding students to move to high performing schools.
There is continuing controversy over the success of Florida’s reforms [Strauss 2010, 2011; Martin 2011]. It appears that the gains from the reforms have been over-estimated and that other factors have contributed to the gains.
A new study published by Federal Reserve Bank of New York last month suggests part of the gains made by the reforms is due to manipulation of school results [Chakrabarti 2011]. The study shows that schools in Florida attempted to remove certain students from the test-taking pool by classifying them into an excluded pool when the state’s accountability program was introduced.
Under the initial Florida Opportunity Scholarship Program students attending failing public schools were offered the opportunity to choose a higher performing public school in the same or a different school district or choose an eligible private school. Students attending another public school were funded at the same level. Students choosing a private school were funded at the public school level or for private school fees, whichever was lower. The private school option was declared unconstitutional by the Florida Supreme Court in 2005.
The voucher program made all students of a school eligible for vouchers if the school received two “F” grades in a period of four years. Thus, it can be looked upon as a “threat of voucher” program. Schools getting an “F” grade for the first time were directly threatened by vouchers, but vouchers were implemented only if they got another “F” grade in the next three years. Vouchers were associated with a loss in revenue (equivalent to state aid per pupil for each student) and also negative media publicity and visibility. Moreover, the “F” grade, being the lowest performing grade, was likely associated with shame and stigma. Therefore, the threatened schools had a strong incentive to try to avoid the second “F”.
Under Florida rules, scores of students in several special-education (ESE) and limited-English-proficient (LEP) categories are not included in the computation of school grades. The study investigated whether the threat of vouchers and the stigma associated with the Florida program induced schools to strategically manipulate their results by excluding these students from the tests so as to artificially boost school test scores.
The study examined the impact of the scheme from its introduction in 1999 to 2002. Prior to its introduction, there was no evidence that would-be threatened schools behaved any differently than would-be non-threatened schools in terms of categorization of students in excluded or included LEP categories in any of the high-stakes or low-stakes grades or in terms of classification in excluded or included ESE categories.
In contrast, the program led to increased classification of students into the excluded LEP category in the high-stakes grade 4 and the entry grade 3 in the first year after program. However, it found no evidence that the threatened schools resorted to increased classification into “excluded” ESE categories in any of the three years after program. This was likely because classification into ESE categories was associated with substantial costs which might have acted as a disincentive to this form of reclassification. It had to be approved by the parents and a group of experts (such as physicians, psychologists, etc.) and it meant increased provision of services.
The study adds to a considerable number of others which demonstrate strategic behaviour by schools in the face of publication of school results and other accountability measures. Several studies have found evidence of classification of low-performing students into excluded disabled categories [Cullen & Reback 2006; Figlio & Getzler 2006; Jacob 2005]. Others found evidence of teaching to the test and pre-emptive retention of students and substitution away from low-stakes subjects [Jacob 2005; Figlio & Rouse 2006] and cheating by teachers [Jacob & Levitt 2003].
Several studies demonstrate that accountability programs lead to differential focus on marginal students [Reback 2008; Chakrabarti 2010; Ladd & Lauen 2010; Neal & Schanzenbach 2010] and differential focus on subject areas [Goldhaber & Hannaway 2004; Chakrabarti 2010]. Another study found that low-performing students were given longer suspensions during the testing period than higher performing students for similar offences [Figlio 2006].
In Australia, there is already extensive evidence of schools rorting their results. There were widespread allegations of cheating in last year’s national literacy and numeracy (NAPLAN) tests in New South Wales, Queensland, South Australia, Western Australia and Victoria. Many resulted in dismissal and disciplinary action and some are still being investigated.
The allegations concerned giving students clues to answers during the tests, helping students with answers, using posters on classroom walls to help students, changing answers after the tests, revealing test questions to teachers before the tests were taken, and excluding low achieving students from the tests.
These incidents came in the first year of My School. Before My School there were no reported incidents of cheating on the national literacy and numeracy tests because school results were not published. Now, there is such pressure from education departments on schools to improve their results that some teachers and principals inevitably succumb and resort to cheating and rorting.
The next round of NAPLAN tests is due in a couple of weeks. Federal and state/territory education ministers have vowed to crack down on schools that discourage struggling students from sitting NAPLAN tests or cheat to improve scores on the My School website.
The new study by Federal Reserve Bank of New York and the extensive previous research indicates this is a lost cause. Manipulation and rorting of school results is inevitable in a high stakes environment where school reputations and the careers of principals and teachers are on the line. The NAPLAN tests and the school results on My School will increasingly seen to be unreliable and become discredited.
Chakrabarti, Rajashri 2010. Vouchers, Public School Response and the Role of Incentives: Evidence from Florida . Staff Report No. 306, Federal Reserve Bank of New York, November.
Chakrabarti, Rajashri 2011. Vouchers, Responses, and the Test-Taking Population: Regression Discontinuity Evidence from Florida. Staff Report No. 486, Federal Reserve Bank of New York, March.
Cullen, J. & Reback, R. 2006. Tinkering Towards Accolades: School Gaming Under A Performance Accountability System. In T. Gronberg and D. Jansen (eds.), Improving School Accountability: Check-Ups or Choice, Advances in Applied Microeconomics, 14: 1-34.
Figlio, David 2006. Testing, Crime and Punishment, Journal of Public Economics, 90: 837-851.
Figlio, D & Getzler, L 2006. Accountability, Ability and Disability: Gaming the System. In T. Gronberg & D. Jansen (eds.) Improving School Accountability: Check-Ups or Choice, Advances in Applied Microeconomics, 14: 35–49.
Figlio, David & Rouse, Cecilia 2006. Do Accountability and Voucher Threats Improve Low-Performing Schools? Journal of Public Economics, 90 (1-2): 239-255.
Goldhaber, Dan & Hannaway, Jane 2004. Accountability with a Kicker: Observations on the Florida A+ Accountability Plan, Phi Delta Kappan, 85 (8): 598-605.
Jacob, B. A. 2005. Accountability, Incentives and Behaviour: The Impact of High-Stakes Testing in the Chicago Public Schools. Journal of Public Economics, 89 (5-6): 761-796.
Jacob, B. A. & Levitt, S. 2003. Rotten Apples: An Investigation of the Prevalence and Predictions of Teacher Cheating. Quarterly Journal of Economics, 118 (3): 843-877.
Ladd, Helen F.& Lauen, Douglas L. 2010. Status Versus Growth: The Distributional Effects of School Accountability Policies, Journal of Policy Analysis and Management, 29 (3): 426-450.
Martin, Michael 2011. What Really Helped Florida’s Test Scores. Answer Sheet blog, Washington Post, 7 January.
Neal, Derek & Schanzenbach, Diane W. 2010. Left Behind By Design: Proficiency Counts and Test-Based Accountability, The Review of Economics and Statistics, 92 (2): 263-283.
Reback, R. 2008. Teaching to the Rating: School Accountability and the Distribution of Student Achievement. Journal of Public Economics, 92 (5-6): 1394-1415.
Strauss, Valerie 2010. A Harder Look at Education Research. Answer Sheet blog, Washington Post, 28 December.
Strauss, Valerie 2011. What Jeb Bush Doesn’t Like to Admit. Answer Sheet blog, Washington Post, 1 April.