Department of Economics

Subjective Performance Evaluation in the Public Sector: Evidence from School Inspections

How can we ensure that public sector institutions and their employees act in the best interests of users and perform effectively and efficiently?

For many governments the answer is to use hard performance targets, such as student test scores for public schools and patient waiting times for health-care systems.

An alternative model relies on subjective assessment by evaluators. The advantage of such an approach is that subjective evaluations can incorporate ‘soft’ information not captured by hard performance targets. A potential downside is that evaluators may be biased, subject to capture or impose a ‘one size fits all’ view of the world. How well subjective evaluation works in practice remains an open question.

In England, Ofsted inspections of schools have been a central feature of state education for over 20 years. This study investigates the validity of the school ratings that inspectors produce, and asks what is the impact of a fail rating on subsequent pupil performance. In addition, it tests the extent to which teachers are able to ‘game’ the system? 


Summary

This paper, by Dr Iftikhar Hussain investigates the effects of Ofsted inspections on the school’s subsequent performance. The questions addressed are:

  • Do student test scores improve in response to a fail inspection rating?
  • To what extent do those being evaluated respond in a strategic or dysfunctional manner? How can we test for such dysfunctional responses by teachers?
  • Does the act of inspection (for non-failing) schools yield any short-term test score gains? One hypothesis is that inspectors may provide valuable feedback that helps raise school productivity.

The empirical strategy employed in this study allows for tests of ‘gaming behaviour’, such as whether low-ability students are removed from tests to artificially inflate performance.  

Methodology

This study analyses Ofsted inspection data for primary schools that failed between 2005/06 and 2008/09. Taking 2005/06 as an example, an analysis of the effects of a fail inspection in 2005 on Key Stage 2 scores the following May is investigated.

Key findings

This paper employs a ‘natural experiment’ to demonstrate that a fail inspection rating leads to test score gains for primary school students.

There is no evidence to suggest that fail schools are able to inflate test score performance by gaming the system. Relative to purely test-based accountability systems, this finding is striking and suggests that oversight by evaluators who are charged with investigating what goes on inside the classroom may play an important role in mitigating such strategic behaviour.

For schools receiving the top ratings, there are no significant effects following an inspection. This suggests that any effects from the process of evaluation and feedback are negligible for non-failing schools, at least in the short term.

The findings also reveal that gains are especially large for students that score low on the prior, age seven, test. These results are consistent with the view that children of low income parents benefit the most from inspections. In such cases, inspectors may fulfill an especially vital role by substituting for parents in holding teachers to account.


Access paper

Subjective Performance Evaluation in the Public Sector: Evidence from School Inspections (2014), forthcoming in the Journal of Human Resources

The latest version of this paper is available from author’s web page