Imagine having someone follow you around, observing you for just a fraction of a day, to assess your capability on the job. Sounds nerve wracking. This is how many teachers are evaluated, and new research suggests that these observations are not altogether reliable.
Although observations as a means of teacher assessment may be favoured over other methods such as gains in pupil standardised test score, we should be wary of relying too heavily on observations as they currently stand. A new paper out by the Brookings Institute reports that an assessment of teachers via observations is biased based on the existing ability level of the pupils in the class. That is, if the same teacher was dropped into in a better-performing class, he would be rated more favourably than if he had been dropped into a group of lower-performing pupils.
Image credit: audio-luci-store.it; creative commons
According to the paper by Russ Whitehurst, Matthew Chingos, and Katharine Lindquist:
“Our data confirm that such a bias does exist: teachers with students with higher incoming achievement levels receive classroom observation scores that are higher on average than those received by teachers whose incoming students are at lower achievement levels. We should not tolerate a system that makes it hard for a teacher who doesn’t have top students to get a top rating.”
The Brookings research illustrates that human judgments and evaluations can be unintentionally biased by subtle or seemingly-irrelevant contextual factors, a concept explored in great depth in the RSA’s recent paper Everyone Starts with an A, and which more generally underpins much of the Social Brain Centre’s work.
Although the research was carried out in the USA, and a key lesson from behavioural science is to be careful about generalizability, it is not unwarranted to wonder whether bias may be prevalent in school systems in the UK and elsewhere too.
Fortunately, in the case of the biased teacher assessments, the authors suggest that there is an easy fix: statistically adjust teacher observation scores based on student demographics. They explain that
“Our analysis demonstrates that a statistical adjustment of classroom observation scores for student demographics is successful in producing a pattern of teacher ratings that approaches independence between observation scores and the incoming achievement level of students. Such an adjustment for the makeup of the class is already factored into teachers’ value-added scores; it should be factored into classroom observation scores as well.”
It is interesting to note that this fix does not attempt to prevent the bias, but instead corrects the consequences of it. This may be a reasonable solution given that these types of biases can be incredibly difficult to overcome, even once someone is made aware that a bias may exist. Just last night Dan Ariely, in conversation with Matthew Taylor, admitted that even as an expert in the field he still finds it difficult to avoid many common decision-making pitfalls.
Providing data to show us patterns in our judgments and decision-making (for example that the assessment of teachers by classroom observations is unfairly related to the ex-ante performance of the students) is certainly eye-opening. But data alone is often not sufficient to change behaviour. What it does provide is a starting point to think about how to design our environments better, keeping in mind what we are learning about human behaviour. In this case, classroom observations will be designed to correct for the unintentional bias occurring within them.
As we learn more about human nature and can better recognise the forces that influence judgement and behaviour, the type of fix such as the one suggested above – one that helps to correct the consequences of our cognitive biases - should become more prevalent. Although certainly better than nothing at all, does this type of solution go far enough? It seems in some sense to be a technical solution to what is potentially a very complex and deep-seated issue. This fix is not perfect, but until we understand more about how to mitigate bias at its root, it is at least a welcome start.