A-levels, AI and a really big mess – lessons from the Ofqual algorithm

As a student, it’s bad enough having your final year of school cut short by a virus that’s compiled merely by a single-stranded RNA, a few proteins and thin lipid envelope approximately 120nm in size. Now, an untested algorithm has knocked a few letters off your predicted grades. Oh 2020, haven’t you done enough? Yet that was the reality for nearly 40% of students who opened that dreaded envelope last Thursday, who were shocked and disappointed by A-level grades that did not represent their academic potential. After two days of furious protests, the government backed down and allowed students to keep their predicted grades. How did it go so wrong? With the cancellation of summer exams, teachers were asked to give two pieces of information; how students are ranked in their ability and how well their school has performed in recent years. Ofqual asserted that this would be an accurate method of awarding A-level grades rather than relying purely on teacher’s assessments. The algorithm was a particularly ‘favourable’ option because there was a concern that teachers alone would provide overgenerous estimated marks. There was some initial sense to this, and no doubt the algorithm was seen as the fairest way forward. But by using previous exam records of a student’s school, along with their postcode data, meant that an excelling student in an underperforming school could only do as well as their alumni. Meanwhile a student from a well-funded private school (with a proven track record for success) was immediately protected. Fair? I don’t think so. Albeit chaotic, the whole fiasco has, however, highlighted the need to re-evaluate the ethics and reliability of algorithms in general. Although humans can certainly be error-prone and biased, this doesn’t mean that algorithms are necessarily better – just that they can reflect those biases. While AI is advancing at an impressive rate, relying on it as a ‘silver bullet’ to solve tricky social and ethical issues is rarely the way to go. This isn’t the first time AI has shown signs of prejudice or poked at issues of social and regional inequality. A study by the US’ National Institute of Standards and Technology (NIST) discovered that from a sample of 189 facial recognition algorithms made by 99 different companies, algorithms falsely identified African-American and Asian faces 10 to 100 times more than Caucasian faces. This outcome was from a database of photos used by law enforcement agencies in the United States. The AI technology had the most difficulty identifying Native Americans, recognising women over men, and falsely identified older adults 10 times more than middle aged adults. This is a perfect example of how AI unconsciously crosses the line when it comes to matters of racism, sexism and even ageism. Another ethical challenge erupted when the tech-giant Amazon had to ditch their AI recruitment tool because the data it was given favoured men, triggering sexism towards women. Of course, we’re talking about unconscious software here and I’m sure these different algorithms didn’t mean to overstep any boundaries. Yet, we can’t ignore what this means for the balance between human and technological intelligence. Who do we trust more? At present, the best approach may be to combine the two. For A-level students, the government took a complete ‘U-turn’ and decided that human judgement was the best way forward. For now, at least. As a society, we deserve more insight into the algorithms which are being applied to the decision making. To ensure unbiased solutions, we need to spend more time testing algorithms to ensure that they are not unfairly marginalising groups of people. This may be achieved by improving the diversity of data that the software is provided with and also the people who are developing the software. At its core, overall transparency should definitely be re-examined.