Spiegelhalter is letting rip at his field while sitting in the central court of the Centre for Mathematical Science, where he is the Chair of the Winton Centre for Risk and Evidence Communication. The Centre’s buildings are beautiful, with sweeping curved walls and low, grass-covered roofs. “We hope the layout of the buildings and the way they work will actively encourage fresh dialogue and discovery,” the award-winning architects explained to The Guardian in 2003. Though perhaps architecture can only go so far: just inside the entrance, there is a sign urging staff and students to smile at one another, and to try saying hello.
Spiegelhalter doesn’t seem to need a reminder to smile and talk. Over the course of his career he has shifted from working on serious maths to improving how evidence is communicated to the public. In his latest book, The Art of Statistics, he applies what he calls ‘simple numeracy’ to messy, real-world problems. The questions he tackles include ‘Do speed cameras reduce accidents?’, ‘What’s the cancer risk from eating bacon sandwiches?’, and ‘Does going to university increase the risk of getting a brain tumour?’ (The answer to the final question is a resounding no.)
He hopes the book provides a grounding in statistical principles for new graduates being lured into careers in data science — a multidisciplinary field that combines statistics with machine learning, design, and communication. He welcomes the advent of the subject and the facelift it has given statistics. “I think the ability to deal with data critically and to realise its strengths and limitations is the most important skill in the future world and it’s an extremely marketable skill as well,” he explains.
But in the shift from statistics to data science, has one bag of tools simply been replaced by another? Are machine learning algorithms being applied with little care for their inner workings, much like the statistical tests of Spiegelhalter’s early years?
“Machine learning is even worse! Because you can get a set of data, and then you bang in a logistic regression, and classification trees, random forests, neural networks, small vector machines and off you go, bam, bam, bam. You apply them all and you’ve got no idea what they are doing and out they come with some error rate and you think ‘Oh, that one’s best.’ And this is appalling, absolutely dreadful, because you’ve got no idea,” he says.
“The robustness and explainability and fairness and transparency are absolutely vital when it comes to this work. Frankly, unless you’ve got a pretty good idea of what it’s doing you’ve got no guarantee whatsoever that if you move into a slightly different domain that it won’t just all fall apart. You slightly change the inputs, move into a different context, and you’ve got no idea why, suddenly, it doesn’t work.”
Spiegelhalter’s is not just sceptical of how businesses apply algorithms, but also of the ambitious and broad claims made about the technology. “You have to recognise there are certain problems and challenges that machine learning and AI have been absolutely fantastic at: well-defined problems in terms of optimisation. I use Google Maps — it’s a fantastic optimisation algorithm, there’s fantastic image recognition, there’s an amazing ability to play games. But these are deeply restricted classes of problems: you need no background knowledge, you just need a huge amount of information on past cases. The idea that ‘Oh, because a program can learn Go [in 2016, Google DeepMind’s AI program AlphaGo beat world master Lee Sedol at the board game], therefore you can let it free on medical records and it can tell you what’s going to happen to a patient,’ is complete nonsense, just nonsense! And there’s so much overhype and over-claiming being made.”
If anyone is qualified to raise an eyebrow at the current enthusiasm for machine learning and artificial intelligence, it’s Spiegelhalter. Some 35 years ago, he and a colleague applied probability theory to artificial intelligence with significant consequences: the work has been cited thousands of times by other researchers and underpins many machine learning algorithms used today. He brightens when recalling the period. “When you look back on your career, you think there are certain times which have been a bit of a slog, and other times it was really exciting and energising. The eighties were really exciting.”
The challenge educators face to convey a balance of enthusiasm for machine learning and artificial intelligence with a healthy dose of scepticism to their students is one Spiegelhalter is keenly interested in. Having previously advised England’s Department of Education on the syllabus for GCSE Mathematics, he is now a member of a Royal Society panel that provides recommendations on data science in schools. In 2018, he co-authored a report that argued for a ‘more detailed, updated, and coherent computing curriculum,’ with recommendations for introducing content on machine learning and computational modelling and strengthening content on data representation.
“Data science certainly does not belong in maths. I don’t even really believe statistics belongs in maths. Data science in all its manifestations — sorting data, getting conclusions from data visualisation, communication — is an essential skill in the world and it’s one that should be part of schools’ education, but I don’t know where it should be. It has a broad spectrum of literacy that can be applied across all areas — geography, the humanities, computer science — nobody owns it, and it’s wonderful. I wish I could say ‘You must do it this way,’ but I really don’t know. But, I suppose it’s quite right you should go into something not knowing the answers.”
The Art of Statistics is published by Pelican and is available to purchase in paperback now.