Imagine someone tells you that their ML/AI model is right 97.2% of the time.
Most people would say: "Ninety-seven percent? Wow! That model is amazing."
A data practitioner would raise an eyebrow: "That number seems rather high. Can you tell me more about how you trained that model?"
An experienced, leadership-level data professional would ask: "OK, but … what about the 2.8% of the time it's wrong? How are those cases similar? And what's the impact?"
The harsh reality of any predictive model is that it will be wrong now and then. (It's kind of in the name: it's a substitute for reality, and therefore it's never going to be a perfect match.)
So when your model is wrong, don't just hide behind the metrics. Understand what or who is on the receiving end of that model error, and then figure out how to address it. Oh, and then go back to those errors and see what they have in common. Maybe you can develop a workaround for the issue.
I have no idea why this comes to mind right now. No idea! But, completely out of the blue, let me tell you about an AI-driven essay-grading tool that exhibited a 2.8% error rate – that is, 2,000 mis-graded essays out of 71,000. A spokesperson tries to claim that this is a "minor" issue. I suspect the 2,000 impacted students have a different view.
Earlier this year, Dallas school officials complained after some questions on state tests were graded by the software, and scores were lower than district leaders expected. When the district submitted about 4,600 student writing samples for regrading, about 2,000 received a higher score.
Jake Kobersky, a spokesman for the Texas Education Agency, said the adjustments were minor in the context of Dallas's 71,000 writing samples. He said the state remained confident in the technology.
(Source: "Teachers Worry About Students Using A.I. But They Love It for Themselves." New York Times 2025/04/14)
Thinking leadership
A recent study finds that leaders should exercise ambivalence
Help them build success
Lessons from Home Depot's gardening division