Unreliable chatbots
2025-02-11

We know that genAI/LLM chatbots aren't the most reliable sources of information. The BBC has released a study that adds some much-needed context to this idea:

(Source: "Groundbreaking BBC research shows issues with over half the answers from Artificial Intelligence (AI) assistants" 2025/02/11)

51% of all AI answers to questions about the news were judged to have significant issues of some form

19% of AI answers which cited BBC content introduced factual errors – incorrect factual statements, numbers and dates

13% of the quotes sourced from BBC articles were either altered or didn’t actually exist in that article.

Or, as I so eloquently put it a few months ago, in Complex Machinery:

A chatbot's entire job description is Just Make Some Shit Up. (It's in the name: everything that comes out of a genAI bot is, well, generated. We only apply the label "hallucinations" to the generated artifacts that we don't like.)

The agent still needs supervision

AI agents aren't quite ready to take over our busywork

Good design is a risk control

When designing tools, make it easy for people to follow the right path