Free Quote

Find us on SAP Ariba

Please Leave a Review

AliTech Solutions

Blog

BBC Study Finds AI Chatbots Still Get News Wrong 45% of the Time

BBC Study Finds AI Chatbots Still Get News Wrong 45% of the Time

Introduction

Artificial Intelligence has become a daily companion for millions of people around the world. From helping us write essays to summarizing breaking news, AI assistants like ChatGPT, Microsoft Copilot, Gemini, and Perplexity are transforming how we consume information. But a shocking new BBC and European Broadcasting Union (EBU) study has revealed a major issue — these AI tools get the news wrong almost half of the time.


Overview of the BBC and EBU Study

The BBC, in collaboration with the EBU, conducted a large-scale investigation across 18 countries and 14 languages to analyze how accurately popular AI assistants deliver news information. More than 3,000 AI-generated responses were reviewed by professional journalists to see how often they contained factual or contextual errors.

The findings were alarming: 45% of responses had at least one major error, 31% included poor or misleading citations, and 20% contained fabricated or outdated facts.


Key Findings: 45% Error Rate in AI Responses

The study showed that nearly half of all AI-generated news answers were wrong, incomplete, or misleading. These weren’t just small typos — some of them could actually misinform readers about real-world events or laws.

For example, when asked about surrogacy laws in Czechia, Perplexity wrongly stated that surrogacy was prohibited, when in fact it’s not regulated but also not illegal. Gemini, Google’s AI, falsely claimed that buying disposable vapes would become illegal — when in truth, the law only restricted their sale and supply.


Which AI Chatbots Were Tested

The research tested four of the most popular AI systems currently used for public information:

  • OpenAI’s ChatGPT

  • Microsoft’s Copilot

  • Google’s Gemini

  • Perplexity AI

Each was asked to answer hundreds of real-world questions about recent news stories, political updates, and legal developments.


Gemini’s Troubling Performance

Among the four tested systems, Google’s Gemini performed the worst. It failed to correctly answer 76% of the questions — the highest error rate by far. Most of Gemini’s issues were related to missing or unreliable sources, misquotations, and incorrect context in its summaries.

Reviewers noted that Gemini often relied heavily on secondary sources like Wikipedia instead of credible news outlets. It also sometimes quoted outdated or irrelevant material, giving users the impression that it was current.


ChatGPT, Copilot, and Perplexity Results

ChatGPT, Copilot, and Perplexity performed better overall, but they still made serious errors. Copilot, for example, claimed that “a vaccine trial for bird flu is underway in Oxford” — a statement based on a BBC article from 2006.

ChatGPT sometimes provided overly confident yet incorrect summaries. Perplexity occasionally fabricated quotes or misrepresented laws. While these tools often sounded authoritative, their answers could be dangerously misleading.


Real-Life Examples of AI Errors

Some examples from the study highlight just how risky AI misinformation can be:

  • Chatbots misidentified current world leaders like the Pope or the German Chancellor.

  • Perplexity misquoted laws, claiming false legal restrictions.

  • Gemini provided outdated health information from decades-old articles.

These mistakes are more than just technical errors — they can cause real-world confusion and damage public trust.


Why AI Systems Make These Mistakes

At the core of the issue lies the way large language models (LLMs) are trained. These systems learn from massive datasets that include everything from books and websites to social media posts. While this allows them to generate fluent, human-like text, it also means they absorb errors, biases, and outdated facts from those sources.

When these flawed inputs are processed, the system builds associations that might not be accurate — leading to what experts call the “poisoned corpus” problem.


Understanding the “Poisoned Corpus” Problem

The “poisoned corpus” refers to bad or unreliable data that gets mixed into the AI’s training material. When such flawed data becomes part of the model’s learning base, it can distort how the AI interprets questions and constructs answers.

Since AI models don’t have real-world understanding — only mathematical relationships between words — even a small amount of bad data can lead to major factual errors.


The Role of Embeddings in AI

LLMs like ChatGPT and Gemini rely on a system called “embeddings,” which mathematically map how words relate to each other. The model doesn’t “know” facts the way humans do; it simply calculates the most statistically likely response to a question based on its training data.

So, when an AI answers a complex question, it often mixes old, incorrect, or contradictory information together — and presents it confidently, as if it were fact.


The Danger of “Confidently Wrong” Answers

One of the biggest risks of AI chatbots is their confidence. They often deliver false or misleading information in a calm, authoritative tone. This “dangerously confident” style makes users trust them even when they’re wrong.

As the BBC report states, these systems sometimes fabricate details rather than admit they don’t know the answer. The result is that users can be deceived without realizing it.


Can This Problem Be Fixed?

While improvements are being made, fixing this problem isn’t easy. AI models are continuously retrained and updated, but as long as they rely on vast public datasets, they’ll always face the risk of errors.

The BBC noted that between its two study phases, AI accuracy improved from 49% correct to around 73%. That’s progress — but still far from perfect, especially for something as important as news reporting.


BBC and EBU’s News Integrity Toolkit

In response to their findings, the BBC and EBU created the “News Integrity in AI Assistants Toolkit.” This resource defines what a good AI-generated news answer should look like and provides a framework for improving accuracy and transparency.

The toolkit focuses on four main principles:

  1. Accuracy — ensuring facts are correct and up-to-date.

  2. Context — providing the background that gives meaning to stories.

  3. Distinction — separating fact from opinion.

  4. Neutrality — avoiding bias, emotion, or editorial judgment.


Importance of Trusted AI Corpora

The report stresses that organizations building AI tools should develop trusted data sets, also known as “corpora.” These must be regularly audited and updated to remove outdated or false information.

Internal AI systems — like employee chatbots or customer support bots — should have designated owners for each data source to ensure accuracy. Without such maintenance, even corporate AI tools can spread outdated or misleading advice.


The Role of AI in Newsrooms

Many journalists now use AI tools to summarize long reports, generate ideas, or translate news quickly. However, the BBC study shows that AI should support journalism, not replace it.

Human editors must remain the gatekeepers of truth, verifying AI outputs before publication. Otherwise, newsrooms risk spreading misinformation under the guise of “AI efficiency.”


What AI Companies Should Do Next

AI developers like OpenAI, Microsoft, and Google must prioritize factual accuracy and transparent sourcing over speed or marketing goals. This means investing in better retrieval systems, clear citations, and ongoing quality audits.

As the BBC and EBU warned, the future of public trust in AI depends on whether these companies are willing to hold their systems accountable.


How Users Can Stay Safe

For everyday users, the best defense is critical thinking. Never assume an AI response is 100% accurate — especially when it comes to news or legal matters. Always double-check important facts using trusted sources like official news outlets or government websites.

AI can be an amazing tool, but it should be treated as a starting point for research, not the final word.


Impact on Journalism and Public Trust

When nearly half of AI-generated news answers are wrong, the consequences are serious. Public trust in information — already fragile in the digital age — can crumble further.

As Jean Philip De Tender of the EBU put it, “When people don’t know what to trust, they end up trusting nothing at all.” This erosion of trust threatens democracy and informed public debate.


The Future of AI and Information Integrity

Despite the problems, AI isn’t going away. The goal now is to improve it. With proper oversight, trusted data, and collaboration between news organizations and AI companies, we can build systems that inform — not mislead — the public.

The next generation of AI must prioritize truth, transparency, and accountability. Only then will these systems earn the trust of the people they serve.


Conclusion

The BBC and EBU study is a wake-up call for everyone — from AI developers to journalists to everyday users. AI chatbots are impressive, but they’re not infallible. The solution lies in human oversight, better data, and responsible use.

AI may be powerful, but human judgment remains the key to truth.


FAQs

1. What was the main finding of the BBC study?
The study found that around 45% of AI-generated news answers contained errors or misleading information.

2. Which AI performed the worst in the study?
Google’s Gemini had the highest error rate, with 76% of its responses showing significant issues.

3. Why do AI chatbots make mistakes?
They rely on vast public datasets that can include outdated, biased, or incorrect information, leading to flawed results.

4. Can AI systems be trusted for news?
Not entirely. While they can summarize news quickly, their accuracy still lags behind human journalism.

5. How can users verify AI-generated information?
Always cross-check with trusted news outlets, official sources, or fact-checking organizations before relying on AI responses.

Grow your freelance career with Realancer — the ultimate platform for global talent

Read more blogs: Alitech Blog

avatar 4

Zeeshan Ali Shah is a professional blog writer at AliTech Solutions, and Realancer renowned for crafting engaging and informative content. He holds a degree from the University of Sindh, where he honed his expertise in technology. With a keen eye for detail and a passion for staying up-to-date on the latest tech trends, Zeeshan’s writing provides valuable insights to his readers. His expertise in the tech industry makes him a sought-after writer, and his work at AliTech Solutions has earned him a reputation as a trusted and knowledgeable voice in the field.

Leave a Reply

Your email address will not be published. Required fields are marked *

  • Rating

Recent Posts