Key Takeaways
Artificial intelligence chatbots are dispensing inaccurate and potentially harmful financial advice to British consumers, prompting urgent warnings from the UK's leading consumer organization.
A comprehensive investigation by Which? has revealed that popular AI tools, including ChatGPT, Microsoft Copilot, and Google Gemini, are making basic factual errors on topics ranging from tax allowances to investment regulations.
The investigation tested six AI platforms with 40 common consumer questions spanning finance, legal matters, health, and travel.
Which? Researchers conducted the tests under controlled laboratory conditions in September 2025, with responses evaluated by subject matter experts across five key criteria: accuracy, relevance, clarity, usefulness, and ethical responsibility.
Andrew Laughlin, Which? Tech Expert said, "Our research uncovered far too many inaccuracies and misleading statements for comfort, especially when leaning on AI for important issues. For particularly complex issues, always seek professional advice."
The results showed considerable variation in quality across different platforms. Perplexity emerged as the top performer with a score of 71%, while Meta AI came last with just 55%. ChatGPT, used by nearly half of UK AI users, scored a disappointing 64%, placing it second-to-last among the tools tested.
Google's offerings showed mixed results, with its AI Overview feature scoring 70% and the standalone Gemini chatbot receiving 69%. Microsoft's Copilot achieved 68%.
Among the most concerning findings, both ChatGPT and Copilot failed to identify a deliberate error when asked about investing a £25,000 annual ISA allowance.
The actual limit is £20,000, and the chatbots provided advice on investing the incorrect amount without flagging the discrepancy, potentially putting users at risk of breaching HMRC regulations.
Meanwhile, Gemini, Meta AI, and Perplexity correctly identified and addressed the error.
On consumer rights questions, Copilot provided misleading information suggesting users are always entitled to full refunds for cancelled flights, which is not accurate.
Multiple platforms misunderstood the voluntary nature of Ofcom's broadband speed guarantee code, leading to incorrect advice about contract cancellation rights.
The investigation also found that some tools were directing users toward premium tax refund services when discussing free HMRC tools.
These third-party services are known for charging high fees and, in some cases, engaging in questionable practices.
High trust levels despite accuracy concerns
The findings are particularly significant given the scale of AI adoption in the UK.
According to a Which? survey of 4,189 UK adults, more than half now use AI tools to search for information online. Approximately one-third consider AI more important than traditional web searching.
Trust in AI outputs remains high despite the identified shortcomings.
Nearly half of the estimated 25 million UK AI users trust the information they receive to a great or reasonable extent, with that figure rising to two-thirds among frequent users. A third of users incorrectly believe that AI exclusively draws on authoritative sources.
The reliance on AI for high-stakes decisions is also notable. One in five survey respondents said they always or often rely on AI for medical advice, while one in six uses it for financial guidance and one in ten for legal matters.
Technology companies respond to findings
The companies behind the AI tools have acknowledged the limitations of their products while defending their approaches to accuracy and user safety.
A Google spokesperson stated: "We've always been transparent about the limitations of Generative AI, and we build reminders directly into the Gemini app to prompt users to double-check information.
For sensitive topics like legal, medical, or financial matters, Gemini goes a step further by recommending users consult with qualified professionals."
Microsoft commented: "Copilot answers questions by distilling information from multiple web sources into a single response.
Answers include linked citations so users can further explore and research as they would with a traditional search.
With any AI system, we encourage people to verify the accuracy of content, and we remain committed to listening to feedback to improve our AI technologies."
An OpenAI spokesperson said, "If you're using ChatGPT to research consumer products, we recommend selecting the built-in search tool. It shows where the information comes from and gives you links so you can check for yourself. Improving accuracy is something the whole industry's working on."
Meta and Perplexity did not provide comments to Which? for the investigation.
The investigation adds to growing evidence about the limitations of AI tools for financial and legal guidance.
Read more: