
The British public is turning to chatbots for high-stakes decisions. Lloyds Bank estimates that more than 28 million adults used them for financial guidance in the last year.
But as adoption grows, so does scrutiny of the technology’s reliability. A study by PsyFi Money, a financial comparison and research site, found that many chatbots produced inaccurate financial advice. The research tested ChatGPT, Claude, Perplexity, Gemini and Grok, finding major discrepancies in relevance, price accuracy and adherence to regulation.
Grok, the chatbot from xAI, ranked lowest for beginner friendliness and for following local laws. ChatGPT performed best for relevance and regulation, while Claude was deemed the most beginner-friendly model.
Despite these rankings, every model showed flaws. While ChatGPT achieved the highest overall score (82/100) for clarity, relevance, and accuracy, followed by Claude and Perplexity, even the top models exhibited concerning errors, particularly regarding legal and tax nuances.
Regulatory misinformation risk
One of the most serious problems uncovered was flawed regulatory guidance. When asked to recommend the safest way for a beginner to invest in cryptocurrency in 2026, Claude incorrectly stated that Binance was registered with the Financial Conduct Authority (FCA). In fact, Binance was ordered to cease regulated activities in the UK in 2021.
This specific error underscores a broader risk. Michele Tieghi, founder of PsyFi Money, commented on the gravity of the mistake: “Recommending a non-FCA-registered exchange as the safest option for a beginner is about as concerning as it gets. Investors could be left without asset protection and exposed to market manipulation.”
Although other models were more cautious about Binance, the incident highlights how AI systems can confidently present outdated or incorrect compliance information.
“AI is advancing rapidly, but it has a long way to go with cryptocurrency advice,” says Tieghi. A lack of real-time data also led to errors in stock and cryptocurrency prices. Gemini performed worst in this category, though all models had inaccuracies. “AI is advancing rapidly, but it has a long way to go with cryptocurrency advice,” adds Tieghi. “Limitations, from inaccurate pricing to incorrectly labelled exchanges, make it unsuitable as a standalone guide.”
Weak data confidently presented
A separate study by Which? found that chatbots often appear confident despite using weak or incomplete data. This often stems from how models summarise sources; advice from a blog or forum may not be relevant to a specific user query. Research by the BBC also found that over half (51%) of news summaries generated by chatbots contained factual errors.
Enterprise leaders are already pushing for greater oversight. Coders are now being advised to double-check AI-generated work, and HR teams are being warned not to let automated screening filter out top talent.
Investors, however, seem less cautious. Many remain confident in these models, even as evidence shows they are prone to making mistakes.
The British public is turning to chatbots for high-stakes decisions. Lloyds Bank estimates that more than 28 million adults used them for financial guidance in the last year.
But as adoption grows, so does scrutiny of the technology's reliability. A study by PsyFi Money, a financial comparison and research site, found that many chatbots produced inaccurate financial advice. The research tested ChatGPT, Claude, Perplexity, Gemini and Grok, finding major discrepancies in relevance, price accuracy and adherence to regulation.
Grok, the chatbot from xAI, ranked lowest for beginner friendliness and for following local laws. ChatGPT performed best for relevance and regulation, while Claude was deemed the most beginner-friendly model.




