
The surge in AI adoption over the past five years has many governments and market observers worried about the security risks of these new systems. New evaluations by the UK’s AI Security Institute (AISI) suggest that even the most advanced systems can be misused, challenging assumptions about vendor trust and model safety.
AISI (formerly the AI Safety Institute) was established by the UK government in 2024 to investigate the capabilities of frontier AI models as well as the risks they pose. To this end, the organisation has tested several models, evaluating their performance in technical tasks, such as biological research and software development, and assessing their potential for misuse. It has thus far published performance tests on two popular models: OpenAI o1 and Claude 3.5 Sonnet.
OpenAI’s first reasoning model, o1, performed comparably overall to the startup’s internal reference model, GPT-4o, according to AISI. The standards body observed similar inherent cybersecurity risks in both models, although o1 suffers from several unique reliability and tooling problems. For general reasoning and coding, o1 underperformed against GPT-o4, but the two models were near equals in technical areas such as biological research.
Claude 3.5 Sonnet far outperformed other models in biological research, as well as in engineering and reasoning tasks. However, its guardrails are less robust, and AISI identified several ways to ‘jailbreak’ the system to elicit dangerous responses.
Although it has published only two detailed evaluations, AISI has evaluated 22 anonymised models, with 1.8 million total attempts to break safeguards and perform illicit tasks. Every model tested by AISI was vulnerable to jailbreaks, with the organisation identifying more than 62,000 harmful behaviours.
Buyers beware
For firms in regulated industries, such as finance, healthcare, legal services and the public sector, AISI’s findings have raised the stakes for AI governance and security. No longer can these organisations delegate security to their ‘trusted-vendors’. They must assess these systems as they would any other high-risk infrastructure, with capability assessments, stress tests and red-teaming.
Even before the AISI tests, some regulators and other public sector organisations, such as the Financial Conduct Authority and the NHS, had published guidance on AI deployment in their respective industries, but their guidelines will soon be updated in light of the AISI findings.
Businesses in all industries, however, would do well to consider the findings when planning an AI strategy, selecting a vendor or integrating the technology. Indeed, the market for enterprise scams is larger than ever, and scammers are rapidly gaining knowledge about AI frameworks and how to exploit them.
Regulation and standardisation inbound
Unlike the EU, which enacted the EU AI Act in 2024, the UK has no single legislation to guide or restrain the use of AI. Although the AISI findings are supported by the government, the insights and any guidance they include are nonbinding.
Moreover, the methods used by AISI to evaluate the models are not standardised. Regulators and safety institutes in jurisdictions around the world use different assessments. This has led some to claim that such tests cannot be used as a basis for declaring an AI model, or the industry as a whole, safe or unsafe. OpenAI and Anthropic submitted their models for AISI tests, but repeated their objections to the lack of standardisation between UK’s AI institute and, for instance, its US counterpart, the Center for AI Standards and Innovation.
Pressure is mounting on governments to align their evaluation frameworks. Until then, firms seeking to adopt AI must understand that safety is not guaranteed, even when using the most trusted suppliers.
The surge in AI adoption over the past five years has many governments and market observers worried about the security risks of these new systems. New evaluations by the UK’s AI Security Institute (AISI) suggest that even the most advanced systems can be misused, challenging assumptions about vendor trust and model safety.
AISI (formerly the AI Safety Institute) was established by the UK government in 2024 to investigate the capabilities of frontier AI models as well as the risks they pose. To this end, the organisation has tested several models, evaluating their performance in technical tasks, such as biological research and software development, and assessing their potential for misuse. It has thus far published performance tests on two popular models: OpenAI o1 and Claude 3.5 Sonnet.
OpenAI’s first reasoning model, o1, performed comparably overall to the startup's internal reference model, GPT-4o, according to AISI. The standards body observed similar inherent cybersecurity risks in both models, although o1 suffers from several unique reliability and tooling problems. For general reasoning and coding, o1 underperformed against GPT-o4, but the two models were near equals in technical areas such as biological research.




