How poets and playwrights could strengthen AI security

With nearly everyone pushing generative AI into production, AI security has become a critical and complex challenge.

The emergence of GenAI has created a new front in the never-ending parade of attack surfaces, which I call inference defence.

Adversaries are deploying their own AI to launch devious attacks, designed to compromise the integrity of AI models – a recent example can be found in this Crescendo research paper. The game has fundamentally changed, meaning defenders must once again adjust their strategies and broaden their scope.

The solution lies not just in implementing new technology but also in embracing diverse human perspectives to help anticipate novel attacks.

Attackers are using clever narrative techniques that manipulate models to bypass guardrails and revealing secrets. To defend against these attacks, the cybersecurity industry would do well to recruit some unconventional talent: creative, human storytellers.

New human-centric attacks

As attackers refine their techniques, the cybersecurity community should not underestimate the value of outside-the-box thinking to play defence.

I experienced first-hand the power of creativity in attack strategies when I worked as a penetration tester. I wasn’t suited for stealthily infiltrating locations, but a colleague from El Salvador could blend in anywhere with simple work boots and coveralls. Another had a knack for social engineering, using her emotional intelligence and southern American drawl to charm her way past security. Thanks to this diversity of skills and backgrounds – what I call the constructive use of differences – we had a 96% success rate in our first year.

The new crop of AI-powered threats are about manipulation more than brute force. Attackers aim to exploit gaps in how the LLMs are trained to achieve their results. It’s the manipulation of training data that can cause models to spew undesirable or exploitable output, so AI security starts with building strong protections for that data.

More subtly, an AI might be swayed to promote a certain narrative or deliver inaccurate financial advice, for instance. It still functions, but its output is compromised. This isn’t a perimeter breach; it’s a corruption of the AI’s core knowledge, exploiting our natural tendency to place too much trust in AI.

In such cases, we don’t know that we’re engaging with a compromised model. But nation states or companies that want to steer user behaviour have billions of green-backed reasons to manipulate outputs in their favour.

These concerns however go beyond training data. New jailbreak techniques bypass safeguards by exploiting the conversational context of AI models. Attackers here are playing a longer con, influencing AI systems, turn by turn, to bypass safeguards without triggering keyword filters. Interactions then subtly guide the AI towards harmful output, without obviously malicious input.

Bards, poets and playwrights in cybersecurity

Security teams will have to think differently to combat these attacks. So-called red teams, which hunt for exploits or gaps in security whether physical or digital, must think more creatively and seek out these more nuanced, human-centric vulnerabilities of the AI era.

If AI is going to put screenwriters out of work, maybe we can lure them into cybersecurity. A mastery of narrative and human psychology could really help the security community to anticipate the subtle ways an AI might be manipulated.

Creatives could help to construct scenarios that exploit not only logical flaws in AI, but also the learned biases of these systems and the ‘personalities’ that they adopt.

Engineering logic and deep technical knowledge will always play a critical role in cybersecurity. But as attacks increasingly morph into the manipulation of natural language, storytellers may better be able to spot how AI output could be twisted for fraud or other nefarious purposes.

If we’re collectively about to put the machines in charge of nearly everything, our safety may depend on human diversity and creativity.

Chuck Herrin is field CISO at F5, a security company.

Securing AI systems step by step

Expand Close

TechnologyArtificial IntelligenceCybersecurityOpinionRiskTechnology

How poets and playwrights could strengthen AI security

New human-centric attacks

Bards, poets and playwrights in cybersecurity

Securing AI systems step by step

Read this next

Want to read on?

Subscribe to our Daily Newsletter