In recent months, warnings have come thick and fast that artificial intelligence (AI) could represent an existential threat to the human race.
In March, for instance, tech leaders including Elon Musk and Apple co-founder Steve Wozniak issued an open letter calling for a pause in developing AI, citing “profound risks to society and humanity”. In late May, a group of expert engineers and NGOs argued that “mitigating the risk of extinction from AI should be a global priority”.
But none of this has stopped organisations from rushing to adopt tools such as ChatGPT, encouraged by reports of greater efficiency. But as some businesses are beginning to discover, a hasty embrace of AI tools can sometimes bring problems of their own.
Within two months of its launch, OpenAI’s ChatGPT application drew in more than 100 million users. A wave of similar tools followed in its wake, from chatbots to website text generators, scheduling tools and presentation designers, and even coding.
But ChatGPT and other large language models have developed a reputation for some serious problems with accuracy.
And while ChatGPT-4’s creators claim that it is now 40% more likely to produce “factual responses” than its earlier iterations, problems remain. The system has no knowledge of events since September 2021, it can make errors of reasoning and it’s often confidently wrong.
In one recent example, a New York lawyer representing a man suing Colombian airline Avianca submitted a set of cases as precedent. Unfortunately, he had used ChatGPT for his research, and not one of the cases was genuine. He may now be sanctioned for “fraudulent notarization”.
Similarly, many developers have experimented with using ChatGPT to generate code, and have found that this, too, is subject to errors.
“Bad code simply wastes developer time, takes up resources and, ultimately, reduces business profitability,” says Dr Leslie Kanthan, co-founder and CEO of AI code optimisation firm TurinTech. “And those in the data-science pipeline already want to spend less time refining code.”
Transparency and accountability
One of the principles of the EU’s new AI Act is that organisations should disclose when content has been generated by AI.
Unfortunately, this doesn’t always happen. Tech title CNET, for example, was recently discovered to have been publishing AI-generated stories and was forced to apologise for misleading readers.
The OECD recommends that AI use should be transparent and generally understood, and that users should be aware of their interactions with it and able to understand and challenge the outcomes. Communication with customers and other stakeholders therefore needs to be clear. And in order to have accurate information to share with customers, businesses will need to carry out due diligence on their AI suppliers in terms of data lineage, labelling practices and model development.
To achieve all of this, Jay Limburn, vice-president of AI and data at IBM, advises involving governance, risk and compliance staff and giving them real teeth to help ensure AI accountability. “If a company building AI tools doesn’t follow clear principles to promote ethical, responsible use of AI, or if they don’t have practices in place to live up to those principles, their technology has no place on the market,” he says.
“Ultimately this is about trust. If the AI models organisations are using aren’t trusted, the organisations themselves will not be trusted and society will not fully realise the benefits of AI.” After all, a lack of accountability in AI, Limburn adds, can result in regulatory fines, brand damage and lost customers.
Data quality and algorithmic bias
Despite the risks of unintended biases in AI models long having been recognised as a potential problem, according to a survey by IBM, three-quarters of businesses using AI have still done nothing to address this.
Such biases can have extreme consequences too: Amazon, for example, was forced five years ago to scrap an AI recruitment tool which was found to be highly sexist. Trained using data that came almost entirely from male applicants, the system was silently downgrading CVs belonging to female candidates.
And according to Simon Bain, CEO of encrypted data analysis specialist OmniIndex, “the biggest chatbots like ChatGPT and Bard still rely on the same generative AI concepts that made Microsoft’s Tay, the incredibly misguided, racist and all-round bigoted AI that was replying to teens and journalists in 2016.”
Ensuring that this doesn’t happen can be tricky. Data for training purposes needs to be representative of all groups and users should have the opportunity to challenge the output.
Meanwhile, data needs to be thoroughly labelled so that if problems with the results are identified it is possible to find where the issue might lie.
The Brookings Institute think-tank, for instance, recommends the use of regulatory sandboxes to foster anti-bias experimentation, the development of a bias impact statement, inclusive design principles and cross-functional work teams.
Earlier this year, video platform Vimeo agreed to pay $2.25m (£1.7m) to some of its users for collecting and storing their facial biometrics without their knowledge. The company had been using the data to train an AI to classify images for storage and insisted that “determining whether an area represents a human face or a volleyball does not equate to ‘facial recognition’”.
But any personal information is subject to standard data protection rules, no matter how it’s used. This includes any data collected for the purposes of training an AI, which can easily end up becoming extremely extensive.
The Information Commissioner’s Office advises organisations to carry out a data protection impact assessment to gain the consent of data subjects; to be prepared to explain their use of personal data; and to collect no more than is necessary.
And importantly, procuring an AI system from a third party does not absolve a business from responsibility for complying with data protection laws.
Businesses are increasingly falling foul of IP rules in their use of AI. Most recently, for instance, image supplier Getty Images sued Stability AI for reportedly using its images to train its art-generating AI, Stable Diffusion.
Similarly, a class-action lawsuit is in progress against Microsoft, GitHub and OpenAI, alleging that they broke copyright law by using source code lifted from GitHub to train the Copilot code-generating AI system.
Theodoros Evgeniou is professor of decision sciences and technology management at business school INSEAD, and a World Economic Forum academic partner on AI. He notes that there is a range of potential IP infringements in using AI. “One extreme is, for example, if one fine-tunes a so-called foundation model, such as Dall-E or ChatGPT, on the work of someone else and then creates something like a ‘digital twin’ of that person or company.
“Then there’s also the question of what to do about the prompts given by the users – so, not the data used to train the model. A user can fine-tune the AI’s output using their own prompts, which can be – for example – the works of another individual.”