
The race to build out global AI infrastructure has resulted in weekly breaking news of large-scale data-centre investments across the US, Europe and Asia.
The spending spree has sparked both confidence and concern among market observers. Although AI infrastructure will be a lucrative target for many investors, and its development may precipitate a digital transformation that could benefit businesses across industries, compute resources are becoming costlier, enterprise demand for capacity is outsourcing supply and Nvidia holds a near-monopoly in segments of the AI-hardware market.
With its GPUs dominating large-scale model training and its compute unified device architecture the default for enterprise-AI development, the company has the biggest cloud providers competing for the same limited supply of its high-end chips.
Nvidia’s own financial reports underline the firm’s clout in the global AI-compute market. In Q3, the company generated $57bn (£43.5bn) in total revenue and nearly $51.2bn (£39bn) from data-center operations alone, a year-on-year jump of more than 60%.
Still, Nvidia accounts for well over half of the AI-chip and data-centre market, according to independent market trackers. TechInsights estimates that Nvidia holds about 65% of the data-centre AI-chip market by revenue, while IoT Analytics puts its share of data-centre GPUs at more than 90%.
Taken together, the numbers point to a hardware pipeline controlled by one supplier that is capturing most of the spend while leaving everyone else, from hyperscalers to UK mid-market vendors, fighting over capacity that cannot keep up with the pace of AI adoption globally.
The cost of doing business in the age of AI
The next immediate pain point for UK companies is cost. Over the past two years, the price of high-end GPUs has risen sharply. Nvidia’s H100 chips have climbed from tens of thousands of pounds to well over £30,000 per unit, and the server racks that house multiple GPUs now run in the hundreds of thousands of pounds.
Hyperscale cloud pricing has followed a similar trajectory. Renting an H100 from AWS, Azure or Google Cloud typically costs between £75 and £95 per hour, and many customers can access that capacity only by committing to long-term reservations.
These budgetary challenges have slowed, if not outright stymied, the ability of UK firms to go all in on AI adoption. Training or fine tuning even modest models becomes a significant line item, and ongoing inference workloads can cause cloud bills to spike.
“The hype of 2023 has ignored several obstacles that will slow progress in the short term,” noted Bill Wood, CCS Insight’s chief analyst, upon launching CCS Insight’s Predictions for 2024 and Beyond report. “The cost of deployment is a prohibitive factor for many organisations and developers. Additionally, future regulation and the social and commercial risks of deploying generative AI in certain scenarios result in a period of evaluation prior to rollout.”
Consequently, small firms are being priced out of the experiment, while large enterprises are being forced to rethink their computing stack, identifying which workloads genuinely require frontier GPUs and which can get by with cheaper or smaller models.
The economic pressure is slowing implementation timelines and pushing organisations to reconsider how deeply they can integrate AI into their operations.
Bottleneck exit strategies
The good news is that enterprises are not locked into a single implementation path. To break free from a Nvidia-dependent framework, businesses must be strategic in their use of the chipmaker’s GPUs alongside more moderately priced compute alternatives.
The trend is toward smaller, domain-specific models, which require far less compute power to train and operate. Open-source options such as Lama 3 8B, Mistral 7B and Microsoft’s Phi-3, can handle summarisation, search and customer support, for instance, without high-end GPU clusters. For many UK organisations, these models offer a practical route to building useful AI tools while keeping training and inference costs under control.
On-device and edge-based AI are growing in popularity, as they take workloads off the cloud entirely. Many common AI-enhanced business tasks can be run directly on laptops and phones, using hardware from Apple, Qualcomm and Google. This effectively turns everyday devices into low-cost compute engines.
UK developers and global research groups are also renewing efforts to encourage the development of open-source-model ecosystems to break away from Nvidia’s software stack. Frameworks that target multiple hardware backends are gaining traction, and several UK consultancies are offering deployment strategies that mix CPUs, alternative accelerators and smaller models to reduce reliance on a single chip family.
While these measures do not eliminate the need for high-end GPUs, they create room for UK businesses to scale up AI adoption without tying their entire strategy to Nvidia’s supply constraints. They also create space for companies to experiment, ship products and manage costs as the market slowly diversifies.
The race to build out global AI infrastructure has resulted in weekly breaking news of large-scale data-centre investments across the US, Europe and Asia.
The spending spree has sparked both confidence and concern among market observers. Although AI infrastructure will be a lucrative target for many investors, and its development may precipitate a digital transformation that could benefit businesses across industries, compute resources are becoming costlier, enterprise demand for capacity is outsourcing supply and Nvidia holds a near-monopoly in segments of the AI-hardware market.
With its GPUs dominating large-scale model training and its compute unified device architecture the default for enterprise-AI development, the company has the biggest cloud providers competing for the same limited supply of its high-end chips.