Why enterprises need a flexible, scalable foundation for AI

Interest and investment in AI are increasing as the technology becomes more accessible for a wide range of use cases. According to a survey by McKinsey & Company, most (92%) companies plan to accelerate their AI expenditure in the next three years. However, only 1% of leaders consider their organisations to be ‘mature’ in their deployment of the technology.

As businesses move beyond the experimentation phase in their digital transformation, they will increasingly look to ‘enterprise AI’ to develop business-specific workflows that can enhance employee efficiency and streamline processes.

The systems underpinning enterprise AI are fundamentally different from the non-proprietary, consumer solutions that many businesses have adopted following the boom of products such as ChatGPT, explains Vik Malyala. He is managing director and president of EMEA and senior vice-president of technology and AI at Supermicro, a server-technology provider.

“Consumer AI serves generalised tasks at an individual level,” Malyala says. “Enterprise AI is designed to address specific knowledge and workloads within an organisation. That can support everything from operations and customer service to supply chain management and real-time decision-making.”

Enterprise AI infrastructures enable businesses to build agentic solutions in their workflows. Systems across every function of the business can act with significant autonomy, reducing the need for human intervention. While firms’ AI strategies may differ considerably – some may train their own solutions, while others rely on fine-tuning existing models – the benefits of translating raw data into outcomes faster and more effectively can help organisations across all sectors.

“Organisations that advance from foundational models to building towards real-world deployment can reshape their day-to-day operations. Agentic AI can autonomously complete tasks, interact with users and adapt based on context. That promises to streamline operations, personalise customer experience or support predictive decision-making,” says Ray Pang, senior vice-president of technology and business enablement at Supermicro.

AI is becoming a core driver of enterprise transformation. It will be embedded into every facet of the enterprise

“AI is becoming a core driver of enterprise transformation. It will be embedded into every facet of the enterprise. Almost every enterprise-level organisation is turning into an AI company. That transformation will be key in achieving long-term strategic goals,” adds Malyala.

These outcomes are achievable with non-proprietary models, but building agentic AI capabilities in this way carries risk. Safeguards for data security and privacy are often inadequate and customisation limited because the technology has been built for broad uses. Governance is also unpredictable, as open models require frequent updates.

As such, the demand for localised, on-premise infrastructure accelerated by graphics processing units (GPU) is growing, particularly in industries where privacy, governance and compliance are critical. Organisations are looking for hardware solutions that can make integration scalable and accessible, reducing complexity and barriers to deployment while keeping costs as low as possible.

Scaling out enterprise AI infrastructure

As they iterate their solutions, enterprises also need adaptable hardware. Core to this technology are GPUs that use the peripheral component interconnect express (PCIe) interface, a high-speed standard in servers that enables the high-bandwidth, low-latency transfer of data between central processing units (CPUs), GPUs, memory and storage. The standardised nature of PCIe ensures compatibility and flexibility, allowing IT teams to optimise performance across a range of workloads. For scaling out beyond a single server, technologies such as InfiniBand and high-speed ethernet provide the external connectivity needed to link servers and data centres efficiently.

“Many businesses are exploring a modular approach,” says Pang. “PCIe GPUs support enterprise workloads without the need for complex, highly- specialised components that can be integrated into existing infrastructure. That enables organisations to adapt and expand their computing needs as they evolve. Future-proof AI infrastructure means enterprises can experiment today and expand tomorrow – without a disruptive overhaul.”

The development of enterprise AI in existing IT infrastructure requires careful planning. Unlike foundational AI, which often demands purpose-built facilities owing to its power, space, thermal and networking requirements, enterprise AI can be integrated into established systems. It is not a standalone environment but an added layer that must align with current resources to ensure scalability and efficiency without disrupting operations.

Pang says building proprietary enterprise AI therefore requires a holistic approach. Beyond modular servers and storage systems that can grow as AI workloads mature, businesses must also consider factors such as latency, data-pipeline readiness and sustainability and efficiency of power and cooling.

What might an ideal product mix look like? For a mid-sized company, which Pang defines as one with roughly 1,000 employees and an annual recurring revenue of $250m, a good starting point would be an Nvidia-certified GPU server. These systems are validated to work seamlessly with Nvidia AI enterprise software and Nvidia networking, ensuring a reliable, full-stack solution for accelerated AI adoption. For example, Supermicro’s servers containing either the Nvidia RTX Pro 6000 Blackwell Server Edition or Nvidia B200 GPUs enable both model fine-tuning and inference – the application of pre-trained models to unseen data to generate predictions – at scale.

In addition to GPU servers, companies need AI-optimised storage solutions to enable retrieval-augmented generation (RAG), which powers real-time, context-aware responses. To meet this need, Supermicro works with leading storage providers such as Vast, Weka and Nutanix, ensuring high-performance data access. High-speed ethernet switches provide low-latency connections between nodes and inference endpoints, allowing for fast transmission of data between GPU servers and edge devices. Finally, this hardware should be paired with a software stack that includes AI orchestrators – systems that manage and coordinate multiple tools, models and data pipelines – along with advanced RAG capabilities.

“That provides a systematic, automated solution to redeploy and refine models that maintain accuracy targets while lowering resource demands,” says Malyala. “It also helps ease the risk in maintenance, customisation and governance. RAG and agent-advanced features will provide additional security and accuracy.”

By combining the right infrastructure with an optimised software stack, enterprises can expand their capabilities, move closer to true agentic AI and rely less on generic, non–proprietary solutions.

Malyala adds: “It balances compute intensity, storage performance and energy efficiency. It can handle millions of customer interactions per month with reliability and compliance. The enterprise has the ability to own a proprietary customer service AI model, reducing dependence on generic, third-party SaaS providers.”

There are some challenges in implementing these systems, however. For example, many on-premise data centres are only designed to deliver a maximum output of 20kW per rack, whereas racks in AI data centres can require up to 200kW of power each. Given the increased computing output of GPUs, that capacity can be quickly absorbed by fully-populated systems.

Cooling can also be a problem. Most enterprise data centres are air-cooled, which limits the types and density of GPUs that can be deployed. Plus, thanks to space constraints, it’s often not possible to make major modifications. Enterprises must therefore adopt AI solutions that integrate into existing infrastructures. Supermicro offers GPU-accelerated systems in a range of air-cooled form factors – including 5U, 4U and 2U – that fit into standard data centre racks. For organisations planning larger, purpose-built infrastructures, liquid cooling is an emerging option that can reduce operational costs and increase efficiency, though it typically requires new rack designs and dedicated facilities to support higher- power CPUs and GPUs.

As enterprises look to mature in their deployment of AI, they will increasingly need to rely on the counsel of hardware providers that can deliver infrastructure capable of accelerating time-to-result and time-to-revenue for their proprietary agentic AI. Supermicro works closely with partners such as Nvidia to bring new platforms to market in weeks rather than months, adapting quickly to emerging hardware and unique customer needs.

“Through our global manufacturing presence and first-in-market support and system validation for new Nvidia GPUs, Supermicro helps organisations deploy enterprise AI infrastructure faster, speeding up our clients’ adoption of AI and bringing faster return on investment,” says Malyala.

Enterprise AI in action

From finance to telecoms, the uses for enterprise AI are rapidly expanding. Enterprise AI is moving from theory to practice, delivering measurable outcomes across industries. By embedding agentic AI into core operations, organisations can automate complex workflows, improve efficiency and unlock new opportunities for growth. Below are some of the most valuable uses for enterprise AI across key sectors.

Financial services

Banks and financial institutions are required to carry out extensive know-your-customer (KYC) and anti-money laundering activities. Yet these functions achieve staggeringly poor returns on their investments. By automating client onboarding, which triggers frequent KYC checks, and tracking irregular activities through agentic AI, firms can significantly reduce fraudulent activity.

Agentic AI can also streamline credit assessments for new customers. Previously, such checks would rely on static data. But by using contemporaneous transaction data and economic indicators, lenders can continuously assess the risk on their balance sheets, creating dynamic lending models that can adjust in real time.

Manufacturing

According to research by IDS-INDATA, UK and European manufacturers are projected to lose more than £80bn to downtime in 2025. By collecting data from equipment sensors and production lines to identify patterns, agentic AI tools can detect early signs of wear and tear, predicting impending failures or disruption to production lines.

Disruption to supply chains also presents a significant risk. Agentic AI can predict supplier delays by tracking geopolitical or climate disruption, negotiate contracts with alternate vendors if there is a shortage of goods and balance production across factories to fulfill regional demand.

Intelligent models can also help manufacturers create more optimised designs of their final products, reducing operational and manufacturing costs, which in turn improves margins.

Retail

Retailers have long strived for a true omnichannel experience. For customers, that is underpinned by the synchronisation of data across in-store and digital platforms. Agentic AI can support that aim by giving retailers real-time inventory visibility across stores, unifying pricing strategy and delivering consistent messaging on promotions on multiple channels. The ability to harness customer data can also drive marketing personalisation and higher sales, while reducing costs.

In-store automation is also helping retailers reduce labour costs, while improving customer experiences and increasing operational accuracy. Autonomous checkout systems, such as those used in Amazon Go stores, are powered by agentic AI. Using computer vision, customer movements are tracked, products are identified and customers are automatically charged as they leave the store. Walmart has deployed shelf-scanning robots in its US stores, identifying out-of-stock items, price discrepancies and misplaced products, and providing accurate data on in-store behaviour and increasing efficiency.

Telecommunications

As the demand for global data traffic increases – Infosys estimates 300 exabytes of data will be transmitted per month by 2027 – the need for reliable and low-latency networks will become even more important. Agentic data can help telcos reduce downtime and improve performance by enabling self-healing and autonomous network management.

Customer-service interactions can also be significantly automated, improving customer experience and reducing labour costs. Multiple agents, such as billing and communication, can collaborate, delivering personalised responses to customer queries.

Agentic enterprise AI

The ability to draw from previously disparate data centres at speed will be crucial to improving customer experience, as businesses look to reduce cost and deliver highly personalised support and campaigns.

Agentic AI can also help to streamline internal processes in universal business operations. For example, in human resources, AI-powered technology can help to automate and personalise onboarding processes, deliver learning and development programmes, and improve collaboration across teams.

For more information about how to scale the adoption of enterprise AI, visit nvidia.com

TechnologyArtificial IntelligenceSponsored

Why enterprises need a flexible, scalable foundation for AI

Scaling out enterprise AI infrastructure

Enterprise AI in action

Financial services

Manufacturing

Retail

Telecommunications

Agentic enterprise AI

Read this next

Check out top-rated tools tailored for teams like yours

Want to read on?

Subscribe to our Daily Newsletter