
AI is evolving at breakneck speed. In just a few years, it has transformed how businesses make decisions, personalise customer experiences and design new services. However, while some firms have successfully integrated AI across their workforce and workflows, others are struggling to transition from proof-of-concept experiments to production deployments.
Three main issues hold them back. Firstly, there’s the high cost of inference – the process of using a trained model to generate outputs – when running large, proprietary GenAI models and model services in production, with costs escalating as organisations scale and bring more use cases online. Secondly, there’s the complexity involved in selecting and aligning these models with organisational data, either through retrieval augmented generation (RAG), which involves combining them with external knowledge sources to provide more accurate results, or expensive fine-tuning. And lastly, there’s the challenge of deploying and managing AI across hybrid cloud environments.
“It can be difficult to deploy AI where you need to – but at the same time businesses need to embrace that kind of flexibility,” says Martin Isaksson, go-to-market lead in the AI business unit at Red Hat, an enterprise open-source software provider.
The speed of innovation today and complexity involved in deploying AI at scale are challenges that are best addressed collaboratively. Businesses must harness the unique expertise of a broad ecosystem of partners, from hardware and software vendors to cloud hyperscalers and system integrators, to deploy AI effectively across all the environments where it’s needed. And this requires a consistent, open platform for co-innovation.
A key benefit of working with open source is that you’re always operating at the same pace as innovation, right where it happens
“That sort of foundation is incredibly important, because no single vendor can provide every tool that’s needed,” says Isaksson. “A key benefit of working with open source is that you’re always operating at the same pace as innovation, right where it happens.”
This approach underpins Red Hat’s AI portfolio. It includes Red Hat Enterprise Linux (RHEL), a foundation model platform for developing, testing and deploying LLMs, and Red Hat OpenShift AI, a platform for managing the entire lifecycle of AI and machine learning models, from development and training to deployment and monitoring, across hybrid cloud environments.
Red Hat’s partner ecosystem also includes global system integrators such as Accenture and Wipro, hardware providers including AMD and Nvidia, and cloud specialists such as IBM and Google. By orchestrating value across them, Red Hat enables businesses to accelerate AI initiatives without becoming locked into a single vendor.
Any model, any accelerator, any cloud
With so many LLMs, inference server settings and accelerator options available today, businesses need an easy way to navigate them and ensure that tradeoffs between performance, accuracy and cost meet their needs. “The ability to easily switch between different models is of great importance,” says Isaksson. “We have a pre-selected model catalogue with optimised models, so they’re faster and cheaper to run.”
These curated and validated models are available on Hugging Face, an open-source community for co-innovating models, datasets and applications. Deploying them helps businesses to reduce their dependence on proprietary LLM providers, whose solutions often include ‘black box’ algorithms and training data. “We believe you should be in control of the infrastructure, the data, the model, how it works – and in the end the whole application,” says Isaksson.
Small language models (SLMs), a subset of LLMs that can quickly and easily be customised with enterprise data for specific tasks, are another key area of focus. Together with IBM, Red Hat co-created InstructLab, an open-source project designed to lower the barriers to customisation by enabling domain experts, not just data scientists, to fine-tune SLMs using their own knowledge and data. In fact, when paired with RAG, the right SLM could even outperform a proprietary LLM.
Faster, more efficient inference has also emerged as an important element of successful AI strategies. Red Hat AI Inference Server optimises model inference across the hybrid cloud to drive down costs. Built on the opensource LLM project, it can support any GenAI model, on any AI accelerator, in any cloud environment.
Scale and trust
Red Hat’s OpenShift AI has helped data scientists at DenizBank, a private bank based in Turkey, to reduce model development time, for instance, freeing up their data-science teams to create new business value instead of managing infrastructure. Integration with hardware-accelerator dashboards has helped to optimise GPU use, with the platform automatically scaling up the slices of GPU a model has access to, as needed. This enables more workloads to run simultaneously without the need for additional GPU hardware, maximising the return on their existing hardware investments and improving efficiency.
Such AI solutions can help businesses to optimise the performance of AI deployments across a range of hardware configurations. “If your accelerator infrastructure is fragmented, with resources scattered in different places – some in the cloud, some on-prem – with one platform you can virtually pool all these resources and optimise them,” Isaksson explains.
Open-source co-innovation also enables businesses to stay on the right side of rapidly evolving AI governance and security requirements. Use of proprietary LLMs often raises concerns about security, privacy and safety, while uncertainty around the training data used and the accuracy of responses can increase legal risks for businesses. Open-source projects, on the other hand, provide the transparency needed to identify bias and privacy issues before they become problematic. This level of transparency is critical for managing the significant legal and reputational risks associated with enterprise AI.
“That’s really important,” says Isaksson. “This is why we’re supporting TrustyAI, which is an open-source responsible-AI toolkit that aims to solve AI’s well-documented problems with bias.”
The TrustyAI community maintains several responsible-AI projects, involving model explainability, model monitoring and responsible model serving. Red Hat engineers in the community recently developed safeguards that ensure LLMs behave ethically, safely and within organisational or regulatory boundaries, making them more viable for high-stakes deployments. The company also plays an active role in Llama Stack, Meta’s open-source framework for building GenAI applications, and supports Anthropic’s so-called Model Context Protocol, which standardises agent-to-application interactions.
Ultimately, businesses need innovative solutions that will help them overcome AI’s cost and complexity barriers. By providing the open, hybrid-cloud platform to unify and orchestrate this rich ecosystem of partners, businesses can turn experiments into scaled solutions, successfully deploy AI across hybrid cloud environments and retain control over their data and costs. In other words, they can ensure that their AI systems work on their terms.
For more information please visit redhat.com

AI is evolving at breakneck speed. In just a few years, it has transformed how businesses make decisions, personalise customer experiences and design new services. However, while some firms have successfully integrated AI across their workforce and workflows, others are struggling to transition from proof-of-concept experiments to production deployments.
Three main issues hold them back. Firstly, there’s the high cost of inference – the process of using a trained model to generate outputs – when running large, proprietary GenAI models and model services in production, with costs escalating as organisations scale and bring more use cases online. Secondly, there’s the complexity involved in selecting and aligning these models with organisational data, either through retrieval augmented generation (RAG), which involves combining them with external knowledge sources to provide more accurate results, or expensive fine-tuning. And lastly, there’s the challenge of deploying and managing AI across hybrid cloud environments.
“It can be difficult to deploy AI where you need to – but at the same time businesses need to embrace that kind of flexibility,” says Martin Isaksson, go-to-market lead in the AI business unit at Red Hat, an enterprise open-source software provider.