Organisations often struggle to balance the development of new products and features with maintaining the reliability of their systems. Heightened demand for digital services during the coronavirus pandemic has emphasised the need for system observability and new operational cultures to enable reliable growth
Businesses are brimming with ideas, but implementing them effectively often proves a challenge. In a digital context, creating new experiences, services or products requires rewriting supporting software code at a significant scale or establishing entirely new systems, with both options presenting a significant risk of introducing operational problems.
The same consequences occur when organisations grow quickly. Systems are put under major strain, given the large data loads and new pressures involved. This leads technology teams to make extensive changes to essential systems, which can cause additional system latencies and dropouts.
“Changes such as these can lead to really problematic repercussions, introducing the danger of new bugs, bottlenecks and outages,” explains Steve Hurn, executive vice president and general manager, Europe, Middle East and Africa, at software company New Relic.
Internal dynamics within organisations often serve to worsen the difficulties, with different teams inadvertently pulling in opposing directions. “In many companies, developers change code as needed, test it and push it out to operations teams. By contrast, the operations personnel are entirely focused on validation and continuity, and may not even understand all the thinking behind the new code. Driving in different directions like this can rapidly create complications and stop innovation or growth in their tracks,” says Hurn.
The coronavirus pandemic has greatly exacerbated many of these problems because businesses have shifted so much more of their operations onto digital channels. Highly advanced operators saw a huge surge in demand for online services, while bricks-and-mortar businesses were left with little choice but to promote and rely on web and mobile. As a result, changes to software and server setups, as well as entire technology architectures, have shot up strategic agendas.
With this increase in pressure, any system weaknesses have been quickly exposed. “It’s been a real revelation to many companies about how well or badly their systems actually run and how hard it can be to find the root causes,” notes Hurn. “Companies have suddenly been faced with a need to create better or entirely new digital customer experiences, or to handle a tripling in web traffic.”
Finding a solution
As a result, businesses are under pressure to understand where their operational and scalability weaknesses lie. Many are turning to observability, which is a new means of proactively understanding the full picture of system operations, including what will happen when code is changed or updated, or when new apps go live. Observability typically picks up a range of problems that are being missed, including those caused by complex systems that almost work seamlessly, but have marginal problems; such inaccuracies tend to build and lead to major outages.
Observability consists of three key elements: telemetry, contextualisation and understanding. Telemetry agents collect data from every component of a business’ digital service. Then there is the context, where data is enriched and correlated to build a complete understanding of how the system behaves. Using that information the technology deploys visualisation, so organisations can understand the data, query it and know how to action improvements fast and proactively.
The more complex the setup, the more essential it is to understand what is happening in detail and to have a reliable answer, says Hurn at New Relic, whose observability software provides detailed insights. “It’s a bit like if you’re driving a Formula 1 car as opposed to an ordinary vehicle,” he says. “To get the best out of a high-performance machine, you need to understand everything about what it’s really doing. That’s what observability allows you to do as an organisation, giving you a comprehensive view of the most relevant data, so you can solve problems and innovate at pace without sacrificing reliability.”
As businesses across industries look to make these changes and succeed in a fast-changing environment, they are also seeing an increasing need for a cultural shift to support system innovation and robustness. Such a transformation requires a focus on services and user experiences, driven by high-quality data.
Typically, there is a maturity curve for companies as they improve understanding of how their systems work and where changes are most needed. Observability is typically first used by development, operations, IT and infrastructure teams, but the clear impact of the insights often leads to other teams implementing it.
“We increasingly see production, marketing and finance teams applying observability to correlate system functionality with broader business objectives, such as service or product usage and costs,” explains Hurn. “In the most advanced companies, even the chief executive accesses the dashboards for quick daily performance insights.”
Some 94 per cent of the most mature organisations already see observability as a key aspect of development and they all integrate end-user performance data with system performance to understand the effects of changes, according to research by New Relic. This work is enabled by machine-learning and artificial intelligence that augment human processes, reducing noise and deriving important conclusions which might otherwise easily be missed.
For all organisations, observability provides a sharp view into their own platforms, creating clarity on where improvements can be made and how problems need to be resolved
Among the organisations working with New Relic’s observability software is the Royal Society of Chemistry. The 179-year-old institution is undergoing a complex digital transformation that is influencing how chemists around the world use its vast published resources and databases.
The Royal Society of Chemistry brought in New Relic to help with its move of legacy applications into micro services for a more-effective online presence. Using New Relic’s observability software, the institution now has end-to-end visibility of every effect of new developments and updates. Its systems also deliver real-time monitoring of the experience for its 54,000 scientists and research members as they access a million specialist articles and records, with a single source of the truth enabling collaboration across teams to make changes, and prevent problems and latencies, even with hundreds of thousands of page views every hour.
“Observability is helping us build better products and services for our global audience of scientists, researchers and chemical science organisations,” says Chris Callaghan, Royal Society of Chemistry’s development and site reliability engineering manager.
For all organisations, observability provides a sharp view into their own platforms, creating clarity on where improvements can be made and how problems need to be resolved. Any organisation seeking to innovate and grow quickly, while maintaining reliability and enhancing user experiences, can benefit significantly from the high visibility and actionable insights.
To find out how to harness observability for more perfect software, reliable continuity and rapid innovation, please visit newrelic.com