What to do when the servers go down

Servers are central to business operations and a crash can be catastrophic. Organisations should plan for the worst to enable a speedy recovery

Nationwide customers faced repeated disruption to their banking services around Christmas, unable to access their funds or pay their bills. While the building society described the outage as a “technical issue”, some experts identified a server failure. 

The disruption demonstrated just how much damage a tech crash can cause and the salient importance of servers for smooth business operations, with millions of users unable to receive or make payments.

Whatever the causes of a server crash – ranging from simple hardware failure and power outages to software glitches, cyberattacks and natural disasters – the consequences can be catastrophic. Businesses, big and small, rely on connectivity; in a digital age, it’s the lifeblood of commerce. Consequently, organisations have become increasingly reliant on servers.

Just one hour of downtime can cost anything from thousands to hundreds of thousands of pounds

As screens go blank, digital and human connections are cut. The afflicted organisation loses productivity, orders and profits, while customers are affected, causing reputational damage and possible loss of future business. In addition, if private data is lost, regulatory fines and penalties can result, as well as class-action lawsuits.

Servers support essential connections, which facilitate business operations including interaction with staff and customers. Their importance means more and more companies use a network of cloud servers.

These servers now play a vital role in business technology. They provide a central repository, receiving, storing, retrieving and sending data, ensuring all team members have timely access to the information they need. 

Web, email and file servers, to name just a few, are essential for employees, teams and systems to perform the tasks that make up their jobs. The pandemic and resulting shift to remote and hybrid working have necessarily accelerated data-centric, cloud-based digitalisation, so businesses have become increasingly dependent on the uninterrupted operation of their servers. 

Achilles’ heel

But have servers – a computer with advanced hardware running a server program – become an Achilles’ heel? Essential to business operations, what happens when servers crash? How can an organisation recover quickly and get back to work? 

Azeem Javed is a consultant at Creative Networks, managed IT and telecoms specialists. He says that encapsulating backup as part of a business continuity and disaster recovery (BCDR) strategy “is critical for all businesses, ensuring continuity of their operations and suitable recovery. This should extend to all aspects of business.”

Contingency planning and a system backup strategy – installing locally based or remote backup servers or backup to an external hard drive and disaster recovery software – can certainly help the chief technology officer sleep more soundly at night. If a business has alternate backups for its files, it can quickly bounce back and resume operations.

A full backup is a complete copy of an organisation’s data assets. This process requires all files to be backed up into a single version. However, the dataset should be copied in its entirety and stored in a separate location, away from the server.

Such an offsite backup, which can be accessed, restored or administered from a different location, guarantees high-level security and peace of mind as it allows data storage offsite and online.

“If your data is mission-critical to your business, backup servers are absolutely vital to ensure seamless business continuity and to avoid data loss,” says Jake Madders, a director at Hyve, a managed cloud hosting provider. 

“We now live in an ‘always-on’ world, where just one hour of downtime can cost anything from thousands to hundreds of thousands of pounds, depending on the size of a company. Time is money.”

Worst-case scenario

Irrespective of the location of the server, a BCDR plan is essential for the worst-case scenario.

“The pandemic has forced companies to realise that being prepared for even the most unlikely situation can no longer be treated as an optional part of business planning,” says Madders. “While it might seem difficult to measure the return on investment of a disaster recovery solution, because it’s a precautionary feature that ideally would never need to be used, it shouldn’t be seen as a ‘luxury’ add-on service solely for larger companies. It should be a fundamental part of every business’s IT strategy.”

A disaster recovery plan is a documented, structured approach that describes how an organisation can quickly resume work after an unplanned incident. It is an essential part of a business continuity plan and should be applied to the aspects of the operation that depend on a functioning IT infrastructure. 

The step-by-step plan consists of precautions to minimise the effects of a disaster so the organisation can continue to operate or quickly resume mission-critical functions. Typically, disaster recovery planning involves an analysis of business processes and continuity needs. 

Before generating a detailed plan, an organisation should perform a business impact analysis and risk analysis, and establish recovery objectives.

All strategies should align with the organisation’s goals. Once a business continuity strategy has been developed and approved, it can be translated into a disaster recovery plan, with an incident response team and list of important contacts.

The plan should be reviewed by management, tested, audited and regularly updated. It should be substantiated through testing, which identifies deficiencies and provides opportunities to fix problems before a crash occurs. Additionally, it’s important for businesses to monitor and protect their servers with software that can flag up potential problems.

And before calling in the tech experts, there are a few basic housekeeping tips that can lower the possibility of servers crashing in the first place. Prevention measures include keeping the server room isolated and cool with air conditioning. It should also be clean because dust can cause overheating.

In-house tech staff may be able to troubleshoot a server failure, but more complex issues could require outside help. This means that adequate training of tech staffers in how to deal with a failed server in the first instance is a good investment, as is maintaining a working relationship with an external IT specialist. 

Of course, if the server is in a remote data centre, the organisation is at the mercy of the good practice of an outside agency and reliant on their speedy action to get systems back up and running – so choose your provider carefully.