But, the need for protection against disruption is universal because system failures, power or network outages, cyber attacks, accidents and natural disasters can all wipe out an organisation’s operations. Smaller problems, from a botched software update to hardware component failure, can also wreak havoc.
Researchers at the Uptime Institute estimate that 44% of organisations suffered a recent major outage that “tangibly impacted” the business. Uptime’s 2020 Annual global data center survey also suggests outages are becoming more damaging.
Uptime’s research pointed out that most outages could be prevented, with power failures the most common cause. But no organisation can remove all risk. Instead, they need an effective strategy to recover data and restore their computer systems.
Cloud computing has the potential to make disaster recovery more affordable and, in some cases, simpler. Some cloud technologies, especially software-as-a-service (SaaS), have disaster recovery and backup options built in.
For technology that runs in an enterprise’s own datacentre, the cloud can provide a pay-as-you-go alternative to secondary sites and redundant hardware.
The market for cloud-based DR, or DRaaS, is expected to be worth US$4.9bn in 2021, with a 16.7% growth rate through to 2024, according to analyst IDC.
The case for cloud DR
“The pendulum has swung towards disaster recovery in the cloud, and DRaaS,” says Phil Goodwin, a research director at IDC.
“Cloud has changed the economics of disaster recovery. Some years back, only the biggest organisations could fully implement DR because it was so expensive to duplicate infrastructure and systems, even if it was through a third-party provider.”
The most appealing feature of cloud-based data storage is its on-demand model. There is no need to buy, and maintain, hardware and other assets in case of emergency.
But using the cloud for disaster recovery comes with its own costs and drawbacks. Specialist DR suppliers exist to fill the gap between a fully replicated IT environment and the cloud, with the bulk of fees only payable if the business has to invoke the DR plan. But for IT managers, the decision is rarely as simple as copying all data to the cloud.
Cloud vs in-house DR
The market for dedicated, cloud-based DR is already large and maturing.
According to IDC’s Goodwin, there are now thousands of firms that offer at least some cloud-based disaster recovery. These range from full-service, “white glove” approaches to those where the customer does most of the design and provisioning.
Most CIOs, however, aim for a middle ground and do some work themselves. “It’s due to cost, as well as a desire for an element of control over the data,” he says.
These are the key considerations for choosing between the cloud, and on-site DR.
Cloud vs on-site DR: Costs
The cloud removes two significant costs for disaster recovery: hardware and premises. However, these costs are replaced by service and usage fees. Businesses might prefer ongoing costs to up front expenditure, but savings are not guaranteed.
“However, one of cloud’s major advantages over in-house self-provision for DR is pay-for-use,” he adds.
DRaaS providers will typically have a monthly or annual service fee, based on the number of servers or virtual machines. Some price according to applications, but cost can vary dependent on performance and potentially on the DR site’s location.
Most DR services will charge an invocation fee, and may also charge for testing.
DIY solutions can appear cheaper, but there can be hidden costs. Cloud storage providers’ monthly fees look low, but data egress charges – to copy data to an operational cloud or to on-premise systems – can be high. This is especially the case for “cold” storage operators.
Cloud vs on-site DR: Moving data to the cloud
Copying data to a backup facility is a challenge for DR service providers and for the cloud. Unless a business is new and has little data, or it can afford real-time mirroring and high availability systems, the initial data copy is likely to be to disk or tape.
“Whether you are moving data to a managed service facility, the cloud or a third party, it’s about comms,” says Tony Lock, analyst at Freeform Dynamics. “It’s about how much bandwidth you have, and how much it changes.”
Managed DR providers use disk to move initial batches of data, he says. CIOs may have to follow that route for the cloud: there is little to choose between the two platforms here.
Cloud DR: RPO and RTO
Only a minority of organisations can afford near-real-time replication or high-availability systems. Most will opt for point-in-time backups, copied over to the cloud directly or more likely via a staging server, NAS or dedicated appliance.
The speed of copying data depends on the public internet. Companies replicating to their own datacentres, or to a conventional DR provider, might have more control over bandwidth, but the limitation is still the speed of light.
“Having your data in the cloud or a DR specialist doesn’t make any difference,” says Lock.
Cloud DR providers can, however, charge different rates for a quicker data restore, and a smaller RPO will likely incur more storage and data transfer costs, through more frequent copies. At Fordham, Blandford points out that a sub-two-hour RPO/RTO is likely to be unaffordable in the cloud, because the cloud environment needs to run constantly.
Data and business recovery
Full-service DR provides more than just data restoration. It includes providing new servers, storage and networks and even new PCs and desk space.
Firms can recover data from the cloud to new hardware, and some will. But the cloud can provide new VMs and even bare metal servers. One of the key attractions of cloud DR is that a business can quickly spin up new capacity when needed.
Using the same cloud service provider to host backups and production systems is the most seamless option and removes delays from moving data between cloud hosts or to a new site.
The downside is application latency. If this is not acceptable, DR providers with dedicated hardware or a failover datacentre might be more suitable.
The business will also need to provide end points for staff. If they already have a remote or mobile workforce, this will be less of a barrier. But, CIOs will need to consider end-user equipment as part of the long-term recovery plan, even if enterprise apps stay in the cloud.
Geographical, regulatory and contractual risks
Some businesses are unable or unwilling to use the public cloud to run applications or store data, in the medium term or at all. In those cases, cloud-to-cloud recovery might not be appropriate.
More cloud providers now support classified data, and there is more choice when it comes to data location. Nonetheless, Freeform Dynamics’ Lock warns that businesses have few protections if a cloud provider goes out of business, is sold or is legally prevented from operating in a territory.
Careful sourcing reduces these risks, but CIOs need to consider where their data is, and whether they have full control of all business information. Ultimately, this might mean keeping physical copies on disk or tape even if the primary backup is the cloud.
Speak to the business to protect the business
Cloud disaster recovery is a powerful tool for CIOs. It gives them options and can reduce costs – especially up-front Capex.
But it is not the solution for every organisation. IDC’s Phil Goodwin advises talking to the business first. “Input from the business is critical,” he says.
Ultimately only the business knows which data to protect – and how much they will pay to do so – whether the technical solution is the cloud, a secondary site or a managed DR service.
Acceptable time to recover most critical service
Potential technology to employ Source: Fordway
Three to four days
Tape recovery to standby hardware
One to two days
Backup-based replication to second site or cloud backup service
Eight to 24 hours
SAN or NAS data replication to warm standby site with suitable hardware to run service and good, tested recovery procedure
Two to eight hours
Near real-time data replication to a live or hot standby site
Under two hours
Services running active/active across two datacentres with automated failover