Leveraging Cloud for Disaster Recovery
Sunday, March 24, 2019
Sunday, March 24, 2019
As public cloud infrastructures mature and storage costs decrease, more and more enterprises are looking to the cloud to implement their disaster recovery (DR) plans. Virtually every recent survey of IT trends shows that secondary backup in general and DR in particular are highly compelling cloud use cases, and are often the first forays of an organization into the cloud.
In this blog post we discuss cloud disaster recovery benefits and challenges and we will examine the different DR options in public cloud.
Disaster recovery in cloud computing comprises the IT policies, tools, and procedures that ensure critical business infrastructure and systems will function despite disruptive events such as natural disasters, cyber security attacks, or even planned upgrades/maintenance that require shutting down production infrastructures. DR supports business continuity objectives by quickly restoring infrastructure, applications, and data. The key to any DR strategy is data replication.
In today’s always-connected, always-on global economy, users expect apps and services to be available 24/7/365. At an average cost of ~$8,850 per minute, an average of 95 minutes of downtime per outage, and 31% of organizations having experienced at IT downtime incident over the previous 12 months, the direct costs of downtime are clearly significant. Indirect costs in the form of lost customers, business opportunities and productivity, as well as damage to reputation, can be just as or even more damaging. No matter what the cause, when disaster strikes, rapid recovery is essential.
Disaster recovery, in cloud deployments or on-prem, is a business-critical issue. In many verticals, it is also important to maintain redundant data sets in order to comply with long-term data retention regulations. However, the costs of setting up, testing and maintaining a DR site in a failover data center are very high—especially in light of the fact that the organization hopes never to have to use that replicated site. Other DR challenges include:
In order to meet these challenges, enterprises of all sizes often seek to implement their DR strategies by leveraging on-demand public cloud compute, network, and storage resources. In this way they can reduce their data center footprints and costs, shift CAPEX to OPEX, enhance data safety, and benefit from limitless scalability.
The traditional approach to DR requires significant investment of time and resources. At minimum, users must consider how they would replicate their primary infrastructure to a secondary site. That secondary site needs to be procured, installed, and maintained. During normal operations, the secondary site will typically be under-utilized or over-provisioned.
The cost of such an investment is beyond the means of many companies. Even for companies with the means, DR is seen as a sunk cost that delivers little return quarter over quarter. However, not having an adequate DR strategy is also something no company can afford.
The public cloud offers a way for companies of all sizes to build DR sites with little upfront costs through a pay-as-you-go model.
Every major public cloud vendor offers multiple options for building a DR site using their cloud. AWS, for example, offers four options or scenarios that they highlight in a white paper published in 2014. Each scenario, which can also be created with the other public cloud vendors, comes in at a different price point and delivers a different Recovery Time Objective (RTO) and a different Recovery Point Objective (RPO).
Companies can choose the option that best meets their RTO and RPO requirements and budget. In general, public cloud enables customers to build solutions with better RTO and RPO at a lowered cost than a secondary DR site.
Traditionally, companies have used off-site backup tapes as their primary means for restoring data in the event of a disaster. This typically involved retrieving tapes from cold storage and recovering data when the primary facility has been restored or when the tapes have been sent to a cold secondary site only turned on when a disaster has occurred.
Companies have started to leverage public cloud storage services such as Amazon S3 and Azure Blob Storage as alternatives to archiving tape to an off-site facility. Not only is this a more cost-effective solution, it delivers better RTO and RPO since the data is already in the cloud where it can be used to launch a DR site on-demand.
There are various approaches for transferring data from the user’s on-premises infrastructure to the public cloud. These include migration tools specific to a particular cloud vendor, as well as vendor neutral data management platforms such as Rubrik.
In a disaster, users create cloud resources to restore data to and launch new server instances/VMs to run production workloads in the cloud.
The Pilot Light option is named after the constantly-on gas heater pilot light that is used to quickly light the furnace. With this approach, a minimal copy of the production environment is maintained in the cloud. Core components whose state must be maintained and updated, such as a production database, run continuously in the cloud and are synced regularly with production. Servers in the cloud can be provisioned but turned off until a disaster or server images can be maintained for launching instances/VMs.
Compared to the Backup and Restore option, the Pilot Light scenario offers a better RTO since the core components are already running in the cloud and servers are already provisioned or ready to be provisioned. It also offers better RPO since core services are regularly updated and synced with production. However, the cost is typically higher.
The Warm Standby option requires a scaled down copy of production to be provisioned and run continuously in the cloud. Stateful core components are also updated and synced regularly with production. A subset of servers, found in production, run continuously as instances/VMs in the cloud and can be scaled up as needed.
Compared to the previous two options, the Warm Standby scenario offers a better RTO since the core components are already running in the cloud and critical servers are already provisioned and running. In a disaster, production traffic for critical workloads can be redirected to the cloud while additional instances/VMs are launched to take on additional workloads. The Warm Standby option also offers better RPO since core services are being regularly updated and synced with production. The cost is higher than the earlier two options since more resources are provisioned and continuously running.
Similar to the Warm Standby option, a copy of the production environment runs continuously in the cloud. But in the hot site scenario, a copy of the full production environment runs in the cloud. This allows for immediate failover during a disaster, with the cloud provisioned to run the same amount of workload as production. In addition, if core components are being updated synchronously, then the cloud can be used for production, along with the user’s on-premises infrastructure, in an active-active setup.
This option has the best RTO and RPO since the user is running an exact replica of the on-premises infrastructure in the cloud. As expected, it also has the highest cost, particularly if core components for both the on-premises and cloud environments are being completely synced.
This option has the best RTO and RPO since the user is running an exact replica of the on-premises infrastructure in the cloud. As expected, it also has the highest cost, particularly if core components for both the on-premises and cloud environments are being completely synced.
With disaster recovery a business imperative, enterprises are focused on optimizing their DR strategies in order to provide bullet-proof protection at minimal costs. The cloud has come to play an important role in DR, offering services that leverage global data centers and flexible storage tiering for cost-effective yet robust DR replication targets.
Latest Thought's
Categories
If you are looking for reliable and efficient solutions to enhance your business operations, Mindfire is the perfect partner for you. Contact us today to learn more about our services and how we can help you achieve your goals. Whether you need Cyber Security Services, Managed Security Services (MSS), Consulting Services, Cyber Risk Management Services, Cloud Services, Digital Services, or Digital Transformation, our team of experts is here to assist you every step of the way. Don't hesitate to get in touch with us and take your business to the next level with Mindfire.