top of page

DR Criteria in the Multicloud World (Part II)

  • 3 days ago
  • 2 min read


Part II: The Human and Physical Reality


In Part 1 we addressed the idea that many DR plans and strategies are too narrowly focused. Some only look at RPO; some are only focused on having a copy of data with no concept of total datacenter recovery.


The two I’d like to address first (DR site location/network infrastructure and your personnel plan) are often the least considered. They’re not the quantitative metrics teams tend to fixate on; they’re about the physical and human reality of a disaster. And precisely because they’re easy to overlook, failing to think through and plan for them can result in the complete failure of an otherwise well-engineered Disaster Recovery effort.


DR Site Location and Network Infrastructure


DR site location is crucial to surviving and recovering successfully from a DR event at the Production site. There are many examples of a DR site being unusable or compromised in the event of a disaster. The DR site needs to be far enough away from Production to ensure it’s unaffected by the event causing the outage at the Production site. An evaluation should be done to determine the most likely events and the geographic span it would affect. And beyond just the disaster itself, the DR site should be far enough away such that both infrastructure and the DR personnel are unaffected. The DR site requires electricity and other utilities. Personnel will need transportation and/or stable reliable communication to the datacenter.


Determining optimal geographic distance requires risk awareness. Knowing what you’re protecting against determines where and how far away your DR site should be. For example:


  • Earthquakes (West Coast U.S.) – Place your DR site at least 150 km from your production environment.

  • Hurricanes (Southeastern U.S.) – Distance should be at least 300 km to avoid correlated weather impact.


These distances aren’t arbitrary. They’re based on observed patterns of regional disruption and infrastructure interdependence. A well-positioned DR site balances risk mitigation with power availability, network performance and personnel to enact the failover.


Personnel Plan


A DR plan needs to be clear about who will be responsible when executing DR recovery. It’s highly recommended that the personnel executing the DR recovery are not the personnel located at the Production site. If there is a DR event, it’s highly likely that personnel at the Production site will be unavailable.


The reality is that if something happens that is bad enough to shut down production operations at the Production site, IT personnel will likely be dealing with their own personal issues. Their homes may be affected and families will have priority. Even if personnel are available to work on executing the DR recovery, transportation may be impossible or severely restricted. Power may be down with only so much battery life in communications equipment. It’s also likely that if Production is down, network connectivity to the DR site will also be down, making it impossible to effect IT operations at the DR site. A continuity plan for personnel is as important as your continuity plan for servers and infrastructure.


In Part 3 we’ll start to talk about application evaluation, prioritization, and categorization.


Todd Matters, Co-Founder & CTO

 
 
 

Comments


bottom of page