Part 1 – The real reasons you don’t have an effective DR strategy in place, and what you are missing
“Hey, I take daily backups”, reason #3 below, seems to go-to safety mantra when discussing disaster recovery (DR). The purpose of having a DR in place is obvious – to recover from disastrous events that can impact business continuity. In practice, most organizations have neither thought about nor planned for disasters.
A DR strategy should be laid out the at the time of planning the infrastructure layout. Would you take the delivery of your car before it is actively covered under insurance?
The good news is – even if you hadn’t done it then, you can still plan now. More on that coming up. For now, let’s jump into the reasons you give yourself for not having an effective DR strategy.
Reason 1: “Disaster won’t strike me, why should I spend on additional infrastructure?”
This is probably the most popular reason, with reasons that come in the following variants:
- Budgetary constraints
- Negotiations that sacrifice DR to reduce TCO
- History as proof for the future (it’s worked fine for the past 3 years!)
There are many more variants of this – basically, all of these reasons consider that failures only happen to older hardware systems or systems that aren’t hosted properly. None consider the fickle nature of electronics & components.
Reason 2: “I trust my data center provider”
Many years ago, I read in a book – ‘Trust in God, but lock your car’. That applies to your IT infrastructure also. Just because your IT infrastructure is hosted or co-located at a data center, that doesn’t automatically mean that your data is secure. Data centers offer redundancies for utilities – electricity, internet, etc, but the data is solely your responsibility.
Hardware manufacturers build redundancies into their products – multiple CPUs, RAM & other components. After all, electronics isn’t perfect & hardware does fail.
Unfortunately, when competing for business, data centers do not make this particular distinction obvious. We routinely hear stories from clients having data center-hosted servers where data corruption caused by hardware failures has led to days, if not days of weeks, of data loss & downtime.
Reason 3: “We take daily backups”
You’re taking regular backups. Great! That is definitely the first step toward disaster recovery. In reality, this is not an effective disaster recovery plan. Should disaster actually strike, this plan expects the entire IT team to scurry about & reload all data from a backup system while business users wait for their systems to become available. And, what if the backup drives themselves are inaccessible in case of floods or earthquakes? Worse still, what if the data center isn’t accessible, so there’s nowhere to load the backups into!
The reasons this cannot be your go-to strategy:
- Backup drives are prone to corruption
- It assumes that the people responsible for backups will actually do it daily (making the process human dependent)
- It assumes backup drives are alive, well & accessible
- It assumes that your hardware infrastructure has not failed, and is available into which data can be restored
- It assumes that the location of the servers is still accessible (which it won’t be in case of floods, earthquakes & even electricity blackouts)
This isn’t an assault on backup drives – they do serve their purpose. Instead, this is a list of practical learnings that we have come across over time as clients. Probably one of the worst instances we encountered was when an employee responsible for taking daily backups decided to stop the task without telling anyone, and the company lost 6 weeks of data due to a data corruption on their servers.
Reason 4: “We have high availability infrastructure”
We have seen data center vendors mislead clients into believing that setting up (and paying for) a high-availability (‘HA’) infrastructure means they will not require a DR site.
First, having a highly available infrastructure is not a disaster recovery plan. It is an optional, parallel infrastructure that can be enabled when the primary infrastructure succumbs to failure for any reason.
Secondly, typical HA infrastructure options are typically provided within the same data center, and what’s worse these days – within the same server!
We spend a lot of time explaining the difference to clients – who have been sold such ‘HA’ infrastructure by consulting companies. Just because a consulting company recommends a solution, that doesn’t automatically become the smartest option – clients need to apply their own common sense.
In reality, Disaster Recovery Planning (DR in popular terminology) is, more often than not under-addressed and under-planned, if not totally overlooked. Disaster recovery planning is all about planning for redundancies. There is nothing worse than trying to restore data from a corrupt drive or trying to locate compatible hardware when the dispatch team keeps calling for access to the systems.
In the next part, we will examine a case study with a client who did have a DR, but the need was under-addressed. And in the last part of this blog series, we will review the best practices of setting up & maintaining a DR site.