IT disasters: Think avoidance, not recovery

 

Data and business operations are irrevocably entwined. Modern organizations leverage hosts of information technology assets to execute revenue-driving activities. While this tech-heavy approach certainly bolsters productivity, it also comes with risks. What would happen should essential systems suddenly go offline? In most cases, these instances of downtime halt all operations and drain cash coffers. Small enterprises fork over an average of $8,000 per hour of downtime, according to research from Infrascale. Large companies pay even more, with most setting aside as much as $700,000 per hour in the event of total system shutdown.

James Elling,  an industry expert for Extreme Networks with over 18 years of experience, was happy to share (vent his frustration) a nightmare scenario with one of his clients… “We had one customer who had a massive data loss due to poor maintenance procedures and called in at 4am. We ended up finding $80,000 worth of invoices in rubbish bins which was almost all of the invoices that they had lost.”

Noah Morrison, owner of Check the Internet, also shares his experiences of a workplace fire. He says: “A few years ago, we hired an intern for the first time. He was a friend of a friend, so we were keen to give him a shot!

Around Christmas, my boss was a very busy new husband so he allowed our intern to lock up, which was a big mistake. We had recently moved, and he had left a lot of the cardboard boxes in our server room that were to be taken away and put into the office before locking up. He had also turned the AC unit off because he was too cold and had forgotten to turn it back on before he left. It’s impossible to say exactly what happened, but we were told the power supply had overloaded (almost certainly because of the heat), and the boxes were enough to set the whole place alight.

“Obviously, the place was destroyed; anything that hadn’t been burnt to a crisp was completely soaked through with water. I had to find another job too, so it was a really brutal experience. Unfortunately, we didn’t have a recovery plan, as we were young and (very) stupid!”

 

As James Eling has been the owner of the IT company, Extreme Networks, for 18 years, he has a few stories he can share regarding workplace disasters. He says: “As an example, we had one customer who decided to fix a couple of hard drives to get more space before they fixed their back up. We told them that if there was an issue, there was no back up, and they assured us that it was a nice and simple job.

“I get a call at 4AM the next morning.  They had been working on it all night and couldn’t fix an initial issue with it.  Part of the efforts to fix the issue had made it worse, and the data was completely destroyed.

“As a result, they lost 5 days worth of invoicing. As it was for a trucking company, that equated to about $120,000.  I found some of the data on desktops, and found $80,000 worth of invoices from paper running sheets that the drivers had thrown out of their trucks in the dumpster, and what they still had in their cabins.

My advice? Firstly, always fix the back up first. That is the most importantly job, always.  Secondly, once they realised they had a problem, they should have got expert assistance.  After the first issue, it was recoverable.  After they tried to fix it, the data was lost forever. Back ups are something that you shouldn’t play around with!”

With these costs in mind, many businesses develop detailed disaster recovery plans that allow administrators to regain access to mission-critical applications after external forces, such as adverse weather events, have knocked them offline. While these strategies can prove useful, there are more effective means for maintaining connectivity. Disaster avoidance is the ideal solution. What’s the difference between avoidance and recovery?

Back in action versus always on
Disaster recovery plans come into play after the fact, mitigating the impact of unexpected system interruptions. These programs normally center on two central variables: Recovery point objective and recovery time objective. RPO refers to the age of the files that must be recovered to regain full system functionality, while RTO is the maximum length of time any given platform can remain out of commission, according to TechTarget. Organizations normally set out these requirements in service-level agreements with disaster recovery providers, which maintain offsite backups and offer system restoration services when on-premises servers fail.

Unfortunately, many businesses do not have workable disaster recovery plans. In fact, an estimated 75 percent are not adequately prepared to handle major system outage events, according to research from the Disaster Recovery Preparedness Council. On top of that, many of the firms that do attest to having disaster recovery measures in place rely on antiquated methods. More than a quarter of the DRPC survey respondents who fell into this category said their organizations maintained hardware-based backups. This methodology is notoriously problematic, as information technology staff must keep track of and update physical hard or tape drives. Even those with ostensibly flawless disaster recovery practices can run into trouble, as the shear amount of data that passes through internal networks makes consistent and accurate replication difficult, according to Datastax.

“An estimated 75 percent of businesses are not adequately prepared to handle major system outage events.”

Enterprises with disaster avoidance programs in place do not have to worry about post-event mitigation, as they have the tools required to keep systems online no matter what. How? System resilience. For example, an organization with virtual machine-based workload might move critical files to offsite servers prior to an adverse weather event. Here, of course, the key is having easy-to-migrate, scalable systems that administrators can shuffle across data caches, internal and external.

Implementing a disaster avoidance program
Firms interested in embracing this proactive approach must first highlight mission-critical assets and, more importantly, the external forces that have the power to disconnect them, according to Quality Technology Solutions. The latter may prove difficult for most, as it’s hard to foresee disasters before they unfold. Nevertheless, developing a list of potential emergencies is a strong place to start. Most planners take into account everything from adverse weather events to terrorist attacks. This opening step mirrors the initial phase of disaster recovery planning. However, the similarities end here.

Those looking to avoid downtime altogether must then look for advanced connectivity solutions to disaster-proof their backend systems. The cloud is the optimal solution here, as it facilitates the timely migration practices mentioned above. In this environment, mission-critical applications are stored on virtual machines that administrators can move between servers with ease. Container-centered cloud solutions are even better. Businesses of all kinds have moved toward this approach in recent years, TechTarget reported. Containers are self-sustaining processing modules that run on a single host, making them more agile than virtual machines. IT teams can quickly create these environments and move them with ease. Plus, containerization is perfect for users who need to maintain multiple iterations of a single application, according to the International Data Group. Once solutions such as those above are in place, systems administrators should draft protocols for maintaining these fixtures to ensure business continuity.

While disaster recovery plans offer companies some protection in the event of unforeseen situations, avoidance strategies are the better option, supporting around-the-clock backend functionality. When mere minutes of downtime cost thousands, firms must do everything they can to keep mission-critical applications up and running. Disaster avoidance is the only way to achieve this kind of always-on operational approach.

Are you ready to swap your disaster recovery plan for an avoidance strategy? Connect with TelcoSolutions. Our partners offer cutting-edge data center and cloud services solutions that can facilitate such a transition. Contact us today to set up a meeting with one of our consultants and design the networking strategy that fits your operation.