While most businesses do a fairly good job of identifying theglobal threats to business operations, they often don't realize thesmallest details can undermine seemingly robust business continuityor disaster recovery plans.

|

How can an organization make an accurate assessment of riskwithout knowing all of the obstacles that may stand in the way of acomplete return to normal business operations? The best way touncover the gremlins that may sabotage an organization's speedy andseamless post-incident recovery is to periodically conduct realworld tests of each element of the plan and update itaccordingly.

|

Take the case of two companies that had what appeared to besolid disaster recovery plans but failed to see the gremlinslurking.

|

o Company 1: A Failure To Test!

|

This company was a large professional services firm operatingfrom a home office hub in a metropolitan high-rise that connectedto its multiple satellite offices via a Web-based network. Becauseof this company's total dependence on its IT systems for all of itscore business operations, it determined that protecting its ITnetwork would be a mission-critical element of any disasterrecovery plan.

|

Extensive measures were taken to install redundancy throughoutits vast IT system. For several years, in fact, the organizationhad been making backups of its entire system each week. The backupswere tested weekly for data integrity, ensuring that the backup wassuccessful and the data intact.

|

The company's disaster recovery plan anticipated the need for anoperational hot-spot–a secondary location equipped with necessarycomponents, including working utilities, desks, phones, networkedcomputers and broadband Web access–and there was a standby contractin place for such a facility. From a senior management perspective,the company's vital IT system seemed well-protected anddisaster-ready.

|

When the largest hail storm on record rendered the high-risehome office building uninhabitable, the company promptly moved intoits hot spot location and attempted to resume operations, only tofind out there was a gremlin in its recovery plan.

|

When the organization selected its hot spot, there had been adetailed analysis of exactly how much capacity it would require atthe alternative site. The contract specified the number of desks,computers and phones that were to be operational.

|

Unfortunately, during the 18-month period after the hot spot wascontracted and prior to the devastating hail storm, no one from theIT department had ever taken the data backups to the hot spot andattempted to boot and install the company's data.

|

It was presumed, based ondetailed specifications in the standby contract, that the company'sbackup data would be compatible with the IT system at the hot spot.That turned out to be the gremlin.

|

The company's IT system had a sophisticated firewall system thatprevented access from unauthorized users and locations. All secureaccess to the network and system administration had been doneexclusively through the home office hub, which was damaged ordestroyed in the massive hail storm.

|

When the backup data was installed at the hot spot, the firewallsystem interpreted the entire hot spot as an unauthorized systemand locked up, preventing any access to the company's files. Hadsomeone tried to install the backup data at the hot spot before thecatastrophe, this problem could have been identified and easilyfixed while the network hub was operational.

|

o Company 2: Too Much Testing!

|

The second organization was a diversified financial servicescompany with many of its larger operations in coastal stateslocated in hurricane- or flood-prone areas. The company had almostdoubled in size via organic growth and acquisitions over theprevious 10-year period.

|

There was a full-time risk manager and risk management staffoperating from the main corporate office. The company had done agood job of bringing all of its new operations into the existingdisaster recovery plan, despite rapid expansion and growth.

|

Communication between each regional hub and the corporate officewas maintained with land lines and backed up with independentdirect satellite uplinks in the event of any primaryinterruption.

|

Based on its experience from previous flooding incidents andcoastal storms, the company decided to install diesel-poweredgenerators with power outputs averaging 500kw at all of itsregional processing hubs to ensure basic power for operations.

|

These generators were tested monthly and run for approximately30 minutes as part of the company's ongoing testing procedures.From a top-down perspective, this company apparently had a veryrobust and integrated plan for business recovery.

|

When Hurricane Katrina hit, however, the company realized therehad been a gremlin undermining its plan. During the selection andinstallation of the backup power generators, the company determinedthat 500 gallons of diesel would supply fuel to run each powerplant for approximately four eight-hour business days.

|

All of the 500-gallon storage tanks had been filled during theinitial installation, and there was a stand-by contract in place torefuel each location in the event a storm interrupted main power atany location.

|

The gremlin turned out to be the monthly testing of the backupgenerators–they consumed about eight to 10 gallons of diesel fueleach month. During the two years of monthly testing pre-Katrina,the coastal locations had used up almost half of the available fuelin the storage tanks. Because no one thought about this type offuel depletion, the refueling contract did not call for periodicrefueling in the absence of a specific weather event.

|

When Katrina hit, the generators worked just as anticipated, butthey ran out of fuel after a day and a half and left the regionalhubs dark for almost two days, until the refueling contractor madeits scheduled stop.

|

Had anyone taken the time to physically check the fuel gauge onthe generators before the storm, it would have been obvious thatthe tanks had been substantially depleted. The plan did not callfor anyone to look at the fuel gauges, however.

|

How can an organization defend against these sabotaginggremlins? When it comes to testing a recovery plan, presumenothing, test everything and reevaluate the plan based on thetests. Every element of the plan that is critical to recovery needsa trial implementation–as though there was a real emergency ordisaster. The plan also needs to include the necessary funding forfield testing and updating.

|

Lastly, have someone outside of the planning process review andevaluate the plan. Fresh eyes bring fresh perspective and may bewhat's needed to spot the gremlins that can hijack contingencyplans.

|

Don H. Donaldson, RPA, CIC, CRM, CHS ispresident of LA Group in Montgomery, Texas. He may be reached at[email protected].

Want to continue reading?
Become a Free PropertyCasualty360 Digital Reader

  • All PropertyCasualty360.com news coverage, best practices, and in-depth analysis.
  • Educational webcasts, resources from industry leaders, and informative newsletters.
  • Other award-winning websites including BenefitsPRO.com and ThinkAdvisor.com.
NOT FOR REPRINT

© 2024 ALM Global, LLC, All Rights Reserved. Request academic re-use from www.copyright.com. All other uses, submit a request to [email protected]. For more information visit Asset & Logo Licensing.