I remember pulling into the parking lot at work and wondering why so many vehicles were in the public parking area. We had an employee parking lot that was controlled by an automatic gate–and the gate apparently was broken or the power was out. My mind started running down the problems we would have if the power was, in fact, out. We hosted our own data center, and without power, we probably were in a very bad place.
I should have received an e-mail notifying me if any of our Web servers were down, but if everything was down, that meant e-mail wasn't working either. The building was located in a semi-rural area, outside of redundant power grids common in urban environments. We had battery-powered backup power supplies, but their effective life was only 60 minutes or so.
My first fears were quickly confirmed. The power was out, and all network and communications gear was down. The senior executive present was trying to get the network guys to run out and purchase portable generators. We had paying customers consuming services we provided by way of the Internet. The idea, of course, was the quick purchase of a thousand dollars' worth of portable generators would get those services back online ASAP.
It took only a few minutes for the folly of that plan to sink in. Powering up a data center actually is a little more complex than running a 500-foot extension cord from a couple of 1000-watt portable generators. What really started to sink in, though, was the realization we had no business continuation plan. There had been a lot of talk about real fail-over generators or locating critical servers in a real data center, but that is all it had been–talk. We had tacitly accepted the risks that go with not having a plan, and now we were paying the price.
Governance
A few months ago, we discussed governance in this column. An essential part of IT governance is disaster recovery (DR) and business continuance plans. What will you do when a major system fails for one reason or another? How critical is that system to your organization?
All too often when we build or buy IT systems we justify them using ROI models that are not based on reality. Let's suppose you are interested in creating a new ratings system. That system can be specified and priced and then justified based on increased revenue or decreased expenses attributed to the new system. The total cost of ownership including licensing, development, hardware, and maintenance costs are projected over a period time, and if the numbers display a positive ROI, the system may be green-lighted. It sounds good, but have we examined all the factors?
How Dependent?
We need to quantify just how critical any particular system is to the lifeblood of an organization before we determine what effect that system will have on a business continuation plan. The lifeblood of the organization is, of course, money. What is the net effect that the discontinuation of any particular system will have on top- and bottom-line revenue? How dependent are we on that system? Most systems have some degree of criticality within the organization, some more than others. A ratings engine obviously lies in the critical zone if you are in the business of underwriting insurance policies.
Given that, we need to decide how long we can afford to operate without that system. Is it a few hours? A few days? Uptime for critical systems is referred to in terms of 9s, for example, 99.999 percent, or five nines, equates to a total allowed downtime of a little more than five minutes a year. Four nines (99.99 percent) is 52 minutes a year. Three nines (99.9 percent) availability allows you eight hours and 45 minutes of downtime. That isn't a lot of time. It also means "standard" disaster recovery methods aren't going to cut it. Restoring a physically damaged system with new equipment and tape backups stored off site is not going to meet any of those criteria.
Rethink
The primary concerns of most customers who build software systems to allow them to solve particular business needs are:
o How much is it going to cost?
o When can it be delivered?
And those are fair and legitimate questions. The problem is the limited scope of the questions. When most organizations ask how much it is going to cost, they mean how much they are going to spend on hardware, software development and configuration, and licensing before they can go live. We will call these first-level costs. And often even those costs are based upon preconceived direction from the client. Software vendors obviously have a vested interest in making very high-level hardware recommendations to support their products. Purchasers of their software often make the mistake of assuming everything is overspecified and decide they safely can cut back the specifications in order to save money. Bad decision.
I Want to Be Liked
There is a tendency among technology managers to try to cut the costs of IT projects to make them more palatable to the business owners. This practice is foolish and makes no sense. We should not be afraid to be honest and forthright about the real cost of IT projects. Time and time again, I see projects that already have budgets, and IT then is forced to shoehorn somehow a business critical project into that budget. If any part of a business cries out for zero-base budgeting it is IT projects.
Do we feel such a need to be liked we always are willing to say we will do the project for less than the real cost? We should never shrink back from providing realistic and inclusive cost and resources estimates. And I haven't even started talking about the real cost of ownership.
Real World
Let's get real for a minute. I recently had a conversation with a very successful independent insurance agent. He places policies with a number of regional, national, and international insurance carriers. We were discussing why he would place a policy with one carrier or another. In some circumstances, it is a matter of having a longstanding relationship with the underwriter or knowledge that for a particular type of policy a certain carrier would provide more competitive rates. The agent also said, in some cases, it simply is a matter of some insurance companies providing better, easier-to-use tools than others.
Think about that. You have a Web-based system agents can use to get quotes, write policies, modify policies, make payments, etc. A substantial part of your business is directly related to this system. So, maybe we are talking about five 9s–you cannot afford to lose business by having this application unavailable. Once agents switch to another carrier because you were not available, they may not come back.
Now we are talking about business continuity and basic disaster recovery plans that do not provide 99.999 percent availability.
Business Continuance
So, you finally have made the decision to create a continuity plan for this system. What are the elements of that plan? It really is a twofold process–first, build a system that by itself will provide very high availability. Use clustered or mirrored database servers. Use industrial-strength SANs with sophisticated backup and restore devices attached. Use server hardware with RAID technologies. Use servers with redundant hot swappable drives.
Consider virtualization with virtual machines running hot ready to swap into the system. Consider blade servers with redundant blades ready to be rolled into the mix. If you are using regular rack-mounted stand-alone servers, have fully configured spares on standby. Consider creating a staging/fail-over environment that is a mirror of production. Insist all mission-critical equipment is operating in a secure data center with redundant power supplies and redundant network connectivity.
Phase Two
If you really are serious, you will consider a second mirrored data center with hot servers ready to assume the load in an emergency. Your secondary data center does not need to be as robust as your primary. It truly is an emergency fail-over and likely will be used only during regular testing. The fail-over data center does not need to encompass every application in the organization–but it better have all your mission-critical applications ready to roll.
Real World, Part Two
This is not pie-in-the-sky stuff. Nor are these the ramblings of a mad hardware geek with an unlimited budget. There are organizations that take their data and data systems seriously enough to build killer systems and killer fail-over data centers. And those organizations are successful. Their bottom-line revenue is solid. It simply is a matter of priorities. If a particular system truly is necessary to run your business successfully, then it is worth the investment to make certain it is available.
The biggest problem is in organizations that have an established pattern of beating down IT spending. In some corporate cultures, IT systems are considered a necessary evil that must be tolerated but need to be kept in check. Those organizations inevitably will experience a non-recoverable IT disaster. Information systems all fail at some point in time. Playing the odds it won't happen on your watch is shortsighted and self-serving.
So, What Does It Cost
Let's get back to the organization that really wants to know what a system is going to cost. How do we convince that organization to look beyond the first-level costs? As a practical matter, it is very difficult and isn't going to happen most of the time. Businesses use a rationale similar to this: "We will use the increased revenues we gain from the new system to build it out the way it really should be. But we can't afford to do that until we have the system running."
Isn't there a really bad joke about the three biggest lies? Well, this is one I have heard over and over again. Once an IT system is running, it is assumed to be a commoditized cow that can be milked and milked with only the minimum of care and feeding. The money never returns to IT so that "we can do it right this time." If there isn't enough funding available to fund a project properly from the beginning, then the organization is not ready for the project.
It reminds me of the young man who purchased a very expensive German automobile. All services were included in the purchase price for the first 50,000 miles. When he scheduled his first service after the "free" ones, he found he couldn't afford it. All his ready cash was earmarked for another three years of installment payments on the vehicle. It seems to me he hadn't considered the true cost of ownership. Let's not make the same mistake.
If the information technology group in an organization truly cares about that organization, it will present management with realistic cost estimates that include more than first-level costs. Business continuation is all about the long-term viability of the organization, and if certain IT systems are a critical part of that, then the business must be prepared to absorb the cost of that commitment. Don't worry about being liked. Try getting real instead.
Please address comments, complaints, and suggestions to the author at prolich@yahoo.com.
© Touchpoint Markets, All Rights Reserved. Request academic re-use from www.copyright.com. All other uses, submit a request to TMSalesOperations@arc-network.com. For more information visit Asset & Logo Licensing.