{"id":221,"date":"2011-08-02T12:18:27","date_gmt":"2011-08-02T17:18:27","guid":{"rendered":"http:\/\/www.smbitjournal.com\/?p=221"},"modified":"2017-02-18T11:27:04","modified_gmt":"2017-02-18T16:27:04","slug":"do-you-really-need-redundancy-the-real-cost-of-downtime","status":"publish","type":"post","link":"https:\/\/smbitjournal.com\/2011\/08\/do-you-really-need-redundancy-the-real-cost-of-downtime\/","title":{"rendered":"Do You Really Need Redundancy: The Real Cost of Downtime"},"content":{"rendered":"
Downtime – now that is a word that no one wants to hear.\u00a0 It strikes fear into the heart of businesses, executives and especially IT staff.\u00a0 Downtime costs money and it causes frustration.<\/p>\n
Because downtime triggers an emotional reaction businesses are often left reacting to it differently than traditional business factors.\u00a0 This emotional approach causes businesses, especially smaller businesses often lacking in rational financial controls, to treat downtime as being far worse than it is.\u00a0 It is not uncommon to find that smaller businesses have actually done more financial damage to themselves reacting to a fear of potential downtime than the feared downtime would have inflicted had it actually occurred.\u00a0 This is a dangerous overreaction.<\/p>\n
The first step is to determine the cost of downtime.\u00a0 In IT we are often dealing with rather complex systems and downtime comes in a variety of flavors such as loss of access, loss of performance or a complete loss of a system or systems.\u00a0 Determining every type of downtime and its associated costs can be rather complex but a high level view is often enough for producing rational budgets or are, at the very least, a good starting point on a path towards understanding the business risks involved with downtime.\u00a0 Keep in mind that just like spending too much to avoid downtime is bad that spending too much to calculate the costs of downtime is bad.\u00a0 Don’t spend so much time and resources determining if you will lose money that you would have been better off just losing it.\u00a0 Beware of the high cost of decision making.<\/p>\n
We can start by considering only complete system loss.\u00a0 What is the cost of organizational downtime for you – that is, if you had to cease all business for an hour or a day how much money is lost?\u00a0 In some cases the losses could be dramatic, like in the case of a hospital where a day of downtime would result in a loss of faith and future customer base and potentially result in lawsuits.\u00a0 But in many cases a day of downtime might have nominal financial impact – many businesses could simply call the day a holiday, let their staff rest for the day and have people work a little harder over the next few days to make up the backlog from the lost day.\u00a0 It all comes down to how your business does and can operate and how well suited you are for mitigating lost time.\u00a0 Many business will only look at daily revenue figures to determine lost revenue but this can be wildly misleading.<\/p>\n
Once we have a rough figure for downtime cost we can then consider downtime risk.\u00a0 This is very difficult to assess as good figures on IT system reliability are nearly non-existent and every organization’s systems are so unique that industry data is very nearly useless.\u00a0 Here we are forced to rely on IT staff to provide an overview of risks and, hopefully, a reliable assessment of likelihoods of individual risks.\u00a0 For example, in big round numbers, if we had a line of business application that ran on a server with only one hard drive then we would expect that sometime in the next five to ten years that there will be downtime associated with the loss of that drive.\u00a0 If we have that same server with hot swap drives in a mirrored array then the likelihood of downtime associated with that storage system, even over ten years, is quite small.\u00a0 This doesn’t mean that a drive is not likely to fail, it is, but that the system is likely to be unaffected until redundancy is restored without end users noticing that anything has happened.<\/p>\n
Our last rough estimation tool is to apply applicable business hours.\u00a0 Many businesses do not run 24×7, some do, of course, but most do not.\u00a0 Is the loss of a line of business application at six in the evening equivalent to the loss of that application at ten in the morning?\u00a0 What about on the weekend?\u00a0 Are people productively using it at three on a Friday afternoon or would losing it barely cost a thing and make for happy employees getting an extra hour or two on their weekends?\u00a0 Can schedules be shifted in case of a loss near lunch time?\u00a0 These factors while seemingly trivial can be significant.\u00a0 If downtime is limited to only two to four hours then many businesses can mitigate nearly all of the financial impact simply by asking employees to have a little flexibility in their schedules to accommodate the outage by taking lunch early or leaving work early one day and working an extra hour the next.<\/p>\n
Now that we have these factors\u00a0 – the cost of downtime, the ability to mitigate downtime impact based on duration and the risks of outage events we can begin to draw a picture of what a downtime event is likely to look like.\u00a0 From this we can begin to derive how much money it would be worth to reduce the risk of such as event.\u00a0 For some businesses this number will be extremely high and for others it will be surprisingly low.\u00a0 This exercise can expose a great deal about how a business operates that may not be normally all that visible.<\/p>\n
It is important to note at this point that what we are looking at here is a loss of availability of systems, not a loss of data.\u00a0 We are assuming that good backups are being taken and that those backups are not compromised.\u00a0 Redundancy and downtime are not topics related to data loss, just availability loss.\u00a0 Data loss scenarios should be treated with equal or greater diligence but are a separate topic.\u00a0 It is a rare business that can survive catastrophic data loss but common to experience and easily survive even substantial downtime.<\/p>\n
There are multiple ways to stave off downtime, redundancy is highly visible and treated almost like a buzz word and so receives a lot of focus, but there are other means as well.\u00a0 Good system design is important, avoiding system complexity can heavily reduce downtime simply by removing points of unnecessary risk and fragility.\u00a0 Using quality hardware and software is important as well – as low end hardware that is redundant will often fail just as often as non-redundant enterprise class hardware.\u00a0 Having a rapid supply chain of replacement parts can be a significant factor often seen in the form of four hour hardware vendor replacement part response contracts.\u00a0 This list goes on.\u00a0 What we will focus on is redundancy which is where we are most likely to overspend when faced with the fear of downtime.<\/p>\n
Now that we know the costs of failing to have adequate redundancy we can compare this potential cost against the very real, up front cost of providing that redundancy.\u00a0 Some things, such as hard drives, are highly likely to fail and relatively easy and cost effective to make redundant – taking significant risk and trivializing it.\u00a0 These tend to be a first point of focus.\u00a0 But there are many areas of redundancy to consider such as power supplies, network hardware, Internet connections and entire systems – often made redundant through modern virtualization techniques providing new avenues for redundancy previously not accessible to many smaller businesses.<\/p>\n
New types of redundancy, especially those made available through virtualization, are often a point where businesses will be tempted to overspend, perhaps dramatically, compared to the risks of downtime.\u00a0 Worse yet, in the drive to acquire the latest fads in redundancy companies will often implement these techniques incorrectly and actually introduce greater risk and a higher likelihood of downtime compared to having done nothing at all.\u00a0 It is becoming increasingly common to hear of businesses spending tens or even hundreds of thousands of dollars in an attempt to mitigate a downtime monetary loss of only a few thousand dollars – and to then fail in that attempt and end up increasing their risk anyway.<\/p>\n
When gauging the cost of mitigation it is critical to remember that mitigation is a guaranteed expense where risk is only a risk.\u00a0 Much like auto insurance where you pay a guaranteed small monthly fee in order to fend off a massive, unplanned expense.\u00a0\u00a0 The theory of risk mitigation is to spend a comparatively small amount of money now in order to reduce the risk of a large expense later, but if the cost of mitigation gets too high then it becomes better to simply accept the risks.<\/p>\n
Systems can be assessed individually, of course.\u00a0 Keeping a web presence and telephone system up and running at all times is far more important than an email system where even hours of downtime are unlikely to be detectable by external clients.\u00a0 Paying only to protect those systems where the cost of downtime is significant is an important strategy.<\/p>\n
Do not be surprised if what you discover is that beyond some very basic redundancy (such as mirrored hard drives) that a simple network design with good backups and restore plans and a good hardware support contract is all that is needed for the majority, if not all, of your systems.\u00a0 By lowering the complexity of your systems you make them naturally more stable and easier to manage – further reducing the cost of your IT infrastructure.<\/p>\n","protected":false},"excerpt":{"rendered":"
Downtime – now that is a word that no one wants to hear.\u00a0 It strikes fear into the heart of businesses, executives and especially IT staff.\u00a0 Downtime costs money and it causes frustration. Because downtime triggers an emotional reaction businesses are often left reacting to it differently than traditional business factors.\u00a0 This emotional approach causes … Continue reading Do You Really Need Redundancy: The Real Cost of Downtime<\/span>