All posts by Scott Alan Miller

Started in software development with Eastman Kodak in 1989 as an intern in database development (making database platforms themselves.) Began transitioning to IT in 1994 with my first mixed role in system administration.

Do You Really Need Redundancy: The Real Cost of Downtime

Downtime – now that is a word that no one wants to hear.  It strikes fear into the heart of businesses, executives and especially IT staff.  Downtime costs money and it causes frustration.

Because downtime triggers an emotional reaction businesses are often left reacting to it differently than traditional business factors.  This emotional approach causes businesses, especially smaller businesses often lacking in rational financial controls, to treat downtime as being far worse than it is.  It is not uncommon to find that smaller businesses have actually done more financial damage to themselves reacting to a fear of potential downtime than the feared downtime would have inflicted had it actually occurred.  This is a dangerous overreaction.

The first step is to determine the cost of downtime.  In IT we are often dealing with rather complex systems and downtime comes in a variety of flavors such as loss of access, loss of performance or a complete loss of a system or systems.  Determining every type of downtime and its associated costs can be rather complex but a high level view is often enough for producing rational budgets or are, at the very least, a good starting point on a path towards understanding the business risks involved with downtime.  Keep in mind that just like spending too much to avoid downtime is bad that spending too much to calculate the costs of downtime is bad.  Don’t spend so much time and resources determining if you will lose money that you would have been better off just losing it.  Beware of the high cost of decision making.

We can start by considering only complete system loss.  What is the cost of organizational downtime for you – that is, if you had to cease all business for an hour or a day how much money is lost?  In some cases the losses could be dramatic, like in the case of a hospital where a day of downtime would result in a loss of faith and future customer base and potentially result in lawsuits.  But in many cases a day of downtime might have nominal financial impact – many businesses could simply call the day a holiday, let their staff rest for the day and have people work a little harder over the next few days to make up the backlog from the lost day.  It all comes down to how your business does and can operate and how well suited you are for mitigating lost time.  Many business will only look at daily revenue figures to determine lost revenue but this can be wildly misleading.

Once we have a rough figure for downtime cost we can then consider downtime risk.  This is very difficult to assess as good figures on IT system reliability are nearly non-existent and every organization’s systems are so unique that industry data is very nearly useless.  Here we are forced to rely on IT staff to provide an overview of risks and, hopefully, a reliable assessment of likelihoods of individual risks.  For example, in big round numbers, if we had a line of business application that ran on a server with only one hard drive then we would expect that sometime in the next five to ten years that there will be downtime associated with the loss of that drive.  If we have that same server with hot swap drives in a mirrored array then the likelihood of downtime associated with that storage system, even over ten years, is quite small.  This doesn’t mean that a drive is not likely to fail, it is, but that the system is likely to be unaffected until redundancy is restored without end users noticing that anything has happened.

Our last rough estimation tool is to apply applicable business hours.  Many businesses do not run 24×7, some do, of course, but most do not.  Is the loss of a line of business application at six in the evening equivalent to the loss of that application at ten in the morning?  What about on the weekend?  Are people productively using it at three on a Friday afternoon or would losing it barely cost a thing and make for happy employees getting an extra hour or two on their weekends?  Can schedules be shifted in case of a loss near lunch time?  These factors while seemingly trivial can be significant.  If downtime is limited to only two to four hours then many businesses can mitigate nearly all of the financial impact simply by asking employees to have a little flexibility in their schedules to accommodate the outage by taking lunch early or leaving work early one day and working an extra hour the next.

Now that we have these factors  – the cost of downtime, the ability to mitigate downtime impact based on duration and the risks of outage events we can begin to draw a picture of what a downtime event is likely to look like.  From this we can begin to derive how much money it would be worth to reduce the risk of such as event.  For some businesses this number will be extremely high and for others it will be surprisingly low.  This exercise can expose a great deal about how a business operates that may not be normally all that visible.

It is important to note at this point that what we are looking at here is a loss of availability of systems, not a loss of data.  We are assuming that good backups are being taken and that those backups are not compromised.  Redundancy and downtime are not topics related to data loss, just availability loss.  Data loss scenarios should be treated with equal or greater diligence but are a separate topic.  It is a rare business that can survive catastrophic data loss but common to experience and easily survive even substantial downtime.

There are multiple ways to stave off downtime, redundancy is highly visible and treated almost like a buzz word and so receives a lot of focus, but there are other means as well.  Good system design is important, avoiding system complexity can heavily reduce downtime simply by removing points of unnecessary risk and fragility.  Using quality hardware and software is important as well – as low end hardware that is redundant will often fail just as often as non-redundant enterprise class hardware.  Having a rapid supply chain of replacement parts can be a significant factor often seen in the form of four hour hardware vendor replacement part response contracts.  This list goes on.  What we will focus on is redundancy which is where we are most likely to overspend when faced with the fear of downtime.

Now that we know the costs of failing to have adequate redundancy we can compare this potential cost against the very real, up front cost of providing that redundancy.  Some things, such as hard drives, are highly likely to fail and relatively easy and cost effective to make redundant – taking significant risk and trivializing it.  These tend to be a first point of focus.  But there are many areas of redundancy to consider such as power supplies, network hardware, Internet connections and entire systems – often made redundant through modern virtualization techniques providing new avenues for redundancy previously not accessible to many smaller businesses.

New types of redundancy, especially those made available through virtualization, are often a point where businesses will be tempted to overspend, perhaps dramatically, compared to the risks of downtime.  Worse yet, in the drive to acquire the latest fads in redundancy companies will often implement these techniques incorrectly and actually introduce greater risk and a higher likelihood of downtime compared to having done nothing at all.  It is becoming increasingly common to hear of businesses spending tens or even hundreds of thousands of dollars in an attempt to mitigate a downtime monetary loss of only a few thousand dollars – and to then fail in that attempt and end up increasing their risk anyway.

When gauging the cost of mitigation it is critical to remember that mitigation is a guaranteed expense where risk is only a risk.  Much like auto insurance where you pay a guaranteed small monthly fee in order to fend off a massive, unplanned expense.   The theory of risk mitigation is to spend a comparatively small amount of money now in order to reduce the risk of a large expense later, but if the cost of mitigation gets too high then it becomes better to simply accept the risks.

Systems can be assessed individually, of course.  Keeping a web presence and telephone system up and running at all times is far more important than an email system where even hours of downtime are unlikely to be detectable by external clients.  Paying only to protect those systems where the cost of downtime is significant is an important strategy.

Do not be surprised if what you discover is that beyond some very basic redundancy (such as mirrored hard drives) that a simple network design with good backups and restore plans and a good hardware support contract is all that is needed for the majority, if not all, of your systems.  By lowering the complexity of your systems you make them naturally more stable and easier to manage – further reducing the cost of your IT infrastructure.

Patching in a Small Environment

In enterprise IT shops, system patching is a complicated process involving large numbers of test systems which mirror production systems so that each new patch arriving from operating system and software vendors can be tested in a real world environment to see how they interact with the hardware and software combinations available in the organization.  In an ideal world, every shop would have a managed patching process that immediately responded to newly published patches, tested instantly and applied as soon as the patch was deemed safe and applicable.  But the world is not an ideal one and in real life we have to make due with limited resources: physical, temporal and financial.

Patches are generally released for a few key reasons: security, stability, performance and, occasionally, to supply new features.  Except for the addition of new features, which is normally handled through a different release process, patches represent a fix to a known issue.  This is not a “if it is not broken, don’t fix it” scenario but is a “it is broken and has not completely failed yet” scenario which demands attention – the sooner the better.  Taking a “sit back and wait” approach to patches is unwise as the existence of a new patch means that malicious hackers have a “fix” to analyze and even if an exploit did not exist previously, it will very shortly.  The release of the patch itself can be the trigger for the immediate need for said patch.

This patch ecosystem creates a need for a “patch quickly” mentality.  Patches should never sit, they need to be applied often as soon as they are released and tested.  Waiting to patch can mean running with critical security bugs or keeping systems unnecessarily unreliable.

Small IT shops rarely, if ever, have test environments whether for servers, networking equipment or even desktops.  Not ideal but, realistically, even if those environments were available few small shops have the excess human IT resources available to run those tests in a timely manner.

This is not as bleak as it sounds.  The testing done for most patches is redundant with patching already tested by the vendor.  Vendors cannot possibly test every hardware and software interaction that could ever happen with their products but they generally test wide ranges of permutations and look at areas where interactions are most likely.  It is rare for a major vendor to cripple their own software with bad patches.  Yes, it does happen and having good backups and rollback plans are important, but in day to day operations, patching is a relatively safe process that is far more important to do promptly than it is to wait for opportunities that may or may not occur.

Like any system change, patches are best applied in frequent, small dosages.  If patches are applied promptly then normally only one or a few patches must be applied at the same time.  For operating systems you may still have to deal with multiple patches at one time, especially if patching only weekly, but seldom must you patch dozens or hundreds of files at one time when done in this manner.  When done like this it is vastly easier to evaluate patches for adverse affects and to roll back if a patch process goes badly.

The worst scenario for a small business lacking a proper patch testing workflow is to wait on patches.  Waiting means that systems go without needed care for long periods of times and when patches are finally applied it is often in large, bulk patch processes.  Applying many patches at once increases the chances that something will go wrong and, when it does, identifying which patch(es) is at fault and producing a path to remediation can be much more difficult.

Delayed patching is a process that provides little or no advantage to either IT or a business but does carry substantial risk to security, stability and performance.  Best practices for patching in a small environment is either to allow systems to self patch as quickly as possible or to schedule a regular patching process, perhaps weekly, during a time when the business is most prepared for patching to fail and patch remediation to be handled.  Whether you choose to patch automatically or simply to do so regularly through a manual process, patch often and promptly for best results.

Never Get Advice from a Reseller (or Vendor)

This is general business advice that often applies to IT but is certainly not limited to that realm alone.  Outside support in IT comes from two main sources: firms who are paid (by you) to advise you and firms paid (by you) to sell you something.  The first are what we generally consider consultants.  The second are what we call resellers.

The simple rule of thumb is – never, ever get advice from a reseller.  At least not general advice, at best very specific advice centered purely around only the products that that reseller sells.  This isn’t to say that resellers are bad, far from it.  In fact, the reason that you can’t get advice from a reseller is not because of them but is because of you – let me explain.

When we go to a company to get advice we must pay for that advice.  One way or another, nothing is ever free.  Resellers traditionally earn their money by providing whatever free advice we desire and then making their money by selling us a product that has been marked up to cover their costs and to provide for their profit.  This is fine, but as the customer we need to understand that we are only compensating that reseller if they convince us to buy a product or a service that they sell and we compensate them better the more of that product that they convince us to buy.  The reseller isn’t at fault here, we need resellers and we need them to make money in this manner.  The issue is going to them and attempting to get free, general advice – we are forcing them to either work for us for free or to sell us something whether it is the right thing for us or not.  We’ve backed them into the proverbial corner and the only reasonable response is for them to attempt to sell us what they offer.  That is, after all, their job.

This leads to an additional problem, of course, which is that resellers don’t have skilled, professional, general-consultants on staff – at least not as a rule.  So if you go to a reseller and ask for advice, that reseller is almost assuredly only trained and knowledgeable on the products that they sell themselves.  They may not even be aware of what other solutions are on the market or, if they do, they do not know them to the same depth as their own products and may be unaware of advantages and caveats that you might need to know to make a truly informed decision.  Even if they did, it is not in their interest to tell you about them – you are only going to compensate them if they sell you something.

This is not to say that resellers are not good, honest, hard-working folk with value for our industry.  They are, but they aren’t magically free consultants like many people expect them to be.  Resellers are there to add consulting and selection assistance, as well as warehousing, repair, logistics and other value-adds, solely around the products that they represent.  Trying to get general consulting from a reseller is like asking your Chevy dealer to advise you as to what vehicle to buy and hoping that they equally consider all major makes, models and types of transportation as well as the regulations and limitations of all of these and are able to apply this to your unique situation – including knowing when to tell you that you don’t need to buy anything at all.  Of course, all they will do is try to sell you the best Chevy that meets your needs whether the best option for you is to just walk, buy an Impala, take a cab or to buy a fifty foot deep-sea fishing trawler.  Even if they did have the expertise to look at the big scope of your transportation needs you aren’t willing to pay them unless they give a specific answer.  So we can expect that the answer we pay for is the one that we will get.

Resellers are useful only after the decision to buy the products that they sell has already been made.  A reseller can then help you choose the right product from the range that they have.  For example, if you are buying a server from a reseller, that reseller can help you to choose which options like drive types and sizes, out of band management and other add-ons you might want.  But even then, be wary that they are likely earning more to upsell you and will recommend unneeded extras or may advise making configuration changes without understanding the entire scope of the project and how those changes from your original requirements might affect you.

Attempt to limit the advice that you receive to very concrete items such as “does this particular model offer this particular feature that I am seeking?” and avoid subjective valuations between products “is this one fast enough or should I buy the bigger one?” or “how does this compare to your competitor’s product?”

When asking subjective questions you are actually pressuring the reseller into either making more money overselling to you or losing money while trying to find the most appropriate product.  Not only do they make more money (generally) selling you the more expensive item but it also mitigates their risk that they didn’t get you what you needed.  There is no reason for them to take on risk, they’ll just try to sell you as much as possible and, if you come back unhappy, they can say “well, we tried to convince you to get a bigger, faster model but you wanted to save money and this is what happens.”  So it is not in their interest in any way to size to your needs but always to pad for safety and profit.  A position that they are put in, again, by their customers.

In most cases, principal vendors are themselves a reseller so can be considered in exactly the same way.  If you call Dell to buy a product, they will sell you a Dell no matter what your needs are.  This is not their fault, they only have one job, to sell you Dell products and if you call them for advice they can only assume that you did so because you wanted to buy a Dell.  They are no more going to consult on what IBM product to buy as they are on what car to drive or if its a good time to sell your house and move to Florida.  But they are very helpful in making sure that the Dell product that you order is going to be the one that you wanted and that the extra parts that you are getting will work with that model.  That’s what they are there for.  They will figure out how long it will take to arrive, go over warranty terms as well as give you pricing and financing options.  These are all things that your general consultant cannot do.  The two roles are complimentary, not competitive.

A perfect example of this entire scenario is one that I see happen in the real world time and time again.  With the recent explosion in virtualization businesses are turning, en masse, to vendors to find out what they need in order to dip their toes into the world of virtualization.  What I see, over and over, is instead of being sold a reasonable virtualization setup they are often sold entire systems including storage and software that in no way meets their needs and, often, actually works against their needs while costing as much as ten to twenty times what a better performing, more reliable system would have cost.  Often they are upsold into a completely unreasonable category of product for their project and then caught by budget limitations and stuck skimping – leaving them with a crippled virtualization project that could have been completed successfully for a fraction of the money spent and leaving good room for growth over time as needed.

The issue, of course, is that turning to a vendor and asking for advice on virtualization products is exactly like saying “I have no idea what I’m doing, let’s see what you can sell me on.”  And honestly, once the vendor knows you don’t even have your architectural elements worked out before contacting them, they know that the sky is the limit.  The goose has arrived and all they have to do is wait for that golden egg to be delivered.

I’ve heard this exact scenario so many times, I can’t count.  Your vendor is not your friend.  They have one job to do – sell you as many products as possible.  If you ask them what you should buy they will tell you whatever you want to hear.  They will cut corners on safety items or management items that they feel you will not find flashy or cool and will sell you what they think you will get excited about or confused about.  They know their jobs well – they have to, it is a tough market.  A great example is vendors cutting storage costs by selling smaller than appropriate storage arrays and using risky array configurations to make the capacity cost less.  That the client is at heightened risk to a failing array doesn’t impact the vendor and is a very hard issue to quantify, so once the product is sold it is the customer’s concern not the vendor’s.

The answer to this is to leverage a general consultant.  A general consultant gets compensated by delivering good advice and not for selling you a product.  In theory a general consultant will earn a similar amount regardless of whether they convince you to install millions of dollars of products or to do nothing and use what you currently own.  A general consultant should be far more intimate with your environment than a vendor or reseller could ever be and should be able to speak to your technical staff, make presentations to the business and put their advice into the proper context for your business with insight into how the costs, risks and other factors will impact you specifically and advise on what they feel is more appropriate for your specific needs.

In reality you still have to consider the complete role of your general consultant.  Most often a general consulting firm will also offer broad support and implementation services.  These are loosely tied to their recommendations so caveat emptor applies as always, but since they are compensated in a far more direct manner (paid for their effort) they have a very real reason to deliver you what you are buying.  Even general consultants who have some ties to reselling often make a very small fraction on the resold goods as they do on the consulting so anything that puts their consulting work at risk is a major liability to them.  Make sure that any general consultant, if offering resold services, is not tied to them and works with other resellers or vendors as well.  Sometimes general consultants offer low cost reseller services as a loss leader or at minimal profit just to keep customers from feeling that they must turn to another company but would prefer if their customers did not use  that service – profits are often higher not reselling.

Your general consultant should be able to interface with your resellers or vendors directly or allow you to do so.  Having a consultant handle the transaction can be beneficial because it provides an integrated procedure and consultants are very unlikely to be persuaded to make snap decisions based on sales, “special deals” or to be sold on a different approach by a salesperson who has a specific product to push that month.  The consultant has little emotional tie to the purchasing process and so can be much more methodical and calculating.

Of course we must consider the opposite situation as well – how do we treat our service providers?  For example, if we go to a reseller over and over again asking for advice, making them generate quotes and generally spin their wheels and then buy nothing from them or very little we will, sooner or later, force them to either refuse to work with us at all or do something drastic like supplying less than accurate data or raising prices.  A good vendor or reseller will provide you with the best value when you treat them well.  Loyalty may seem to be dead in business transactions today, but this is not at all true.  Good relationships still pay off.

With consultants the need to treat them well is somewhat built into the equation – you generally pay for what you get so other than being friendly and respectful you don’t normally have too much to worry about as far as how you are structuring your relationship.  But even with a consultant there are still concerns.  If you pay for an “unlimited” service plan, use it well but don’t abuse it, for example.  Always make your consultant happy that you are their customer and, most likely, they will work hard to make sure that you are happy to be their customer too.

The most important concept to take away from this is that with any company with whom you do business, you should have some empathy for them.  Put yourself in their shoes and think about how your relationship with them is structured.  Are your goals mutually aligned?  Is it in both companies’ interests to act in the interest of the other?  Or have you arranged for an adversarial relationship where they can only win at your expense?

Keep in mind that you are the customer so, very likely, your consultant or reseller is, to some degree, at your mercy to make sure that your relationship is a healthy one.  In order to obtain clients they are often pressured into a position of accepting a less than ideal arrangement.  As the client, you have the opportunity to be the client that that consultant or reseller is excited to work for and will go out of their way to make happy.  The choice is very much yours to make in most cases.  Choose well because good relationships can work wonders for your business.

Ask a jeweler what to get your wife for your anniversary and he will say: “You can’t go wrong with jewelry.”

Ask a florist what to get your wife and he will tell you: “Women always love flowers.”

Ask a chocolatier and he will tell you that nothing makes a woman happier than chocolate.

Ask a consultant he will ask you: “What does your wife like?”

Seven Reasons It Is Time for Windows 7

What’s your reason for not upgrading to Windows 7? Many IT managers wait for the first service pack before deploying an OS upgrade; others update the operating system as part of a hardware refresh. Here are some advantages to upgrading.

Inevitability

If you have been watching Microsoft’s enterprise desktop operating systems over the past two decades then you are aware that there is a pattern emerging and that pattern places Windows 7 as the long term successor to Windows XP and that XP was the clear successor to NT 4.0.  Each of these were the golden child of the Microsoft machine, blessed with prime market positioning, lack of extreme overhauls and sporting a high level of polish.  As such, whether you are seeking the latest and greatest or just looking for the best desktop OS investment, Windows 7 meets your needs.  Windows 7 is here to stay and adoption rates are already very high.

Once you accept that Windows 7 is coming to your environment sometime over the next several years then the question truly becomes: “What are you waiting for?”  The sooner that you get Windows 7 in place, the sooner you can make the transition and the sooner you can start reaping the benefits of the latest technologies and nearly a decade of development since Windows XP originally released – and let’s face it, most shops are moving from XP to 7 today.  You will achieve your greatest benefits from Windows 7 the sooner that you put it in place giving your users maximum time to adapt to it and giving you more time to take advantage of its features.

Performance

One of the biggest complaints of users who switched to Vista from XP was a lack of performance.  Windows 7 addressed this very well and is more performant than Vista and has lower minimum requirements allowing it be used in the Netbook realm that had been previously reserved for Windows XP up through the Vista era.  Windows 7 runs nicely on Vista-era equipment and much of the XP-era equipment while taking good advantage of new hardware as well making it a good option for in-place software upgrades.

Having a Windows operating system that actually outperforms its predecessor on the same hardware is a major feat.  Traditionally an OS was only expected to be comparable or faster when used on hardware current to its release.  Unlike any other Windows upgrade, Windows 7 can be deployed onto existing hardware without needing hardware upgrades and you will still see small performance gains.  This alone removes one of the traditional obstacles to in-place operating system upgrades.

Security

Security is always of concern and Windows 7 comes with a slew of security enhancements.  The best one results in an improved user experience as well – the update of User Account Control (UAC.)  This update makes UAC, the bane of Windows Vista, into the security tool that it was always meant to be.  UAC is now easy to use and control but still powerful enough to protect you in critical ways.  Moving from XP to 7 provides a very important security update from a technology side while moving from Vista to 7 makes this technology user friendly enough that it can remain enabled without the bulk of users demanding that it be removed.
Solid State Drive Support

With solid state drives rapidly dropping in price and growing in popularity, having specific support for them in Windows 7 is a very big deal today but especially over the next few years as solid state drives move from the realm of power user equipment to mainstream user equipment.  Solid state drives work best when the drivers handling them are aware that they are solid state.  SSDs should not be treated like tradition, spindle-based hard drives for maximum performance and reliability benefits.

Windows 7’s solid state enhancements like TRIM and removal of spindle drive tools like Superfetch and ReadyBoost give SSDs better performance and longer lifespan on Windows 7 then on previous Windows iterations.  These features may not seem like a big deal today but over the lifespan of Windows 7, as SSDs become more and more of an expected desktop component for the average office worker these SSD-specific features will play a bigger and bigger role.
XP Mode

XP Mode is one of those really stand-out features that sets Windows 7 apart from its predecessors.  Previous Windows version have struggled in handling legacy applications.  Windows 7’s new approach of including a Windows XP operating system as a complete virtual machine handles this issue in a graceful way.  Now legacy apps are more reliable and the Windows 7 system is not encumbered with extra subsystems needed to handle legacy systems.  With Windows XP having been such a dominant player like no Windows platform had been before, this approach is brilliant and a shrewd move on Microsoft’s part.  XP Mode delivers a level of confidence that existing apps will continue to work on Windows 7 – even apps that no longer see active development and are not being tested against the newest Microsoft operating systems.  Once again, Windows 7 provides more than its predecessor in an area where we would not expect to see this – backwards compatibility.  Windows 7 is dramatically more compatible with Windows XP software than Vista is.
Branch Cache

Enterprise customers can leverage Branch Cache, Microsoft’s new WAN optimization technology targeted at supporting branch offices within a larger, enterprise environment.  Branch Cache can be a significant feature for the many companies who struggle with providing storage resources out to small, remote offices.  Branch Cache’s ability to seamlessly store previously accessed CIFS and web resources out at a branch office can, for some businesses, mean that extra equipment and larger Internet connections need not be purchased which can result in substantial cost savings and branch office productivity gains.  Branch Cache will also reduce loads on central storage systems allowing file server dollars to be stretched a little farther too.
Direct Access

Previous versions of Windows have had VPN products included with them but Direct Access takes the idea of “always connected mobility” to a new level.  Direct Access adds seamless VPN to Windows which gives users a unified experience between remote and “in office” computing modes.  No longer do users need to manage their VPN experience – as long as they are online they are connected to the office.  Direct Access leverages IPv6 and IPSec for simple, efficient and extremely secure remote computing.  Direct Access is designed to work with Microsoft’s existing authentication systems allowing it to be used for normal, everyday computing without breaking communications with Active Directory so that both the machine and the user can properly authenticate – even when working remotely.

Summary

At the end of the day, however, what makes Windows 7 compelling isn’t any significant feature.  In fact, it is the lack of major features that makes Windows 7 so important.  Like XP, its spiritual predecessor, Windows 7 tweaks a working formula.  Vista introduced the new kernel, the new interface, UAC and other features.  Introducing change is painful.  Windows 7 takes what works and makes it better.  Windows 7 is the long term, strategic desktop decision because it is a polished system that introduces small, incremental updates and relies on established features to drive its overarching value.  Think of 7 as the evolutionary whereas Vista was revolutionary.