Tag Archives: storage

Just Because You Can…

I see this concept appear in discussions surrounding virtualization all of the time.  This is a broader, more general concept but virtualization is the “hot, new technology” facing many IT organizations and seems to be the space where currently we see the “just because you can, doesn’t mean you should” problems rearing their ugly heads most prevalently.  As with everything in IT, it is critical that all technical decisions be put into a business context so that we understand why we choose to do what we do and not blindly attempt to make our decisions based on popular deployment methodologies or worse, myths..

Virtualization itself, I should point out, I feel should be a default decision today for those working in the x64 computing space with systems being deployed sans virtualization only when a clear and obvious necessity exists such as specific hardware needs, latency sensitive applications, etc.  Baring any specific need, virtualization is free to implement from many vendors and offers many benefits both today and in future-proofing the environment.

That being said, what I often see today is companies deploying virtualization not as a best practice but as a panacea to all perceived IT problems.  This it certainly is not.  Virtualization is a very important tool to have in the IT toolbox and one that we will reach for very often, but it does not solve every problem and should be treated like every other tool that we posses and used only when appropriate.

I see several things recurring when virtualization discussions come up as a topic.  Many companies today are moving towards virtualization not because they have identified a business need but because it is the currently trending topic and people feel that if they do not implement virtualization that somehow they will be left behind or miss out on some mythical functionality.  This is generally good as it is increasing virtualization adoption, but it is bad because good IT and business decision making processes are being bypassed.  What happens is often that in the wave of virtualization hype IT departments feel that not only do they have to implement virtualization itself but do so in ways that may not be appropriate for their business.

There are four things that I often see tied to virtualization, often accepted as virtualization requirements, whether or not they make sense in a given business environment.  These are server consolidation, blade servers, SAN storage and high availability or live failover.

Consolidation is so often vaunted as the benefit of virtualization that I think most IT departments forget that there are other important reasons for doing implementing it.  Clearly consolidation is a great benefit for nearly all deployments (mileage may vary, of course) and is nearly always able to be achieved simply through better utilization of existing resources.  It is a pretty rare company that runs more than a single physical server that cannot shave some amount of cost through limited consolidation and it is not uncommon to see datacenter footprints decimated in larger organizations.

In extreme cases, though, it is not necessary to abandon virtualization projects just because consolidation proves to be out of the question.  These cases exist for companies with high utilization systems and little budget for a preemptive consolidation investment.  But these shops can still virtualize “in place” systems on a one to one basis to gain other benefits of virtualization today and look to consolidate when hardware needs to be replaced tomorrow or when larger, more powerful servers become more cost effective in the future.  It is important to not rule out virtualization just because its most heralded benefit may not apply at the current time in your environment.

Blade servers are often seen as the choice for virtualization environments.  Blades may play better in a standard virtualization environment than they do with more traditional computational workloads but this is both highly disputable and not necessarily applicable data.  Being a good scenario for blades themselves does not make it a good scenario for a business.  Just because the blades perform better than normal when used in this way does not imply that they perform better than traditional servers – only that they have potentially closed the gap.

Blades needs to be evaluated using the same harsh criteria when virtualizing as when not and, very often, they will continue to fail to provide the long term business value needed to choose them over the more flexible alternatives.  Blades remain far from a necessity for virtualization and often, in my opinion, a very poor choice indeed.

One of the most common misconceptions is that by moving to virtualization one must also move to shared storage such as SAN.  This mindset is the obvious reaction to the desire to also achieve other benefits from virtualization which, if they don’t require SAN, benefit greatly from it.  The ability to load balance or failover between systems is heavily facilitated by having a shared storage backend.  It is a myth that this is a hard requirement, but replicated local storage brings its own complexities and limitations.

But shared storage is far from a necessity of virtualization itself and, like everything, needs to be evaluated on its own.  If virtualization makes sense for your environment but you need no features that require SAN, then virtualize without shared storage.  There are many cases where local storage backed virtualization is an ideal deployment scenario.  There is no need to dismiss this approach without first giving it serious consideration.

The last major assumed necessary feature of virtualization is system level high availability or instant failover for your operating system.  Without a doubt, high availability at the system layer is a phenomenal benefit that virtualization brings us.  However, few companies needed high availability at this level prior to implementing virtualization and the price tag of the necessary infrastructure and software to do it with virtualization is often so high as to make it too expensive to justify.

High availability systems are complex and often overkill.  It is a very rare business system that requires transparent failover for even the most critical systems and those companies with that requirement would almost certainly already have failover processes in place.  I see companies moving towards high availability all of the time when looking at virtualization simply because a vendor saw an opportunity to dramatically oversell the original requirements.  The cost of high availability is seldom justified by the potential loss of revenue from the associated reduction in downtime.  With non-highly available virtualization, downtime for a failed hardware device might be measured in minutes if backups are handled well.  This means that high availability has to justify its cost in potentially eliminating just a few minutes of unplanned downtime per year minus any additional risks assumed by the added system complexity.  Even in the biggest organizations this is seldom justified on any large scale and in a more moderately sized company it is uncommon altogether.  But today we find many small businesses implementing high availability systems at extreme cost on systems that could easily suffer multi-day outages with minimal financial loss simply because the marketing literature promoted the concept.

Like anything, virtualization and all of the associated possibilities that it brings to the table need to be evaluated individually in the context of the organization considering them.  If the individual feature does not make sense for your business do not assume that you have to purchase or implement that feature.  Many organizations virtualize but use only a few, if any, of these “assumed” features.  Don’t look at virtualization as a black box, look at the parts and consider them like you would consider any other technology project.

What often happens in a snowball effect where one feature, likely high availability, is assumed to be necessary without the proper business assessment being performed.  Then a shared storage system, often assumed to be required for high availability, is added as another assumed cost.  Even if high availability features are not purchased the decision to use SAN might already be made and fail to be revisited after changes to the plan are made.  It is very common, in my experience, to find projects of this nature with sometimes more than fifty percent of the total expenditure on the project being spent on products that the purchaser is unable to even describe the reason for having purchased.

This concept does not stop at virtualization.  Extend it to everything that you do.  Keep IT in perspective of the business and don’t assume that going with one technology automatically assumes that you must adopt other technologies that are popularly associated with it.

RAID Revisited

Back when I was a novice service tech and barely knew anything about system administration one of the few topics that we were always expected to know cold was RAID –  Redundant Array of Inexpensive Disks.  It was the answer to all of our storage woes.  With RAID we could scale our filesystems larger, get better throughput and even add redundancy allowing us to survive the loss of a disk which, especially in those days, happened pretty regularly.  With the rise of NAS and SAN storage appliances the skill set of getting down to the physical storage level and tweaking it to meet the needs of the system in question are rapidly disappearing.  This is not a good thing.  Just because we are offloading storage to external devices does not change the fact that we need to fundamentally understand our storage and configure it to meet the specific needs of our systems.

A misconception that seems to have entered the field over the last five to ten years is the belief that RAID somehow represents a system backup.  It does not.  RAID is a form of fault tolerance.  Backup and fault tolerance are very different conceptually.  Backup is designed to allow you to recover after a disaster has occurred.  Fault tolerance is designed to lessen the chance of disaster in the first place.  Think of fault tolerance as building a fence at the top of a cliff and backup as building a hospital at the bottom of it.  You never really want to be in a situation without a both a fence and a hospital, but they are definitely different things.

Once we are implementing RAID for our drives, whether locally attached or on a remote appliance like SAN, we have four key RAID solutions from which to choose today for business: RAID 1 (mirroring), RAID 5 (striping with parity), RAID 6 (striping with double parity) and RAID 10 (mirroring with striping.)  There are others, like RAID 0, that only should be used in rare circumstances when you really understand your drive subsystem needs.  RAID 50 and 51 are used as well but far less commonly and are not nearly as effective.  Ten years ago RAID 1 and RAID 5 were common, but today we have more options.

Let’s step through the options and discuss some basic numbers.  In our examples we will use n to represent the number of drives in our array and we will use s to represent the size of any individual drive.  Using these we can express the usable storage space of an array making comparisons easy in terms of storage capacity.

RAID 1: In this RAID type drives are mirrored.  You have two drives and they do everything together at the same time, hence “mirroring”.  Mirroring is extremely stable as the process is so simple, but it requires you to purchase twice as many drives as you would need if you were not using RAID at all as your second drive is dedicated to redundancy.  The benefit being that you have the assurance that every bit that you write to disk is being written twice for your protection.  So with RAID 1 our capacity is calculated to be (n*s/2).  RAID 1 suffers from providing minimal performance gains over non-RAID drives.  Write speeds are equivalent to a non-RAID system while read speeds are almost twice as fast in most situations since during read operations the drives can access in parallel to increase throughput.  RAID 1 is limited to two drive sets.

RAID 5: Striping with Single Parity, in this RAID type data is written in a complex stripe across all drives in the array with a distributed parity block that exists across all of the drives.  By doing this RAID 5 is able to use an arbitrarily sized array of three or more disks and only loses the storage capacity equivalent to a single disk to parity although the parity is distributed and does not exist solely on any one physical disk.   RAID 5 is often used because of its cost effectiveness due to its lack of storage capacity loss in large arrays.  Unlike mirroring, striping with parity requires that a calculation be performed for each write stripe across the disks and this creates some overhead.  Therefore the throughput is not always an obvious calculation and is dependent heavily upon the computational power of the system doing the parity calculation.  Calculating RAID 5 capacity is quite easy as it is simply ((n-1)*s).  A RAID 5 array can survive the loss of any single disk in the array.

RAID 6: Redundant Striping with Double Parity.  RAID 6 is practically identical to RAID 5 but uses two parity blocks per stripe rather than one to allow for additional protection against disk failure.  RAID 6 is a newer member of the RAID family having been added several years after the other levels had become standardized.  RAID 6 is special in that it allows for the failure of any two drives within an array without suffering data loss.  But to accommodate the additional level of redundancy a RAID 6 array loses the storage capacity of the equivalent to two drives in the array and requires a minimum of four drives.  We can calculate the capacity of a RAID 6 array with ((n-2)*s).

RAID 10: Mirroring plus Striping.  Technically RAID 10 is a hybrid RAID type encompassing a set of RAID 1 mirrors existing in a non-parity stripe (RAID 0).  Many vendors use the term RAID 10 (or RAID 1+0) when speaking of only two drives in an array but technically that is RAID 1 as striping cannot occur until there are a minimum of four drives in the array.  With RAID 10 drives must be added in pairs so only an even number of drives can exist in an array.  RAID 10 can survive the loss of up to half of the total set of drives but a maximum loss of one from each pair.  RAID 10 does not involve a parity calculation giving it a performance advantage over RAID 5 or RAID 6 and requiring less computational power to drive the array.  RAID 10 delivers the greatest read performance of any common RAID type as all drives in the array can be used simultaneously in read operations although its write performance is much lower.  RAID 10’s capacity calculation is identical to that of RAID 1, (n*s/2).

In today’s enterprise it is rare for an IT department to have a serious need to consider any drive configuration outside of the four mentioned here regardless of whether software or hardware RAID is being implemented.  Traditionally the largest concern in a RAID array decision was based around usable capacity.  This was because drives were expensive and small.  Today drives are so large that storage capacity is rarely an issue, at least not like it was just a few years ago, and the costs have fallen such that purchasing additional drives necessary for better drive redundancy is generally of minor concern.  When capacity is at a premium RAID 5 is a popular choice because it loses the least storage capacity compared to other array types and in large arrays the storage loss is nominal.

Today we generally have other concerns, primarily data safety and performance.  Spending a little extra to ensure data protection should be an obvious choice.  RAID 5 suffers from being able to lose only a single drive.  In an array of just three members this is only slightly more dangerous than the protection offered by RAID 1.  We could survive the loss of any one out of three drives.  Not too scary compared to losing either of two drives.  But what about a large array, say sixteen drives.  Being able to safely lose only one of sixteen drives should make us question our reliability a little more thoroughly.

This is where RAID 6 stepped in to fill the gap.  RAID 6, when used in a large array, introduces a very small loss of storage capacity and performance while providing the assurance of being able to lose any two drives.  Proponents of the striping with parity camp will often quote these numbers to assuage management that RAID 5/6 can provide adequate “bang for the buck” in storage subsystems, but there are other factors at play.

Almost entirely overlooked in discussions of RAID reliability, an all too seldom discussed topic as it is, is the question of parity computation reliability.  With RAID 1 or RAID 10 there is no “calculation” done to create a stripe with parity.  Data is simply written in a stable manner.  When a drive fails its partner picks up the load and drive performance is slightly degraded until the partner is replaced.  There is no rebuilding process that impacts existing drive members.  Not so with parity stripes.

RAID arrays with parity have operations that involve calculating what is and what should be on the drives.  While this calculation is very simple it provides an opportunity for things to go wrong.  An array control that fails with RAID 1 or RAID 10 could, in theory, write bad data over the contents of the drives but there is no process by which the controller makes drive changes on its own so this is extremely unlikely to ever occur as there is never a “rebuild” process except in creating a mirror.

When arrays with parity perform a rebuild operation they perform a complex process by which they step through the entire contents of the array and write missing data back to the replaced drive.  In and of itself this is relatively simple and should be no cause for worry.  What I and others have seen first hand is a slightly different scenario involving disks that have lost connectivity due to loose connectors to the array.  Drives can commonly “shake” loose over time as they sit in a server especially after several years of service in an always-on system.

What can happen, in extreme scenarios, is that good data on drives can be overwritten by bad parity data when an array controller believes that one or more drives have failed in succession and been brought back online for rebuild.  In this case the drives themselves have not failed and there is no data loss.  All that is required is that the drives be reseated, in theory.  On hot swap systems the management of drive rebuilding is often automatic based on the removal and replacement of a failed drive.  So this process of losing and replacing a drive may occur without any human intervention – and a rebuilding process can begin.  During this process the drive system is at risk and should this same event occur again the drive array may, based upon the status of the drives, begin striping bad data across the drives overwriting the good filesystem.  It is one of the most depressing sights for a server administrator to see when a system with no failed drives loses an entire array due to an unnecessary rebuild operation.

In theory this type of situation should not occur and safeguards are in place to protect against it but the determination of a low level drive controller as to the status of a drive currently and previously and the quality of the data residing upon that drive is not as simple as it may seem and it is possible for mistakes to occur.  While this situation is unlikely it does happen and it adds a nearly impossible to calculate risk to RAID 5 and RAID 6 systems.  We must consider the risk of parity failure in addition to the traditional risk calculated from the number of drive losses that an array can survive out of a pool.  As drives become more reliable the significance of the parity failure risk event becomes greater.

Additionally, RAID 5 and RAID 6 parity introduces system overhead due to parity calculation which is often handled by way of dedicated RAID hardware.  This calculation introduces latency into the drive subsystem that varies dramatically by implementation both in hardware and in software making it impossible to state performance numbers of RAID levels against one another as each implementation will be unique.

Possibly the biggest problem with RAID choices today is that the ease with which metrics for storage efficiency and drive loss survivability can be obtained mask the big picture of reliability and performance as those statistics are almost entirely unavailable.  One of the dangers of metrics is that people will focus upon factors that can be easily measured and ignore those that cannot be easy measured regardless of their potential for impact.

While all modern RAID levels have their place it is critical that they be considered within context and with an understanding as to the entire scope of the risks.  We should work hard to shift our industry from a default of RAID 5 to a default of RAID 10.  Drives are cheap and data loss is expensive.

[Edit: In the years since this was initial written the rise of URE (Unrecoverable Read Errors) risks during a rebuild operation has shifted the primary risks from those listed to URE-related risks for parity arrays.]