Category Archives: Storage

Spotlight on SMB Storage

Storage is a hard nut to crack.  For businesses storage is difficult because it often involves big price tags for what appear to be nebulous gains.  Most executives understand the need to “store” things and more of them but they understand very little about performance, access methods, redundancy and risk calculations, backup and disaster recovery.  This makes the job of IT difficult because we need to explain why budgets need to be often extremely large for what appears to be an invisible system to the business stakeholders.

For IT, storage is difficult because storage systems are complex – often the single most complex system within an SMB – and often, due to their expense and centralization, exist in very small quantities within a business.  This means that most SMBs, if they have any storage systems, have only one and keep it for a very long time.  This lack of broad exposure to storage systems combined with the relatively infrequent need to interact with storage systems leaves SMB IT departments dealing with a large budget item of incredible criticality to the business that is a small percentage of their “task” range and over which they actually have very little experience by the very nature of the beast.  Other areas of IT are far more accessible for experimentation, testing and education purposes.

Between these two major challenges we are left with a product that is poorly understood, in general, by both management and IT.  Storage is so misunderstood that often IT departments are not even aware of what they need at all and often are doing little more than throwing darts at the storage dart board and starting from wherever the darts land – and often starting by calling vendors rather than consultants leading them down a path of “decision already made” while seemingly getting advice.

Storage vendors, knowing all of this, do little to aid the situation since once contact between an SMB and a vendor is made it is in the vendor’s best interest not to educate the customer since the customer already  made the decision to approach that vendor in the first place before having the necessary information at hand.  So the vendor simply wants to sell whatever they have available.  Seldom does a single storage vendor have a wide range of products in their own lines so going directly to a vendor before knowing what exactly is needed can go much, much farther towards the customer having effectively already decided on what to buy than in other arenas of technology and this can cause costs to be off by orders of magnitude compared to what is needed.

Example: Most server vendors offer a wide array of servers both in the x64 family as well as large scale RISC machines and other, niche products.  Most storage vendors offer a small subset of storage products offering only SAN or only NAS or only “mainframe” class storage or only small, non-replicated storage, etc.  Only a very few vendors have a wide assortment of storage products to meet most needs and even the best of these lack full market scale hitting the smaller SMB market as well as the mid and enterprise markets.

So where do we go from here?  Clearly this is a serious challenge to overcome.

The obvious option, and one that shops need to not rule out, is turning to a storage consultant.  Someone who is not reselling a solution or, at the very least, is not reselling a single solution but has a complete solution set from which to choose and who is going to be able to provide a lot cost, $1,000 solution as well as a $1,000,000 solution – someone who understands NAS, SAN, scale out storage, replication, failover, etc.  When going to your consultant do not make the presumption that you know what your costs will be – there are many, many factors and by considering them careful you may be able to spend far less than you had anticipated.  But do have budgets in mind, risk aversion well documented, costs for downtime and a very complete set of anticipated storage use case scenarios.

But turning to a consultant is certainly not the only path.  Doing your own research, learning the basics and following a structured decision making process can get you, if not to the right solution, at least a good way down the right path.  There are four major considerations when looking at storage: function (how storage is used and accessed), capacity, speed and reliability.

The first factor, function, is the most overlooked and the least understood.  In fact, even though this is the most basic of concerns, this is often simply swept under the carpet and forgotten.  We can answer this question by asking ourselves “Why are we purchasing storage?”

Let us address this systematically.  There are many reasons that we will be buying storage.  Here are a few popular ones: to lower costs over having large amounts of storage locally on individual servers or desktops, to centralize management of data, to increase performance and to make data more available in the case of system failure.

Knowing which of these factors, or if there is another factor not listed here, driving you towards shared storage is important as it will likely provide a starting point in your decision making process.  Until we know why we need shared storage we will be unable to look at the function of that storage, which, as we know already, is the most fundamental decision making factor.  If you cannot determine the function of the storage then it is safe to assume that shared storage is not needed at all.  Do not be afraid to make this decision, the vast majority of small businesses have little or no need for shared storage.

Once we determine the function of our shared storage we can now, relatively easily, determine capacity and performance needs.  Capacity is the easiest and most obvious function of storage.  Performance, or speed, is easy to state and explain but much more difficult to quantify as IOPS are, at best, a nebulous concept and at worst completely misunderstood.  IOPS come in different flavours and there are concerns around random access, sequential access, burst speeds, latency and sustained rates and then comes the differences between reading and writing!  It is difficult to even determine the needed performance let alone the expected performance of a device.  But with careful research, this is achievable and measurable.

Our final factor is reliability.  This, like functionality, seems to be a recurring stumbling point for IT professionals looking to move into shared storage.  It is important, nay, absolutely critical, that the idea that storage is “just another server” be kept in mind and the concepts of redundancy and reliability that apply to normal servers apply equally to dedicated shared storage systems.  In nearly all cases, enterprise storage systems are built on enterprise servers – same chassis, same drives, same components.  What is oft confusing is that even SMBs will look to mid or high end storage systems to support much lower end servers which can sometimes cause storage systems to appear mystical in the same way that big iron servers may appear to someone only used to commodity server hardware.  But do not be mislead, the same principles of reliability apply and you will need to gauge risk exactly the same as you always have (or should have) to determine what equipment is right for you.

Taking time to assess, research and understand storage needs is very important as your storage system will likely remain as a backbone component on your network for a very long time due to its extremely high cost and complexity of replacing.  Unlike the latest version of Microsoft Office, buying a new shared storage system will not cause a direct impact on an executive’s desktop and so lack the flash necessary to drive “feature updates” as well.

Now that we have our options in front of us we can begin to look at real products.  Based on our functionality research we now should be able to determine if we are in need of SAN, NAS or neither.  In many cases – far more than people realize – neither is the correct choice.  Often adding drives to existing servers or attaching a DAS drive chassis where needed is more cost effective and reliable than doing something more complex.  This should not be overlooked.  In fact, if DAS will suit the need at hand it would be rare that something else would make sense at all.  Simplicity is the IT manager’s friend.

There are plenty of times when DAS will not meet the current need.  Shared storage certainly has its place, even if only to share files between desktop users.  With today’s modern virtualization systems shared storage is becoming increasingly popular – although even there DAS is too likely avoided even when it might suit well the existing needs.

With rare exception when shared storage is needed NAS is the place to turn.  NAS stands for Network Attached Storage.  NAS mimics the behaviour of a fileserver (NAS is simply a fileserver packaged as an appliance) making it easy to manage and easy to understand.  NAS tends to be very multi-purposed replacing traditional file servers and often being used as the shared backing for virtualization.  NAS is typified by the NFS and CIFS protocols but we will not uncommonly see HTTP, FTP, SFTP, AFS and others available on NAS devices as well.  NAS works well as a connector allowing Windows and UNIX systems to share files easily with each other while only needing to work with their own native protocols.  NAS is commonly used as the shared storage for VMWare’s vSphere, Citrix XenServer, Xen and KVM.  With NAS it is easy to use your shared storage in many different roles and easy to get good utilization from your shared storage system.

NAS does not always meet our needs.  Some special applications still need shared storage but cannot utilize NAS protocols.  The most notable products affected by this are Microsoft’s HyperV, databases and server clusters.  The answer for these products is SAN.  SAN, or Storage Area Networking, is a difficult concept and even at the best of times is difficult to categorize.  Like NAS which is simply a different way of presenting traditional file servers, SAN is truly just a different way of presenting direct attached disks.  While the differences between SAN and DAS might seem obvious actually differentiating between them is nebulous at best and impossible at worst.  SAN and DAS typically share protocols, chassis, limitations and media.  Many SAN devices can be attached and used as a DAS.  And most DAS devices can be attached to a switch and used as SAN.  In reality we typically use the terms to refer to their usage scenario more than anything else.

SAN is difficult to utilize effectively for many reasons.  The first is that it is poorly understood.  SAN is actually simple – so simple that it is very difficult to grasp making it surprisingly complex.  SAN is effectively just DAS that is abstracted, re-partioned and presented back out to hosts as DAS again.  The term “shared storage” is confusing because while SAN technology, like NAS, can allow for multiple hosts to attach to a single storage system it does not provide any form of mediation for hosts attached to the same filesystem.  NAS is intelligent and handles this making it easy to “share” shared storage.  SAN does not, it is too simple.  SAN is so simple that what in effect happens is simply that a single hard drive (abstracted as it may be) is wired into controllers on multiple hosts.  Back when shared storage meant attaching two servers to a single SCSI cable this was easy to envision.  Today with SAN’s abstractions and the commonality of NAS most IT shops will forget what SAN is doing and disaster can strike.

SAN has its place, to be sure, but SAN is complex to use and to administer and very limiting.  Often it is very expensive as well.  The rule of thumb with SAN is this: unless you need SAN, use something else.  It is that simple.  SAN should be avoided until it is the only option and when it is, it is the right option.  It is rarely, if ever, chosen for performance or cost reasons as it normally underperforms and out costs other options.  But when you are backing HyperV or building a database cluster nothing else is going to be an option for you.  For most use cases in an SMB, using SAN effectively will require a NAS to be placed in front of it in order to share out the storage.

NAS makes up the vast majority of shared storage use scenarios.  It is simple, well understood and it is flexible.

Many, if not most, shared storage appliances today will handle both SAN and NAS and the difference between the two is in their use, protocols and ideology more than anything.  Often the physical devices are similar if not the same as are the connection technologies today.

More than anything it is important to have specific goals in mind when looking for shared storage.  Write these goals down and look at each technology and product to see how or if they meet these goals.  Do not use knee-jerk decision making or work off of marketing materials or what appears to be market momentum.  Start by determining if shared storage is even a need.  If so, determine if NAS meets your needs.  If not, look to SAN.  Storage is a huge investment, take the time to look at alternatives, do lots of research and only after narrowing the field to a few, specific competitive products – turn to vendors for final details and pricing.

RAID Revisited

Back when I was a novice service tech and barely knew anything about system administration one of the few topics that we were always expected to know cold was RAID –  Redundant Array of Inexpensive Disks.  It was the answer to all of our storage woes.  With RAID we could scale our filesystems larger, get better throughput and even add redundancy allowing us to survive the loss of a disk which, especially in those days, happened pretty regularly.  With the rise of NAS and SAN storage appliances the skill set of getting down to the physical storage level and tweaking it to meet the needs of the system in question are rapidly disappearing.  This is not a good thing.  Just because we are offloading storage to external devices does not change the fact that we need to fundamentally understand our storage and configure it to meet the specific needs of our systems.

A misconception that seems to have entered the field over the last five to ten years is the belief that RAID somehow represents a system backup.  It does not.  RAID is a form of fault tolerance.  Backup and fault tolerance are very different conceptually.  Backup is designed to allow you to recover after a disaster has occurred.  Fault tolerance is designed to lessen the chance of disaster in the first place.  Think of fault tolerance as building a fence at the top of a cliff and backup as building a hospital at the bottom of it.  You never really want to be in a situation without a both a fence and a hospital, but they are definitely different things.

Once we are implementing RAID for our drives, whether locally attached or on a remote appliance like SAN, we have four key RAID solutions from which to choose today for business: RAID 1 (mirroring), RAID 5 (striping with parity), RAID 6 (striping with double parity) and RAID 10 (mirroring with striping.)  There are others, like RAID 0, that only should be used in rare circumstances when you really understand your drive subsystem needs.  RAID 50 and 51 are used as well but far less commonly and are not nearly as effective.  Ten years ago RAID 1 and RAID 5 were common, but today we have more options.

Let’s step through the options and discuss some basic numbers.  In our examples we will use n to represent the number of drives in our array and we will use s to represent the size of any individual drive.  Using these we can express the usable storage space of an array making comparisons easy in terms of storage capacity.

RAID 1: In this RAID type drives are mirrored.  You have two drives and they do everything together at the same time, hence “mirroring”.  Mirroring is extremely stable as the process is so simple, but it requires you to purchase twice as many drives as you would need if you were not using RAID at all as your second drive is dedicated to redundancy.  The benefit being that you have the assurance that every bit that you write to disk is being written twice for your protection.  So with RAID 1 our capacity is calculated to be (n*s/2).  RAID 1 suffers from providing minimal performance gains over non-RAID drives.  Write speeds are equivalent to a non-RAID system while read speeds are almost twice as fast in most situations since during read operations the drives can access in parallel to increase throughput.  RAID 1 is limited to two drive sets.

RAID 5: Striping with Single Parity, in this RAID type data is written in a complex stripe across all drives in the array with a distributed parity block that exists across all of the drives.  By doing this RAID 5 is able to use an arbitrarily sized array of three or more disks and only loses the storage capacity equivalent to a single disk to parity although the parity is distributed and does not exist solely on any one physical disk.   RAID 5 is often used because of its cost effectiveness due to its lack of storage capacity loss in large arrays.  Unlike mirroring, striping with parity requires that a calculation be performed for each write stripe across the disks and this creates some overhead.  Therefore the throughput is not always an obvious calculation and is dependent heavily upon the computational power of the system doing the parity calculation.  Calculating RAID 5 capacity is quite easy as it is simply ((n-1)*s).  A RAID 5 array can survive the loss of any single disk in the array.

RAID 6: Redundant Striping with Double Parity.  RAID 6 is practically identical to RAID 5 but uses two parity blocks per stripe rather than one to allow for additional protection against disk failure.  RAID 6 is a newer member of the RAID family having been added several years after the other levels had become standardized.  RAID 6 is special in that it allows for the failure of any two drives within an array without suffering data loss.  But to accommodate the additional level of redundancy a RAID 6 array loses the storage capacity of the equivalent to two drives in the array and requires a minimum of four drives.  We can calculate the capacity of a RAID 6 array with ((n-2)*s).

RAID 10: Mirroring plus Striping.  Technically RAID 10 is a hybrid RAID type encompassing a set of RAID 1 mirrors existing in a non-parity stripe (RAID 0).  Many vendors use the term RAID 10 (or RAID 1+0) when speaking of only two drives in an array but technically that is RAID 1 as striping cannot occur until there are a minimum of four drives in the array.  With RAID 10 drives must be added in pairs so only an even number of drives can exist in an array.  RAID 10 can survive the loss of up to half of the total set of drives but a maximum loss of one from each pair.  RAID 10 does not involve a parity calculation giving it a performance advantage over RAID 5 or RAID 6 and requiring less computational power to drive the array.  RAID 10 delivers the greatest read performance of any common RAID type as all drives in the array can be used simultaneously in read operations although its write performance is much lower.  RAID 10’s capacity calculation is identical to that of RAID 1, (n*s/2).

In today’s enterprise it is rare for an IT department to have a serious need to consider any drive configuration outside of the four mentioned here regardless of whether software or hardware RAID is being implemented.  Traditionally the largest concern in a RAID array decision was based around usable capacity.  This was because drives were expensive and small.  Today drives are so large that storage capacity is rarely an issue, at least not like it was just a few years ago, and the costs have fallen such that purchasing additional drives necessary for better drive redundancy is generally of minor concern.  When capacity is at a premium RAID 5 is a popular choice because it loses the least storage capacity compared to other array types and in large arrays the storage loss is nominal.

Today we generally have other concerns, primarily data safety and performance.  Spending a little extra to ensure data protection should be an obvious choice.  RAID 5 suffers from being able to lose only a single drive.  In an array of just three members this is only slightly more dangerous than the protection offered by RAID 1.  We could survive the loss of any one out of three drives.  Not too scary compared to losing either of two drives.  But what about a large array, say sixteen drives.  Being able to safely lose only one of sixteen drives should make us question our reliability a little more thoroughly.

This is where RAID 6 stepped in to fill the gap.  RAID 6, when used in a large array, introduces a very small loss of storage capacity and performance while providing the assurance of being able to lose any two drives.  Proponents of the striping with parity camp will often quote these numbers to assuage management that RAID 5/6 can provide adequate “bang for the buck” in storage subsystems, but there are other factors at play.

Almost entirely overlooked in discussions of RAID reliability, an all too seldom discussed topic as it is, is the question of parity computation reliability.  With RAID 1 or RAID 10 there is no “calculation” done to create a stripe with parity.  Data is simply written in a stable manner.  When a drive fails its partner picks up the load and drive performance is slightly degraded until the partner is replaced.  There is no rebuilding process that impacts existing drive members.  Not so with parity stripes.

RAID arrays with parity have operations that involve calculating what is and what should be on the drives.  While this calculation is very simple it provides an opportunity for things to go wrong.  An array control that fails with RAID 1 or RAID 10 could, in theory, write bad data over the contents of the drives but there is no process by which the controller makes drive changes on its own so this is extremely unlikely to ever occur as there is never a “rebuild” process except in creating a mirror.

When arrays with parity perform a rebuild operation they perform a complex process by which they step through the entire contents of the array and write missing data back to the replaced drive.  In and of itself this is relatively simple and should be no cause for worry.  What I and others have seen first hand is a slightly different scenario involving disks that have lost connectivity due to loose connectors to the array.  Drives can commonly “shake” loose over time as they sit in a server especially after several years of service in an always-on system.

What can happen, in extreme scenarios, is that good data on drives can be overwritten by bad parity data when an array controller believes that one or more drives have failed in succession and been brought back online for rebuild.  In this case the drives themselves have not failed and there is no data loss.  All that is required is that the drives be reseated, in theory.  On hot swap systems the management of drive rebuilding is often automatic based on the removal and replacement of a failed drive.  So this process of losing and replacing a drive may occur without any human intervention – and a rebuilding process can begin.  During this process the drive system is at risk and should this same event occur again the drive array may, based upon the status of the drives, begin striping bad data across the drives overwriting the good filesystem.  It is one of the most depressing sights for a server administrator to see when a system with no failed drives loses an entire array due to an unnecessary rebuild operation.

In theory this type of situation should not occur and safeguards are in place to protect against it but the determination of a low level drive controller as to the status of a drive currently and previously and the quality of the data residing upon that drive is not as simple as it may seem and it is possible for mistakes to occur.  While this situation is unlikely it does happen and it adds a nearly impossible to calculate risk to RAID 5 and RAID 6 systems.  We must consider the risk of parity failure in addition to the traditional risk calculated from the number of drive losses that an array can survive out of a pool.  As drives become more reliable the significance of the parity failure risk event becomes greater.

Additionally, RAID 5 and RAID 6 parity introduces system overhead due to parity calculation which is often handled by way of dedicated RAID hardware.  This calculation introduces latency into the drive subsystem that varies dramatically by implementation both in hardware and in software making it impossible to state performance numbers of RAID levels against one another as each implementation will be unique.

Possibly the biggest problem with RAID choices today is that the ease with which metrics for storage efficiency and drive loss survivability can be obtained mask the big picture of reliability and performance as those statistics are almost entirely unavailable.  One of the dangers of metrics is that people will focus upon factors that can be easily measured and ignore those that cannot be easy measured regardless of their potential for impact.

While all modern RAID levels have their place it is critical that they be considered within context and with an understanding as to the entire scope of the risks.  We should work hard to shift our industry from a default of RAID 5 to a default of RAID 10.  Drives are cheap and data loss is expensive.

[Edit: In the years since this was initial written the rise of URE (Unrecoverable Read Errors) risks during a rebuild operation has shifted the primary risks from those listed to URE-related risks for parity arrays.]

Using GMail to Backup Your Email

When disaster strikes it is a good time to reflect on what preventative measures might have saved the day.  Working in IT, as I do, mitigating and even preventing disaster is a big part of the job.  No matter how hard we try disaster can still strike and being prepared for it is very important.

Email is one of those systems that almost no business can function without in this day and age.  Getting to lost email messages quickly is almost as important as getting email flowing again.

Recently, I had to deal with a pretty significant email disaster.  A lot of email was lost.  Getting email back up and running wasn’t too hard but a lot of email had been lost.  In a post-mortem we looked into many potential solutions to the ongoing issue of email backups which are often tricky and difficult to do properly.

One of the best suggestions made in our post-mortem engineering sessions was to have email backed-up, on the fly, message by message via forwarding to Google’s GMail service.

Now, before we get too far, I need to state that this is not a comprehensive backup solution.  Using email forwarding to GMail (or any other email service) must be handled account by account which causes it to scale very poorly.  It would be an administrative nightmare for a shop of any size but is quite easy for a very small organization of up to possibly twenty or thirty people.  It also does not handle outgoing (sent) email in any way but only incoming email – which is almost always where the important messages are located.  A traditional backup of your email system is still necessary.  This is really a complimentary service not a replacement solution.

What is great about forwarding to GMail is that it is free, it is extremely convenient, it can be handled by the email users who want it and ignored by those who do not,  it is almost instantaneous and the backups will continue even when an email client is not connected – unlike an IMAP or POP client based backup strategy.  Google provides so much storage capacity that likely you can send years of email messages to GMail without ever needing to clean out your archived messages.

Email forwarding can be set up by individual users or by an email administrator although if individual users do not manage their own GMail accounts this can be problematic.  There is also the potential option to have several company email account forward to a single GMail account although comingling email will result is all kinds of potential headaches later and be sure to be very confident about your legal ground for combining email in this manner.

Smaller organizations need to carefully consider how to take best advantage of the email options that they have available to them.  Email forwarding is one way in which very small organizations can take advantage of their size.  Large organizations would be forced to use more complex and expensive backup strategies to achieve these same results.