All posts by Scott Alan Miller

Started in software development with Eastman Kodak in 1989 as an intern in database development (making database platforms themselves.) Began transitioning to IT in 1994 with my first mixed role in system administration.

One Big RAID 10 – A New Standard in Server Storage

In the late 1990s the standard rule of thumb for building a new server was to put the operating system onto its own, small, RAID 1 array and separate out applications and data into a separate RAID 5 array.  This was done for several reasons, many of which have swirled away from us, lost in the sands of time.  The main driving factors were that storage capacity was extremely expensive, disks were small, filesystems corrupted regularly and physical hard drives failed at a very high rate compared to other types of failures.  People were driven by a need to protect against physical hard drive failures, protect against filesystem corruption and acquire enough capacity to meet their needs.

Today the storage landscape has changed.  Filesystems are incredibly robust and corruption from the filesystem itself is almost unheard of and, thank to technologies like journalling, can almost always be corrected quickly and effectively protecting the end users from data loss.  Almost no one worried about filesystem corruption today.

Modern filesystem are also able to handle far more capacity than they could previously.  It was not uncommon in the late 1990s and early 2000s to have the ability to easily make a drive array larger than any single filesystem could handle. Today that is not reasonably the case as all common filesystems handle many terabytes at least and often petabytes, exabytes or more of data.

Hard drives are much more reliable than they were in the late 1990s.  Failure rates for an entire drive failing are very low, even in less expensive drives.  So low, in fact, that array failures (data loss in the entire RAID array) is concerned with failing arrays primarily, rather than the failure of hard drives.  We no longer replace hard drives with wild abandon.  It is not unheard of for large arrays to run their entire lifespans without losing a single drive.

Capacities have scaled dramatically.  Instead of 4.3GB hard drives we are installing 3TB drives.  Nearly one thousand times more capacity on a single spindle compared to less than fifteen years ago.

These factors come together to create a need for a dramatically different approach to server storage design and a change to the “rule of thumb” about where to start when designing storage.

The old approach can be written RAID 1 + RAID 5.  The RAID 1 space was used for the operating system while the RAID 5 space, presumably much larger, was used for data and applications.  This design split the two storage concerns putting maximum effort into protecting the operating system (which was very hard to recover in case of disaster and on which the data relied for accessibility) onto highly reliable RAID 1.  Lower cost RAID 5, while somewhat riskier, was chosen, typically, for data because the cost of storing data on RAID 1 was too high in most cases.  It was a tradeoff that made sense at the time.

Today, with our very different concerns, a new approach is needed, and this new approach is known as “One Big RAID 10” – meaning a single, large RAID 10 array with operating system, applications and data all stored together.  Of course, this is just what we say to make it handy, in a system without the needs of performance or capacity beyond a single disk we would say “One Big RAID 1”, but many people include RAID 1 in the RAID 10 group so it is just easier to say the former.

To be even handier, we abbreviate this to OBR10.

Because the cost of storage has dropped considerably and instead of being at a premium is typically in abundance today, because filesystems are incredibly reliable, because RAID 1 and RAID 10 share performance characteristics and because non-disk failure triggered array failures have moved from background noise to primary causes of data loss the move to RAID 10 and to eliminate array splitting has become the new standard approach.

With RAID 10 we now have the highly available and resilient storage previously held only for the operating system available to all of our data.  We get the benefit of mirrored RAID performance plus the benefit of extra spindles for all of our data.  We get better drive capacity utilization and performance based on that improved utilization.

Even the traditional splitting of log files normally done with databases (the infamous RAID 1 + RAID 5 + RAID 1 approach) is no longer needed because RAID 10 keeps the optimum performance characteristics across all data.  With RAID 10 we eliminate almost all of the factors that once caused us to split arrays.

The only significant factor, that has not been mentioned, for which split arrays were traditionally seen as beneficial is access contention – the need for different processes to need access to different parts of the disk at the same time causing the drive head to move around in a less than ideal pattern reducing drive performance.  Contention was a big deal in the late 1990s when the old rule of thumb was developed.

Today, drive contention still exists but has been heavily mitigated by the use of large RAID caches.  In the late 90s drive caches were a few megabytes at best and often non-existent.  Today 256MB is a tiny cache and average servers are deployed with 1-2GB of cache on the RAID card alone.  Some systems are beginning to integrate additional solid state drive based caches to add a secondary cache beyond the memory cache on the controller.  These can easily add hundreds of gigabytes of extremely high speed cache that can buffer nearly any spindle operation from needing to worry about contention.  So the issue of contention has been solved in other ways over the years but has, like other technology changes, effectively freed us from the traditional concerns requiring us to split arrays.

Like array contention, another, far less common reason for splitting arrays in the late 1990s was to improve communications bus performance because of the limitations of the era’s SCSI and ATA technologies.  These, too, have been eliminated with the move to serial communications mechanisms, SAS and SATA, in modern arrays.  We are no longer limited to the capacity of a single bus for each array and can grow much larger with much more flexibility than previously.  Bus contention has been all but eliminated.

If there is a need to split off space for protection, such as log file growth, this can be achieved through partitioning rather than through physical array splitting.  In general you will want to minimize partitioning as it increases overhead and lowers the ability of the drives to tune themselves but there are cases where it is the better approach.  But it does not require that the underlying physical storage be split as it traditionally was.  Even better than partitioning, when available, is logical volume management which makes partition-like separations without the limitations of partitions.

So at the end of the day, the new rule of thumb for server storage is “One Big RAID 10.”  No more RAID 5, no more array splitting.  It’s about reliability, performance, ease of management and moderate cost effectiveness.  Like all rules of thumb, this does not apply to every single instance, but it does apply much more broadly than the old standard ever did.  RAID 1 + RAID 5, as a standard, was always an attempt to “make due” with something undesirable and to make the best of a bad situation.   OBR10 is not like that.  The new standard is a desired standard – it is how we actually want to run, not something with which we have been “stuck”.

When designing storage for a new server, start with OBR10 and only move away from it when it specifically does not meet your technology needs.  You should never have to justify using OBR10, only justify not using it.

 

Virtualization as a Standard Pattern

Virtualization as an enterprise concept is almost as old as business computing is itself.  The value of abstracting computing from the bare hardware was recognized very early on and almost as soon as computers had the power to manage the abstraction process, work began in implementing virtualization much as we know it today.

The earliest commonly accepted work on virtualization began in 1964 with the IBM CP-40 operating system developers for the IBM System/360 mainframe.  This was the first real foray into commercial virtualization and the code and design from this early virtualization platform has descended today into the IBM VM platform that has been used continuously since 1972 as a virtualization layer for the IBM mainframe families over the decades.  Since IBM first introduced virtualization we have seen enterprise systems adopting this pattern of hardware abstraction almost universally.  Many large scale computing systems, minicomputers and mainframes, moved to virtualization during the 1970s with the bulk of all remaining enterprise systems doing so, as the power and technology were available to them, during the 1980s and 1990s.

The only notable holdout to virtualization for enterprise computing was the Intel IA32 (aka x86) platform which lacked the advanced hardware resources necessary to implement effective virtualization until the advent of the extended AMD64 64-bit platform and even then only with specific new technology.  Once this was introduced the same high performance, highly secure virtualization was available across the board on all major platforms for business computing.

Because low cost x86 platforms lacked meaningful virtualization (outside of generally low performance software virtualization and niche high performance paravirtualization platforms) until the mid-2000s this left virtualization almost completely off of the table for the vast majority of small and medium businesses.  This has lead many dedicated to the SMB space to be unaware that virtualization is a well established, mature technology set that long ago established itself as the de facto pattern for business server computing.  The use of hardware abstraction is nearly ubiquitous in enterprise computing with many of the largest, most stable platforms having no option, at least no officially support option, for running systems “bare metal.”

There are specific niches where the need to avoid hardware abstraction through virtualization is not advised but these are extremely rare, especially in the SMB market.  Typical systems needing to not be virtualized include latency sensitive systems (such as low latency trading platforms) and multi-server combined workloads such as HPC compute clusters where the primary goal is performance above stability and utility.  Neither of these are common to the SMB.

Virtualization offers many advantages.  Often, in the SMB where virtualization is less expected, it is assumed that virtualization’s goal is consolidation where massive scale cost savings can occur or in providing new ways to provide for high availability.  Both of these are great options that can help specific organizations and situations but neither is the underlying justification for virtualization.  We can consolidate and achieve HA through other means, if necessary.  Virtualization simply provides us with a great array of options in those specific areas.

Many of the uses of virtualization are artifacts of the ecosystem such as a potential reduction in licensing costs.  These types of advantages are not intrinsic advantages to virtualization but do exist and cannot be overlooked in a real world evaluation.  Not all benefits apply to all hypervisors or virtualization platforms but nearly all apply across the board.  Hardware abstraction is a concept, not an implementation, so how it is leveraged will vary.  Conceptually, abstracting away hardware whether at the storage layer, at the computing layer, etc. is very important as it eases management, improves reliability and speeds development.

Here are some of the benefits from virtualization.  It is important to note that outside of specific things such as consolidation and high availability nearly all of these benefits apply not only to virtualizing on a single hardware node but for a single workload on that node.

  1. Reduced human effort and impact associated with hardware changes, breaks, modifications, expansion, etc.
  2. Storage encapsulation for simplified backup / restore process, even with disparate hardware targets
  3. Snapshotting of entire system for change management protection
  4. Ease of archiving upon retirement or decommission
  5. Better monitoring capabilities, adding out of band management even on hardware platforms that don’t offer this natively
  6. Hardware agnosticism provides for no vendor lock-in as the operating systems believe the hypervisor is the hardware rather than the hardware itself
  7. Easy workload segmentation
  8. Easy consolidation while maintaining workload segmentation
  9. Greatly improved resource utilization
  10. Hardware abstraction creates a significantly realized opportunity for improved system performance and stability while lowering the demands on the operating system and driver writers for client operating systems
  11. Simplified deployment of new and varied workloads
  12. Simple transition from single platform to multi-platform hosting environments which then allow for the addition of options such as cloud deployments or high availability platform systems
  13. Redeployment of workloads to allow for easy physical scaling

In today’s computing environments, server-side workloads should be universally virtualized for these reasons.  The benefits of virtualization are extreme while the downsides are few and trivial.  The two common scenarios where virtualization still needs to be avoided are in situations where there is specialty hardware that must be used directly on the server (this has become very rare today, but does still exist from time to time) and extremely low latency systems where sub-millisecond latencies are critical.  The second of these is common only in extremely niche business situations such as low latency investment trading systems.  Systems with these requirements will also have incredible networking and geolocational requirements such as low-latency Infiniband with fiber to the trading floor of less than five miles.

Some people will point out that high performance computing clusters do not use virtualization, but this is a grey area as any form of clustering is, in fact, a form of virtualization.  It is simply that this is a “super-system” level of virtualization instead of being strictly at the system level.

It is safe to assume that any scenario in which you might find yourself in which you should not use virtualization you will know it beyond a shadow of a doubt and will be able to empirically demonstrate why virtualization is either physically or practically impossible.  For all other cases, virtualize.  Virtualize if you have only one physical server and one physically workload and just one user.  Virtualize if you are a Fortune 100 with the most demanding workloads.  And virtualize if you are anyone in between.  Size is not a factor in virtualization; we virtualize out of a desire to have a more effective and stable computing environment both today and into the future.

 

Choosing RAID for Hard Drives in 2013

After many, many articles, discussions, threads, presentations, questions and posts on choosing RAID, I have finally decided to publish my 2012-2013 high level guide to choosing RAID.  The purpose of this article is not to broadly explain or defend RAID choices but to present a concise guide to making an educated, studied decision for RAID that makes sense for a given purpose.

Today, four key RAID types exist for the majority of purposes: RAID 0, RAID 1, RAID 6 and RAID 10.  Each has a place where it makes the most sense.  RAID 1 and RAID 10, one simply being an application of the other, can handily be considered as a single RAID type with the only significant difference being the size of the array.  Many vendors refer to RAID 1 incorrectly as RAID 10 today because of this and, while this is clearly a semantic mistake, we will call them RAID 1/10 here to make decision making less complicated.  Together they can be considered the “mirrored RAID” family and the differentiation between them is based solely on the number of pairs in the array.  One pair is RAID 1, more than one pair is RAID 10.

RAID 0: RAID without redundancy.  RAID 0 is very fast and very fragile.  It has practically no overhead and requires the fewest hard disks in order to accomplish capacity and performance goals.  RAID 0 is perfect for situations where data is volatile (such as temporary caches) and where data is read only and there are solid backups and where accessibility is not a key concern.  RAID 0 should never be used for live or critical data.

RAID 6: RAID 6 is the market standard today for parity RAID, the successor to RAID 5.  As such, RAID 6 is cost effective in larger arrays (five drives minimum, normally six or more drives) where performance and reliability are secondary concerns to cost.  RAID 6 is focused on cost effective capacity for near-line data.

RAID 1/10: Mirrored RAID provides the best speed and reliability making it ideally suited for online data – any data where speed and reliability are of the top concern.  It is the only reasonable choice for arrays of four or fewer drives where the data is non-volatile.  With rare exception, mirrored RAID should be the defacto choice for any RAID array where specific technical needs do not clearly mandate a RAID 1 or RAID 6 solution.

It is a rare circumstance where RAID 0 is required, very rare.  RAID 6 has a place in many organizations but almost never on its own.  Almost every organization should be relying on RAID 1 or 10 for its primary storage and potentially using other RAID types for special cases, such as backups, archives and caches.  It is a very, very rare business that wouldn’t not have RAID 10 as the primary storage for the bulk of its systems.

Virtual Eggs and Baskets

In speaking with small business IT professionals, one of the key factors for hesitancy around deploying virtualization arises from what is described as “don’t put your eggs in one basket.”

I can see where this concern arises.  Virtualization allows for many guest operating systems to be contained in a single physical system which, in the event of a hardware failure, causes all guest systems residing on it to fail together, all at once.  This sounds bad, but perhaps it is not as bad as we would first presume.

The idea of the eggs and baskets idiom is that we should not put all of our resources at risk at the same time.  This is generally applied to investing, encouraging investors to diversify and invest in many different companies and types of securities like bonds, stocks, funds and commodities.  In the case of eggs (or money) we are talking about an interchangeable commodity.  One egg is as good as another.  A set of eggs are naturally redundant.

If we have a dozen eggs and we break six, we can still make an omelette, maybe a smaller one, but we can still eat.  Eating a smaller omelette is likely to be nearly as satisfying as a larger one – we are not going hungry in any case.  Putting our already redundant eggs into multiple baskets allows us to hedge our bets.  Yes, carrying two baskets means that we have less time to pay attention to either one so it increases the risk of losing some of the eggs but reduces the chances of losing all of the eggs.  In the case of eggs, a wise proposition indeed.  Likewise, a smart way to prepare for your retirement.

This theory, because it is repeated as an idiom without careful analysis or proper understanding, is then applied to unrelated areas such as server virtualization.  Servers, however, are not like eggs.  Servers, especially in smaller businesses, are rarely interchangeable commodities where having six working, instead of the usual twelve, is good enough.  Typically servers each play a unique role and all are relatively critical to the functioning of the business.  If a server is not critical then it is unlikely to be able to justify the cost of acquiring and maintaining itself in the first place and so would probably not exist.  When servers are interchangeable, such as in a large, stateless web farm or compute cluster, they are configured as such as a means to expanding capacity beyond the confines of a single, physical box and so fall outside the scope of this discussion.

IT services in a business are usually, at least to some degree, a “chain dependency.”  That is, they are interdependent and the loss of a single service may impact other services either because they are technically interdependent (such as a line of business application being dependent on a database) or because they are workflow interdependent (such as an office worker needing the file server working in order to provide a file which he needs to edit with information from an email while discussing the changes over the phone or instant messenger.)  In these cases, the loss of a single key service such as email, network authentication or file services may create a disproportionate loss of working ability.  If there are ten key services and one goes down, company productivity from an IT services perspective likely drops by far more than ten percent, possibly nearing one hundred percent in extreme cases.   This is not always true, in some unique cases workers are able to “work around” a lost service effectively, but this is very uncommon.  Even if people can remain working, they are likely far less productive than usual.

When dealing with physical servers, each server represents its own point of failure.  So if we have ten servers, we have ten times the likelihood of outage than if we had only one of those same servers.  Each server that we add brings with it its own risk.  If each failure has an outage factor of 2.5 – that is financially impacting the business for twenty five percent of revenue for, say, one day then our total average impact over a decade is the equivalent of two and a half total site outages.  I use the concept of factors and averages here to make this easy, determining the length of an average outage or impact of an average outage is not necessary as we only need to determine relative impact in this case to compare the scenarios.  It’s just a means of comparing cumulative outage financial impact of one event type compared to another without needing specific figures – this doesn’t help you determine what your spend should be, just relative reliability.

With virtualization we have the obvious ability to consolidate.  In this example we will assume that we can collapse all ten of these existing servers down into a single server.  When we do this we often trigger the “all our eggs in one basket” response.  But if we run some risk analysis we will see that this is usually just fear and uncertainty and not a mathematically supported risk.  If we assume the same risks as the example above our single server will, on average, incur just a single total site outage, once per decade.

Compare this to the first example which did the damage equivalent to two and a half total site outages – the risk of the virtualized, consolidated solution is only forty percent that of the traditional solution.

Now keep in mind that this is based on the assumption that losing some services means a financial loss greater than the strict value of the service that was lost, which is almost always the case.  Even if the service lost is no more than the loss of an individual service we are only at break even and need not worry.  In rare cases impact from losing a single system can be less than its “slice of the pie”, normally because people are flexible and can work around the failed system – like if instant messaging fails and people simple switch to using email until instant messaging is restore, but these cases are rare and are normally isolated to a few systems out of many with the majority of systems, say ERP, CRM and email, having disproportionally large impacts in the event of an outage.

So what we see here is that under normal circumstances moving ten services from ten servers to ten services on one server will generally lower our risk, not increase it – in direct contrast to the “eggs in a basket” theory.  And this is purely from a hardware failure perspective.  Consolidation offers several other important reliability factors, though, that can have a significant impact to our case study.

With consolidation we reduce the amount of hardware that needs to be monitored and managed by the IT department.  Fewer servers means that more time and attention can be paid to those that remain.  More attention means a better chance of catching issues early and more opportunity to keep parts on hand.  Better monitoring and maintenance leads to better reliability.

Possibly the most important factor, however, with consolidation is that there is significant cost savings and this, if approached correctly, can provide opportunities for improved reliability.  With the dramatic reduction in total cost for servers it can be tempting to continue to keep budgets tight and attempt to purely leverage the cost savings directly.   Understandable and for some businesses this may be the correct approach.  But it is not the approach that I would recommend when struggling against the notion of eggs and baskets.

Instead by applying a more moderate approach keeping significant cost savings but still spending more, relatively speaking, on a single server you can acquire a higher end (read: more reliable) server, use better parts, have on-site spares, etc.  The cost savings of virtualization can often be turned directly into increased reliability further shifting the equation in favor of the single server approach.

As I stated in another article, one brick house is more likely to survive a wind storm than either one or two straw houses.  Having more of something doesn’t necessarily make it the more reliable choice.

These benefits come purely from the consolidation aspect of virtualization and not from the virtualization itself.  Virtualization provides extended risk mitigation features separately as well.  System imaging and rapid restores, as well as restores to different hardware, are major advantages of most any virtualization platform.  This can play an important role in a disaster recovery strategy.

Of course, all of these concepts are purely to demonstrate that single box virtualization and consolidation can beat the legacy “one app to one server” approach and still save money – showing that the example of eggs and baskets is misleading and does not apply in this scenario.    There should be little trepidation in moving from a traditional environment directly to a virtualized one based on these factors.

It should be noted that virtualization can then extend the reliability of traditional commodity hardware providing mainframe-like failover features that are above and beyond what non-virtualized platforms are able to provide.  This moves commodity hardware more firmly into line with the larger, more expensive RISC platforms.  These features can bring an extreme level of protection but are often above and beyond what is appropriate for IT shops initially migrating from a non-failover, legacy hardware server environment.  High availability is a great feature but is often costly and very often unnecessary, especially as companies move from, as we have seen, relatively unreliable environments in the past to more reliable environments today.  Given that we have already increased reliability over what was considered necessary in the past there is a very good chance that an extreme jump in reliability is not needed now, but due to the large drop in the cost of high availability, it is quite possible that it will he cost justified where previously it could not be.

In the same vein, virtualization is often feared because it is seen as a new, unproven technology.  This is certainly untrue but there is an impression of this in the small business and commodity server space.  In reality, though, virtualization was first introduced by IBM in the 1960s and ever since then has been a mainstay of high end mainframe and RISC servers – those systems demanding the best reliability.  In the commodity server space virtualization was a larger technical challenge and took a very long time before it could be implemented efficiently enough to make it effective to use in the real world.  But even in the commodity server space virtualization has been available since the late 1990s and so is approximately fifteen years old today which is very far past the point of being a nascent technology – in the world of IT it is positively venerable.  Commodity platform virtualization is a mature field with several highly respected, extremely advanced vendors and products.  The use of virtualization as a standard for all or nearly all server applications is a long established and accepted “enterprise pattern” and one that now can easily be adopted by companies of any and every size.

Virtualization, perhaps counter-intuitively, is actually a very critical component of a reliability strategy.  Instead of adding risk, virtualization can almost be approached as a risk mitigation platform – a toolkit for increasing the reliability of your computing platforms through many avenues.