Category Archives: Storage

One Big RAID 10 – A New Standard in Server Storage

In the late 1990s the standard rule of thumb for building a new server was to put the operating system onto its own, small, RAID 1 array and separate out applications and data into a separate RAID 5 array.  This was done for several reasons, many of which have swirled away from us, lost in the sands of time.  The main driving factors were that storage capacity was extremely expensive, disks were small, filesystems corrupted regularly and physical hard drives failed at a very high rate compared to other types of failures.  People were driven by a need to protect against physical hard drive failures, protect against filesystem corruption and acquire enough capacity to meet their needs.

Today the storage landscape has changed.  Filesystems are incredibly robust and corruption from the filesystem itself is almost unheard of and, thank to technologies like journalling, can almost always be corrected quickly and effectively protecting the end users from data loss.  Almost no one worried about filesystem corruption today.

Modern filesystem are also able to handle far more capacity than they could previously.  It was not uncommon in the late 1990s and early 2000s to have the ability to easily make a drive array larger than any single filesystem could handle. Today that is not reasonably the case as all common filesystems handle many terabytes at least and often petabytes, exabytes or more of data.

Hard drives are much more reliable than they were in the late 1990s.  Failure rates for an entire drive failing are very low, even in less expensive drives.  So low, in fact, that array failures (data loss in the entire RAID array) is concerned with failing arrays primarily, rather than the failure of hard drives.  We no longer replace hard drives with wild abandon.  It is not unheard of for large arrays to run their entire lifespans without losing a single drive.

Capacities have scaled dramatically.  Instead of 4.3GB hard drives we are installing 3TB drives.  Nearly one thousand times more capacity on a single spindle compared to less than fifteen years ago.

These factors come together to create a need for a dramatically different approach to server storage design and a change to the “rule of thumb” about where to start when designing storage.

The old approach can be written RAID 1 + RAID 5.  The RAID 1 space was used for the operating system while the RAID 5 space, presumably much larger, was used for data and applications.  This design split the two storage concerns putting maximum effort into protecting the operating system (which was very hard to recover in case of disaster and on which the data relied for accessibility) onto highly reliable RAID 1.  Lower cost RAID 5, while somewhat riskier, was chosen, typically, for data because the cost of storing data on RAID 1 was too high in most cases.  It was a tradeoff that made sense at the time.

Today, with our very different concerns, a new approach is needed, and this new approach is known as “One Big RAID 10” – meaning a single, large RAID 10 array with operating system, applications and data all stored together.  Of course, this is just what we say to make it handy, in a system without the needs of performance or capacity beyond a single disk we would say “One Big RAID 1”, but many people include RAID 1 in the RAID 10 group so it is just easier to say the former.

To be even handier, we abbreviate this to OBR10.

Because the cost of storage has dropped considerably and instead of being at a premium is typically in abundance today, because filesystems are incredibly reliable, because RAID 1 and RAID 10 share performance characteristics and because non-disk failure triggered array failures have moved from background noise to primary causes of data loss the move to RAID 10 and to eliminate array splitting has become the new standard approach.

With RAID 10 we now have the highly available and resilient storage previously held only for the operating system available to all of our data.  We get the benefit of mirrored RAID performance plus the benefit of extra spindles for all of our data.  We get better drive capacity utilization and performance based on that improved utilization.

Even the traditional splitting of log files normally done with databases (the infamous RAID 1 + RAID 5 + RAID 1 approach) is no longer needed because RAID 10 keeps the optimum performance characteristics across all data.  With RAID 10 we eliminate almost all of the factors that once caused us to split arrays.

The only significant factor, that has not been mentioned, for which split arrays were traditionally seen as beneficial is access contention – the need for different processes to need access to different parts of the disk at the same time causing the drive head to move around in a less than ideal pattern reducing drive performance.  Contention was a big deal in the late 1990s when the old rule of thumb was developed.

Today, drive contention still exists but has been heavily mitigated by the use of large RAID caches.  In the late 90s drive caches were a few megabytes at best and often non-existent.  Today 256MB is a tiny cache and average servers are deployed with 1-2GB of cache on the RAID card alone.  Some systems are beginning to integrate additional solid state drive based caches to add a secondary cache beyond the memory cache on the controller.  These can easily add hundreds of gigabytes of extremely high speed cache that can buffer nearly any spindle operation from needing to worry about contention.  So the issue of contention has been solved in other ways over the years but has, like other technology changes, effectively freed us from the traditional concerns requiring us to split arrays.

Like array contention, another, far less common reason for splitting arrays in the late 1990s was to improve communications bus performance because of the limitations of the era’s SCSI and ATA technologies.  These, too, have been eliminated with the move to serial communications mechanisms, SAS and SATA, in modern arrays.  We are no longer limited to the capacity of a single bus for each array and can grow much larger with much more flexibility than previously.  Bus contention has been all but eliminated.

If there is a need to split off space for protection, such as log file growth, this can be achieved through partitioning rather than through physical array splitting.  In general you will want to minimize partitioning as it increases overhead and lowers the ability of the drives to tune themselves but there are cases where it is the better approach.  But it does not require that the underlying physical storage be split as it traditionally was.  Even better than partitioning, when available, is logical volume management which makes partition-like separations without the limitations of partitions.

So at the end of the day, the new rule of thumb for server storage is “One Big RAID 10.”  No more RAID 5, no more array splitting.  It’s about reliability, performance, ease of management and moderate cost effectiveness.  Like all rules of thumb, this does not apply to every single instance, but it does apply much more broadly than the old standard ever did.  RAID 1 + RAID 5, as a standard, was always an attempt to “make due” with something undesirable and to make the best of a bad situation.   OBR10 is not like that.  The new standard is a desired standard – it is how we actually want to run, not something with which we have been “stuck”.

When designing storage for a new server, start with OBR10 and only move away from it when it specifically does not meet your technology needs.  You should never have to justify using OBR10, only justify not using it.

 

Choosing RAID for Hard Drives in 2013

After many, many articles, discussions, threads, presentations, questions and posts on choosing RAID, I have finally decided to publish my 2012-2013 high level guide to choosing RAID.  The purpose of this article is not to broadly explain or defend RAID choices but to present a concise guide to making an educated, studied decision for RAID that makes sense for a given purpose.

Today, four key RAID types exist for the majority of purposes: RAID 0, RAID 1, RAID 6 and RAID 10.  Each has a place where it makes the most sense.  RAID 1 and RAID 10, one simply being an application of the other, can handily be considered as a single RAID type with the only significant difference being the size of the array.  Many vendors refer to RAID 1 incorrectly as RAID 10 today because of this and, while this is clearly a semantic mistake, we will call them RAID 1/10 here to make decision making less complicated.  Together they can be considered the “mirrored RAID” family and the differentiation between them is based solely on the number of pairs in the array.  One pair is RAID 1, more than one pair is RAID 10.

RAID 0: RAID without redundancy.  RAID 0 is very fast and very fragile.  It has practically no overhead and requires the fewest hard disks in order to accomplish capacity and performance goals.  RAID 0 is perfect for situations where data is volatile (such as temporary caches) and where data is read only and there are solid backups and where accessibility is not a key concern.  RAID 0 should never be used for live or critical data.

RAID 6: RAID 6 is the market standard today for parity RAID, the successor to RAID 5.  As such, RAID 6 is cost effective in larger arrays (five drives minimum, normally six or more drives) where performance and reliability are secondary concerns to cost.  RAID 6 is focused on cost effective capacity for near-line data.

RAID 1/10: Mirrored RAID provides the best speed and reliability making it ideally suited for online data – any data where speed and reliability are of the top concern.  It is the only reasonable choice for arrays of four or fewer drives where the data is non-volatile.  With rare exception, mirrored RAID should be the defacto choice for any RAID array where specific technical needs do not clearly mandate a RAID 1 or RAID 6 solution.

It is a rare circumstance where RAID 0 is required, very rare.  RAID 6 has a place in many organizations but almost never on its own.  Almost every organization should be relying on RAID 1 or 10 for its primary storage and potentially using other RAID types for special cases, such as backups, archives and caches.  It is a very, very rare business that wouldn’t not have RAID 10 as the primary storage for the bulk of its systems.

Choosing a RAID Level by Drive Count

In addition to all other factors, the number of drives available to you plays a significant role in choosing what RAID level is appropriate for you.  Ideally RAID is chosen ahead of time in conjunction with chassis and drives in a holistic approach so that the entire system is engineered for the desired purpose, but even in these cases, knowing how drive counts can affect useful RAID choices can be very helpful.

To simplify the list, RAID 0 will be left off of it.  RAID 0 is a viable choice for certain niche business scenarios in any count of drives.  So there is no need to display it on the list.  Also, the list assumes that a hot spare, if it exists, is not included in the count as that is “outside” of the RAID array and so would not be a part of the array drive count.

2 Drives: RAID 1

3 Drives: RAID 1 *

4 Drives: RAID 10

5 Drives: RAID 6

6 Drives: RAID 6 or RAID 10

7 Drives: RAID 6 or RAID 7

8 Drives: RAID 6 or RAID 7 or RAID 10 **

9 Drives: RAID 6 or RAID 7

10 Drives: RAID 6 or RAID 7 or RAID 10 or RAID 60/61

11 Drives: RAID 6 or RAID 7 

12 Drives: RAID 6 or RAID 7 or RAID 10 or RAID 60/61

13 Drives: RAID 6 or RAID 7

14 Drives: RAID 6 or RAID 7 or RAID 10 or RAID 60/61or RAID 70/71

15 Drives: RAID 6 or RAID 7 or RAID 60

16 Drives: RAID 6 or RAID 7 or RAID 10 or RAID 60/61 or RAID 70/71

17 Drives: RAID 6 or RAID 7

18 Drives: RAID 6 or RAID 7 or RAID 10 or RAID 60/61 or RAID 70/71

19 Drives: RAID 6 or RAID 7

20 Drives: RAID 6 or RAID 7 or RAID 10 or RAID 60/61 or RAID 70/71

21 Drives: RAID 6 or RAID 7 or RAID 60 or RAID 70

22 Drives: RAID 6 or RAID 7 or RAID 10 or RAID 60/61 or RAID 70/71

23 Drives: RAID 6 or RAID 7

24 Drives: RAID 6 or RAID 7 or RAID 10 or RAID 60/61 or RAID 70/71

25 Drives: RAID 6 or RAID 7 or RAID 60

………

* RAID 1 is technically viable at any drive count of two or more.  I have included it only up to three drives because using it beyond that point is generally considered absurd and is completely unheard of in the real world.  But technically it would continue to provide equal write performance while continuing to increase in read performance and reliability as more drives are added to the mirror.  But for reasons of practicality I have included it only twice on the list where it would actually be useful.

** At six drives and higher both RAID 6 and RAID 10 are viable options for arrays of even drive counts and RAID 6 alone is a viable option for odd numbered drive array counts.

For this list I have only considered the standard RAID levels of 0, 1, 4, 5, 6 and 10.  I left 0 off of the list because it is always viable for certain use cases.  RAID 5 never appears because there is no time on spindle hard drives today that it should be used, as RAID 5 is an enhancement of RAID 4, it too does not appear on the list.  Non-standard double parity RAID solutions such as Netapp’s RAID-DP and Oracle’s RAIDZ2 can be treated as derivations of RAID 6 and apply accordingly.  Oracle’s triple parity RAIDZ3 (sometimes called RAID 7) would apply at seven drives and higher but is a non-standard level and extremely rare so I included it in italics.

More commonly, RAID 6 makes sense at six drives or more and RAID 7 at eight drives or more.

Like RAID 4 and 5, RAID levels based on them (RAID 40, 50, 41, 51, 55, etc.) are not appropriate any longer due to the failure and fragility modes of spindle-based hard drives.  Complex RAID levels based on RAID 6 and 7 (60, 61, 70, 71, etc.) have a place but are exceedingly rare as they generally have very little cost savings compared to RAID 10 but suffer from performance issues and increased risk.  RAID 61 and 71 are almost exclusively effective when the highest order RAID, the mirror component, is over a network rather than local on the system.

Hardware and Software RAID

RAID, Redundant Array of Inexpensive Disks, systems are implemented in one of two basic ways: software or dedicated hardware.  Both methods are very viable and have their own merits.

In the small business space, where Intel and AMD architecture systems and Windows operating systems rule, hardware RAID is so common that a lot of confusion has arisen around software RAID due, as we will see, in no small part to the wealth of scam software RAID products touted as dedicated hardware and known colloquially as “Fake RAID.”

When RAID was first developed, it was used, in software, on high end enterprise servers running things like proprietary UNIX where the systems were extremely stable and the hardware was very powerful and robust making software RAID work very well.  Early RAID was primarily focused on mirrored RAID or very simplistic parity RAID (like RAID 2) which had little overhead.

As the need for RAID began to spill into the smaller server space and as parity RAID began to grow in popularity requiring greater processing power to support it became an issue that the underpowered processors in the x86 space were significantly impacted by the processing load of RAID, especially RAID 5.  This, combined with almost no operating systems heavily used on these platforms having software RAID implementations, lead to the natural development of hardware RAID – an offload processor board (similar to a GPU for graphics) that had its own complete computer on board with CPU and memory and firmware all of its own.

Hardware RAID worked very well at solving the RAID overhead problem in the x86 server space.  As CPUs gained more power and memory became less scarce popular x86 operating systems like Windows Server began to offer software RAID options.  Specifically Windows software RAID was known as a poor RAID implementation and was available only on server operating system versions causing a lack of appreciation for software RAID in the community of system administrators working primarily with Windows.

Because of historical implementations in the enterprise server space and the commodity x86 space there became a natural separation between the two markets supported initially by technology and later purely by ideology.  If you talk to a system administrator in the commodity space you will almost universally hear that hardware RAID is the only option.  Conversely if you talk to a system administrator in the mainframe, RISC (Sparc, Power, ARM) or EPIC (Itanium) server (sometimes called UNIX server) space you will often be met with surprise as hardware RAID isn’t available for those classes of systems – software RAID is simply a forgone conclusion.  Neither camp seems to have a real knowledge of the situation in the opposite one and crossovers in skill sets between these two is relatively rare until recently as enterprise UNIX platforms like Linux, Solaris and FreeBSD have started to become very popular and well understood on commodity hardware platforms.

To make matters more confusing for the commodity server space, in order to fill the vacuum left by the dominate operating system vendor’s lack of software RAID for the non-server operating system market while attempting to market to a less technically savvy target audience, a large number of vendors began selling non-RAID controller cards along with a “driver” that was actually software RAID and pretending that the resulting product was actually hardware RAID.  This created a large amount of confusion at best and an incredible disdain for software RAID at worse as almost universally any system whose core function is to protect data whose market is built upon deception and confusion will result in disaster.  Fake RAID systems routinely have issues with performance and reliability.  While, in theory, a third party software RAID package is a reasonable option, the reality of the software RAID market is that essentially all quality software RAID implementations are native components of either the operating system itself (Linux, Mac OSX, Solaris, Windows)  or of the filesystem (ZFS, VxFS, BtrFS) and are provided and maintained by primary vendors leaving little room or purpose for third party products outside of the Windows desktop space where a few, small legitimate software RAID players do exist but are often overshadowed by the Fake RAID players.

Today there is almost no need for hardware RAID as commodity platforms are incredibly powerful and there is almost always a dramatic excess of both computational and memory resources.  Hardware RAID instead competes mostly based on features rather than on reducing resource load.  Selection of hardware RAID versus software RAID in the commodity server space is almost completely one of preference and market momentum rather than of specific performance or features – both platforms essentially are equal with individual implementations being far more important in considering product options rather than hardware and software approaches are on their own.

Today hardware RAID offerings tend to be more “generic” with rather vanilla implementations of standard RAID levels.  Hardware RAID tends to earn its value through resource utilization reduction (CPU and memory offload), ability to “blind swap” failed drives, simplified storage management, block level storage agnostically abstracted from the operating system, fast cache close to the drives and battery or flash backed cache.  Software RAID tends to earn its value through lower power consumption, lower cost of acquisition, integrated management with the operating system, unique or advanced RAID features (such as ZFS’ RAIDZ that doesn’t suffer from the standard RAID 5 write hole) and generally better overall performance.  It is truly not a discussion of better or worse but of better or worse for a very specific situation with the most important factor often being familiarity and comfort and/or default vendor offering.

One of the most overlooked but important differentiators between hardware and software RAID is the change in the job role associated with RAID array management.  Hardware RAID moves the handling of the array to the server administrator (the support role that works on the physical server and is  stationed in the datacenter) whereas software RAID moves the handling of the array to the system administrator (the support role working on the operating system and above and rarely sitting in the datacenter.)  In the SMB market this factor might be completely overlooked but in a Fortune 500 the difference in job role can be very significant.  In many cases with hardware RAID disk replacements and system setup can be done without the need for system administrator intervention.  Datacenter server administrators can discover failed drives either through alerts or by looking for “amber lights” during walkthroughs and do replacements on the fly without needing to contact anyone or know what the server is even running.  Software RAID almost always would require the system administrator to be involved in managing the offlining of a failed disk, coordinating the replacement process with the datacenter and onlining the new one once the replacement process was completed.

Because of the way that CPU offloading and performance works and because of some advantages in the way that non-standard RAID implementations often handle parity RAID reconstruction there is a tendency for mirrored RAID levels to favor hardware RAID and software RAID levels to favor parity RAID.  Parity RAID is drastically more CPU intensive and so having access to the high power central CPU resources can be a major factor in speeding up RAID calculations.  But with mirrored RAID where RAID reconstruction is far safer than with parity RAID and where automated rebuilds are more important then hardware RAID brings the benefit of allowing blind drive replacement very easily.

One aspect of the hardware and software RAID discussion that is extremely paradoxical is that the same market that often dismisses software RAID out of hand as being inferior to hardware RAID is almost completely overlapping (you can picture the Venn Diagram in your head here) with the market that feels that file servers are inferior to commodity NAS appliances yet those NAS appliances in the SMB range are almost universally based on the same software RAID implementations being casually dismissed.  So it is often considered both inferior and superior simultaneously.  Some NAS devices in the SMB range, and NAS appliance software, that are software RAID based include: Netgear ReadyNAS, Netgear ReadyData, Buffalo Terastation, QNAP, Synology, OpenFiler FreeNAS, Nexenta and NAS4Free.

There is truly no “always use one way or the other” with hardware and software RAID.  Even giant, six figure enterprise NAS and SAN appliances are undecided as to which to use with part of the industry going each direction.  The real answer is it depends on your specific situation – your job role separation, your technical needs, your experience, your budget, etc.  Both options are completely viable in any organization.