Comments on: When No Redundancy Is More Reliable – The Myth of Redundancy

By: Scott Alan Miller

Scott Alan Miller — Fri, 31 Aug 2018 04:45:52 +0000

@andrew Actually I think you’ll find that you have the premise backwards. My entire point was that since R0 requires one fewer drives, it is less likely (for the same capacity and performance) to experience a drive failure, at all. So rather than being focused on what happens after a failure like you thought, the entire point I was making here was the opposite, that you have to stop dealing only with that one portion of the risk and look at the risk as a whole. You can’t overlook the trigger to the rebuild and only look at the rebuild, you have to look at the full risk scenario. Just run the math, it’s that simple.

And yes, I thought that I did show exactly what you are stating that I should show… that by having R5 you clearly increase the risk of the initial failure, by a lot. That’s the fundamental premise here. I’m not sure how you missed that, as that was the entire basis of the article.

R5 requires N+1 disks compared to the N disks of RAID 0. That gives equal capacity and a rather significant performance advantage to R0 under most performance scenarios. So the R0 has one fewer disk to fail. That is generally pretty statistically significant in any real world sized array.

That’s before we take into account, and allows us to ignore, the accepted industry knowledge that R5 increases wear and tear causing each individual drive to be more likely to experience failure as well. This is caused slightly from increased wear and tear during reads, but almost entirely by the effects of the 4x write expansion. The drives are read more often and written dramatically more often than its R0 counterpart. Often 200 – 400% more. That’s a big number for wear and tear.

So take the incredibly obvious increased initial risk of having more moving parts to fail. Then work them drastically harder under the same workload and you’ve got a super clear picture of why the R5 array, apples to apples, suffers initial drive failure more often than the R0 array.

Now that we’ve established why R5s need to recover more often, then when we talk about how reliably R5 rebuild operations are, we have a clear understanding that this must be significantly reliable enough to overcome the additional risk posed by the increased initial failures that they cause.

By: Andrew

Andrew — Fri, 20 Apr 2018 10:58:28 +0000

I’m sorry but your whole conjecture is ridiculous. I know this is an old post but in the case that someone stumbles across it again (like I have), I need to have a say here to correct some serious inaccuracies.

Firstly, though, I will say that the general message is definitely valid – as re-iterated by even the non-IT guy:
“I do entirely agree with the position of not relying on half-baked redundancy ideas for long-term data maintenance and those who do deserve the consequences.”

To the point in question, however.

You want to argue that R5 is –less– resilient than R0. But your whole case is based on what happens –AFTER– the first drive failure. Now, no matter –what– the chances of the R5 successfully rebuilding are, there is a still –SOME– chance. Whereas with R0, your data is already gone, 100%, with the first drive loss.

The only way that R5 could be -less- resilient than R0, is if you could show that the very fact of creating an array with write+parity write operations statistically increases the risk of an –initial– drive failure, such that the chance of the initial failure AND that of the secondary URE during a rebuild, together, increases past the likelihood of single disk failure in a R0. And this is clearly never going to be the case. Well, not unless some ridiculously poorly-written RAID contoller software does some stupid/insane activity with the disks while in the array.

As for all these reports that “no-one uses R5 anymore”, etc. I have and still do work in IT with server arrays, etc. That is not the point though – unless you work with –every– single company, no-one here is qualified to say what “everyone” or ‘most’ companies/engineers are doing “these days” and it makes 0 sense to claim it.

By: Scott Alan Miller

Scott Alan Miller — Sat, 31 Dec 2016 22:41:25 +0000

In reply to Peter Kelly.

No, even very large RAID 5 arrays would still recover once in a while. But few large ones are. It does happen, but mostly in contrived tests that don’t reflect the math because of this effect: https://mangolassi.it/topic/5084/why-non-uniform-ure-distribution-may-make-parity-raid-riskier-than-thought

But keep in mind that effectively no large array is on RAID 5 and hasn’t been for a long time, large arrays are normally on SAS making RAID 5 overly expensive rather than “always going to fail” due to paying for lower URE failure rates and no matter how risky the math appears, it never hits 100% fail (nor 100% recovery.) So we will see large arrays occasionally recover, just as we commonly see tiny ones fail to recover even though the chances of recovery are very high.

By: Peter Kelly

Peter Kelly — Wed, 21 Dec 2016 12:20:14 +0000

I’m not deeply involved in IT (just a photographer with reasonable storage needs) and came across this article.

I agree wholeheartedly with the main point, which I believe is for people not to confuse redundancy, with reliability, with security of data. However, this point is somewhat obscured by the completely erroneous mathematics! If it were correct, no moderately large Raid 5 array would ever be recovered. Never. Never, ever. I’m pretty sure, even with my limited experience that is wrong.

Indeed, I’ve seen it done. Still, I do entirely agree with the position of not relying on half-baked redundancy ideas for long-term data maintenance and those who do deserve the consequences.

By: metalisours

metalisours — Thu, 16 Jun 2016 16:33:24 +0000

I agree with Mr. Karl. Me thinks a straw man is taking a beating here.

Way back when, many of we sysops felt that RAID-5 was a way to sell more disk drives to companies. Most real configurations were simply mirrored. And we also backed up to tape–and most importantly TESTED tape restores.

When we swapped out a failed drive in a mirror, it was always a crap-shoot. We knew it could fail–and hey, that’s why we had the tapes. But if it *didn’t* fail, bonus! We got a key piece of infrastructure up and running before midnight.

Redundancy should be performed. Whether disk-to-disk backup, co-location, streaming tape, streaming replication…flash drives, whatever. RAID5 may not guarantee no downtime or data loss, but it’s only a layer in the sandwich of disaster recovery strategies. Then again, who (other than salesman) said RAID5 guarantee these things without fail? If your server has a single primary fixed disk, you are dead in the water. With RAID5, there’s a good possibility that there’s still time to keep running, pull critical data, et cetera–depending on configuration and circumstances.

Unfortunately this article will be taken the wrong way by businesses, and since Statistics is (unbelievably!) still an optional math discipline in public schools, I can pretty much guarantee that the end result will be a lopsided risk plan where saving a few grand by avoiding a passable DR system will be more important than protection millions worth of data. And that’s just stupid.

That’s just the business perspective. Every day, millions of photographs, hours of audio, and pages of writing and stuffed into systems with no redundancy and not guarantee of serviceability. Data is lost, careers are ruined, and we all pay down the debt of poor disaster recovery planning with our wages and the prices we pay at the register.

By: Karl Fife

Karl Fife — Mon, 03 Feb 2014 06:52:50 +0000

I’m surprised how many people here are still talking about using hardware Raid for critical data. To be very frank, the idea of using ANY hardware raid configuration, (including the popular 1 & 5/6 configuration) is bordering on antiquarian. It may be time to re-evaluate. You may be “renting DVD’s” when the rest of world is streaming videos… On the beach… with an iPad-Super-Ultra-Air².

If your data is important to you, you should be storing it in a ZFS array. If your data matters, and you don’t think you need ZFS, you simply haven’t thought about it long enough. We’ve grown accustomed to our processors and memory evolving like crazy. Now it’s time for the storage paradigm to do the same.

With ZFS:
-Problems like “Bit rot”, write holes, data corruption… all disappear.
-You can create historic point-in-time snapshots (that can be re-mounted ‘read/write’ in parallel with your current data. Did a virus just destroy all of your data. No problem. Just go back in time to a few moments before the event.
-Checksums are so ample that the chance of undetected data corruption is effectively zero, even if your dataset has more bits than there are atoms in the universe.
-Monthly scrubs run on schedule to true-up data to its checksums. If data has changed due to bit rot, it can be corrected by a redundant copy.
-Copy on write architecture ensures that all changes to your data are written and validated before the previous version is ‘decoupled’ from the file name.
-Did the power go out right in the middle of writing to a critical file? No problem. The file is not corrupted because the previous state is still intact.
-Abstraction of physical storage pools to the logical datasets contained within, so different data can have appropriate snapshotting, quota, compression, and replication policies applied.
-You can build as much redundancy as you like. You can create as many mirrored copies as you want (to use the author’s ‘water pump’ example), but better still, you can also replicate the entire file system to other instances on or off site, through a TCP socket or SSH tunnel. Following copies receive changed data blocks, not whole files, so data transfer is manageable. If you’re ultra-paranoid like me, you can use good old-fashioned RSync to another following server for mega-ultra redundancy by way of replicating with a fully dissimilar technology.

Oh yeah, did I mention this is all free? Sorry. I meant to.
Granted, ZFS on Solaris (or Solaris forks), on *BSD can be a bit intimidating, but when ZFS is packaged in GUI-driven Open-source distributions like FreeNAS, or paid, supported ZFS products (like NexentaStore, and TrueNAS), it becomes something that ‘normal’ admins (like me) can incorporate into enterprise (and personal) workflows. Even my personal windows workstation libraries replicate to ZFS stores which in turn replicate offsite within minutes. House burned down? Oh well.

EVEN IF you ignore the mind-bending read performance that can be achieved with ZFS through large read caches (DRAM, and L2 Adaptive Read Cache), and EVEN IF you ignore the insane synchronous write performance when paired with a high performance write cache device; the data integrity achievable with ZFS instance is unmatched. You owe it to yourself, to your career, perhaps to your employer to explore this technology.

Just sayin…

https://twitter.com/karlfife/status/429417133290688513

By: Luciano Lingnau

Luciano Lingnau — Fri, 03 Jan 2014 17:22:05 +0000

Sir, you could not be more right. Thanks for “expressing” this, I definitely agree.

By: Silk

Silk — Fri, 20 Dec 2013 15:41:10 +0000

A great write-up, really enjoy reading it.

James,

With a 4-drive RAID 5 you still get a very large volume. Even a 3TB array gives you 25% possibility of encountering a URE during the rebuild process. I certainly won’t risk it, not in business world.

By: RAID Guide [2] - Seite 23

RAID Guide [2] - Seite 23 — Wed, 27 Nov 2013 18:13:00 +0000

[…] http://www.zdnet.com/blog/storage/wh…ng-in-2009/162 Why RAID 6 stops working in 2019 | ZDNet When No Redundancy Is More Reliable – The Myth of Redundancy | SMB IT Journal RAID 5 and Uncorrectable Read Errors […]

By: james

james — Fri, 08 Nov 2013 02:19:13 +0000

There are several problems with the indictment of RAID5.

First, the problems identified are more related to giant drives and RAID volumes that span a number of them than specifically to RAID5. The same URE rates apply to large volumes regardless of underlying RAID. The risks outlined are mitigated by using giant drives in sets rather than all in one giant RAID. For example, the 12-drive RAID5 array of giant disks should be set up as three RAID5 arrays, each smaller and less risky to resilver. There is a capacity loss versus using one giant RAID5, but as you have pointed out, giant RAID volumes are not viable anywa.

Second, the brick house/straw house analogy is flawed. It is not a hurricane that takes out the first member of a redundant pair. Unlike a hurricane, the root cause of the system (brick house) that fails is not so likely to take out the straw house as a “hurricane.” This makes the redundant second system, even built of straw, better than indicated in the article.

Third, the car analogy is problematic. Systems with redundancy built in is not like carrying a spare in the truck, like a tire or a water pump. A fault tolerant car with redundant systems has two working water pumps installed and can operate when one dies. It has more than four tires, such that one or more can go flat and the vehicle still operate. Same with battery, a second one that keeps the car working when one fails.

Finally, the article says that redundancy is itself not a fix for risk, but then says that the fix is redundancy in all components (server, san, AND network). It is clear that redundancy is the best mitigation for the risk of failure.

In summary, the take-ways from this article:
1. do not use giant RAID volumes
2. when building redundant systems, make sure to include components that are often over-looked, like network.

By: David

David — Fri, 07 Jun 2013 06:55:18 +0000

Nice write up. When using than 6 or more drives, I have never considered a RAID5 as a ‘safe’ method of redundancy. How does using a RAID6 compare against the shortfalls you covered of RAID5 rebuilding a degraded array?

By: Software vs Hardware raid - Page 2 - www.hardwarezone.com.sg

Software vs Hardware raid - Page 2 - www.hardwarezone.com.sg — Fri, 17 Aug 2012 04:56:03 +0000

[…] […]

By: [Review] Synology DS412+ - A high performance 4-bay NAS - Page 3 - www.hardwarezone.com.sg

Fri, 17 Aug 2012 04:43:20 +0000

[…] […]

By: Owen

Owen — Wed, 08 Aug 2012 16:53:37 +0000

Fantastic write up. I saw first hand the exact scenario you laid out in the last part of your article, so I have seen the danger up jumping on the “Lets get a SAN” bandwagon.

By: Scott Alan Miller

Scott Alan Miller — Wed, 11 Jul 2012 19:09:54 +0000

Another good reference for RAID issues can be found here: http://queue.acm.org/detail.cfm?id=1670144

By: David Hay-Currie

David Hay-Currie — Fri, 29 Jun 2012 21:34:03 +0000

SAM, I enjoyed this article quite a lot. Good after readying the OpenStorage article.
I have been gathering, readying a lot of information, and in the end (a couple of weeks ago) I finally decided that the best route was to use internal datastore in our server because of higher I/O available, and more reliable drives than our NAS, but replicate data to NAS, in case that there is a problem with the local store. Since I already have to VM Host (to separate resources consumption) this seemed more logical.
The weird thing, is that I did not see much about doing this kind of deployment, and I talked with a couple of people that put a lot of emphasis into Vmotion and HA, however, they do not have large budgets either. I was really feeling lonely 🙂
However, the challenge is still there to have this replication.
I found this software
http://www.stormagic.com/SvSAN.php
That can present local datastore as a SAN (using iSCSI), but I am still wondering if there are other options out there