{"id":681,"date":"2015-01-05T15:26:31","date_gmt":"2015-01-05T20:26:31","guid":{"rendered":"http:\/\/www.smbitjournal.com\/?p=681"},"modified":"2017-02-19T05:09:19","modified_gmt":"2017-02-19T10:09:19","slug":"practical-raid-performance","status":"publish","type":"post","link":"https:\/\/smbitjournal.com\/2015\/01\/practical-raid-performance\/","title":{"rendered":"Practical RAID Performance"},"content":{"rendered":"<p>Choosing a RAID level is an exercise in balancing many factors including cost, reliability, capacity and, of course, performance. \u00a0RAID performance can be difficult to understand especially as different RAID levels use different techniques and behave rather differently from each other in some cases. \u00a0In this article I want to explore the common RAID levels of RAID 0, 5, 6 and 10 to see how performance differs between them.<\/p>\n<p><em>For the purposes of this article, RAID 1 will be assumed to be a subset of RAID 10. \u00a0 This is often a handy way to think of RAID 1 &#8211; as simply being a RAID 10 array with only a single mirrored pair member. \u00a0As RAID 1 is truly a single pair RAID 10 and behaves as such this works wonderfully for making RAID performance easy to understand as it simply maps into the RAID 10 performance curve.<\/em><\/p>\n<p>There are two types of performance to look at with all storage: reading and writing. \u00a0In terms of RAID reading is extremely easy and writing is rather complex. \u00a0Read performance is effectively stable across all RAID types. \u00a0Writing, however, is not.<\/p>\n<p>To make discussing performance easier we need to define a few terms as we will be working with some equations. In our discussions we will use N to represent the total number of drives, often referred to as spindles, in our array and we will use X to refer to the performance of each drive individually. \u00a0This allows us to talk in terms of relative performance as a factor of the drive performance allowing us to abstract away the RAID array and not have to think in terms of raw IOPS. \u00a0This is important as IOPS are often very hard to define but we can compare performance in a meaningful way by speaking to it in relationship to the individual drives within the array.<\/p>\n<p>It is also important to remember that we are only talking about the performance of the RAID array itself, not an entire storage subsystem. \u00a0Artifacts such as memory caches and solid state caches will do amazing things to alter the overall performance of a storage subsystem, but do not fundamentally change the performance of the RAID array itself under the hood. \u00a0There is no simple formula for determining how different cache options will impact the overall performance but suffice it to say that it can be very dramatic but this depends heavily not only on the cache choices themselves but also heavily upon workload. Even the biggest, fastest, most robust cache options cannot change the long term, sustained performance of an array.<\/p>\n<p>RAID is complex and many factors influence the final performance. \u00a0One is the implementation of the RAID system itself. \u00a0A poor implementation might cause latency or may fail to take advantage of the available spindles (such as having a RAID 1 array read only from a single disk instead of from both simultaneously!) \u00a0There is no easy way to account for deficiencies in specific RAID implementations so we must assume that all are working to the limits of the specification as, indeed, any enterprise RAID system will do. It is primarily hobby and consumer RAID systems that fail to do this.<\/p>\n<p>Some types of RAID also have dramatic amounts of computational overhead associated with them while others do not. \u00a0Primarily parity RAID levels require heavy processing in order to handle write operations with different levels having different amounts of computation necessary for each operation. \u00a0This introduces latency, but does not curtail throughput. \u00a0This latency will vary, however, based on the implementation of the RAID level as well as on the processing capability of the system in question. \u00a0Hardware RAID will use something like a general purpose CPU (often a Power or ARM RISC processor) or a custom ASIC to handle this while software RAID hands this off to the server&#8217;s own CPU. \u00a0Often the server CPU is actually faster here but consumes system resources. \u00a0ASICs can be very fast but are expensive to produce. \u00a0This latency impacts storage performance but is very difficult to predict and can vary from nominal to dramatic. \u00a0So I will mention the relative latency impact with each RAID level but will not attempt to measure it. \u00a0In most RAID performance calculations, this latency is ignored but it is important to understand that it is present and could, depending on the configuration of the array, have a noticeable impact on a workload.<\/p>\n<p>There is, it should be mentioned, a tiny performance impact to read operations due to efficiencies in the layout of data on the disk itself. \u00a0Parity RAID requires there to be data on the disks that is useless during a healthy read operation but cannot be used to speed it up. \u00a0The actually results in it being slightly slower. \u00a0But this impact is ridiculously small and is normally not measured and so can be ignored.<\/p>\n<p>Factors such as stripe size also impact performance, of course, but as that is configurable and not an intrinsic artifact in any RAID level I will ignore it here. \u00a0It is not a factor when choosing a RAID level itself but only in configuring one once chosen.<\/p>\n<p>The final factor that I want to mention is the read to write ratio of storage operations. \u00a0Some RAID arrays will be used almost purely for read operations, some almost solely for write operations but most use a blend of the two, likely something like eighty percent read and twenty percent write. \u00a0This ratio is very important in understanding the performance that you will get from your specific RAID array and understanding how each RAID level will impact you. \u00a0I refer to this as the read\/write blend.<\/p>\n<p>We measure storage performance primarily in IOPS. \u00a0IOPS stands for Input\/Output Operations Per Second (yes, I know that the letters don&#8217;t line up well, it is what it is.) \u00a0I further use the terms RIOPS for Read IOPS, WIOPS for Write IOPS and BIOPS for Blended IOPS which would come with a ration 80\/20 or whatever. \u00a0Many people talk about storage performance with a single IOPS number. \u00a0When this is done they normally mean Blended IOPS at 50\/50. \u00a0However, rarely does any workload run at 50\/50 so that number can be extremely misleading. \u00a0Two numbers, RIOPS and WIOPS is what is needed to understand performance and these two together can be used to find any IOPS Blend that is needed. \u00a0 For example, a 50\/50 blend is as simple as (RIOPS * .5) + (WIOPS * .5). \u00a0The more common 80\/20 blend would be (RIOPS * .8) + (WIOPS * .2).<\/p>\n<p>Now that we have established some criteria and background understanding we will delve into our RAID levels themselves and see how performance varies across them.<\/p>\n<p>For all RAID levels, the Read IOPS number is calculated using\u00a0<strong>NX<\/strong>. \u00a0This does not address the nominal overhead numbers that I mention above, of course. \u00a0This is a &#8220;best case&#8221; number but the real world number is so close that it is very practical to simply use this formula. \u00a0Since take the number of spindles (N) and multiple by the IOPS performance of an individual drive (X). \u00a0<em>Keep in mind that drives often have different read and write performance so be sure to use the drives Read IOPS rating or tested speed for the Read IOPS calculation and the Write IOPS rate or tested speed for the Write IOPS calculation.<\/em><\/p>\n<p><strong>RAID 0<\/strong><\/p>\n<p>RAID 0 is the easiest RAID level to understand because there is effectively no overhead to worry about, no resources consumed to power it and both read and write get the full benefit of every spindle, all of the time. \u00a0So for RAID 0\u00a0our formula for write performance is very simple:\u00a0<strong>NX<\/strong>. \u00a0RAID 0 is always the most performant RAID level.<\/p>\n<p><em>An example would be an eight spindle RAID 0 array. \u00a0If an individual drive in the array delivers 125 IOPS then our calculation would be from N = 8 and X = 125 so 8 * 125 yielding 1,000 IOPS. \u00a0Since both read and write IOPS are the same here, it is extremely simple as we get 1K RIOPS, 1K WIOPS and 1K with any blending thereof. \u00a0Very simple. \u00a0If we didn&#8217;t know the absolute IOPS of an individual spindle we could refer to an eight spindle RAID 0 as delivering 8X Blended IOPS.<\/em><\/p>\n<p><strong>RAID 10<\/strong><\/p>\n<p>RAID 10 has the second simplest RAID level to calculate. \u00a0Because RAID 10 is a RAID 0 stripe of mirror sets, we have no overhead to worry about from the stripe but each mirror has to write the same data twice in order to create the mirroring. \u00a0This cuts our write performance in half compared to a RAID 0 array of the same number of drives. \u00a0Giving us a write performance formula of simply:\u00a0<strong>NX\/2 \u00a0<\/strong>or\u00a0.5NX.<\/p>\n<p style=\"padding-left: 30px;\"><em>It should be noted that at the same capacity, rather than the same number of spindles, RAID 10 has the same write performance as RAID 0 but double the read performance &#8211; simply because it requires twice as many spindles to match the same capacity.<\/em><\/p>\n<p><em>So an eight spindle RAID 10 array would be N = 8 and X = 125 and our resulting calculation comes out to be (8 * 125)\/2 which is 500 WIOPS or 4X WIOPS. \u00a0A 50\/50 blend would result in 750 Blended IOPS (1,000 Read IOPS and 500 Write IOPS.)<\/em><\/p>\n<p><em>This formula applies to RAID 1, RAID 10, RAID 100 and RAID 01 equally.<\/em><\/p>\n<p>Uncommon options such as triple mirroring in RAID 10 would alter this write penalty. \u00a0RAID 10 with triple mirroring would be NX\/3, for example.<\/p>\n<p><strong>RAID 5<\/strong><\/p>\n<p>While RAID 5 is deprecated and should never be used in new arrays I include it here because it is a well known and commonly used RAID level and its performance needs to be understood. \u00a0RAID 5 is the most basic of the modern parity RAID levels. \u00a0RAID 2, 3 &amp; 4 are no longer found in production systems and so we will not look into their performance here. \u00a0RAID 5, while not recommended for use today, is the foundation of other modern parity RAID levels so is important to understand.<\/p>\n<p>Parity RAID adds a somewhat complicated need to verify and re-write parity with every write that goes to disk. \u00a0This means that a RAID 5 array will have to read the data, read the parity, write the data and finally write the parity. \u00a0Four operations for each effective one. \u00a0This gives us a write penalty on RAID 5 of four. \u00a0So the formula for RAID 5 write performance is\u00a0<strong>NX\/4.<\/strong><\/p>\n<p><em>So following the eight spindle example where the write IOPS of an individual spindle is 125 we would get the following calculation: (8 * 125)\/4 or 2X Write IOPS which comes to 250 WIOPS. \u00a0In a 50\/50 blend this would result in 625 Blended IOPS.<\/em><\/p>\n<p><strong>RAID 6<\/strong><\/p>\n<p>RAID 6, after RAID 10, is probably the most common and useful RAID level in use today. \u00a0RAID 6, however, is based off of RAID 5 and has another level of parity. \u00a0This makes it dramatically safer than RAID 5, which is very important, but also imposes a dramatic write penalty as each write operation requires the disks to read the data, read the first parity, read the second parity, write the data, write the first parity and then finally write the second parity. \u00a0This comes out to be a six times write penalty, which is pretty dramatic. \u00a0So our formula is\u00a0<strong>NX\/6.<\/strong><\/p>\n<p><em>Continuing our example we get (8 * 125)\/6 which comes out to ~167 Write IOPS or 1.33X. \u00a0In our 50\/50 blend example this is a performance of \u00a0583.5 Blended IOPS. \u00a0As you can see, parity writes cause a very rapid decrease in write performance and a noticeable drop in blended performance.<\/em><\/p>\n<p><strong>RAID 7 (aka RAID 5.3 or RAID 7.3)<\/strong><\/p>\n<p>RAID 7 is a somewhat non-standard RAID level with triple parity based off of the existing single parity of RAID 5 and the existing double parity of RAID 6. \u00a0The only current implementation of RAID 7 is ZFS&#8217; RAIDZ3. \u00a0Because RAID 7 contains all of the overhead of both RAID 5 and RAID 6 plus the additional overhead of the third parity component we have a write penalty of a staggering eight times. \u00a0So our formula for finding RAID 7 write performance is\u00a0<strong>NX\/8.<\/strong><\/p>\n<p><em>In our example this would mean that (8 * 125)\/8 would come out to 125 Write IOPS or 1X. \u00a0So with eight drives in our array we would get only the write performance of a single, stand alone drive. \u00a0That is significant overhead. \u00a0Our blended 50\/50 IOPS would come out to only 562.5.<\/em><\/p>\n<p><strong>Complex RAID<\/strong><\/p>\n<p>Complex RAID levels or Nested RAID levels such as RAID 50, 60, 61, 16, etc. can be found using the information above and breaking the RAID down into its components and applying each using the formul\u00e6 provided above. \u00a0There is no simple formula for these levels because they have varying configurations. \u00a0It is necessary to break them down into their components and apply the\u00a0formul\u00e6 multiple times.<\/p>\n<p><em>RAID 60 with twelve drives, two sets of six drives, where each drive is 150 IOPS would be done with two RAID 6s. \u00a0It would be the NX of RAID 0 where N is two (for two RAID 6 arrays) and the X is the resultant performance of each RAID 6. \u00a0 Each RAID 6 set would be (6 * 150)\/6. \u00a0So the full array would be 2((6 * 150)\/6). \u00a0Which results in 300 Write IOPS.<\/em><\/p>\n<p><em>The same example as above but configured as RAID 61, a mirrored pair of RAID 6 arrays, would be the same performance per RAID 6 array, but applied to the RAID 1 formula which is NX\/2 (where X is the resultant performance of the each RAID array.) \u00a0So the final formula would be 2((6 * 150)\/6)\/2 coming to 150 Write IOPS from twelve drives.<\/em><\/p>\n<p><strong>Performance as a Factor of Capacity<\/strong><\/p>\n<p>When we are producing RAID performance\u00a0formul\u00e6 we think of these in terms of the number of spindles which is incredibly sensible. \u00a0This is very useful in determining the performance of a proposed array or even an existing one where measurement is not possible and allows us to compare the relative performance between different proposed options. \u00a0It is in these terms that we universally think of RAID performance.<\/p>\n<p>This is not always a good approach, however, because typically we look at RAID as a factor of capacity rather than of performance or spindle count. \u00a0It would be very rare, but certainly possible, that someone would consider an eight drive RAID 6 array versus an eight drive RAID 10 array. \u00a0Once in a while this will occur due to a chassis limitation or some other, similar reason. \u00a0But typically RAID arrays are viewed from the standpoint of total array capacity (e.g. usable capacity) rather than spindle count, performance or any other factor. \u00a0It is odd, therefore, that we should then switch to viewing RAID performance as a function of spindle count.<\/p>\n<p>If we change our viewpoint and pivot upon capacity as the common factor, while still assuming that individual drive capacity and performance (X) remains constant between comparators then we arrive at a completely different landscape of performance. \u00a0In doing this we see, for example, that RAID 0 is no longer the most performant RAID level and that read performance varies dramatically instead of being a constant.<\/p>\n<p>Capacity is a fickle thing but we can distill it out to the number of spindles necessary to reach desired capacity. \u00a0This makes this discussion far easier. \u00a0So our first step is to determine our spindle count needed for raw capacity. \u00a0If we need a capacity of 10TB and are using 1TB drives, we would need ten spindles, for example. \u00a0Or if we need 3.2TB and are using 600GB drives we would need six spindles. \u00a0We will, different than before, refer to our spindle count as R. \u00a0As before, performance of the individual drive is represented as X. \u00a0<em>(R is used here to denote that this is the Raw Capacity Count, rather that the total Number of spindles.)<\/em><\/p>\n<p>RAID 0 remains simple, performance is still <strong>RX<\/strong> as there are no additional drives. \u00a0Both read and write IOPS are simply NX.<\/p>\n<p>RAID 10 has <strong>RX<\/strong> Write IOPS but <strong>2RX<\/strong> Read IOPS. \u00a0This is dramatic. \u00a0Suddenly when viewing performance as a factor of stable capacity we find that RAID 10 has\u00a0<em>double read performance<\/em> over RAID 0!<\/p>\n<p>RAID 5 gets slightly trickier. \u00a0Write IOPS would be expressed as <strong>((R + 1) * X)\/4<\/strong>. \u00a0The Read IOPS are expressed as <strong>((R +1) * X)<\/strong>.<\/p>\n<p>RAID 6, as we expect, follows the pattern that RAID 5 projects. \u00a0Write IOPS for RAID 6 are\u00a0<strong>((R + 2) * X)\/6<\/strong>. \u00a0And the Read IOPS are expressed as\u00a0<strong>((R + 2) * X)<\/strong>.<\/p>\n<p>RAID 7 falls right in line. \u00a0RAID 7 Write IOPS would be\u00a0<strong>((R + 3) * X)\/8<\/strong>. \u00a0And the Read IOPS are\u00a0<strong>((R + 3) * X).<\/strong><\/p>\n<p>This vantage point changes the way that we think about performance and, when looking purely at read performance, RAID 0 becomes the slowest RAID level rather than the fastest and RAID 10 becomes the fastest for both read and write no matter what the values are for R and X!<\/p>\n<p><em>If we take a real world example of 10 2TB drives to achieve 20TB of usable capacity with each drive having 100 IOPS of performance and assume a 50\/50 blend, the resultant IOPS would be: \u00a0RAID 0 with 1,000 Blended IOPS, RAID 10 with 1,500 Blended IOPS (2,000 RIOPS \/ 1,000 WIOPS), RAID 5 with 687.5 Blended IOPS (1,100 RIOPS \/ 275 WIOPS), RAID 6 with 700 Blended IOPS (1,200 RIOPS \/ 200 WIOPS) and finally RAID 7 with 731.25 Blended IOPS (1,300 RIOPS \/ 162.5 WIOPS.) \u00a0RAID 10 is a dramatic winner here.<\/em><\/p>\n<p><strong>Latency and System Impact with Software RAID<\/strong><\/p>\n<p>As I have stated earlier, RAID 0 and RAID 10 have, effectively, no system overhead to consider. \u00a0The mirroring operation requires essentially no computational effort and is, for all intents and purposes, immeasurably small. \u00a0Parity RAID does have computational overhead and this results in latency at the storage layer and system resources being consumed. \u00a0Of course, if we are using hardware RAID those resources are dedicated to the RAID array and have no function but to be consumed in this role. \u00a0If we are using software RAID, however, these are general purpose system resources (primarily CPU) that are consumed for the purposes of the RAID array processing.<\/p>\n<p>The impact to even a very small system with a large amount of RAID is still very small but it can be measured and should be considered, if only lightly. \u00a0Latency and system impact are directly related to one another.<\/p>\n<p>There is no simple way to state latency and system impact for different RAID levels except in this way:\u00a0<em>RAID 0 and RAID 10 have effectively no latency or impact, RAID 5 has some latency and impact, RAID 6 has roughly twice as much computational latency and impact as RAID 5 and RAID 7 has roughly triple the computational latency and impact as RAID 5.<\/em><\/p>\n<p>In many cases this latency and system impact will be so small that they cannot be measured with standard system tools and as modern processors become increasingly powerful the latency and system impact will continue to diminish. \u00a0Impact has been considered negligible for RAID 5 and RAID 6 systems on even low end, commodity hardware since approximately 2001. \u00a0But it is possible on heavily loaded systems with a large amount of parity RAID activity that there could be contention between the RAID subsystem and other processes requiring system resources.<\/p>\n<p>Reference:\u00a0<em><a href=\"http:\/\/theithollow.com\/2012\/03\/understanding-raid-penalty\/\" target=\"_blank\">The IT Hollow &#8211; Understanding the RAID Penalty<\/a><\/em><\/p>\n<p>Article originally posted to the <a href=\"http:\/\/www.storagecraft.com\/blog\/raid-performance\/\" target=\"_blank\">StorageCraft Blog &#8211; RAID Performance<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Choosing a RAID level is an exercise in balancing many factors including cost, reliability, capacity and, of course, performance. \u00a0RAID performance can be difficult to understand especially as different RAID levels use different techniques and behave rather differently from each other in some cases. \u00a0In this article I want to explore the common RAID levels &hellip; <a href=\"https:\/\/smbitjournal.com\/2015\/01\/practical-raid-performance\/\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">Practical RAID Performance<\/span> <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[41],"tags":[193,49],"class_list":["post-681","post","type-post","status-publish","format-standard","hentry","category-storage-2","tag-performance","tag-raid"],"_links":{"self":[{"href":"https:\/\/smbitjournal.com\/wp-json\/wp\/v2\/posts\/681","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/smbitjournal.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/smbitjournal.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/smbitjournal.com\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/smbitjournal.com\/wp-json\/wp\/v2\/comments?post=681"}],"version-history":[{"count":11,"href":"https:\/\/smbitjournal.com\/wp-json\/wp\/v2\/posts\/681\/revisions"}],"predecessor-version":[{"id":709,"href":"https:\/\/smbitjournal.com\/wp-json\/wp\/v2\/posts\/681\/revisions\/709"}],"wp:attachment":[{"href":"https:\/\/smbitjournal.com\/wp-json\/wp\/v2\/media?parent=681"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/smbitjournal.com\/wp-json\/wp\/v2\/categories?post=681"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/smbitjournal.com\/wp-json\/wp\/v2\/tags?post=681"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}