{"id":684,"date":"2015-02-13T18:44:11","date_gmt":"2015-02-13T23:44:11","guid":{"rendered":"http:\/\/www.smbitjournal.com\/?p=684"},"modified":"2017-02-19T03:59:03","modified_gmt":"2017-02-19T08:59:03","slug":"slow-os-drives-fast-data-drives","status":"publish","type":"post","link":"https:\/\/smbitjournal.com\/2015\/02\/slow-os-drives-fast-data-drives\/","title":{"rendered":"Slow OS Drives, Fast Data Drives"},"content":{"rendered":"

Over the years I have found that people often err on the side of high performance, highly reliable data storage for an operating system partition but choose slow, “cost effective” storage for critical data stores. \u00a0I am amazed by how often I find this occurring and now, with the advent of hypervisors, I see the same behaviour being repeated there as well – compounding the previously existing issues.<\/p>\n

In many systems today we deal with only a single storage array shared by all components of the system. \u00a0In these cases we do not face the problem of misbalancing our storage system performance. \u00a0This is one of the big advantages of this approach and a major reason why it comes so highly recommended. \u00a0All performance is in a shared pool and the components that need the performance have access to it.<\/p>\n

In many cases, whether in an attempt at increased performance or reliability design or out of technical necessity, I find that people are separating out their storage arrays and putting hypervisors and operating systems on one array and data on another. \u00a0But what I find shocking is that arrays dedicated to the hypervisor or operating system are often staggeringly large in capacity and extremely high in performance – often involving 15,000 RPM spindles or even solid state drives at great expense. \u00a0Almost always in RAID 1 (as per common standards from 1998.)<\/p>\n

What needs to be understood here is that operating systems themselves have effectively no storage IO requirements. \u00a0There is a small amount, mostly for system logging, but that is about all that is needed. \u00a0Operating system partitions are almost completely static. \u00a0Required components are loaded into memory, mostly at boot time, and are not accessed again. \u00a0Even in cases where logging is needed, many times these logs are sent to a central logging system and not to the system storage area reducing or even removing that need as well.<\/p>\n

With hypervisors this effect is even more extreme. \u00a0As hypervisors are far lighter and less robust than traditional operating systems they behave more like embedded systems and, in many ways, actually are embedded systems in many cases. \u00a0Hypervisors load into memory at system boot time and their media is almost never needed again while a system is running except for logging on some occasions. \u00a0Because hypervisors are small in physical size even the total amount of time needed to completely read a full hypervisor off of storage is very small, even on very slow media, because the total size is very small.<\/p>\n

For these reasons, storage performance is of little to no consequence for operating systems and especially hypervisors. \u00a0The difference between fast storage and slow storage really only impacts system boot time where the difference in one second or thirty seconds rarely would be noticed, if at all. \u00a0When would anyone perceive even several extra seconds during the startup of a system and in most cases, startups are rare events happening at most once a week during an automated, routine system reboot during a planned maintenance window or very rarely, sometimes only once every several years, for systems that are only brought offline in emergencies. \u00a0Even the slowest conceivable storage system is far faster than necessary for this role.<\/p>\n

Even slow storage is generally many times faster than is necessary for system logging activities. \u00a0In those rare cases where logging is very intense we have many choices of how to tackle this problem. \u00a0The most obvious and common solution here is to send logs to a drive array other than the one used by the operating system or hypervisor. \u00a0This is a very easy solution and ultimately very practical in cases where it is warranted. \u00a0The other common and highly useful solution is to simply refrain from keeping logs on the local device at all and send them to a remote log collection utility such as Splunk, Loggly or ELK.<\/p>\n

The other major concern that most people have around their operating systems and hypervisors is reliability. \u00a0It is common to focus more efforts on protecting these relatively unimportant aspects of a system rather than the often irreplaceable data. \u00a0However, operating systems and hypervisors are easily rebuilt from scratch when necessary using fresh installs and manual reconfiguration when necessary. \u00a0The details which could be lost are generally relatively trivial to recreate.<\/p>\n

This does not mean that these system filesystems should not be backed up, of course they should (in most cases.) \u00a0But just in case the backups fail as well, it is rare that the loss of an OS partition or filesystem truly spells tragedy but only an inconvenience. \u00a0There are ways to recover in nearly all cases without access to the original data, as long as the “data” filesystem is separate. \u00a0And because of the nature of operating systems and hypervisors, change is rare so backups can generally be less frequent, possibly triggered manually only when updates are applied!<\/p>\n

With many modern systems in the DevOps and Cloud computing spaces it has become very common to view operating systems and hypervisor filesystems as completely disposable since they are defined remotely via a system image or by a configuration management system. \u00a0In these cases, which are becoming more and more common, there is no need for data protection or backups as the entire system is designed to be recreated, nearly instantly, without any special interaction. \u00a0The system is entirely self-replicating. \u00a0This further trivializes the need for system filesystem protection.<\/p>\n

Taken together, the lack of need around performance and the lack of need around protection and reliability handled primarily through simple recreation and what we have is a system filesystem with very different needs than we commonly assume. \u00a0This does not mean that we should be reckless with our storage, we still want to avoid storage failure while a system is running and rebuilding unnecessarily is a waste of time and resources even if it does not prove to be disastrous. \u00a0So striking a careful balance is important.<\/p>\n

It is, of course, for these reasons that including the operating system or hypervisor on the same storage array as data is now common practice – because there is little to no need for storage access to the system files at the same time that there is access to the data files so we get great synergy by getting fast boot times for the OS and no adverse impact on data access times once the system is online. \u00a0This is the primary means by which system designers today tackle the need for efficient use of storage.<\/p>\n

When the operating system or hypervisor must be separated from the arrays holding data which can still happen for myriad reasons we generally seek to obtain reasonable reliability at low cost. \u00a0When using traditional storage (local disks) this means using small, slow, low cost spinning drives for operating system storage, generally in simple RAID 1 configuration. \u00a0A real world example is the use of 5400 RPM “eco-friendly” SATA drives in the smallest sizes possible. \u00a0These draw little power and are very inexpensive to acquire. \u00a0SSDs and high speed SAS drives would be avoided as they cost a premium for protection that is irrelevant and performance that is completely wasted.<\/p>\n

In less traditional storage it is common to use a low cost, high density SAN consolidating the low priority storage for many systems onto shared, slow arrays that are not replicated. This is only effective in larger environments that can justify the additional architectural design and can achieve enough density in the storage consolidation process to create the necessary cost savings but in larger environments this is relatively easy. \u00a0SAN boot devices can leverage very low cost arrays across many servers for cost savings. \u00a0In the virtual space this could mean a low performance datastore used for OS virtual disks and another, high performance pool, for data virtual disks. \u00a0This would have the same effect as the boot SAN strategy but in a more modern setting and could easily leverage the SAN architecture under the hood to accomplish it.<\/p>\n

Finally, and most dramatically, it is a general rule of thumb with hypervisors to install them to SD cards or USB thumb drives rather than to traditional storage as their performance and reliability needs are so much less even than traditional operating systems. \u00a0Normally if a drive of this nature were to fail while a system was running it would actually remain running without any problem as the drive is never used once the system has booted initially. \u00a0It would only be during a reboot that an issue would be found and, at that time, a backup boot device could be used such as a secondary SD card or USB stick. \u00a0This is the official recommendation for VMware vSphere, is often recommended by Microsoft representatives for HyperV and is officially supported through HyperV’s OEM vendors and is often recommended, but not so broadly supported, for Xen, XenServer and KVM systems. \u00a0Using SD cards or USB drives for hypervisor storage effectively turns a virtualization server into an embedded system. \u00a0While this may feel unnatural to system administrators who are used to thinking of traditional disks as a necessity for servers, it is important to remember that enterprise class, highly critical systems like routers and switches last decades and use this exact same strategy for the exact same reasons.<\/p>\n

A common strategy for hypervisors in this embedded style mode with SD cards or USB drives is to have two such devices, which may actually be one SD card and one USB drive, each with a copy of the hypervisor. \u00a0If one device fails, booting to the second device is nearly as effective as a traditional RAID 1 system. \u00a0But unlike most traditional RAID 1 setups, we also have a relatively easy means of testing system updates by only updating one boot device at a time and testing the process before updating the second boot device leaving us with a reliable, well tested fall back in case a version update goes awry. \u00a0This process was actually common on large UNIX RISC systems where boot devices were often local software RAID 1 sets that supported a similar practice, especially common in AIX and Solaris circles.<\/p>\n

It should also be noted that while this approach is the best practice for most hypervisor scenarios there is actually no reason why it cannot be applied to full operating system filesystems too, except that it is often more work. \u00a0Some OSes, especially Linux and BSD are very adept at being installed in an embedded fashion and can easily be adapted for installation on SD card or USB drive with a little planning. \u00a0This approach is not at all common but there is no technical reason why, in the right circumstances, it would not be an excellent approach except for the fact that almost never should an OS be installed to physical hardware rather than on top of a hypervisor. \u00a0In those cases where physical installs are necessary then this approach is extremely valid.<\/p>\n

When designing and planning for storage systems, remember to be mindful as to what read and write patterns will really look like when a system is running. And remember that storage has changed rather dramatically since many traditional guidelines were developed and not all of the knowledge used to develop them still applies today or applies equally. \u00a0Think about not only which storage subsystems will attempt to use storage performance but also how they will interact with each other (for example, do two systems never request storage access at the same time or will they conflict regularly) and whether or not their access performance is important. \u00a0General operating system functions can be exceedingly slow on a database server without negative impact, all that matters is the speed at which a \u00a0database can be accessed. \u00a0Even access to application binaries is often irrelevant as they too, once loaded into memory, remain there and only memory speed impacts ongoing performance.<\/p>\n

None of this is meant to suggest that separating OS and data storage subsystems from each other is advised, it often is not. \u00a0I have written in the past about how consolidating these subsystems is quite frequently the best course of action and that remains true now. \u00a0But there are also many reasonable cases where splitting certain storage needs from each other makes sense, often when dealing with large scale systems where we can lower cost by dedicating high cost storage to certain needs and low cost storage to other needs and it is in those cases where I want to demonstrate that operating systems and hypervisors should be considered the lowest priority in terms of both performance and reliability except in the most extreme cases.<\/p>\n","protected":false},"excerpt":{"rendered":"

Over the years I have found that people often err on the side of high performance, highly reliable data storage for an operating system partition but choose slow, “cost effective” storage for critical data stores. \u00a0I am amazed by how often I find this occurring and now, with the advent of hypervisors, I see the … Continue reading Slow OS Drives, Fast Data Drives<\/span> →<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[41],"tags":[76,198,161,197],"class_list":["post-684","post","type-post","status-publish","format-standard","hentry","category-storage-2","tag-array","tag-array-spliting","tag-arrays","tag-partitioning"],"_links":{"self":[{"href":"https:\/\/smbitjournal.com\/wp-json\/wp\/v2\/posts\/684","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/smbitjournal.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/smbitjournal.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/smbitjournal.com\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/smbitjournal.com\/wp-json\/wp\/v2\/comments?post=684"}],"version-history":[{"count":6,"href":"https:\/\/smbitjournal.com\/wp-json\/wp\/v2\/posts\/684\/revisions"}],"predecessor-version":[{"id":722,"href":"https:\/\/smbitjournal.com\/wp-json\/wp\/v2\/posts\/684\/revisions\/722"}],"wp:attachment":[{"href":"https:\/\/smbitjournal.com\/wp-json\/wp\/v2\/media?parent=684"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/smbitjournal.com\/wp-json\/wp\/v2\/categories?post=684"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/smbitjournal.com\/wp-json\/wp\/v2\/tags?post=684"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}