Tag Archives: san

The Jurassic Park Effect

“If I may… Um, I’ll tell you the problem with the scientific power that you’re using here, it didn’t require any discipline to attain it. You read what others had done and you took the next step. You didn’t earn the knowledge for yourselves, so you don’t take any responsibility for it. You stood on the shoulders of geniuses to accomplish something as fast as you could, and before you even knew what you had, you patented it, and packaged it, and slapped it on a plastic lunchbox, and now …” – Dr. Ian Malcolm, Jurassic Park

When looking at building a storage server or NAS, there is a common feeling that what is needed is a “NAS operating system.”  This is an odd reaction, I find, since the term NAS means nothing more than a “fileserver with a dedicated storage interface.”  Or, in other words, just a file server with limited exposed functionality.  The reason that we choose physical NAS appliances is for the integrated support and sometimes for special, proprietary functionality (NetApp being a key example of this offering extensive SMB and NFS integration and some really unique RAID and filesystem options or Exablox offering fully managed scale out file storage and RAIN style protection.)  Using a NAS to replace a traditional file server is, for the most part, a fairly recent phenomenon and one that I have found is often driven by misconception or the impression that managing a file server, one of the  most basic IT workloads, is special or hard.  File servers are generally considered the most basic form of server and traditionally what people meant when using the term server unless additional description was added and the only form commonly integrated into the desktop (every Mac, Windows and Linux desktop can function as a file server and it is very common to do so.)

There is, of course, nothing wrong with turning to a NAS instead of a traditional file server to meet your storage needs, especially as some modern NAS options, like Exablox, offer scale out and storage options that are not available in most operating systems.  But it appears that the trend to use a NAS instead of a file server has led to some odd behaviour when IT professionals turn back to considering file servers again.  A cascading effect, I suspect, where the reasons for why NAS are sometimes preferred and the goal level thinking are lost and the resulting idea of “I should have a NAS” remains, so that when returning to look at file server options there is a drive to “have a NAS” regardless of whether there is a logical reason for feeling that this is necessary or not.

First we must consider that the general concept of a NAS is a simple one, take a traditional file server, simplify it by removing options and package it with all of the necessary hardware to make a simplified appliance with all of the support included from the interface down to the spinning drives and everything in between.  Storage can be tricky when users need to determine RAID levels, drive types, monitor effectively, etc.  A NAS addresses this by integrating the hardware into the platform.  This makes things simple but can add risk as you have fewer support options and less ability to fix or replace things yourself.  A move from a file server to a NAS appliance is truly about support almost exclusively and is generally a very strong commitment to a singular vendor.  You chose the NAS approach because you want to rely on a vendor for everything.

When we move to a file server we go in the opposite direction.  A file server is a traditional enterprise server like any other.  You buy your server hardware from one vendor (HP, Dell, IBM, etc.) and your operating system from another (Microsoft, Red Hat, Suse, etc.)  You specify the parts and the configuration that you need and you have the most common computing model for all of IT.  With this model you generally are using standard, commodity parts allowing you to easily migrate between hardware vendors and between software vendors. You have “vendor redundancy” options and generally everything is done using open, standard protocols.  You get great flexibility and can manage and monitor your file server just like any other member of your server fleet, including keeping it completely virtualized.  You give up the vertical integration of the NAS in exchange for horizontal flexibility and standardization.

What is odd, therefore, is when returning to the commodity model but seeking, what is colloquially known as, a NAS OS.  Common examples of these include NAS4Free, FreeNAS and OpenFiler.  This category of products is generally nothing more than a standard operating system (often FreeBSD as it has ideal licensing, or Linux because it is well known) with a “storage interface” put onto it and no special or additional functionality that would not exist with the normal operating system.  In theory they are a “single function” operating system that does only one thing.  But this is not reality.  They are general purpose operating systems with an extra GUI management layer added on top.  One could say the same thing about most physical NAS products themselves, but they typically include custom engineering even at the storage level, special features and, most importantly, an integrated support stack and true isolation of the “generalness” of the underlying OS.  A “NAS OS” is not a simpler version of a general purpose OS, it is a more complex, yet less functional version of it.

What is additionally odd is that general OSes, with rare exception, already come with very simple, extremely well known and fully supported storage interfaces.  Nearly every variety of Windows or Linux servers, for example, have included simple graphical interfaces for these functions for a very long time.  These included GUIs are often shunned by system administrators as being too “heavy and unnecessary” for a simple file server.  So it is even more unusual that adding a third party GUI, one that is not patched and tested by the OS team and not standardly known and supported, would then be desired as this goes against the common ideals and practices of using a server.

And this is where the Jurassic Park effect comes in – the OS vendors (Red Hat, Microsoft, Oracle, FreeBSD, Suse, Canonical, et. al.) are giants with amazing engineering teams, code review, testing, oversight and enterprise support ecosystems.  While the “NAS OS” vendors are generally very small companies, some with just one part time person, who stand on the shoulders of these giants and build something that they knew that they could but they never stopped to ask if they should.  The resulting products are wholly negative compared to their pure OS counterparts, they do not make systems management easier nor do they fill a gap in the market’s service offerings. Solid, reliable, easy to use storage is already available, more vendors are not needed to fill this place in the market.

The logic often applied to looking at a NAS OS is that they are “easy to set up.”   This may or may not be true as easy, here, must be a relational term.  For there to be any value a NAS OS has to be easy in comparison to the standard version of the same operating system.  So in the case of FreeNAS, this would mean FreeBSD.  FreeNAS would need to be appreciably easier to set up than FreeBSD for the same, dedicated functions.  And this is easily true, setting up a NAS OS is generally pretty easy.  But this ease is only a panacea and one of which IT professionals need to be quite aware.  Making something easy to set up is not a priority in IT, making something that is easy to operate and repair when there are problems is what is important.  Easy to set up is nice, but if it comes at a cost of not understanding how the system is configured and makes operational repairs more difficult it is a very, very bad thing.  NAS OS products routinely make it dangerously easy to get a product into production for a storage role, which is almost always the most critical or nearly the most critical role of any server in an environment, that IT has no experience or likely skill to maintain, operate or, most importantly, fix when something goes wrong.  We need exactly the opposite, a system that is easy to operate and fix.  That is what matters.  So we have a second case of “standing on the shoulders of giants” and building a system that we knew we could, but did not know if we should.

What exacerbates this problem is that the very people who feel the need to turn to a NAS OS to “make storage easy” are, by the very nature of the NAS OS, the exact people for whom operational support and the repair of the system is most difficult.  System administrators who are comfortable with the underlying OS would naturally not see a NAS OS as a benefit and avoid it, for the most part.  It is uniquely the people for whom it is most dangerous to run a not fully understood storage platform that are likely to attempt it.  And, of course, most NAS OS vendors earn their money, as we could predict, on post-installation support calls for customers who deployed and got stuck once they were in production so that they are at the mercy of the vendors for exorbitant support pricing.  It is in the interest of the vendors to make it easy to install and hard to fix.  Everything is working against the IT pro here.

If we take a common example and look at FreeNAS we can see how this is a poor alignment of “difficulties.”  FreeNAS is FreeBSD with an additional interface on top.  Anything that FreeNAS can do, FreeBSD an do.  There is no loss of functionality by going to FreeBSD.  When something fails, in either case, the system administrator must have a good working knowledge of FreeBSD in order to exact repairs.  There is no escaping this.  FreeBSD knowledge is common in the industry and getting outside help is relatively easy.  Using FreeNAS adds several complications, the biggest being that any and all customizations made by the FreeNAS GUI are special knowledge needed for troubleshooting on top of the knowledge already needed to operate FreeBSD.  So this is a large knowledge set as well as more things to fail.  It is also a relatively uncommon knowledge set as FreeNAS is a niche storage product from a small vendor and FreeBSD is a major enterprise IT platform (plus all use of FreeNAS is FreeBSD use but only a tiny percentage of FreeBSD use is FreeNAS.)  So we can see that using a NAS OS just adds risk over and over again.

This same issue carries over into the communities that grow up around these products.  If you look to communities around FreeBSD, Linux or Windows for guidance and assistance you deal with large numbers of IT professionals, skilled system admins and those with business and enterprise experience.  Of course, hobbyists, the uninformed and others participate too, but these are the enterprise IT platforms and all the knowledge of the industry is available to you when implementing these products.  Compare this to the community of a NAS OS.  By its very nature, only people struggling with the administration of a standard operating system and/or storage basics would look at a NAS OS package and so this naturally filters the membership in their communities to include only the people from whom we would be best to avoid getting advice.  This creates an isolated culture of misinformation and misunderstandings around storage and storage products.  Myths abound, guidance often becomes reckless and dangerous and industry best practices are ignored as if decades of accumulated experience had never happened.

A NAS OS also, commonly, introduces lags in patching and updates.  A NAS OS will almost always and almost necessarily trail its parent OS on security and stability updates and will very often follow months or years behind on major features.  In one very well known scenario, OpenFiler, the product was built on an upstream non-enterprise base (RPath Linux) which lacked community and vendor support, failed and was abandoned leaving downstream users, included everyone on OpenFiler, abandoned without the ecosystem needed to support them.  Using a NAS OS means trusting not just the large, enterprise and well known primary OS vendor that makes the base OS but trusting the NAS OS vendor as well.  And the NAS OS vendor is orders of magnitude more likely to fail if they are basing their products on enterprise class base OSes.

Storage is a critical function and should not be treated carelessly and should not be ignored as if its criticality did not exist.  NAS OSes tempt us to install quickly and forget, hoping that nothing ever goes wrong or that we can move on to other roles or companies completely before bad things happen.  It sets us up for failure where failure is most impactful.  When a typical application server fails we can always copy the files off of its storage and start fresh.  When storage fails, data is lost and systems go down.

“John Hammond: All major theme parks have delays. When they opened Disneyland in 1956, nothing worked!

Dr. Ian Malcolm: Yeah, but, John, if The Pirates of the Caribbean breaks down, the pirates don’t eat the tourists.”

When storage fails, businesses fail.  Taking the easy route to setting up storage and ignoring the long term support needs and seeking advice from communities that have filtered out the experienced storage and systems engineers increases risk dramatically.  Sadly, the nature of a NAS OS, is that the very reason that people turn to it (lack of deep technical knowledge to build the systems) is the very reason they must avoid it (even greater need for support.)  The people for whom NAS OSes are effectively safe to use, those with very deep and broad storage and systems knowledge would rarely consider these products because for them they offer no benefits.

At the end of the day, while the concept of a NAS OS sounds wonderful, it is not a panacea and the value of a NAS does not carry over from the physical appliance world to the installed OS world and the value of standard OSes is far too great for NAS OSes to effectively add real value.

“Dr. Alan Grant: Hammond, after some consideration, I’ve decided, not to endorse your park.

John Hammond: So have I.”

When to Consider a SAN?

Everyone seems to want to jump into purchasing a SAN, sometimes quite passionately.  SANs are, admittedly, pretty cool.  They are one of the more fun and exciting, large scale hardware items that most IT professionals get a chance to have in their own shop.  Often the desire to have a SAN of ones own is a matter of “keeping up with the Jones” as using a SAN has become a bit of a status symbol – one of those last bastions of big business IT that you only see in a dedicated server closet and never in someone’s home (well, almost never.)  SANs are pushed heavily, advertised and sold as amazing boxes with internal redundancy making them infallible, speed that defies logic and loaded with features that you never knew that you needed.  When speaking to IT pros designing new systems, one of the most common design aspects that I hear is “well we don’t know much about our final design, but we know that we need a SAN.”

In the context of this article, I use SAN in its most common context, that is to mean a “block storage device” and not to refer to the entire storage network itself.  A storage network can exist for NAS but not use a SAN block storage device at all. So for this article SAN refers exclusively to SAN as a device, not SAN as a network.  SAN is a soft term used to mean multiple things at different times and can become quite confusing.  A SAN configured without a network becomes DAS.  DAS that is networked becomes SAN.

Let’s stop for a moment.  SAN is your back end storage.  The need for it would be, in all cases, determined by other aspects of your architecture.  If you have not yet decided upon many other pieces, you simply cannot know that a SAN is going to be needed, or even useful, in the final design.  Red flags. Red flags everywhere.  Imagine a Roman chariot race with the horses pushes the chariots (if you know what I mean.)

It is clear that the drive to implement a SAN is so strong that often entire projects are devised with little purpose except, it would seem, to justify the purchase of the SAN.  As with any project, the first question that one must ask is “What is the business need that we are attempting to fill?”   And work from there, not “We want to buy a SAN, where can we use it?”  SANs are complex, and with complexity comes fragility.  Very often SANs carry high cost.  But the scariest aspect of a SAN is the widespread lack of deep industry knowledge concerning them.  SANs pose huge technical and business risk that must be overcome to justify their use.  SANs are, without a doubt, very exciting and quite useful, but that is seldom good enough to warrant the desire for one.

We refer to SANs as “the storage of last resort.”  What this means is, when picking types of storage, you hope that you can use any of the other alternatives such as local drives, DAS (Direct Attach Storage) or NAS (Network Attached Storage) rather than SAN.  Most times, other options work wonderfully.  But there are times when the business needs demand requirements that can only reasonably be met with a SAN.  When those come up, we have no choice and must use a SAN.  But generally it can be avoided in favor of simpler and normally less costly or risky options.

I find that most people looking to implement a SAN are doing so under a number of misconceptions.

The first is that SANs, by their very nature, are highly reliable.  While there are certainly many SAN vendors and specific SAN products that are amazingly reliable, the same could be said about any IT product.  High end servers in the price range of high end SANs are every bit as reliable as SANs.  Since SANs are made from the same hardware components as normal servers, there is no magic to making them more reliable.  Anything that can be used to make a SAN reliable is a trickle down of server RAS (Reliability, Availability and Serviceability) technologies.  Just like SAN, NAS and DAS, as well as local disks, can be made incredibly reliable.  SAN only refers to the device being used to serve block storage rather than perform some other task.  A SAN is just a very simple server.  SANs encompass the entire range of reliability with mainframe-like reliability at the top end to devices that are nothing more than external hard drives – the most unreliable network devices on your network – on the bottom end.  So rather than SAN meaning reliability, it actually offers a few special cases of being the lowest reliability you can imagine.  But, for all intents and purposes, all servers share roughly equal reliability concerns.  SANs gain a reputation for reliability because often businesses put extreme budgets into their SANs that they do not put into their servers so that the comparison is a relatively high end SAN to a relatively budget server.

The second is that SAN means “big” and NAS means “small.”  There is no such association.  Both SANs and NASs can be of nearly any scale or quality.  They both run the gamut and there isn’t the slightest suggestion from the technology chosen whether a device is large or not.  Again, as above, SAN actually can technically come “smaller” than a NAS solution due to its possible simplicity but this is a specialty case and mostly only theoretical although there are SAN products on the market that are in this category, just very rare to find them in use.

The third is that SAN and NAS are dramatically different inside the chassis.  This is certainly not the case as the majority of SAN and NAS devices today are what is called “unified storage” meaning a storage appliance that acts simultaneously as both SAN and NAS.  This highlights that the key difference between the two is not in backend technology or hardware or size or reliability but the defining difference is the protocols used to transfer storage.  SANs are block storage exposing raw block devices onto the network using protocols like fibre channel, iSCSI, SAS, ZSAN, ATA over Ethernet (AoE) or Fibre Channel over Ethernet (FCoE.)  NAS, on the other hand, uses a network file system and exposes files onto the network using application layer protocols like NFS, SMB, AFP, HTTP and FTP which then ride over TCP/IP.

The fourth is that SANs are inherently a file sharing technology.  This is NAS.  SAN is simply taking your block storage (hard disk subsystem) and making it remotely available over a network.  The nature of networks suggests that we can attach that storage to multiple devices at once and indeed, physically, we can.  Just as we used to be able to physically attach multiple controllers to opposite ends of a SCSI ribbon cable with hard drives dangling in the middle.  This will, under normal circumstances, destroy all of the data on the drives as the controllers, which know nothing about each other, overwrite data from each other causing near instant corruption.  There are mechanisms available in special clustered filesystems and their drivers to allow for this, but this requires special knowledge and understanding that is far more technical than many people acquiring SANs are aware that they need for what they often believe is the very purpose of the SAN – a disaster so common that I probably speak to someone who has done just this almost weekly.  That the SAN puts at risk the very use case that most people believe it is designed to handle and not only fails to deliver the nearly magic protection sought but is, to the contrary, the very cause of the loss of data exposes the level of risk that implemented misunderstood storage technology carrier with it.

The fifth is that SANs are fast.  SANs can be fast; they can also be horrifically slow.  There is no intrinsic speed boost from the use of SAN technology on its own.  It is actually fairly difficult for SANs to overcome the inherent bottlenecks introduced by the network on which they sit.  As some other storage options such as DAS use all the same technologies as SAN but lack the bottleneck and latency of the actual network an equivalent DAS will also be just a little faster than its SAN complement.  SANs are generally a little faster than a hardware-identical NAS equivalent, but even this is not guaranteed.  SAN and NAS behave differently and in different use cases either may be the better performing.  SAN would rarely be chosen as a solution based on performance needs.

The sixth is that by being a SAN that the inherent problems associated with storage choices no longer apply.  A good example is the use of RAID 5.  This would be considered bad practice to do in a server, but when working with a SAN (which in theory is far more critical than a stand alone server) often careful storage subsystem planning is eschewed based on a belief that being a SAN that it has somehow fixed those issues or that they do not apply.  It is true that some high end SANs do have some amount of risk mitigation features unlikely to be found elsewhere, but these are rare and exclusively relegated to very high end units where using fragile designs would already be uncommon.  It is a dangerous, but very common practice, to take great care and consideration when planning storage for a physical server but when using a SAN that same planning and oversight is often skipped based on the assumption that the SAN handles all of that internally or that it is simply no longer needed.

Having shot down many misconceptions about SAN one may be wondering if SANs are ever appropriate.  They are, of course, quite important and incredibly valuable when used correctly.  The strongest points of SANs come from consolidation and special types of shared storage.

Consolidation was the historical driver bringing customers to SAN solutions.  A SAN allows us to combine many filesystems into a single disk array allowing far more efficient use of storage resources.  Because SAN is block level it is able to do this anytime that a traditional, local disk subsystem could be employed.  In many servers, and even many desktops, storage space is wasted due to the necessities of growth, planning and disk capacity granularity.  If we have twenty servers each with 300GB drive arrays but each only using 80GB of that capacity, we have large waste.  With a SAN would could consolidate to just 1.6TB plus a small amount necessary for overhead and spend far less on physical disks than if each server was maintaining its own storage.

Once we begin consolidating storage we begin to look for advanced consolidation opportunities.  Having consolidated many server filessytems onto a single SAN we have the chance, if our SAN implementation supports it, to deduplicate and compress that data which, in many cases such as server filesystems, can potentially result in significant utilization reduction.  So out 1.6TB in our example above might actually end up being only 800GB or less.  Suddenly our consolidation numbers are getting better and better.

To efficiently leverage consolidation it is necessary to have scale and this is where SANs really shine – when scale but in capacity and, more importantly, in the number of attaching nodes become very large.  SANs are best suited to large scale storage consolidation.  This is their sweet spot and what makes them nearly ubiquitous in large enterprises and very rare in small ones.

SANs are also very important for certain types of clustering and shared storage that requires single shared filesystem access.  These is actually a pretty rare need outside of one special circumstance – databases.  Most applications are happy to utilize any type of storage provided to them, but databases often require low level block access to be able to properly manipulate their data most effectively.  Because of this they can rarely be used, or used effectively, on NAS or file servers.  Providing high availability storage environments for database clusters has long been a key use case of SAN storage.

Outside of these two primary use cases, which justify the vast majority of SAN installations, SAN also provides for high levels of storage flexibility in making it potentially very simple to move, grow and modify storage in a large environment without needing to deal with physical moves or complicated procurement and provisioning.  Again, like consolidation, this is an artifact of large scale.

In very large environments, the use of SAN can also be used to provide a point a demarcation between storage and system engineering teams allowing there to be a handoff at the network layer, generally of fibre channel or iSCSI.  This clear separation of duties can be critical in allowing for teams to be highly segregated in companies that want highly discrete storage, network and systems teams.  This allows the storage team to do nothing but focus on storage and the systems team to do nothing but focus on the systems without any need for knowledge of the other team’s implementations.

For a long time SANs also presented themselves as a convenient means to improve storage performance.  This is not an intrinsic component of SAN but an outgrowth of their common use for consolidation.  Similarly to virtualization when used as consolidation, shared SANs will have a nature advantage of having better utilization of available spindles, centralized caches and bigger hardware than the equivalent storage spread out among many individual servers.  Like shared CPU resources, when the SAN is not receiving requests from multiple clients it has the ability to dedicate all of its capacity to servicing the requests of a single client providing an average performance experience potentially far higher than what an individual server would be able to affordably achieve on its own.

Using SAN for performance is rapidly fading from favor, however, because of the advent of SSD storage becoming very common.  As SSDs with incredibly low latency and high IOPS performance drop in price to the point where they are being added to stand alone servers as local cache or potentially even being used as mainline storage the bottleneck of the SANs networking becomes a larger and larger factor making it increasingly difficult for the consolidation benefits of a SAN to offset the performance benefits of local SSDs.  SSDs are potentially very disruptive for the shared storage market as they bring the performance advantage back towards local storage – just the latest in the ebb and flow of storage architecture design.

The most important aspect of SAN usage to remember is that SAN should not be a default starting point in storage planning.  It is one of many technology choices and one that often does not fit the bill as intended or does so but at an unnecessarily high price point either in monetary or complexity terms.  Start by defining business goals and needs.  Select SAN when it solves those needs most effectively, but keep an open mind and consider the overall storage needs of the environment.

Comparing SAN and NAS

One of the greatest confusions that I have seen in recent years is that between NAS and SAN.  Understanding what each is will go a long way towards understanding where they are useful and appropriate.

Our first task is to strip away the marketing terms and move on to technical ones.  NAS stands for Network Attached Storage but doesn’t mean exactly that and SAN stands for Storage Area Network but is generally used to refer to a SAN device, not the network itself.  In its most proper form, a SAN is any network dedicated to storage traffic, but in the real world, that’s not how it is normally used.  In this case we are hear to talk about NAS and SAN devices and how they compare so we will not use the definition that includes the network rather than the device.  In reality, both NAS and SAN are marketing terms and are a bit soft around the edges because of it.  They are precise enough to use in a normal technical conversation, as along as all parties know what they mean, but when discussing their meaning we should strip away the cool-sounding names and stick to the most technical descriptions.  Both terms, when used via marketing, are used to imply that they are a certain technology that has been “appliancized” which makes the use of the terms unnecessarily complicated but no more useful.

So our first task is to define what these two names mean in a device context.  Both devices are storage servers, plain and simple, just two different ways of exposing that storage to the outside world.

The simpler of the two is the SAN which is properly a block storage device.  Any device that exposes its storage externally as a block device falls into this category and can be used interchangeably based on how it is used.  The block storage devices are external hard drives, DAS (Direct Attach Storage) and SAN.  All of these are actually the same thing.  We call it an external hard drive when we attach it to a desktop.  We call it a DAS when we attach it to a server.  We call it a SAN when we add some form of networking, generally a switch, between the device and the final device that is consuming the storage.  There is no technological difference between these devices.  A traditional SAN can be directly attached to a desktop and used like an external hard drive.  An external hard drive can be hooked to a switch and used by multiple devices on a network.  The interface between the storage device and the system using it is the block.  Common protocols for block storage include iSCSI, Fibre Channel, SAS, eSATA, USB, Thunderbolt, IEEE1394 (aka Firewire), Fibre Channel over Ethernet (FCoE) and ATA over Ethernet (AoE.)  A device attaching to a block storage device will always see the storage presented as a disk drive, nothing more.

A NAS, also known as a “filer”, is a file storage device.  This means that it exposes its storage as a network filesystem.  So any device attaching to this storage does not see a disk drive but instead sees a mountable filesystem.  When a NAS is not packaged as an appliance, we simply call it a file server and nearly all computing devices from desktops to servers have some degree of this functionality included in them.  Common protocols for file storage devices include NFS, SMB / CIFS and AFP.  There are many others, however, and technically there are special case file storage protocols such as FTP and HTTP that should qualify as well.  As an extreme example, a traditional web server is a very specialized form of file storage device.

What separates block storage and file storage devices is the type of interface that they present to the outside world, or to think of it another way, where the division between server device and client device happens within the storage stack.

It has become extremely common today for storage devices to include both block storage and file storage from the same device.  Systems that do this are called unified storage.  With unified storage, whether you can say that it is behaving as block storage or file storage device (SAN or NAS in the common parlance) or both is based upon the behavior that you configure for the device not based on what you purchase.  This is important as it drives home the point that this is purely a protocol or interface distinction, not one of size, capability, reliability, performance, features, etc.

Both types of devices have the option, but not the requirement, of providing extended features beneath the “demarcation point” at which they hand off the storage to the outside.  Both may, or may not, provide RAID, logical volume management, monitoring, etc.  File storage (NAS) may also provide file system features such as Windows NTFS ACLs.

The key advantage to block storage is that the systems that attach to it are given an opportunity to manipulate the storage system as if it were a traditional disk drive.  This means that RAID and logical volume management, which may already have been doing in the “black box” of the storage device can now be done again, if desired, at a higher level.  The client devices are not aware what kind of device they are seeing, only that it appears as a disk drive.  So you can choose to trust it (assume that it has RAID of an adequate level, for example) or you can combine multiple block storage devices together into RAID just as if they were regular, local disks.  This is extremely uncommon but is an interesting option and there are products that are designed to be used in this way.

More commonly, logical volume management such as Linux LVM, Solaris ZFS or Windows Dynamic Disks is applied on top of the exposed block storage from the device and then, on top of that, a filesystem would be employed.  This is important to remember, with block storage devices the filesystem is created and managed by the client device, not by the storage device.  The storage device is blissfully unaware of how the block storage that it is presenting is used and allows the end user to use it however they see fit with total control.  This extends even to the point that you can chain block storage devices together with one providing the storage to the next being combines, perhaps, into RAID groups – block storage devices can be layered, more or less, indefinitely.

Alternatively, a file storage device contains all of the block portion of the storage so any opportunity for RAID, logical volume management and monitoring must be handled by the file storage device.  Then, on top of the block storage, a filesystem is applied.  Commonly this would be Linux’ EXT4, FreeBSD and Solaris’ ZFS, Windows NTFS but other filesystems such as WAFL, XFS, JFS, BtrFS, UFS and more are certainly possible.   On this filesystem, data will be stored.  To them share this data with the outside world a network file system (also known as a distributed file system) is used which provides a file system interface that is network enabled – NFS, SMB and AFP being the most common but, like in any protocol family, there are numerous special case and exotic possibilities.

A remote device wanting to use storage on the file storage device would see it over the network the same as it would see a local filesystem and is able to mount it in an identical manner.  This makes file storage especially easy and obvious for end consumer to use as it is very natural in every aspect.  We use network file systems every day for normal desktop computing.  When we “map a drive” in Windows, for example, we are using a network file system.

One critical differentiation between block storage and file storage that must be differentiated between is that, while both potentially can sit on a network and allow multiple client machines to attach to them, only file storage devices have the ability arbitrate that access.  This is very important and cannot be glossed over.

Block storage appears as a disk drive.  If you simply attach a disk drive to two or more computers at once, you can imagine what will happen – each will know nothing of the other and will be unaware of new files being created, others changing and they systems will rapidly begin to overwrite each other.  If your file system is read only on all nodes, this is not a problem.  But if any system is writing or changing the data, the others will have problems.  This generally results in data corruption very quickly, typically on the order of minutes.  To see this in extreme action, imagine having two or three client system all believe that they have exclusive access to a disk drive and have them all defragment it at the same time.  All data on the drive will be scrambled in seconds.

A file storage device, on the other hand, has natural arbitration as the network file system handles the communications for access to the real file system and filesystems, by their nature, are multi-user naturally.  So if one system attached to a file storage device makes a change, all systems are immediately aware of the change and will not “step on each others toes.”  Even if they attempt to do the the file storage device’s filesystem arbitrates access and has the final say and does not let this happen.  This makes sharing data easy and transparent to end users.  (I use the term “end users” here to include system administrators.)

This does not mean that there is no means of sharing storage from a block device, but the arbitration of it cannot be handled by the block storage device itself.  Block storage devices are be made “shareable” by using what is known as a clustered file system.  These types of file systems originated back when server clusters shared storage resources by having two servers attached with a SCSI controller on either end of a single SCSI cable and having the shared drives attached in the middle of the cable.  The only means by which the servers could communicate was through the file system itself and so special clustered file systems were developed that allowed there to be communications between the devices, alerting each to changes made by the other, through the file system itself.  This actually works surprisingly well but clustered file systems are relatively uncommon with Red Hat’s GFS and Oracle’s OCFS being some of the best well known in the traditional server world and VMWare’s much newer VMFS having become extremely well known through its use for virtualization storage.  Normal users, including system administrators, may not have access to clustered file systems or may have needs that do not allow their use.  Of important note is also that the arbitration is handled through trust, not through enforcement, like with a file storage device.  With a file storage device, the device itself handles the access arbitration and there is no way around it.  With block storage devices using a clustered file system, any device that attaches to the storage can ignore the clustered file system and simply bypass the passive arbitration – this is so simple that it would normally happen accidentally.  It can happen when mounting the filesystem and specifying the wrong file system type or through a drive misbehaving or any malicious action.  So access security is critical at the network level to protect block level storage.

The underlying concept being exposed here is that block storage devices are dumb devices (think glorified disk drive) and file storage devices are smart devices (think traditional server.)  File storage devices must contain a full working “computer” with CPU, memory, storage, filesystem and networking.  Block storage devices may contain these things but need not.  At their simplest, block storage devices can be nothing more than a disk drive with a USB or Ethernet adapter attached to them.  It is actually not uncommon for them to be nothing more than a RAID controller plus Ethernet or Fiber Channel adapters to be attached.

In both cases, block storage device and file storage devices, we can scale down to trivially simple devices or can scale up to massive “mainframe class” ultra-high-availability systems.  Both can be either fast or slow.  One is not better or worse, one is not higher or lower, one is not more or less enterprise – they are different and serve generally different purposes.  And there are advanced features that either may or may not contain.  The challenge comes in knowing which is right for which job.

I like to think of block storage protocols as being a “standard out” stream, much like on a command line.  So the base level of any storage “pipeline” is always a block device and numerous block devices or transformations can exist with each being piped one to another as long as the output remains a block storage protocol.  We only terminate the chain when we apply a file system.   In this way hardware RAID, network RAID, logical volume management, etc. can be applied in multiple combinations as needed.  Block storage is truly not just blocks of data but building blocks of storage systems.

One point that is very interesting is that since block storage devices can be chained and since network storage devices must accept block storage as their “input” it is actually quite common for a block storage device (SAN) to be used as the backing storage for a file storage device (NAS), especially in high end systems.  They can coexist within a single chassis or they can work cooperatively on the network.

Choosing a Storage Type

While technicalities defining which type of storage is which can become problematic, the underlying concepts are pretty well understood.  There are four key types of storage that we use in everyday server computing: local disks, DAS, NAS and SAN.  Choosing which we want to use, in most cases, can be broken down into a relatively easy formula.

The quick rule of thumb for storage should be: Local before DAS, DAS before NAS, NAS before SAN.  Or as I like to write it:

Local Disks -> DAS -> NAS -> SAN

To use this rule you simply start with your storage requirements in hand and begin on the left hand side.  If local disks meet your requirements, then almost certainly they are your best choice.  If they don’t meet your requirements move to the right and check if DAS will meet your requirements.  If so, great, if not continue the process.

That’s the rule of thumb, so if that is all you need, there you go.  But we will dive into the “why” of the rule below. The quick overview is that on the left we get speed and reliability at the lowest cost.  As we move to the right complexity increases as does price typically.  The last two, while very different, are actually the most alike in many ways due to their networked nature.

Local Disks:  Local drives inside your server chassis are your best bet for most tasks.  Being inside the chassis means the least amount of money spent on extra containers to hold and power the drives, least physical risk, most solid connection technologies, shortest distance and least amount of potential bottlenecks. Being raw disks, local disks are block devices.

Direct Attached Storage:  DAS is, more or less, local drives housed outside of the server chassis.  The server itself will see them exactly like any other local drives making them very easy to use.  DAS is simple but still has extra external containers and extra cables.  This adds cost and some complexity.  DAS makes it easier to attach multiple servers to the same set of drives as this is almost impossible, and always cumbersome, with local disks.  So DAS is effectively our first type of physically sharable storage.  Being identical to local disks, DAS is a form of block device.

Network Attached Storage: NAS is unique in that it is the only non-block device from which we have to choose.  A NAS, or a traditional file server – they are truly one and the same, is the first of our technologies designed to run over a network.  This adds a lot of complication.  NAS shares storage out at the filesystem level.  A NAS is an intelligent device that allows users over the network to easily and safely share storage because the NAS has the necessary logic on board to handle multiple users at one time.  NAS is very easy for anyone to use and is even commonly used by people at home.

Storage Area Network: SAN is an adaptation of DAS with the addition of a network infrastructure allowing the SAN to behave as a remote hard drive (block device) that an operating system sees as no different from any other hard drive attached to it.  SANs require advanced networking knowledge, are surrounded by a large amount of myth and rumor, are poorly understood by the average IT professional, are generally complex to use and understand and because they lack the logic of a NAS they effectively expose a hard drive directly to the network making it trivially easy to corrupt and destroy data.  It is, in fact, so easy to lose data on a SAN due to misconfiguration that the most commonly expected use of a SAN is a use case for which a SAN cannot be used.

Of course there is much grey area.  What is normally considered a DAS can be turned into a SAN.  A SAN can be direct connected.  NAS can be direct connected.  Local storage can act as either NAS or SAN depending on configuration such as with a VSA (Virtual Storage Appliance.)  Many devices are simultaneously NAS and SAN and the determination is by configuration, not by the physical device itself.  But in generally accepted use, the terms are mostly straightforward.

The point being that as we move from left to right in our list we move from simple and easy to difficult and complex.  SAN itself is a rock solid technology; it is the introduction of humans and their tendency to do dangerous things easily with SAN that makes it a dangerous storage technique for the average user.  As with everything in IT, keeping our technologies and processes simple brings stability and security and, often, cost savings as well.

There are many times when movement to “the right” is necessary.  Local disks do not scale well and can become too expensive to maintain for certain types of larger deployments.  DAS, likewise, doesn’t scale well in many cases.  NAS scales well but being a non-block protocol is a bit unique and doesn’t always work for our purposes, a good example being HyperV that requires a block device for storage.  SAN is the final catchall of storage.  If nothing else works, SAN is always there to fall back on – or, as I like to say, SAN is the storage of last resort.

This is a very high level look at the basics of choosing a storage approach.  This is a common IT task that must be done with great regularity.  I did not intend this post, in any way, to explain any deep knowledge of storage but simply to provide a handy guide to understanding where to start looking at storage options.  Exceptions and special cases abound, but it is extremely common to simply skip the best option and go straight to considering something big, expensive and complex and rapidly forget that something much more simple might do the same job in a far superior manner.  The underlying concept is the simplest solution that meets the need is usually the best.