One of the more common, yet more tricky fundamental concepts in computing today is the concept of drive appearance or, in other words, something that appears to be a hard drive. This may sound simple, and mostly it is, but it can be tricky.
First, what is a hard drive. This should be simple. We normally mean a traditional spinning disk Winchester device such have been made for decades in standard three and a half inch as well as two and a half inch form factors. They contain platters that spin, a drive head that moves forward and backward and they connect using something like ATA or SCSI connectors. Most of us can pick up a hard drive with our hands and be certain that we have a hard drive. This is what we call the physical manifestation of the drive.
To the computer, though, it does not see the casing of the drive nor the connectors. The computer has to look through its electronics and “see” the drive digitally. This is very, very different from how humans view the physical drive. To the computer, a hard drive appears as an ATA, SCSI or Fibre Channel device at the most basic physical level and are generally abstracted at a higher level as a block device. This is what we would call a logical appearance, rather than a physical one. For our purposes here, we will think of all of these drive interfaces as being block devices. They do differ, but only slightly and not consequentially to the discussion. What is important is that there is a standard interface or set of closely related interfaces that are seen by the computer as being a hard drive.
Another way to think of the logical drive appearance here is that anything that looks like a hard drive to the computer is something on which the computer format with a filesystem. Filesystems are not drives themselves, but require a drive on which to be placed.
The concept of the interface is the most important one here. To the computer, it is “anything that implements a hard drive interface” that is truly seen as being a hard drive. This is both a simple as well as a powerful concept.
It is because of the use of a standard interface that we were able to take flash memory, attach it to a disk controller that would present it over a standard protocol (both SATA and SAS implementations of ATA and SCSI are common for this today) and create SSDs that look and act exactly like traditional Winchester drives to the computer yet have nothing physically in common with them. They may or may not come in a familiar physical form factor, but they definitely lack platters and a drive head. Looking at the workings of a traditional hard drive and a modern SSD we would not guess that they share a purpose.
This concept applies to many devices. Obviously SD cards and USB memory sticks work in the same way. But importantly, this is how partitions on top of hard drives work. The partitioning system uses the concept of drive impression interface on one side to be able to be applied to a device, and on the other side it presents a drive impression interface to whatever wants to use it; normally a filesystem. This idea of something that using the drive impression interface on both sides is very important. By doing this, we get a uniform and universal building block system for making complex storage systems!
We see this concept of “drive in; drive out” in many cases. Probably the best know is RAID. A RAID system takes an array of hard drives, applies one of a number of algorithms to make the drives act as a team, and then present them as a single drive impression to the next system up the “stack.” This encapsulation is what gives RAID its power: systems further up the stack looking at a RAID array see literally a hard drive. They do not see the array of drives, they do not know what is below the RAID. They just see the resulting drive(s) that the RAID system present.
Because a RAID system takes an arbitrary number of drives and presents them as a standard drive we have the theoretical ability to layer RAID as many times as we want. Of course this would be extremely impractical to do to any great degree. But it is through this concept that nested RAID arrays are possible. For example, if we had many physical hard drives split into pairs and each pair in a RAID 1 array. Each of those resulting arrays gets presented as a single drive. Each of those resulting logical drives can be combined into another RAID array, such as RAID 0. Doing this is how RAID 10 is built. Going further we could take a number of RAID 10 arrays, present them all to another RAID system that puts them in RAID 0 again and get RAID 100 and so forth indefinitely.
Similarly the logical volume layer uses the same kind of encapsulation as RAID to work its magic. Logical Volume Managers, such as LVM on Linux and Dynamic Disks on Windows, sit on top of logical disks and provide a layer where you can do powerful management such as flexibly expanding devices or enabling snapshots, and then present logical disks (aka drive impression interface) to the next layer of the stack.
Because of the uniform nature of drive impressions the stack can happen in any order. A logical volume manager can sit on top of RAID, or RAID can sit on top of a logical volume manager and of course you can skip or the other or both!
The concept of drive impressions or logical hard drives is powerful in its simplicity and allows us great potential for customizing storage systems however we need to make them.
Of course there are other uses of the logical drive concept as well. One of the most popular and least understood is that of a SAN. A SAN is nothing more than a device that takes one or more physical disks and presents them as logical drives (this presentation of a logical drive from a SAN is called a LUN) over the network. This is, quite literally, all that a SAN is. Most SANs will incorporate a RAID layer and likely a logical volume manager layer before presenting the final LUNs, or disk impressions, to the network, but that is not required to be a SAN.
This means, of course, that multiple SAN LUNs can be combined in a single RAID or controlled via a logical volume layer. And of course it means that a SAN LUN, a physical hard drive, a RAID array, a logical volume, a partition…. can all be formatted with a filesystem as they are all different means of achieving the same result. They all behave identically. They all share the drive appearance interface.
To give a real world example of how you would often see all of these parts come together we will examine one of the most common “storage stacks” that you will find in the enterprise space. Of course there are many ways to build a storage stack so do not be surprised if yours is different. At the bottom of the stack is nearly always physical hard drives, which could include solid state drives. This are located physically within a SAN. Before leaving the SAN the stack will likely include the actual storage layer of the drives, then a RAID layer combining those drives into a single entity. Then a logical volume layer to allow for features like growth and snapshots. Then there is the physical demarcation between the SAN and the server which is presented as the LUN. The LUN then has a logical volume manager applies to it on the server / operating system side of the demarcation point. Then on top of that LUN is a filesystem which is our final step as the filesystem does not continue to present a drive appearance interface but a file interface, instead.
Understanding drive appearance, or logical drives, and how these allows components to interface with each other to build complex storage subsystems is a critical building block to IT understanding and is widely applicable to a large number of IT activities.
One thought on “Drive Appearance”