The Ripple Effect of Windows 8

Windows 8, with its new, dramatic Metro interface, is a huge gamble for Microsoft.  A huge gamble not only because they risk slowing update cycles and attrition of their desktop installation base but also because the Windows desktop is an underpinning of the Microsoft ecosystem – one that can easily unravel if Microsoft fails to maintain a strong foundation.

As a technologist I have been watching Windows 8 for some time having been using it, in some capacity, since the earliest public betas.  I’ve long struggled to come to terms with how Microsoft envisioned Windows 8 fitting into their existing customer base but have been, more or less, hopeful that the final release would fix many of my concerns.  When Windows 8 did finally release I was, sadly, left still wondering why it was so different from past Windows interfaces, what the ultimate intention was and how users were going to react to it.

It didn’t take long before I got a very thorough introduction to user reaction.  As a technology consultancy we tend to move quickly on new technologies and trends.  We may not deploy beta products into production but when new products release our update cycles are generally almost instantaneous.  We need to run the latest and the greatest all of the time so that we are ready for problems before anyone else allowing us to stay ahead of our customers.  So Windows 8 started getting prepped for rollout pretty much on the day that it was released to manufacturing.  This is when management got their first chance to try it out before the actual deployments started – the IT department had been playing with it since early betas.

Management came back to IT to ask critical questions concerning efficiency, usability and training.  Their reaction was that Windows 8’s interface was confusing and highly inefficient requiring a disruptive “jolt” of leaping to and from full screen menus that caused mental context shifting and loss of focus.  Many tasks require power user levels of knowledge to be usable while the interface seemed to be designed around low end “consumer” use and not very appropriate for people with the level of knowledge necessary to make the system functional.

It wasn’t that Windows 8 was unusable but failed at delivering the value traditionally associated with Windows, the value that causes us to traditionally move from version to version more or less without thinking and that is that sticking with Windows on the desktop delivers a predictable user experience requiring little to no retraining and an overall efficient experience.  Windows 8 requires extensive retraining, makes workers less efficient even after adapting to it and expects traditionally casual users to need to be power users to be effective.  While sticking with Windows is the obvious choice for IT departments with deep investments in Windows knowledge and skills (and tools), the value proposition for end users does not have the same continuity that it has in the past.

We read many reviews and consistently the answer to whether Windows 8 would deliver value to other organizations seemed to be focused on it being “good enough” and that with extensive training and all end users learning to “deal with” the interface issues and to learn totally new skills like jumping back and forth between mouse and keyboard, memorizing shortcut keys, etc. that the system could be made to be functional.  But never good, never ideal.  All concerns around Windows 8 aren’t about showing why it is better, just making it acceptable.  Hardly a position that we want to be in as an IT department.  We want to deliver solutions and value.  We want to make our businesses more efficient, not less.  We want to avoid disruption, not create it.

We even went to far as to visit Microsoft at a trade show putting on a display of Windows 8.  Even Microsoft’s own staff were unable to clarify the value proposition of Windows 8 or even in their demonstration environment get it to work “as intended”.  It is clear that even Microsoft themselves are not confident in the product or sure how their customers are expected to react to it.

The decision was made quickly: management wanted a demonstration of a Linux desktop immediately.  The first test was Linux Mint which ended up being the final choice as well. The non-IT users were really impressed with how easy Linux Mint was to use for people with a Windows background and nothing else.  It required no training – users literally just sat down and started working, unlike on Windows 8 where users were confused and needed help even with the simplest tasks like opening an application or shutting down the computer.  And there was essentially no pushback, people were universally excited about the opportunities that the new platform could provide, whereas people were actively concerned about how painful working with Windows 8 would be both up front and down the road.

That Windows 8 blundered so dramatically as to cause a competing product to get auditioned was not that surprising to me.  These things happen.  That the reaction of the non-IT staff was so dramatically in favor of a Linux distro was quite surprising, however.  Staff with no Linux exposure didn’t just see Linux as a low cost alternative or the lesser of two evils but were downright excited to use it.  Windows 8 caused Microsoft’s worst fears to come true – using Windows is no longer something that users can choose because it is familiar and comfortable.  If they feel the need or desire to test alternatives Windows will no longer compete on a “devil we know” basis like it traditionally has in the past but will need to compete on a basis of usability comparisons as Linux Mint, in this case, actually felt far more familiar and comfortable than Windows 8.

What did truly surprise me, however, was the ripple effect that changing the operating system had on the computing infrastructure.  Because Windows was being replaced this caused a series of questions to arise around other technology choices.  The first, probably somewhat obviously, was what to do about Windows-based applications that had no Linux versions?

We are lucky that the shop ran very standard applications and most applications are modern, browser-based ones so the bulk of systems worked on Linux transparently.  The only major application to require an alternative was Microsoft Office.  Fortunately the fix was easy, LibreOffice had everything that we needed and is built into the operating system.  Moving from MS Office to LibreOffice can be simple or intimidating depending on outside dependencies, complexity of use scenarios, heavy use of macros, etc.  We were lucky that across the board the move was trivial, in our case.

Dropping Microsoft Office left us without an effective email client for our Exchange email system.  So again, management asked, what compelling value is there for us in Exchange.  Shoulder shrugs followed.  Almost immediately a migration effort from a hosted Exchange service to Rackspace Email began.  This resulted in one of the largest cost savings, overall, in this entire process.

Next to be questioned was SharePoint.  Without desktop Active Directory integration, Microsoft Office integration and Exchange integration, was the overhead of running a heavy SharePoint installation of appreciable value to our organization?  SharePoint put up the biggest fight as it truly is a nearly irreplaceable system with numerous aspects and features that cannot be trivially compared to other systems.  In the end, however, without the slew of Microsoft integrated components SharePoint was deemed too costly and complex to warrant using on its own in our environment.

One by one, Microsoft products whose values were established through their tight integration with each other began to be eliminated in favor of lower cost, more flexible alternatives.  As one by one they were removed the value that they had cumulatively created diminished making each one less and less valuable without the others.

Before the move to a Linux desktop we had been preparing to install Lync as a replacement both to our instant messaging platform as well as our telephony platform.  Needless to say, that project was cancelled and our current systems, which integrate really well with Linux and were of much lower cost, were kept.

As we got to the end of eliminating Microsoft-based applications it became apparent that using Active Directory for centralized authentication was not cost effective.  This last piece will take quite some time to phase out completely as creating a new, centralized authentication mechanism will take quite a bit of planning and implementation time, but the process has begun to move to a completely different platform.

Even applications that we thought were sacred and untouchable, where plans were in place to keep them running on dedicated Windows instances just for special purposes like accounting, ended up being less sacred than we had anticipated.  New applications were found and systems were migrated.

Of course support infrastructure followed as well with System Center and Windows-focused backup systems no longer needed.  And Windows-based file servers stopped making sense without Windows clients to support.

At the end of the day what was so shocking was that the littlest thing, a concern over the efficiency and usability of Windows 8’s new interface, triggered a series of discoveries that completely unraveled our Microsoft-centered ecosystem.  No single product was unloved or disliked.  We were a team of dedicated Windows 7 desktop users on a wholly Microsoft infrastructure and happy with that decision and happy to be continuing to move more and more over to the Microsoft “way”.  But by simply questioning the assumption that we wanted or needed to be using a Windows desktop ended up bringing down an infrastructural house of cards.

From an end user perspective, the move to Linux was effortless.  There has been quite a bit of retraining and rethinking from the support side, of course.  There is a lot to learn there, but that is IT’s job – support the business and do what needs to be done to make them able to work most efficiently.

Does this bode of a dark future for Windows?  Unlikely, but it does highlight that a significant misstep on the desktop platform could easily put Microsoft’s market position on a downward spiral.  Microsoft depends on tight integration between their systems to create their value proposition.  Losing the desktop component of that integration can quickly undermine the remaining pieces.  To be sure, ours is a special case scenario – a small firm with extensive UNIX skills already existing in house, an ambitious and forward thinking management team and the agility to make broad changes combined with more than a decade of seeking platform independence in application choices, but just because we lie on the extreme edge does not mean that our story is not an important one.  For some, Windows 8 might not only represent the tipping point in the Windows desktop value proposition but the tipping point in the Microsoft ecosystem itself.

Keeping IT in Context

Information Technology doesn’t exist in a bubble, it exists to serve a business or organization (for profit, non-profit, government, etc.)  The entity which we, as IT professionals, serve provides the context for IT.  Without this context IT changes, it becomes just “technology.”

One of the biggest mistakes that I see when dealing with companies of all sizes is the proclivity for IT professionals to forget the context in which they are working and start behaving in one of two ways.  First forgetting context completely and leaving IT for “hobbyist land” and looking at the technologies and equipment that we use purely as toys for the enjoyment and fulfillment of the IT department itself without consideration for the business.  The second is treating the business as generic instead of respecting that every business has unique needs and IT must adapt to the environment which it is in.

The first problem, the hobbyist problem, is the natural extension of the route through which most IT Professionals arrive in IT – they love working on computers and would do so on their own, at home, whether they were paid to do so or not.  This brings often a lifetime of “tech for the sake of tech” feeling to an IT Pro and is nearly universal in the field.  Few other professionals find themselves so universally drawn to what they do that they would do it paid or not.  But this shared experience creates a culture of often forgetting that the IT department exists within the context of a specific corporate entity or business unit and that its mandate exists only within that context.

The second problem stems, most likely, from broad IT and business training that focuses heavily on rules of thumb and best practices which, generally, require “common scenarios” as these are easy to teach by rote and leave out the difficult pieces of problem analysis and system design.  Custom tailoring not only solutions but IT thinking to the context of a specific business with specific needs is difficult and requires learning a lot about the business itself and a lot of thought to put IT into the context of the business specifically.

The fault does not necessarily lie with IT alone.  Business often treat their IT departments as nothing but hobbyists and focus far too heavily on technical and not business skills and often keep IT at arm’s length forgetting that IT has some of the most important business insight as they tend to cross all company boundaries.  IT needs deep access to business processes, workflows, plus planning and goals to be able to provide good advisement to the business but is often treated as if this information is not needed.  Businesses, especially smaller ones, tend to think of IT as a magic box with a set budget that money goes in and network plumbing comes out.  Print and radio ads promote this thinking.  IT as a product is poor business thinking.

In the defense of the business, IT operates in a way that few businesses are really prepared to handle.  IT is a cost center in that there is a base cost needed to keep any company functioning.  But beyond this, IT can be an opportunity center in most businesses, but this requires both IT and the business to work together to create these opportunities and even moreso to leverage them.

IT is often put in the inappropriate position of being forced to justify its own existence.  This is nonsensical as human resources, accounting, legal, management, janitorial, sales, marketing and production departments are never asked to demonstrate their financial viability.  Needing to do so puts an unfair strain on the IT department requiring non-business people to present business cases and wastes resources and hampers thinking in a vain attempt at producing pointless metrics.  This is a flaw in business thinking often caused by a rift between management and the people that they’ve hired to support them.  The relationship is often cold or even adversarial or cursory when it should be close and involved.  IT should be sitting at the decision table, it brings insight and it needs insight.

One of the biggest challenges that IT faces is that it is often in a position of needing to convince the business to do what is in the business’ own best interest.  This is, for the most part, a flaw in business thinking.  The business should not demand to stand in a position of doing the wrong thing and only be willing to do the right thing if it can be “sold” to them.  This is a fundamental flaw in approach.  It should be a process of good decision making, not starting from bad decision making unless being convinced otherwise.  Other departments are not presented with a similar challenge.  What other department regularly has to mount a campaign to request necessary resources?

Due to this challenge in constantly fighting for management attention and resources, IT needs to develop internal business skills in order to cope.  This is a reality of most IT departments today.  The ability not only to keep the business that they support in context and to make IT decisions based on this context but then be able to act as marketing and sales people taking these decisions and delivering them to the business in a manner similar to how outside vendors and salespeople would do is critical.  Outside vendors are sending skilled sales people and negotiators to the business in an attempt to do an end run around IT, IT needs the same skills (with the advantage of insider knowledge and the obvious advantage of having the best interest of the business) in order to demonstrate to the business why their solutions, opportunities and needs are important for consideration.

Having good interpersonal, writing and presentation skills is not enough, of course.  Knowing business context and leveraging it efficiently includes understanding factors such as risk, opportunity, loss, profit and being able to apply these to the relationship between the businesses’ IT investments and the bottom line.  Often IT Pros will be frustrated when the business is unwilling to invest in a solution that they present but forget that the business is considering (we hope) the total cost of ownership and the impact on the company’s bottom line.  When asked how the solution will save money or generate revenue, even indirectly, often, at best, the answers are vague and lack metrics.  Before going to the business with solutions, IT departments need to vet recommendations internally and ask tough questions like:

How does this solution save money today?  Or how does it make us more money?
How much money is it expected to save or earn?
What business problem are we attempting to solve? (What itch are we trying to scratch?)
What risks do we take on or reduce?

Or similar lines of thinking.  Instead of bringing technology to the business, bring solutions.  Identify problems or opportunity and present a case.  Role play and imagine yourself as a business owner disinterested in a solution.  Would you feel that the investment requested is a good one?  Too often we in IT like a solution because it is advanced, complex, “the right way to do it”, because another company is doing it, because it is the hot trend in IT and often we have very good reasons for wanting to bring these techniques or technologies into our workplace but forget that they may not apply or apply well to the business as it is and its financial capabilities or the business roadmap.

When I speak to IT professionals looking for advice on a system design or approach my first question is pretty universally: “What business need are you attempting to solve?”  Often this question is met with silence.  The business had not been considered in the selection of the solution being presented.  Regularly bringing requests or solutions to the business that do not take into consideration the context of the IT department within the business will rapidly train business decisions makers to distrust the advice coming from the IT department.  Not that they would feel that the advice is intentionally skewed but they, and often rightfully so, will suspect that the decisions are being brought forward from a technical basis alone and isolated from the concerns of the business.  Once this distrust is in place it is difficult to return to a healthier relationship.

Making the IT department continuously act within the context of the business that it serves, encouraging IT to pursue business skills and to approach the business for information and insight and making the business see IT as a partner and supporter with whom information must be shared and insight should be gathered can be a tall order.  The business is not likely to take the first step in improving the interaction.  It is often up to IT to demonstrate that it is considering the needs of the business, often moreso than the business itself, and considering the potential financial impact or benefit of its decisions and recommendations.  There is much to be gained from this process, but it is not an easy process

It is important to remember that the need for IT to keep business context is crucial, to some degree, for all members of the IT team, especially those making recommendations, but the ability to judge business need, understand high level workflows, understand financial ramifications, seek opportunity is a combination of IT management (CIO, Dir. of IT, etc.) and the IT department as a whole.  Many non-managerial technical members need not panic and feel that their lack of holistic business vision and acumen will keep them from adequately performing their role within the business context, but it does limit their ability to provide meaningful guidance to the business outside of extremely limited scopes.  Even common job roles, such as deskside support, need to have some understanding of the fiscal responsibilities of the IT department however, such as recognizing when the cost of fixing a low cost component may far exceed the cost of replacing the component with one that is new and, potentially, better.

Solution Elegance

It is very easy, when working in IT, to become focused on big, complex solutions.  It seems that this is where the good solutions must lie – big solutions, lots of software, all the latest gadgets.  What we do is exciting and it is very easy to get caught up in the momentum.  It’s fun to do challenging, big projects.  Hearing what other IT pros are doing, how other companies solve challenges and talking to vendors with large systems to sell to us all adds to the excitement and it is very easy to lose a sense of scope and goal and it is so common to see big, over the top solutions to simple problems that it seems like this must just be how IT is.

But it need not be.  Complexity is the enemy of both reliability and security.  Unnecessarily complex solutions increase cost both in acquisition and in implementation as well as in maintenance while being generally slower, more fragile and possess a large attack surface that is harder to comprehend and protect.  Simple, or more appropriately, elegant solutions are the best approach.  This does not mean that all designs will be simple, not at all.  Complex designs are often required.  IT is hardly a field that has any lack of complexity.  In fact it is often believed that software development may be the most complex of all human endeavors, at least of those partaken of on any scale.  A typical IT installation includes millions of lines of codes, hundreds or thousands of protocols, large numbers of interconnected systems, layers of unique software configurations, more settings than any team could possibly know and only then do we add in the complexity of hundreds or thousands or hundreds of thousands of unpredictable, irrational humans trying to use these systems, each in a unique way.  IT is, without a doubt, complex.

What is important is to recognize that IT is complex, that this cannot be avoided completely but to focus on designing and engineering solutions to be as simple, as graceful… as elegant as possible.  This design idea comes from, at least in my mind, software engineering where complex code is seen as a mistake and simple, beautiful code that is easy to read, easy to understand is considered successful.  One of the highest accolades that can be bestowed upon a software engineer is for her code to be deemed elegant.  How apropos that it is attributed to Blaise Pascal, after whom one of the most popular programming languages of the 1970s and 1980s was named is this famous quote (translated poorly from French): “I am sorry I have had to write you such a long letter, but I did not have time to write you a short one.”

It is often far easier to design complex, convoluted solutions than it is to determine what simple approach would suffice.  Whether we are in a hurry or don’t know where to begin an investigation, elegance is always a challenge. The industry momentum is to promote the more difficult path.  It is in the interest of vendors to sell more gear not only to make the initial sale but they know that with more equipment comes more support dollars and if enough new, complex equipment is sold the support needs stop increasing linearly and begin to increase geometrically as additional support is needed not just for the equipment or software itself but also for the configuration and support of system interactions or additional customization   The financial influences behind complexity are great, and they do not stop with vendors.  IT professionals gain much job security, or the illusion of it, by managing large sets of hardware and software that are difficult to seamlessly transition to another IT professional.

Often complexity is so assumed, so expected, that the process of selecting a solution begins with great complexity as a foregone conclusion without any consideration for the possibility that a less complex solution might suffice, or even be superior outside of the question of complexity and cost itself.  Complexity is sometimes even completely tied to certain concepts to a degree where I have actually faced incredulity at the notion that a simple solution might outperform in price, performance and reliability a complex one.

Rhetoric is easy, but what is a real world example?  The best examples that I see today are mostly related to virtualization whether vis a vis storage or a cloud management layer or software or just virtualization itself.  I see quite frequently that a conversation involving just virtualization for one person brings an instant connotation of requiring networked, shared block storage, expensive virtualization management software, many redundant virtualization nodes and complex high availability software – none of which are intrinsic to virtualization and most of which are rarely for the purpose of supporting or really, even in the interest of the business for whom they will be implemented.  Rather than working from business requirements, these concepts arise predominantly from technology preconceptions.  It is simple to point to complexity and appear to be solving a problem – complexity creates a sense of comfort.  Filter many arguments down and you’ll hear “How can it not work, it’s complex?”  Complexity provides an illusion of completeness, or having solved a problem, but this can commonly hide the fact that a solution may not actually be complete or even functional but the degree of complexity makes this difficult to see.  Our minds will then not accept easily a simpler approach being more complete and solving a problem when a complex one does not because it feels so counter-intuitive.

A great example of this is that we resort to discussing redundancy rather than reliability.  Reliability is difficult to measure, redundancy is simple to quantify.  A brick is highly reliable, even when singular.  It does not take redundancy for a brick to be stable and robust.  Its design is simple.  You could make a supporting structure out of many redundant sticks that would not be nearly as reliable as a single brick.  If you talk in reliability – the chance that the structure will not fail – it is clear that the brick is a superior choice to several sticks.  But if you say “but there is no redundancy, the brick could fail and there is nothing to take its place” you sound silly.  But when talking about computers and computer systems we find systems that are so complex that rarely do people see when they have a brick or a stick and so, since they cannot determine reliability which matters, they focus on the easily to quantify redundancy, which doesn’t.  The entire system is too complex, but seeking the simple solution, the one that directly addresses the crux of the problem to solve can reduce complexity and provide us a far better answer in the end.

This can even be seen in RAID.  Mirrored RAID is simple, just one disk or set of disks being an exact copy of another set.  It’s so simple.  Parity RAID is complex with calculations on a variable stripe across many devices that must be encoded when written and decoded should a device fail.  Mirrored RAID lacks this complexity and solves the problem of disk reliability through simple, elegant copy operations that are highly reliable and very well understood.  Parity RAID is unnecessarily complex making it fragile.  Yet in doing so and by undermining its own ability to solve the problem for which it was designed it also, simultaneously, because seemingly more reliable based on no factor other than its own complexity.  The human mind immediately jumps to “it’s complex, therefore it is more advanced, therefore it is more reliable” but neither progression is a logical one.  Complexity does not suggest that it is more advanced and being advanced does not suggest that it is reliable, but the human mind itself is complex and easily mislead.

There is no simple answer for finding simplicity.  Knowing that complexity is bad by its nature but unavoidable at times teaches us to be mindful, however it does not teach us when to suspect over-complexity.  We must be vigilant, always seeking to determine if a more elegant answer exists and not accept complexity as the correct answer simply because it is complex.  We need to question proposed solutions and question ourselves.  “Is this solution really as simple as it should be?”  “Is this complexity necessary?”  “Does this require the complexity that I had assumed?”

In most system design recommendations that I give, the first technical determination step that I normally take, after the step of inquiring as to the business need needing to be solved, is to question complexity.  If complexity cannot be defended, it is probably unnecessary and actively defeating the purpose for which it was chosen.

“Is it really necessary to split those drives into many separate arrays?  If so, what is the technical justification for doing so?”

“Is shared storage really necessary for the task that you are proposing it for?”

“Does the business really justify the use of distributed high availability technologies?”

“Why are we replacing a simple system that was adequate yesterday with a dramatically more complex system tomorrow?  What has changed that makes a major improvement, while remaining simple, not more than enough but requires orders of magnitude more complexity and more spending that wasn’t justified previously?”

These are just common examples, complexity exists in every aspect of our industry.  Look for simplicity.  Strive for elegance.  Do not accept complexity without rigorously vetting it.  Put it through the proverbial ringer.  Do not allow complexity to creep in where it is not warranted.  Do not err on the side of complexity – when in doubt, fail simply.  Oversimplifying a solution typically results in a minor failure while making it overly complex allows for a far greater degree of failure.  The safer bet is with the simpler solution.  And if a simple solution is chosen and proven inadequate it is far easier to add complexity than it is to remove it.

The History of Array Splitting

Much of the rote knowledge of the IT field, especially that of the SMB field, arose in the very late 1990s based on a variety of factors.  The biggest factors were that suddenly smaller and smaller businesses were rushing to computerize, Microsoft had gotten Windows NT 4 so stable that there was a standard base for all SMB IT to center around, the Internet era had finally taken hold and Microsoft introduce their certification and training programs that reshaped knowledge dissemination in the industry.  Put together, this created both a need for new training and best practices and caused a massive burst of new thinking, writing, documentation, training, best practices, rules of thumb, etc.

For a few years nearly the entire field was trained on the same small knowledge set and many rules of thumb became de facto standards and much of the knowledge of the time was learned by rote and passed on mentor to intern in a cycle that moved much of the technical knowledge of 1998 into the unquestioned, set-in-stone processes of 2012.  At the time this was effective because the practices were relevant but that was fifteen years ago, technology, economics, use cases and knowledge have changed significantly since that time.

One of the best examples of this was the famous Microsoft SQL Server recommendation of RAID 1 for the operating system, RAID 5 for the database files and another RAID 1 for the logs.  This setup has endured for nearly the entire life of the product and was so well promoted that it has spread into almost all aspects of server design in the SMB space.  The use of RAID 1 for the operating system and RAID 5 for data is so pervasive that it is often simply assumed without any consideration as to why this was recommended at the time.

Let’s investigate the history and see why R1/5/1 was good in 1998 and why it should not exist today.  Keep some perspective in mind, the gap between when these recommendations first came out (as early as 1995) compared to today is immense.  Go back, mentally, to 1995 and think about the equivalent gap at the time.  That would have been like using recommendations in the early Internet age based on home computing needs for the first round of Apple ][ owners!  The 8bit home computer era was just barely getting started in 1978.  Commodore was still two years away from releasing their first home computer (the VIC=20) and would go through the entire Commodore and Commodore Amiga eras and go bankrupt and vanish all before 1995.  The Apple ][+ was still a year away.  People were just about to start using analogue cassette drives as storage.  COBOL and Fortran were the only series business languages in use.  Basically, the gap is incredible.  Things change.

First, we need to look at the factors that existed in the late 1990s that created the need for our historic setup.

  1. Drives were small, very small.  A large database array might have been four 2.1GB SCSI drives in an R5 array for just ~6GB of usable storage space on a single array.  The failure domain for parity RAID failure was tiny (compared to things like URE fail rates.)
  2. Drive connection technologies were parallel and slow.  The hard drives of the time were only slightly slower than drives are today but the connection technologies represented a considerable bottleneck.  It was common to split traffic to allow for reduced bus bottlenecks.
  3. SCSI drive technology was the only one used for servers.  The use of a PATA (IDE it was called at the time) in a server was unthinkable.
  4. Drives were expensive per gigabyte so cost savings was the key issue, while maintaining capacity, for effectively all businesses.
  5. Filesystems were fragile and failed more often than drives.
  6. Hardware RAID was required and only basic RAID levels of 1 and 5 were commonly available.  RAID 6 and RAID 10 were years away from being accessible to most businesses.  RAID 0 is discounted as it has no redundancy.
  7. Storage systems were rarely, if ever, shared between servers so access was almost always dedicated to a single request queue.
  8. Storage caches were tiny or did not exist making drive access limitations pass directly onto the operating system.  This meant having different arrays with different characteristics to handle different read/write or random/sequential access mixes.
  9. Drive failure was common and the principle concern of storage system design.
  10. Often drive array size was limited by physical limitations so often array splitting decisions were made out of necessity, not choice.
  11. A combination of the above factors meant that RAID 1 was best for some parts of the system where small size was acceptable and access was highly sequential or write heavy and RAID 5 was best for others where capacity outweighed reliability and where access was highly random and read heavy.

In the nearly two decades since the original recommendations were released, all of these factors have changed.  In some cases the changes are cascading ones where the move from general use RAID 5 to general use RAID 10 has then causes what would have been the two common array types, RAID 1 and RAID 10, to share access characteristics so the need or desire to use one or the other depending on load type is gone.

  1. Drives are now massive.  Rather than struggling to squeeze what we need onto them, we generally have excess capacity.  Single drives over a terabyte are common, even in servers.  Failure domains for parity are massive (compared to things like URE fail rates.)
  2. Drive connections are serial and fast.  The drive connections are no longer a bottleneck.
  3. SATA is now common on servers skewing potential risks for URE in a way that did not exist previously.
  4. Capacity is now cheap but performance and reliability are now the key concerns for dollars spent.
  5. Filesystems are highly robust today and filesystem failures are “background noise” in the greater picture of array reliability.
  6. Hardware RAID and software RAID are both options today and available RAID levels include many options but, most importantly, RAID 10 is available ubiquitously.
  7. Storage systems are commonly shared making sequential access even less common.
  8. Storage caches are commonly and often very large.  512MB and 1GB caches are considered normal today making many arrays in 1995 fit entirely into memory on the RAID controller today.  With caches growing rapidly compared to storage capacity and the recent addition of solid state drives as L2 cache in storage in the last two years it is not out of the question for even a small business to have databases and other performance sensitive applications running completely from cache.
  9. Drive failure is uncommon and of trivial concern to storage system design (compared to other failure types.)
  10. Drive array size is rarely limited by physical limitations.
  11. The use of RAID 1 and RAID 10 as the principle array types today means that there is no benefit to using different array levels for performance tuning.

These factors highlight why the split array system of 1995 made perfect sense at the time and why it does not make sense today.  OBR10, today’s standard, was unavailable at the time and cost prohibitive.  RAID 5 was relatively safe in 1995, but not today.  Nearly every factor involved in the decision process has changed dramatically in the last seventeen years and is going to continue to change as SSD becomes more common along with auto-tiering, even larger caches and pure SSD storage systems.

The change in storage design over the last two decades also highlights the dangers that IT faces as a large portion of the field learns, as is common in engineering, basic “rules of thumb” or “best practices” without necessarily understanding the underlying principles that drive those decisions making it difficult to know when not to apply those best practices or, even more importantly, when to recognize that the rule no longer applies.  Unlike traditional mechanical or civil engineering where new advances and significant factor changes may occur once or possibly never over the course of a career, IT still changes fast enough that complete “rethinks” of basic rules of thumb are required several times through a career.  Maybe not annually, but once per decade or more is almost always necessary.

The current move from uniprocessing to multithreaded architectures is another similar, significant change requiring the IT field to completely change how system design is handled.

The Information Technology Resource for Small Business