Category Archives: Business of IT

Understanding Technical Debt

From Wikipedia: “Technical debt (also known as design debt or code debt) is “a concept in programming that reflects the extra development work that arises when code that is easy to implement in the short run is used instead of applying the best overall solution”.

Technical debt can be compared to monetary debt. If technical debt is not repaid, it can accumulate ‘interest’, making it harder to implement changes later on. Unaddressed technical debt increases software entropy. Technical debt is not necessarily a bad thing, and sometimes (e.g., as a proof-of-concept) technical debt is required to move projects forward. On the other hand, some experts claim that the “technical debt” metaphor tends to minimize the impact, which results in insufficient prioritization of the necessary work to correct it.”

The concept of technical debt comes from the software engineering world, but it applies to the world of IT and business infrastructure just as much. Like software engineering, we design our systems and our networks, and taking shortcuts in our designs, which includes working with less than ideal designs, incorporating existing hardware and other bad design practices produce technical debt.  One of the more significant forms of this comes from investing in the “past” rather than in the “future” and is quite often triggered through the sunk cost fallacy (a.k.a. throwing good money after bad.)

It is easy to see this happening in businesses every day.  New plans are made for the future, but before they are implemented investments are made in making an old system design continue working, work better, expand or whatever.  This investment then either turns into a nearly immediate financial loss or, more often, becomes incentive to not invest in the future designs as quickly, as thoroughly or possible, at all.  The investment in the past can become crippling in the worst cases.

This happens in numerous ways and is generally unintentional.  Often investments are needed to keep an existing system running properly and, under normal conditions, would simply be made.  But in a situation where there is a future change that is needed or potentially planned this investment can be problematic.  Better cost analysis and triage planning can remedy this, in many cases, though.

In a non-technical example, imagine owning an older car that has served well but is due for retirement in three months.  In three months you plan to invest in a new car because the old one is no longer cost effective due to continuous maintenance needs, lower efficiency and so forth.  But before your three month plan to buy a new car comes around, the old car suffers a minor failure and now requires a significant investment to keep it running.  Putting money into the old car would be an new investment in the technical debt.  Rather than spending a large amount of money to make an old car run for a few months, moving up the time table to buy the new one is obviously drastically more financially sound.  With cars, we see this easily (in most cases.)  We save money, potentially a lot of it, by quickly buying a new car.  If we were to invest heavily in the old one, we either lose that investment in a few months or we risk changes our solid financial planning for the purchase of a new car that was already made.  Both cases are bad financially.

IT works the same way.  Spending a large sum of money to maintain an old email system six months before a planned migration to a hosted email system would likely be very foolish.  The investment is either lost nearly immediately when the old system is decommissioned or it undermines our good planning processes and leads us to not migrate as planned and do a sub-par job for our businesses because we allowed technical debt to drive our decision making rather than proper planning.

Often a poor triage operation or improper authority to triage players can be the factor that causes emergency technical debt investments rather than rapid future looking investments.  This is only one area where major improvements may address issues, but it is a major one.  This can also be mitigated, in some cases, through “what if” planning to have investment plans in place contingent on common or expected emergencies that might arise, which may be as simple as capacity expansion needs due to growth that happen before systems planning comes into play.

Another great example of common technical debt is server storage capacity expansion.  This is a scenario that I see with some frequency and demonstrates technical debt well.  It is common for a company to purchase servers that lack large internal storage capacity.  Either immediately or sometime down the road more capacity is needed.  If this happens immediately we can see that the server purchased was a form of technical debt in improper design and obviously represents a flaw in the planning and purchasing process.

But a more common example is needing to expand storage two or three years after a server has been purchased.  Common expansion choices include adding an external storage array to attach to the server or modifying the server to accept more local storage.  Both of these approaches tend to be large investments in an already old server, a server that is easily forty percent or higher through its useful lifespan.  In many cases the same or only slightly higher investment in a completely new server can result in new hardware, faster CPUs, more RAM, the storage needed, purpose designed and built, aligned and refreshed support lifespan, smaller datacenter footprint, lower power consumption, newer technologies and features, better vendor relationships and more all while retaining the original server to reuse, retire or resell.  One way spends money supporting the past, the other often can spend comparable money on the future.

Technical debt is a crippling factor for many businesses.  It increases the cost of IT, sometimes significantly, and can lead to high levels of risk through a lack of planning and most spending being emergency based.

 

No One Ever Got Fired For Buying…

It was the 1980s when I first heard this phrase in IT and it was “no one ever got fired for buying IBM.”  The idea was that IBM was so well known, trusted and reliable that it was the safe choice as a vendor for a technology decision maker to select.  As long as you chose IBM, you were not going to get in trouble, no matter how costly or effective the resulting solution turned out to be.

The statement on its own feels like a simple one.  It makes for an excellent marketing message and IBM, understandably, loved it.  But it is what is implied by the message that causes so much concern.

First, we need to understand what the role of the IT decision maker in question is.  This might sound simple, but it is surprising how easily it can be overlooked.  Once we delve into the ramifications of the statement itself, it is far too easy to lose track of the real goals. In the role of a decision maker, the IT professional is tasked with selecting the best solution for their organization based on its ability to meet organizational goals (normally profits).  This means evaluating options, shielding non-technical management from sales people and marketing, understanding the marketplace, research and careful evaluation.  These things seem obvious, until we begin to put things into practice.

What we have to then analyze is not that “no one ever got fired for choosing product X”, but what the ramifications of such a statement actually are.

First, the statement implies an organization that is going to judge IT decision making not on its merits or applicability but on the brand name recognition of the product maker.  In order for a statement like this to have any truth behind it, it requires the entire organization to either lack the ability or desire to evaluate decisions but also an organizational desire to see large, expensive brand names (the statement is always made in conjunction with extremely high cost items compared to the alternatives) over other alternatives.  An organizational preference towards expensive, harder to justify spends is a dangerous one at best.  We assume that not only does buying the most expensive, most famous products will be judged well compared to less expensive or less well known ones, but that buying products is seen as beneficial to not buying products; even though often the best IT decisions are to not buy things when no need exists.  Prioritizing spending over savings for their own reasons without consideration for the business need is very bad, indeed.

Second, now that we realize the organizational reality that this implies, that the IT decision maker is willing to seize this opportunity to leverage corporate politics as a means of avoiding taking the time and effort to make a true assessment of needs for the business but rather skip this process, possibly completely, we have a strong question of ethics.  Essentially, whether out of fear of the organization not properly evaluating the results  or by blaming the decision maker for unforeseeable events after the fact or of looking to take advantage of the situation to be paid for a job that was not done, we have a significant problem either individually, organizationally, or both.

For any IT decision maker to use this mindset, one that there is safety in a given decision regardless of suitability, there has to be a fundamental distrust of the organization.  Whether this is true of the organization or not is not known, but that the IT decision maker believes it is required for such a thought to even exist.  In many organizations it is understandable that politics trump good decision making and it is far more important to make decisions for which you cannot be blamed rather than trying honestly to do a good job.  That is sad enough on its own, but so often it is simply an opportunity to skip the very job for which the IT decision maker is hired and paid and instead of doing a difficult job that requires deep business and technical knowledge, market research, cost analysis and more – simply allowing a vendor to sell whatever they want to the business.

At best, it would seem, we have an IT decision maker with little to no faith in the ethics or capabilities of those above them in the organization.  At worst we have someone actively attempting to take advantage of a business by being paid to be a key decision maker while, instead of doing the job for which they are hired or even doing nothing at all, actively putting their weight behind a vendor that was not properly evaluated based possibly solely on not needing to do any of the work themselves.

What should worry an organization is not that vendors that could often be considered “safe” get recommended or selected, but rather why they were selected.  Vendors that fall into this category often offer many great products and solutions or they would not earn this reputation in the first place.  But likewise, after gaining such a reputation those same vendors have a strong financial incentive to take advantage of this culture and charge more while delivering less as they are not being selected, in many cases, on their merits but instead on their name, reputation or marketing prowess.

How does an organization address this effect?  There are two ways.  One is to evaluate all decisions carefully in a post mortem structure to understand what good decisions look like and not limit post mortems to obviously failed projects.  The second is to look more critical, rather than less critically, at popular product and solution decisions as these are red flags that decision making may be being skipped or undertaken with less than the appropriate rigor.  Popular companies, assumed standard approaches, solutions found commonly in advertising or commonly recommended by sales people, resellers, and vendors should be looked at with a discerning eye, moreso than less common, more politically “risky” choices.

 

Buyers and Sellers Agents in IT

When dealing with real estate purchases, we have discrete roles defined legally as to when a real estate agent represents the seller or when they represent the buyer.  Each party gets clear documentation as to how they are being represented.  In both cases, the agent is bound by honesty and ethical limitations, but beyond that their obligations are to their represented party.

Outside of the real estate world, most of us do not deal with buyer’s agents very often.  Seller’s agents are everywhere, we just call them salespeople.  We deal with them at many stores and they are especially evident when we go to buy something large, like a car.

In business, buyer’s agents are actually pretty common and actually come in some interesting and unspoken forms.  Rarely does anyone actually talk about buyer’s agents in business terms, mostly because we are not talking about buying objects but about buying solutions, services or designs.  Identifying buyer’s and seller’s agents alone can become confusing and, often, companies may not even recognize when a transaction of this nature is taking place.

We mostly see the engagement of sellers – they are the vendors with products and services that they want us to purchase.  We can pretty readily identify the seller’s agents that are involved.  These include primarily the staff of the vendor itself and the sales people (which includes pre-sales engineering and any “technical” resource that gets compensation by means of the sale rather than being explicitly engaged and remunerated to represent your own interests) of the resellers (resellers being a blanket term for any company that is compensated for selling products, services or ideas that they themselves do not produce; this commonly includes value added resellers and stores.)  The seller’s side is easy.  Are they making money by somehow getting me to buy something?  If so… seller’s agent.

Buyer’s agents are more difficult to recognize.  So much so that it is common for businesses to forget to engage them, overlook them or confuse seller’s agents for them.  Sadly, outside of real estate, the strict codes of conduct and legal oversight do not exist and ensuring that seller’s agent is not engaged mistakenly where a buyer’s agent should be is purely up to the organization engaging said parties.

Buyer’s agents come in many forms but the most common, yet hardest to recognize, is the IT department or staff, themselves.  This may seem like a strange thought, but the IT department acts as a technical representative of the business and, because they are not the business themselves directly, an emotional stop gap that can aid in reducing the effects of marketing and sales tactics while helping to ensure that technical needs are met.  The IT team is the most important buyer’s agent in the IT supply chain and the last line of defense for companies to ensure that they are engaging well and getting the services, products and advice that they need.

Commonly  IT departments will engage consulting services to aid in decision making. The paid consulting firm is the most identifiable buyer’s agent in the process and the one that is most often skipped (or a seller’s agent is mistaken for the consultant.)  A consultant is hired by, paid by and has an ethical responsibility to represent the buyer.  Consultants have an additional air gap that helps to separate them from the emotional responses common of the business itself.  The business and its internal IT staff are easily motivated by having “cool solutions” or expensive “toys” or can be easily caused to panic through good marketing, but consultants have many advantages.

Consultants have the advantage that they are often specialists in the area in question or at least spend their time dealing with many vendors, resellers, products, ideas and customer needs.  They can more easily take a broad view of needs and bring a different type of experience to the decision table.

Consultants are not the ones who, at the end of the day, get to “own” the products, services or solutions in question and are generally judged on their ability to aid the business effectively.  Because of this they have a distinct advantage in being more emotionally distant and therefore more objective in deciding on recommendations.  The coolest, newest solutions have little effect on them while cost effectiveness and business viability do.  More importantly, consultants and internal IT working together provide an important balancing of biases, experience and business understandings that combine the broad experience across many vendors and customers of the one, and the deep understanding of the individual business of the other.

One can actually think of the Buyer’s and Seller’s Agent system as a “stack”.  When a business needs to acquire new services, products or to get advice, the ideal and full stack would look something like this: Business > IT Department > ITSP/Consultants <> Value Added Reseller < Distributor < Vendor.  The <> denotes the reflection point between the buyer’s side and the seller’s side.  Of course, many transactions will not involve and should not involve the entire stack.  But this visualization can be effective in understanding how these pieces are “designed” to interface with each other.  The business should ideally get the final options from IT (IT can be outsourced, of course), IT should interface through an ITSP consultant in many cases, and so forth.  An important part of the processes is keeping actors on the left side of the stack (or the bottom) from having direct contact with those high up in the stack (or on the right) because this can short circuit the protections that the system provides allowing vendors or sales staff to influence the business without the buyer’s agents being able to vet the information.

Identifying, understanding and leveraging the buyer’s and seller’s agent system is important to getting good, solid advice and sales for any business and is widely applicable far outside of IT.

The Emperor’s New Storage

We all know the story of the Emperor’s New Clothes.  In Hans Christian Anderson’s telling of the classic tale we have some unscrupulous cloth vendors who convince the emperor that they have clothes made from a fabric with the magical property of only being visible to people who are fit for their positions.  The emperor, not being able to see the clothes, decides to buy them because he fears people finding out that he cannot see them.  Everyone in the kingdom pretends to see them as well – all sharing the same fear.  It is a brilliant sales tactic because it puts everyone on the same team: the cloth sellers, the emperor, the people in the street all share a common goal that requires them to all maintain the same lie.  Only when a little boy who cares naught about his status in society but only about the truth points out that the emperor is naked is everyone free to admit that they don’t see the clothes either.

And this brings us to the storage market today.  Today we have storage vendors desperate to sell solutions of dubious value and buyers who often lack the confidence in their own storage knowledge to dare to question the vendors in front of management or who simply have turned to vendors to make their IT decisions on their behalf.  This has created a scenario where the vendor confidence and industry uncertainty has engendered market momentum causing the entire situation to snowball.  The effect is that using big, monolithic and expensive storage systems is so accepted today that often systems are purchased without any thought at all.  They are essentially a foregone conclusion!

It is time for someone to point at the storage buying process and declare that the emperor is, in fact, naked.

Don’t get me wrong.  I certainly do not mean to imply that modern storage solutions do not have value.  Most certainly they do.  Large SAN and NAS shared storage systems have driven much technological development and have excellent use cases.  They were not designed without value, but they do not apply to every scenario.

The idea of the inverted pyramid design, the overuse of SANs where they do not apply, came about because they are high profit margin approaches.  Manufacturers have a huge incentive to push these products and designs because they do much to generate profits.  SANs are one of the most profit-bearing products on the market.  This, in turn, incentivizes resellers to push SANs as well, both to generate profits directly through their sales but also to keep their vendors happy.  This creates a large amount of market pressure by which everyone on the “sales” side of the buyer / seller equation has massive pressure to convince you, the buyer, that a SAN is absolutely necessary.  This is so strong of a pressure, the incentives so large, that even losing the majority of potential customers in the process is worth it because the margins on the one customer that goes with the approach is generally worth losing many others.

Resellers are not the only “in between” players with incentive to see large, complex storage architectures get deployed.  Even non-reseller consultants have an incentive to promote this approach because it is big, complex and requires, on average, far more consulting and support than do simpler system designs.  This is unlikely to be a trivial number.  Instead of a ten hour engagement, they may win a hundred hours, for example, and for consultants those hours are bread and butter.

Of course, the media has incentive to promote this, too.  The vendors provide the financial support for most media in the industry and much of the content.  Media outlets want to promote the design because it promotes their sponsors and they also want to talk about the things that people are interested in and simple designs do not generate a lot of readership.  The same problems that exist with sensationalist news: the most important or relevant news is often skipped so that news that will gather viewership is shown instead.

This combination of factors is very forceful.  Companies that look to consultants, resellers and VARs, and vendors for guidance will get a unanimous push for expensive, complex and high margin storage systems.  Everyone, even the consultants who are supposed to be representing the client have a pretty big incentive to let these complex designs get approved because there is just so much money potentially sitting on the table.  You might get paid one hour of consulting time to recommend against overspending, but might be paid hundreds of hours for implementing and supporting the final system.  That’s likely tens of thousands of dollars difference, a lot of incentive, even for the smallest deployments.

This unification of the sales channel and even the front line of “protection” has an extreme effect.  Our only real hope, the only significant one, for someone who is not incentivized to participate in this system is the internal IT staff themselves.  And yet we find very rarely that internal staff will stand up to the vendors on these recommendations or even produce them themselves.

There are many reasons why well intentioned internal IT staff (and even external ones) may fail to properly assess needs such as these.  There are a great many factors involved and I will highlight some of them.

  • Little information in the market.  Because no company makes money by selling you less, there is almost no market literature, discussions or material to assist in evaluating decisions.  Without direct access to another business that has made the same decision or to any consultants or vendors promoting an alternative approach, IT professionals are often left all alone.  This lack of supporting experience is enough to cause adequate doubt to squash dissenting voices.
  • Management often prefers flashy advertising and the word of sales people over the opinions of internal staff.  This is a hard fact, but one that is often true.  IT professionals often face the fact that management may make buying decisions without any technical input whatsoever.
  • Any bid process immediately short circuits good design.  A bid would have to include “storage” and SAN vendors can easily bid on supplying storage while there is no meaningful way for “nothing” to bid on it.  Because there is no vendor for good design, good design has no voice in a bidding or quote based approach.
  • Lack of knowledge.  Often dealing with system architecture and storage concerns are one off activities only handled a few times over an entire career.  Making these decisions is not just uncommon, it is often the very first time that it has ever been done.  Even if the knowledge is there, the confidence to buck the trend easily is not.
  • Inexperience in assessing risk and cost profiles.  While these things may seem like bread and butter to IT management, often the person tasked with dealing with system design in these cases will have no training and no experience in determining comparative cost and risk in complex systems such as these.  It is common that risk goes unidentified.
  • Internal staff often see this big and costly purchase as a badge of honour or a means to bragging rights.  Excited to show off how much they were able to spend and how big their new systems are.  Everyone loves gadgets and these are often the biggest, most expensive toys that we ever touch in our industry.
  • Internal staff often have no access to work with equipment of this type, especially SANs.  Getting a large storage solution in house may allow them to improve their resume and even leverage the experience into a raise or, more likely, a new job.
  • Turning to other IT professionals who have tackled similar situations often results in the same advice as from sales people.  This is for several reasons.  All of the reasons above, of course, would have applied to them plus one very strong one – self preservation.  Any IT professional that has implemented a very costly system unnecessarily will have a lot of incentive to state that they believe that the purchase was a good one.  Whether this is irrational “reverse rationalization” – the trait where humans tend to apply ration to a decision that lacked ration when originally made, because they fear that their job may be in jeopardy if it was found out what they had done or because they have not assessed the value of the system after implementation; or even possibly because their factors were not the same as yours and the design was applicable to their needs.

The bottom line is that basically everyone, no matter what role they play, from vendors to sales people to those that do implementation and support to even your friends in similar job roles to strangers on Internet forums, all have big incentives to promote costly and risky storage architectures in the small and medium business space.  There is, for all intents and purposes, no one with a clear benefit for providing a counter point to this marketing and sales momentum.  And, of course, as momentum has grown the situation becomes more and more entrenched with people even citing the questioning of the status quo and asking critical questions as irrational or reckless.

As with any decision in IT, however, we have to ask “does this provide the appropriate value to meet the needs of the organization?”  Storage and system architectural design is one of the most critical and expensive decisions that we will make in a typical IT shop.  Of all of the things that we do, treating this decision as a knee-jerk, foregone conclusion without doing due diligence and not looking to address our company’s specific goals could be one of the most damaging that we make.

Bad decisions in this area are not readily apparent.  The same factors that lead to the initial bad decisions will also hide the fact that a bad decision was made much of the time.  If the issue is that the solution carries too much risk, there is no means to determine that better after implementation than before – thus is the nature of risk.  If the system never fails we don’t know if that is normal or if we got lucky.  If it fails we don’t know if this is common or if we were one in a million.  So observation of risk from within a single implementation, or even hundreds of implementations, gives us no statistically meaningful insight.  Likewise when evaluating wasteful expenditures we would have caught a financial waste before the purchase just as easily as after it.  So we are left without any ability for a business to do a post mortem on their decision, nor is there an incentive as no one involved in the process would want to risk exposing a bad decision making process.  Even companies that want to know if they have done well will almost never have a good way of determining this.

What makes this determination even harder is that the same architectures that are foolish and reckless for one company may be completely sensible for another.  The use of a SAN based storage system and a large number of attached hosts is a common and sensible approach to controlling costs of storage in extremely large environments.  Nearly every enterprise will utilize this design and it normally makes sense, but is used for very different reasons and goals than apply to nearly any small or medium business.  It is also, generally, implemented somewhat differently.  It is not that SANs or similar storage are bad.  What is bad is allowing market pressure, sales people and those with strong incentives to “sell” a costly solution to drive technical decision making instead of evaluating business needs, risk and cost analysis and implementing the right solution for the organization’s specific goals.

It is time that we, as an industry, recognize that the emperor is not wearing any clothes.  We need to be the innocent children who point, laugh and question why no one else has been saying anything when it is so obvious that he is naked.  The storage and architectural solutions so broadly accepted benefit far too many people and the only ones who are truly hurt by them (business owners and investors) are not in a position to understand if they do or do not meet their needs.  We need to break past the comfort provided by socially accepted plausible deniability or understanding, or culpability for not evaluating.  We must take responsibility for protecting our organizations and provide solutions that address their needs rather than the needs of the sales people.

 

For more information see: When to Consider a SAN and The Inverted Pyramid of Doom