Category Archives: Uncategorized

The Risks of Licensing

There are so many kinds of risk that we address and must consider in IT systems it is easy to overlook risks that are non-technical, especially ones that we often do not address directly, such as licensing.  But licensing carries risks, and costs, that must be considered in everything that we do in IT.

As I write this article, the risks of licensing are very fresh in the news.  Just yesterday, one of the largest and best known cloud computing providers suffered a global, three hour outage that was later attributed to accidentally allowing some of their licensing to expire.  One relatively minor component in their infrastructure stack, with massive global redundancy reduced to worthlessness in a stroke as their licensing expired. Having licensing dependencies means having to carefully manage them.  Some licenses are more dangerous than others.  Some only leave you exposed to audits, others create outages or dataloss.

Licensing may be a risk intentionally, as in the example above where the license expired and the equipment stopped working.  Or they can be less intentional, such a remote kill switches or confusion of equipment with dates or misconfiguration causes systems to fail.  But it is a risk that must be considered and, quite often, may have to be mitigated.  The risk of critical systems time bombing or dying in unrepairable ways can be very dangerous.  Unlike hardware or software failure, there is often no recourse to repair systems without access to a vendor.  A vendor that may be offline, might be out of support, might no longer support the product, may have technical issues of their own or may even be out of business!

Often, licensing outages put customers into a position of extreme leverage for a vendor who can charge nearly any amount that they want for renewed licensing during a pending or worse, already happened, outage.  Due to pressure, customers may easily pay many times the normal prices for licensing to get systems back online and restore customer confidence.

While some licensing represents extreme risk, and some merely an inconvenience this risk must be evaluated and understood.  In my own experience I have seen critical software have licensing revoked by a foreign software vendor simply looking to force a purchasing discussion and causing large losses to environments for which there was little legal recourse, simply because they had the simple ability to remotely kill systems via their licensing even for paid costumers.  Generally illegal and certainly unethical, there is often little recourse for customers in these situations.

And of course, many license issues can be technical or accidental.  Simply that licensing servers go offline, systems break, accidents happen.  Systems that are designed to become inaccessible when they cannot validate their licenses simply carry an entire category of risk that other types of systems do not.  A risk that is more common than people often realize and often has some of the least ability to be mitigated.

Of course beyond these kinds of risks, licensing also carries overhead which, as always, is a form of risk which, in turn, is a form of cost.  Researching, acquiring, tracking and maintaining licenses, even those that would not potentially cripple your infrastructure, takes time and time is money.  And licensing always carries the risk that you will buy too little and be exposed to audits (or buy incorrectly) or that you will buy too much and overspend.  In any of these cases, this is cost that must be calculated into the overall TCO of any solution, but are often ignored.

Licensing time and costs are often one of the more significant costs in a problem, but because they are ignored it can be extremely different to understand how they play into the long term financial picture of solutions – especially as they often later then impact other decisions in various ways.

Licensing is just a fact of life in IT, but one that is hardly cool or interesting so is often ignored or, at least, not discussed heavily.  Being mindful that licensing has costs to manage just like any other aspect of IT and carries risk, potentially very large risk, that needs to be addressed are just part of good IT decision making.

If It Ain’t Broke, Don’t Fix It

We’ve all heard that plenty, right?  “If it ain’t broke, don’t fix it.”  People use it everywhere as a way to discourage improvements, modernization or refactoring.  Many people say it and as with many phrases of this nature, on the surface it seems reasonable.  But in application, it really is not or, at least, not as normally used because it is not well understood.

This is very similar to the concept of telling people not to put all of their eggs in one basket, where it is more often than not applied to situations where the eggs and basket analogy does not apply or is inverted from reality.  But because it is a memorized phrase, they forget that there is a metaphor that needs to hold up for it to work.  It can lead to horrible decisions because it invokes and irrational fear founded on nothing.

Likewise, the idea of not fixing things that are not broken comes from the theory that something that is perfectly good and functional should not be taken apart and messed with just for the sake of messing with it.  This makes sense.  But for some reason, this logic is almost never applies to things where it would make sense (I’m not even sure of a good example of one of these) but instead is applied to complex devices that require regular maintenance and upkeep in order to work properly.

Of course if your shoe is not broken, don’t tear it apart and attempt to glue it back together again.  But your business infrastructure systems are nothing like a shoe.  They are living systems with enormous levels of complexity that function in an ever changing landscape.  They require constant maintenance, oversight, updating and so forth to remain healthy.  Must like a car, but dramatically moreso.

You never, we hope, hear someone tell you that you don’t need to change the oil in your car until the engine has seized.  Of course not, even though it is not yet broken, the point is to do maintenance to keep it from breaking.  We know with a car that if we wait until it breaks, it will be really broken.  Likewise we would not refuse to put air in the tires until the flat tires of ripped off of the wheels.  It just makes no sense.

Telling someone not to maintain systems until it is too late is the same as telling them to break them.  A properly maintained car might last hundreds of thousands of miles, maybe millions.  One without oil will be lucky to make it across town.  Buying a new engine every few days rather than taking care of the one that you have means you might go a lifetime without destroying an engine.

The same goes for your business infrastructure.  Code ages, systems wear out, new technology emerges, new needs exist, the network interacts with the outside world, new features are needed, vulnerabilities and fragility is identified and fixed, updates come out, new attacks are developed and so forth.  Even if new features never were created, systems need to be diligently managed and maintained in order to ensure safe and reliable operation – like a car but a thousand times more complex.

In terms of IT systems, broken means unnecessary exposed to hacking, data theft, data loss, downtime and inefficiencies.  In the real world, we should be considering the system to be broken the moment that maintenance is needed.  How much ransomware would not be a threat today if systems were simply properly maintained?  As IT we need to stand up and explain that unmaintained systems are already broken, disaster just hasn’t struck yet.

If we were to follow the mantra of “if it ain’t broke, don’t fix it” in IT, we wait *until* our data was stolen to patch vulnerabilities, or wait until data was unrecoverable to see if we had working backups.  Of course, that makes no sense. But this is what is often suggested when people tell you not to fix your systems until they break – they are telling you to let them break! Push back, don’t accept that kind of advice.  Explain that the purpose of good IT maintenance is to avoid systems breaking whenever possible.  Avoiding disaster, rather than inviting it.

Virtualize Domain Controllers

One would think that the idea of virtualizing Active Directory Domain Controllers would not be a topic needing discussion, and yet I find that the question arises regularly as to whether or not AD DCs should be virtualized.  In theory, there is no need to ask this question because we have far more general guidance in the industry that tells us that all possible workloads should be virtualized and AD certainly presents no special cases with which to create an exception to this long standing and general rule.

Oddly, people seem to go out regularly seeking clarification on this one particular workload, however and if you seek bad advice, someone is sure to provide.  Tons of people post advice recommending physical servers for Active Directory, but rarely, if ever, with any explanation as to why they would recommend violating best practices at all, let alone with such a mundane and well known workload.

As to why people implementing AD DCs decide that it warrants specific investigation around virtualization when no other workload does, I cannot answer.  But after many years of research into this phenomenon I do have some insight into the source of the reckless advice around physical deployments.

The first mistake comes from a general misunderstanding of what virtualization even is.  This is sadly incredibly common and people quite often think that virtualization means consolidation, which of course it does not.  So they take that mistake and then apply the false logic that consolidation means consolidating two AD DCs onto the same physical host.  It also requires the leap to thinking that there will always be two or more AD DCs, but this is also a common belief.  So three large mistakes in logic come together for some very bad advice that, if you dig into the recommendations, you can normally trace back.  This seems to be the root of the majority of the bad advice.

Other causes are sometimes misunderstanding actual best practices, such as the phrase “If you have two AD DCs, each needs to be on a separate physical host.”  This statement is telling us that two physically disparate machines need to be used in this scenario, which is absolutely correct.  But it does not imply that either of them should not have a hypervisor, only that two different hosts are needed.  The wording used for this kind of advice is often hard to understand if you don’t have the existing understanding that under no circumstance is a non-virtual workload acceptable.  If you read the recommendation with that understanding, its meaning is clear and, hopefully, obvious.  Sadly, that recommendation often gets repeated out of context so the underlying meaning can easily get lost.

Long ago, as in around a decade ago, some virtualization platforms had some issues around timing and system clocks that could play havoc with clustered database systems like Active Directory.  This was a legitimate issue long ago but was long ago solved, as it needed to be for many different workloads.  A perception was created that AD might need special treatment, however, and it seems to linger on even though it has been a generation or two in IT terms since this was an issue and should have long ago been forgotten.

Another myth leading to bad advice is rooted in the fact that AD DCs, like other clustered databases, when used in a clustered mode should not be snapshotted as this will easily create database corruption if only one node of the cluster gets restored in that manner.  This is, however, a general aspect of storage and databases and is not related to virtualization at all.  The same information is necessary for physical AD DCs just the same.  That snapshots are associated with virtualization is another myth; virtualization implies no such storage artefact.

Still other myths arise from a belief that virtualization much rely on Active Directory itself in order to function and therefore AD has to run without virtualization.  This is completely a myth and nonsensical.  There is no such circular requirement.

Sadly, some areas of technical have given rise to large scale myths, often many of them, that surround them and can make it difficult to figure out the truth.  Virtualization is just complex enough that many people attempt to learn but just how to use it, but what it is conceptually, by rote giving rise to sometimes crazy misconceptions that are so far afield that it can be hard to figure out that that is really what we are seeing.  And in a case like this, misconceptions around virtualization, history, clustered databases, high availability techniques, storage and more add up to layer upon layer of misconceptions making it hard to figure out how so many things can come together around one deployment question.

At the end of the day, few workloads are as ideally suited to virtualization as Active Directory Domain Controllers are.  There is no case where the idea of using a physical bare metal operating system deployment for a DC should be considered – virtualize every time.

Hiring IT: Speed Matters

After decades of IT hiring, something that I have learned is that companies serious about hiring top talent always make hiring decisions very quickly.  They may spend months or even years looking for someone that is a right fit for the organization, but once they have found them they take action immediately.

This happens for many reasons.  But mostly it comes down to wanting to secure resources once they have been identified.  Finding good people is an expensive and time consuming process.  Once you have found someone that is considered to be the right fit for the need and the organization, there is a strong necessity to reduce risk by securing them as quickly as possible.  A delay in making an offer presents an opportunity for that resource to receive another offer or decide to go in a different direction.  Months of seeking a good candidate, only to lose them because of a delay of a few hours or days in making an offer is a ridiculous way to lose money.

Delays in hiring suggest that either the situation has not yet been decided upon or that the process has not gotten a priority and that other decisions or actions inside of the company are seen as more  important than the decisions around staffing.  And, of course, it may be true that other things are more important.

Other factors being more important are exactly the kinds of things that potential candidates worry about.   Legitimate priorities might include huge disasters in the company, things that are not a good sign in general.  Or worse, maybe the company just doesn’t see acquiring the best talent as being important and delays are caused by vacations, parties, normal work or not even being sure that they want to hire anyone at all.

It is extremely common for companies to go through hiring rounds just to “see what is out there.”  This doesn’t necessarily mean that they will not consider hiring someone if the right person does come along, but it easily means that the hiring is not fully approved or funded and might not even be possible.  Candidates go through this regularly, a great interview might result in no further action and so know better than to sit around waiting on positions, even ones that seem very likely and possible.  The risks are too high and if a different, good opportunity comes along, will normally move ahead with that.  Few things signal that a job offer is not forthcoming or that a job is not an ideal one than delays in the hiring process.

Candidates, especially senior ones, know that good jobs hire quickly.  So if the offer has not arrived promptly it is often assumed that offer(s) are being made to other candidates or that something else is wrong.  In either situation, candidates know to move on.

If hiring is to be a true priority in an organization, it must be prioritized.  This should go without saying, but good hiring slips through the cracks more often than not.  It is far too often seen as a background activity; one that is approached casually and haphazardly.  It is no wonder that so many organizations waste countless hours of time on unnecessary candidate searches and interviews and untold time attempting to fill positions when, for all intents and purposes, they are turning away their best options all the while.