Getting Started with IT Certifications

This question surfaces very regularly: you are at the beginning of your IT career or maybe have not even gotten into your career yet, and are wondering where to get started with certifications. Maybe you are in high school, maybe you have finished college, perhaps you are six months into your first job and feel that having a certification will help to move you forward. There are a lot of options and a lot of information about IT industry certifications out there but pretty regularly the advice around getting started comes down to just a few basic opinions and I will share mine (having worked in the certification industry for many years and having spent time both as a hiring manager and as a corporate career counselor in IT.)

Certifications that often get mentioned for people “starting” in IT include the CompTIA A+, CompTIA Network+ (often called the Net+), Microsoft’s MTA certifications and Cisco’s CCNA.

When just getting started in IT, though, I recommend starting with a firm foundation. Some certs, like those from Microsoft and Cisco, might be great but begin to take you down very specific career paths which may or may not be the right ones for you. We should hold off on those kinds of certifications until we have a little of the basics firmly under our belts. They can be great as a next step, but we do not want to get ahead of ourselves.

It is also extremely important to note that the Microsoft MTA exams are “pre-professional” certs, not certs for IT Pros. They are not meant to demonstrate a level of skill for an existing IT Pro or even to show that someone is ready to work as an IT Pro but instead to show that someone is ready to intern in IT or to attend IT classes. The MTA is targeted at high school students to take after entry level high school classes and are too low level to be considered even at the college level, let alone at the working level. You should never include these on a resume, even if you have them, once you are working in the field. They are excellent for showing initiative for high school classes but should not be used as goals of their own.

CompTIA is a vendor neutral certification authority so tends to be a good starting spot in IT. CompTIA also focuses on more entry level and broad certifications than most other providers. This makes then exceptionally well suited to entry level folks looking to certify themselves before moving into more specific career paths. And because they tend to focus on foundational knowledge the effort spent certifying is rarely wasted at an educational level either.

CompTIA has two major certs that are generally considered here, the A+ and the Network+. The A+ is by far the more well known outside of IT circles, and this is actually because it is not an IT certification at all. The A+ was originally designed to certify that someone had the appropriate experience level of someone having worked on a helpdesk for six months. However the knowledge tested by the A+ generally covers archaic hardware and tasks that generally do not exist in IT at all but belong to another, related, field of “bench work.” No amount of IT experience, even decades of it, would prepare you for the exam. This makes the A+ specifically targeted at bench careers and has become the industry standard in that area – which includes local computer stores, Best Buy’s Geek Squad, Staples and other non-IT computer “fix it” shops. The skills tested by the A+ are too “low” to be useful for testing in IT and focus on aspects of computers that are rarely, if ever, of concern to IT.

The A+ tends to focus very heavily on hardware and physical repair of consumer equipment. It does not cover tasks common to any level or style of IT. While some entry level IT areas would consider the knowledge in it common, most IT disciplines would not see it as foundational or useful and even the most senior IT professionals would often find it obtuse at best.

CompTIA’s other general purpose certification is the Network+, originally designed to represent the level of knowledge expected after “two years” working in IT. Both of these assessments are very poor, the A+ represents general knowledge or low level archaic knowledge that you would hope the general public would have and the Network+ really represents the knowledge level that you would look for a new hire, first IT job person to have. The Network+ is not a differentiator, therefore, between candidates but more of a foundational level of knowledge and a standard requirement. But that does not make it bad to have, it makes it good. The Network+, unlike its counterpart, does indeed focus on common and very important IT knowledge that those seeking a career in the field should most certainly have or acquire if lacking.

The Network+ represents standard knowledge useful for effectively any IT position or career no matter what technology or area of focus one chooses to pursue. For someone looking to go after their very first job or for someone looking to establish that they are well qualified in their first position or even for someone just looking to prepare themselves for the world of certification testing, the Network+ is an ideal starting point.

It is very unlikely that a Network+ on its own is going to lead to a job or promotion, but it establishes a starting point for looking towards other things. It is, more or less, the final “standard” starting point for nearly everyone in the IT field today. Many will not take the Network+ and certainly there are many options to enter the field without it, but I personally recommend it to everyone in every focus of IT. The knowledge needed for it will be useful throughout a career. As a starting point for a certification portfolio it is unrivaled.

The Network+, as the name implies, focuses almost exclusively on networking knowledge. This does not mean that it is only suitable for those interested in networking related IT careers. Networking is a part of everything that we do in IT today and is even important knowledge for non-IT users who want to understand their computers and their networks better. Even very non-network jobs like database administration would benefit from a firm foundation in networking.

Moving forward from the Network+ the world of certifications opens up and this begins a much more complex discussion. CompTIA offers other, good, general purpose certifications, such as the Security+, but at this stage we should be prepared to begin a bit of soul searching to determine exactly what path we want our careers to take from here. There are so many aspects of the IT field there is no way to provide a solid, reliable next step without looking at both short term and long term career goals and interests.

Making the Best of Your Inverted Pyramid of Doom

The 3-2-1 or Inverted Pyramid of Doom architecture has become an IT industry pariah for many reasons. Sadly for many companies, they only learn about the dangers associated with this design after the components have arrived and the money has left the accounts.

Some companies are lucky and catch this mistake early enough to be able to return their purchases and start over with a proper design and decision phase prior to the acquisition of new hardware and software. This, however, is an ideal and very rare situation. At best we can normally expect restocking fees and, far more commonly, the equipment cannot be returned at all or the fees are so large as to make it pointless.

What most companies face is a need to “make the best” of the situation moving forward. One of the biggest concerns is that concerned parties, whether it be the financial stake holders who have just spent a lot of money on the new hardware or if it is the technical stakeholders who now look bad for having allowed this equipment to be purchased, to succumb to an emotional reaction resulting in giving in to the sunk cost fallacy. It is vital that this emotional, illogical reaction not be allowed to take hold as it will undermine critical decision making.

It must be understood that the money spent on the inverted pyramid of doom has already been spent and is gone. That the money was wasted or how much was wasted is irrelevant to decision making at this point. If the system was a gift or if it cost a billion dollars does not matter, that money is gone and now we have to make do with what we have. A potential “trick” here would be to bring in a financial decision maker like a CFO, explain that there is about to be an emotional reaction to money already spent and discuss the sunk cost fallacy before talking about the actual problem so that people are aware and logical and the person trained (we hope) to best handle this kind of situation is there and ready to head off sunk cost emotions. Careful handling of a potentially emotionally-fueled reaction is important. This is not the time to attempt to cover up either the financial or the technical missteps, which is what the emotional reaction is creating. It is necessary for all parties to communicate and remain detached and logical in order to address the needs. Some companies handle this well, many do not and become caught trying to forge forward with bad decisions that were already made, probably in the hopes that nothing bad happens and that no one remembers or notices. Fight that reaction. Everyone has it, it is the natural amygdala “fight or flight” emotional response.

Now that we are ready to fight the emotional reactions to the problem we can begin to address “where do we go from here.” The good news is that where we are is generally a position of having “too much” rather than “too little.” So we have an opportunity to be a little creative. Thankfully there are generally good options that can allow us to move in several directions.

One thing that is very important to note is that we are looking at solutions exclusively that are more reliable, not less reliable, than the intended inverted pyramid of doom architecture that we are replacing. An IPOD is a very fragile and dangerous design and we could go to great lengths demonstrating concepts like risk analysis, single points of failure, the fallacies of false redundancy, looking at redundancy instead of reliability, dependency chains, etc. but what is absolutely critical for all parties to understand is that a single server, running with local storage is more reliable than the entire IPOD infrastructure would be. This is so important that it has to be said again: if a single server is “standard availability”, the IPOD is lower than that. More risky. If anyone at this stage fears a “lack of redundancy” or a “lack of complexity” in the resulting solutions we have to come back to this – nothing that we will discuss is as risky as what had already been designed and purchased. If there is any fear of risk going forward, the fear should have been greater before we improved the reliability of the design. This cannot be overstated. IPODs sell because they easily confuse those not trained in risk analysis and look reliable when, in fact, they are anything but.

Understanding the above and using a technique called “reading back” the accepted IPOD architecture tells us that the company in question was accepting of not having high availability (or even standard availability) at the time of purchasing the IPOD. Perhaps they believed that they were getting that, but the architecture could not provide it and so moving forward we have the option of “making do” with nothing more than a single server, running on its own local storage. This is simple and easy and improves on nearly every aspect of the intended IPOD design. It costs less to run and maintain, is often faster and is much less complex while being slightly more reliable.

But likely simply dropping down to a single server and hoping to find uses for the rest of the purchased equipment “elsewhere” is not going to be our best option. In situations where the IPOD had been meant to only be used for a single workload or set of workloads and other areas of the business have need for equipment as well it can be very beneficial to go to the “single server” approach for the intended IPOD workload and utilize the remaining equipment elsewhere in the business.

The most common approach to take with repurposing an IPOD stack is to reconfigure the two (or more) compute nodes to be full stack nodes containing their own storage. This step may require no purchases, depending on what storage has already been purchased, a movement of drives between systems or often the relatively small purchase of additional hard drives for this purpose.

These nodes can then be configured into one of two high availability models. In the past a common design choice, for cost reasons, was to use an asynchronous replication model (often known as the Veeam approach) that will replicate virtual machines between the nodes and allow VMs to be powered up very rapidly allowing for a downtime from the moment of compute node failure until recovery of as little as just a few minutes.

Today fully synchronous fault tolerance is available so commonly for free that it has effectively replaced the asynchronous model in nearly all cases. In this model storage is replicated in fully real time between the compute nodes allowing for failover to happen instantly, rather than with a few minutes delay, and with zero data loss instead of a small data loss window (e.g. RPO of zero.)

At this point it seems to be common for people to react to replication with a fear of a loss of storage capacity caused by the replication. Of course this is true. It is necessary that it be understood that it is this replication, missing from the original IPOD design, that provides the firm foundation for high reliability. If this replication is skipped, high availability is an unobtainable dream and individual compute nodes using local storage in a “stand alone” mode is the most reliable potential option. High availability solutions rely on replication and redundancy to build the necessary reliability to qualify for high availability.

This solves the question of what to do with our compute nodes but leaves us with what we can do with our external shared storage device, the single point of failure or the “point” of the inverted pyramid design. To answer this question we should start by looking at what this storage might be.

There are three common types of storage devices that would be used in an inverted pyramid design: DAS, SAN and NAS. We can lump DAS and SAN together as they are both two different aspects of block storage and can be used essentially interchangeably in our discussion – they are only differentiated by the existence of switching which can be added or removed as needed in our designs. NAS differs by being file storage rather than block storage.

In both cases, block (DAS or SAN) or file (NAS) storage one of the most common usages for this now superfluous device is as a backup target for our new virtualization infrastructure. In many cases the device may be overkill for this task, generally with more performance and many more features than needed for a simple backup target but good backup storage is important for any critical business infrastructure and erring on the side of overkill is not necessarily a bad thing. Businesses often attempt to skimp on their backup infrastructures and this is an opportunity to invest heavily in it without spending any extra money.

Along the same vein as backup storage, the external storage device could be repurposed as archival storage or other “lower tier” of storage where high availability is not warranted. This is a less common approach, generally because every business needs a good backup system but only some have a way to leverage an archival storage tier.

Beyond these two common and universal storage models, a common use case for external storage devices, especially if the device is a NAS, is to leverage it in its native rule as a file server separate from the virtualization infrastructure. For many businesses file serving is not as uptime critical as the core virtualization infrastructure and backups are far easier to maintain and manage. By offloading file serving to an already purchased NAS device this can reduce file serving requirements from the virtualization infrastructure both by reducing the number of VMs that need to be run there as well as moving what is typically one of the largest users of storage to a separate device which can lower the performance requirements of the virtualization infrastructure as well as its capacity requirements. By doing this we potentially reduce the cost of obtaining necessary additional hard drives for the local storage on the compute nodes as we stated earlier and so this can be a very popular method for many companies to address the repurposing needs.

Every company is unique and there are potentially many places where spare storage equipment could be effectively used from labs to archives to tiered storage. Using a little creativity and thinking outside of the box can be leveraged to take your unique set of available equipment and your business’ unique set of needs and demands and find the best place to use this equipment where it is decoupled from the core, critical virtualization infrastructure but can still bring value to the organization. By avoiding the inverted pyramid of doom we can obtain the maximum value from the equipment that we have already invested in rather than implementing fresh technical debt that we have to them work to overcome unnecessarily.

Why We Avoid Contract to Hire

Information Technology workers are bombarded with “Contract to Hire” positions, often daily.  There are reasons why this method of hiring and working is fundamentally wrong and while workers immediately identify these positions as bad choices to make, but few really take the time to move beyond emotional reaction to understand why these working method is so flawed and, more importantly, few companies take the time to explore why using tactics such as this undermine their staffing goals.

To begin we must understand that there are two basic types of technology workers: consultants (also called contractors) and permanent employees (commonly known as the FTEs.)  Nearly all IT workers fall into a desire to be one of these two categories. Neither is better or worse, they are simply two different approaches to employment engagements and represent differences in personality, career goals, life situations and so forth.  Workers do not always get to work they way that they desire, but basically all IT workers seek to be in either one camp or the other.

Understanding the desires and motivations of IT workers seeking to be full time employees is generally very easy to do.  Employees, in theory, have good salaries, stable work situations, comfort, continuity, benefits, vacations, protection and so forth.  At least this is how it seems, whether these aspects are real or just illusionary can be debated elsewhere.  What is important is that most people understand why people want to be employees, but the opposite is rarely true.  Many people lack the empathy for those seeking to not be employees.

Understanding professional or intentional consultants can be difficult.  Consultants live a less settled life but generally earn higher salaries and advance in their careers faster, see more diverse environments, get a better chance to learn and grow, are pushed harder and have more flexibility.  There are many factors which can make consulting or contracting intentionally a sensible decision.  Intentional contracting is very often favored by younger professionals looking to grow quickly and gain experience that they otherwise could not obtain.

What makes this matter more confusing is that the majority of workers in IT wish to work as full time employees but a great many end up settling for contract positions to hold them over until a desired full time position can be acquired.  The commonality of this situation has created a situation wherein a great many people both inside and outside of the industry and on both sides of the interview table may mistakenly believe that all cases are this way and that consulting is a lower form of employment.  This is completely wrong.  In many cases consulting is highly desired and contractors can benefit greatly for their choice of engagement methodology.  I, myself, spent most of my early career, around fifteen years, seeking only to work as a contractor and had little desire to land a permanent post.  I wanted rapid advancement, opportunities to learn, chances to travel and variety.

It is not uncommon at all for the desired mode of employment to change over time.  It is most common for contractors to seek to move to full employment at some point in their careers. Contracting is often exhausting and harder to sustain over a long career.  But certainly full time employees sometimes chose to move into a more mobile and adventurous contracting mode as well.  And many choose to only work one style or the other for the entirety of their careers.

Understanding these two models is key.  What does not fit into this model is the concept of a Contract to Hire.  This hiring methodology starts by hiring someone willing to work a contract position and then, sometimes after a set period of time and sometimes after an indefinite period of time, either promises to make a second determination to see if said team member should be “converted” into an employee, or let go.  This does not work well when we attempt to match it up against the two types of workers.  Neither type is a “want to start as one thing and then do another”.  Possibly somewhere there is an IT worker who would like to work as a contractor for four months and then become an employee, getting benefits but only after a four month delay, but I am not aware of such a person and it is reasonable to assume that if there is such a person he is unique and already has done this process and would not want to do it again.

This leaves us with two resulting models to match into this situation.  The first is the more common model of an IT worker seeking permanent employment and being offered a Contract to Hire position.  For this worker the situation is not ideal, the first four months represent a likely jarring and complex situation and a scary one that lacks the benefits and stability that is needed and the second decision point as to whether to offer the conversion is very scary.  The worker must behave and plan as if there was no conversion and must be actively seeking other opportunities during the contract period, opportunities that are pure employment from the beginning.  If there was any certainty of a position becoming a full employment one then there would be no contract period at all.  The risk is exceptionally high to the employee that no conversion will be offered.  In fact, it is almost unheard of in the industry for this to happen.

It must be noted that, for most IT professionals, the idea that a Contract to Hire will truly offer a conversion at the end of the contract duration is so unlikely that it is generally assumed that the enticement of the conversion process is purely a fake one and that there is no possibility of it happening at all.  And for reasons we will discover here it is obvious why companies would not honestly expect to attempt this process.  The term Contract to Hire spells almost certain unemployment for IT works going down that path.  The “to Hire” portion is almost universally nothing more than a marketing ploy and a very dishonest one.

The other model that we must consider is the model of the contract-desiring employee accepting a Contract to Hire position.  In this model we have the better outcome for both parties.  The worker is happy with the contract arrangement and the company is able to employ someone who is happy to be there and not seeking something that they likely will be unable to get.  In cases where the company was less than forthcoming about the fact that the “to Hire” conversion would never be considered this might actually even work out well, but is fall less likely to do so long term and in repeating engagements than if both parties were up front and honest about their intentions on a regular basis.  Even for professional contracts seeing the “to Hire” addendum is a red flag that something is amiss.

The results for a company, however, when obtaining an intentional contractor via a Contract to Hire posting is risky.  For one contractors are highly volatile and are skilled and trained at finding other positions.  They are generally well prepared to leave a position the moment that the original contract is done.

One reason that the term Contract to Hire is used is so that companies can easily “string along” someone desiring a conversion to a full time position by dangling the conversion like a carrot and prolonging contact situations indefinitely.  Intentional contractors will see no carrot in this situation and will be, normally, prepared to leave immediately upon completion of their contract time and can leave without any notice as they simply need not renew their contract leaving the company in a lurch of their own making.

Even in scenarios where an intentional contractor is offered a conversion at the end of a contract period there is the very real possibility that they will simply turn down the conversion.  Just as the company maintains the right to not offer the conversion, the IT worker maintains an equal right to not agree to offered terms.  The conversion process is completely optional by both parties.  This, too, can leave the company in a tight position if they were banking on the assumption that all IT workers were highly desirous of permanent employment positions.

This may be the better situation, however.  Potentially even worse is an intentional contractor accepting a permanent employment position when they were not actually desiring an arrangement of that type.  They are likely to find the position to be something that they do not enjoy, or else they would have been seeking such an arrangement already, and will be easily tempted to leave for greener pastures very soon defeating the purpose of having hiring an employee to the company again.

The idea behind the Contract to Hire movement is the mistaken belief by companies that companies hold all of the cards and that IT workers are all desperate for work and thankful to find any job that they can.  This, combined with the incorrect assumption that nearly all IT workers truly want stable, traditional employment as a full time employee combines to make a very bad hiring situation.

Based on this, a great many companies attempt to leverage the Contract to Hire term in order to lure more and better IT workers to apply based on false promises or poor matching of employment values.  It is seen as a means of lowering cost, testing out potential employees, hedge bets against future head count needs, etc.

In a market where there is a massive over supply of IT workers a tactic such as this may actually pay off.  In the real world, however, IT workers are in very short supply and everyone is aware of the game that companies play and what this term truly means.

It might be assumed that IT workers would still consider taking Contract to Hire because they are willing to take on some risk and hope to convince the employer that conversion, in their case, would be worth while.  And certainly some companies do this process and for some people it has worked out well.  However, it should be noted, that any contract position offers the potential of a conversion offer and in positions where the to “Contract to Hire” is not used, conversions are actually quite common, or at least offers for conversion.  It is specifically when a potential future conversion is offered like a carrot that the conversions become exceptionally rare.  There is no need for an honest company and a quality workplace to mention “to Hire” when bringing on contractors.

What happens, however, is more complex and requires study.  In general the best workers in any field are those that are already employed.  It goes without saying that the better you are, the more likely you are to be employed.  This doesn’t mean that great people never change jobs or find themselves unemployed but the better you are the more time you will average not seeking employment from a position of being unemployed and the worse you are the more likely you are to be unemployed non-voluntarily.  That may seem obvious, but when you combine that with other information that we have, something is amiss.  A Contract to Hire position can never, effectively, entice currently working people in any way.  A great offer of true, full time employment with better pay and benefits might entire someone to give up an existing position for a better one, that happens every day.  But good people generally have good jobs and are not going to give up the positions that they have, the safety and stability to join an unknown situation that only offers a short term contract with an almost certain no chance conversion carrot.  It just is not going to happen.

Likewise when good IT workers are unemployed they are not very likely to be in a position of desperation and even then are very unlikely to even talk to a position listing as Contract to Hire (or contract at all) as most people want full time employment and good IT people will generally be far too busy turning down offers to waste time looking at Contract to Hire positions.  Good IT workers are flooded with employment opportunities and being able to quickly filter out those that are not serious is a necessity.  The words “Contract to Hire” are one of the best low hanging fruits of this filtering process.  You don’t need to see what company it is, what region it is in, what the position is or what experience they expect.  The position is not what you are looking for, move along, nothing to see here.

The idea that employers seem to have is the belief that everyone, employed and unemployed IT workers alike, are desperate and thankful for any possibly job opening.  This is completely flawed.  Most of the industry is doing very well and there is no way to fill all of the existing job openings that we have today, IT workers are in demand.  Certainly there is always a certain segment of the IT worker population that is desperate for work for one reason or another – personal situations, geographic ties, over staffed technology specialization or, most commonly, not being very competitive.

What Contract to Hire positions do is filter out the best people.  They effectively filter out every currently employed IT worker completely.  In demand skills groups (like Linux, storage, cloud and virtualization) will be sorted out too, they are too able to find work anywhere to consider poor offerings.  Highly skilled individuals, even when out of work, will self filter as they are looking for something good, not looking for just anything that comes along.

At the end of the day, the only people in any number seriously considering Contract to Hire positions, often even to the point of being the only ones even willing to respond to postings, are the truly desperate.  Only the group that either has so little experience that they do not realize how foolish the concept is or, more commonly by far, those that are long out of work and have few prospects and feel that the incredible risks and low quality of work associated with Contract to Hire is acceptable.

This hiring problem begins a vicious loop of low quality, if one did not already exist. But most likely issues with quality already will exist before a company considers a Contract to Hire tactic.  Once good people begin to avoid a company, and this will happen even if only some positions are Contract to Hire, – because the quality of the hiring process is exposed, the quality of those able to be hired will begin to decline.  The worse it gets, the harder to turn the ship around.  Good people attract good people.  Good IT workers want to work with great IT workers to mentor them, to train them and to provide places where they can advance by doing a good job.  Good people do not seek to work in a shop staffed by the desperate.  Both because working only with desperate people is depressing and the quality of work is very poor, but also because once a shop gains a poor reputation it is very hard to shake and good people will be very wary of having their own reputation tarnished by having worked in such a place.

Contact to Hire tactics signal desperation and a willingness to admit defeat on the part of an employer.  Once a company sinks to this level with their hiring they are no longer focusing on building great teams, acquiring amazing talent or providing a wonderful work environment.  Contract to Hire is not always something that every IT professional can avoid all of the time.  All of us have times when we have to accept something less than ideal.  But it is important for all parties involved to understand their options and just what it means when a company moves into this mode.  Contract to Hire is not a tactic for vetting potential hires, it simply does not work that way.  Contract to Hire causes companies to be vetted and filter out of consideration by the bulk of potential candidates without those metrics every being made available to hiring firms.  Potential candidates simply ignore them and write them off, sometimes noting who is hiring this way and avoiding them even when other options come along in the future.

As a company, if you desire to have a great IT department and hire good people, do not allow Contract to Hire to ever be associated with your firm.  Hire full time employees and hire intentional contractors, but do not play games with dangling false carrots hoping that contractos will change their personalities or that full time employees will take huge personal risks for no reason, that is simply not how the real world works.

Ferraris and Tractor Trailers

Working in the SMB world, it is actually pretty rare that we need to talk about latency.  The SMB world is almost universally focused on system throughput and generally unaware of latency as a need.  But there are times where latency becomes important and when it does it is critical that we understand the interplay of throughput and latency and just what “speed” means to us.  Once we start moving into the enterprise space, latency is more often going to be viewed as a concern, but even there throughput nearly always reigns supreme, to the point that concepts of speed almost universally revolve around throughput and concepts of latency are often ignored or forgotten.

Understanding the role of latency in a system can be complicated, even though latency itself is relatively simple to understand.

A great comparison between latency and throughput that I like to use is the idea of a Ferrari and a tractor trailer.  Ferraris are “fast” in the traditional sense, they have a high “miles per hour.”  One might say that they are designed for speed.  But are they?

We generally consider tractor trailers to be slow.  They are big and lumbering beasts that have a low top end speed.  But they haul a lot of stuff at once.

In computer terms we normally think of speed like hauling capacity – we think in terms of “items” per second.  In the terms of a Ferrari going two hundred miles per hour is great, but it can haul maybe one box at a time.  A tractor trailer can only go one hundred miles per hour but can haul closer to one thousand boxes at a time.  When we talk about throughput or speed on a computer this is more what we think about.  In network terms we think of gigabytes per second and are rarely concerned with the speed of an individual packet as a single packet is rarely important.  In computational terms we think about ideas like floating point operations per second, a similar concept.  No one really cares how long a single FLOP (floating point operation) takes, only how many we can get done in one or ten seconds.

So when looking at a Ferrari we could say that it has a useful speed of two hundred box-miles per hour.  That is for every hour of operations, a Ferrari can move one box up to two hundred miles.  A tractor trailer has a useful speed of one hundred thousand box-miles per hour.  In terms of moving packages around, the throughput of the tractor trailer is easily five hundred times “faster” than that of the Ferrari.

So in terms of how we normally think of computers and networks a tractor trailer would be “fast” and a Ferrari would be “slow.”

But there is also latency to consider.  Assuming that our payload is tiny, say a letter or a small box, a Ferrari can move that one box over a thousand miles in just five hours!  A tractor trailer would take ten hours to make this same journey (but could have a LOT of letters all arriving at once.)  If what we need is to get a message or a small parcel from one place to another very quickly the Ferrari is the better choice because it has half the latency (delay) from the time we initiate the delivery until the first package is delivered than the tractor trailer does.

As you can imagine, in most cases tractor trailers are vastly more practical because their delivery speed is so much higher.  And, this being the case, we actually see large trucks on the highways all of the time and the occurrence rate of Ferraris is very low – even though each cost about the same amount to purchase (very roughly.)  But in special cases, the Ferrari makes more sense.  Just not very often.

This is a general case concept and can apply to numerous applications.  It applies to caching systems, memory, CPU, networking, operating system kernels and schedulers, to cars and more.  Latency and throughput are generally inversely related – we give up latency in order to obtain throughput.  For most operations this makes the best sense.  But sometimes it makes more sense to tune for latency.

Storage is actually an odd duck in computing where nearly all focus on storage performance is around IOPS, which is roughly a proxy measurement for latency, instead of throughput which is measured in “data transferred per second.”  Rarely do we care about this second number as it is almost never the source of storage bottlenecks.  But this is the exception, not the rule.

Latency and throughput can have some surprising interactions in the computing world.  When we talk about networks, for example, we typically measure only throughput (Gb/s) but rarely care much about the latency (normally measured in milliseconds.)  Typically this is because nearly all networking systems have similar latency numbers and most applications are pretty much unconcerned with latency delays.  It is only the rare application like VoIP over International links or satellite where latency affects the average person or can sometimes surprise people when they attempt something uncommon like iSCSI over a long distance WAN connection and suddenly latency pops up to surprise them as an unforeseen problem.

One of the places where the interaction of latency and throughput starts to become shocking and interesting is when we move from electrical or optical data networks to physical ones.  A famous quote in the industry is:

Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway.

This is a great demonstration of huge bandwidth with very high latency.  Driving fifty miles across town a single stationwagon or SUV could haul hundreds of petabytes of data hitting data rates that 10GB/s fiber could not come close to.  But the time for the first data packet to arrive is about an hour.  We often discount this kind of network because we assume that latency must be bounded at under about 500ms.  But that is not always the case.

Australia recently made the news where they did a test to see if a pigeon carrying an SD card could, in terms of network throughput, outperform the regions ISP – and the pigeon ended up being faster than the ISP!

In terms of computing performance we often ignore latency to the point of not even being aware of it as a context in which to discuss performance.  But in low latency computing circles it is considered very carefully.  System throughput is generally greatly reduced (it becomes common to target systems to only hit ten percent CPU utilization when more traditional systems target closer to ninety percent) with concepts like real time kernels, CPU affinity, processor pinning, cache hit ratios and lowered measuring all being used to focus on obtaining the most immediate response possible from a system rather than attempting to get the most total processing out of a system.

Common places where low latency from a computational perspective is desired is in critical controller systems (such as manufacturing controllers were even a millisecond of latency can cause problems on the factory floor) or in financial trading systems where a few milliseconds of delay can cause investments to have changed in price or products to have already been sold and no longer be available.  Speed, in terms of latency, is often the deciding factor between making money or losing money – even a single millisecond can be crippling.

Technically even audio and video processing systems have to be latency sensitive but most modern computing systems have so much spare processing overhead and latency is generally low enough that most systems, even VoIP PBXs and conferencing systems, can function today with only very rarely needing to be aware of latency concerns on the processing side (even networking latency is becoming less and less common as a concern.)  The average system administrator or engineer might easily manage to go through a career without ever needing to work on a system that is latency sensitive or for which there is not so much available overhead as to hide any latency sensitivity.

Defining speed, whether that means throughput, latency or even something else or some combination of the two, is something that is very important in all aspects of IT and in life.  Understanding how they affect us in different situations and how they react to each other with them generally existing in an indirect relationship where improvements in throughput come at a cost to latency or vice versa and learning to balance these as needed to improve the systems that we work on is very valuable.