Category Archives: Networking

Ferraris and Tractor Trailers

Working in the SMB world, it is actually pretty rare that we need to talk about latency.  The SMB world is almost universally focused on system throughput and generally unaware of latency as a need.  But there are times where latency becomes important and when it does it is critical that we understand the interplay of throughput and latency and just what “speed” means to us.  Once we start moving into the enterprise space, latency is more often going to be viewed as a concern, but even there throughput nearly always reigns supreme, to the point that concepts of speed almost universally revolve around throughput and concepts of latency are often ignored or forgotten.

Understanding the role of latency in a system can be complicated, even though latency itself is relatively simple to understand.

A great comparison between latency and throughput that I like to use is the idea of a Ferrari and a tractor trailer.  Ferraris are “fast” in the traditional sense, they have a high “miles per hour.”  One might say that they are designed for speed.  But are they?

We generally consider tractor trailers to be slow.  They are big and lumbering beasts that have a low top end speed.  But they haul a lot of stuff at once.

In computer terms we normally think of speed like hauling capacity – we think in terms of “items” per second.  In the terms of a Ferrari going two hundred miles per hour is great, but it can haul maybe one box at a time.  A tractor trailer can only go one hundred miles per hour but can haul closer to one thousand boxes at a time.  When we talk about throughput or speed on a computer this is more what we think about.  In network terms we think of gigabytes per second and are rarely concerned with the speed of an individual packet as a single packet is rarely important.  In computational terms we think about ideas like floating point operations per second, a similar concept.  No one really cares how long a single FLOP (floating point operation) takes, only how many we can get done in one or ten seconds.

So when looking at a Ferrari we could say that it has a useful speed of two hundred box-miles per hour.  That is for every hour of operations, a Ferrari can move one box up to two hundred miles.  A tractor trailer has a useful speed of one hundred thousand box-miles per hour.  In terms of moving packages around, the throughput of the tractor trailer is easily five hundred times “faster” than that of the Ferrari.

So in terms of how we normally think of computers and networks a tractor trailer would be “fast” and a Ferrari would be “slow.”

But there is also latency to consider.  Assuming that our payload is tiny, say a letter or a small box, a Ferrari can move that one box over a thousand miles in just five hours!  A tractor trailer would take ten hours to make this same journey (but could have a LOT of letters all arriving at once.)  If what we need is to get a message or a small parcel from one place to another very quickly the Ferrari is the better choice because it has half the latency (delay) from the time we initiate the delivery until the first package is delivered than the tractor trailer does.

As you can imagine, in most cases tractor trailers are vastly more practical because their delivery speed is so much higher.  And, this being the case, we actually see large trucks on the highways all of the time and the occurrence rate of Ferraris is very low – even though each cost about the same amount to purchase (very roughly.)  But in special cases, the Ferrari makes more sense.  Just not very often.

This is a general case concept and can apply to numerous applications.  It applies to caching systems, memory, CPU, networking, operating system kernels and schedulers, to cars and more.  Latency and throughput are generally inversely related – we give up latency in order to obtain throughput.  For most operations this makes the best sense.  But sometimes it makes more sense to tune for latency.

Storage is actually an odd duck in computing where nearly all focus on storage performance is around IOPS, which is roughly a proxy measurement for latency, instead of throughput which is measured in “data transferred per second.”  Rarely do we care about this second number as it is almost never the source of storage bottlenecks.  But this is the exception, not the rule.

Latency and throughput can have some surprising interactions in the computing world.  When we talk about networks, for example, we typically measure only throughput (Gb/s) but rarely care much about the latency (normally measured in milliseconds.)  Typically this is because nearly all networking systems have similar latency numbers and most applications are pretty much unconcerned with latency delays.  It is only the rare application like VoIP over International links or satellite where latency affects the average person or can sometimes surprise people when they attempt something uncommon like iSCSI over a long distance WAN connection and suddenly latency pops up to surprise them as an unforeseen problem.

One of the places where the interaction of latency and throughput starts to become shocking and interesting is when we move from electrical or optical data networks to physical ones.  A famous quote in the industry is:

Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway.

This is a great demonstration of huge bandwidth with very high latency.  Driving fifty miles across town a single stationwagon or SUV could haul hundreds of petabytes of data hitting data rates that 10GB/s fiber could not come close to.  But the time for the first data packet to arrive is about an hour.  We often discount this kind of network because we assume that latency must be bounded at under about 500ms.  But that is not always the case.

Australia recently made the news where they did a test to see if a pigeon carrying an SD card could, in terms of network throughput, outperform the regions ISP – and the pigeon ended up being faster than the ISP!

In terms of computing performance we often ignore latency to the point of not even being aware of it as a context in which to discuss performance.  But in low latency computing circles it is considered very carefully.  System throughput is generally greatly reduced (it becomes common to target systems to only hit ten percent CPU utilization when more traditional systems target closer to ninety percent) with concepts like real time kernels, CPU affinity, processor pinning, cache hit ratios and lowered measuring all being used to focus on obtaining the most immediate response possible from a system rather than attempting to get the most total processing out of a system.

Common places where low latency from a computational perspective is desired is in critical controller systems (such as manufacturing controllers were even a millisecond of latency can cause problems on the factory floor) or in financial trading systems where a few milliseconds of delay can cause investments to have changed in price or products to have already been sold and no longer be available.  Speed, in terms of latency, is often the deciding factor between making money or losing money – even a single millisecond can be crippling.

Technically even audio and video processing systems have to be latency sensitive but most modern computing systems have so much spare processing overhead and latency is generally low enough that most systems, even VoIP PBXs and conferencing systems, can function today with only very rarely needing to be aware of latency concerns on the processing side (even networking latency is becoming less and less common as a concern.)  The average system administrator or engineer might easily manage to go through a career without ever needing to work on a system that is latency sensitive or for which there is not so much available overhead as to hide any latency sensitivity.

Defining speed, whether that means throughput, latency or even something else or some combination of the two, is something that is very important in all aspects of IT and in life.  Understanding how they affect us in different situations and how they react to each other with them generally existing in an indirect relationship where improvements in throughput come at a cost to latency or vice versa and learning to balance these as needed to improve the systems that we work on is very valuable.

One Big Flat Network

There is a natural movement of networks to become unnecessarily complicated.  But there is great value in keeping networks clean and simple.  Simple networks are easier to manage, more performant and more reliable while being generally less expensive.  Every network needs a different level of complexity and large networks will certainly need an extensive level of it, but small businesses can often keep networks extremely simple which is part of what makes smaller businesses more agile and less expensive, giving them an edge over their larger counterparts.  This is an edge that they must leverage because they lack the enterprise advantage of scale.

There are two ways to look at network complexity.  The first is the physical network – the actual setup of the switches and routers that make up the network.  The second is the logical network – how IP address ranges are segmented, where routing barriers exist, etc.  Both are important to consider when looking at the complexity of your network.

It should be the goal of any network to be as simple as possible while still meeting all of the goals and requirements of the network.  

The first aspect we will address is the physically flat network.   Reducing a physical network to be flat can have a truly astounding effect on the performance and reliability of  that network.  In a very small network this could mean working from a single switch for all connections.  Typically this is only available for the very smallest networks as switches rarely are available above forty-eight or possibly fifty-two ports.  But for many small businesses this is completely possible.  It may require additional cabling for a building, in order to bring all connections back to a central location, but can often be attained – at least on a site by site basis.  Many businesses today have multiple locations or staff working from home and this can make the network challenges much greater, although each location can strive for its own simplicity in those cases.

As a network grows the concept of the single switch can be grown as well using the concept of switch stacking.  Stacked switches share a single switching fabric or backplane.  When stacked they behave as a single switch but with more ports.  (Some switches do true backplane sharing and some mimic this with very high speed uplink ports with shared management via that port.)  A switch stack is managed as a single switch making network management no more difficult, complex or time consuming for a stack than for a single switch.  It is common for a switch stack to grow to at least three hundred ports if not more.  This allows for much larger physical site growth before needing to leave the single switch approach.

In some cases, some large module single switch chassis will grow even larger than this allowing for four hundred or more ports in a single switch but in a “blade like” enterprise switching chassis.

By being creative and looking at simple, elegant solutions it is entirely possible to keep even a moderately large network contained to a single switching fabric allowing all network connections to share a single backplane.

The second area that we have to investigate is the logical complexity of the network.  Even in physically simple networks it is common to find small businesses investing a significant amount of time and energy into implementing unnecessary subnets or VLANs and all of the overhead that comes with those.

Subnetting is rarely necessary in a small or even a smaller medium-sized business.  Traditionally, going back to the 1990s, it was very common to want to keep subnets to a maximum of 256 devices (or a /24 subnet) because of packet collision, broadcasts and other practical issues.  This made a lot of sense in that era when hubs were used instead of switches and broadcasts were common and network bandwidth was lucky if it was 10Mb/s on a shared bus.  Today’s broadcast light, collision free, 1Gb/s dedicated channel networks experience network load in a completely different manner.  Where 256 devices on a subnet was an extremely large network then, having more than 1,000 devices on a single subnet is a non-issue today.

These changes in how networks behave mean that small and medium businesses almost never need to subnet for reasons of scale and can comfortably use a single subnet for their entire business reducing complexity and easing network management.  More than a single subnet may be necessary to support specific network segmentation like separating production and guest networks, but scale, the reason traditionally given for subnetting networks, becomes an issue solely of larger businesses.

It is tempting to want to implement VLANs on every small business environment as well.  Subnetting and VLANs are often related and often confused, but subnets often exist without VLANs, while VLANs do not exist without subnets.

In large environments VLANs are a foregone conclusion and it is simply assumed that they will exist.  This mentality often filters down to smaller organizations who are often tempted to apply this to businesses which lack the scale that makes VLAN management make sense.  VLANs should be relatively uncommon in a small business network.

The most common place where I see VLANs used when they are not needed is in Voice over IP or VoIP networks.  It is a common assumption that VoIP has special needs that require VLAN support.  This is not true.  VoIP and the QoS that it sometimes needs are available without VLANs and often will work better without them.

VLANs really only become important when either management is needed at large scale (where scale is larger than a single subnet can provision) and cannot be physically segregated or when specific network-layer security is needed which is relatively rare in the SMB market.  VLANs are very useful and do have their place.  VLANs are often used if a dedicated guest network is needed but generally in a small business guest access is provided via a direct guest connection to the Internet rather than a quarantined network for guests.

The most common practical use of a VLAN in an SMB is likely to be a walled garden DMZ designed for quarantined BYOD remote access where BYOD devices connect much like guests but have the ability to access remote access resources like RDP, ICA or PCoIP protocols.  VLANs would also be popular for building traditional DMZs for externally facing public services such as web and email servers – except that these services are not commonly kept on the local network for hosting in today’s SMBs so this classic use of VLANs in the SMB is rapidly fading.

Another use case where VLANs are often used inappropriately is for a Storage Area Network or SAN.  It is best practice that a SAN be a completely independent (air gapped), physically unique network unrelated to the regular switching infrastructure.  It is generally not advised that a SAN be created using VLANs or subnets but instead be on dedicated switches.

It is tempting to add complex switching setups, additional subnets and VLANs because we hear about these things from larger environments, they are fun and exciting, and they appear to add job security by making the network more difficult to maintain.  Complex networks require higher end skills and can seem like a great way to use that networking certificate.  But in the long run, this is a bad career and IT strategy.  Network complexity should be added in a lab for learning purposes, not in production networks.  Production networks should be run as simply, elegantly and cost effectively as possible.

With relatively little effort, a small business network can likely be designed to be both physically and logically very simple.  The goal, of course, is to come as close as possible to creating single, flat network structure where all devices are physical and logical peers with no unnecessary bottlenecks or protocol escalations.  This improves performance and reliability, reduces costs and frees IT resources to focus on more important tasks.

Originally posted on the StorageCraft Blog.