Category Archives: Uncategorized

Why QuickBooks Can’t Be Stored on Google Drive for Multiple Users

Before we dig into specifics, it is important to understand that this is a general concept and we can actually distill this to “why can’t client/server or shared database file applications be stored on synced storage (e.g. Google Drive, DropBox, NextCloud, etc.) when access is not controlled to a single user?” QuickBooks uses a shared database file mechanism common to 1980s style applications where a single file or set of files is shared via a file sharing mechanism and individual application copies each access this file to modify it. Google Drive is a synced storage mechanism: meaning it makes copies of data from one location to another. People working on the same file at the same time can, and often do, overwrite each other’s changes and the expectation is that these changes will be manually reconciled later, ignored, or that users will be controlled to keep them from working at the same time.

For some types of multiple user applications, synced storage can be leveraged but only in situations where the application is capable of being locked by the storage system to only allow changes when no others exist. This requires a level of integration not practical with general purpose file syncing. Most systems that do this use a syncing mechanism built into either the database or application layer, not a general purpose one that has to work blindly. In order for data integrity to be maintained with synced storage, it is necessary that only one person edit a file at a time, wait for all potential users to receive updates made after the “save” of that file is made, then a different user can edit that file and repeat the process. But only one user at a time can work on the file and must receive the other user’s updates before opening the file for editing themselves. Or else the system has to ask the users which changes to keep and which to discard in every case.

This integrity process cannot be applied to a database file in any realistic way. The file is designed to be open and accessed all the time, not just to quickly open, edit and save. Saving is also not manual, and not always predictable. We generally assume that saving happens continuously during use, but caching can make even those save operations impossible to manually control. But this is necessary for performance reasons.

Confusion often arises because a single user, without the fear of another user accessing the system at the same time, can use synced storage, like Google Drive or Apple iCloud, to act as a backup mechanism (it simply makes a distant copy automatically) and/or as a means to replicate the file so that the single user can use it first from one location, then from another without needing to manually move the file from location to location. As long as that single user takes enough time moving between locations to ensure that any cache has flushed and that syncs and locks have completed, or ensures that they have not left the first instance of the application open they can reasonably assume a safe system (but cannot completely guarantee it – the mechanism carries minute risk of race conditions even then.) Because there is “a way” to safely use synced storage with the application in a single user mode, many non-technical accounting or financial workers will incorrectly assume that multi-user simultaneous access, which is wholly different, will also work. This, however, is not possible.

What happens is that you have a race condition between multiple users and you can never be entirely sure that it has not happened. Sometimes data will simply be bad and there is no question that a race condition has happened as numbers will be wildly inaccurate. But more often, some transactions will simply be lost even after they have been reviewed.

Let’s give an example. User 1 is at home and enters a new receipt into QuickBooks. This change begins to save to the local computer and after that has completed, it starts to forward the new file to Google Drive in the cloud (online). User 2 is in the office and starts to enter a customer payment on an invoice during this time. User 2’s copy of the QuickBooks datafile is open and cannot be overwritten while in use, so the copy being sent to Google Drive cannot get to User 2. Once User 2 saves his change, his copy also wants to send to Google Drive. Google Drive now has to do something with two documents that started off the same but now have two radically different changes to them, but neither copy has both. It has no possible means of merging them, so it can either accept User 1 as the master and discard the changes from User 2 (e.g. first priority). Or it can accept User 2’s changes and discard User 1’s changes (e.g. latest priority). Or, of course, it can discard all changes and accept none. In no case do all users’ financial transactions get retained even after they have saved them locally. Either User 1 or User 2 (or possibly both) are going to have data that they believed to have been saved suddenly vanish on them. Add in more users, and the problem just gets bigger.

Part of the problem is that when working at a file access level, and syncing and sharing data at a full file level, there’s no way to lock only one record or row, or to keep transactions separate or to merge changes. The file is a single entity and it has changed. It’s all or nothing. The individual QuickBook applications cannot talk to each other directly, nor through the database file, to coordinate writes, saves, reads, etc. to work around this problem. They are blind and cannot know that other applications are trying to work with the file at the same time because each has its own unique copy of the file, there is nothing “shared” between them to allow for communications. The copies are not tied to one another, there’s no quantum state involved here. And then we can add in the potential problems with one or more application instances being used when there is a slow or buggy Internet connection or worse, when an instance is offline. There can be hours or days of changes that have to overwrite, or be overwritten, when synchronization finally happens. We rarely are talking about milliseconds, but often days of data.

How this problem is handled, when it is handled, by locally shared files is multi-faceted. First, there is only one file, not copies of that file. So all changes are available to all copies of the application simultaneously and instantaneously. When one instance of the application is going to write data to the file it uses a locking and alerting mechanism similar to how Clustered File Systems allow SANs to work, where it can signal other application instances that a change is being made and that they have to wait for it to complete, and then refresh their copy of the data before continuing. Only one instance can write and all others have to wait. There are no race conditions because the lock is obtained before beginning, and released when done. And all instances only function as long as the data is currently accessible, if connection is lost then they are unable to proceed. A critical data integrity protection mechanism. Some applications will take this mechanism to an even greater level and add direct (rather than through the shared file) communications channels to make this process more rapid for better performance. But few will go this far as once you go to that level it is generally far easier to simply move to a modern application rather than attempt to shoehorn modern multi-user systems onto decades old designs.

Hopefully this has cleared up way accountants may commonly think that sync’d files will work and why they will often claim that it “worked for me” when they should be saying “I got lucky” or “I used it in a totally different scenario that doesn’t apply to multi-user environment” and why you can absolutely use Google Drive, NextCloud, iCloud, DropBox and more with QuickBooks and other legacy style applications for backups and data transfers but cannot consider attempting to use it as a means of obtaining multi-user access as it simply cannot keep the data intact.

The Risks of Licensing

There are so many kinds of risk that we address and must consider in IT systems it is easy to overlook risks that are non-technical, especially ones that we often do not address directly, such as licensing.  But licensing carries risks, and costs, that must be considered in everything that we do in IT.

As I write this article, the risks of licensing are very fresh in the news.  Just yesterday, one of the largest and best known cloud computing providers suffered a global, three hour outage that was later attributed to accidentally allowing some of their licensing to expire.  One relatively minor component in their infrastructure stack, with massive global redundancy reduced to worthlessness in a stroke as their licensing expired. Having licensing dependencies means having to carefully manage them.  Some licenses are more dangerous than others.  Some only leave you exposed to audits, others create outages or dataloss.

Licensing may be a risk intentionally, as in the example above where the license expired and the equipment stopped working.  Or they can be less intentional, such a remote kill switches or confusion of equipment with dates or misconfiguration causes systems to fail.  But it is a risk that must be considered and, quite often, may have to be mitigated.  The risk of critical systems time bombing or dying in unrepairable ways can be very dangerous.  Unlike hardware or software failure, there is often no recourse to repair systems without access to a vendor.  A vendor that may be offline, might be out of support, might no longer support the product, may have technical issues of their own or may even be out of business!

Often, licensing outages put customers into a position of extreme leverage for a vendor who can charge nearly any amount that they want for renewed licensing during a pending or worse, already happened, outage.  Due to pressure, customers may easily pay many times the normal prices for licensing to get systems back online and restore customer confidence.

While some licensing represents extreme risk, and some merely an inconvenience this risk must be evaluated and understood.  In my own experience I have seen critical software have licensing revoked by a foreign software vendor simply looking to force a purchasing discussion and causing large losses to environments for which there was little legal recourse, simply because they had the simple ability to remotely kill systems via their licensing even for paid costumers.  Generally illegal and certainly unethical, there is often little recourse for customers in these situations.

And of course, many license issues can be technical or accidental.  Simply that licensing servers go offline, systems break, accidents happen.  Systems that are designed to become inaccessible when they cannot validate their licenses simply carry an entire category of risk that other types of systems do not.  A risk that is more common than people often realize and often has some of the least ability to be mitigated.

Of course beyond these kinds of risks, licensing also carries overhead which, as always, is a form of risk which, in turn, is a form of cost.  Researching, acquiring, tracking and maintaining licenses, even those that would not potentially cripple your infrastructure, takes time and time is money.  And licensing always carries the risk that you will buy too little and be exposed to audits (or buy incorrectly) or that you will buy too much and overspend.  In any of these cases, this is cost that must be calculated into the overall TCO of any solution, but are often ignored.

Licensing time and costs are often one of the more significant costs in a problem, but because they are ignored it can be extremely different to understand how they play into the long term financial picture of solutions – especially as they often later then impact other decisions in various ways.

Licensing is just a fact of life in IT, but one that is hardly cool or interesting so is often ignored or, at least, not discussed heavily.  Being mindful that licensing has costs to manage just like any other aspect of IT and carries risk, potentially very large risk, that needs to be addressed are just part of good IT decision making.

If It Ain’t Broke, Don’t Fix It

We’ve all heard that plenty, right?  “If it ain’t broke, don’t fix it.”  People use it everywhere as a way to discourage improvements, modernization or refactoring.  Many people say it and as with many phrases of this nature, on the surface it seems reasonable.  But in application, it really is not or, at least, not as normally used because it is not well understood.

This is very similar to the concept of telling people not to put all of their eggs in one basket, where it is more often than not applied to situations where the eggs and basket analogy does not apply or is inverted from reality.  But because it is a memorized phrase, they forget that there is a metaphor that needs to hold up for it to work.  It can lead to horrible decisions because it invokes and irrational fear founded on nothing.

Likewise, the idea of not fixing things that are not broken comes from the theory that something that is perfectly good and functional should not be taken apart and messed with just for the sake of messing with it.  This makes sense.  But for some reason, this logic is almost never applies to things where it would make sense (I’m not even sure of a good example of one of these) but instead is applied to complex devices that require regular maintenance and upkeep in order to work properly.

Of course if your shoe is not broken, don’t tear it apart and attempt to glue it back together again.  But your business infrastructure systems are nothing like a shoe.  They are living systems with enormous levels of complexity that function in an ever changing landscape.  They require constant maintenance, oversight, updating and so forth to remain healthy.  Must like a car, but dramatically moreso.

You never, we hope, hear someone tell you that you don’t need to change the oil in your car until the engine has seized.  Of course not, even though it is not yet broken, the point is to do maintenance to keep it from breaking.  We know with a car that if we wait until it breaks, it will be really broken.  Likewise we would not refuse to put air in the tires until the flat tires of ripped off of the wheels.  It just makes no sense.

Telling someone not to maintain systems until it is too late is the same as telling them to break them.  A properly maintained car might last hundreds of thousands of miles, maybe millions.  One without oil will be lucky to make it across town.  Buying a new engine every few days rather than taking care of the one that you have means you might go a lifetime without destroying an engine.

The same goes for your business infrastructure.  Code ages, systems wear out, new technology emerges, new needs exist, the network interacts with the outside world, new features are needed, vulnerabilities and fragility is identified and fixed, updates come out, new attacks are developed and so forth.  Even if new features never were created, systems need to be diligently managed and maintained in order to ensure safe and reliable operation – like a car but a thousand times more complex.

In terms of IT systems, broken means unnecessary exposed to hacking, data theft, data loss, downtime and inefficiencies.  In the real world, we should be considering the system to be broken the moment that maintenance is needed.  How much ransomware would not be a threat today if systems were simply properly maintained?  As IT we need to stand up and explain that unmaintained systems are already broken, disaster just hasn’t struck yet.

If we were to follow the mantra of “if it ain’t broke, don’t fix it” in IT, we wait *until* our data was stolen to patch vulnerabilities, or wait until data was unrecoverable to see if we had working backups.  Of course, that makes no sense. But this is what is often suggested when people tell you not to fix your systems until they break – they are telling you to let them break! Push back, don’t accept that kind of advice.  Explain that the purpose of good IT maintenance is to avoid systems breaking whenever possible.  Avoiding disaster, rather than inviting it.

Virtualize Domain Controllers

One would think that the idea of virtualizing Active Directory Domain Controllers would not be a topic needing discussion, and yet I find that the question arises regularly as to whether or not AD DCs should be virtualized.  In theory, there is no need to ask this question because we have far more general guidance in the industry that tells us that all possible workloads should be virtualized and AD certainly presents no special cases with which to create an exception to this long standing and general rule.

Oddly, people seem to go out regularly seeking clarification on this one particular workload, however and if you seek bad advice, someone is sure to provide.  Tons of people post advice recommending physical servers for Active Directory, but rarely, if ever, with any explanation as to why they would recommend violating best practices at all, let alone with such a mundane and well known workload.

As to why people implementing AD DCs decide that it warrants specific investigation around virtualization when no other workload does, I cannot answer.  But after many years of research into this phenomenon I do have some insight into the source of the reckless advice around physical deployments.

The first mistake comes from a general misunderstanding of what virtualization even is.  This is sadly incredibly common and people quite often think that virtualization means consolidation, which of course it does not.  So they take that mistake and then apply the false logic that consolidation means consolidating two AD DCs onto the same physical host.  It also requires the leap to thinking that there will always be two or more AD DCs, but this is also a common belief.  So three large mistakes in logic come together for some very bad advice that, if you dig into the recommendations, you can normally trace back.  This seems to be the root of the majority of the bad advice.

Other causes are sometimes misunderstanding actual best practices, such as the phrase “If you have two AD DCs, each needs to be on a separate physical host.”  This statement is telling us that two physically disparate machines need to be used in this scenario, which is absolutely correct.  But it does not imply that either of them should not have a hypervisor, only that two different hosts are needed.  The wording used for this kind of advice is often hard to understand if you don’t have the existing understanding that under no circumstance is a non-virtual workload acceptable.  If you read the recommendation with that understanding, its meaning is clear and, hopefully, obvious.  Sadly, that recommendation often gets repeated out of context so the underlying meaning can easily get lost.

Long ago, as in around a decade ago, some virtualization platforms had some issues around timing and system clocks that could play havoc with clustered database systems like Active Directory.  This was a legitimate issue long ago but was long ago solved, as it needed to be for many different workloads.  A perception was created that AD might need special treatment, however, and it seems to linger on even though it has been a generation or two in IT terms since this was an issue and should have long ago been forgotten.

Another myth leading to bad advice is rooted in the fact that AD DCs, like other clustered databases, when used in a clustered mode should not be snapshotted as this will easily create database corruption if only one node of the cluster gets restored in that manner.  This is, however, a general aspect of storage and databases and is not related to virtualization at all.  The same information is necessary for physical AD DCs just the same.  That snapshots are associated with virtualization is another myth; virtualization implies no such storage artefact.

Still other myths arise from a belief that virtualization much rely on Active Directory itself in order to function and therefore AD has to run without virtualization.  This is completely a myth and nonsensical.  There is no such circular requirement.

Sadly, some areas of technical have given rise to large scale myths, often many of them, that surround them and can make it difficult to figure out the truth.  Virtualization is just complex enough that many people attempt to learn but just how to use it, but what it is conceptually, by rote giving rise to sometimes crazy misconceptions that are so far afield that it can be hard to figure out that that is really what we are seeing.  And in a case like this, misconceptions around virtualization, history, clustered databases, high availability techniques, storage and more add up to layer upon layer of misconceptions making it hard to figure out how so many things can come together around one deployment question.

At the end of the day, few workloads are as ideally suited to virtualization as Active Directory Domain Controllers are.  There is no case where the idea of using a physical bare metal operating system deployment for a DC should be considered – virtualize every time.