If you’ve read a bit about microservices, you’ll probably have come across the mantra, “Don’t use shared libraries in microservices.” This is bad advice.
While the sentiment is borne from real issues and there’s real wisdom to be gained in this area, this little statement is too pithy, lacking the relevant context to make it useful. Consequently, it’s open to misinterpretation, in particular to being applied too liberally, and I’ve seen it misunderstood a number of times recently.
What’s the Context for Understanding Shared Libraries in Microservices?
Only recently, I’ve picked up that different people mean different things when they talk about using shared libraries. People whose main focus is on designing and coding applications are generally talking about writing code that uses a shared library. This is the first context: designing applications. People who are more focussed on deploying code, though, may instead be talking about a binary on a the file system of a deployment target that is used by more than one application. This is the second context: designing deployment environments.
In both contexts, the meaning of “library” is similar: it’s a body of code that is not part of the application but is available as a separate unit from which it can be used in multiple applications. While the “library” concept is similar, the definition of “shared” is quite different. In the application context, sharing means writing code that is compiled against, and requires at runtime, the same library code as another application. By way of example, two applications may both use the same logging library. They may be deployed on separate hosts residing on opposite sides of the globe, but they “share” this library. In the deployment context, “sharing” is actually an extension of the first context. Not only do two applications require the same library, but they are deployed in the same environment (i.e. host, be it virtual or physical) and they share a single binary for that shared library.
So, what’s the problem with shared libraries in microservices? As there are two different contexts, there’s different problems in each context…
The Problem with Shared Libraries in Microservices Application Design
In the application design context, sharing libraries is generally a Good Thing. Being able to use bodies of code that are written and maintained by other organisations has obvious productivity advantages. Anyone who tried to write a modern web application or backend service without using shared libraries would likely be perceived to be either grossly ignorant of the software industry, or perhaps as a megalomaniac with a serious case of “Not Invented Here” syndrome.
When are shared libraries an issue in application design? When they create coupling. Specifically with regard to shared libraries in microservices, coupling happens when the use of a library in one application applies strong force on integrating applications to also use the same library and/or platform.When are shared libraries an issue in application design? When they create coupling. Click To Tweet
By way of example, the Spring Application Framework popular in the Java ecosystem has a capability called “Spring HTTP Invokers”. It uses serialisation of Java objects to easily create RPC links by generating, at runtime, endpoints on the server side and proxies on the client side. Now if a server provides a HTTP Invoker interface, any client is almost forced to use Spring HTTP Invokers as the client and, by extension, to be written in a JVM language, if not Java itself. Use of this library applies a very strong force on integrating applications. Note that the force is not absolutely irresistible: Turing-complete languages being what they are, one could certainly write a serialised Java object parser in Haskell and then implement a client that’s not Spring+Java-dependent. It’s easy to see that would be a pretty silly waste of resources, though, and that’s the evidence of the server implementation applying force on the implementation of clients.
There are other examples that are in the middle of the scale. For example, an interface using Protocol Buffers applies force on clients to also use protofbufs. The force of the coupling is lessened, however, by the fact that protobufs implementations are available for many languages. Using the library in the server has forced use of the same library (or, we might say, a closely related one) in integrating artefacts, but the coupling doesn’t extend to the point of applying force toward a particular platform.
Finally, there are library choices that have very little chance of creating a coupling because they are nowhere near the integration points. For example, whatever library you use to write application logs probably applies no force on integrating applications and how they do application logging. Be careful, though – perhaps your choice of log file format will affect whatever tooling you have to read the logs. Integrations are not always visible on the architecture diagram.
One last note on application coupling. Just because you’ve received a lot of money doesn’t mean you should go out and spend it. Similarly, just because you’ve achieved loose coupling doesn’t mean you should make liberal use of it. It’s great to have the ability to use alternate technologies to implement different parts of your system, but there are also advantages to standardising your tech stack across the team as much as possible, just as using a single OS and a single cloud infrastructure vendor provide a lot of leverage. Economies of scale are real. Ignoring them in a corporate context can be a form of waste.
That’s application design. What about deployment environments?
The Problem with Shared Libraries in Microservices Deployment Environments
Remember that, in this context, a library is pretty much any binary which is used to support an application, including platform runtimes, applications and even the OS.
Historically, when a library has been required by an application, it would often be installed onto the deployment target manually. If you had an application that required Java 7, for instance, you’d install Java 7 on the target environment once, and then install many different versions of your application as it was developed and maintained, and most likely many other applications, and progressive versions of them. The same went for Python, Ruby, Perl, .Net, application servers, web servers, whatever. You would most likely install patches and maybe minor upgrades for such libraries over time to rid them of bugs and security vulnerabilities. If you installed two applications on the same machine, or seventeen, you’d only install the library once and every application could share the same installation. “Yay!” Yay because the installation was manual, so doing it more times incurred costly Systems Engineer time. We’ve reduced a resource constraint through re-use.
Where’s the problem with that sharing? The problem comes when you want to install Java 8! Or BigNewVersion of anything else. Well, we’re all cool, right? Just upgrade the one installation on the host to BigNewVersion and it’s there ready for all the applications. But what if your applications aren’t all ready to move to BigNewVersion, and some of them will be broken by it? Then you’re kind of left with three options:
- Option #1: Don’t upgrade to BigNewVersion until all the applications on the host are deemed ready.
- Option #2: Move applications that aren’t ready to upgrade (or the ones that are) onto a different host.
- Option #3: Figure out a way to install both BigNewVersion and OldSkoolVersion on the one host, and reconfigure each application to use the one it requires.
You can see that #1 constrains progress, while the other two will likely involve significant operational cost and risk. Without finding a convincing business case, or having a culture that has a modus operandi of investment in rolling to the latest tool versions, your chances of securing resources to work on this problem are slim. There’s a good chance you’ll end up with option #4 by default: Leave all the applications as they are, in maintenance mode, until they’re decommissioned.
How did we end up in this situation again? We tried to save some time by minimising the number of times a manual task was done. Sharing libraries in a deployment environment seems like the cheap way to go when building the environment, but that upfront cost saving ignores the ongoing cost of maintenance, particularly the complexity introduced by unintentionally coupling the applications, which only rears its head once the requirements of the apps start to diverge.Sharing libraries in a deployment environment seems like the cheap way to go when building the environment... Click To Tweet
The solution to all this should be obvious these days: instead of minimising the number of times the manual task of installing libraries is performed, we can minimise the cost of each installation by making it automated instead of manual. Once the cost is minimised, there’s less reason to minimise the number of installations, so we can install one copy of each library per application from the outset, eliminating shared libraries in the deployment environment. Doing this has become easier and easier over the last decade thanks to the proliferation of automated provisioning (e.g. CFEngine, Puppet, Chef, Ansible) coupled with virtual machines (e.g. VMWare) and/or containerisation (e.g. Docker). Containers are the most recent iteration of this “automate more, share less” pattern to gain popularity, where the suggested approach is to deliver your application as a package that also contains everything it needs – packages, frameworks, libraries, app servers, even OS – everything except the Linux kernel it runs on.
Note that just using these tools doesn’t immediately get you out of the clutches of shared libraries. With Docker, for example, your containers are constrained to using a Linux distro which uses the same kernel version as the host OS. Likewise, if you’re using VMs and your VM-building infrastructure is heavily biased towards building applications on top of a single SOE, you may run into the same challenges once an application appears that needs something different to the current SOE.
What Should We Really Do About Shared Libraries in Microservices?
The simplistic advice “Don’t share libraries in microservices” needs to die. It conflates two different issues and is overly broad to boot.The simplistic advice “Don’t share libraries in microservices” needs to die. Click To Tweet
Instead, I suggest passing on the following advice, which highlights the context and the specific types of sharing to avoid:
- When designing applications, avoid using library code that increases coupling between applications, so that developers building clients are free to make different technology choices if that becomes advantageous.
- When designing deployment environments, avoid sharing library binaries between applications deployed on the same host, so that application dependencies can be upgraded independently.
If you’re a member of a microservices community – or any IT community, for that matter – and you hear people throwing around pithy advice that lacks context, is too absolute, or doesn’t make the targeted issues obvious, do the community a favour and ask the advice-giver to put some more substance around their recommendations.
Image credit: “Pounced” by Henti Smith