If you’ve read a bit about microservices, you’ll probably have come across the mantra, “Don’t use shared libraries in microservices.” This is bad advice.
While the sentiment is borne from real issues and there’s real wisdom to be gained in this area, this little statement is too pithy, lacking the relevant context to make it useful. Consequently, it’s open to misinterpretation, in particular to being applied too liberally, and I’ve seen it misunderstood a number of times recently.
What’s the Context for Understanding Shared Libraries in Microservices?
Only recently, I’ve picked up that different people mean different things when they talk about using shared libraries. People whose main focus is on designing and coding applications are generally talking about writing code that uses a shared library. This is the first context: designing applications. People who are more focussed on deploying code, though, may instead be talking about a binary on a the file system of a deployment target that is used by more than one application. This is the second context: designing deployment environments.
In both contexts, the meaning of “library” is similar: it’s a body of code that is not part of the application but is available as a separate unit from which it can be used in multiple applications. While the “library” concept is similar, the definition of “shared” is quite different. In the application context, sharing means writing code that is compiled against, and requires at runtime, the same library code as another application. By way of example, two applications may both use the same logging library. They may be deployed on separate hosts residing on opposite sides of the globe, but they “share” this library. In the deployment context, “sharing” is actually an extension of the first context. Not only do two applications require the same library, but they are deployed in the same environment (i.e. host, be it virtual or physical) and they share a single binary for that shared library.
So, what’s the problem with shared libraries in microservices? As there are two different contexts, there’s different problems in each context…
The Problem with Shared Libraries in Microservices Application Design
In the application design context, sharing libraries is generally a Good Thing. Being able to use bodies of code that are written and maintained by other organisations has obvious productivity advantages. Anyone who tried to write a modern web application or backend service without using shared libraries would likely be perceived to be either grossly ignorant of the software industry, or perhaps as a megalomaniac with a serious case of “Not Invented Here” syndrome.
When are shared libraries an issue in application design? When they create coupling. Specifically with regard to shared libraries in microservices, coupling happens when the use of a library in one application applies strong force on integrating applications to also use the same library and/or platform.When are shared libraries an issue in application design? When they create coupling. Click To Tweet
By way of example, the Spring Application Framework popular in the Java ecosystem has a capability called “Spring HTTP Invokers”. It uses serialisation of Java objects to easily create RPC links by generating, at runtime, endpoints on the server side and proxies on the client side. Now if a server provides a HTTP Invoker interface, any client is almost forced to use Spring HTTP Invokers as the client and, by extension, to be written in a JVM language, if not Java itself. Use of this library applies a very strong force on integrating applications. Note that the force is not absolutely irresistible: Turing-complete languages being what they are, one could certainly write a serialised Java object parser in Haskell and then implement a client that’s not Spring+Java-dependent. It’s easy to see that would be a pretty silly waste of resources, though, and that’s the evidence of the server implementation applying force on the implementation of clients.
There are other examples that are in the middle of the scale. For example, an interface using Protocol Buffers applies force on clients to also use protofbufs. The force of the coupling is lessened, however, by the fact that protobufs implementations are available for many languages. Using the library in the server has forced use of the same library (or, we might say, a closely related one) in integrating artefacts, but the coupling doesn’t extend to the point of applying force toward a particular platform.
Finally, there are library choices that have very little chance of creating a coupling because they are nowhere near the integration points. For example, whatever library you use to write application logs probably applies no force on integrating applications and how they do application logging. Be careful, though – perhaps your choice of log file format will affect whatever tooling you have to read the logs. Integrations are not always visible on the architecture diagram.
One last note on application coupling. Just because you’ve received a lot of money doesn’t mean you should go out and spend it. Similarly, just because you’ve achieved loose coupling doesn’t mean you should make liberal use of it. It’s great to have the ability to use alternate technologies to implement different parts of your system, but there are also advantages to standardising your tech stack across the team as much as possible, just as using a single OS and a single cloud infrastructure vendor provide a lot of leverage. Economies of scale are real. Ignoring them in a corporate context can be a form of waste.
That’s application design. What about deployment environments?
The Problem with Shared Libraries in Microservices Deployment Environments
Remember that, in this context, a library is pretty much any binary which is used to support an application, including platform runtimes, applications and even the OS.
Historically, when a library has been required by an application, it would often be installed onto the deployment target manually. If you had an application that required Java 7, for instance, you’d install Java 7 on the target environment once, and then install many different versions of your application as it was developed and maintained, and most likely many other applications, and progressive versions of them. The same went for Python, Ruby, Perl, .Net, application servers, web servers, whatever. You would most likely install patches and maybe minor upgrades for such libraries over time to rid them of bugs and security vulnerabilities. If you installed two applications on the same machine, or seventeen, you’d only install the library once and every application could share the same installation. “Yay!” Yay because the installation was manual, so doing it more times incurred costly Systems Engineer time. We’ve reduced a resource constraint through re-use.
Where’s the problem with that sharing? The problem comes when you want to install Java 8! Or BigNewVersion of anything else. Well, we’re all cool, right? Just upgrade the one installation on the host to BigNewVersion and it’s there ready for all the applications. But what if your applications aren’t all ready to move to BigNewVersion, and some of them will be broken by it? Then you’re kind of left with three options:
- Option #1: Don’t upgrade to BigNewVersion until all the applications on the host are deemed ready.
- Option #2: Move applications that aren’t ready to upgrade (or the ones that are) onto a different host.
- Option #3: Figure out a way to install both BigNewVersion and OldSkoolVersion on the one host, and reconfigure each application to use the one it requires.
You can see that #1 constrains progress, while the other two will likely involve significant operational cost and risk. Without finding a convincing business case, or having a culture that has a modus operandi of investment in rolling to the latest tool versions, your chances of securing resources to work on this problem are slim. There’s a good chance you’ll end up with option #4 by default: Leave all the applications as they are, in maintenance mode, until they’re decommissioned.
How did we end up in this situation again? We tried to save some time by minimising the number of times a manual task was done. Sharing libraries in a deployment environment seems like the cheap way to go when building the environment, but that upfront cost saving ignores the ongoing cost of maintenance, particularly the complexity introduced by unintentionally coupling the applications, which only rears its head once the requirements of the apps start to diverge.Sharing libraries in a deployment environment seems like the cheap way to go when building the environment... Click To Tweet
The solution to all this should be obvious these days: instead of minimising the number of times the manual task of installing libraries is performed, we can minimise the cost of each installation by making it automated instead of manual. Once the cost is minimised, there’s less reason to minimise the number of installations, so we can install one copy of each library per application from the outset, eliminating shared libraries in the deployment environment. Doing this has become easier and easier over the last decade thanks to the proliferation of automated provisioning (e.g. CFEngine, Puppet, Chef, Ansible) coupled with virtual machines (e.g. VMWare) and/or containerisation (e.g. Docker). Containers are the most recent iteration of this “automate more, share less” pattern to gain popularity, where the suggested approach is to deliver your application as a package that also contains everything it needs – packages, frameworks, libraries, app servers, even OS – everything except the Linux kernel it runs on.
Note that just using these tools doesn’t immediately get you out of the clutches of shared libraries. With Docker, for example, your containers are constrained to using a Linux distro which uses the same kernel version as the host OS. Likewise, if you’re using VMs and your VM-building infrastructure is heavily biased towards building applications on top of a single SOE, you may run into the same challenges once an application appears that needs something different to the current SOE.
What Should We Really Do About Shared Libraries in Microservices?
The simplistic advice “Don’t share libraries in microservices” needs to die. It conflates two different issues and is overly broad to boot.The simplistic advice “Don’t share libraries in microservices” needs to die. Click To Tweet
Instead, I suggest passing on the following advice, which highlights the context and the specific types of sharing to avoid:
- When designing applications, avoid using library code that increases coupling between applications, so that developers building clients are free to make different technology choices if that becomes advantageous.
- When designing deployment environments, avoid sharing library binaries between applications deployed on the same host, so that application dependencies can be upgraded independently.
If you’re a member of a microservices community – or any IT community, for that matter – and you hear people throwing around pithy advice that lacks context, is too absolute, or doesn’t make the targeted issues obvious, do the community a favour and ask the advice-giver to put some more substance around their recommendations.
Image credit: “Pounced” by Henti Smith
Good one Graham ! I remember seeing something against shared libraries in the classic “Building Microservices” Sam Newman book. But, you did clear the air with some real examples.
It was nice meeting you at the API Days @2016 in Melbourne !
You bring up a very important point – empirically it can be claimed that “not-sharing libraries” makes your R1 delivery 2-5 times more expensive and fragile. It is quite possible that due to this mantra your R1 will not see daylight and all talk about future maintenance will be futile. In fact, not-sharing libraries will create another type of maintenance nightmare . For example, “A” fixed a bug in service-1, service-2 owner would encounter similar problem next week and so on.
Here is one argument to support above assertions.
In an application, typically 50-60% code is common across functional areas and that is after leaving aside open source libraries. This is data structure, housekeeping, utility code. For example, someone may write an application specific wrapper to MongoDB which can be used by multiple micro-services. If you follow the “not-sharing” mantra, either each developer/ team will build the same wrapper, or someone will develop once and everyone else copy it over in its micro-service repo. You can extrapolate the nightmare that will follow until this Mongo-DB wrapper is hardened.
Indeed, not sharing seems like a waste of resources as Anil Sharma writes earlier. But once you decide to share common code, you’ll need to find a solution for the common library dev cycle. How to separate the dependency’ dev/master/other branch development, and make sure your application dev/master/other branch is using the correct version? It’s not a bucket of common-code you simply include in your projects, as you’ll need to keep track of versioning, setup a private library repository, etc.. Then think about automation and the devops. It’s a lot of work many people forget about when ditching their monolith.
I think you missed the major point of the article. The author isn’t suggesting that code-reuse is bad, but he is suggesting that use of shared libraries in finished products can be troublesome. Anyone that has dealt with DLL-hell, of any form, realizes that they are simply awful. There is simply no reason anymore to use them: they (a) introduce a lot more performance overhead dispatching across shared-library boundaries, (b) don’t really save on disk space, (c) push it onto the user to find all dependent libraries as installers rarely do a fine job of this, case in point Oracle and their ilk.
Having been an avid C++ developer (working in the database industry for 25 years (writing DB internals at various DB companies) I can attest that shared libraries introduce a boatload of complexity.
In the new micro-service world, nearly everything is now written in Golang, which is a freshly welcomed sigh of relief over prior train-wrecks of programming languages (Ruby is a great case in point), and later versions of C++ are awful.
The better word for “Independence” in the article would be self-sufficient, or standalone. Anytime there is a dependency, one opens up doors for misconfiguration, or environment issues, the very thing that micoservices and docker are trying to resolve. Throwing shared libraries back into the mix is an antipattern.
Pingback: The trouble with client libraries – colum Walsh
Interesting post, but I think you have missed the primary purpose of shared libraries. Shared libraries are not about saving the time to install the library files multiple times. Shared libraries are primarily about saving memory at runtime. If multiple processes running in the same OS need to access the same library, using a shared library allows the library code to be loaded into memory once, and shared. Statically linking libraries results in the same code being loaded into memory multiple times by multiple applications. In a system with memory constraints, this can make a big difference in system performance. Since micro service architectures are more about scalability than efficient use of resources, perhaps forgoing the efficiency benefits of shared libraries in exchange for the improved isolation of static linking makes sense.
There are many legitimate instances where you need to share dto/contracts etc. Listen to how Netflix handled this same situation (Mastering Chaos — A Netflix Guide to Microservices on youtube). Rather than saying absolutely no to sharing code between microservices they made decisions based on the situation and context. There are no absolutes in software engineering.
My impression of this concept was always that “No shared libraries” doesn’t mean any code owned by another service, but instead refers specifically to libraries intended as a mechanism to just have a dump of shared code. In stead of a “shared code” lib, one might divide up the components into client libs, put them in the source control of another project and version them alongside the stuff they are related to. The real problem “no shared code” is addressing, at least IMO, is the problem of no real ownership.
TLDR; No shared code does not mean no code from other services, but instead means make sure your shared code has an actual owner conceptually and practically and version is appropriately.
“Common libs” where everyone dumps their code that they think others might want has ended up being a horrible mess in my experience.
Each microservice is autonomous so executables will have its own copy of shared libraries so there is no coupling with shared library?
Spring boot packages langauge run time also in package so microservice
Nothing is shared even runtime somi dont see problem in using library or common package in microservice
Pingback: Shared dependencies: Avoiding this micro antipattern - Benedict Roeser