Skip to content

Stop Joining Tables In Your “Modular” Monolith

Sponsor: Do you build complex software systems? See how NServiceBus makes it easier to design, build, and manage software systems that use message queues to achieve loose coupling. Get started for free.

Learn more about Software Architecture & Design.
Join thousands of developers getting weekly updates to increase your understanding of software architecture and design concepts.


Modular monoliths are all the rage.

You have well defined modules built around business capabilities. Maybe inside those modules you are using something like Clean Architecture. You have separation of concerns. You have direction of dependencies. Everything looks great and you think, “Wow, finally I have a really good structure.”

But do you?

YouTube

Check out my YouTube channel, where I post all kinds of content on Software Architecture & Design, including this video showing everything in this post.

Modular Monlith

You are looking at an order management system, viewing a particular order, and somebody wants to see the quantity on hand in the warehouse for the items on that order.

That seems harmless. Straightforward. Since you are in a monolith, you have the same database. No big deal. When you get the order line items out, if you are using a relational database, you just join with the inventory table.

Problem solved.

Except what looked like a simple answer to show some inventory on a screen may have just welded two modules together.

Your codebase can be lying to you. Even though you have this well structured solution with different projects around modules, it might really just be a nice looking wrapper on top of a very coupled database.

Your code might say Sales and Warehouse. But what do your queries say?

The Query That Crosses the Boundary

You might think it is architectural nonsense to hear “don’t cross a boundary,” especially in a situation like this. You are in a monolith. Why would you create two different queries when you can just have one with a join?

What is really the harm?

That is the trap you might not realize you are putting yourself into.

Jumping back to the order detail page, the request seems simple enough. Add another column that shows quantity on hand from the warehouse.

To do that, it is going to be pretty simple, right? Sales is building out the page. Sales gets the order details and the line items. From there, if we are using a relational database, we can just join to the inventory table wherever that lives as part of Warehousing.

Sure, they are separate things, but just go to the data directly. That is the tempting version.

I say that because it is simple, easy, and probably really fast to get that data and render it with a simple join. But architecturally, something important just happened.

You decided to couple Sales and Warehouse at the database level.

Now it is not just a query. You have a dependency.

That is the hidden coupling. You do not see it in code. Everything can look well structured, but the coupling is hidden inside your queries and how you are dealing with data access.

I know a lot of people immediately think, “Yes, this is why my system is a disaster. It is a rat’s nest.”

Because there is a free for all of coupling and integration at the database level. You can have some level of ownership and organization within code, but if you are not doing that at your database schema level, all bets are off.

Just Put an API in Front of It?

A solution you might be thinking of is to create an API. Sales interacts with that API to Warehouse to get the inventory.

That way, Warehouse defines and uses its explicit schema, and so does Sales. Sales has to do the composition of getting the information from Warehouse through the API.

Sure, that solves a problem.

But you still have coupling. The form of coupling is just different. Before, you were coupled directly to the database. With an API, you are coupled to a contract.

That gives you more options. Maybe you have versioning. Maybe you can deprecate some APIs or mark something as obsolete. You have different ways of changing the behavior of how things work.

For example, say the requirements change and quantity on hand is stored differently. Now you have different warehouses, and you are showing which warehouse has what quantity on hand in what location.

If you were integrating directly into the database, you are handcuffed. That is especially true if you are not in the same codebase and cannot easily find all those queries.

Imagine you have some other codebase somewhere else that is also integrating into your database. Now you cannot change it. You are totally handcuffed.

With the API, at least you have some means to migrate. You have versioning. You have some course of action if you want to evolve and make change.

What Do You Actually Need?

There is more to it than that, because sometimes what you think you need and what you really need are two different things entirely.

I actually love this example because it highlights that.

If we are talking about needing the actual quantity on hand in the warehouse when looking at orders, maybe it is just for view purposes. Maybe it is also for placing an order, where you think you need the quantity on hand.

The thing is, that is not reality.

Sure, you may think this is a trivial example, but I can guarantee you that in your own business domain, this happens all the time.

With Sales, you might think about a particular SKU and the price for the product, or what you are selling it for. Purchasing might care about buying that product from a vendor and what the actual cost is. The Warehouse cares about that SKU, the quantity on hand, where it is in the warehouse, what bin it is in, and whatever else matters physically inside the warehouse.

But the idea of quantity on hand in Sales is not actually a thing.

It is not.

It is a totally different concept. It is a business function called Available to Promise.

Available to Promise is solving the actual issue, not necessarily from a technical perspective, but from a business perspective.

Sales does not actually care how many items are in the warehouse at any given time, or what the system thinks is there, because that is not reality.

Reality is going into the warehouse and seeing what parts or products are actually there and in what quantity. Something might be damaged. Something might be stolen. What is in y

our system is not reality. What is physically on the shelf is reality.

So something like ATP, Available to Promise, is a business function. It is the idea of, “What have we sold right now? What have we purchased from our vendors or manufacturers that we are going to receive? What number can I actually promise to sell because I know more are coming in?”

That is the number Sales cares about.

This goes back to the difference between what you think the requirement is and what you actually need.

When you model that properly, you may not need the coupling. You may not need to call some API to get another boundary’s data. You may not need to reach into its database or do a join or cross that boundary line at all.

You have all the data you need because you aligned everything with the business functions and the requirements within the boundary.

It Is All Tradeoffs

This is all about tradeoffs.

There really is no right answer, per se, because it comes down to what is right in your system.

What is the size of the system? How much is it evolving? If today it is just one extra query, no big deal. Maybe you have to do a little composition. Maybe it is a little bit slower. If that helps you tomorrow deal with coupling, then sure.

But if you have to deal with schema changes, and things are evolving, and you cannot change the schema because all these different things are coupled to the same underlying database, that is a problem.

There is your tradeoff.

How are you going to deal with regressions? If you change something, are you going to break other processes or application code? If you want to make a change, what different applications or parts of the same system do you have to coordinate with because you are changing the underlying schema?

There is no universal right answer here. It is a matter of the tradeoffs and the degree of coupling you are creating.

Which Camp Are You In?

I think there are probably two camps.

One camp is thinking this is insane. Sales and Warehouse? It is just a simple warehousing system. I can do the join. It is one database. It is a monolith. No problem.

And that may be working totally fine for you, depending on the size of your system and what else is accessing that database.

If it works, it works.

However, there are large enough systems where you absolutely need boundaries because you cannot evolve without them.

I can guarantee there are a lot of people living in those types of systems, where they are handcuffed because they cannot make any schema changes at all. They have no idea what application code or what queries are happening anywhere that are calling those tables.

You cannot make a change because you are going to break something else.

In those situations, you want to create an API because it is your contract. It is something you can evolve, at least a little bit easier than when clients are reaching directly into your database.

That is much more difficult to change.

Your Code Might Be Structured, But Your Data Might Not Be

If you just want to make that query and do that join, have at it. But realize the tradeoff you are making and the coupling you are creating.

Your application code might be well structured. It might look like it does not have much coupling. It might look like a modular monolith.

But if all the coupling is happening at the database level, what you still have is a big ball of mud.

It just has a nicer looking code structure wrapped around it.

Join CodeOpinon!
Developer-level members of my Patreon or YouTube channel get access to a private Discord server to chat with other developers about Software Architecture and Design and access to source code for any working demo application I post on my blog or YouTube. Check out my Patreon or YouTube Membership for more info.