Do you need a Distributed Transaction? Maybe not!

If you’re working in a distributed application, you’re bound to run into a design issue where you want data consistency between services. But you don’t have a distributed transaction, so what’s the solution? In this video, I will take an example use case and explain the design challenge and solutions for handling communication and consistency between services.

YouTube

Check out my YouTube channel, where I post all kinds of content accompanying my posts, including this video showing everything in this post.

Workflow

This example use case was asked in my private Discord server by a member of my blog/channel. The domain is a subscription service where you buy a subscription and receive orders daily. The Subscription has a Balance for the amount your credit card charged. Your credit card is charged at an interval to keep a positive balance for your subscription.

One crucial aspect is that a subscription can lapse. Meaning that maybe the customer’s credit card has expired, and the customer did not update it in time before there was a $0 balance. Because of this, they allow a grace period of 2 days which you will still receive your Order, and you’ll go into a negative balance. Once you update your credit card and your credit card is successfully charged, the balance is updated, removing you from a negative balance and also updating the orders received during the grace period as being paid.

Here’s how the current system works when a subscription lapses.

There are two boundaries. One boundary is for handling credit card payments and managing subscriptions. The other boundary is for managing Orders.

Services

The customer updates their credit card, and the Payment services hit the payment gateway to charge their credit card.

Add Payment

Once the credit card is charged successfully, they update the subscription to set the new balance to the amount charged.

Update Balance

Then they use an event-driven architecture, create a PaymentCompleted Event, and publish it to a message broker.

Publish Event

The Order service consumes that event. It looks at its database to determine which orders have not been paid yet.

Consume Event

Then the order service makes a synchronous blocking RPC call (HTTP or gRPC) back to the Payment service to decrease the balance for the orders marked as paid, which were created during the grace period.

RPC to update Balance

Once that RPC call is completed, the order service can mark the orders as fully paid. Now both Order service and Payment service are consistent. All the orders are marked as paid, and the payment service’s balance is correct and consistent.

But what happens if there is an error at that last step, updating the Orders status as paid?

Requires a distributed transaction

Now we’re left in an inconsistent state. We’ve decreased the balance in the Payment service but failed to set the Orders as paid.

It sure looks like we need a distributed transaction! Not so fast.

Boundaries

Another solution is not needing a distributed transaction. We have these consistency issues because we are keeping track of the order status separately from the balance.

The Orders Service contains the status of the order. The subscription service has the subscription balance and all the credit card transactions.

Entity Services

One solution is to move the status of an order to the subscription service. This means having the same concept of order in both boundaries but for different purposes. They only share the OrderId and the amount. This is a simplified example but there might be many more pieces of data that are unique to each boundary. For example, the Orders boundary also has the CustomerId, which the Subscription boundary doesn’t care about. It cares about the status and the amount of all the orders.

Data Ownership for consistency

These changes now mean we don’t have to communicate or have any workflow between services when a credit card is updated, and we need to mark orders as paid that were created during the grace period.

When the credit card is updated, we hit the payment gateway to charge the customer’s credit card.

Charge Credit Card

Then we update the balance as we did before; however, now we can also, within the same transaction, update the orders that have not been marked as paid because we have the order status within the Payment service.

Update Balance and Order Status, fully consistent

Entity Services

Why did the Order service need to own the order status, determining if it was paid? This is because we often get caught up in entity services. Services that own everything to do with an entity. However, the entity usually has different purposes for different boundaries. You do not need to have a single Entity live in only one Service. The concept of an entity can exist in many different boundaries, and each owns a portion of the data and behaviors around that data.

Do you need a distributed transaction? Maybe not. Look at data ownership around the consistency you’re looking for.

Join!

Developer-level members of my YouTube channel or Patreon get access to a private Discord server to chat with other developers about Software Architecture and Design and access to source code for any working demo application I post on my blog or YouTube. Check out my Patreon or YouTube Membership for more info.

Follow @CodeOpinion on Twitter

Software Architecture & Design

Get all my latest YouTube Vidoes and Blog Posts on Software Architecture & Design

How your “Sr.” Devs incurred Technical Debt

Are you overwhelmed by technical debt? Taking the path of least resistance when implementing new features in a large existing codebase will ultimately turn it into a difficult-to-change turd pile. It’s a vicious circle. Making the “quick change” constantly makes it harder to make future changes. So what’s the solution? Being aware of technical debt, stop solely thinking about data, and give yourself options in your architecture.

YouTube

Check out my YouTube channel, where I post all kinds of content accompanying my posts, including this video showing everything in this post.

Path of Least Resistance

One common reason for a system growing over time and becoming unmaintainable is developers choosing to take the path of least resistance when implementing a change.

This happens for various reasons, such as time constraints, unfamiliarity with the system, lack of domain knowledge, poor overall architecture & design, etc.

For example, let’s say we have a typical web application that is using some underlying web framework that invokes some code into our application logic, through to our domain, and then some interaction with a database.

Application Request

When a new feature is implemented, it’s common to look at other features as templates for developing a new feature. Or, worse, it can be using an existing feature and adding the relevant code needed for the new feature throughout the stack. I say worse because this can often confuse two concepts that seem similar but are very distinct. Merging the two concepts within the same code path can add complexity.

Layers

This means we may change existing code through the entire stack, from the client, web API, application code, domain, and our database.

You may decide to piggyback off another feature because of time constraints. It’s not because the feature is difficult to implement. It’s time-consuming or will take more time than you have to implement. Or if you’re new to the codebase or it’s brittle, you might be afraid to make changes because you know it it can cause you to break other parts of the system and don’t want to cause any regressions.

The path of least resistance is making a change that you know isn’t going to break anything that isn’t overly time-consuming, but it’s not necessarily the ideal. It’s likely good for the right now but not good for the long run.

Technical Debt

Technical debt isn’t inherently bad. For me, technical debt comes in two forms. The first is when you’re aware and choosing to take on technical debt at a very moment, knowing it adds value now but will cause issues in the future. This awareness of choosing to make this explicit decision isn’t bad.

However, when you’re unaware that you’re making these types of decisions is when you’re headed in the wrong direction.

If you’re making explicit decisions about the tradeoffs of technical debt, you’re aware of the debt being incurred. You can then explicitly choose when to pay off (refactor) that debt. For example, with a startup, you might incur debt right now so that you have a future.

On the other side, if you’re unaware that you’re incurring technical debt, then when would you realize all the debt that’s been incurred and needs to be addressed? Taking the path of least resistance, without realizing it, is one form of this happening. While it seems like it’s helping you now, it could be hindering you now and even more so in the future.

Coupling & Cohesion

Software Architecture is about making key decisions at a low cost that give you options in the future. Having a good architecture allows you to evolve your system over time. As a codebase and system grow, it should not hinder future development. I’ve talked about this more in my post What is Software Architecture?

Why is a system brittle and hard to change? Generally, it has a high degree of coupling from higher and lower levels within a system. I find this is often because of the focus on data and informational cohesion rather than functional cohesion.

For example, let’s say we are in an e-commerce and warehouse system. There is the concept of a product. When we primarily think about data first, we think of a singular product. It holds all information for everything related to an individual product. The name, price, location in the warehouse, the quantity on hand, it is available for sale, etc.

In reality, a system for e-commerce and a warehouse would be huge. A large codebase that multiple departments would use in an organization. Sales, Purchasing, Warehouse (shipping & receiving), Accounting, and more.

In other words, I’m simplifying this example only to show a few different pieces of data related to a product, but in reality, there would be a lot.

When focusing on data primarily, we lose sight of the behaviors that relate to this data. What does the QuantityOnHand have to do with the Price? What does the Location have to do with the Description?

Nothing.

We’ve lumped all aspects into one concept of a product. However, in a large system like this, the concept of a product would exist in many different forms depending on the behaviors provided.

Product Entities

Sales have the concept of the product that cares about the Selling Price and if we’re selling. It’s customer focused.

Purchasing cares about the price from the vendor or manufacturer, which is our cost. It’s vendor-centric.

The warehouse cares about the location of the product in the warehouse and the assumed quantity on hand.

Each logical boundary has a concept of a product but has different concerns in each of its own contexts.

This means instead of mixing all these different concerns up together, instead be driven by the capabilities of each boundary and then the data ownership for those capabilities.

Low functional cohesion will lead to a high degree of coupling.

Defining logical boundaries by grouping related behaviors will lead to higher cohesion, which can then lead to loose coupling.

Awareness

Some of the trade-offs of taking the path of least resistance is being aware of the trade-offs you are making between coupling and cohesion. Earlier I mentioned piggybacking off an existing feature to implement a new feature. You’re coupling. Again, not a bad thing if that decision is explicit.

Over time, left unchecked, if you’re unaware of the technical debt you’re creating, you’ll end up with a large turd pile that’s brittle and hard to change.

If you are aware you can choose when to pay down debt (refactor) and keep making those decisions over time, you can manage the amount of debt incurred, never letting it get out of reach.

I often say a system is a turd pile because nothing is perfect. It’s a constant battle to pay down debt, whether you choose it explicitly or not.

Join!

Developer-level members of my YouTube channel or Patreon get access to a private Discord server to chat with other developers about Software Architecture and Design and access to source code for any working demo application I post on my blog or YouTube. Check out my Patreon or YouTube Membership for more info.

Follow @CodeOpinion on Twitter

Software Architecture & Design

Get all my latest YouTube Vidoes and Blog Posts on Software Architecture & Design

Design Patterns: Who gives a 💩?

Should you care about design patterns? There are books devoted to them; heck, even I post videos about specific design patterns. But do they matter? If you’re new to design, it can be overwhelming and cause a lot of unneeded complexity. I will cover how I think of design patterns or how I don’t think of design patterns.

YouTube

Check out my YouTube channel, where I post all kinds of content accompanying my posts, including this video showing everything in this post.

Bell Curve

In my own personal experience, and witnessing the same occur to other developers in their careers, is a mid-career explosion of complexity. Design Patterns play a big role in this explosion of complexity.

Early on in a developer’s career, they are often righting pretty simple code. While that code may be highly coupled and flawed, they generally write “simple code.”

Beginner Simplicity
The vertical-axis is the level of complexity; the horizontal axis is the level of experience.

Now we can argue what “simple code” means, but speaking for myself, it was straightforward without any magic. You aren’t writing “smart” or “clever” code. What you see is what you get. There was little indirection.

As you gain more experience and read or watch various tutorials/courses/books, you start seeing every problem as a means to solve by patterns.

There is a similar case to be made for doing the same thing with the latest technology, library, framework, or platform. You learn something new and immediately want to apply it. Unfortunately, this often leads to using it aimlessly. Meaning you have a hammer, so everything starts looking like a nail.

Design Pattern Complexity

Now you might be thinking, “really, people are just applying patterns for no good reason?”. Yes, this is more common than you think. It can also be because you’re applying patterns because you think you have the problem it solves. More on that later.

Hopefully, you feel enough pain in this phase of your career where you can realize you aren’t solving problems but rather creating unneeded complexity. On the bright side, you’ll better understand various patterns and the problems they solve, as you’ve used them for the right and wrong reasons.

On the other side of the complexity, the nightmare phase is back to the simplicity that resembles the naive code you’ve written at the beginning of your career.

Simplicity with deeper insight

That’s not to say you’re writing beginner code, but it’s focused, trivial, with no magic, and less useless indirection. It’s direct and to the point. As someone commented on the YouTube video:

It took me four years to paint like Raphael, but a lifetime to paint like a child.

Pablo Picasso

Communication

A key to patterns is communication. Named patterns are a way to communicate between developers about solutions and implementations for various problems. For example, one developer is explaining a problem they are having with another developer. The other developer says that the problem could be solved by [Insert Named Pattern]. If both developers understand the named pattern, they don’t have to get lost in the deep implementation details of that pattern. They already understand it. It’s a communication tool.

According to many comments on various videos I’ve done on YouTube, it’s common for people to apply a pattern without even realizing it’s a named pattern. I have done this countless times over my career. You’re faced with a problem and come up with a solution that turns out to be a named pattern! Once you realize this, great! You understand both the problem the pattern truly solves and how to implement it.

Knowing the names of patterns and the problems they solve is great for communication.

Avoiding the Problem

Earlier I mentioned that you could apply patterns for problems you think you have. However, it’s often helpful to examine why you have the problem. Meaning one solution is to avoid the problem in the first place.

To illustrate this, I’m going to use the example of the Repository Pattern. Now you could take this example more abstractly and apply it to other patterns that add indirection.

There are different definitions of the repository pattern, but for this example, I’ll say it’s used to encapsulate data access logic.

Mixing data access logic with other concerns sounds like a terrible idea. However, there is an underlying issue that’s not talked about. Coupling.

With the repository, you’re coupling to it rather than likely using a native database provider directly. The purpose of abstractions is to simplify the interface for your purpose. The repository pattern can do this for us. Great. However, you still have the same degree of coupling from your application code using the repository.

If you have hundreds (or thousands) of usages of the repository in your application code, you have a high degree of coupling to your repository. If you make any breaking changes to your repository, you’re faced with changing all calling code that breaks.

Another solution is to limit coupling.

In many situations, it’s not that you need all the data or implementation of the repository. Often you only need a subset. The repository might not be ideal in every situation. I talked about this in my post Should you use the Repository Pattern? With CQRS, Yes and No!

Using your repository abstraction might be helpful in one situation and not ideal in another. You may decide a subset of features is grouped together, use the same related data, and use a repository. Another subgroup of features might choose to access data differently because of its use case.

The problem isn’t that you need to encapsulate data access; the problem is you have a high degree of coupling of calling code that needs data access. One solution is to encapsulate data access with a repository. Another is to limit coupling, so the need for encapsulating data access is less of a concern or decided per situation.

Design Patterns

Should you care about design patterns? Yes, absolutely. Understanding the names of design patterns and the problems they solve is helpful. It’s a great way to communicate with other developers when discussing problems and solutions. However, don’t apply a pattern unless you truly have the problem it solves. Do you have the problem it solves? Or should you look at ways to eliminate the problem.

Join!

Developer-level members of my YouTube channel or Patreon get access to a private Discord server to chat with other developers about Software Architecture and Design and access to source code for any working demo application I post on my blog or YouTube. Check out my Patreon or YouTube Membership for more info.

Follow @CodeOpinion on Twitter

Software Architecture & Design

Get all my latest YouTube Vidoes and Blog Posts on Software Architecture & Design