Sponsor: Do you build complex software systems? See how NServiceBus makes it easier to design, build, and manage software systems that use message queues to achieve loose coupling. Get started for free.

That’s NOT an Aggregate in Domain Driven Design

Are you frustrated that you have to open multiple files across multiple layers to make what seems like a simple change? One of the culprits for this is following structure and templates that apply patterns or concepts to solve problems you might not have. One typical case of this is using aggregate from domain drive design. In this video, I’ll give examples of where an aggregate can make sense and where it’s not and adds useless indirection.

YouTube

Check out my YouTube channel, where I post all kinds of content accompanying my posts, including this video showing everything in this post.

Useless Indirection

The idea for this video/blog came from a common I received on a YouTube video on my channel where I was talking about indirection.

The sentiment of this comment is all too common. I have many similar comments and have conversations with developers all the time about this. One of the culprits to this is applying patterns or concepts that are solutions to problems you don’t have in a given context.

One such pattern is the usage of an Aggregate from Domain Driven Design. The purpose of an aggregate is to create a consistency boundary. Unfortunately, the way it’s often explained more illustrates it as an object model or hierarchy.

Aggregate Domain Driven Design

The stereotypical example is to model a shopping basket. You would have a basket that would have many basket items. Many think this is an aggregate because you cannot have a basket item without a basket. In this case, this would be the aggregate, and the Basket would be the aggregate root.

Typically you’d then use a Repository to save and fetch the aggregate out, only exposing the aggregate root (Basket) to consumers.

Aggregate Domain Driven Design

But does this need to be an aggregate?

Most commonly, aggregates are often incorrectly used to model an object/data hierarchy and to old domain logic, which I often think is a more trivial validation than complex domain logic.

However, an aggregate is about creating a consistency boundary. It’s not about modeling a hierarchy.

Do you need consistency within this aggregate?

Useless Setters

Here’s a made-up example of an aggregate based on a sample I found on GitHub.

This is a simplified example. However, you can see two methods for setting the Name and the Price of this Entity. There is also some logic for setting the price: the price must be greater than zero. To do this, it’s using a specification.

What value does the specification serve? What value do the SetName and SetPrice have? None.

The SetName method is just setting the underlying Name property. It’s useless indirection.

The SetPrice contains some validation logic, which is nice. However, the separate ProductNegativePriceSpecification is useless indirection. The SetPrice is also putting our entity in an invalid state even though it’s throwing. The caller could catch the exception and carry on.

We could just put the conditional check directly in the SetPrice method. But we can also use value objects and types to enforce a valid value directly from the caller.

Now, what value do the SetName and SetPrice have? Zero value. They are just setting the underlying properties. We’ve enforced our product price when the caller needs to construct a ProductPrice type.

We don’t have an aggregate (root). We have a data model with useless setters. Remove the SetPrice and SetName, then set the properties directly from the calling code.

Consistency Boundary

So when do you need an aggregate? Well, here’s an example of an Order Aggregate (root)

This slimmed-down version of the Order Aggregate Root illustrates what’s important. When we add an order item, we do it through the aggregate root (Order) because we want to only have a single unique product per order. Also, if we have a discount for the product, we want to use the discount with the greatest value. This is a consistency boundary. We need an aggregate and all operations to go through the root to perform this logic. We don’t want random data access code or transaction scripts managing order items. This gives us consistency.

Lastly, in the SetStockconfirmedStatus method, we’re making a state change, but we’re also publishing a domain event OrderStatusChangedToStockConfirmed. Other parts of our system likely rely on this event when that state changes. We must always publish this event when the order status changes to StockConfirmed. Again, consistency on state change and publishing an event.

Aggregate or Data Model

If you need a consistency boundary, use an aggregate and aggregate root. You’re not getting any of the benefits if you have a data model with just setters. Don’t add useless indirection. Just use a data model with transaction scripts.

Join!

Developer-level members of my YouTube channel or Patreon get access to a private Discord server to chat with other developers about Software Architecture and Design and access to source code for any working demo application I post on my blog or YouTube. Check out my Patreon or YouTube Membership for more info.

Follow @CodeOpinion on Twitter

Software Architecture & Design

Get all my latest YouTube Vidoes and Blog Posts on Software Architecture & Design

McDonald’s Journey to Event-Driven Architecture

McDonald’s uses Event-Driven Architecture! Luckily for us, they’ve written a couple of blog posts providing some details of their journey into event-driven architecture. I’m going to go a bit deeper by providing my thoughts on how their system works and why they are doing it so that it can give you some ideas about your systems.

YouTube

Check out my YouTube channel, where I post all kinds of content accompanying my posts, including this video showing everything in this post.

McDonald’s Event-Driven Architecture

It’s always interesting to see companies post details of the architecture of various systems they have. It can be insightful to see what they are doing, why, and their challenges. McDonald’s posted behind-the-scenes and how-it-works blog posts detailing their journey to event-driven architecture. More specifically, it’s not that they are new to event-driven architecture but rather have a standardized way to implement it with distributed teams of developers with different skill levels.

McDonald's Event-Driven Architecture

There are many different components to their platform. Their infrastructure is within AWS, and they use MSK (Managed Streaming for Kafka) along with ECS, DynamocDB, and API Gateway.

Here’s how everything works together.

Schema Registry

One of their challenges was related to data quality. Likely because there was no set definition (schema) for data within events. If multiple producers produce the same event type, they might not be composing them exactly the same. I believe an event should have a single publisher, the owner of that schema, to avoid this issue. However, this could be applicable in a message-driven architecture that’s also using queues and commands.

Producers at startup use a custom SDK that retrieves all the event schemas from the registry. This allows the producer to validate the event being produced against the schema.

If validation passes, the producer can publish this event to the appropriate Kafka topic using the SDK at this point.

As you can expect, on the consumer side, the same thing occurs. Consumers at startup use a custom SDK that retrieves all the schemas from the registry, just like the producers do.

Then the consumers can process messages from the Kafka topics and understand how to deserialize them from the schema and version of the schema.

Everything within any Kafka topic should be valid based on all the schemas (versioned) within the registry. Data quality issues are solved!

Validation

Of course, not everything goes through the happy path. What if a producer tries to publish an event, but it fails to validate against the schema? The producer then publishes the message to a Dead Letter Queue. Kafka isn’t a queue, so this is a Dead Letter Topic.

Producer to DLQ

Once a message is in the “DLQ” there needs to be a way to view, modify and fix the event so it can be re-published to the correct topic.

For this, an Admin/Utility UI provides this functionality for them.

Reliable Publishing

The second failure that can occur is failing to publish to Kafka (MSK). Anyone getting involved in Event-Driven Architecture is bound to run into this. It would be best if you had consistency between making state changes to your business data and publishing your event. When events become critical to your system and possibly workflows, you need guarantees that you publish the relevant events when you make some state change to business data.

Mcdonald’s chose to use DynamoDB to persist any events that cannot be published to Kafka. This means their Publisher SDK will fallback to storing the event data within DynamoDB if it cannot publish to Kafka.

Using a fallback to some durable storage is a common approach. However, the Outbox Pattern is another common solution. I discussed this and other common issues in a post about the 5 Pitfalls of EDA.

Once the event data is in DynamoDB, they use Lambda to pull it from DynamoDB and then retry and publish it to Kafka. I’d assume they have different retry intervals/backoffs.

Lambda Retry

Gateway

Lastly, if you’re integrating with 3rd parties or even within a large organization, you’ll need to have them publish events. However, they won’t have direct access to your SDK and Kafka. For this, they use API Gateway as an HTTP interface to convert HTTP requests that will communicate with the Producer that has the SDK and can publish to Kafka.

Event Gateway

That way, we go through the same validation against the schema in the registry just as if any of our client code is using the producer SDK. This allows external 3rd parties to publish events without using our SDK directly. We can instead have them use our Event Gateway (HTTP API).

Technical Blog Posts

I love when companies have technical blog posts that give insights into their architecture and design. It’s hard to know the full context, but seeing how they solve these issues they run into is interesting. Companies face many common issues when using Event-Driven Architecture, but all have unique constraints.

If you have any recommendations for other technical blog post analyses, please let a comment!

Join!

Developer-level members of my YouTube channel or Patreon get access to a private Discord server to chat with other developers about Software Architecture and Design and access to source code for any working demo application I post on my blog or YouTube. Check out my Patreon or YouTube Membership for more info.

Follow @CodeOpinion on Twitter

Software Architecture & Design

Get all my latest YouTube Vidoes and Blog Posts on Software Architecture & Design

Do you need a Distributed Transaction? Maybe not!

If you’re working in a distributed application, you’re bound to run into a design issue where you want data consistency between services. But you don’t have a distributed transaction, so what’s the solution? In this video, I will take an example use case and explain the design challenge and solutions for handling communication and consistency between services.

YouTube

Check out my YouTube channel, where I post all kinds of content accompanying my posts, including this video showing everything in this post.

Workflow

This example use case was asked in my private Discord server by a member of my blog/channel. The domain is a subscription service where you buy a subscription and receive orders daily. The Subscription has a Balance for the amount your credit card charged. Your credit card is charged at an interval to keep a positive balance for your subscription.

One crucial aspect is that a subscription can lapse. Meaning that maybe the customer’s credit card has expired, and the customer did not update it in time before there was a $0 balance. Because of this, they allow a grace period of 2 days which you will still receive your Order, and you’ll go into a negative balance. Once you update your credit card and your credit card is successfully charged, the balance is updated, removing you from a negative balance and also updating the orders received during the grace period as being paid.

Here’s how the current system works when a subscription lapses.

There are two boundaries. One boundary is for handling credit card payments and managing subscriptions. The other boundary is for managing Orders.

Services

The customer updates their credit card, and the Payment services hit the payment gateway to charge their credit card.

Add Payment

Once the credit card is charged successfully, they update the subscription to set the new balance to the amount charged.

Update Balance

Then they use an event-driven architecture, create a PaymentCompleted Event, and publish it to a message broker.

Publish Event

The Order service consumes that event. It looks at its database to determine which orders have not been paid yet.

Consume Event

Then the order service makes a synchronous blocking RPC call (HTTP or gRPC) back to the Payment service to decrease the balance for the orders marked as paid, which were created during the grace period.

RPC to update Balance

Once that RPC call is completed, the order service can mark the orders as fully paid. Now both Order service and Payment service are consistent. All the orders are marked as paid, and the payment service’s balance is correct and consistent.

But what happens if there is an error at that last step, updating the Orders status as paid?

Requires a distributed transaction

Now we’re left in an inconsistent state. We’ve decreased the balance in the Payment service but failed to set the Orders as paid.

It sure looks like we need a distributed transaction! Not so fast.

Boundaries

Another solution is not needing a distributed transaction. We have these consistency issues because we are keeping track of the order status separately from the balance.

The Orders Service contains the status of the order. The subscription service has the subscription balance and all the credit card transactions.

Entity Services

One solution is to move the status of an order to the subscription service. This means having the same concept of order in both boundaries but for different purposes. They only share the OrderId and the amount. This is a simplified example but there might be many more pieces of data that are unique to each boundary. For example, the Orders boundary also has the CustomerId, which the Subscription boundary doesn’t care about. It cares about the status and the amount of all the orders.

Data Ownership for consistency

These changes now mean we don’t have to communicate or have any workflow between services when a credit card is updated, and we need to mark orders as paid that were created during the grace period.

When the credit card is updated, we hit the payment gateway to charge the customer’s credit card.

Charge Credit Card

Then we update the balance as we did before; however, now we can also, within the same transaction, update the orders that have not been marked as paid because we have the order status within the Payment service.

Update Balance and Order Status, fully consistent

Entity Services

Why did the Order service need to own the order status, determining if it was paid? This is because we often get caught up in entity services. Services that own everything to do with an entity. However, the entity usually has different purposes for different boundaries. You do not need to have a single Entity live in only one Service. The concept of an entity can exist in many different boundaries, and each owns a portion of the data and behaviors around that data.

Do you need a distributed transaction? Maybe not. Look at data ownership around the consistency you’re looking for.

Join!

Developer-level members of my YouTube channel or Patreon get access to a private Discord server to chat with other developers about Software Architecture and Design and access to source code for any working demo application I post on my blog or YouTube. Check out my Patreon or YouTube Membership for more info.

Follow @CodeOpinion on Twitter

Software Architecture & Design

Get all my latest YouTube Vidoes and Blog Posts on Software Architecture & Design