Should you use the Repository Pattern? With CQRS, Yes and No!

The repository pattern is polarizing. Some developers swear you should always use it to abstract data access logic while others think it’s unnecessary if you’re using an ORM. So should you use it? My answer is Yes and No! If you’re applying CQRS and Vertical Slice Architecture you’ll likely want a repository to build up Aggregates. However, for a Query, you may want to just get the data you need rather than an entire aggregate (or collection of aggregates) to build a view model.

YouTube

Check out my YouTube channel where I post all kinds of content that accompanies my posts including this video showing everything that is in this post.

Repository

As with many terms and concepts in the software industry, a repository can mean different things depending on your definition. I’m using the definition from Martin Fowler’s P of EAA Catalog definition.

Mediates between the domain and data mapping layers using a collection-like interface for accessing domain objects.

https://martinfowler.com/eaaCatalog/repository.html

The key to part for me of that definition is domain objects. Not data models, but domain objects.

CQRS & Vertical Slices

Often I’m using CQRS to separate pathways between writes (commands) and reads (queries) to my database. This allows the definition of two distinct paths that each can decide how they interact with the database, their dependencies, etc. This isn’t top-level architecture but just a decision you can make in various parts of your system.

Should you use the Repository Pattern? With CQRS, Yes and No!

As mentioned earlier, CQRS is a concept that is often confused. Check out my post CQRS Myths: 3 Most Common Misconceptions that should clear up any confusion.

What ultimately happens when you start focusing individually on a command or a query leads to organizing your code around features. A feature can be an individual command or query, or a collection of a few.

A Vertical Slice is a concept of taking everything related to a feature and organizing it together. As mentioned, this becomes a natural fit with CQRS. Ultimately a feature is a single use-case or a defined set of functionality within your system.

Should you use the Repository Pattern? With CQRS, Yes and No!

Vertical Slices are focused on features, not technical concerns. No longer are you organizing and writing code in a layered approach. The layers and technical separation are defined per feature.

This means you can define how each command or query handles various concerns, for example, data access.

If you go back to the definition of the Repository Pattern, it’s for accessing domain objects. Domain Objects that are grouped together are defined as an Aggregate. To interact with the Aggregate, all operations are handled by a primary domain object which is the Aggregate Root.

The common example often used is a Sales Order and all the Line Items. The Sales Order and Line Items are domain objects that form an Aggregate. The Sales Order is the Aggregate Root. All operations are done through the Sales Order and no access is done directly to any Line Items. Check out my post on Aggregate Design: Using Invariants as a Guide for more on how to define and design an aggregate.

This means that I’m only concerned with an aggregate for making state changes. In other words, an Aggregate is required for Commands, not Queries.

This means that we can define to use an Aggregate for any Commands, and simply use a Data Model for any Queries. We do not need an Aggregate for queries because our Aggregate is responsible for state changes.

Also, most of the time when creating a Query, you want data to be shaped a certain way. This doesn’t necessarily require everything within an Aggregate. Because of this, you’re often way over fetching data to build the Aggregate when you only need a subset of the data for the Query.

To illustrate this, here is code from the eShopOnWeb sample application. The Order entity is the Aggregate Root that is returned from a Repository.

The sample code has a Query that is using the IOrderRepository to list all the Orders for the logged-in user.

Since we don’t need the Orders Aggregate, we don’t really need to use the Repository Pattern. The benefit of not using the repository is rather we can select the data we actually need for this use case. In this sample, it was reusing the OrderViewModel to be used when listing all the Orders as well in another route when viewing an individual Order.

This re-use is actually not helpful because the Order Listing page does not need any of the Order Items or the Shipping Address.

Rather, we can define our result explicitly for this use case and fetch exactly the data needed. Again, this use case did not need any order items, the product for those order items, or the Shipping Address. The aggregate is fetching and returning all this data that we do not need.

The Repository Pattern

If I’m applying CQRS and Vertical Slices, it means that on the Command side I’m going to use a Repository to build up and return an Aggregate. An aggregate is a consistency boundary and is responsible for state changes that are controlled by invariants.

On the Query side, since I’m not making any state changes, I do not need an Aggregate. An aggregate is likely way more data that I likely need to transform into the result that I need to create. Queries are specific use cases in ways to return data. A Query is encapsulating that concept including how it’s accessing that data. You could decide to not even use the same library for underlying data access in your Repository as you are in any Queries. CQRS enables that option.

Source Code

Developer-level members of my YouTube channel or Patreon get access to the full source for any working demo application that I post on my blog or YouTube. Check out the YouTube Membership or Patreon for more info.

Related Links

Follow @CodeOpinion on Twitter

Software Architeture & Design

Get all my latest YouTube Vidoes and Blog Posts on Software Architecture & Design

Should you publish Domain Events or Integration Events?

Common advice is to not publish domain events outside of your service boundary. They should only exist within your service boundary. Instead, you should publish integration events for other service boundaries. While this general advice makes sense, it’s not so cut-and-dry. There are many reasons why you would want to publish domain events for other services to consume. Here are how I think of Domain Events and Integration Events and when to use them.

YouTube

Check out my YouTube channel where I post all kinds of content that accompanies my posts including this video showing everything that is in this post.

Domain Events

What most people are referring to and generally implement when they talk about Domain Events is within a single boundary. When a state change or something occurs a domain event is published that is then processed by consumers within the same boundary.

This is generally all done in-process. Meaning the consumers all process the same event within the same process as the publisher. There is no message broker or any asynchronous messaging occurring. Everything is done in-memory and in-process. Because of this, this can often be wrapped within the same database transaction. If a consumer throws an exception, it will go up the call stack to the publisher, which can then roll back the transaction.

While this seems on the surface like a good pattern it can oftentimes be less desirable than actually having your consumers in isolation and moving the consumer processing out of process. More on that in a future video/blog post.

Integration Events

Integration events are generally used for integrating with other service boundaries. This then means we’re moving work out of process by leveraging a message broker. Consumers will each individually process the event in isolation without having any effect on the publisher or any other consumers.

Integration events differ from Domain Events in that Domain Events are very specific concepts within a boundary. A domain event may not mean anything or have a perceived different meaning to another boundary. Integration Events are specifically for telling other outside boundaries that something has occurred within a boundary.

Inside vs Outside Events

Generally, Domain Events will be referred to as “Inside Events” because they don’t leave their boundary. While Integration Events are referred to as “Outside Events” because their intent is to leave their boundary.

Why do people recommend not exposing domain events to other boundaries but rather exposing integration events instead? While that recommendation has a good intent, it’s a bit misleading. For me, you can expose domain events outside your boundary if it fits these 3 requirements: Stability, Understanding, and Consumer Requirements.

Stability

Events that are stable consider exposing outside your own boundary.

If you’re talking with the business and domain experts are collaboratively determining the various domain events that are a part of a specific boundary, then they are likely to be stable business concepts.

If the business concepts are stable, then your events will be stable and they won’t be likely to change. This is the primary reason why people advocate for Integration Events. Once you expose an event to outside consumers, you have to version them just as you would any API changes that are public.

If you’re using stable business concepts then they aren’t likely to change. Domain Events aren’t for data propagation but rather to indicate what has occurred. This is often very useful for a long-running business process where many different boundaries are involved. Domain events are more behavioral rather than derived from CRUD.

Understanding

A service boundary can be a linguistic boundary. One concept in one boundary can have a very different meaning in another boundary. Because of this, a domain event in one boundary might not mean anything, or worst have a different perceived meaning in another boundary.

If you’re exposing domain events to other boundaries then there needs to be a clear understanding of what those events are from other boundaries. There must be a level of shared understanding. As mentioned, this often occurs when domain events are used as a way to notify various boundaries within a system that is all a part of a long-running business process.

Consumer Requirements

While the benefit of an event-driven architecture is to decouple producers and consumers, in practice you do actually care about what the consumer requirements are.

There are different purposes for events. Mainly for notification or data propagation. Consumers of events are going to care about events for these two specific reasons.

If they want to consume an event because of data propagation it’s because they want to keep a local cache copy of data that is owned from another boundary. These types of events will often contain a lot more data and often be “fat events” or referred to as Event Carried State Transfer.

If the service wants to consume an event because it’s a part of a long-running business process, then it doesn’t really care about the state so much as it does about simply being notified that an event has occurred, and now it must react and do its part of the business process.

For data propagation, events are often derived from CRUD or Property-based events. For example, ProductUpdated.

While events used for notifications are generally more behavioral. For example, ProductInventoryAdjusted.

Domain Events or Integration Events?

As always, it depends. If your domain events are stable business concepts and they are understood outside of your boundary as a part of a long-running business process, then yes, publishing domain events outside of your boundary are acceptable.  If events are used for data propagation or are more CRUD in nature, then publish Integration Events.

Source Code

Developer-level members of my YouTube channel or Patreon get access to the full source for any working demo application that I post on my blog or YouTube. Check out the YouTube Membership or Patreon for more info.

Related Links

Follow @CodeOpinion on Twitter

Software Architeture & Design

Get all my latest YouTube Vidoes and Blog Posts on Software Architecture & Design

What’s the Cost of Indirection & Abstractions?

Indirection is fundamental to software design. Creating abstractions is one common way of creating indirection. The benefits are reuse, isolating complexity, encapsulation of dependencies, and more. But what’s the cost of indirection & abstractions? Cognitive load to fully understand all of the layers of a request and limiting functionality.

YouTube

Check out my YouTube channel where I post all kinds of content that accompanies my posts including this video showing everything that is in this post.


Indirection

To illustrate indirection, first, let’s get down to the basics of having calling code (Caller) that is invoking another piece of code (Target).

Adding indirection simply means adding something in between the caller and the target.

As developers, we’re adding indirection all the time without really thinking of it. It’s not inherently bad and comes with a lot of benefits. Adding indirection is a useful way of isolating complexity, allowing re-use, and abstracting dependencies as we can have many different callers use the same abstraction.

A typical example of indirection is adding an abstraction of data access. This can often be thought to if you’re creating some type of data abstraction layer, a repository, etc.

If you’ve ever used the decorator pattern or created a request pipeline (check out my post on Separating Concerns with Pipes & Filters), you’re creating indirection.

Another way to think about indirection which isn’t directly related to application code is infrastructure. Indirection can come from a message queue, topics, load balancer, etc.

Code Example

I’m going to be showing snippets from the eShopOnWeb reference application. Below is a from the OrderController that has a route to show the signed-in users Orders.

The first use of indirection in our code is using MediatR. Instead of having application code in our ASP.NET Controller, we’re using MediatR to invoke our Query Handler which will have that logic.

The second use of indirection is we’re injecting an IOrderRepository and using that to get out the list of Orders. Meaning our indirection is coming from separating data access. When then take that list and transform it into a list of OrderViewModels that is returned from our Handler.

The OrderRepository is actually adding another layer of indirection because it is using Entity Framework Core within it to get data from the database.

So the call stack from the Controller to our database looks like this:

What's the Cost of Indirection & Abstractions?

If we removed most of the indirection it would look like this:

I’ve left Entity Framework as a layer because ultimately you’d be using some type of data access client to get to the database, regardless if that’s Entity Framework or simply ADO.NET directly.

Now I’m not implying you should remove indirection! There are clear benefits. To start with, the usage of MediatR can be benefiting from not coupling your application code with ASP.NET Core. Let your application code focus on application logic and let ASP.NET Core handle HTTP.

In the case of the repository, its purpose is to abstract the dependency on Entity Framework Core. Instead of having application logic directly couple to a 3rd party dependency, creating an abstraction of the repository allows you to couple to a type that you own (although I’ll argue later I don’t have to).

Cost of Indirection & Abstractions

The first cost of indirection is cognitive load.  If you need to understand the full life of a request and everything that happens, depending on how many layers the request is passing through can be challenging.

In the example above, it’s pretty simple however you can imagine the more indirection that exists, the more difficult it will be to understand the full scope.

On the flip side, there can be the benefit of not having to worry about certain layers.  Meaning you simply don’t have to concern yourself with them.  Until you do.

My point is keeping indirection to a level where you have the ability to fully understand the entire request and how it pertains to the application code you’re writing.

The second cost is performance because of limited functionality. ┬áNot necessarily from a memory allocation or CPU perspective, although that’s possible, more because when you’re creating indirection through abstractions, you’re often times making your abstraction generic or a limited surface of what we’re abstracting. ┬áThis occurs often when you’re abstracting a 3rd party dependency.

To illustrate this, look back at the Handler that was using the repository.  Does it really need to use a repository? what effect does using the repository have?

To get all the Orders out of the Repository, it was taking a Specification. This is the specification it was using. Its purpose is to add the Where() so it’s only fetching the orders for a specific user and to eagerly load the OrderItems and then the ItemOrdered which is the actual product.

Here’s the IRepository, which if you’ve used a repository before, probably looks pretty familiar.

This is all very generic and limits the ability to really leverage the underlying data access, which is Entity Framework. Because we’re abstracting entity framework behind this Repository, we’re now stripped out a bunch of functionality that we can’t expose.

Because of this, the listing page is getting back way more data than it actually needs.

The repository is returning line items and for each line item the associated product. None of this is used within this view.

Personally, I’d rather not use a repository in this situation. Why? Check out my video on Should you use the Repository Pattern? With CQRS, Yes and No!

Indirection is something we’re constantly creating but it has a cost. Be aware of when you’re adding indirection and if it actually adding value. If you’re abstracting a dependency so you can make it more testable, then great. If the dependency is testable and you simply don’t want to directly couple to it, then that might make sense, or it might not!

Source Code

Developer-level members of my CodeOpinion YouTube channel get access to the full source for any working demo application that I post on my blog or YouTube. Check out the membership for more info.

Related Links

Follow @CodeOpinion on Twitter

Software Architeture & Design

Get all my latest YouTube Vidoes and Blog Posts on Software Architecture & Design