Avoiding the Repository Pattern with an ORM

For many years now I’ve advocated not using the repository pattern on top of an ORM such as Entity Framework. There are many reasons why that I’ll try and cover throughout this post based on ways that I’ve seen it implemented. Meaning, this post is talking about poorly implemented approaches or pitfalls that I’ve seen.

To clarify, since this topic seems to really fire people up, I’m not saying that you shouldn’t use the repository pattern. I’m going to clarify why I don’t think under certain situations it’s very useful and other situations that I do find it useful.

This post was spurred on by a blog post and tweet:

IQueryable

The first thing I’ve seen with a repository is exposing IQueryable<T> (or DbSet<T>) from the underlying DbContext in your repository. This serves no purpose. It’s not abstracting anything at all.

What’s even worst is the consumers/callers don’t necessarily know at what point will they actually be retrieving data (doing I/O), unless you’re aware that the underlying IQueryable is coming from Entity Framework., Now when you call a method that materializes your query and actually hits the database (such as ToListAsync()).

Lazy Loading

Second, to this point is now if you have any type of navigation properties and are accessing IQueryable<T> from repository consumers, you must either eager load (via Include()) or have your consumers do the Include() or not realize all navigation properties are lazy loading.

Again, consumers are now aware of the underlying implementation that is Entity Framework.

IEnumerable

To overcome these issues, usually what comes next is avoiding the IQueryable<T> by returning an IEnumerable<T>.

The issue now is since you’re taking away control from the consumer, you must decide what data to Include() and Select() behind query methods.

What this often turns into is a pile of methods with various filtering parameters that could have been much easier expressed via a LINQ expression against the DbSet directly.

Query Objects

So if I don’t generally use the repository pattern, what do I use? Query Objects.

For querying, I’d rather have specialized objects that can return very specific data for the given use case. When implementing in vertical feature slices, as opposed to layers, each query is responsible for how it retrieves data.

The simplest solution is to use the DbContext and query directly.

The primary benefit is query objects only have dependencies that they actually require. Because each query object defines its own dependencies, you can change those dependencies without affecting other query objects.

A simple example of this is if you wanted to migrate from Entity Framework 6 to Entity Framework Core. You could migrate one query object at a time to EF Core instead of having to change over an entire repository that is highly coupled.

Testing

I can see the argument for using a repository because testing was difficult with EF6. However, with EF Core using the SQLite or the InMemory Provider, testing is incredibly easy.

I’ve written a post on how to use the SQLite provider with an in-memory database.

Testing a Query Object becomes incredibly easy without the need to mock.

Caching

Another argument for using the repository pattern is being able to swap out the implementation for a “cached repository”. I do use this pattern but in very select cases. Most times this is across bounded context were cached or stale data is acceptable.

If you decide to swap out the implementation of your repository, which was previously always hitting the database (point of truth) and now is using a cache implementation, how does that affect the callers? How quickly is the data invalidated?

Data can be stale the moment you retrieve it from the database, however adding caching to your repository without your callers knowing it can have a big impact on behavior.

Aggregate Root

One place I do often use a repository is when accessing an aggregate (in DDD Terms). My repositories often only contain two methods, Get(id) and Save(aggregateRoot).

The reason I do use a repository in this situation is that my repository usually returns an object that encapsulates my EF data model. I want it to fetch the entire object model and construct the aggregate root. The aggregate root does not expose data but only behavior (methods) to change state.

Repository Pattern Related

Follow @CodeOpinion on Twitter

Software Architecture & Design

Get all my latest YouTube Vidoes and Blog Posts on Software Architecture & Design

4+1 Architectural View Model

It’s incredibly difficult to describe a complex system, regardless if you are developing a monolith or (micro) services. Use cases, code organization/navigation, interactions between services, and deployment/infrastructure are just some of the aspects that comprise the architecture of an entire system.

Depending on your role as a stakeholder, your view of the system can be very different than another stakeholder.

This blog post is in a series. To catch up check out these other posts:

Context Matters

There are many different stakeholders related to a software system, which all have different perspectives. Project/Project Managers, Developers, System Engineers, End Users all view a system in completely different ways. They view the system based on their own context.

In order to describe a system, it would be useful to define all the different viewpoints and how the overall use cases of the system.

4+1 Architectural View Model

We all have seen many books and articles where one diagram attempts to capture the gist of the architecture of a system. But looking carefully at the set of boxes and arrows shown on these diagrams, it becomes clear that their authors have struggled hard to represent more on one blueprint than it can actually express. Are the boxes representing running programs? Or chunks of source code? Or physical computers? Or merely logical groupings of functionality? Are the arrows representing compilation dependencies? Or control flows? Or data flows? Usually it is a bit of everything.

The paper by Philippe Kruchten, Architectural Blueprints—The “4+1” View Model of Software Architecture, defines 4 concurrent views from the point of view of the various stakeholders.

4+1 Architecture View Model

I recommend reading the paper but for an incredibly simplified version of the views:

  • Logical View: The functionality. The service.
  • Process View: Communication between processes and/or services.
  • Physical View: Deployment of your services.
  • Development View: File/Folder Structure of your codebase. What you’re looking at in your IDE/Editor

The +1 comes in from the scenarios view which is what your end users actually care about. It’s the system functionality/capabilities. The scenarios view is what guides all the other views.

Service Boundaries

Where I think general guidance has fallen short when developing (micro)services is just because you’ve identified multiple logical views ( services), does not mean that each must have their own independent deployment (physical view) or git repo (development view).

To be clear, there are obvious benefits to having independent deployments, but there are many disadvantages as well. But it’s not a requirement.

As an example, in .NET, you could have a solution with various projects that represent different logical views. None of these projects reference each other, they are simply in the same solution. There may however be an ASP.NET Core project that hosts these projects as a single process.

Point being is that just because you have multiple services does not necessarily mean they need to be developed independently or deployed independently.

The four views are representations of the system depending on the context. They don’t need to map 1:1.

When not everything maps 1:1, that’s when it makes it a bit more challenging finding service boundaries. Focus on the scenarios.

Blog Series

More on all of these topics will be covered in greater length in other posts. If you have any questions or comments, please reach out to me in the comments section or on Twitter.

Follow @CodeOpinion on Twitter

Software Architecture & Design

Get all my latest YouTube Vidoes and Blog Posts on Software Architecture & Design

Focus on Service Capabilities, not Entities

Service Capabilities

One of the most common pitfalls I think I’ve fallen into is focusing too much on data entities rather than service capabilities. What tends to happen is building up a domain model of behaviors related to single entities.

As I’ve mentioned in my post about using language to find service boundaries, you can have the same entity that lives in a different context, but that owns specific behaviors and data.

This blog post is in a series. To catch up check out these other posts:

Entities

I’m not entirely sure where the focus on entities comes from. I suspect it the rise of ORMs has something to do with it as well as the general relational database table design.

In the world of monolithic applications and databases, it’s common to see a singular table that represents an entity. A massive Product table with 100 columns isn’t unusual.

What is unusual when living in a monolith is to think of that Product table being split up into multiple Product tables across multiple databases. This is a big mental leap.

Service Capabilities

But the reality is your application doesn’t often require all of the data related to an entity. Likely it needs very little of it. An exercise to see what it actually needs is by looking at the business logic related to a particular capability.

In our distribution example I’ve been using through this series, if I were to look at the Inventory Adjustment functionality in the Warehouse Service, do you think it requires the Product Selling Price?

An inventor adjustment is used to reconcile the deviation from what physically is in the warehouse for a product and what our system says. Why would there be a deviation from the system, well physical products sitting in a warehouse can be broke or be stolen. The real point of truth is the physical warehouse, not a number in a database.

As you might have guessed, an Inventory Adjustment doesn’t need the selling price.

What this illustrates is we don’t need to have a singular Product entity backed by a singular product table. We can separate these entities into multiple Product entities that live in various services.

What they will share is a common identifier, in our case a SKU to identify the product.

Once you start focusing on the behaviors and capabilities you can identify the data they encapsulate you can start splitting them into multiple entities across the services that own those behaviors.

Blog Series

More on all of these topics will be covered in greater length in other posts. If you have any questions or comments, please reach out to me in the comments section or on Twitter.

Follow @CodeOpinion on Twitter

Software Architecture & Design

Get all my latest YouTube Vidoes and Blog Posts on Software Architecture & Design