Skip to content

Persistence Ignorance is Overrated

Sponsor: Do you build complex software systems? See how NServiceBus makes it easier to design, build, and manage software systems that use message queues to achieve loose coupling. Get started for free.

Learn more about Software Architecture & Design.
Join thousands of developers getting weekly updates to increase your understanding of software architecture and design concepts.


If you’re using an ORM and creating “domain entities” you’re likely trying to force your database structure into your domain model. That can work but the point of persistence ignorance is to let you domain model shine without knowing how it’s persisted. But you’re data structure will force you down a certain path if you treat your data model as your domain model.

YouTube

Check out my YouTube channel, where I post all kinds of content accompanying my posts, including this video showing everything in this post.

Persistence Ignorance

Persistence ignorance is a great concept, but in practice, it’s a little harder to achieve, and you’re likely going to need to strike some type of balance. Here’s an example.

Imagine a shipment. Think about a truck going to a warehouse, picking up lots of products for different orders, and then going to deliver those.

Our shipment example here is persisted and managed by an ORM. So we have some backing data; we have a shipment ID, which is likely the key, some shipper information, and a collection of stops. I’ll get to those in a second. Then we have different methods, one of them being to arrive when the truck arrives at any given stop. This is an action that’s performed to specify it’s there. The only real logic we have here is that we’re checking all the stops in sequence to ensure that all the previous ones have been departed. If not, we’re throwing an exception.

We can see that we know what the current stop is, so when I look at the actual stop, we have different methods that we can call. This one’s the arrive method, so we can change our status to arrive. The same type of thing applies because this is persistent using our ORM. We have a bunch of different data here: we have a stop ID again like our key for this particular stop, the stop type, whether it is a pickup or delivery, the location, the physical address, the status, and the sequence. We have a combination here of our behavior and those methods, and then the backing data, which all looks pretty standard.

Now, what’s the issue with this? Well, it really just depends on what side of the scale that I showed earlier that you want to be on. Because we were using an ORM, there was nothing really that indicated and knew about the ORM. My shipment was about exposing behavior so that I could do the arrive, pickup, deliver. What does the shipment ID have to do with anything? That’s just for persistence because of my ORM. That was the key. Same thing with the shipper; this is just about data. It has no logic around it. We do have an update method; this is simply about CRUD and data.

So there’s a mix between behavior and data around that behavior and data that’s simply a part of our ORM that we need to persist. And that’s pretty much the question a member asked me of my channel on our private Discord.

The question was

How do I have persistence ignorant domain entities with Entity Framework? I don’t want my domain to care about metadata structure or how it’s persistent, but it feels silly to create all these mappings from domain objects to persistent objects.

It’s a really common question.

Here’s another example that illustrates it.

I mentioned the shipment ID was the primary key. This can be defined how you configure your ORM and in the mapping, or you can use an attribute. But people try to avoid this like the plague for some reason because:

“oh no, that’s an infrastructure persistence concern in my domain, and I don’t want to be on that side. I want this to be pure or clean!”

The same goes for a column or something like you have audit information, like last updated, some date time, or the user ID that last updated it. We don’t want that in there because that’s a persistence audit concern, and it’s not focused on what the entity is doing and exposing those behaviors.

I think people start realizing, which I believe is correct, that your domain model is not your data model.

The problem is if you’re using something like an ORM and Entity Framework, you’re trying to force both to be the same. For people using clean architecture, you’ll be familiar with this diagram and this direction of dependencies.

Our infrastructure, where our ORM and Entity Framework will live, references the application. The application inherently references the domain, and that’s what you have against your Entity Framework entities, which are living in your domain project. However, as mentioned, they combine domain entity and data model.

So, if your domain model and your data model are different things, why are we forcing them to be the same?

That would mean that we would have our data model in the infrastructure and our domain model be our domain model that has behavior and only the state that it cares about to apply any type of state transition. This means that we have to do some type of mapping, or do we?

I advocate for passing your data model to your domain model. So, in my case, with a repository, I can just fetch out the stops and pass that to our shipment.

The shipment didn’t care about the shipper, the shipment ID; it wasn’t doing anything with that. It was only doing validation and logic around the stop when we were performing different methods, different behaviors, and that’s what we were making our state transitions.

So I can pass the stops to it. We can see we have our private member of stops, and we can still have our arrive, but we don’t care about all that other data that was there just for persistence. You may be wondering, what do you do with the data that really is just CRUD and has no logic? You just need to set data. Well, don’t create any more indirection by having useless setter methods. Create a separate data model that has that data, and use it as simply as you can with CRUD. The whole point of that shipment was to expose behavior. If there’s no behavior, then there’s no point in even creating that model.

But if you’re using clean architecture and you’re going to pass your data model to your domain model, you have a problem because if you think your infrastructure, that’s where your data lives.

Now you’re going to have a dependency from your domain to your infrastructure because the domain needs to accept that data model. That can’t happen if you’re using this direction of dependencies. But you shouldn’t actually have that problem because data isn’t an infrastructure concern or about mapping; it’s a key part of your domain. Yes, it’s all about the behaviors and exposing those behaviors, but the data behind those behaviors is also important.

A little bit of food for thought to kind of make sense of this is if your understanding of event sourcing, you’re using events as a means to record state. Where do your events live? They’re a part of your domain. So it really is about what side of the spectrum do you want to be on? If you have a complex domain with a lot of logic, you probably don’t want to be on the very left, where you understand all the infrastructure, how it’s persisted, and the data structure. That might not be a great idea if you don’t, and it’s really just CRUD. Is really that big of an issue?

On the other side, do you want to jump through hoops with mappings and all kinds of complexity so that your domain model is completely 100% free of not only understanding how it’s persisted but any non-need properties or data structure? So that you have no idea how that data is structured? You’re simply using your domain entities as a way to perform state transitions at a minimal view, and you don’t care about how it’s persisted. You might have to jump through a lot more hoops for that.

I think if you make the distinction that your domain model is not your data model, and you don’t conflate the two, you’ll realize how many more options you have around modeling, even if you’re not into event sourcing. It illustrates this really well because it does not conflate the two. I’ll have a link to a video at the end of this video that may give you some inspiration even if you’re recording current state with an ORM.

Hopefully, this video gave you a little food for thought.

Join CodeOpinon!
Developer-level members of my Patreon or YouTube channel get access to a private Discord server to chat with other developers about Software Architecture and Design and access to source code for any working demo application I post on my blog or YouTube. Check out my Patreon or YouTube Membership for more info.

Leave a Reply

Your email address will not be published. Required fields are marked *