Message Properties

Message Properties

Over the years I’ve learned various aspects of messaging either through using various products, reading other blogs, or unfortunately in a lot of cases, discovering them myself. One of those aspects is is message properties.

Specifically, (meta) data that you might want to include in your messages to solve some common problems.

My plan is to dive deeper with implementation examples for each topic listed in this post. Hopefully, this post can serve as a reference point to your adventures in messaging or possibly gain some insight that provides a solution for a pain point you may (already) be experiencing.

I was inspired to write this post after seeing this tweet:

Type/Name

The naming of messages is incredibly important for a variety of reasons. When talking about events, it’s important to name them in the past tense because it conveys that they are statements of fact.

Most often the name of an event is used to determine how to deserialize into a specific type. Where I’ve personally run into problems, in .NET specifically, is using the CLR type name as the event name to determine how to deserialize. The issue with this is you’re very likely to change the name of your CLR type.

Prefer to having a static name for an event and not using the CLR type name at runtime to create the event name.

Message Version

Most often when needing to version a message is to simply create a new type/name. With events, I think this approach works fairly well when you discover a better way to describe the event and along with it the data to expose.

When a new event isn’t required, but rather a slight change in data is to specify the version in the message. Consuming clients can use the message name along with the version to determine how to process the body of the message.

Message ID

A unique identifier of a message. This can be used for idempotency and concurrency. Message Handlerscan keep track of which messages IDs they have processed and determine if they should act upon receiving a message.

This is important because message handlers should be reentrant and messages can be delivered more than once.

Correlation ID

A unique identifier that allows you to reference the flow of events (event chain) or a transaction. Also, often called a Trace ID. Assigning a correlation ID as soon as possible and any message handler that receives a message with a correlation ID then uses that same ID for any messages it outputs.

This is very useful for logging and error handling in asynchronous and distributed environments.

Contract Owner

Who owns the message contract can be important. Depending on how messages are distributed, there may be many messages from various owners. Knowing who owns a message contract is helpful for troubleshooting as well as how to handle that message.

An OrderPlaced event seems like it could be unique across all services, however, in a distribution domain, is that a Sales OrderPlaced from the Sales service or a Purch OrderPlaced from the purchasing service?

Having the owner of the message provides the ability to distinguish how to handle the message when the name alone doesn’t provide enough distinction.

UTC Timestamp

This seems obvious but it gives more context to when the message was created. A useful is keeping track of latency between a message being created to how long it’s taking a given service to process it.

Sender

Who originated the message. Not the service, but rather the user. This could be an access token that is then used to determine if the sender has the correct permissions or is authorized to perform the given request.

Resource/Entity Version

The version of the resource/entity that you are performing an action on. Similar to the message ID this can be used for concurrency. A wonderful example of this is Azure Cosmos DB which uses an ETag as the resource version which is then used with an If-None-Match header for optimistic concurrency. If you’re using Cosmos DB check out my post on Optimistic Concurrency in Azure Cosmos DB

Message Properties

I’ll be going in-depth in each of the above topics with code examples for the actual implementation of how you would use these message properties. Stay tuned.

If you’d like to add to the list, let me know in the comments or on Twitter.

Related

Enjoy this post? Subscribe!

Subscribe to our weekly Newsletter and stay tuned.

Avoiding the Repository Pattern with an ORM

For many years now I’ve advocated not using the repository pattern on top of an ORM such as Entity Framework. There are many reasons why that I’ll try and cover throughout this post based on ways that I’ve seen it implemented. Meaning, this post is talking about poorly implemented approaches or pitfalls that I’ve seen.

To clarify, since this topic seems to really fire people up, I’m not saying that you shouldn’t use the repository pattern. I’m going to clarify why I don’t think under certain situations it’s very useful and other situations that I do find it useful.

This post was spurred on by a blog post and tweet:

IQueryable

The first thing I’ve seen with a repository is exposing IQueryable<T> (or DbSet<T>) from the underlying DbContext in your repository. This serves no purpose. It’s not abstracting anything at all.

What’s even worst is the consumers/callers don’t necessarily know at what point will they actually be retrieving data (doing I/O), unless you’re aware that the underlying IQueryable is coming from Entity Framework., Now when you call a method that materializes your query and actually hits the database (such as ToListAsync()).

Lazy Loading

Second, to this point is now if you have any type of navigation properties and are accessing IQueryable<T> from repository consumers, you must either eager load (via Include()) or have your consumers do the Include() or not realize all navigation properties are lazy loading.

Again, consumers are now aware of the underlying implementation that is Entity Framework.

IEnumerable

To overcome these issues, usually what comes next is avoiding the IQueryable<T> by returning an IEnumerable<T>.

The issue now is since you’re taking away control from the consumer, you must decide what data to Include() and Select() behind query methods.

What this often turns into is a pile of methods with various filtering parameters that could have been much easier expressed via a LINQ expression against the DbSet directly.

Query Objects

So if I don’t generally use the repository pattern, what do I use? Query Objects.

For querying, I’d rather have specialized objects that can return very specific data for the given use case. When implementing in vertical feature slices, as opposed to layers, each query is responsible for how it retrieves data.

The simplest solution is to use the DbContext and query directly.

The primary benefit is query objects only have dependencies that they actually require. Because each query object defines its own dependencies, you can change those dependencies without affecting other query objects.

A simple example of this is if you wanted to migrate from Entity Framework 6 to Entity Framework Core. You could migrate one query object at a time to EF Core instead of having to change over an entire repository that is highly coupled.

Testing

I can see the argument for using a repository because testing was difficult with EF6. However, with EF Core using the SQLite or the InMemory Provider, testing is incredibly easy.

I’ve written a post on how to use the SQLite provider with an in-memory database.

Testing a Query Object becomes incredibly easy without the need to mock.

Caching

Another argument for using the repository pattern is being able to swap out the implementation for a “cached repository”. I do use this pattern but in very select cases. Most times this is across bounded context were cached or stale data is acceptable.

If you decide to swap out the implementation of your repository, which was previously always hitting the database (point of truth) and now is using a cache implementation, how does that affect the callers? How quickly is the data invalidated?

Data can be stale the moment you retrieve it from the database, however adding caching to your repository without your callers knowing it can have a big impact on behavior.

Aggregate Root

One place I do often use a repository is when accessing an aggregate (in DDD Terms). My repositories often only contain two methods, Get(id) and Save(aggregateRoot).

The reason I do use a repository in this situation is that my repository usually returns an object that encapsulates my EF data model. I want it to fetch the entire object model and construct the aggregate root. The aggregate root does not expose data but only behavior (methods) to change state.

Repository Pattern Related

Enjoy this post? Subscribe!

Subscribe to our weekly Newsletter and stay tuned.

4+1 Architectural View Model

It’s incredibly difficult to describe a complex system, regardless if you are developing a monolith or (micro) services. Use cases, code organization/navigation, interactions between services, and deployment/infrastructure are just some of the aspects that comprise the architecture of an entire system.

Depending on your role as a stakeholder, your view of the system can be very different than another stakeholder.

This blog post is in a series. To catch up check out these other posts:

Context Matters

There are many different stakeholders related to a software system, which all have different perspectives. Project/Project Managers, Developers, System Engineers, End Users all view a system in completely different ways. They view the system based on their own context.

In order to describe a system, it would be useful to define all the different viewpoints and how the overall use cases of the system.

4+1 Architectural View Model

We all have seen many books and articles where one diagram attempts to capture the gist of the architecture of a system. But looking carefully at the set of boxes and arrows shown on these diagrams, it becomes clear that their authors have struggled hard to represent more on one blueprint than it can actually express. Are the boxes representing running programs? Or chunks of source code? Or physical computers? Or merely logical groupings of functionality? Are the arrows representing compilation dependencies? Or control flows? Or data flows? Usually it is a bit of everything.

The paper by Philippe Kruchten, Architectural Blueprints—The “4+1” View Model of Software Architecture, defines 4 concurrent views from the point of view of the various stakeholders.

4+1 Architecture View Model

I recommend reading the paper but for an incredibly simplified version of the views:

  • Logical View: The functionality. The service.
  • Process View: Communication between processes and/or services.
  • Physical View: Deployment of your services.
  • Development View: File/Folder Structure of your codebase. What you’re looking at in your IDE/Editor

The +1 comes in from the scenarios view which is what your end users actually care about. It’s the system functionality/capabilities. The scenarios view is what guides all the other views.

Service Boundaries

Where I think general guidance has fallen short when developing (micro)services is just because you’ve identified multiple logical views ( services), does not mean that each must have their own independent deployment (physical view) or git repo (development view).

To be clear, there are obvious benefits to having independent deployments, but there are many disadvantages as well. But it’s not a requirement.

As an example, in .NET, you could have a solution with various projects that represent different logical views. None of these projects reference each other, they are simply in the same solution. There may however be an ASP.NET Core project that hosts these projects as a single process.

Point being is that just because you have multiple services does not necessarily mean they need to be developed independently or deployed independently.

The four views are representations of the system depending on the context. They don’t need to map 1:1.

When not everything maps 1:1, that’s when it makes it a bit more challenging finding service boundaries. Focus on the scenarios.

Blog Series

More on all of these topics will be covered in greater length in other posts. If you have any questions or comments, please reach out to me in the comments section or on Twitter.

Enjoy this post? Subscribe!

Subscribe to our weekly Newsletter and stay tuned.