My TOP Patterns for Event Driven Architecture

Here are my top 5 patterns and concepts (Outbox, Idempotent Consumers, Event Choreography, Orchestration, Retry/Dead Letter) for Event Driven Architecture that you’ll likely implement. Why? Well if you’re new to Event Driven Architecture there are many different problems that you’ll encounter. Most of the issues have well-established patterns or concepts you can leverage to deal with them. Here are the most common patterns for Event Driven Architecture that you’ll likely use.

YouTube

Check out my YouTube channel where I post all kinds of content that accompanies my posts including this video showing everything that is in this post.

Outbox Pattern

Generally, you’re publishing events because you want to notify another boundary that something has occurred within your boundary. There was likely some type of state change that you want to want to notify other boundaries about. This could be from a long-running business process or for state propagation.

The issue is your state changes are likely to be stored in a database while your publishing messages to a message broker. You cannot reliably do both without a distributed transaction.

My TOP Patterns for Event Driven Architecture

If you’re successfully writing to your database, but then for some reason, the message broker or queue you’re publishing a message to is unavailable or there is a failure, then you’re never going to publish the event.

My TOP Patterns for Event Driven Architecture

This could have some serious implications and cause inconsistencies or breaks of workflow between services in a long-running business process.

The Outbox Pattern solves this problem by writing the messages to be published to the database with your state changes within the same transaction. Separately a “Publisher” process will pull the messages from the database and then send them to your Message Broker or Queue.

Outbox Pattern

Outbox Pattern

Outbox Pattern

The outbox pattern is one of my top patterns for event driven architecture if you need to reliably publish events.

For more in on the Outbox Pattern, check out my post and video with code samples of how it works.

Idempotent Consumers

Most Message Brokers support “at least once messaging”. This means that a message may be delivered to a consumer at least once. In other words, it could deliver a message to a consumer more than once.

Processing a message more than once could have a negative effect that what is intended.

There are many different reasons why a message could be delivered more than once, but as an example, one reason is using the Outbox pattern described above. When the Outbox Publisher pulls a message from the database, publishes it to the Message Broker or Queue, it then must delete the record from the database. If for some reason this fails, then the record will still exist and the Outbox Publisher will ultimately send the message again to the broker.

Idempotent Consumers are able to handle processing the same message more than once without any adverse side effects.

Some consumers may be naturally idempotent. Meaning how they react to consuming a message can occur multiple times and they do not have any side effects.

For consumers that would have side effects, in order to handle duplicate messages, the key is to record a unique identifier (Message-ID) for each message that has been processed. Just like the outbox pattern, this means persisting the Message-ID along with any state changes in the same transaction to your database.

For more info on creating Idempotent Consumers, check out my post and video with code samples of how it works.

Event Choreography & Orchestration

Many different boundaries are often used together in a long-running business process. The challenges are that if one part of the process failed,

If each service has its own database, there’s no easy way to roll back changes that have happened in the process prior to the failure. How do you handle the lack of a distributed transaction or two-phase commit?

Event Choreography is driven entirely by events being consumed and published by various boundaries within a system. There is no centralized coordination or logic. A long-running process workflow is created by one boundary publishing an event, another consuming it and performing some action, then publishing its own event. Depending on the workflow there could be many services involved but they are entirely decoupled and have no knowledge about how the entire workflow works.

Orchestration provides a centralized place to define the workflow for a long-running business process. It consumes events but may send Commands to a specific boundary, generally still asynchronous via a message queue. Orchestration is telling other services to perform a specific action. Those services in turn publish events that the orchestrator consumes to start the next part of the workflow.

For more info on Event Choreography & Orchestration, check out my post and video with code samples of how it works.

Failures

Transient Faults are unpredictable and could be caused by network issues, availability, or latency with the service you’re communicating with.

In an event driven architecture, you get the benefit of having various ways of handling failures. The most common for transient failures are immediate retries.

For example, if you’re consuming a message and have to interact with some other dependency. It could be a database, a cache, or a 3rd party service. If there is a failure when consuming a message with that dependency, simply retry consuming the message again.

If the failure continues, using an Exponential Backoff allows more time/delay between retries. You may use an immediate retry and if a failure continues, wait for 5 seconds then retry again. If a failure still occurs, wait even longer for 10 seconds and retry again. You could configure this exponential backoff for different intervals and a total number of retries.

If all retries are failing you may choose to move the message that cannot be properly consumed to a dead letter queue. This allows you to continue processing other messages while not losing the message that cannot be processed. Moving a message over to a Dead Letter Queue allows you to attempt to process the message later or investigate why the consumer is failing. You’re not losing the actual message.

Handling failures with various patterns for event driven architecture is required to be resilient and reliable.

For more info on Handling Failures in a Message Driven Architecture, check out my post and video.

Source Code

Developer-level members of my CodeOpinion YouTube channel get access to the full source for any working demo application that I post on my blog or YouTube. Check out the membership for more info.

Follow @CodeOpinion on Twitter

Enjoy this post? Subscribe!

Subscribe to our weekly Newsletter and stay tuned.

Processing Large Payloads with the Claim Check Pattern

How do you handle processing large payloads? Maybe a user has uploaded a large image that needs to be resized to various sizes. Or perhaps you need to perform some ETL on a text file and interact with your database. One way is with a Message broker to prevent any blocking from calling code. Combined with the Claim Check Pattern to keep message sizes small to not exceed any message limits or cause performance issues with your message broker.

The pattern is to send the payload data to an external service or blob storage, then use a reference ID/pointer the blob storage location within the message sent to the Message Broker. The consumer can then use the reference ID/pointer to retrieve the payload from blob storage. Just like a Claim Check! This keeps message sizes small to not overwhelm your message broker.

YouTube

Check out my YouTube channel where I post all kinds of content that accompanies my posts including this video showing everything that is in this post.

In-Process

As an example, if a user is uploading a large file to our HTTP API, and we then need to process that file in some way, this could take a significant amount of time. Let’s say it’s simply a large text file where we need to iterate through the contents of the file, extract the data we need, then save the data to our database. This is a typical ETL (Extract, Transform, Load) process.

There are a couple of issues with doing this ETL when the user uploads the file. The first is that we’ll be blocking the user while the ETL occurs. Again, if this is a long process the could take a significant amount of time. The second issue is that if there are any failures throughout processing, we may partially process the file.

What I’d rather do is accept the file in our HTTP API, return back to the user/browser that the upload is complete and the file will be processed.

Out of Process

To move the processing of the file into another separate process, we can leverage a queue.

First, the Client/Browser will upload the file and our HTTP API.

Processing Large Payloads with the Claim Check Pattern

Once the file is been uploaded, we create a message and send it to the queue of our message broker.

Processing Large Payloads with the Claim Check Pattern

Once the message has been sent to the queue, we can then complete the request from the client/browser.

Now asynchronously a consumer can receive the message from the message broker and do the ETL work needed.

Processing Large Payloads with the Claim Check Pattern

Large Messages

There is one problem with this solution. If the file being uploaded is large and we’re putting the contents into the message on our queue, that means we’re going to have very large messages in our queue.

This isn’t a good idea for a few reasons. The first is that your message broker might not even support the size of messages you’re trying to send it. The second is that large messages can have performance implications with the message broker because you’re pushing a large amount of data to them, and then also pulling that large message out. Finally, the third issue is that your message broker may have a total volume limit. It may not be the number of messages but rather the total volume that has a limit. This means that you may only be able to have a limited number of messages because the messages themselves are so large.

This is why it’s recommended to keep messages small. But how do you keep a message small when you need to process a large file? That’s where the claim check pattern comes in.

First, when the file is uploaded to our HTTP API, it will upload the file to shared blob/file storage. Somewhere that both the producer and consumer can access.

Processing Large Payloads with the Claim Check Pattern

Once uploaded to blob/file storage, the producer will then create a message that contains a unique reference to the file in blob/file storage. This could be a key, file path, or anything that is understood by the consumer on how to retrieve the file.

Processing Large Payloads with the Claim Check Pattern

Now the consumer can receive the file asynchronously from the message broker.

Processing Large Payloads with the Claim Check Pattern

The consumer will then use the unique reference or identifier in the message to then read the file out of blob/file storage and perform the relevant ETL work.

Claim Check Pattern

If you have a large payload from a user that you need to process, offload that work out asynchronously to separate processes using a queue and message broker. But use the claim check pattern to keep your messages small. Have the producer and consumer share a blob or file storage where the producer can upload the file and then create a message that contains a reference to the uploaded file. When the consumer receives the message it can use the reference to read the file from blob storage and process it.

Source Code

Developer-level members of my CodeOpinion YouTube channel get access to the full source for any working demo application that I post on my blog or YouTube. Check out the membership for more info.

Related Posts

Follow @CodeOpinion on Twitter

Enjoy this post? Subscribe!

Subscribe to our weekly Newsletter and stay tuned.

Synchronous vs Messaging: When to use which?

Not all communication will be synchronous request/response with HTTP/RPC or asynchronous messaging within a system. But how do you choose between Synchronous vs Messaging? Well, it depends on if it’s a command and/or a query as well as where the request is originating from. If you want reliability and resiliency, then use messaging where it’s appropriate.

YouTube

Check out my YouTube channel where I post all kinds of content that accompanies my posts including this video showing everything that is in this post.

Synchronous

The most common places you’ll encounter making synchronous Request/Response calls are to 3rd party services, infrastructure (like a database, cache, etc), or from a UI/Client.

A typical example of this would be a Javascript frontend application making an HTTP call to a Web API. Or perhaps your backend making a call to a Cloud Service such as Blog Storage. Integration involving newer B2B services is usually done as synchronous request/response calls using HTTP.

Asynchronous

Communicating asynchronous most often happens between other internal services that your team or another team owns within an organization. Another good approach is to use asynchronous messaging to your own service or monolith. Check out my post on using Message Driven Architecture to decouple a monolith for more.

In B2B, a common use case for asynchronous communication is EDI where you’re exchanging files via mailboxes.

Commands & Queries

One way to distinguish where to use Synchronous vs Messaging is if you’re performing a command or a query. A command being a request to change state and a query being a request to return state.

If you’re using a javascript application in the browser, that’s communicating with an HTTP API backend service, you’re going to be using HTTP for synchronous request/response. If the backend service is communicating with a database, that will also be synchronous. This is to be expected and how much applications work.

Synchronous Messaging: When to use which?

However, when you’re communicating between internal services, I recommend communicating asynchronously using a Message Driven Architecture via Events and Commands.

This means your services are communicating via a Message Broker to send commands to a queue or publish messages to a topic.

Synchronous Messaging: When to use which?

The reason I do not recommend HTTP to communicate between services is from the complexity of dealing with latency issues, availability concerns, resilience, difficulty debugging, and most importantly coupling.

Check out my post on REST APIs for Microservices? Beware! for more info on why you should avoid it as the primary way to communicate between services.

Origination

An important distinction besides commands and queries in the Synchronous vs Messaging debate is where the request originated from. For example, a client UI/browser sends an HTTP Request to our Web API Backend service, which then makes a synchronous call to a 3rd party service.

Synchronous Messaging: When to use which?

But what happens when that 3rd party service is unavailable or is timing out? What do we return to the client UI? Do we just send back the Client UI an error message? Is the 3rd party service is critical to your application/service, does that mean as long as it’s unavailable, your service is also unavailable?

Take this exact same situation, but change the originator to be a message broker, the implications are very different.

Synchronous Messaging: When to use which?

The synchronous call from our app/service was caused by a message from a message broker and the 3rd party service is unavailable, then we have many different courses of action.

We can do an immediate retry to resolve any transient errors. We can implement an exponential backoff, where we retry and wait a period of time before retrying again. You can also move the message to a dead letter queue that will allow you to investigate and manually retry the messages once you know the 3rd party service is available again.

You have many different options. Check out my post on Handling Failures in a Message Driven Architecture for more info.

The point being is that you aren’t losing work that needs to be completed, nor are any users going to be aware that is potentially an issue.

To accomplish this for the above example, we simply need to change from synchronous to asynchronous at some point through the call path.

Synchronous Messaging: When to use which?

This means that our Client UI will still make a synchronous call to our app/services, however instead of calling the 3rd party immediately, we’ll enqueue a message to the message broker and return back an immediate response to our client UI.

Then we will consume that same message asynchronously from the message broker and complete the work that needs to communicate with the 3rd party. If there are any failures, we now have the ability to handle those failures with different resiliency and fault tolerance and we do not lose any work that needs to be done.

Use Case

A good example of how this applies in real applications is when you’re in the AWS EC2 Console. This would apply to many different services with any cloud provider. I’ll use AWS EC2, which is an AWS Virtual Machine service.

If you have a running instance and you choose to stop the instance, it doesn’t immediately stop.

When you click the “Stop Instance” menu option, the browser doesn’t sit and wait for the HTTP request to finish. The request is fairly quick and then your browser displays that the Instance state is “Stopping”.

This is because of the exact example I illustrated earlier where the request from the browser was synchronous, but the actual work being performed is asynchronous.

Synchronous Messaging: When to use which?

Not all interactions can be done this way. If the end-user expects to immediately see their changes, then forcing an asynchronous workflow will prove challenging. If you can provide the user with the correct expectation about long-running processes then you can leverage events to drive real-time web.

Check out my post on using Real-Time Web by leveraging Event Drive Architecture.

Source Code

Developer-level members of my CodeOpinion YouTube channel get access to the full source for any working demo application that I post on my blog or YouTube. Check out the membership for more info.

Related Posts

Follow @CodeOpinion on Twitter

Enjoy this post? Subscribe!

Subscribe to our weekly Newsletter and stay tuned.