Handling Duplicate Messages (Idempotent Consumers)

“At Least Once” message guarantees that a message will be delivered to a consumer once or many times. This means that you need to develop your consumers to be able to effectively handle duplicate messages. The term for this is having idempotent consumers.

Not doing so could result in some bad outcomes for your system.

For example, processing a message twice that creates an order, could create two orders. That would not likely be a good outcome.

Why do messages get delivered more than once?

How do you handle duplicates?

Here’s how to make idempotent consumers and be resilient to duplicate messages.

YouTube

Check out my YouTube channel where I post all kinds of content that accompanies my posts including this video showing everything that is in this post.

Delivery Guarantees

Before we jump too far ahead, I want to quickly cover message delivery guarantees from message brokers. Different brokers provide different types of guarantees, but they are broken down into these 3.

At Most Once

Consumers will receive a message once or possibly not at all.

At Least Once

Consumers will receive a message once or possibly multiple times. I’ll cover why it may be delivered multiple times and how to handle it in this post.

Exactly Once

This is a tricky one and is complex. Some brokers/event logs support this by having a producer send exactly once and the message will be delivered to consumers exactly once (excluding failures and retries).

Idempotent Consumers

At least once delivery is the most common among message brokers. But not only for that reason will you need to handle duplicate messages. Here are some reasons why a message can get delivered more than once.

At Least Once Delivery

When a message broker delivers a message to a consumer, it does not consider the message processed until the consumer acknowledges the delivery.

This can either happen implicitly or explicitly depending on the message library you’re using.

Idempotent Consumers

Idempotent Consumers

Unacknowledged & Timeouts

  1. If your consumer fails (for whatever reason) and never acknowledges the delivery to the broker, then the broker will send the message again to the consumer.
  2. If you’re using a library that requires you to acknowledge in code, and for whatever reason, it never occurs, that message again will be delivered to the consumer.
  3. Also, there is generally a timeout or an expected period of time given that an acknowledgment needs to occur. If you do send the acknowledgment, but it occurs after this time lapses, the broker will deem it unacknowledged and resend it to the consumer.

In any of the 3 cases above, if you’ve made a state change to your database, and the message gets delivered again, you’re going to make the same state change again.

This could have some very negative impacts. As mention as the starting, if you were creating an Order in the consumer, and you receive it more than once, you would end up creating multiple orders. Not ideal.

Producer Duplicating

Another reason you could receive duplicates are because the producer itself is sending the same message more than once.

This can occur simply because of a bug in your code but also because of the outbox pattern. You can refer to my post on the outbox pattern for the problem it solves, but it does introduce duplicate message issues.

The producer will pull messages/events from the database and then publish those to a message broker. After it does that, it then has to update the database to mark the messages as published. But because these are two different operations, the update could fail. If that happens, the producer will send the messages again to the message broker.

This will result in the consumer receiving the same message.

Handling Duplictates

In order to handle duplicate messages, we need to record what messages we’ve previously processed.

You want to record the message ID and the consumer when a consumer processes a message.

If you’re in a concurrent environment, then you also want to save this alongside your state changes to your application within the same database and transaction.

The code below is using Entity Framework Core. I’ve added two new methods:

IdempotentConsumer(), which adds a new record that contains the messageId and Consumer name.

HasBeenProcessed(), which checks to see if a record exists.

The IdempotentConsumer model has a Primary/Unique key on MessageId, Consumer. This is important in a concurrent environment.

Now in our consumer, we’re going to check if the message has been processed at the very beginning using the HasBeenProcessed() method. If it has, just exit early.

Then in the same transaction as our state change, we’re also going to use the IdempotentConsumer() method to add a new record.

If the same message is processed at the exact same time (concurrently), the unique key constraint on MessageId, Constumer will cause an exception when we save or commit the transaction.

We’ve now implemented an idempotent consumer. It can fully handle duplicate messages.

Naturally Idempotent

Not every consumer needs to keep track of messages it’s processed. If your consumer does not have any side-effects that will cause issues if they are executed again, then you might considered just letting it run.

For example, if a consumer sets the ShipplingLabel to Cancelled when an order is cancelled.

If this is executed multiple times by duplicate messages, there are no side-effects that we are concerned with. The state remains the same.

Having naturally idempotent consumers means you do not need to keep track of processed messages, however requires diligence. When code changes, you may introduce other side-effects that make it no longer naturally idempotent and do need to record that you’ve processed it before.

Follow @CodeOpinion on Twitter

Enjoy this post? Subscribe!

Subscribe to our weekly Newsletter and stay tuned.

Links