Avoiding a QUEUE Backlog Disaster with Backpressure & Flow Control

I advocate a lot for asynchronous messaging. It can add reliability, temporal decoupling, and much more. But what are some of the challenges? One of them is backpressure and flow control. This occurs when you’re producing more messages can you can consume. Meaning you’re piling up messages in your queue and you can never catch up. The queue just keeps growing.

YouTube

Check out my YouTube channel where I post all kinds of content that accompanies my posts including this video showing everything in this post.

Producers and Consumers

Producers send messages to a broker/queue and a Consumer processes those messages. For a simplistic view, we have a single producer and a single consumer.

The producer creates a message and sends it to the broker/queue.

Single Producer sending a message to a queue

The message can sit on the broker until the consumer is ready to process it. This enables the producer and the consumer to be temporally decoupled.

Temporal Decoupling provided as the queue holds the message

The consumer then processes the message and it is removed from the broker/queue.

Consumer process the message from the queue

As long as you can consume messages on average faster than messages are produced, you won’t get into having a queue backlog.

But since there can be many producers, or because of load, you may start producing more messages at a faster rate than can be consumed.

For example, if you’re producing a single message every second, yet it takes you 1.5 seconds to process the message, you’re going to start filling up the queue. You’ll never be able to catch up and have an empty queue.

Queue backlog

Most systems will have peaks and valleys in terms of how many messages are produced. During the valleys is where the consumer can catch up. But if again, on average, you’re producing more messages than can be processed, you’re going to build a queue backlog.

Competing Consumers

One solution is to add more consumers so that you can process more messages concurrently. Basically, you’re increasing your throughput. You need to match or exceed the rate of production with consumption.

The competing consumer pattern is having multiple instances of the consumer that are competing for messages on the queue.

Competing Consumers of multiple consumer instances

Since we have multiple consumers, we can now process 2 messages concurrently. Since one consumer is busy processing a message, if another message is sent to the queue, we have another consumer that is available.

Each consumer available competes for the next message

The consumer that is available will compete for the next message and process it.

Competing consumers adds more processing which increases throughput

There are a couple of issues with the Competing consumers’ pattern.

The first is if you’re expecting to be processing messages in order. Just because you’re using a FIFO (first-in, first-out) queue, that does not mean you’ll process messages in order as they were produced. Because you’re processing messages concurrently, you could finish processing messages out of order.

The second issue is you’ve moved the bottleneck. Any downstream services that are used when consuming a message will now experience additional load. For example, if you’re interacting with a database, you’re now going to add additional calls to that database because you’re now processing more messages at a given time.

Competing consumers adding additional load on downstream services

Incoming

A queue is like a buffer. My analogy is to think of a queue as a pond of water. There is a stream of water as an inflow on one end, and a stream of water as an outflow on the other.

If the stream of water coming in is larger than the stream of water going out, the water level on the pond will increase. In order to lower the water level, you need to widen the outgoing stream to allow more water to escape. This will lower the water level.

But another way to maintain the water level is to limit the amount of water entering the pond.

In other words, limit the producer to only be able to add so many messages to the queue.

Setting a limit on the broker/queue itself means when the producer tries to send a message to the queue, if it’s reached its limit, it won’t accept the message.

Queue Backlog handled by limiting producer

Because the producer might be a client/UI, you might want to have built-in retry and other ways of handling this failure if you cannot enqueue a message. Generally, I think this way of handling this backpressure is used as a safeguard to not overwhelm the entire system.

Queue Backlog

Ultimately when dealing with queues (and topics!) you need to understand some metrics. The rate at which you’re producing messages. The rate at which you can consume messages. How long are messages sitting in the queue? What’s the lead time, from when it was produced to when it was consumed to be processed? What’s the processing time, and how long does it take to consume a specific message?

Knowing these metrics will allow you to understand how to handle backpressure and flow control. Look at both sides, producing and consuming.

Look at competing consumers’ pattern to increase throughput. Also, look at if there are optimizations to be made for how a consumer is processing a message. Be aware of downstream services that also will be affected by increasing throughput.

Add a safeguard on the producing side to not get into a queue backlog situation you can’t recover from.

Join!

Developer-level members of my YouTube channel or Patreon get access to a private Discord server to chat with other developers about Software Architecture and Design as well as access to source code for any working demo application that I post on my blog or YouTube. Check out the YouTube Membership or Patreon for more info.

Follow @CodeOpinion on Twitter

Software Architecture & Design

Get all my latest YouTube Vidoes and Blog Posts on Software Architecture & Design

Message Ordering in Pub/Sub or Queues

Do you need message ordering? Processing messages in order as they were sent/published to a queue or topic sounds simple but has some implications. What happens when processing a message fails? What happens to all subsequent messages? How is throughput handled when using the competing consumers’ pattern, which loses the guarantee processing in order. Lastly, is ordering even important?

YouTube

Check out my YouTube channel where I post all kinds of content that accompanies my posts including this video showing everything in this post.

Message Ordering

What are some reasons that most people think they need to process messages in a particular order? The most common reason is workflow. As an example events are expected to have occurred in a particular order such as OrderPlaced, OrderBilled, OrderShipped.

The second most common is because of CRUD & Property-Based Events. Generally, this is more referring to Event Carried State Transfer, where you are trying to propagate the state of entities to other services. As an example, ProductCreated and ProductNameUpdated. You must process these events in order. If you process ProductNameUpdated before ProductCreated, you have no data/record to update and change the Product Name.

So how do you achieve message ordering? Well if you have a message broker that supports FIFO (First In, First Out) and you have a single consumer that is processing messages in a single-threaded or one at a time, you will process the messages in the same order as they were delivered to the broker.

As an example, we have a producer (or many) creating messages and sending them to the broker.

As the prior example, let’s say the first Event was ProductCreated, and then the user immediately changed the name, so another event was generated ProductNameUpdated. Both events are now on Topic waiting to be consumed.

With a single consumer, it will process the first event (ProductCreated).

Once it’s done and updated its local cache of the product, the consumer will then process the next event ProductNameUpdated.

Competing Consumers

When you want to scale and process more messages you’ll start using the competing consumers’ pattern. This is basically having multiple instances of the consumer running so you can process more messages concurrently.

This poses a problem because if we have two events ProductCreated and ProductNameUpdated sitting in our Topic waiting to be consumed, we now have two consumers able to process those events.

Since they are independent and process messages concurrently, this means that one consumer could be processing ProductCreated at the same time that the other consumer is processing ProductNameUpdated.

Now there is a race condition and we need ProductCreated to finish processing first, otherwise, ProductNameUpdated will fail since we haven’t finished processing ProductCreated yet.

Competing consumers applies to Topics, often called Consumer Groups as well as Queues.

One solution to solve this issue with processing messages in order when using competing consumers’ pattern is to process related messages one at a time. To achieve this, different messaging platforms will call these Partitions, Message Groups, or an Ordering Key.

To illustrate this, let’s say we have a Partition/Message Group/Ordering Key on a ProductID. This means that only a single consumer will get process the messages for that ProductID.

We have two messages sent to a Topic. ProductCreated for productID=1 and ProductCreated for ProductID=2.

The first (top in the diagram) consumer is responsible for handling all events where ProductID=1

And the bottom consumer is responsible for ProductID=2.

Now if we get another event ProductNameChanged for ProductID=1, it will to the first partition.

And the same consumer will handle it since it is handling all the messages for that partition.

This strategy allows you to process messages in the order that relate to each other and you want to process them one at a time. However, it still allows you to process many messages concurrently that don’t relate. Consumers can be responsible for many different Partitions/Message Groups/Ordering Keys.

Failures

Failures are another thing to consider. If you’re processing messages in order, how do you want to handle not being able to process a message? What happens to all the messages behind it waiting to be processed that relate to the failed message?

If the first event to be processed is ProductCreated, and the following event is ProductNameChanged. They are waiting to be consumed by a single consumer to be processed one at a time.

If the consumer attempts to process ProductCreated but it fails to do so, because of a bug, serialization issue, or whatever the reason, how do you now process ProductNameChanged?

This is very situational. Maybe you can discard the failed message and continue on. Maybe you need to treat it as a poison pill and stop processing any future messages. Again, this needs to be considered if you want to process messages in order.

WorkFlow

Do you really need to process messages in order? Often times you can create a policy to understand when all the relevant events have occurred so that you can then perform a specific action.

When an order is placed in our system, we need to charge the customer in billing, and then create a shipping label in the warehouse.

Billing will consume the OrderPlaced event and charge the customer.

Once it charges the customer, it will publish an OrderBilled event.

The warehouse will have a Policy (illustrated by an NServiceBus Saga below), that will keep track that both OrderPlaced and OrderBilled have occurred.

Now depending on message ordering and when these are published and how we consume them, these can be processed out of order. We could of processed OrderBilled first, and then later OrderPlaced was consumed, even though that’s not the order they were published.

Once both events have been consumed by the Policy/Saga, it can send a CreateShippingLabel command to the Warehouse.

Here’s an example of what this looks like with NServiceBus.

Message Ordering

So do you need to process messages in order? Maybe, maybe not. I’d take a look at the workflow of what you’re trying to achieve to be sure that you actually require to process messages in order. If you do, you’ll have to leverage FIFO (first-in, first-out) queues or topics. As well as use a broker that supports single consumers to process partitions, message groups, or ordering keys.

Join!

Developer-level members of my YouTube channel or Patreon get access to a private Discord server to chat with other developers about Software Architecture and Design as well as access to source code for any working demo application that I post on my blog or YouTube. Check out the YouTube Membership or Patreon for more info.

Follow @CodeOpinion on Twitter

Software Architecture & Design

Get all my latest YouTube Vidoes and Blog Posts on Software Architecture & Design

Commands & Events: What’s the difference?

One of the building blocks of messaging is, you guessed it, messages! But there are different kinds of messages: Commands and Events. So what’s the difference? Well, they have very distinct purposes, usage, naming, ownership, and more!

YouTube

Check out my YouTube channel where I post all kinds of content that accompanies my posts including this video showing everything in this post.

Commands

The purpose of commands is the intent to invoke behavior. When you want something to happen within your system, you send a command. There is some type of capability your service provides and you need a way to expose that. That’s done through a command.

I didn’t mention CRUD. While you can expose Create, Update, and Delete operations through commands, I’m more referring to specific behaviors you want to invoke within your service. Let CRUD just be CRUD.

Commands have two parts. The first is the actual message (the command), which is the request and intent to invoke the behavior. The second is consumer/handler for that command which is performing and executing the behavior requested.

Commands have only a single consumer/handler that resides in the same logical boundary that defines and owns the schema and definition command.

Commands can be sent from many different logical boundaries. There can be many different senders.

To illustrate this, the diagram below has many different senders, which can be different logical boundaries. The command (message) is being sent to a queue to decouple the sender and consumer.

Commands & Events: What's the difference?

A single consumer/handler, that owns the command, will receive/pull the message from the queue.

When processing the message, it may interact with its database, as an example.

Commands & Events: What's the difference?

As mentioned, there can be many senders, so we could have a completely different logical boundary also sending the same command to the queue, which will be processed the same way by the consumer/handler.

Lastly, naming is important. Since a command is the intent to invoke behavior, you want to represent it by a verb and often a noun. Examples are PlaceOrder, ReceiveShipment, AdjustInventory, and InvoiceCustomer. Again, notice I’m not calling these commands CreateOrder, UpdateProduct, etc. These are specific behaviors that are related to actual business concepts within a domain.

Events

Events are about telling other parts of your system about the fact that something occurred within a service boundary. Something happened. Generally, an event can be the result of the completion of a command.

Events have two parts. The first is the actual message (the event), which is the notification that something occurred. The second is the consumer/handler for that event which is going to react and execute something based on that event occurring.

There is only one logical boundary that owns the schema and publishes an event.

Event consumers can live within many different logical boundaries. There may not be a consumer for an event at all. Meaning there can be zero or many different consumers.

To illustrate, the single publisher that owns the event will create and publish it to a Topic on a Message Broker.

Commands & Events: What's the difference?

That event will then be received by both consumers. Each consumer will receive a copy of the event and be able to execute independently in isolation from each other. This means that if one consumer fails, it will not affect the other.

Commands & Events: What's the difference?

Naming is important. Events are facts that something happened. They should be named in the past tense which reflects what occurred. Examples are OrderPlaced, ShipmentReceived, InventoryAdjusted, and PaymentProcessed. These are the result of specific business concepts.

Full Cycle

So how do commands and events fit together? Since Commands are about invoking intent, and Events are about indicating that something occurred, you can see how the result of a command can be publishing an event.

First, we have a client/browser making a call to our HTTP API. Let’s say its to place an order.

In this specific case, we want to just accept the incoming HTTP request and capture the relevant data and send a PlaceOrder command to a queue on our message broker so we can process the message asynchronously. At which point we can then immediately return the client/browser that we accepted the request and are processing it.

Asynchronously the command handler can pull the message from the queue and execute whatever behavior is required to place the order.

The result of this is now we are going to publish an OrderPlaced event to a topic on the message broker.

We have 2 different consumers for the OrderPlaced event. One is to send an email to the customer saying thank you for your order. The other consumer might be to process their credit card and charge them for the order.

Commands & Events

To summarize commands and events we can compare and contrast their purpose, ownership, consumers, senders, and naming.

Commands & Events: What's the difference?

Join!

Developer-level members of my YouTube channel or Patreon get access to a private Discord server to chat with other developers about Software Architecture and Design as well as access to source code for any working demo application that I post on my blog or YouTube. Check out the YouTube Membership or Patreon for more info.

Follow @CodeOpinion on Twitter

Software Architecture & Design

Get all my latest YouTube Vidoes and Blog Posts on Software Architecture & Design