Sponsor: Do you build complex software systems? See how NServiceBus makes it easier to design, build, and manage software systems that use message queues to achieve loose coupling. Get started for free.
I’ve recently read a few blogs and watched videos that compare gRPC with REST and GraphQL. It seemed like the majority claimed that gRPC is the standard for communication between services without giving any real reason. I think it would be better served to explain where and the situations where gRPC could be useful and where I’d avoid using it.
YouTube
Check out my YouTube channel, where I post all kinds of content accompanying my posts, including this video showing everything in this post.
Query Composition
I can see gRPC being useful where requests are naturally request-response. Queries and performing UI/ViewModel Composition are naturally request-response and would be a good fit for gRPC.
When you have multiple services that own various pieces of data that the UI/Client requires, you need to perform this composition somewhere. One option is to do this with a Backend-for-Frontend (BFF). The BFF will make all the relevant query calls to all the services to get data. It then composes that day and returns it to the client.
Using gRPC from BFF to each service can be synchronous request-response. Typically you’d probably see an HTTP API (what most would call REST) in this case; however, gRPC would be another option here.
Infrastructure & 3rd Parties
Another place I see gRPC being useful is when needing to make calls to infrastructure (as a service) or 3rd parties.
For example, a database is a good example of infrastructure that is also naturally request-response. Yes, it’s a core part of your system, but it isn’t a service. EventStoreDB provides gRPC clients for interacting with it. This makes sense as it allows you to interact with the database using gRPC rather than having to produce native SDK for every language/platform.
Another good example of gRPC is with 3rd party services. Meaning services that you don’t own. As an example, this could be a currency exchange or map routing. You could also be the 3rd party providing the service. gRPC could be a good fit here as well.
Service to Service Nightmare
I do not see gRPC being a good fit with service-to-service communication. As mentioned, I’ve read enough blogs/articles/videos that state that gRPC is or should be the standard for service-to-service communication. I disagree.
I’ve mentioned a few times that I think gRPC is a fit in naturally request-response situations. That’s what gRPC is. Remote Procedure calls that are often blocking. Service-to-service communication using blocking RPC calls can lead to a nightmare of coupling and terrible reliability.
Let’s say we have a client that makes a request to ServiceA. ServiceA then makes a blocking synchronous call to ServiceB. It then makes a blocking synchronous call to some external service or 3rd party.
Once the call from ServiceB completes, ServiceA then makes a call to ServiceC.
Little do we know, since it’s free for all of services being able to call other services, that ServiceC then makes a call to ServiceD. But guess what? That call fails!
Since we aren’t a single process, we don’t have an easy way to catch an exception, nor will we get any type of good stack trace. ServiceA has to handle the failure ultimately, but it’s not ServiceC causing the issue but further downstream.
If state changes were happening in any of the services called, you do not have a distributed transaction. You must handle these failures and decide how to roll back or perform compensating actions.
Total nightmare. For more headaches that can come with this, check out my post REST APIs for Microservices? Beware! The same would apply to gRPC.
Messaging
What was likely trying to happen was that a workflow was being modeled by blocking request-response between services. To add reliability to your system and business processes, model them using an asynchronous workflow by using messaging.
By using a message broker, you eliminate direct communication between services. Services are no longer temporally coupled. If one service isn’t available, that does not stop or break the workflow.
Messaging can also be done in a request-reply style but still be asynchronous. This allows one service to call another and get a reply, but asynchronously.
When a client requests a service to start some type of workflow, the service can send a command/message to the broker (queue) for some other service to perform some work that is a part of the workflow.
Another service will consume this command and perform whatever actions it needs to take part in the workflow.
Once it’s consumed and finished processing the message, it can provide a reply message for the originating service.
Finally, the originating service can consume the reply message and handle it however it needs to. This could mean it might send another command to the broker for a different service to perform its part of the workflow.
If the client needs to be notified that the entire workflow is done, you can leverage WebSockets or push notifications to add real-time capabilities to the client so they can be notified of completed work.
If there is a failure processing a message, a service can retry, backoff, and retry processing again or ultimately put the message on a dead letter queue. You have many more options on how to handle processing failures.
Because each service works independently, they do not all need to be online and available. If one service is down and unavailable, that does not make the entire workflow stop or fail. They aren’t temporally coupled.
For more, check out my post Workflow Orchestration for Resilient Systems
gRPC
You might have noticed that this post isn’t about gRPC but rather the places where synchronous request-response is appropriate. Should you use gRPC over an HTTP API in those situations? I’ll save that for another post!
Context
Context is king. The context and perspective I’m referring to within this post/video are to do with line of business and enterprise-type systems where business processes and workflows are naturally asynchronous. The world is asynchronous, and yet we still try and primarily model these business processes in a synchronous way. Should all workflow be asynchronous? Absolutely not. Context is king.
Join!
Developer-level members of my YouTube channel or Patreon get access to a private Discord server to chat with other developers about Software Architecture and Design and access to source code for any working demo application I post on my blog or YouTube. Check out the YouTube Membership or Patreon for more info.