Don't Use Sync Communication For Every Case

There are other ways as well...

Dec 03, 2024

Stop using synchronous communication for every interaction between services.

Synchronous communication is when one service waits for a response from another before proceeding.

While it sometimes works well, relying solely on synchronous communication for every interaction between microservices is counter-productive.

The Challenges of Sync Communication

There are multiple challenges with just relying on synchronous communication.

You can play around with the diagram on Eraser.io

Let’s look at a few important ones:

1 - Increased Latency

In a synchronous model, a service sends a request to another service and waits for a response.

If multiple services are chained in this manner, you just end up with latency compounding.

For instance, if Service A calls Service B, which then calls Service C, the overall response time is the sum of the latencies of each call. In scenarios where services are geographically distributed or under heavy load, this latency is bound to become a bottleneck.

2 - Downtime and Cascading Failures

When one service depends on another synchronously, any downtime or failure in one service can ripple through the entire system.

If Service B is unavailable, Service A will fail to process its requests. This tight coupling makes the entire system more fragile and prone to cascading failures.

3 - Higher Costs

Synchronous communication requires services to handle requests in real time.

To meet peak demand, you often need to provision resources for the highest possible load, leading to underutilized resources during off-peak times.

4 - Tight Temporal Coupling

Synchronous communication creates a dependency on the availability and responsiveness of other services at the instance level.

Both service instances must be operational simultaneously for communication to succeed. This temporal coupling can complicate deployments, scaling, and debugging.

Alternatives to Synchronous Communication

The good news is that there are alternative approaches to synchronous communication.

Let’s look at some of these alternatives:

Asynchronous Request-Response

In asynchronous communication, the caller sends a request but doesn’t block while waiting for the response. For example, a "Payment" service can place a "Process Payment" request in a queue. The "Billing" service processes the request asynchronously and places a "Payment Processed" message in the response queue.

A common pattern for implementing this is using message queues, such as RabbitMQ, Kafka, or AWS SQS.

How It Works:

The sender places a request on a message queue.
The receiver processes the request asynchronously and places the response back in the queue.
The sender retrieves the response at its convenience.

Benefits:

Non-blocking communication improves responsiveness.
Services are loosely coupled, as they don’t need to be online at the same time.
Message queues can act as a buffer, helping handle spikes in traffic.

Event-Driven Architecture

In an event-driven architecture, services communicate by publishing and subscribing to events. Instead of direct service-to-service calls, services emit events that other services consume.

For example, when an "OrderPlaced" event is emitted by the "Order" service, the "Inventory" service updates stock levels, and the "Notification" service sends a confirmation email—all without direct communication between them.

How It Works:

Services publish events, such as "OrderPlaced" or "PaymentProcessed," to an event broker (e.g., Kafka, AWS SNS, or Redis Streams).
Interested services subscribe to these events and take appropriate actions.

Benefits:

Decouples services, allowing them to operate independently.
Enables scalability and resilience, as services don’t directly depend on each other’s availability.
Simplifies adding new functionality, as new services can subscribe to existing events without modifying the publisher.

Well-Defined Service Boundaries

When services have well-defined responsibilities and minimal overlap, the need for inter-service communication is naturally reduced.

For example, instead of having a "User" service that handles user data and authentication, split these responsibilities into a "User Profile" service and an "Authentication" service. This minimizes unnecessary interactions and makes each service easier to scale and maintain.

How This Helps:

By encapsulating functionality within a service boundary, you avoid frequent cross-service calls.
Services can operate more independently, improving their reliability and scalability.

Bulk Data Transfers

For scenarios involving large amounts of data, consider using bulk data transfers instead of multiple synchronous requests. For example, a "Reporting" service can retrieve transaction logs from the "Payments" service in bulk, rather than querying for each transaction individually.

How It Works:

Services periodically exchange data using batch jobs, file transfers, or database replication.
These operations are performed asynchronously during off-peak hours.

Benefits:

Reduces the load on services during real-time operations.
Improves overall system efficiency.

Caching Local Copy

Sometimes, frequent inter-service communication can be replaced by caching or maintaining a local copy of data. For example, a "Product" service can cache product details locally instead of querying the "Catalog" service for every request.

How It Works:

Use caching solutions like Redis or Memcached to store frequently accessed data.
Replicate state across services to avoid constant data retrieval.

Benefits:

Reduces latency by serving data locally.
Minimizes the risk of service unavailability affecting performance.

Best Practices for Adopting Alternatives

Understand the Use Case: Not every scenario requires asynchronous communication. Analyze the trade-offs and choose the approach that fits your requirements.
Monitor and Debug: Use distributed tracing tools like Jaeger or Zipkin to monitor asynchronous flows and troubleshoot issues.
Graceful Degradation: Implement fallbacks and retries to handle communication failures gracefully.
Prioritize Idempotency: Ensure that operations are idempotent to handle duplicate messages in asynchronous systems.

👉 So - will you add any other alternative approach to synchronous communication?