Do We Need Additional Backpressure Control?
Background
While exploring full-chain WebFlux reactive programming with gRPC calls, I tried to add backpressure control to the producer, hoping that: the consumer could control the producer's production rate through a gRPC interface.
But later I found I was completely wrong - this kind of backpressure control brings non-trivial logical complexity to the producer.
So the main question this article explores is: why we don't need backpressure control, and what complexity it brings.
Full-Chain WebFlux Reactive Programming with gRPC Calls
Before we begin, we need a practical scenario to illustrate the problem:
Imagine a scenario: You need to read an entire data table from one service to another via gRPC, without worrying about pagination, memory overflow, etc. Even more magically, when processing takes too long, the consumer will automatically stop the producer's data stream, and the producer can even stop its reactive database driver (like R2DBC), completely stopping reading from the database.
How much code do you think you need to implement such robust functionality? The answer might surprise you: less than 200 lines of code.
Yes, this is completely achievable with WebFlux.
The development steps are even simple:
- Both producer/consumer introduce gRPC framework
- Define data_stream.proto file
I won't expand on specific code here - there are plenty of tutorials online.
Then once connected, you have the above functionality, even supporting billion-level data transfer without pagination. The performance improvement brought by gRPC is also considerable - throughput increases, latency drops to less than 1ms at minimum, which is impossible with REST API.
Then I Started Thinking 🤔, Can the Consumer Control the Producer's Production Rate Through a gRPC Interface?
The result is no, and it's even the wrong approach.
I tried to add an additional backpressure control service, with code similar to:
Based on requestedItems (number of data items requested), limit the upstream's production rate for this data stream. But after actual operation, I found:
Unless the downstream passes backlog information to upstream through an interface, upstream can never know the downstream's real backlog situation.
And if every upstream uses complex logic to control production rate, it's unacceptable for maintainability. This rate control logic is tightly coupled before and after data production - even using AOP aspects is difficult to handle.
Later I thought this problem can be analogized to: Suppose you're the village chief, and every month each villager comes to you for a fixed amount of feed for their pigs. What the village chief should do is ensure stable feed production and set upper/lower limits for each villager's collection, NOT have villagers say: "Chief, I couldn't use all the feed you gave me this month, give me less next month."
Why?
The villager's behavior is wrong. Only the villager himself knows his needs best - in other words: only the downstream itself knows its performance load situation. If downstream can't handle it, it means downstream requested too much data at once when requesting items.
This erroneous request reflected to upstream naturally means pushing data frantically to downstream, but this rate must have an upper limit. It can't squeeze other downstream data streams' rates.
In this scenario, even if problems occur, upstream won't crash. If downstream hasn't configured backpressure strategy, like: .onBackpressureDrop(dropped -> System.out.println("Dropped: " + dropped))
Then downstream errors or crashes are normal, because downstream service is already making chaotic data requests, but upstream can't perceive this. Even with WebFlux, upstream can't know downstream's internal backlog. So what upstream should do is protect itself from being overwhelmed by downstream's chaotic requests.
Upstream's Real Responsibilities
Upstream's responsibilities are simple:
- Stable output: Ensure your service can stably and healthily output data streams.
- Self-protection: Once downstream makes unreasonable requests (like requesting 10,000 items at once), upstream can choose to drop, rate-limit, or only keep the latest data, but will never sacrifice its own stability to accommodate downstream's needs.
- Protocol contract: Reactive Streams already specifies the request(n) mechanism, which is the natural existence of backpressure - no need to create an additional gRPC interface for artificial control.
So, backpressure control isn't explicitly implemented by upstream, but naturally manifests when downstream consumes.
What Downstream Should Do
Downstream is the party that knows its own capabilities best. What it needs to do is:
- Request reasonably
- Don't request excessive data all at once. BaseSubscriber can pull one by one or in batches:
Flux.range(1, 1000)
.subscribe(new BaseSubscriber<Integer>() {
@Override
protected void hookOnSubscribe(Subscription subscription) {
request(10); // Request 10 each time
}
@Override
protected void hookOnNext(Integer value) {
process(value);
request(1); // Consume one, pull one
}
});
- Choose appropriate strategies
- If worried about memory overflow, use
.onBackpressureDrop() - If business only cares about latest data, use
.onBackpressureLatest() - If can tolerate caching, use
.onBackpressureBuffer(1000)
- Rate limiting and protection
- Downstream can add rate limiters at the entry (like RateLimiter or Resilience4j) to prevent self-destruction under high concurrency.
Why We Don't Need "Additional Backpressure Control"
Summary:
- Backpressure naturally exists: The Reactive Streams protocol already has request(n), no need to reinvent the wheel.
- Complexity issues: If consumer still needs to make additional gRPC calls to tell producer "slow down", this actually increases network interaction and state synchronization complexity, prone to bugs.
- Separation of responsibilities:
- Upstream → Stable output, protect itself.
- Downstream → Request reasonably, protect itself.
- Robustness: Under this model, even if downstream crashes, upstream can continue running healthily; conversely, if upstream changes logic to accommodate downstream, it might be dragged down by downstream.
Final Thoughts
So when we discuss "do we need backpressure control", the answer is:
👉 We already have backpressure, no need to artificially create another controller. What's really needed is how downstream self-constrains and reasonably utilizes the natural backpressure mechanism provided by Reactive Streams.
In other words:
- Backpressure is a natural product at the protocol level, not management logic at the application level.
- We don't need to artificially add a "production rate interface", otherwise it only brings complexity without substantial benefits.