-
Notifications
You must be signed in to change notification settings - Fork 23
Open
Description
There are several problems with how we process requests:
- The limiter has some problems:
- It has no limit on the number of concurrent requests, resulting in many concurrent goroutines competing for a single channel lock. The high number of goroutines creates a potential positive feedback loop as contention reduces performance, increasing contention, and so on.
- Context cancellation is the only way to remove a request from the limiter. The cancellations are only checked at specific points in the code. Sometimes "dead" and timed-out requests accumulate in the limiter because their contexts are not checked. The cancellations are also hard to diagnose.
- The previous two points at times of high load introduce high contention that reduces performance when needed most.
- There is no fairness regarding the order of requests. See Unlucky requests can get needlessly dropped #35
- There is no global limit on the number of requests sent to backends in the zipper. Since the limits for individual backends need to be generous (the connection are unevenly distributed), the total number of connections can skyrocket, introducing critical overhead. This can result — and we have seen this — in a positive feedback loop involving remote caches. Such an occurrence can make the system unstable.
- The current processing components don't have a clear separation of responsibilities. This makes diagnostics hard and introduces unnecessary complexity.
We need to address the mentioned issues with a redesign.