Add docs for rate limit aware load balancing#2126
Conversation
Signed-off-by: Alex Leong <alex@buoyant.io>
Signed-off-by: Alex Leong <alex@buoyant.io>
raykroeker
left a comment
There was a problem hiding this comment.
Some minor language changes, and 1 technical issue (ratio at the bottom).
| failures for this calculation. If this annotation is not present, the default | ||
| value is `0.8` (80% success-rate). | ||
| - `balancer.alpha.linkerd.io/failure-accrual-success-rate-window`: The window of | ||
| time over success-rate is calculated. If this annotation is not present, the |
There was a problem hiding this comment.
Instead of:
The window of time over success-rate is calculated..
The window of time over which the success-rate is calculated...
| | `balancer.linkerd.io/failure-accrual-consecutive-max-failures` | number | `7` | Trip if we encounter this many consecutive failures | | ||
| | `balancer.linkerd.io/failure-accrual-consecutive-min-penalty` | duration | `1s` | The minimum duration for which to cut off traffic | | ||
| | `balancer.linkerd.io/failure-accrual-consecutive-max-penalty` | duration | `1m` | The maximum duration for which to cut off traffic | | ||
| | `balancer.linkerd.io/failure-accrual-consecutive-jitter-ratio` | number between 0.0 and 100.0 | `0.5` | The amount of randomness to inject into the backoff | |
There was a problem hiding this comment.
Instead of:
number between 0.0 and 100.0
number between 0.0 and 1.0
There was a problem hiding this comment.
100.0 is actually the bound: https://github.com/linkerd/linkerd2-proxy/blob/main/linkerd/exp-backoff/src/lib.rs#L83-L85
Signed-off-by: Alex Leong <alex@buoyant.io>
unleashed
left a comment
There was a problem hiding this comment.
Maybe add a note on backwards compatibility to remove these annotations if downgrading?
| | Annotation | Type | Default | | | ||
| |---------------------------------------------------------------|----------|---------|----------------------------------------------------------------------------------------| | ||
| | `balancer.alpha.linkerd.io/load-biaser-penalty` | duration | `5s` | The latency value to inject for rate-limited responses | | ||
| | `balancer.alpha.linkerd.io/failure-accrual-honor-retry-after` | boolean | `false` | If Retry-After response headers or grpc-retry-pushback-ms gRPC trailers are respected. | |
There was a problem hiding this comment.
This became a no-op after removing support for retry-after/grpc-retry-pushback-ms hints in breakers.
| amount of pushback. Note that this requires setting the | ||
| `balancer.alpha.linkerd.io/failure-accrual-honor-retry-after=true` annotation on | ||
| the Service in order for these response hints to be used. |
There was a problem hiding this comment.
I don't think we are optionally enabling this in the biaser, but always respecting it whenever it is enabled.
| number of consecutive failures, just like the consecutive failures accrual. | ||
|
|
||
| To enable the Unified failure accrual circuit breaker on a Service, set the | ||
| following annotation to `"unified"` on the Server resource: |
There was a problem hiding this comment.
| following annotation to `"unified"` on the Server resource: | |
| following annotation to `"unified"` on the Service resource: |
|
|
||
| | Annotation | Type | Default | | | ||
| |---------------------------------------------------------------|----------|---------|----------------------------------------------------------------------------------------| | ||
| | `balancer.alpha.linkerd.io/load-biaser-penalty` | duration | `5s` | The latency value to inject for rate-limited responses | |
There was a problem hiding this comment.
| | `balancer.alpha.linkerd.io/load-biaser-penalty` | duration | `5s` | The latency value to inject for rate-limited responses | | |
| | `balancer.alpha.linkerd.io/load-biaser-penalty` | duration | `5s` | The latency value to inject for rate-limited responses and failures | |
| load balancing algorithm which takes rate-limit responses (HTTP 429 or gRPC | ||
| RESOURCE_EXHAUSTED) into account. This algorithm is called the Load Biaser |
There was a problem hiding this comment.
This also takes failures into account -- some gRPC (server error ones) and all HTTP 5xx responses.
| - `balancer.alpha.linkerd.io/failure-accrual-success-rate-threshold`: If the | ||
| success-rate of responses in the window drops below this threshold, then the | ||
| endpoint will be made unavailable. Must be between `0.0` and `1.0`. | ||
| Rate-limited responses such as HTTP 429 and gRPC RESOURCE_EXHAUSATED count as |
There was a problem hiding this comment.
| Rate-limited responses such as HTTP 429 and gRPC RESOURCE_EXHAUSATED count as | |
| Rate-limited responses such as HTTP 429 and gRPC RESOURCE_EXHAUSTED count as |
| @@ -0,0 +1,105 @@ | |||
| --- | |||
| title: Rate Limit Aware Load Balacing | |||
There was a problem hiding this comment.
| title: Rate Limit Aware Load Balacing | |
| title: Rate Limit Aware Load Balancing |
|
|
||
| {{< warning >}} | ||
|
|
||
| Rate Limit Aware Load Balacing is an experimental, opt-in feature. |
There was a problem hiding this comment.
| Rate Limit Aware Load Balacing is an experimental, opt-in feature. | |
| Rate Limit Aware Load Balancing is an experimental, opt-in feature. |
There was a problem hiding this comment.
| circuit breaker"). |
There was a problem hiding this comment.
(i.e. just delete the last two sentences)
| ### Unified | ||
|
|
||
| In this failure accrual policy, an endpoint is marked as failing after _either_ | ||
| success-rate drops below a configured threshold _or_ a configured number of |
There was a problem hiding this comment.
"success-rate" => "success rate" throughout this PR
There was a problem hiding this comment.
| Supported values for this annotation are `consecutive` and `unified`. |
There was a problem hiding this comment.
The new unified policy respects 429s and gRPC rate-limited responses. I am not sure whether that is extended to the consecutive policy, but either way we should clarify in this note.
| When backends implement rate limiting and return | ||
| [HTTP 429](https://www.rfc-editor.org/rfc/rfc6585.html#page-3) or | ||
| [gRPC RESOURCE_EXHAUSTED](https://grpc.github.io/grpc/core/md_doc_statuscodes.html) | ||
| responses, the proxy currently treats these as successful responses from a load |
There was a problem hiding this comment.
| responses, the proxy currently treats these as successful responses from a load | |
| by default, the proxy treats these as successful responses from a load |
| title: Rate Limit Aware Load Balacing | ||
| description: Routing traffic away from rate limited endpoints |
There was a problem hiding this comment.
| title: Rate Limit Aware Load Balacing | |
| description: Routing traffic away from rate limited endpoints | |
| title: Handling Rate-Limited Endpoints | |
| description: Automatically route traffic away from rate-limited endpoints |
Signed-off-by: Alex Leong <alex@buoyant.io>
Signed-off-by: Alex Leong <alex@buoyant.io>
No description provided.