-
Notifications
You must be signed in to change notification settings - Fork 27
Support for per-zone PDB #253
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Includes; * new web-hook handler for pod eviction added to rollout-operator * manifests for new custom resource definition for configuring the PDBZ * updated development example files and environment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add an integration test to cover this functionality?
development/custom-resource-definition-pod-disruption-zone-budget.yaml
Outdated
Show resolved
Hide resolved
development/custom-resource-definition-pod-disruption-zone-budget.yaml
Outdated
Show resolved
Hide resolved
* rename to ZoneAwarePodDisruptionBudget * updated root README * updates to main eviction handler per PR *
* updates per PR review * added ZPDB informer to monitor ZPDB configurations * added Pod informer to work with pod eviction cache * added pod eviction cache * added validating webhook configuration for ZPDB files * added locking for concurrent eviction handling
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't finished reviewing the tests - will take a look tomorrow.
.golangci.yaml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change seems unrelated - am I missing something?
|
||
This allows an operator to perform maintenance on a single zone whilst ensuring sufficient pod availability in other zones. | ||
|
||
Consider the following topology where the `PDZB` has `maxUnavailability=1`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider the following topology where the `PDZB` has `maxUnavailability=1`; | |
Consider the following topology where the `ZPDB` has `maxUnavailable` set to 1: |
If `ingester-zone-a-0` is to be evicted, it will be allowed if there are no disruptions in either zone `b` or zone `c`. | ||
|
||
If `ingester-zone-a-1` has failed and `ingester-zone-a-0` is to be evicted, it will be allowed if there are no disruptions in either zone `b` or zone `c`. | ||
|
||
*The already disrupted zone can be further disrupted as long as the other zones meet the unavailability criteria.* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I should have caught this earlier, this isn't quite right - we still want to limit the number of pods allowed to become unavailable :
If `ingester-zone-a-0` is to be evicted, it will be allowed if there are no disruptions in either zone `b` or zone `c`. | |
If `ingester-zone-a-1` has failed and `ingester-zone-a-0` is to be evicted, it will be allowed if there are no disruptions in either zone `b` or zone `c`. | |
*The already disrupted zone can be further disrupted as long as the other zones meet the unavailability criteria.* | |
If `ingester-zone-a-0` is to be evicted, it will be allowed if there are no disruptions in either zone `a`, zone `b` or zone `c`. | |
If `ingester-zone-a-1` has failed and `ingester-zone-a-0` is to be evicted, it will not be allowed. |
It might be good to add another example where maxUnavailability
is set higher than 1 to illustrate what happens in that case as well.
This is a draft PR for the implementation of a custom pod disruption budget which is zone aware.
#194
Included in this PR;
Note - this is a draft PR to give early visibility to @charleskorn and to share details on the new custom resource definition.
Relates to grafana/mimir#12315