To allow the Kubernetes Cluster Autoscaler to move pods around, pods (sometimes) need PodDistruptionBudgets
which can specify either minAvailable
or maxUnavailable
, but not both. There are a bunch of “gotchas” to look out for.
You cannot set minAvailable: 0
It's not that you can't set minAvailable
it to zero, but since you are not allowed to set both minAvailable
and maxUnavailable
, most Helm charts have code like:
{{- if .Values.podDisruptionBudget.minAvailable }}
minAvailable: {{ .Values.podDisruptionBudget.minAvailable }}
{{- end }}
{{- if .Values.podDisruptionBudget.maxUnavailable }}
maxUnavailable: {{ .Values.podDisruptionBudget.maxUnavailable }}
{{- end }}
If you set minAvailable: 0
that is the same as not setting it. That results in neither value getting set, which is effectively the same as setting maxUnavailable: 0
(which, of course, you also cannot do).
The fix: set minAvailable: "0%"
. So you may wonder, “why to bother setting a PodDisruptionBudget
at all, if you are going to allow all the pods to be deleted?” Well, the reason is that this gives the Autoscaler explicit permission to evict pods that it might otherwise be too cautious about evicting. For example, anything that uses emptyDir
for anything. The contents of emptyDir
will be lost when the pod is evicted, so the Autoscaler will not evict the pods without explicit permission to avoid deleting something important.
It is dangerous to set minAvailable: 1
It may seem innocuous to set minAvailable: 1
, but frequently we scale deployments down to 1 instance, and if you combine that with minAvailable: 1
then you are stuck with a single pod that cannot be evicted, which in turn will prevent the instance from being taken out of service.
Not a fix: minAvailable: 25%
. Kubernetes does not round these percentages to the nearest, it always rounds up. So with 1 replica, it will round 25% up to 1, so this is not a fix at all.
The fix: set maxUnavailable: 50%
. That will avoid knocking out the service when there are replicas to spare but not prevent the service from being evicted.
Sometimes you have to set both minAvailable and maxUnavailable
While you are not allowed to set both minAvailable
and maxUnavailable
in the actual PodDisruptionBudget
resource, if the Helm chart provides a default value for one of them and you want to use the other one, you have to set both in the helmfile. Set the one you do not want to ""
to hint to readers that you are unsetting a default.