PodDisruptionBudget "Gotchas"

Best PracticesDevOps

PodDisruptionBudget "Gotchas"

byJeremy GrodbergAWS DevOps Consultant

May 02 2020

To allow the Kubernetes Cluster Autoscaler to move pods around, pods (sometimes) need PodDistruptionBudgets which can specify either minAvailable or maxUnavailable, but not both. There are a bunch of "gotchas" to look out for.

You cannot set minAvailable: 0

It's not that you can't set minAvailable it to zero, but since you are not allowed to set both minAvailable and maxUnavailable, most Helm charts have code like:

{{- if .Values.podDisruptionBudget.minAvailable }}
minAvailable: {{ .Values.podDisruptionBudget.minAvailable }}
{{- end  }}
{{- if .Values.podDisruptionBudget.maxUnavailable }}
maxUnavailable: {{ .Values.podDisruptionBudget.maxUnavailable }}
{{- end  }}

If you set minAvailable: 0 that is the same as not setting it. That results in neither value getting set, which is effectively the same as setting maxUnavailable: 0 (which, of course, you also cannot do).

The fix: set minAvailable: "0%". So you may wonder, "why to bother setting a PodDisruptionBudget at all, if you are going to allow all the pods to be deleted?" Well, the reason is that this gives the Autoscaler explicit permission to evict pods that it might otherwise be too cautious about evicting. For example, anything that uses emptyDir for anything. The contents of emptyDir will be lost when the pod is evicted, so the Autoscaler will not evict the pods without explicit permission to avoid deleting something important.

It is dangerous to set minAvailable: 1

It may seem innocuous to set minAvailable: 1, but frequently we scale deployments down to 1 instance, and if you combine that with minAvailable: 1 then you are stuck with a single pod that cannot be evicted, which in turn will prevent the instance from being taken out of service.

Not a fix: minAvailable: 25%. Kubernetes does not round these percentages to the nearest, it always rounds up. So with 1 replica, it will round 25% up to 1, so this is not a fix at all.

The fix : set maxUnavailable: 50%. That will avoid knocking out the service when there are replicas to spare but not prevent the service from being evicted.

Sometimes you have to set both minAvailable and maxUnavailable

While you are not allowed to set both minAvailable and maxUnavailable in the actual PodDisruptionBudget resource, if the Helm chart provides a default value for one of them and you want to use the other one, you have to set both in the helmfile. Set the one you do not want to "" to hint to readers that you are unsetting a default.

Jeremy Grodberg

AWS DevOps Consultant

Open source contributor. Cloud, DevOps, and automation expert.

Share This Post