[VPA] Updater doesn't fallback to pod eviction in case it cannot decrease mem limits in-place

**Which component are you using?**:

/area vertical-pod-autoscaler

**What version of the component are you using?**:

Component version: 1.4.1

**What k8s version are you using (`kubectl version`)?**: 

Server Version: v1.33.1-eks-595af52

**What environment is this in?**:

AWS EKS

**What did you expect to happen?**:

The docs [here](https://github.com/kubernetes/autoscaler/blob/vertical-pod-autoscaler-1.4.1/vertical-pod-autoscaler/docs/features.md#in-place-updates-inplaceorrecreate) state:

> VPA will fall back to pod recreation in the following scenarios:
> ...
> - Update is in progress for more than 1 hour
> - Memory limit downscaling is required with [PreferNoRestart policy](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/1287-in-place-update-pod-resources/README.md#container-resize-policy)
> 

BTW, it looks like k8s 1.33 doesn’t recognize a policy called PreferNoRestart.

So, for a pod with `resizePolicy` set to `NotRequired` for memory, I expect the updater to fall back to pod eviction after the in-recommendation-bounds-eviction-lifetime-threshold period is over (1h in my case)

**What happened instead?**:



After 24 hours (with my lifetime threshold set to 1 hour), I only see this:

```
updater.go:286] "In-place update failed" error="Pod \"vpa-recommender-8449858858-dz4gq\" is invalid: spec.containers[0].resources.limits[memory]: Forbidden: memory limits cannot be decreased unless resizePolicy is RestartContainer" pod="vpa/vpa-recommender-8449858858-dz4gq"
```

Yes, I’m experimenting and have VPA for VPA pods 😁 

My pod has this resizePolicy:

```
58   │     resizePolicy:
59   │     - resourceName: cpu
60   │       restartPolicy: NotRequired
61   │     - resourceName: memory
62   │       restartPolicy: NotRequired
```

**How to reproduce it (as minimally and precisely as possible)**:

1. Start an overprovisioned deployment with a pause pod, with InPlaceOrRecreate VPA defined for it, while the VPA webhook is down. This way, VPA will want to reduce the resources on the pod.

2. The deployment must:
- be overprovisioned in resources
- Have NotRequired as the restartPolicy for the container
3. Observe that the updater complains and does nothing.

And yes, I understand this is an alpha feature. 😄



**Use case**:

I’m experimenting with VPA to see if it could help me optimize my dev cluster, where workloads spawn randomly in new namespaces. I'd like to somewhat quickly adjust resources for these workloads, cause those could be gone completely in 6-12 hours.

VPAs for these workloads are created by a separate controller, so for new VPAs recommendations are initially off. I’d like VPA to act on those workloads after 1–3 hours of collecting stats, be able to scale resources up in place if needed, but avoid constantly restarting containers when reducing memory (which happens if I set the RestartContainer policy for memory).

If I understand correctly, having the PreferNoRestart/NotRequired policy on a container’s memory resource will, by design, cause VPA to evict the pod in order to lower the memory limit. However, the new pod would then be subject to the in-recommendation-bounds-eviction-lifetime-threshold period, which means VPA will not restart pods frequently.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[VPA] Updater doesn't fallback to pod eviction in case it cannot decrease mem limits in-place #8434

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[VPA] Updater doesn't fallback to pod eviction in case it cannot decrease mem limits in-place #8434

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions