Skip to content

VPA: Pod memory limit exceeds recommendation and namespace quotas #8401

@tspearconquest

Description

@tspearconquest

Which component are you using?:

/area vertical-pod-autoscaler

What version of the component are you using?:

Component version: 1.2.1

What k8s version are you using (kubectl version)?:

kubectl version Output
$ kubectl version
Client Version: v1.32.5
Kustomize Version: v5.5.0
Server Version: v1.33.1

What environment is this in?: Azure

What did you expect to happen?: VPA webhook would apply limits less than the maxAllowed value

What happened instead?: VPA webhook applied a limit far exceeding the maxAllowed value of the VPA resource and also exceeding the namespace limits.memory LimitRange value for pods and the namespace limits.memory ResourceQuota value

How to reproduce it (as minimally and precisely as possible): Just applied the VPA, waited for a few minutes, and then restarted the targetRef deployment

Anything else we need to know?:

Does VPA set limits at this time? I see earlier in #2747 that some years ago, VPA did not back then support setting limits, but I'm seeing odd behavior for a resource I have managed by VPA.

My VPA as you can see below has maxAllowed.memory configured to 1536Mi (1.5Gi):

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: source-controller-platform-admin
  namespace: flux-system
spec:
  resourcePolicy:
    containerPolicies:
    - containerName: '*'
      controlledResources:
      - cpu
      - memory
      maxAllowed:
        cpu: 1000m
        memory: 1536Mi
      minAllowed:
        cpu: 50m
        memory: 768Mi
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: source-controller-platform-admin
  updatePolicy:
    minReplicas: 1
    updateMode: Auto

Namespace limit range allows up to 2560Mi per pod and per container (1 container per pod) and specifies defaults less than that to limit any pods not managed by VPA:

apiVersion: v1
kind: LimitRange
metadata:
  name: flux-system
  namespace: flux-system
spec:
  limits:
  - default:
      cpu: "1"
      ephemeral-storage: 1280Mi
      memory: 1280Mi
    defaultRequest:
      cpu: 500m
      ephemeral-storage: 1280Mi
      memory: 1280Mi
    max:
      cpu: "1"
      ephemeral-storage: 2560Mi
      memory: 2560Mi
    type: Container
  - max:
      cpu: "1"
      ephemeral-storage: 2560Mi
      memory: 2560Mi
    type: Pod

Namespace resource quota allows up to 25Gi total memory usage in the namespace:

apiVersion: v1
kind: ResourceQuota
metadata:
  name: limit-total-namespace-cpu-memory-storage-usage
  namespace: flux-system
spec:
  hard:
    limits.ephemeral-storage: 25Gi
    limits.memory: 25Gi
    requests.cpu: "20"
    requests.ephemeral-storage: 25Gi
    requests.memory: 25Gi

My deployment and replicaset have memory defined as:

resources:
  limits:
    memory: 1536Mi
  requests:
    cpu: 50m
    memory: 64Mi

Yet, when the pod is created by the replicaset, I get this error in the replicaset events (it's a singleton service but there are other singleton pods in the same namespace using some of the memory):

message: 'pods "source-controller-platform-admin-74dfbb9c79-7k7w8" is forbidden:
  exceeded quota: limit-total-namespace-cpu-memory-storage-usage, requested: limits.memory=18Gi,
  used: limits.memory=25440Mi, limited: limits.memory=25Gi'

Namespace memory usage currently without this pod:

limits.memory: 23904Mi/25Gi

This is from another pod in the same namespace targeted by a different VPA also got 18Gi limit applied by VPA recommendation and took up a large percentage of the available limits.memory quota.

My VPA doesn't seem to be recommending limits of 18Gi for this pod as seen below:

status:
  conditions:
  - lastTransitionTime: "2025-08-03T15:06:50Z"
    status: "True"
    type: RecommendationProvided
  recommendation:
    containerRecommendations:
    - containerName: manager
      lowerBound:
        cpu: 50m
        memory: 768Mi
      target:
        cpu: 63m
        memory: 768Mi
      uncappedTarget:
        cpu: 63m
        memory: "764046746"
      upperBound:
        cpu: 628m
        memory: 1536Mi

And nothing I have configured in the namespace should be configuring the limit that high (18Gi), as seen with the limit range resource defined above.

However, if I remove the VPA, the pod gets created with the value from the deployment and all is well; so it clearly seems to be the VPA mutating the pod at admission causing it to get a limit value set to 18Gi which exceeds the namespace resource quota.

So again I just want to confirm if VPA now supports setting limits and has any issues with this?

This is the AKS managed supported installation of VPA. The recommender pod runs this image:

mcr.microsoft.com/oss/v2/kubernetes/autoscaler/vpa-recommender:v1.2.1-1

Metadata

Metadata

Assignees

Labels

area/vertical-pod-autoscalerhelp wantedDenotes an issue that needs help from a contributor. Must meet "help wanted" guidelines.kind/bugCategorizes issue or PR as related to a bug.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions