Increase range for expected VPA CPU recommendations in e2e #8386

kamarabbas99 · 2025-07-29T20:10:20Z

What type of PR is this?

/kind flake

What this PR does / why we need it:

These tests can get flaky because the resource consumer consumes 1800 m CPU which can be unevenly distributed across 3 pods which can lead to failure.

Also the tests don't need to append recommendations since vpa-recommender is running in this suite.

Similar PR before: #4469

These tests can get flaky because the resource consumer consumes 1800 m CPU which can be unevenly distributed across 3 pods which can lead to failure. Also the tests dont need to append recommendations since vpa-recommender is running in this suite.

k8s-ci-robot · 2025-07-29T20:10:22Z

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot · 2025-07-29T20:10:25Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: kamarabbas99
Once this PR has been reviewed and has the lgtm label, please assign jbartosik for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

vertical-pod-autoscaler/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot · 2025-07-29T20:10:29Z

Hi @kamarabbas99. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

kamarabbas99 · 2025-07-29T20:18:07Z

cc @adrianmoisey

adrianmoisey · 2025-07-29T20:26:31Z

/ok-to-test

omerap12 · 2025-08-01T20:23:41Z

Can you explain a bit more on how this supposed to solve the flake?
In here we specify 3 replicas

autoscaler/vertical-pod-autoscaler/e2e/v1/full_vpa.go

Line 53 in 7b2d468

replicas := 3

So we have

autoscaler/vertical-pod-autoscaler/e2e/v1/full_vpa.go

Line 107 in 7b2d468

rc.ConsumeCPU(600 * replicas)

Which means 1800m in total. but if this load is unevenly distributed across those 3 pods how changing the min/max of a pod helps?
how does

ParseQuantityOrDie("600m"), ParseQuantityOrDie("1800m"))

solve this flake? doesn't a pod can get something like 400m and another will get 800m for example?
What am I missing?

kamarabbas99 · 2025-08-01T20:36:37Z

Can you explain a bit more on how this supposed to solve the flake? In here we specify 3 replicas

autoscaler/vertical-pod-autoscaler/e2e/v1/full_vpa.go

Line 53 in 7b2d468

replicas := 3

So we have

autoscaler/vertical-pod-autoscaler/e2e/v1/full_vpa.go

Line 107 in 7b2d468

rc.ConsumeCPU(600 * replicas)

Which means 1800m in total. but if this load is unevenly distributed across those 3 pods how changing the min/max of a pod helps?
how does
ParseQuantityOrDie("600m"), ParseQuantityOrDie("1800m"))
solve this flake? doesn't a pod can get something like 400m and another will get 800m for example? What am I missing?

Thats a good point. I am actually not sure about that but the similar thing is done for memory.

autoscaler/vertical-pod-autoscaler/e2e/v1/full_vpa.go

Line 126 in 7b2d468

ParseQuantityOrDie("900Mi"), ParseQuantityOrDie("4000Mi"))

Maybe it will be minimum of 600?

I am not sure how else to reproduce this but I am encountering this flake when I add cpu boost logic to updater(you can maybe add a sleep in RunOnce and it will still flake).

kamarabbas99 · 2025-08-01T20:38:47Z

/cc @omerap12

adrianmoisey · 2025-08-02T08:29:20Z

Is this change fixing a current flake, or a new upcoming flake?

omerap12 · 2025-08-02T13:41:36Z

Can you explain a bit more on how this supposed to solve the flake? In here we specify 3 replicas

autoscaler/vertical-pod-autoscaler/e2e/v1/full_vpa.go

Line 53 in 7b2d468

replicas := 3

So we have

autoscaler/vertical-pod-autoscaler/e2e/v1/full_vpa.go

Line 107 in 7b2d468

rc.ConsumeCPU(600 * replicas)

Which means 1800m in total. but if this load is unevenly distributed across those 3 pods how changing the min/max of a pod helps?
how does
ParseQuantityOrDie("600m"), ParseQuantityOrDie("1800m"))
solve this flake? doesn't a pod can get something like 400m and another will get 800m for example? What am I missing?
Thats a good point. I am actually not sure about that but the similar thing is done for memory.

autoscaler/vertical-pod-autoscaler/e2e/v1/full_vpa.go

Line 126 in 7b2d468

ParseQuantityOrDie("900Mi"), ParseQuantityOrDie("4000Mi"))

Maybe it will be minimum of 600?

I am not sure how else to reproduce this but I am encountering this flake when I add cpu boost logic to updater(you can maybe add a sleep in RunOnce and it will still flake).

To be honest - I haven’t seen this flake happen recently, also not in my current PR.
It makes sense that if you set a lower min value and a higher max value, there will probably be fewer flakes,but it doesn’t guarantee that the flake is completely fixed.

k8s-ci-robot added kind/flake Categorizes issue or PR as related to a flaky test. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Jul 29, 2025

k8s-ci-robot requested review from raywainman and voelzmo July 29, 2025 20:10

k8s-ci-robot added area/vertical-pod-autoscaler cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jul 29, 2025

k8s-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Jul 29, 2025

k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jul 29, 2025

k8s-ci-robot requested a review from omerap12 August 1, 2025 20:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Increase range for expected VPA CPU recommendations in e2e #8386

Increase range for expected VPA CPU recommendations in e2e #8386

Uh oh!

kamarabbas99 commented Jul 29, 2025

Uh oh!

k8s-ci-robot commented Jul 29, 2025

Uh oh!

k8s-ci-robot commented Jul 29, 2025

Uh oh!

k8s-ci-robot commented Jul 29, 2025

Uh oh!

kamarabbas99 commented Jul 29, 2025

Uh oh!

adrianmoisey commented Jul 29, 2025

Uh oh!

omerap12 commented Aug 1, 2025

Uh oh!

kamarabbas99 commented Aug 1, 2025

Uh oh!

kamarabbas99 commented Aug 1, 2025

Uh oh!

adrianmoisey commented Aug 2, 2025

Uh oh!

omerap12 commented Aug 2, 2025

Uh oh!

Uh oh!

Increase range for expected VPA CPU recommendations in e2e #8386

Are you sure you want to change the base?

Increase range for expected VPA CPU recommendations in e2e #8386

Uh oh!

Conversation

kamarabbas99 commented Jul 29, 2025

What type of PR is this?

What this PR does / why we need it:

Uh oh!

k8s-ci-robot commented Jul 29, 2025

Uh oh!

k8s-ci-robot commented Jul 29, 2025

Uh oh!

k8s-ci-robot commented Jul 29, 2025

Uh oh!

kamarabbas99 commented Jul 29, 2025

Uh oh!

adrianmoisey commented Jul 29, 2025

Uh oh!

omerap12 commented Aug 1, 2025

Uh oh!

kamarabbas99 commented Aug 1, 2025

Uh oh!

kamarabbas99 commented Aug 1, 2025

Uh oh!

adrianmoisey commented Aug 2, 2025

Uh oh!

omerap12 commented Aug 2, 2025

Uh oh!

Uh oh!