Skip to content

Hetzner(feat): Add ability to specify a subnet for autoscaled node placement #8334

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

tloesch
Copy link

@tloesch tloesch commented Jul 21, 2025

What type of PR is this?

/kind feature

What this PR does / why we need it:

This PR enables the optional specification of a subnet where autoscaled nodes will be placed. Previously, it was only possible to define a network ID, which caused issues when multiple subnets existed within that network.

Without subnet specification, Hetzner would place nodes randomly in one of the available subnets, leading to unpredictable network configurations and potential connectivity problems. This feature provides more precise control over node placement.

The implementation ensures that nodes receive consistent IP addresses from the specified subnet through a reservation mechanism that prevents IP conflicts during scaling operations.

Which issue(s) this PR fixes:

Fixes #5263

Special notes for your reviewer:

Notice
I am not experienced in Golang development. Please provide me with guidance on necessary improvements. I will then try to implement them as best as possible.

Concept description:

The IP Reserver operates using a thread-safe data structure that manages all previously assigned IP addresses within a subnet. When creating a new server, the process follows a defined sequence:

  1. First, a free IP address is reserved from the configured subnet, automatically skipping network and broadcast addresses.
  2. The server is initially created without network connectivity.
  3. The reserved IP is stored as a label on the server (cluster-autoscaler/reserved-ip) and internally as map[string]net.IP.
  4. After successful server creation, the server is connected to the network with the reserved static IP address.
  5. When a server is deleted, its associated IP address is automatically released and becomes available again.

Does this PR introduce a user-facing change?

NONE

Added the ability to specify a subnet for autoscaled node placement in the Hetzner Cloud Provider. This resolves an issue where nodes could be placed in random subnets when multiple subnets existed within a network, leading to unpredictable network configurations. Refer to the README.md of the Hetzner Cloud Provider for configuration instructions.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. labels Jul 21, 2025
Copy link

linux-foundation-easycla bot commented Jul 21, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

@k8s-ci-robot k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. do-not-merge/needs-area labels Jul 21, 2025
@k8s-ci-robot
Copy link
Contributor

Welcome @tloesch!

It looks like this is your first PR to kubernetes/autoscaler 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/autoscaler has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot requested review from apricote and x13n July 21, 2025 09:13
@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jul 21, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @tloesch. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added area/cluster-autoscaler area/provider/hetzner Issues or PRs related to Hetzner provider and removed do-not-merge/needs-area labels Jul 21, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: tloesch
Once this PR has been reviewed and has the lgtm label, please assign apricote for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Jul 21, 2025
@tloesch
Copy link
Author

tloesch commented Jul 21, 2025

Tests / test-and-verify (pull_request) fails.
Is this because of any code changes within the PR?

EDIT: Found the issue in middle of the CI run. i fixed it by formatting files

@lukasmetzner
Copy link
Contributor

Hey, thanks for the contribution. I will take a look at your PR ^^

@lukasmetzner
Copy link
Contributor

Hey, before I go into detail about the code I would like to discuss the overall concept.

Your way of storing the IPs in the Servers labels, will cause issues with multiple clusters (including an autoscaler) in the same Hetzner Cloud project.

A simpler method could be to fetch the necessary data from the API (or cache) and evaluate, which IPs are free to use before starting the Server creation process. These can then be assigned to each Server in the Create call and you should be fine. This would not clash with other clusters in the project and simply the implementation.

I would like to ask you to revise your concept. If you have any questions feel free to ask them.

Best Regards
Lukas

@tloesch
Copy link
Author

tloesch commented Aug 8, 2025

Hey, before I go into detail about the code I would like to discuss the overall concept.

Your way of storing the IPs in the Servers labels, will cause issues with multiple clusters (including an autoscaler) in the same Hetzner Cloud project.

A simpler method could be to fetch the necessary data from the API (or cache) and evaluate, which IPs are free to use before starting the Server creation process. These can then be assigned to each Server in the Create call and you should be fine. This would not clash with other clusters in the project and simply the implementation.

I would like to ask you to revise your concept. If you have any questions feel free to ask them.

Best Regards Lukas

Seems more robust. I will change the implementation and mention you when i am done.
Thanks for your advise.

EDIT:
@lukasmetzner

The code is a bit older and I responded a little too hastily.
Perhaps I didn't describe the concept well enough in the PR.
So here's a hopefully more detailed explanation:

The IP addresses are already obtained from the API/cache.
In addition, the IP addresses are also obtained from the labels.
The reason I decided to add the IP address as a label to the server is that IP assignment within the Hetzner Cloud does not happen immediately.
This is a problem when the autoscaler or autoscalers want to create servers almost simultaneously.
The additional local map object is perhaps a bit excessive here and should be removed for the reason you mentioned. (Multiple autoscalers/clusters within the same Hetzner project and network/subnetwork)

Please let me know whether I should remove the local map object or whether we should find another, better conceptual solution.
In my opinion, another solution could be an implementation on the API side of Hetzner Cloud.
For example, the possibility of adding a subnet to a server creation API request.
This would greatly simplify the work on the autoscaler side.
Unfortunately, to my knowledge, this is closed source and therefore cannot be customized by the community.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cluster-autoscaler area/provider/hetzner Issues or PRs related to Hetzner provider cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Hetzner Autoscaler does not allow subnets for HCLOUD_NETWORK
3 participants