Skip to content

Add flag to determine whether containerd config is overwritten #12278

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

mkjpryor
Copy link

@mkjpryor mkjpryor commented Jun 2, 2025

What type of PR is this?

/kind feature

What this PR does / why we need it:

Adds a flag to opt out of overriding the containerd configuration.

Which issue(s) this PR fixes:

Fixes #12277

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

Allow opting out of overriding the containerd configuration. This can be desirable when applications that modify the containerd config are installed on the cluster.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jun 2, 2025
@k8s-ci-robot k8s-ci-robot requested review from MrFreezeex and yankay June 2, 2025 10:57
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: mkjpryor
Once this PR has been reviewed and has the lgtm label, please assign mzaian for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jun 2, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @mkjpryor. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@yankay
Copy link
Member

yankay commented Jun 3, 2025

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jun 3, 2025
@mkjpryor
Copy link
Author

mkjpryor commented Jun 4, 2025

@yankay

Looks like this tested down OK. Is there anything else you need from me?

@yankay
Copy link
Member

yankay commented Jun 5, 2025

Thanks @mkjpryor​​

I appreciate the solution you provided. Personally, I think this parameter is reasonable, as I couldn’t think of a better way to address this issue.

However, there’s one scenario that may need clarification:
If a user needs to modify containerd config.yaml while the cluster has gpu-operator deployed, they must set containerd_config_overwrite: true. After running Ansible, they should also restart the NVIDIA Container Toolkit and ensure the configuration file is correctly updated.

​​/lgtm​​
@mzaian @VannTen Could you please share your thoughts?

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 5, 2025
@VannTen
Copy link
Contributor

VannTen commented Jun 5, 2025 via email

@mkjpryor
Copy link
Author

mkjpryor commented Jun 9, 2025

@VannTen

I agree with you that this is not an ideal solution, but as @yankay says it is the only reasonable solution I can think of that I can implement right now to stop kubespray and the NVIDIA GPU operator fighting over my containerd config.

In the docs you pointed to, it says that RuntimeClass is used for making Kubernetes aware of the different CRI configurations available on the nodes so that it can request them. However in order to make additional CRI configurations available on a node running containerd, you have to modify the containerd config. The NVIDIA GPU operator does indeed create RuntimeClass objects to make Kubernetes aware of the CRI configurations that it has installed.

The major problem here is actually containerd's lack of support for a config.d style model, which would allow the GPU operator to drop the extra CRI configurations into an include directory. That would be the ideal solution for this but containerd does not support doing this due to some strange decisions they took about which parts of the config are subject to a merge and which are overwritten completely by includes (spoiler - the entire runtimes section is overwritten at once, so you can't have a default runtime in config.toml and add extra runtimes in drop-in files).

I'm open to other solutions if you can think of any. I can't. The NVIDIA GPU operator is a very widely used piece of software so I am surprised to be the only one hitting this. Maybe nobody else uses it with kubespray 🤷‍♂️

@mkjpryor
Copy link
Author

mkjpryor commented Jun 9, 2025

In any case, I think a flag whose default value results in no change in behaviour is a reasonable compromise?

@mkjpryor
Copy link
Author

mkjpryor commented Jun 9, 2025

Thanks @mkjpryor​​

I appreciate the solution you provided. Personally, I think this parameter is reasonable, as I couldn’t think of a better way to address this issue.

However, there’s one scenario that may need clarification: If a user needs to modify containerd config.yaml while the cluster has gpu-operator deployed, they must set containerd_config_overwrite: true. After running Ansible, they should also restart the NVIDIA Container Toolkit and ensure the configuration file is correctly updated.

​​/lgtm​​@mzaian @VannTen Could you please share your thoughts?

@yankay

You are correct that if a user sets this flag in their group vars, then if they really need kubespray to update the containerd config they will need to run with containerd_config_overwrite=true as an extra var and probably roll the NVIDIA daemonsets after. Does this need a documentation patch? I am happy to do it if you can suggest a good place in the docs for it to go.

@yankay
Copy link
Member

yankay commented Jun 11, 2025

In any case, I think a flag whose default value results in no change in behaviour is a reasonable compromise?

agree with your opinion. This PR enhances the configurability of Kubespray without any negative impacts, so it can be merged.
Another reviewer's approval is still needed before this can be merged.

Additionally, we can continue to explore whether there are more elegant solutions in the future.

@VannTen
Copy link
Contributor

VannTen commented Jun 14, 2025

Ok, I see the rationale. I'm still not a very big fan, especially because this opens the door for subtle bugs where the containerd config is not updated with new versions, that kind of things, which would not be a breaking change with normal behavior but could be if we don't update the containerd config.

What's the cri-o approach on this, a config.d/ scheme like you mentioned above ?

Another question regarding the nvidia operator specifically: is the runtime and runtime class available to implement independently ?
Currently the container engine and container runtimes are quite coupled in kubespray, but in the future we should decoupled them, and I'm wondering if that would be then a solution which would allow to get rid of this workaround.

@VannTen
Copy link
Contributor

VannTen commented Jun 14, 2025 via email

@mkjpryor
Copy link
Author

mkjpryor commented Jun 18, 2025

@VannTen

In all honesty, I'd rather use the NVIDIA official approach for similar reasons to the ones that you expressed as concerns for this patch. If the NVIDIA GPU operator changes the installation and I have fixed runtimes defined in containerd_additional_runtimes that are out-of-date with that, I could get in a world of pain that way. I would probably also have to assume responsibility for installing/upgrading the NVIDIA drivers and runtime binaries, which I don't really want to do.

Unfortunately, containerd does not support a config.d style config merging properly right now, so it is not really an option for the NVIDIA GPU operator to use that. The NVIDIA runtime installation process reads the current default runtime and bases the NVIDIA runtimes off that with the required changes, e.g. to point at different runtime binaries.

A stronger decoupling of the installation of the container engine itself and the corresponding runtime config could work, but I'm not sure what that would look like that is much different to what this patch does.

This patch introduces a new option that you specifically opt in to that basically says to kubespray that I, the operator, am taking responsibility for ensuring the containerd config is correct because I know other things need to change it. I'm not sure how much better we can do with the current state of the containerd config merging.

Of course, if a larger refactoring in the future, or changes to containerd make a config.d approach feasible, we should revisit this workaround.

@VannTen
Copy link
Contributor

VannTen commented Jun 20, 2025 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Kubespray overwrites NVIDIA GPU operator containerd configuration
4 participants