Skip to content
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Commit 02b96f5

Browse files
committedDec 12, 2023
Rework documentation
Remove the cgroup schema as it's not really actionnable => the link to kubernetes documenation and design doc over here already has that stuff.
1 parent f34e136 commit 02b96f5

File tree

2 files changed

+39
-81
lines changed

2 files changed

+39
-81
lines changed
 

‎docs/cgroups.md

Lines changed: 22 additions & 53 deletions
Original file line numberDiff line numberDiff line change
@@ -2,71 +2,40 @@
22

33
To avoid the rivals for resources between containers or the impact on the host in Kubernetes, the kubelet components will rely on cgroups to limit the container’s resources usage.
44

5-
## Enforcing Node Allocatable
5+
## Node Allocatable
66

7-
You can use `kubelet_enforce_node_allocatable` to set node allocatable enforcement.
7+
Node Allocatable is calculated by substracting from the node capacity:
8+
- kube-reserved reservations
9+
- system-reserved reservations
10+
- hard eviction thresholds
811

9-
```yaml
10-
# A comma separated list of levels of node allocatable enforcement to be enforced by kubelet.
11-
kubelet_enforce_node_allocatable: "pods"
12-
# kubelet_enforce_node_allocatable: "pods,kube-reserved"
13-
# kubelet_enforce_node_allocatable: "pods,kube-reserved,system-reserved"
14-
```
15-
16-
Note that to enforce kube-reserved or system-reserved, `kube_reserved_cgroups` or `system_reserved_cgroups` needs to be specified respectively.
17-
18-
Here is an example:
12+
You can set those reservations:
1913

2014
```yaml
21-
kubelet_enforce_node_allocatable: "pods,kube-reserved,system-reserved"
22-
23-
# Reserve this space for kube resources
24-
# Set to true to reserve resources for kube daemons
25-
kube_reserved: true
26-
kube_reserved_cgroups_for_service_slice: kube.slice
27-
kube_reserved_cgroups: "/{{ kube_reserved_cgroups_for_service_slice }}"
15+
# Kubelet and container engine
2816
kube_memory_reserved: 256Mi
2917
kube_cpu_reserved: 100m
30-
# kube_ephemeral_storage_reserved: 2Gi
31-
# kube_pid_reserved: "1000"
32-
# Reservation for master hosts
33-
kube_master_memory_reserved: 512Mi
34-
kube_master_cpu_reserved: 200m
35-
# kube_master_ephemeral_storage_reserved: 2Gi
36-
# kube_master_pid_reserved: "1000"
18+
kube_ephemeral_storage_reserved: 2Gi
19+
kube_pid_reserved: "1000"
3720

38-
# Set to true to reserve resources for system daemons
39-
system_reserved: true
40-
system_reserved_cgroups_for_service_slice: system.slice
41-
system_reserved_cgroups: "/{{ system_reserved_cgroups_for_service_slice }}"
21+
# System daemons (sshd, network manager, ...)
4222
system_memory_reserved: 512Mi
4323
system_cpu_reserved: 500m
44-
# system_ephemeral_storage_reserved: 2Gi
45-
# system_pid_reserved: "1000"
46-
# Reservation for master hosts
47-
system_master_memory_reserved: 256Mi
48-
system_master_cpu_reserved: 250m
49-
# system_master_ephemeral_storage_reserved: 2Gi
50-
# system_master_pid_reserved: "1000"
24+
system_ephemeral_storage_reserved: 2Gi
25+
system_pid_reserved: "1000"
5126
```
5227
53-
After the setup, the cgroups hierarchy is as follows:
28+
By default, the kubelet will enforce Node Allocatable for pods by default, which means pods will be
29+
evicted when resource usage excess Allocatable.
30+
31+
You can optionnaly enforce the reservations for kube-reserved and system-reserved, but proceed with
32+
caution (see
33+
https://kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources/#general-guidelines).
5434
55-
```bash
56-
/ (Cgroups Root)
57-
├── kubepods.slice
58-
│ ├── ...
59-
│ ├── kubepods-besteffort.slice
60-
│ ├── kubepods-burstable.slice
61-
│ └── ...
62-
├── kube.slice
63-
│ ├── ...
64-
│ ├── {{container_manager}}.service
65-
│ ├── kubelet.service
66-
│ └── ...
67-
├── system.slice
68-
│ └── ...
69-
└── ...
35+
```yaml
36+
enforce_allocatable_pods: true # default
37+
enforce_allocatable_kube_reserved: true
38+
enforce_allocatable_system_reseverd: true
7039
```
7140
7241
You can learn more in the [official kubernetes documentation](https://kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources/).

‎inventory/sample/group_vars/k8s_cluster/k8s-cluster.yml

Lines changed: 17 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -259,47 +259,36 @@ podsecuritypolicy_enabled: false
259259
# Download kubectl onto the host that runs Ansible in {{ bin_dir }}
260260
# kubectl_localhost: false
261261

262-
# A comma separated list of levels of node allocatable enforcement to be enforced by kubelet.
263-
# Acceptable options are 'pods', 'system-reserved', 'kube-reserved' and ''. Default is "".
264-
# kubelet_enforce_node_allocatable: pods
262+
## Reserving compute resources
263+
# https://kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources/
265264

266-
## Set runtime and kubelet cgroups when using systemd as cgroup driver (default)
267-
# kubelet_runtime_cgroups: "/{{ kube_service_cgroups }}/{{ container_manager }}.service"
268-
# kubelet_kubelet_cgroups: "/{{ kube_service_cgroups }}/kubelet.service"
269-
270-
## Set runtime and kubelet cgroups when using cgroupfs as cgroup driver
271-
# kubelet_runtime_cgroups_cgroupfs: "/system.slice/{{ container_manager }}.service"
272-
# kubelet_kubelet_cgroups_cgroupfs: "/system.slice/kubelet.service"
273-
274-
# Optionally reserve this space for kube daemons.
275-
# kube_reserved: false
265+
# Optionally reserve resources for kube daemons.
276266
## Uncomment to override default values
277-
## The following two items need to be set when kube_reserved is true
278-
# kube_reserved_cgroups_for_service_slice: kube.slice
279-
# kube_reserved_cgroups: "/{{ kube_reserved_cgroups_for_service_slice }}"
280267
# kube_memory_reserved: 256Mi
281268
# kube_cpu_reserved: 100m
282269
# kube_ephemeral_storage_reserved: 2Gi
283270
# kube_pid_reserved: "1000"
284-
# Reservation for master hosts
285-
# kube_master_memory_reserved: 512Mi
286-
# kube_master_cpu_reserved: 200m
287-
# kube_master_ephemeral_storage_reserved: 2Gi
288-
# kube_master_pid_reserved: "1000"
289271

290272
## Optionally reserve resources for OS system daemons.
291-
# system_reserved: true
292273
## Uncomment to override default values
293274
## The following two items need to be set when system_reserved is true
294-
# system_reserved_cgroups_for_service_slice: system.slice
295-
# system_reserved_cgroups: "/{{ system_reserved_cgroups_for_service_slice }}"
296275
# system_memory_reserved: 512Mi
297276
# system_cpu_reserved: 500m
298277
# system_ephemeral_storage_reserved: 2Gi
299-
## Reservation for master hosts
300-
# system_master_memory_reserved: 256Mi
301-
# system_master_cpu_reserved: 250m
302-
# system_master_ephemeral_storage_reserved: 2Gi
278+
# system_pid_reserved: "1000"
279+
#
280+
# Make the kubelet enforce with cgroups the limits of Pods
281+
# enforce_allocatable_pods: true
282+
283+
# Enforce kube_*_reserved as limits
284+
# WARNING: this limits the resources the kubelet and the container engine can
285+
# use which can cause instability on your nodes
286+
# enforce_allocatable_kube_reserved: false
287+
288+
# Enforce system_*_reserved as limits
289+
# WARNING: this limits the resources system daemons can use which can lock you
290+
# out of your nodes (by OOMkilling sshd for instance)
291+
# enforce_allocatable_system_reserved: false
303292

304293
## Eviction Thresholds to avoid system OOMs
305294
# https://kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources/#eviction-thresholds

0 commit comments

Comments
 (0)
Please sign in to comment.