-
Notifications
You must be signed in to change notification settings - Fork 68
Open
Description
Problem Description
When creating application with cStor provisioned volume(3 replicas), app gets stuck in container creating state.
Environment details:
Kubeadm based 4-node(1 master & 3 workers) cluster with K8s 1.25 version:
[root@k8s-master-640 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master-640 Ready control-plane 20h v1.25.0
k8s-node1-641 Ready <none> 19h v1.25.0
k8s-node2-642 Ready <none> 19h v1.25.0
k8s-node3-643 Ready <none> 19h v1.25.0
Each node is having 3 disks attached to it.
Steps followed to create a cStor volume:
- Created a CSPC using the 3 disks on all the 3 worker nodes.
- CSPC created successfully with the provisioned == desired instances(CSPI) and the pool pods are also in running state.
- Created a cStor volume with
3 replicas
mentioned in the StorageClass. - PVC gets bounds to its respective PV.
- CVR are created and all are in
healthy
state - Deployed an application with the above created PVC.
Describe of the application pod:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 7m54s default-scheduler 0/4 nodes are available: 4 pod has unbound immediate PersistentVolumeClaims. preemption: 0/4 nodes are available: 4 Preemption is not helpful for scheduling.
Normal Scheduled 7m52s default-scheduler Successfully assigned default/wordpress-5fb7bff8dd-csqrb to k8s-node1-641
Warning FailedMount 2m3s (x10 over 7m43s) kubelet MountVolume.MountDevice failed for volume "pvc-14297415-5f2a-406f-bf8b-87a1a5006742" : rpc error: code = Internal desc = Waiting for pvc-14297415-5f2a-406f-bf8b-87a1a5006742's CVC to be bound
Warning FailedMount 77s (x3 over 5m50s) kubelet Unable to attach or mount volumes: unmounted volumes=[wordpress-persistent-storage], unattached volumes=[wordpress-persistent-storage kube-api-access-zwkx9]: timed out waiting for the condition'
Describe of CVC:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Provisioning 8m22s (x4 over 8m40s) cstorvolumeclaim-controller failed to create PDB for volume: pvc-14297415-5f2a-406f-bf8b-87a1a5006742: failed to list PDB belongs to pools with selector openebs.io/cstor-disk-pool-ffvp=true,openebs.io/cstor-disk-pool-l2fb=true,openebs.io/cstor-disk-pool-54zn=true: the server could not find the requested resource
Warning Provisioning 4m47s (x4 over 8m36s) cstorvolumeclaim-controller failed to create PDB for volume: pvc-14297415-5f2a-406f-bf8b-87a1a5006742: failed to list PDB belongs to pools with selector openebs.io/cstor-disk-pool-l2fb=true,openebs.io/cstor-disk-pool-54zn=true,openebs.io/cstor-disk-pool-ffvp=true: the server could not find the requested resource
Warning Provisioning 3m17s (x18 over 8m42s) cstorvolumeclaim-controller failed to create PDB for volume: pvc-14297415-5f2a-406f-bf8b-87a1a5006742: failed to list PDB belongs to pools with selector openebs.io/cstor-disk-pool-54zn=true,openebs.io/cstor-disk-pool-ffvp=true,openebs.io/cstor-disk-pool-l2fb=true: the server could not find the requested resource
Logs from one of the pool pods:
I0907 06:52:21.373440 8 event.go:282] Event(v1.ObjectReference{Kind:"CStorVolumeReplica", Namespace:"openebs", Name:"pvc-5a9e63ce-1c6d-4c53-bb7f-dd4782360736-cstor-disk-pool-54zn", UID:"7f1d146f-4c2c-4a91-a3b0-9b0500867ce1", APIVersion:"cstor.openebs.io/v1", ResourceVersion:"138978", FieldPath:""}): type: 'Normal' reason: 'Synced' Received Resource create event
I0907 06:52:21.389429 8 handler.go:226] will process add event for cvr {pvc-5a9e63ce-1c6d-4c53-bb7f-dd4782360736-cstor-disk-pool-54zn} as volume {cstor-fb027a66-716a-4abf-b643-3a336cc3da6a/pvc-5a9e63ce-1c6d-4c53-bb7f-dd4782360736}
I0907 06:52:21.393542 8 handler.go:572] cVR 'pvc-5a9e63ce-1c6d-4c53-bb7f-dd4782360736-cstor-disk-pool-54zn': uid '7f1d146f-4c2c-4a91-a3b0-9b0500867ce1': phase 'Init': is_empty_status: false
I0907 06:52:21.393557 8 handler.go:584] cVR pending: 7f1d146f-4c2c-4a91-a3b0-9b0500867ce1
2022-09-07T06:52:21.527Z INFO volumereplica/volumereplica.go:308 {"eventcode": "cstor.volume.replica.create.success", "msg": "Successfully created CStor volume replica", "rname": "cstor-fb027a66-716a-4abf-b643-3a336cc3da6a/pvc-5a9e63ce-1c6d-4c53-bb7f-dd4782360736"}
I0907 06:52:21.527245 8 handler.go:469] cVR creation successful: pvc-5a9e63ce-1c6d-4c53-bb7f-dd4782360736-cstor-disk-pool-54zn, 7f1d146f-4c2c-4a91-a3b0-9b0500867ce1
I0907 06:52:21.527559 8 event.go:282] Event(v1.ObjectReference{Kind:"CStorVolumeReplica", Namespace:"openebs", Name:"pvc-5a9e63ce-1c6d-4c53-bb7f-dd4782360736-cstor-disk-pool-54zn", UID:"7f1d146f-4c2c-4a91-a3b0-9b0500867ce1", APIVersion:"cstor.openebs.io/v1", ResourceVersion:"138980", FieldPath:""}): type: 'Normal' reason: 'Created' Resource created successfully
I0907 06:52:21.538547 8 event.go:282] Event(v1.ObjectReference{Kind:"CStorVolumeReplica", Namespace:"openebs", Name:"pvc-5a9e63ce-1c6d-4c53-bb7f-dd4782360736-cstor-disk-pool-54zn", UID:"7f1d146f-4c2c-4a91-a3b0-9b0500867ce1", APIVersion:"cstor.openebs.io/v1", ResourceVersion:"138980", FieldPath:""}): type: 'Warning' reason: 'SyncFailed' failed to sync CVR error: unable to update snapshot list details in CVR: failed to get the list of snapshots: Output: failed listsnap command for cstor-fb027a66-716a-4abf-b643-3a336cc3da6a/pvc-5a9e63ce-1c6d-4c53-bb7f-dd4782360736 with err 11
Error: exit status 11
I0907 06:52:21.563031 8 event.go:282] Event(v1.ObjectReference{Kind:"CStorVolumeReplica", Namespace:"openebs", Name:"pvc-5a9e63ce-1c6d-4c53-bb7f-dd4782360736-cstor-disk-pool-54zn", UID:"7f1d146f-4c2c-4a91-a3b0-9b0500867ce1", APIVersion:"cstor.openebs.io/v1", ResourceVersion:"139013", FieldPath:""}): type: 'Warning' reason: 'SyncFailed' failed to sync CVR error: unable to update snapshot list details in CVR: failed to get the list of snapshots: Output: failed listsnap command for cstor-fb027a66-716a-4abf-b643-3a336cc3da6a/pvc-5a9e63ce-1c6d-4c53-bb7f-dd4782360736 with err 11
Error: exit status 11
How to solve
Upon debugging, found out that cStor operators is using v1beta1
version of PodDisruptBudegt
object in its codebase which was deprecated in K8s 1.21 version and is completely removed in K8s 1.25 version.
We need upgrade the usage version of PodDisruptBudegt
to v1 in the codebase to enable cStor to work in K8s 1.25 or later versions
Darkclainer, ThomasBuchinger, apatotski, weironz and djjudas21
Metadata
Metadata
Assignees
Labels
No labels