Open
Description
/kind bug
What steps did you take and what happened:
clusterctl generate cluster capz-private --infrastructure azure --kubernetes-version v1.31.4 --control-plane-machine-count=1 --worker-machine-count=1
- Then I add on the below to the generated yaml file
image:
computeGallery:
gallery: "ClusterAPI-f72ceb4f-5159-4c26-a0fe-2ea738f0d019"
name: "capi-ubun2-2404"
version: "1.31.4"
networkSpec:
privateDNSZoneName: "capz-private.capz.io"
apiServerLB:
type: Internal
frontendIPs:
- name: lb-private-ip-frontend
privateIP: 172.19.2.100
controlPlaneOutboundLB:
frontendIPsCount: 1
nodeOutboundLB:
frontendIPsCount: 1
subnets:
- name: control-plane-subnet
role: control-plane
cidrBlocks:
- 172.19.2.0/25
- name: node-subnet
role: node
cidrBlocks:
- 172.19.2.128/25
vnet:
name: vnet-capz-private
cidrBlocks:
- 172.19.2.0/24
peerings:
- resourceGroup: rg-name
remoteVnetName: vnet-capi-mgmt-stage
- The cluster is
Provisioned
but: I see 2 errors (below), KubeAdmControlPlane not initialized
E0116 16:26:38.836076 1 cluster_accessor.go:257] "Connect failed" err="error creating HTTP client and mapper: cluster is not reachable: Get \"https://apiserver.capz-private.capz.io:6443/?timeout=5s\": context deadline exceeded" controller="clustercache" controllerGroup="cluster.x-k8s.io" controllerKind="Cluster" Cluster="default/capz-private" namespace="default" name="capz-private" reconcileID="4ec298dc-f392-4809-9426-d35cf8bfa55d"
E0116 16:31:31.645375 1 controller.go:324] "Reconciler error" err=<failed to reconcile AzureMachine: failed to reconcile AzureMachine service vmextensions: extension state failed. This likely means the Kubernetes node bootstrapping process failed or timed out. Check VM boot diagnostics logs to learn more: failed to create or update resource thi-poc-capz-private/CAPZ.Linux.Bootstrapping (service: vmextensions): GET https://management.azure.com/subscriptions/9d54088e-89a6-4a3a-8f91-05699ba3c93a/providers/Microsoft.Compute/locations/northcentralus/operations 6921441d-f760-44e3-8db1-b7aa18c05a5e -------------------------------------------------------------------------------- RESPONSE 200: 200 OK
ERROR CODE: VMExtensionProvisioningError -------------------------------------------------------------------------------- "code": "VMExtensionProvisioningError",
"message": "VM has reported a failure when processing extension 'CAPZ.Linux.Bootstrapping' (publisher 'Microsoft.Azure.ContainerUpstream' and type 'CAPZ.Linux.Bootstrapping'). Error message: 'Enable failed: failed to execute command: command terminated with exit status=1\n[stdout]\n\n[stderr]\n'. More information on troubleshooting is available at https://aka.ms/vmextensionlinuxtroubleshoot.
What did you expect to happen:
- KubeAdmControlPlane gets initialized, CAPZ.Linux.Bootstrapping executes successfully
Anything else you would like to add:
- The logs mentioned in https://learn.microsoft.com/en-us/azure/virtual-machines/extensions/features-linux?tabs=azure-cli#troubleshoot-vm-extensions does not give any additional info..at least to me it doesnt..i can share it if requested
- VNET, Subnets, and Network security groups do not pre-exist..CAPI creates them
- Even though the bootstrapping does not execute successfully, I can do
clusterctl get kubeconfig
and dokubectl get po
to prove that the API server is functioning and there is not network connectivity issue - Compared to the cluster that I don't specify the API SERVER LB as
type: Internal
, I noticed thatkube-proxy
is missing..may be that is related to the boostrapping - Using the same node image, I can create and have a functioning cluster if I don't specify the API SERVER LB as
type: Internal
Environment:
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
Todo