k8ssandra-operator: K8SSAND-1443 ⁃ K8ssandraCluster deployment on OpenShift 4.10 does not complete

I am attempting to install a 3 DC deployment of Cassandra 4 across 3 OpenShift (OKD) 4.10 clusters.

The clusters are isolated from the internet, and have pod level (Layer 3) routing provided by Submariner. I have verified that Submariner is allowing traffic across the clusters.

The installation completes successfully. I am using Kustomize to install the Operators.

I am using the first OpenShift cluster as both Control Plane and Data Plane.

The creation of the ClientConfig objects is successful.

A deployment of a 3 DC K8ssandraCluster does not proceed past the creation of the StatefulSet for the first DC.

All of the pods start up and pass their readiness checks. But the deployment seems to get stuck.

The relevant versions are listed below, as well as the log entries from the k8ssandra-operator and the StatefulSet pods:

System Versions:

OpenShift: 4.10.0-0.okd-2022-03-07-131213 Kubernetes: v1.23.3-2003+e419edff267ffa-dirty Submariner: 0.12.0 Certificate Manager: v1.8.0

Container Versions:

k8ssandra-operator: v1.0.1 cass-operator: v1.10.3 cass-management-api: 4.0.1 system-logger: v1.10.3 cass-config-builder: 1.0.4-ubi7

K8ssandraCluster

apiVersion: k8ssandra.io/v1alpha1
kind: K8ssandraCluster
metadata:
  name: k8ssandra-cluster
spec:
  cassandra:
    serverVersion: ${CASS_VER}
    serverImage: ${PROXY_REGISTRY}/k8ssandra/cass-management-api:${CASS_VER}
    storageConfig:
      cassandraDataVolumeClaimSpec:
        storageClassName: rook-ceph-block
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 5Gi
    config:
      jvmOptions:
        heapSize: 512M
    networking:
      hostNetwork: false 
    datacenters:
      - metadata:
          name: dc1
        size: 3
        jmxInitContainerImage:
          registry: ${PROXY_REGISTRY}
          repository: k8ssandra
          name: busybox
          tag: ${BUSYBOX_VER}
      - metadata:
          name: dc2
        k8sContext: okd4-region-02
        size: 3
        jmxInitContainerImage:
          registry: ${PROXY_REGISTRY}
          repository: k8ssandra
          name: busybox
          tag: ${BUSYBOX_VER}
      - metadata:
          name: dc3
        k8sContext: okd4-region-03
        size: 3
        jmxInitContainerImage:
          registry: ${PROXY_REGISTRY}
          repository: k8ssandra
          name: busybox
          tag: ${BUSYBOX_VER}

Logs from k8ssandra-operator:

2022-04-11T11:49:10.281Z	INFO	controller.k8ssandracluster	Creating endpoints	{"reconciler group": "k8ssandra.io", "reconciler kind": "K8ssandraCluster", "name": "k8ssandra-cluster", "namespace": "k8ssandra-operator", "K8ssandraCluster": "k8ssandra-operator/k8ssandra-cluster", "CassandraDatacenter": "k8ssandra-operator/dc1", "K8SContext": "", "Endpoints": {"namespace": "k8ssandra-operator", "name": "k8ssandra-cluster-dc1-additional-seed-service"}}
2022-04-11T11:49:10.285Z	ERROR	controller.k8ssandracluster	Failed to create endpoints	{"reconciler group": "k8ssandra.io", "reconciler kind": "K8ssandraCluster", "name": "k8ssandra-cluster", "namespace": "k8ssandra-operator", "K8ssandraCluster": "k8ssandra-operator/k8ssandra-cluster", "CassandraDatacenter": "k8ssandra-operator/dc1", "K8SContext": "", "Endpoints": {"namespace": "k8ssandra-operator", "name": "k8ssandra-cluster-dc1-additional-seed-service"}, "error": "endpoints \"k8ssandra-cluster-dc1-additional-seed-service\" is forbidden: endpoint address 10.101.2.40 is not allowed"}
github.com/k8ssandra/k8ssandra-operator/controllers/k8ssandra.(*K8ssandraClusterReconciler).reconcile
	/workspace/controllers/k8ssandra/k8ssandracluster_controller.go:133
github.com/k8ssandra/k8ssandra-operator/controllers/k8ssandra.(*K8ssandraClusterReconciler).Reconcile
	/workspace/controllers/k8ssandra/k8ssandracluster_controller.go:87
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.0/pkg/internal/controller/controller.go:114
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.0/pkg/internal/controller/controller.go:311
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.0/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.0/pkg/internal/controller/controller.go:227
2022-04-11T11:49:10.297Z	INFO	controller.k8ssandracluster	updated k8ssandracluster status	{"reconciler group": "k8ssandra.io", "reconciler kind": "K8ssandraCluster", "name": "k8ssandra-cluster", "namespace": "k8ssandra-operator", "K8ssandraCluster": "k8ssandra-operator/k8ssandra-cluster"}
2022-04-11T11:49:10.297Z	ERROR	controller.k8ssandracluster	Reconciler error	{"reconciler group": "k8ssandra.io", "reconciler kind": "K8ssandraCluster", "name": "k8ssandra-cluster", "namespace": "k8ssandra-operator", "error": "endpoints \"k8ssandra-cluster-dc1-additional-seed-service\" is forbidden: endpoint address 10.101.2.40 is not allowed"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
       /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.0/pkg/internal/controller/controller.go:227

Logs from k8ssandra-cluster StatefulSet pods

INFO  [epollEventLoopGroup-7-3] 2022-04-11 11:52:05,583 NetworkTopologyStrategy.java:88 - Configured datacenter replicas are dc1:rf(1)
WARN  [epollEventLoopGroup-7-3] 2022-04-11 11:52:05,933 K8SeedProvider4x.java:58 - Seed provider couldn't lookup host k8ssandra-cluster-dc1-additional-seed-service
INFO  [epollEventLoopGroup-7-3] 2022-04-11 11:52:05,954 Keyspace.java:386 - Creating replication strategy none params KeyspaceParams{durable_writes=true, replication=ReplicationParams{class=org.apache.cassandra.locator.NetworkTopologyStrategy, dc1=1}}
INFO  [epollEventLoopGroup-7-3] 2022-04-11 11:52:05,955 NetworkTopologyStrategy.java:88 - Configured datacenter replicas are dc1:rf(1)

These log entries are continuous in all three StatefulSet pods:

WARN  [OptionalTasks:1] 2022-04-11 11:52:07,664 CassandraRoleManagerInterceptor.java:75 - CassandraRoleManager skipped default role setup: some nodes were not ready
INFO  [OptionalTasks:1] 2022-04-11 11:52:07,664 CassandraRoleManager.java:369 - Setup task failed with error, rescheduling
WARN  [OptionalTasks:1] 2022-04-11 11:52:17,666 CassandraRoleManagerInterceptor.java:75 - CassandraRoleManager skipped default role setup: some nodes were not ready
INFO  [OptionalTasks:1] 2022-04-11 11:52:17,667 CassandraRoleManager.java:369 - Setup task failed with error, rescheduling

Logs from cass-operator

2022-04-11T11:52:06.051Z	DEBUG	events	Normal	{"object": {"kind":"CassandraDatacenter","namespace":"k8ssandra-operator","name":"dc1","uid":"200a12e1-eeeb-4ed1-9d98-c2420cd9055b","apiVersion":"cassandra.datastax.com/v1beta1","resourceVersion":"5521769"}, "reason": "CreatedUsers", "message": "Created users"}
6125
2022-04-11T11:52:06.051Z	DEBUG	events	Normal	{"object": {"kind":"CassandraDatacenter","namespace":"k8ssandra-operator","name":"dc1","uid":"200a12e1-eeeb-4ed1-9d98-c2420cd9055b","apiVersion":"cassandra.datastax.com/v1beta1","resourceVersion":"5521769"}, "reason": "CreatedSuperuser", "message": "Created superuser"}
6126
2022-04-11T11:52:06.065Z	INFO	controllers.CassandraDatacenter.cassandradatacenter_controller.controller.cassandradatacenter-controller	reconcile_racks::listPods	{"reconciler group": "cassandra.datastax.com", "reconciler kind": "CassandraDatacenter", "name": "dc1", "namespace": "k8ssandra-operator", "namespace": "k8ssandra-operator", "datacenterName": "dc1", "clusterName": "k8ssandra-cluster"}
6127
2022-04-11T11:52:06.066Z	INFO	controllers.CassandraDatacenter.cassandradatacenter_controller.controller.cassandradatacenter-controller	calling Management API features - GET /api/v0/metadata/versions/features	{"reconciler group": "cassandra.datastax.com", "reconciler kind": "CassandraDatacenter", "name": "dc1", "namespace": "k8ssandra-operator", "pod": "k8ssandra-cluster-dc1-default-sts-0"}
6128
2022-04-11T11:52:06.066Z	INFO	controllers.CassandraDatacenter.cassandradatacenter_controller.controller.cassandradatacenter-controller	client::callNodeMgmtEndpoint	{"reconciler group": "cassandra.datastax.com", "reconciler kind": "CassandraDatacenter", "name": "dc1", "namespace": "k8ssandra-operator"}
6129
2022-04-11T11:52:06.083Z	INFO	controllers.CassandraDatacenter.cassandradatacenter_controller.controller.cassandradatacenter-controller	client::callIsFullQueryLogEnabledEndpoint	{"reconciler group": "cassandra.datastax.com", "reconciler kind": "CassandraDatacenter", "name": "dc1", "namespace": "k8ssandra-operator"}
6130
2022-04-11T11:52:06.083Z	INFO	controllers.CassandraDatacenter.cassandradatacenter_controller.controller.cassandradatacenter-controller	client::callNodeMgmtEndpoint	{"reconciler group": "cassandra.datastax.com", "reconciler kind": "CassandraDatacenter", "name": "dc1", "namespace": "k8ssandra-operator"}
6131
2022-04-11T11:52:06.094Z	INFO	controllers.CassandraDatacenter.cassandradatacenter_controller.controller.cassandradatacenter-controller	calling Management API features - GET /api/v0/metadata/versions/features	{"reconciler group": "cassandra.datastax.com", "reconciler kind": "CassandraDatacenter", "name": "dc1", "namespace": "k8ssandra-operator", "pod": "k8ssandra-cluster-dc1-default-sts-1"}
6132
2022-04-11T11:52:06.094Z	INFO	controllers.CassandraDatacenter.cassandradatacenter_controller.controller.cassandradatacenter-controller	client::callNodeMgmtEndpoint	{"reconciler group": "cassandra.datastax.com", "reconciler kind": "CassandraDatacenter", "name": "dc1", "namespace": "k8ssandra-operator"}
6133
2022-04-11T11:52:06.107Z	INFO	controllers.CassandraDatacenter.cassandradatacenter_controller.controller.cassandradatacenter-controller	client::callIsFullQueryLogEnabledEndpoint	{"reconciler group": "cassandra.datastax.com", "reconciler kind": "CassandraDatacenter", "name": "dc1", "namespace": "k8ssandra-operator"}
6134
2022-04-11T11:52:06.107Z	INFO	controllers.CassandraDatacenter.cassandradatacenter_controller.controller.cassandradatacenter-controller	client::callNodeMgmtEndpoint	{"reconciler group": "cassandra.datastax.com", "reconciler kind": "CassandraDatacenter", "name": "dc1", "namespace": "k8ssandra-operator"}
6135
2022-04-11T11:52:06.121Z	INFO	controllers.CassandraDatacenter.cassandradatacenter_controller.controller.cassandradatacenter-controller	calling Management API features - GET /api/v0/metadata/versions/features	{"reconciler group": "cassandra.datastax.com", "reconciler kind": "CassandraDatacenter", "name": "dc1", "namespace": "k8ssandra-operator", "pod": "k8ssandra-cluster-dc1-default-sts-2"}
6136
2022-04-11T11:52:06.121Z	INFO	controllers.CassandraDatacenter.cassandradatacenter_controller.controller.cassandradatacenter-controller	client::callNodeMgmtEndpoint	{"reconciler group": "cassandra.datastax.com", "reconciler kind": "CassandraDatacenter", "name": "dc1", "namespace": "k8ssandra-operator"}
6137
2022-04-11T11:52:06.137Z	INFO	controllers.CassandraDatacenter.cassandradatacenter_controller.controller.cassandradatacenter-controller	client::callIsFullQueryLogEnabledEndpoint	{"reconciler group": "cassandra.datastax.com", "reconciler kind": "CassandraDatacenter", "name": "dc1", "namespace": "k8ssandra-operator"}
6138
2022-04-11T11:52:06.137Z	INFO	controllers.CassandraDatacenter.cassandradatacenter_controller.controller.cassandradatacenter-controller	client::callNodeMgmtEndpoint	{"reconciler group": "cassandra.datastax.com", "reconciler kind": "CassandraDatacenter", "name": "dc1", "namespace": "k8ssandra-operator"}
6139
2022-04-11T11:52:06.181Z	INFO	controllers.CassandraDatacenter.cassandradatacenter_controller.controller.cassandradatacenter-controller	All StatefulSets should now be reconciled.	{"reconciler group": "cassandra.datastax.com", "reconciler kind": "CassandraDatacenter", "name": "dc1", "namespace": "k8ssandra-operator", "namespace": "k8ssandra-operator", "datacenterName": "dc1", "clusterName": "k8ssandra-cluster"}
6140
2022-04-11T11:52:06.181Z	INFO	controllers.CassandraDatacenter	Reconcile loop completed	{"cassandradatacenter": "k8ssandra-operator/dc1", "requestNamespace": "k8ssandra-operator", "requestName": "dc1", "loopID": "46294b95-7abf-4b90-9409-8b0da1ace759", "duration": 0.256523188}

┆Issue is synchronized with this Jira Task by Unito ┆friendlyId: K8SSAND-1443 ┆priority: Medium

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 18 (8 by maintainers)

Commits related to this issue

Most upvoted comments

The warning about the additional seeds service is just noise. Once k8ssandra-operator creates the Endpoints, that warning should go away.

I’ll try testing with Kubernetes 1.23. Also feel free to ping me either in Kubernetes slack or in the K8ssandra Discord (https://discord.com/invite/qP5tAt6Uwt).