cassandra-operator: Cassandra on GKE port forwarding connection refused

Describe the bug

I deployed the cassandra-operator on a GKE cluster of 3 Kubernetes nodes using following configuration:

apiVersion: cassandraoperator.instaclustr.com/v1alpha1
kind: CassandraDataCenter
metadata:
  name: test-cluster-dc1
  labels:
    app: cassandra-performance

cluster: test-cluster
datacenter: dc1

spec:
  
  serviceAccountName: cassandra-performance
  
  nodes: 3

  cassandraImage: "gcr.io/cassandra-operator/cassandra-3.11.7:latest"
  sidecarImage: "gcr.io/cassandra-operator/cassandra-sidecar:latest"
  imagePullPolicy: Always

  resources:
    requests:
      cpu: "4"
      memory: 6Gi
    limits:
      cpu: "8"
      memory: 12Gi
  sidecarResources:
    limits:
      memory: 512Mi
    requests:
      memory: 512Mi

  dataVolumeClaimSpec:
    accessModes:
      - ReadWriteOnce
    storageClassName: ssd-expandable
    resources:
      requests:
        storage: 10Gi

  fsGroup: 999

  cassandraAuth:
    roleManager: CassandraRoleManager
    authenticator: AllowAllAuthenticator
    authorizer: AllowAllAuthorizer

  prometheusSupport: true

  optimizeKernelParams: true

  userConfigMapVolumeSource:
    name: test-cluster-dc1
    type: array
    items:
      - key: 100-jvm-memory-gc.options
        path: jvm.options.d/100-jvm-memory-gc.options
      - key: 100-cassandra-custom.yaml
        path: cassandra.yaml.d/100-cassandra-custom.yaml

 ---

apiVersion: v1
kind: ConfigMap
metadata:
  name: test-cluster-dc1
  labels:
    app: cassandra-performance
data:
  100-jvm-memory-gc.options: |
    -Xms6G
    -Xmx9G
    -Xmn512M
  100-cassandra-custom.yaml: |
    endpoint_snitch: GossipingPropertyFileSnitch

All Cassandra nodes in the datacenter start properly, without issues. Checking with nodetool returns this:

cassandra@cassandra-test-cluster-dc1-rack1-0:/$ nodetool status
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address     Load       Tokens       Owns (effective)  Host ID                               Rack
UN  10.24.1.11  204.1 KiB  256          67.0%             a0e1ccc8-a3ab-4379-b38e-358ce7c7cbd5  rack1
UN  10.24.0.11  180.25 KiB  256          70.0%             38088136-01b9-40fa-9577-efa5945fb241  rack1
UN  10.24.2.16  159.06 KiB  256          63.0%             861f9b2f-9031-4c63-8cb9-00db3cd8e3a0  rack1

When I try to connect to Cassandra from my laptop through port-forwarding (using different applications like TablePlus, JetBrains DataGrip, or simply cqlsh) using kubectl to pod

kubectl port-forward pod/cassandra-test-cluster-dc1-rack1-0 9042

or to service

kubectl port-forward svc/cassandra-test-cluster-dc1-nodes 9042

I receive always connection refused:

$ cqlsh localhost
Connection error: ('Unable to connect to any servers', {'::1:9042': ConnectionShutdown('Connection to ::1:9042 was closed',), '127.0.0.1:9042': ConnectionShutdown('Connection to 127.0.0.1:9042 was closed',)})

And in the terminal window with the port forwarding opened:

Forwarding from 127.0.0.1:9042 -> 9042
Forwarding from [::1]:9042 -> 9042
Handling connection for 9042
E0907 11:50:49.277682    8117 portforward.go:400] an error occurred forwarding 9042 -> 9042: error forwarding port 9042 to pod 639979598906171a9d3a840bd271fe5bb5ebe9b45c74c8ae1d9054e6eb37e28e, uid : failed to execute portforward in network namespace "/var/run/netns/cni-dfdd1737-86af-8b27-6730-b8788b19abd0": socat command returns error: exit status 1, stderr: "2020/09/07 09:50:49 socat[3823203] E connect(5, AF=2 127.0.0.1:9042, 16): Connection refused\n"
Handling connection for 9042
E0907 11:50:49.404167    8117 portforward.go:400] an error occurred forwarding 9042 -> 9042: error forwarding port 9042 to pod 639979598906171a9d3a840bd271fe5bb5ebe9b45c74c8ae1d9054e6eb37e28e, uid : failed to execute portforward in network namespace "/var/run/netns/cni-dfdd1737-86af-8b27-6730-b8788b19abd0": socat command returns error: exit status 1, stderr: "2020/09/07 09:50:49 socat[3823213] E connect(5, AF=2 127.0.0.1:9042, 16): Connection refused\n"

To Reproduce

  1. Deploy Cassandra on GKE using provided manifests above
  2. Use kubectl to port forward the Cassandra k8s service cassandra-test-cluster-dc1-nodes or a single pod
  3. Try to connect using cqlsh localhost

Expected behavior cqlsh, TablePlus, DataGrip should be able to connect to Cassandra cluster without issues.

I tried also with a custom application written by me in Golang, but the behaviour is the same.

Environment

  • OS: MacOS Catalina v10.15.6 with iTerm Build 3.4.0beta5
  • Kubernetes version: 1.16.13-gke.1
  • kubectl version
Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.0", GitCommit:"e19964183377d0ec2052d1f1fa930c4d7575bd50", GitTreeState:"clean", BuildDate:"2020-08-26T21:54:15Z", GoVersion:"go1.15", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"16+", GitVersion:"v1.16.13-gke.1", GitCommit:"688c6543aa4b285355723f100302d80431e411cc", GitTreeState:"clean", BuildDate:"2020-07-21T02:37:26Z", GoVersion:"go1.13.9b4", Compiler:"gc", Platform:"linux/amd64"}
  • Go version: 1.14
  • Cassandra version: 3.11.6

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 48

Most upvoted comments

@bygui86 this is the PR (1) which solves both your issues (IP and restore (I havent tested restore cause I havent had time but the most probably this is it)).

Please try it all out and let me know, after that I ll do proper release.

Each commit in master produces images which have tag “latest-dev”, these changes are in image “cassandra-3.11.8” and “cassandra-4.0-beta2” (same tags). You find images here (2)

(1) https://github.com/instaclustr/cassandra-operator/pull/392 (2) http://console.cloud.google.com/gcr/images/cassandra-operator

Hi @smiklosovic thanks for your PR! Currently I’m on holiday, on Monday 21st I will be back and I will try your improvements with pleasure! Just to be sure, I have to test “cassandra-3.11.8:latest-dev” right?