rancher: Jenkins kubernetes plugin can't watch pods after 2.2.5 or 2.2.6 upgrade

Issue: After upgrading to 2.2.5 or 2.2.6 the Jenkins plugin is no longer able to run builds inside the cluster. It stems from a failure to watch for the newly provisioned pod. The plugin is still able to create and destroy pods but fails on the watch.

Plugin output looks something like this

Jul 24, 2019 3:12:24 PM FINE org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud

Building connection to Kubernetes kubernetes URL https://rancher.example.com/k8s/clusters/c-scffx namespace mobilebuilds

Jul 24, 2019 3:12:24 PM FINE org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud

Connected to Kubernetes kubernetes URL https://rancher.example.com/k8s/clusters/c-scffx/

Jul 24, 2019 3:12:24 PM INFO org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher launch

Created Pod: mobilebuilds/androidbuild-l8zzt

Jul 24, 2019 3:12:24 PM FINE io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager runWatch

Connecting websocket ... io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager@3a01ac53

Jul 24, 2019 3:12:24 PM WARNING io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$1 onFailure

Exec Failure: HTTP 403, Status: 403 - null
java.net.ProtocolException: Expected HTTP 101 response but was '403 Forbidden'
	at okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:229)
	at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:196)
	at okhttp3.RealCall$AsyncCall.execute(RealCall.java:206)
	at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.lang.Thread.run(Unknown Source)


Jul 24, 2019 3:12:24 PM FINE io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager scheduleReconnect

Submitting reconnect task to the executor

Jul 24, 2019 3:12:24 PM FINE io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager close

Force closing the watch io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager@3a01ac53

Jul 24, 2019 3:12:24 PM FINE io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$2 execute

Scheduling reconnect task

Jul 24, 2019 3:12:24 PM FINE io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager nextReconnectInterval

Current reconnect backoff is 1000 milliseconds (T0)

Jul 24, 2019 3:12:24 PM WARNING org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher launch

Error in provisioning; agent=KubernetesSlave name: androidbuild-l8zzt, template=PodTemplate{inheritFrom='', name='AndroidBuild', namespace='mobilebuilds', slaveConnectTimeout=300, idleMinutes=1, label='java8_k', nodeSelector='', nodeUsageMode=EXCLUSIVE, workspaceVolume=EmptyDirWorkspaceVolume [memory=false], containers=[ContainerTemplate{name='jnlp', image='docker.alexandria.example.com/mobilebuild/android-build:1.0.0.8', workingDir='/home/jenkins', command='', args='', ttyEnabled=true, resourceRequestCpu='', resourceRequestMemory='', resourceLimitCpu='', resourceLimitMemory='', livenessProbe=org.csanchez.jenkins.plugins.kubernetes.ContainerLivenessProbe@4d1d8b05}], yamls=[spec:
  containers:
    - name: jnlp
      resources:
         limits:
           memory: "6Gi"
           cpu: "6"]}
io.fabric8.kubernetes.client.KubernetesClientException
	at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$1.onFailure(WatchConnectionManager.java:198)
	at okhttp3.internal.ws.RealWebSocket.failWebSocket(RealWebSocket.java:571)
	at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:198)
	at okhttp3.RealCall$AsyncCall.execute(RealCall.java:206)
	at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.lang.Thread.run(Unknown Source)


Jul 24, 2019 3:12:24 PM FINER org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher

Removing Jenkins node: androidbuild-l8zzt

Entry in the audit log look like this:

{
  "auditID": "8bf4d6c2-eeed-40e7-9fef-67a15c4518e5",
  "requestURI": "/k8s/clusters/c-scffx/api/v1/namespaces/mobilebuilds/pods?fieldSelector=metadata.name%3Dandroidbuild-qdb0d&watch=true",
  "user": {},
  "method": "GET",
  "remoteAddr": "10.128.91.212:63935",
  "requestTimestamp": "2019-07-25T17:55:50Z",
  "responseTimestamp": "2019-07-25T17:55:50Z",
  "responseCode": 403,
  "requestHeader": {
    "Accept-Encoding": [
      "gzip"
    ],
    "Connection": [
      "Upgrade"
    ],
    "Origin": [
      "https://rancher.epic.com:-1"
    ],
    "Sec-Websocket-Key": [
      "uhirw9aNFMwpU1rLTUjnAw=="
    ],
    "Sec-Websocket-Version": [
      "13"
    ],
    "Upgrade": [
      "websocket"
    ],
    "User-Agent": [
      "okhttp/3.12.0"
    ]
  },
  "responseHeader": {
    "Content-Type": [
      "application/json"
    ]
  }
}

Steps to reproduce (least amount of steps as possible): Upgrade rancher to 2.2.5 or 2.2.6 and use the Jenkin’s Kubernetes plugin to run build jobs in the cluster. Happens with the 1.16 and 1.17 versions of the plugin. The 403s occur even if the user being used by Jenkins is a rancher-wide admin.

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Reactions: 4
  • Comments: 17 (4 by maintainers)

Most upvoted comments

Fixed on 1.18.2 of the Jenkins K8S plugin : https://github.com/jenkinsci/kubernetes-plugin/releases

Workaround to add the port is not necessary anymore