rancher: Jenkins kubernetes plugin can't watch pods after 2.2.5 or 2.2.6 upgrade
Issue: After upgrading to 2.2.5 or 2.2.6 the Jenkins plugin is no longer able to run builds inside the cluster. It stems from a failure to watch for the newly provisioned pod. The plugin is still able to create and destroy pods but fails on the watch.
Plugin output looks something like this
Jul 24, 2019 3:12:24 PM FINE org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud
Building connection to Kubernetes kubernetes URL https://rancher.example.com/k8s/clusters/c-scffx namespace mobilebuilds
Jul 24, 2019 3:12:24 PM FINE org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud
Connected to Kubernetes kubernetes URL https://rancher.example.com/k8s/clusters/c-scffx/
Jul 24, 2019 3:12:24 PM INFO org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher launch
Created Pod: mobilebuilds/androidbuild-l8zzt
Jul 24, 2019 3:12:24 PM FINE io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager runWatch
Connecting websocket ... io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager@3a01ac53
Jul 24, 2019 3:12:24 PM WARNING io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$1 onFailure
Exec Failure: HTTP 403, Status: 403 - null
java.net.ProtocolException: Expected HTTP 101 response but was '403 Forbidden'
at okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:229)
at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:196)
at okhttp3.RealCall$AsyncCall.execute(RealCall.java:206)
at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Jul 24, 2019 3:12:24 PM FINE io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager scheduleReconnect
Submitting reconnect task to the executor
Jul 24, 2019 3:12:24 PM FINE io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager close
Force closing the watch io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager@3a01ac53
Jul 24, 2019 3:12:24 PM FINE io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$2 execute
Scheduling reconnect task
Jul 24, 2019 3:12:24 PM FINE io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager nextReconnectInterval
Current reconnect backoff is 1000 milliseconds (T0)
Jul 24, 2019 3:12:24 PM WARNING org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher launch
Error in provisioning; agent=KubernetesSlave name: androidbuild-l8zzt, template=PodTemplate{inheritFrom='', name='AndroidBuild', namespace='mobilebuilds', slaveConnectTimeout=300, idleMinutes=1, label='java8_k', nodeSelector='', nodeUsageMode=EXCLUSIVE, workspaceVolume=EmptyDirWorkspaceVolume [memory=false], containers=[ContainerTemplate{name='jnlp', image='docker.alexandria.example.com/mobilebuild/android-build:1.0.0.8', workingDir='/home/jenkins', command='', args='', ttyEnabled=true, resourceRequestCpu='', resourceRequestMemory='', resourceLimitCpu='', resourceLimitMemory='', livenessProbe=org.csanchez.jenkins.plugins.kubernetes.ContainerLivenessProbe@4d1d8b05}], yamls=[spec:
containers:
- name: jnlp
resources:
limits:
memory: "6Gi"
cpu: "6"]}
io.fabric8.kubernetes.client.KubernetesClientException
at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$1.onFailure(WatchConnectionManager.java:198)
at okhttp3.internal.ws.RealWebSocket.failWebSocket(RealWebSocket.java:571)
at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:198)
at okhttp3.RealCall$AsyncCall.execute(RealCall.java:206)
at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Jul 24, 2019 3:12:24 PM FINER org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher
Removing Jenkins node: androidbuild-l8zzt
Entry in the audit log look like this:
{
"auditID": "8bf4d6c2-eeed-40e7-9fef-67a15c4518e5",
"requestURI": "/k8s/clusters/c-scffx/api/v1/namespaces/mobilebuilds/pods?fieldSelector=metadata.name%3Dandroidbuild-qdb0d&watch=true",
"user": {},
"method": "GET",
"remoteAddr": "10.128.91.212:63935",
"requestTimestamp": "2019-07-25T17:55:50Z",
"responseTimestamp": "2019-07-25T17:55:50Z",
"responseCode": 403,
"requestHeader": {
"Accept-Encoding": [
"gzip"
],
"Connection": [
"Upgrade"
],
"Origin": [
"https://rancher.epic.com:-1"
],
"Sec-Websocket-Key": [
"uhirw9aNFMwpU1rLTUjnAw=="
],
"Sec-Websocket-Version": [
"13"
],
"Upgrade": [
"websocket"
],
"User-Agent": [
"okhttp/3.12.0"
]
},
"responseHeader": {
"Content-Type": [
"application/json"
]
}
}
Steps to reproduce (least amount of steps as possible): Upgrade rancher to 2.2.5 or 2.2.6 and use the Jenkin’s Kubernetes plugin to run build jobs in the cluster. Happens with the 1.16 and 1.17 versions of the plugin. The 403s occur even if the user being used by Jenkins is a rancher-wide admin.
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 4
- Comments: 17 (4 by maintainers)
Fixed on 1.18.2 of the Jenkins K8S plugin : https://github.com/jenkinsci/kubernetes-plugin/releases
Workaround to add the port is not necessary anymore