docker-plugin: Could not provision second build - Jenkins master hangs and need to be restarted

I did setup Docker Cloud with template: pointed to DOCKER_HOST URI and image in Manage Jenkins/Configure System/Cloud

When i trigger test job, first one finishes with success, but second build for the same job without any additional actions and changes is not triggered at all hanging in “(pending—Waiting for next available executor) state” in build queue.

docker-plugin: 1.0.2 jenkins: 2.73.2 docker engine:

Client:
 Version:      17.09.0-ce
 API version:  1.31 (downgraded from 1.32)
 Go version:   go1.8.3
 Git commit:   afdb6d4
 Built:        Tue Sep 26 22:40:09 2017
 OS/Arch:      darwin/amd64

Server:
 Version:      17.07.0-ce
 API version:  1.31 (minimum version 1.12)
 Go version:   go1.8.3
 Git commit:   8784753
 Built:        Tue Aug 29 17:41:43 2017
 OS/Arch:      linux/amd64
 Experimental: false

Stack trace: ( masked registry DNS name)

Oct 24, 2017 10:53:52 PM com.nirima.jenkins.plugins.docker.DockerCloud provision
 INFO: Asked to provision 1 slave(s) for: null
 Oct 24, 2017 10:53:52 PM com.nirima.jenkins.plugins.docker.DockerCloud provision
 INFO: Will provision 'registry2.****/aurea/eng.build/jenkins/jnlp-slave:3.10-1-alpine', for label: 'null', in cloud: 'DLB-1'
 Oct 24, 2017 10:53:52 PM com.nirima.jenkins.plugins.docker.DockerCloud addProvisionedSlave
 INFO: Provisioning 'registry2.*****/aurea/eng.build/jenkins/jnlp-slave:3.10-1-alpine' number '0' on 'DLB-1'; Total containers: '174'
 Oct 24, 2017 10:53:52 PM hudson.slaves.NodeProvisioner$StandardStrategyImpl apply
 INFO: Started provisioning Image of registry2.****/aurea/eng.build/jenkins/jnlp-slave:3.10-1-alpine from DLB-1 with 1 executors. Remaining excess workload: 0
 Oct 24, 2017 10:53:52 PM com.nirima.jenkins.plugins.docker.DockerCloud provisionFromTemplate
 INFO: Trying to run container for registry2.****/aurea/eng.build/jenkins/jnlp-slave:3.10-1-alpine
 Oct 24, 2017 10:54:16 PM hudson.TcpSlaveAgentListener$ConnectionHandler run
 INFO: Accepted JNLP4-connect connection #2 from /172.18.0.130:38480
 Oct 24, 2017 10:54:20 PM com.nirima.jenkins.plugins.docker.DockerCloud removeJobTemplate
 WARNING: Couldn't remove template for job with id: 9
 Oct 24, 2017 10:54:24 PM hudson.model.Run execute
 INFO: TestJob #3 main build action completed: SUCCESS
 Oct 24, 2017 10:54:24 PM com.nirima.jenkins.plugins.docker.DockerSlave _terminate
 INFO: Disconnected computer
 Oct 24, 2017 10:54:24 PM jenkins.slaves.DefaultJnlpSlaveReceiver channelClosed
 WARNING: Computer.threadPoolForRemoting [#25] for DLB-1-ae2be65a0456 terminated
 java.nio.channels.ClosedChannelException
 	at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer.onReadClosed(ChannelApplicationLayer.java:208)
 	at org.jenkinsci.remoting.protocol.ApplicationLayer.onRecvClosed(ApplicationLayer.java:222)
 	at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:832)
 	at org.jenkinsci.remoting.protocol.FilterLayer.onRecvClosed(FilterLayer.java:287)
 	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecvClosed(SSLEngineFilterLayer.java:181)
 	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.switchToNoSecure(SSLEngineFilterLayer.java:283)
 	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processWrite(SSLEngineFilterLayer.java:503)
 	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processQueuedWrites(SSLEngineFilterLayer.java:248)
 	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doSend(SSLEngineFilterLayer.java:200)
 	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doCloseSend(SSLEngineFilterLayer.java:213)
 	at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.doCloseSend(ProtocolStack.java:800)
 	at org.jenkinsci.remoting.protocol.ApplicationLayer.doCloseWrite(ApplicationLayer.java:173)
 	at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer$ByteBufferCommandTransport.closeWrite(ChannelApplicationLayer.java:311)
 	at hudson.remoting.Channel.close(Channel.java:1403)
 	at hudson.remoting.Channel.close(Channel.java:1356)
 	at hudson.slaves.SlaveComputer.closeChannel(SlaveComputer.java:708)
 	at hudson.slaves.SlaveComputer.access$800(SlaveComputer.java:96)
 	at hudson.slaves.SlaveComputer$3.run(SlaveComputer.java:626)
 	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 	at java.lang.Thread.run(Thread.java:745)
 
 Oct 24, 2017 10:54:40 PM com.nirima.jenkins.plugins.docker.DockerSlave$2 run
 INFO: Stopped container ae2be65a04566fd1a9375d8f05edba0a8644eaa764aeaad81fc46b41d7b74ac3
 Oct 24, 2017 10:54:40 PM com.nirima.jenkins.plugins.docker.DockerSlave$2 run
 INFO: Shutdowned slave for ae2be65a04566fd1a9375d8f05edba0a8644eaa764aeaad81fc46b41d7b74ac3
 Oct 24, 2017 10:54:41 PM com.nirima.jenkins.plugins.docker.DockerSlave$2 run
 INFO: Removed container ae2be65a04566fd1a9375d8f05edba0a8644eaa764aeaad81fc46b41d7b74ac3

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Reactions: 1
  • Comments: 22 (10 by maintainers)

Commits related to this issue

Most upvoted comments

You’re right that plannedCapacitySnapshot is wrong here, need to investigate how to force this being updated. I’ve been able to reproduce this error after interrupting a build. Investigating …