concourse: unknown handle on repository.
Bug Report
I’m regularly encountering an unknown handle on the repository resource:
unknown handle: 479a6ba9-cc14-4d3b-74fe-7785d4ee9144
The repository is currently configured as following:
- name: repository
type: git
source:
branch: master
password: ((git-pwd))
uri: https://private.git.server/repo.git
username: ((git-user))
check_every: 24h
webhook_token: <webhook_token>
Sometimes the unknown handle goes away after a manual trigger of the webhook, or when re-creating the workers. The webhook does not always work.
- Concourse version: 3.6.0
- Deployment type (BOSH/Docker/binary): BOSH
- Infrastructure/IaaS: OpenStack
- Browser (if applicable): Any
- Did this used to work? Yes
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Reactions: 2
- Comments: 18 (2 by maintainers)
@mvdkleijn yet i was not aware that BOSH based environments have any of those issues - though its just the docker-based ones. The issue you have is known to be related in workers-mixed state persistance.
That means, that especially in docker environments, when a worker is
stoped ordownto be started again, volumes like /worker-state are flushed, while the db volume is persistent and holds partial state e.g. for existing cache ( or something like that ). When the worker comes back alive, he could even have a different hostname in docker envs.so you have to retire the “old” worker properly to book in the new one ( even though the worker is not new ).
To you your case, when you land a worker before stopping it + persist the /worker-state and down/up, you will get this handle errors. When you do not land + persist the /worker-state and do down/up, you get a “file not found” issues - when you land + not persist /worker-state, you get ‘handle’ issues.
The only thing to properly “restart” a worker right now in docker-env is to retire the worker before shutting him down, and then register him on start. The persistance of /worker-state is not allowed at all over restarts due to that bug.
Explained it briefly for you to maybe match this on the BOSH workflow you use. To fix the issue in docker, if you migrate there, i use a
trapwrapper to retire a worker beforeKILLhttps://github.com/EugenMayer/docker-image-concourseci-worker-solid https://github.com/EugenMayer/docker-image-concourseci-worker-solid/blob/master/worker_wrapper.sh#L11 This basically wrapperswrapperwith a simple script but usesconcourse/concourseas the base image. Its usage is .e.g here https://github.com/EugenMayer/concourseci-server-boilerplate/blob/master/docker-compose.yml#L49Doing that, i could setup a proper docker-based production stack without any of those issues anymore, i am using this with rancher catalogs.
This should be fixed by #2588 which will be in 5.0.0.