azure-pipelines-agent: Environment agent seems to be stuck in a running state

Agent Version and Platform

Version of your agent? 2.144.0/2.144.1/… 2.193.1

OS of the machine running the agent? OSX/Windows/Linux/… Windows Server 2016 Datacenter

Azure DevOps Type and Version

dev.azure.com

If dev.azure.com, what is your organization name? https://dev.azure.com/Itron Itron

What’s not working?

Environment agents seem to be stuck in a running state. However, the agent is not doing anything and appears to be done. Subsequent jobs/deployments in the pipeline end up stuck in a queued state. This happens on different agents, but seems to happen every time we run. The environment shows that an agent is still running: image The pipeline thinks it is done but is waiting to start the next phase (queued) image

Agent and Worker’s Diagnostic Logs

Logs are located in the agent’s _diag folder. The agent logs are prefixed with Agent_ and the worker logs are prefixed with Worker_. All sensitive information should already be masked out, but please double-check before pasting here Worker_20211025-163049-utc.log Agent_20211022-231144-utc.log .

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 16 (6 by maintainers)

Most upvoted comments

@tinylabspace I had seen that but the docs didn’t mention that tasks (just jobs/stages) couldn’t have a displayName with Deployment in them. Also, though it is a bit ambiguous, I believe the docs are saying that the actual name of the job/stage can’t have deployment, but the displayName is not mentioned. We have managed to work around this issue by reducing the number of total steps in the stages. A few of our templates are called repeatedly with control parameters indicating which parts of the template should be run. We moved that logic from “conditions” to template syntax so the step isn’t even compiled in if it is not used. That has allowed us to proceed without any more hangups. Support has indicated there are limits on the number of steps and that there are backlog items to resolve issues like ours, but they have not definitively correlated our issue to the backlog items.

I’m still following this thread. The user experience would improve if the YAML validation simply failed when a user tries to save a pipeline that uses a reserved word for a stage name.

If that can’t be done because users might be using an external editor to edit the pipeline it might be better if the pipeline run failed immediately. Possibly with a YAML parsing error, vs allowing it to start then hang on the Deploy stage with no meaningful feedback in the UI.

@anatolybolshakov I will upgrade and do some testing