azure-pipelines-agent: $PATH does not contain paths defined on /etc/environment

Agent Version and Platform

Version of your agent? 2.187.2, 2.188.3

OS of the machine running the agent? Ubuntu 18.04, Ubuntu 20.04

Azure DevOps Type and Version

dev.azure.com (Cloud)

What’s not working?

Issue created after request from the contributors of the /actions/virtual-environments repository (see: https://github.com/actions/virtual-environments/issues/3695), saying this issue is probably related to the agents and not the images themselves.

Description

When running a task inside a self-hosted Ubuntu build agent (either manual or automatic scaleset), running echo $PATH on a self hosted agent returns a very limited number of paths. Connecting to the agent via SSH and running echo $PATH directly will return the full range of paths inside the PATH variable though.

Virtual environments affected

  • Ubuntu 16.04
  • Ubuntu 18.04
  • Ubuntu 20.04
  • macOS 10.15
  • macOS 11
  • Windows Server 2016
  • Windows Server 2019

Image version and build link

(using the https://github.com/actions/virtual-environments repository)

Ubuntu 18 Image version: releases/ubuntu18/20210606 commit 58b026cedf2363aee66fcdde3981b09704d5bd79 Agent version: 2.187.2

Ubuntu 20 Image version: releases/ubuntu20/20210606 commit a26b241d4791b9af60f069b29c6e993595d75349 Agent version: 2.188.3

Packer: 1.6.2, 1.7.2

Expected behavior

Running a shell script task echo $PATH on a self hosted Ubuntu agent should return all of the paths defined inside the /etc/environment file:

PATH=/home/linuxbrew/.linuxbrew/bin:/home/linuxbrew/.linuxbrew/sbin:$HOME/.local/bin:/opt/pipx_bin:/usr/share/rust/.cargo/bin:$HOME/.config/composer/vendor/bin:/usr/local/.ghcup/bin:$HOME/.dotnet/tools:/snap/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin

Actual behavior

Running a shell script task echo $PATH on a self hosted Ubuntu agent returns a very limited number of paths:

/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin

If I perform the following steps, $PATH gets filled as expected though:

  1. Connect to the virtual machine
  2. Open the agent folder /opt/vsts/a1
  3. Run the following commands:
./env.sh
sudo ./svc.sh stop
sudo ./svc.sh start

Repro steps

  1. Build a VM image using the Ubuntu image versions provided
  2. Create a virtual machine on a scaleset using the newly generated image (either manual or automatic scaleset via Azure DevOps)
  3. (If manual scaleset only) , install a vsts agent on the virtual machine using a release from the vsts agent repo (https://github.com/microsoft/azure-pipelines-agent)
  4. Create a new pipeline (ex.: YAML) and run script echo $PATH

Agent and Worker’s Diagnostic Logs

Agent_20210714-172015-utc.log Worker_20210714-155257-utc.log

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 41 (12 by maintainers)

Commits related to this issue

Most upvoted comments

Why was this closed? This is still a major issue with self-hosted agents. The PATH is read from /etc/sudoers and not from /etc/environment.

You can’t expect us to edit /etc/sudoers in our custom image just for ADO. And adding a step to every pipeline to source /etc/environment is not practical.

The problem lies in the ADO agent extension - and should be fixed there.

We also ran into this problem with ubuntu2204. Eventhough I like the workaround from @HoLengZai - it’s still a work around. A permanent solution by Microsoft is favoured by us.

@HoLengZai You made my day. Thank you very much for your work.

@HoLengZai : Thanks for the script, saved me a lot of headache after i fell face down into troubleshooting the issue. I also do not understand the difficulty engaging MS into changing the install script the vmss extension that gets downloaded and run during agent onboarding.

Can confirm your approach makes the PATH correctly populated when running logic through the pipeline. I also use runner-images MS repo for creating the custom vmss image to use, and had some frustration figuring out why I could not call many of the tools directly in the pipeline.

Here is my final script without comments and I removed unused commands:

sudo useradd -m AzDevOps
sudo usermod -a -G docker AzDevOps
sudo usermod -a -G adm AzDevOps
sudo usermod -a -G sudo AzDevOps

sudo chmod -R +r /home
setfacl -Rdm "u:AzDevOps:rwX" /home
setfacl -Rb /home/AzDevOps

sudo su -c "echo 'AzDevOps  ALL=(ALL) NOPASSWD:ALL' | sudo tee /etc/sudoers.d/01_AzDevOps && chmod 0440 /etc/sudoers.d/01_AzDevOps"

# Must be done after AzDevOps user creation
sudo su -c "find /opt/post-generation -mindepth 1 -maxdepth 1 -type f -name '*.sh' -exec bash {} \;"

pathFromEnv=$(cut -d= -f2 /etc/environment | tail -1)

mkdir /agent && chmod 775 /agent
echo $pathFromEnv > /agent/.path && chmod 444 /agent/.path
echo "PATH=$pathFromEnv" > /agent/.env && chmod 644 /agent/.env
chown -R AzDevOps:AzDevOps /agent

chattr +i /agent/.path

In my terraform vmss module, my extension block looks like this: This is how I push my custom script, so it will be run before the VM pipeline agent extension

  extension = {
    name                 = var.extension["name"]
    publisher            = var.extension["publisher"]
    type                 = var.extension["type"]
    type_handler_version = var.extension["type_handler_version"]
    settings = jsonencode({
      "script" = base64encode(data.local_file.sh.content)
    })
  }

  # -
  # - Custom Scripts
  # -
  data "local_file" "sh" {
    #filename = "${path.module}/files/${var.vmss_linux_script_filename}"
    filename = join("/", [path.module, "files", var.vmss_linux_script_filename])
  }

I’ll close the issue for now, will reopen if something arises. Feel free to ping me here or at v-ikuleshov@microsoft.com

I don’t agree with the statement “it’s specific for you”. We use this Microsoft documentation without any customization 🙂

For Windows agents, all is/seems good ✔

For Linux agent, the $PATH is provisionned with the values of /etc/sudoers instead of /etc/environment.

The agent installation is done on the Microsoft side (see the bellow documentation) and it is not working out of the box. I think there is an issue here (maybe not related to the agent itself) but @marcuslopes was redirect in this repo from actions/virtual-environments#3695

We have done one more test on our side. We have added a Custom Script Extension on the VMMS then:

  • Read the $PATH variable (values from /etc/sudoers are provisionned)
  • Execute the source command
  • Read again the $PATH variable (values from /etc/sudoers are provisionned, not from /etc/environment)
  • If we will connect to the VM with the user, the $PATH is correctly provisionned.

This test show the potential issue with the extension on the VMMS (note: Microsoft use an extension to install the agent). As I indicate in the beginning of my reply, we use the Microsoft documentation without any customization. So, the feature (Elastic Pools) is not working out of the box with Linux images possibly related to a specific Extension issue.

Microsoft don’t have this issue with their own Hosted Pools, so two possibilities:

  • You don’t use the Elastic Pools process internally
  • You use the Elastic Pools process internally but you have customized something

Can you find the missing information on your side, or redirect the issue to the team that is responsible of the Elastic Pools feature?

Thank you!

Hi @kuleshovilya,

We use ‘Azure Virtual Machine Scale Set’ agent pool type. There is nothing custom from our part in the vmss extensions. The extension in place is all configured by Azure DevOps itself (extension name: Microsoft.Azure.DevOps.Pipelines.Agent). Just like @AlexrDev mentioned, we also noticed that the path that is provisioned in the /agent/.path file is the same that in the value secure_path inside /etc/sudoers.

In our process to create the generalized image used in our vmss, we first created build a generalized image with Packer and push it in an image Gallery.

From there,

  • Create a specialized vm
  • Run the post-generation scripts (as recommended from Microsoft documentation) via a custom script extension installed on the vm.
  • Proceed with the generalization and versioning
  • Using the command sudo waagent -deprovision
  • Shutting down the vm
  • Generalize the vm
  • Capture the vm
  • Use this image as a reference in our vmss

We finally found a way to fix the current issue (as a workaround)

  • create the AzDevOps user beforehand, using parts of the enableagent scripts as inspiration. (this additional step is done on the specialized vm BEFORE running the post-generation scripts)
  • Modifying the sudoers file to have the value secure_path to be equal to the path value in /etc/environment. (this additional step is done on the specialized vm AFTER running the post-generation scripts)

Here is the specialized vm extension custom script

#!/bin/bash

sudo useradd -m AzDevOps
sudo usermod -a -G docker AzDevOps
sudo usermod -a -G adm AzDevOps
sudo usermod -a -G sudo AzDevOps

sudo chmod -R +r /home
sudo setfacl -Rdm "u:AzDevOps:rwX" /home
sudo setfacl -Rb /home/AzDevOps
sudo echo 'AzDevOps ALL=NOPASSWD: ALL' >> /etc/sudoers

sudo su AzDevOps

# Run the post-generation scripts
sudo find /opt/post-generation -mindepth 1 -maxdepth 1 -type f -name '*.sh' -exec bash {} \;

# Remove the secure_path line
sed -i.bak '/secure_path/d' /etc/sudoers

# Add the secure_path with the /etc/environment path
pathFromEnv=$(cut -d= -f2 /etc/environment | tail -1)
echo "Defaults secure_path=\"$pathFromEnv\"" >> /etc/sudoers

We tried various ways of creating files in /etc/sudoers.d/ and modifying env_keep, env_reset, disabled secure_path, etc. While the behavior seemed OK while connecting to the specialized vm, and that the secure_path was either removed or replaced in the sudoers.d files, the azure extension always wrote the default secure_path in the .path on the resulting vmss.

Forcing the right path in the secure_path is the only thing that worked, which points to a weird behavior when using azure extensions on a vm/vmss.

When we use this new image version for the vmss, the problem is fixed.

*This is a workaround and not something we consider using as a official fix since it modify the sudoers file directly.

We also noticed a difference in the machine from Azure pipelines pools. The configuration for the vsts user vsts ALL=(ALL) NOPASSWD=ALL) is not in the sudoers file but in the file vsts in the /etc/sudoers.d/ directory. The AzureDevOps extensions is adding this configuration for the user AzDevOps directly to the sudoers file. There seems to be slight differences in the way the user vsts is configure vs. the user AzDevOps. Maybe this is part of the reason why we have this issue.

Thank you for your support on this issue.

By the way I work in the same team as @marcuslopes and @ChristopheLav

@kuleshovilya This is not a use-case specific problem to @ChristopheLav. With the workarounds suggested in this thread it means that there is no way to create a reusable ScaleSet image with custom software to use for agents in an agent pool. In our case, we need to have ansible installed on the agent, which is installed via pipx, and therefore the /opt/pipx_bin directory needs to be appended to the path. When installing this at image build time, the PATH variable is updated, but this is then overwritten when the agent software is installed by ADO causing the incorrect $PATH in the .path file.

Please re-open this issue as it is a problem created by Microsoft’s approach to custom image use, rather than a user specific scenario

Thanks @ChristopheLav, @marcuslopes, @Jean-FrancoisBeaudet Your investigation saved me a bunch of time.

I think i found the workaround (by using the command chattr in the custom script). I know this issue is closed (even i don’t think it should be closed) and a bit old. It will be great and easier if Microsoft dev team can publish the git repo of this extension. It would be much easier to figure out the issue and post the issue to the correct team directly as we don’t know which dev team has developed this VM extension to install the pipeline agent. I tried all the stuffs to trick the VM extension (especially the file enableagent.sh which calling config.sh which “source” .env which generating the .path file

@tonyskidmore, I looked your workaround but I agree with @ChristopheLav, we should not change the “sudoers” file. As mentioned on the official Microsoft repo: https://github.com/actions/runner-images#about (aka: https://learn.microsoft.com/en-us/azure/devops/pipelines/agents/scale-set-agents?view=azure-devops#where-can-i-find-the-images-used-for-microsoft-hosted-agents)

I run plenty of pipeline on the Microsoft-hosted “Azure Pipeline” and indeed the 2 images (MS-hosted and the one from the official repo runner-images with packer) are the same however the way the pipeline agent is deployed is totally different. Example: MS-hosted create the user account vsts and it put the agent in /home/vsts/agents/<x.y.z>/ VM pipeline extension create the user AzDevOps and put the agent in /agent/

Then MS-hosted is running the pipeline agent as a service in systemd. It’s not the case with the VM pipeline extension. I don’t really know how it runs but i don’t see any vsts service in /etc/systemd/system/*vsts*

So here my workaround and i hope it will help others: I use the custom script extension to create the /agent/.path which creating the issue before the VM pipeline agent extension. I know that the VM pipeline agent extension will overwrite the /agent/.path file as the extension is run through the AzDevOps account (in theory), so as superuser, so it will overwrite it. As mentioned on Microsoft doc, the VM pipeline agent extension will always run at last (or at least after the custom script extension)

<–> btw, I still don’t understand why image $PATH =/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin I played with /etc/sudoers.d folder, runuser config, .bashrc, .profile, .bash_profile (non interactive, non login shell, login shell, etc, I tried all combinaison to have the regular $PATH (= /etc/environment) I always got the secure_path PATH from sudoers. <–>

Anyway, so here the content of the custom script:

I create the AzDevOps user instead of letting the VM pipeline extension to do it as the ‘enableagent.sh’ check if the account has been already created. And the main reason is to be sure that the script in ‘/opt/post-generation’ run after AzDevOps user creation, so the /etc/environment will contain AzDevOps instead of the default user name or $HOME (@tonyskidmore that’s one thing that your workaround does not do, so your /etc/environment contains the default user name instead of AzDevOps, I don’t know if there is any issue but if you look the MS-hosted one, the PATH contains the vsts user and not the default user name)

# copied from enableagent.sh script
sudo useradd -m AzDevOps
sudo usermod -a -G docker AzDevOps
sudo usermod -a -G adm AzDevOps
sudo usermod -a -G sudo AzDevOps

sudo chmod -R +r /home
setfacl -Rdm "u:AzDevOps:rwX" /home
setfacl -Rb /home/AzDevOps

# I do not want to modify the sudoers file like on the MS hosted agent. so i create the config in sudoers.d
# As recommended on the sudoers man, I do chmod 0440 for the files
# NOPASSWD:ALL should be enough too. SETENV can be removed
sudo su -c "echo 'AzDevOps  ALL=(ALL) NOPASSWD:SETENV: ALL' | sudo tee /etc/sudoers.d/02_keep_env_for_AzDevOps && chmod 0440 /etc/sudoers.d/02_keep_env_for_AzDevOps"
# I think this one is useless but my idea is to tell not to env_reset for that particular user (AzDevOps) but it doesn't work, I always got the secure_path from sudoers file
sudo su -c "echo 'Defaults:AzDevOps    !env_reset' | sudo tee /etc/sudoers.d/01_keep_env_for_AzDevOps && chmod 0440 /etc/sudoers.d/01_keep_env_for_AzDevOps"

# copied from "https://github.com/actions/runner-images/blob/main/docs/create-image-and-azure-resources.md#ubuntu"
sudo su -c "find /opt/post-generation -mindepth 1 -maxdepth 1 -type f -name '*.sh' -exec bash {} \;"

# I believed that it will help to put something in profile.d but it doesn't change anything so this line can be removed too
sudo su -c "echo 'source /etc/environment' > /etc/profile.d/agent_env_vars.sh && chmod 755 /etc/profile.d/agent_env_vars.sh"

# Useless the 3 below - it doesnt change anything for AzDevOps PATH when I run a pipeline
echo "source /etc/environment" >> /home/AzDevOps/.bashrc
echo "source /etc/environment" >> /home/AzDevOps/.profile
echo "source /etc/environment" >> /home/AzDevOps/.bash_profile

# Thanks @tonyskidmore,  I reuse your line :)
pathFromEnv=$(cut -d= -f2 /etc/environment | tail -1)

# I create the /agent folder with the same permission as what the VM pipeline agent extension will do after that custom script
mkdir /agent && chmod 775 /agent
# I will put the proper PATH in the .path file and i will set the same permission as what the VM pipeline agent extension will do after that custom script
echo $pathFromEnv > /agent/.path && chmod 444 /agent/.path
# I do the same on /agent/.env file as the VM pipeline agent extension will append on that file instead of overwriting it (=not like for /agent/.path)
echo "PATH=$pathFromEnv" > /agent/.env && chmod 644 /agent/.env
# Change the ownership and group owner for the whole /agent folder (so recursively with -R)
sudo -E su -c 'chown -R AzDevOps:AzDevOps /agent'

# THE WORKAROUND!!! : make the /agent/.path immutable, so the VM pipeline agent extension won't be able to overwrite it even though the extension runs as root
# https://www.golinuxcloud.com/restrict-root-directory-extended-attributes/
chattr +i /agent/.path

Then I use bastion to check that it’s really immutable: image

Then I run a pipeline which I print my PATH and I run ansible --version and it works ! image

I have been working on a three part blog series on Azure DevOps Self-Hosted VMSS Agents. I looked at this issue again while working through that and came up with a workaround that seems to work for me, based on various comments in the history of this issue. Specifically, in the PATH issue in Part 2 I mention what I did to workaround this. Would be interested to get any feedback.

This is still the case and should be fixed!

@ChristopheLav Sure, left a comment in the discussion

Once again, please check /etc/environment setup and usage, the way you describe it’s working is fine and how it’s supposed to be, so far, we can only recommend adding source to your pipelines or something else that customizes the behaviour.

It’s not a valid workaround.

It’s not required with Azure Pipelines, so why it’s required with Elastic Pools? Because MS uses the same master images, there is a custom undocumented configuration step here. I would like to know this step.

If no one adapt something, the Elastics Pool don’t work out of the box. The documentation need to be updated to indicates this MAJOR point!

Whatever it’s working as Linux expect, if the feature Elastic Pools don’t work there is something to fix in your (MS) end. I don’t understand why you don’t want to push the issue to the right team if it’s not related to you. I really don’t understand that point.

@MaksimZhukov the design of the master images that use /etc/environment is not working with the Elastic Pools.

Sorry, I meant if you try sudo login <user> or sudo login -f <user>

Yes if I’m logon on the VM instance manually. No if I use a VMSS Custom Script Extension.

This is my tests and associated results:

From Commands Result
VMSS Custom Script Extension echo $PATH
VMSS Custom Script Extension sudo login -f AzDevOps
echo $PATH
Manually login to the VM instance echo $PATH
Manually login to the VM instance sudo login -f AzDevOps
echo $PATH

This is the script used for the tests:

# Create our user account (same as MS)
echo creating AzDevOps account
sudo useradd -m AzDevOps
sudo usermod -a -G docker AzDevOps
sudo usermod -a -G adm AzDevOps
sudo usermod -a -G sudo AzDevOps

echo "Giving AzDevOps user access to the '/home' directory"
sudo chmod -R +r /home
setfacl -Rdm "u:AzDevOps:rwX" /home
setfacl -Rb /home/AzDevOps
echo 'AzDevOps ALL=NOPASSWD: ALL' >> /etc/sudoers

# Diagnostics PATH issue (custom part)
mkdir -p "/opt/dbg"
{ whoami; echo $PATH; } > /opt/dbg/1.txt
sudo login -f AzDevOps
{ whoami; echo $PATH; } > /opt/dbg/2.txt

The AzDevOps account is created like what Microsoft does in enableagent.sh

I noted that on a VMSS Custom Script Extension, the command sudo login -f user do anything: the logged user remain root. So I decided to use the same method than Microsoft does to install the agent (see previous link):

sudo runuser AzDevOps -c "{ echo \$PATH; } > /opt/dbg/3.txt"

These are the results:

From Result
VMSS Custom Script Extension
Manually login to the VM instance

I got this time the same results when I’m logon manually on the VM instance or when the script is executed in a VMSS Custom Script Extension.

I modify a little bit the command to include the source command like you want @kuleshovilya :

sudo runuser AzDevOps -c "{ source /etc/environment; echo \$PATH; } > /opt/dbg/4.txt"

These are the results:

From Result
VMSS Custom Script Extension
Manually login to the VM instance

As I understand with this conversation and some researches: the configuration from /etc/sudoers is used to apply some policies when sudo and runuser are used. When the secure_path is defined, the$PATH variable will not be loaded from the global environment machine but with the specific values from secure_path. Loaded explicitely the variables with the source command is required.

Go back to how the agent installation is done:

  • Call the config.sh script
  • The script is calling env.sh that read the values from $PATH variable and put them into a file .path

Remember the Elastic Pools automatically install the agent with a VMSS Extension and it is not customizable from the users: https://vstsagenttools.blob.core.windows.net/tools/ElasticPools/Linux/6/enableagent.sh

The issue for me a couple of things:

  • The agent consider during the installation that the variable $PATH is always up-to-date and read the variable without any possibility to force a reload from /etc/environment
  • The Elastic Pools feature install the agent without calling the source /etc/environment to ensure to get the proper values
  • The Elastic Pools feature install the agent with sudo command sudo runuser AzDevOps -c "......"
  • The team on @actions/virtual-environments provisionned required values in /etc/environment since some months, and the Elastic Pools feature refers to this repos for get the master images but the installation process is not compatible (currently) with that

Seems like a deadlock because I can’t personnalize anything with the Elastic Pools feature that is not working correctly out of the box due to decisions of differents teams.

I think some changes are required:

  • Update the env.sh to include the command source /etc/environment (as an option?)
  • Update the enableagent.sh to include the command source /etc/environment before start the agent installation
  • Or a mix of the two previous changes

Let me known if you see something else.

Thank you for your reply @kuleshovilya

@Jean-FrancoisBeaudet describes what we do for building the master image that we use in an Azure Virtual Machine Scale Set. As indicated by @marcuslopes, we use the official master image sources in https://github.com/actions/virtual-environments.

To manage the Scale Set, we use the native Azure DevOps Scale Sets feature. As you can read in the Lifecycle of a Scale Set Agent section, the installation of the agent is directly managed by Microsoft with an own extension (automatically injected in the Scale Set by ADO). The documentation show the used (and not customizable) script. We have no custom script currently because we privilege the personnalisation during the build of the master image to keep good (start) performance because we want to use the tear down after each use feature at some time.

Maybe we can attempt to add a Custom Script Extension with what you indicates (source). But the issue always occurs with the master image, so I think the Microsoft script/logic should be updated to work correctly in this case, if it is related to how the agent is installed by the Microsoft script.

I hope it’s more clear for you.