WALinuxAgent: [BUG] Agent breaks when deploying template with VM and runCommand

When using a Bicep template to deploy an Ubuntu VM while also doing a runCommand in the same template after the VM has been deployed, the agent completely breaks and doesn’t seem to be able to repair itself. It’s as if the agent never finishes setting up to begin with. Removing the runCommand resource from the template will make sure the agent is properly installed, and deploying the same template with runCommand after the VM has been provisioned works as it should. Restarting the walinuxagent service or the VM doesn’t help either.

waagent --version
WALinuxAgent-2.2.45 running on ubuntu 18.04
Python: 3.6.9
Goal state agent: 2.2.45

Log file keeps retrying to connect. Same thing over and over.

2022/02/28 21:14:02.183944 INFO Daemon WireServer endpoint is not found. Rerun dhcp handler
2022/02/28 21:14:02.184409 INFO Daemon Test for route to 168.63.129.16
2022/02/28 21:14:02.187106 INFO Daemon Route to 168.63.129.16 exists
2022/02/28 21:14:02.187485 INFO Daemon Wire server endpoint:168.63.129.16
2022/02/28 21:14:02.200153 INFO Daemon Fabric preferred wire protocol version:2015-04-05
2022/02/28 21:14:02.200873 INFO Daemon Wire protocol version:2012-11-30
2022/02/28 21:14:02.201970 INFO Daemon Server preferred version:2015-04-05
2022/02/28 21:14:02.362426 INFO Daemon Found private key matching thumbprint 93D0A3DD1C418BEC2457ACF4412128541581F3E5
2022/02/28 21:14:02.445277 INFO Daemon Found private key matching thumbprint 93D0A3DD1C418BEC2457ACF4412128541581F3E5
2022/02/28 21:14:02.535893 INFO Daemon Found private key matching thumbprint 93D0A3DD1C418BEC2457ACF4412128541581F3E5
2022/02/28 21:14:02.549331 ERROR Daemon Exception processing goal state, giving up: [the JSON object must be str, bytes or bytearray, not 'NoneType']
2022/02/28 21:14:02.556208 INFO Daemon WireServer is not responding. Reset endpoint
2022/02/28 21:14:02.560111 INFO Daemon Protocol endpoint not found: WireProtocol, [ProtocolError] Exceeded max retry updating goal state
2022/02/28 21:14:02.569141 INFO Daemon Protocol endpoint not found: MetadataProtocol, [ProtocolError] 404 - GET: http://169.254.169.254/Microsoft.Compute/identity?api-version=2015-05-01-preview
2022/02/28 21:14:02.577300 INFO Daemon Retry detect protocols: retry=62

Steps to reproduce.

  1. Deploy VM and runCommand in the same template.
  2. Deploy will get stuck forever and runCommand is never executed and agent breaks.

Template used.

param location string = resourceGroup().location
param vmName string = 'VmTestRunCmd2'
param adminUsername string = 'hadmin'
@secure()
param adminPassword string
param vnetRGName string = 'a-natfw-rg'
param vnetName string = 'a-natfw-vnet'
param subnetBackend string = 'snet-pls'

resource vnet 'Microsoft.Network/virtualNetworks@2021-05-01' existing = {
  name: vnetName
  scope: resourceGroup(vnetRGName)
}

resource networkInterface 'Microsoft.Network/networkInterfaces@2020-11-01' = {
  name: '${vmName}-nic'
  location: location
  properties: {
    ipConfigurations: [
      {
        name: 'ipconfig1'
        properties: {
          privateIPAllocationMethod: 'Dynamic'
          subnet: {
            id: '${vnet.id}/subnets/${subnetBackend}'
          }
        }
      }
    ]
  }
}

resource virtualMachine 'Microsoft.Compute/virtualMachines@2020-12-01' = {
  name: vmName
  location: location
  properties: {
    hardwareProfile: {
      vmSize: 'Standard_A2_v2'
    }
    osProfile: {
      computerName: vmName
      adminUsername: adminUsername
      adminPassword: adminPassword
    }
    storageProfile: {
      imageReference: {
        publisher: 'Canonical'
        offer: 'UbuntuServer'
        sku: '18.04-LTS'
        version: 'latest'
      }
      osDisk: {
        name: '${vmName}-OSDisk'
        caching: 'ReadWrite'
        createOption: 'FromImage'
      }
    }
    networkProfile: {
      networkInterfaces: [
        {
          id: networkInterface.id
        }
      ]
    }
    diagnosticsProfile: {
      bootDiagnostics: {
        enabled: true
      }
    }
  }
}

resource runCommand 'Microsoft.Compute/virtualMachines/runCommands@2021-07-01' = {
  name: '${virtualMachine.name}/runCommandnow'
  location: location
  properties: {
    asyncExecution: false
    errorBlobUri: 'https://anatfwlogs.blob.core.windows.net/error/error.txt'
    outputBlobUri: 'https://anatfwlogs.blob.core.windows.net/error/output.txt'
    source: {
      script: 'touch /home/hadmin/test.txt'
    }
    timeoutInSeconds: 120
  }
}

vmagentnotready

vmproperties

About this issue

  • Original URL
  • State: open
  • Created 2 years ago
  • Reactions: 1
  • Comments: 41 (24 by maintainers)

Most upvoted comments

It was a fresh VM setup this morning. I tried reinstalling via apt, but it didn’t seem to help. Wonder if the package got corrupt somehow. Anyways, working on creating a new one.

@bpkroth you would look if RunCommand shows up in the instance view, but your issue is unrelated, see my previous post

I’m hitting same problem with CustomScript Extension (V2)

2022-10-27T05:53:48.263783Z WARNING Daemon Agent WALinuxAgent-2.2.46 launched with command 'python3 -u /usr/sbin/waagent -run-exthandlers' failed with return code: 1
2022-10-27T05:53:48.271214Z ERROR Daemon Event: name=WALinuxAgent, op=Enable, message=eJw1zLsOgzAMQNFf8cYUokLVgS07e2eXuE2kYCrHFvTveUgdr3R1wodY4RnGzLaFM1zXdu39AQWNp0QR1qwJpmWekSM035+mhXtwBt6q+PrK7FfEy3Fi7GjTdKyFpDbwxlz+hpCa8EFFGuC2AyUXKrQ=, duration=0
2022-10-27T05:53:48.280208Z WARNING Daemon Agent WALinuxAgent-2.2.46 launched with command 'python3 -u /usr/sbin/waagent -run-exthandlers' returned code: 1
2022-10-27T05:53:48.286191Z INFO Daemon Installed Agent WALinuxAgent-2.2.46 is the most current agent
2022-10-27T05:53:49.220645Z INFO ExtHandler Agent WALinuxAgent-2.2.46 is running as the goal state agent
2022-10-27T05:53:49.240505Z INFO ExtHandler Distro info: ubuntu 22.04, osutil class being used: UbuntuOSUtil, agent service name: walinuxagent
2022-10-27T05:53:49.271294Z INFO ExtHandler WireServer endpoint 168.63.129.16 read from file
2022-10-27T05:53:49.291756Z INFO ExtHandler Wire server endpoint:168.63.129.16
2022-10-27T05:53:49.331166Z WARNING ExtHandler Agent WALinuxAgent-2.2.46 failed with exception: str expected, not NoneType
2022-10-27T05:53:49.361480Z WARNING ExtHandler Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/azurelinuxagent/ga/update.py", line 268, in run
    monitor_thread.run()
  File "/usr/lib/python3/dist-packages/azurelinuxagent/ga/monitor.py", line 146, in run
    self.init_sysinfo()
  File "/usr/lib/python3/dist-packages/azurelinuxagent/ga/monitor.py", line 185, in init_sysinfo
    vminfo = self.protocol.get_vminfo()
  File "/usr/lib/python3/dist-packages/azurelinuxagent/common/protocol/wire.py", line 113, in get_vminfo
    goal_state = self.client.get_goal_state()
  File "/usr/lib/python3/dist-packages/azurelinuxagent/common/protocol/wire.py", line 856, in get_goal_state
    self.goal_state = GoalState(xml_text)
  File "/usr/lib/python3/dist-packages/azurelinuxagent/common/protocol/wire.py", line 1381, in __init__
    self._parse(xml_text)
  File "/usr/lib/python3/dist-packages/azurelinuxagent/common/protocol/wire.py", line 1398, in _parse
    os.environ[CONTAINER_ID_ENV_VARIABLE] = self.container_id
  File "/usr/lib/python3.10/os.py", line 684, in __setitem__
    value = self.encodevalue(value)
  File "/usr/lib/python3.10/os.py", line 756, in encode
    raise TypeError("str expected, not %s" % type(value).__name__)
TypeError: str expected, not NoneType

Update - I used ubuntu 18_04-lts-gen2 image and it worked fine. problem seems to be with ubuntu 20.04 and 22.04