iotedge: Simple C# custom module stops sending mqtt msgs and cannot recover

Expected Behavior

Slim custom module should run alongside edgeAgent and edgeHub while sending mqtt msgs to iothub on a small device.

Current Behavior

Custom module fails after some time, and cannot restart.

Steps to Reproduce

Provide a detailed set of steps to reproduce the bug.

  1. Setup edgeAgent with the following environment variables:
{
    "RocksDB_MaxTotalWalSize": 4194304,
    "SendRuntimeQualityTelemetry": false
}

And the following create options:

{
    "HostConfig": {
        "Dns": [
            "1.1.1.1"
        ],
        "Memory": 209715200
    }
}
  1. Setup edgeHub with the following environment variables:
{
    "OptimizeForPerformance": false,
    "amqpSettings__enabled": false,
    "mqttSettings__enabled": true,
    "httpSettings__enabled": false,
    "UpstreamProtocol": Mqtt,
    "RocksDB_MaxTotalWalSize": 4194304
}

And the following create options:

{
    "HostConfig": {
        "Dns": [
            "1.1.1.1"
        ],
        "PortBindings": {
            "443/tcp": [
                {
                    "HostPort": "443"
                }
            ],
            "5671/tcp": [
                {
                    "HostPort": "5671"
                }
            ],
            "8883/tcp": [
                {
                    "HostPort": "8883"
                }
            ]
        },
        "Memory": 131072000
    }
}
  1. Create custom c# module (SimpleSharpModule) with the following logic:
using Microsoft.Azure.Devices.Client;
using Microsoft.Azure.Devices.Client.Transport.Mqtt;
using System;
using System.Runtime.Loader;
using System.Text;
using System.Threading;
using System.Threading.Tasks;
static async Task Main(string[] args)
{
	Console.WriteLine("Initializing");
	var moduleClient = await Init();
	Console.WriteLine("Successfull initialization");

	try
	{
		while (true)
		{
			Console.WriteLine("Start " + DateTime.UtcNow.ToString("s"));
			if (moduleClient == null)
			{
				Console.WriteLine("SIMPLESHARPMODULE - Module client creating");
				moduleClient = await Init();
				Console.WriteLine("SIMPLESHARPMODULE - Module client created");
			}

			//build msg
			var freetext = "\"im alive\"";
			byte[] bytes = Encoding.UTF8.GetBytes(freetext);
			var msg = new Message(bytes);

			Console.WriteLine("SIMPLESHARPMODULE - before send");
			await moduleClient.SendEventAsync("output1", msg);
			Console.WriteLine("SIMPLESHARPMODULE - after send");

			Thread.Sleep(60000); //1 minute sleep
		}
	}
	catch (Exception e)
	{
		Console.WriteLine(e);
	}

	// Wait until the app unloads or is cancelled
	var cts = new CancellationTokenSource();
	AssemblyLoadContext.Default.Unloading += (ctx) => cts.Cancel();
	Console.CancelKeyPress += (sender, cpe) => cts.Cancel();
	WhenCancelled(cts.Token).Wait();
}

/// <summary>
/// Handles cleanup operations when app is cancelled or unloads
/// </summary>
public static Task WhenCancelled(CancellationToken cancellationToken)
{
	var tcs = new TaskCompletionSource<bool>();
	cancellationToken.Register(s => ((TaskCompletionSource<bool>)s).SetResult(true), tcs);
	return tcs.Task;
}

/// <summary>
/// Initializes the ModuleClient and sets up the callback to receive
/// messages containing temperature information
/// </summary>
static async Task<ModuleClient> Init()
{
	MqttTransportSettings mqttSetting = new MqttTransportSettings(TransportType.Mqtt_Tcp_Only);
	ITransportSettings[] settings = { mqttSetting };

	// Open a connection to the Edge runtime
	ModuleClient ioTHubModuleClient = await ModuleClient.CreateFromEnvironmentAsync(settings);
	await ioTHubModuleClient.OpenAsync();
	Console.WriteLine("IoT Hub module client initialized.");

	// Register callback to be called when a message is received by the module
	//await ioTHubModuleClient.SetInputMessageHandlerAsync("input1", PipeMessage, ioTHubModuleClient);
	return ioTHubModuleClient;
}

Context (Environment)

Output of iotedge check

Click here

Configuration checks
--------------------
√ config.yaml is well-formed - OK
√ config.yaml has well-formed connection string - OK
√ container engine is installed and functional - OK
√ config.yaml has correct hostname - OK
× config.yaml has correct URIs for daemon mgmt endpoint - Error
    Unable to find image 'mcr.microsoft.com/azureiotedge-diagnostics:1.1.11' locally
    docker: Error response from daemon: Get "https://mcr.microsoft.com/v2/": net/http: TLS handshake timeout.
    See 'docker run --help'.
‼ latest security daemon - Warning
    Installed IoT Edge daemon has version 1.1.11 but 1.1.15 is the latest stable version available.
    Please see https://aka.ms/iotedge-update-runtime for update instructions.
‼ host time is close to real time - Warning
    Time on the device is out of sync with the NTP server. This may cause problems connecting to IoT Hub.
    Please ensure time on device is accurate, for example by installing an NTP daemon.
× container time is close to host time - Error
    Could not query local time inside container
‼ DNS server - Warning
    Container engine is not configured with DNS server setting, which may impact connectivity to IoT Hub.
    Please see https://aka.ms/iotedge-prod-checklist-dns for best practices.
    You can ignore this warning if you are setting DNS server per module in the Edge deployment.
√ production readiness: identity certificates expiry - OK
√ production readiness: certificates - OK
‼ production readiness: logs policy - Warning
    Container engine is not configured to rotate module logs which may cause it run out of disk space.
    Please see https://aka.ms/iotedge-prod-checklist-logs for best practices.
    You can ignore this warning if you are setting log policy per module in the Edge deployment.
‼ production readiness: Edge Agent's storage directory is persisted on the host filesystem - Warning
    The edgeAgent module is not configured to persist its /tmp/edgeAgent directory on the host filesystem.
    Data might be lost if the module is deleted or updated.
    Please see https://aka.ms/iotedge-storage-host for best practices.
‼ production readiness: Edge Hub's storage directory is persisted on the host filesystem - Warning
    The edgeHub module is not configured to persist its /tmp/edgeHub directory on the host filesystem.
    Data might be lost if the module is deleted or updated.
    Please see https://aka.ms/iotedge-storage-host for best practices.

Connectivity checks
-------------------
√ host can connect to and perform TLS handshake with DPS endpoint - OK
√ host can connect to and perform TLS handshake with IoT Hub AMQP port - OK
√ host can connect to and perform TLS handshake with IoT Hub HTTPS / WebSockets port - OK
√ host can connect to and perform TLS handshake with IoT Hub MQTT port - OK
× container on the default network can connect to IoT Hub AMQP port - Error
    Container on the default network could not connect to ccj-dev-iothub.azure-devices.net:5671
× container on the default network can connect to IoT Hub HTTPS / WebSockets port - Error
    Container on the default network could not connect to ccj-dev-iothub.azure-devices.net:443
× container on the default network can connect to IoT Hub MQTT port - Error
    Container on the default network could not connect to ccj-dev-iothub.azure-devices.net:8883
√ container on the IoT Edge module network can connect to IoT Hub AMQP port - OK
√ container on the IoT Edge module network can connect to IoT Hub HTTPS / WebSockets port - OK
√ container on the IoT Edge module network can connect to IoT Hub MQTT port - OK

13 check(s) succeeded.
6 check(s) raised warnings. Re-run with --verbose for more details.
5 check(s) raised errors. Re-run with --verbose for more details.


Device Information

  • Host OS [e.g. Ubuntu 18.04, Windows Server IoT 2019]: Buildroot 2020.11.2
  • Architecture [e.g. amd64, arm32, arm64]: arm32
  • Container OS [e.g. Linux containers, Windows containers]: Linux

Runtime Versions

  • aziot-edged [run iotedge version]: 1.1.11
  • Edge Agent [image tag (e.g. 1.0.0)]: 1.1.11
  • Edge Hub [image tag (e.g. 1.0.0)]: 1.1.11
  • Docker/Moby [run docker version]:
Output of docker version

Client:
 Version:           19.03.13
 API version:       1.40
 Go version:        go1.15.6
 Git commit:        19.03.13
 Built:             unknown-buildtime
 OS/Arch:           linux/arm
 Experimental:      false

Server:
 Engine:
  Version:          19.03.13
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.15.6
  Git commit:
  Built:            Mon Jun 27 12:03:38 CEST 2022
  OS/Arch:          linux/arm
  Experimental:     false
 containerd:
  Version:          1.4.3
  GitCommit:
 runc:
  Version:          1.0.0-rc92
  GitCommit:


Logs

Logs are posted in the next comments because of character restrictions.

Additional Information

Previously i made the same issue, but with a custom python module: https://github.com/Azure/iotedge/issues/6140

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 35 (16 by maintainers)

Most upvoted comments

@henrik2424 your example of using SetConnectionStatusChangesHandler seems fine to me, with one exception: you must not use blocking API in async context - so replace Thread.Sleep() with await Task.Delay().

UPD: also, add volatile keyword to your static bool NotConnected