iotedge: ModuleClient cannot connect due to TLS authentication error

Expected Behavior

ModuleClient should be able to connect to IoTHub without issue.

Current Behavior

During the last few weeks, multiple of our IoT edge devices started to log errors indicating that the ModuleClient could not connect to IoT Hub due to a TLS authentication error. The other built-in modules (edge-agent and edge-hub) seem to work fine and don’t show any error.

We tried to restart the modules (built-in and custom) but it didn’t fix the issues. We tried to delete all our custom modules and recreate them, but this doesn’t fix the issue either.

Here is the stack of our app showing the error:

Unhandled exception. System.AggregateException: One or more errors occurred. (TLS authentication error.)
 ---> System.Security.Authentication.AuthenticationException: TLS authentication error.
 ---> System.Security.Authentication.AuthenticationException: The remote certificate was rejected by the provided RemoteCertificateValidationCallback.
   at System.Net.Security.SslStream.SendAuthResetSignal(ProtocolToken message, ExceptionDispatchInfo exception)
   at System.Net.Security.SslStream.CompleteHandshake(SslAuthenticationOptions sslAuthenticationOptions)
   at System.Net.Security.SslStream.ForceAuthenticationAsync[TIOAdapter](TIOAdapter adapter, Boolean receiveFirst, Byte[] reAuthenticationData, Boolean isApm)
   at System.Threading.Tasks.TaskToApm.End(IAsyncResult asyncResult)
   at System.Net.Security.SslStream.EndAuthenticateAsClient(IAsyncResult asyncResult)
   at Microsoft.Azure.Amqp.Transport.TlsTransport.HandleOpenComplete(IAsyncResult result, Boolean syncComplete)
--- End of stack trace from previous location ---
   at Microsoft.Azure.Amqp.ExceptionDispatcher.Throw(Exception exception)
   at Microsoft.Azure.Amqp.AsyncResult.End[TAsyncResult](IAsyncResult result)
   at Microsoft.Azure.Amqp.AmqpObject.OpenAsyncResult.End(IAsyncResult result)
   at Microsoft.Azure.Amqp.AmqpObject.EndOpen(IAsyncResult result)
   at Microsoft.Azure.Amqp.Transport.TlsTransportInitiator.HandleTransportOpened(IAsyncResult result)
   at Microsoft.Azure.Amqp.Transport.TlsTransportInitiator.OnTransportOpened(IAsyncResult result)
--- End of stack trace from previous location ---
   at Microsoft.Azure.Amqp.ExceptionDispatcher.Throw(Exception exception)
   at Microsoft.Azure.Amqp.AsyncResult.End[TAsyncResult](IAsyncResult result)
   at Microsoft.Azure.Amqp.Transport.AmqpTransportInitiator.ConnectAsyncResult.End(IAsyncResult result)
   at Microsoft.Azure.Amqp.Transport.AmqpTransportInitiator.<>c.<ConnectAsync>b__17_1(IAsyncResult r)
   at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endFunction, Action`1 endAction, Task`1 promise, Boolean requiresSynchronization)
--- End of stack trace from previous location ---
   at Microsoft.Azure.Devices.Client.Transport.AmqpIot.AmqpIotTransport.InitializeAsync(CancellationToken cancellationToken)
   at Microsoft.Azure.Devices.Client.Transport.Amqp.AmqpIotConnector.OpenConnectionAsync(CancellationToken cancellationToken)
   at Microsoft.Azure.Devices.Client.Transport.Amqp.AmqpConnectionHolder.EnsureConnectionAsync(CancellationToken cancellationToken)
   at Microsoft.Azure.Devices.Client.Transport.Amqp.AmqpConnectionHolder.OpenSessionAsync(IDeviceIdentity deviceIdentity, CancellationToken cancellationToken)
   at Microsoft.Azure.Devices.Client.Transport.AmqpIot.AmqpUnit.EnsureSessionIsOpenAsync(CancellationToken cancellationToken)
   at Microsoft.Azure.Devices.Client.Transport.AmqpIot.AmqpUnit.OpenAsync(CancellationToken cancellationToken)
   at Microsoft.Azure.Devices.Client.Transport.Amqp.AmqpTransportHandler.OpenAsync(CancellationToken cancellationToken)
   at Microsoft.Azure.Devices.Client.Transport.ProtocolRoutingDelegatingHandler.OpenAsync(CancellationToken cancellationToken)
   at Microsoft.Azure.Devices.Client.Transport.ErrorDelegatingHandler.<>c__DisplayClass27_0.<<ExecuteWithErrorHandlingAsync>b__0>d.MoveNext()
--- End of stack trace from previous location ---
   at Microsoft.Azure.Devices.Client.Transport.ErrorDelegatingHandler.ExecuteWithErrorHandlingAsync[T](Func`1 asyncOperation)
   --- End of inner exception stack trace ---
   at Microsoft.Azure.Devices.Client.Transport.ErrorDelegatingHandler.ExecuteWithErrorHandlingAsync[T](Func`1 asyncOperation)
   at Microsoft.Azure.Devices.Client.Transport.RetryDelegatingHandler.OpenInternalAsync(Boolean withRetry, CancellationToken cancellationToken)
   at Microsoft.Azure.Devices.Client.Transport.RetryDelegatingHandler.EnsureOpenedAsync(Boolean withRetry, CancellationToken cancellationToken)
   at Microsoft.Azure.Devices.Client.Transport.RetryDelegatingHandler.<>c__DisplayClass28_0.<<EnableTwinPatchAsync>b__0>d.MoveNext()
--- End of stack trace from previous location ---
   at Microsoft.Azure.Devices.Client.TransientFaultHandling.RetryPolicy.<>c__DisplayClass34_0.<<RunWithRetryAsync>b__0>d.MoveNext()
--- End of stack trace from previous location ---
   at Microsoft.Azure.Devices.Client.TransientFaultHandling.RetryPolicy.RunWithRetryAsync[T](Func`1 taskFunc, ShouldRetry shouldRetry, Func`2 isTransient, Action`3 onRetrying, Boolean fastFirstRetry, CancellationToken cancellationToken)
   at Microsoft.Azure.Devices.Client.TransientFaultHandling.RetryPolicy.RunWithRetryAsync(Func`1 taskFunc, ShouldRetry shouldRetry, Func`2 isTransient, Action`3 onRetrying, Boolean fastFirstRetry, CancellationToken cancellationToken)
   at Microsoft.Azure.Devices.Client.Transport.RetryDelegatingHandler.EnableTwinPatchAsync(CancellationToken cancellationToken)
   at Microsoft.Azure.Devices.Client.InternalClient.SetDesiredPropertyUpdateCallbackAsync(DesiredPropertyUpdateCallback callback, Object userContext, CancellationToken cancellationToken)
   at Microsoft.Azure.Devices.Client.InternalClient.SetDesiredPropertyUpdateCallbackAsync(DesiredPropertyUpdateCallback callback, Object userContext)
   at PulseModule.Data.IotHub.Utilities.DesiredStateConfigurationProvider.LoadAsync() in /app/PulseModule/PulseModule.Data.IotHub/Utilities/DesiredStateConfigurationProvider.cs:line 27
   --- End of inner exception stack trace ---
   at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions)
   at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
   at System.Threading.Tasks.Task.Wait()
   at PulseModule.Data.IotHub.Utilities.DesiredStateConfigurationProvider.Load() in /app/PulseModule/PulseModule.Data.IotHub/Utilities/DesiredStateConfigurationProvider.cs:line 20
   at Microsoft.Extensions.Configuration.ConfigurationRoot..ctor(IList`1 providers)
   at Microsoft.Extensions.Configuration.ConfigurationBuilder.Build()
   at Microsoft.Extensions.Hosting.HostBuilder.BuildAppConfiguration()
   at Microsoft.Extensions.Hosting.HostBuilder.Build()
   at PulseModule.ProgramBootstrap.ProgramRunner.MainLoop(String[] args, Boolean checkIotHubConnectionStringEnvVar) in /app/PulseModule/PulseModule.ProgramBootstrap/ProgramRunner.cs:line 33
   at PulseModule.Program.Main(String[] args) in /app/PulseModule/PulseModule/Program.cs:line 11
   at PulseModule.Program.<Main>(String[] args)

Steps to Reproduce

Not sure how to reproduce. It was working fine and randomly(?), we strated to see these errors popping up. But when it happens on a device, we cannot fix it and we are stuck there.

  • We see this behavior in different IoT Hubs (dev, prod)
  • We use DPS to provision our devices
  • We use TPM to authenticate our devices

Context (Environment)

Output of iotedge check

Click here
Configuration checks (aziot-identity-service)
---------------------------------------------
√ keyd configuration is well-formed - OK
√ certd configuration is well-formed - OK
√ tpmd configuration is well-formed - OK
√ identityd configuration is well-formed - OK
√ daemon configurations up-to-date with config.toml - OK
√ identityd config toml file specifies a valid hostname - OK
‼ aziot-identity-service package is up-to-date - Warning
    Installed aziot-identity-service package has version 1.4.0 but 1.4.1 is the latest stable version available.
    Please see https://aka.ms/aziot-update-runtime for update instructions.
√ host time is close to reference time - OK
√ preloaded certificates are valid - OK
√ keyd is running - OK
√ certd is running - OK
√ tpmd is running - OK
√ identityd is running - OK
√ read all preloaded certificates from the Certificates Service - OK
√ read all preloaded key pairs from the Keys Service - OK
√ check all EST server URLs utilize HTTPS - OK
√ ensure all preloaded certificates match preloaded private keys with the same ID - OK

Connectivity checks (aziot-identity-service)
--------------------------------------------
√ host can connect to and perform TLS handshake with iothub AMQP port - OK
√ host can connect to and perform TLS handshake with iothub HTTPS / WebSockets port - OK
√ host can connect to and perform TLS handshake with iothub MQTT port - OK
√ host can connect to and perform TLS handshake with DPS endpoint - OK

Configuration checks
--------------------
√ aziot-edged configuration is well-formed - OK
√ configuration up-to-date with config.toml - OK
√ container engine is installed and functional - OK
√ configuration has correct URIs for daemon mgmt endpoint - OK
‼ aziot-edge package is up-to-date - Warning
    Installed IoT Edge daemon has version 1.4.0 but 1.4.2 is the latest stable version available.
    Please see https://aka.ms/iotedge-update-runtime for update instructions.
√ container time is close to host time - OK
‼ DNS server - Warning
    Container engine is not configured with DNS server setting, which may impact connectivity to IoT Hub.
    Please see https://aka.ms/iotedge-prod-checklist-dns for best practices.
    You can ignore this warning if you are setting DNS server per module in the Edge deployment.
‼ production readiness: logs policy - Warning
    Container engine is not configured to rotate module logs which may cause it run out of disk space.
    Please see https://aka.ms/iotedge-prod-checklist-logs for best practices.
    You can ignore this warning if you are setting log policy per module in the Edge deployment.
‼ production readiness: Edge Agent's storage directory is persisted on the host filesystem - Warning
    The edgeAgent module is not configured to persist its /tmp/edgeAgent directory on the host filesystem.
    Data might be lost if the module is deleted or updated.
    Please see https://aka.ms/iotedge-storage-host for best practices.
‼ production readiness: Edge Hub's storage directory is persisted on the host filesystem - Warning
    The edgeHub module is not configured to persist its /tmp/edgeHub directory on the host filesystem.
    Data might be lost if the module is deleted or updated.
    Please see https://aka.ms/iotedge-storage-host for best practices.
√ Agent image is valid and can be pulled from upstream - OK
√ proxy settings are consistent in aziot-edged, aziot-identityd, moby daemon and config.toml - OK

Connectivity checks
-------------------
√ container on the default network can connect to upstream AMQP port - OK
√ container on the default network can connect to upstream HTTPS / WebSockets port - OK
√ container on the IoT Edge module network can connect to upstream AMQP port - OK
√ container on the IoT Edge module network can connect to upstream HTTPS / WebSockets port - OK
31 check(s) succeeded.
6 check(s) raised warnings. Re-run with --verbose for more details.
2 check(s) were skipped due to errors from other checks. Re-run with --verbose for more details.

Device Information

  • Host OS: Ubuntu 20.04.5 LTS (Focal Fossa)
  • Architecture: amd64
  • Container OS: Linux

Runtime Versions

  • aziot-edged: 1.4.0
  • Edge Agent: 1.4
  • Edge Hub: 1.4
  • Docker/Moby: 20.10.17+azure-3

Logs

aziot-edged logs

After restarting our module that dhows the error.

Nov 04 17:18:53 thinkedgejoproulx aziot-edged[1483]: 2022-11-04T17:18:53Z [INFO] - Watchdog checking Edge runtime status
Nov 04 17:18:53 thinkedgejoproulx aziot-edged[1483]: 2022-11-04T17:18:53Z [INFO] - Edge runtime is running
Nov 04 17:18:54 thinkedgejoproulx aziot-edged[1483]: 2022-11-04T17:18:54Z [INFO] - <-- GET /modules?api-version=2021-12-07 {"accept": "application/json", "host": "mgmt.sock:80", "connection": "close"}
Nov 04 17:18:54 thinkedgejoproulx aziot-edged[1483]: 2022-11-04T17:18:54Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 04 17:18:59 thinkedgejoproulx aziot-edged[1483]: 2022-11-04T17:18:59Z [INFO] - <-- GET /modules?api-version=2021-12-07 {"accept": "application/json", "host": "mgmt.sock:80", "connection": "close"}
Nov 04 17:18:59 thinkedgejoproulx aziot-edged[1483]: 2022-11-04T17:18:59Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 04 17:19:04 thinkedgejoproulx aziot-edged[1483]: 2022-11-04T17:19:04Z [INFO] - <-- GET /modules?api-version=2021-12-07 {"accept": "application/json", "host": "mgmt.sock:80", "connection": "close"}
Nov 04 17:19:04 thinkedgejoproulx aziot-edged[1483]: 2022-11-04T17:19:04Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 04 17:19:09 thinkedgejoproulx aziot-edged[1483]: 2022-11-04T17:19:09Z [INFO] - <-- GET /modules?api-version=2021-12-07 {"accept": "application/json", "host": "mgmt.sock:80", "connection": "close"}
Nov 04 17:19:09 thinkedgejoproulx aziot-edged[1483]: 2022-11-04T17:19:09Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 04 17:19:14 thinkedgejoproulx aziot-edged[1483]: 2022-11-04T17:19:14Z [INFO] - <-- GET /modules?api-version=2021-12-07 {"accept": "application/json", "host": "mgmt.sock:80", "connection": "close"}
Nov 04 17:19:14 thinkedgejoproulx aziot-edged[1483]: 2022-11-04T17:19:14Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 04 17:19:19 thinkedgejoproulx aziot-edged[1483]: 2022-11-04T17:19:19Z [INFO] - <-- GET /modules?api-version=2021-12-07 {"accept": "application/json", "host": "mgmt.sock:80", "connection": "close"}
Nov 04 17:19:19 thinkedgejoproulx aziot-edged[1483]: 2022-11-04T17:19:19Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 04 17:19:24 thinkedgejoproulx aziot-edged[1483]: 2022-11-04T17:19:24Z [INFO] - <-- GET /modules?api-version=2021-12-07 {"accept": "application/json", "host": "mgmt.sock:80", "connection": "close"}
Nov 04 17:19:24 thinkedgejoproulx aziot-edged[1483]: 2022-11-04T17:19:24Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 04 17:19:29 thinkedgejoproulx aziot-edged[1483]: 2022-11-04T17:19:29Z [INFO] - <-- GET /modules?api-version=2021-12-07 {"accept": "application/json", "host": "mgmt.sock:80", "connection": "close"}
Nov 04 17:19:29 thinkedgejoproulx aziot-edged[1483]: 2022-11-04T17:19:29Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 04 17:19:34 thinkedgejoproulx aziot-edged[1483]: 2022-11-04T17:19:34Z [INFO] - <-- GET /modules?api-version=2021-12-07 {"accept": "application/json", "host": "mgmt.sock:80", "connection": "close"}
Nov 04 17:19:34 thinkedgejoproulx aziot-edged[1483]: 2022-11-04T17:19:34Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 04 17:19:39 thinkedgejoproulx aziot-edged[1483]: 2022-11-04T17:19:39Z [INFO] - <-- GET /modules?api-version=2021-12-07 {"accept": "application/json", "host": "mgmt.sock:80", "connection": "close"}
Nov 04 17:19:39 thinkedgejoproulx aziot-edged[1483]: 2022-11-04T17:19:39Z [INFO] - --> 200 {"content-type": "application/json"}
edge-agent logs After restarting the module.
2022-11-04 17:13:48  Starting Edge Agent
2022-11-04 17:13:48  Changing ownership of storage folder: /tmp/edgeAgent to 13622
2022-11-04 17:13:48  Changing ownership of backup folder: /tmp/edgeAgent_backup to 13622
2022-11-04 17:13:48  Changing ownership of management socket: /var/run/iotedge/mgmt.sock
2022-11-04 17:13:48  Completed necessary setup. Starting Edge Agent.
2022-11-04 17:13:48.174 +00:00 Edge Agent Main()
<6> 2022-11-04 17:13:48.317 +00:00 [INF] - Initializing Edge Agent.
<6> 2022-11-04 17:13:48.443 +00:00 [INF] - Version - 1.4.0.59620487 (157650c3b1c5fb58b17f5975e6f1417a5a5f4f89)
<6> 2022-11-04 17:13:48.443 +00:00 [INF] - 
        █████╗ ███████╗██╗   ██╗██████╗ ███████╗
       ██╔══██╗╚══███╔╝██║   ██║██╔══██╗██╔════╝
       ███████║  ███╔╝ ██║   ██║██████╔╝█████╗
       ██╔══██║ ███╔╝  ██║   ██║██╔══██╗██╔══╝
       ██║  ██║███████╗╚██████╔╝██║  ██║███████╗
       ╚═╝  ╚═╝╚══════╝ ╚═════╝ ╚═╝  ╚═╝╚══════╝

 ██╗ ██████╗ ████████╗    ███████╗██████╗  ██████╗ ███████╗
 ██║██╔═══██╗╚══██╔══╝    ██╔════╝██╔══██╗██╔════╝ ██╔════╝
 ██║██║   ██║   ██║       █████╗  ██║  ██║██║  ███╗█████╗
 ██║██║   ██║   ██║       ██╔══╝  ██║  ██║██║   ██║██╔══╝
 ██║╚██████╔╝   ██║       ███████╗██████╔╝╚██████╔╝███████╗
 ╚═╝ ╚═════╝    ╚═╝       ╚══════╝╚═════╝  ╚═════╝ ╚══════╝

<6> 2022-11-04 17:13:48.445 +00:00 [INF] - ModuleUpdateMode: NonBlocking
<6> 2022-11-04 17:13:48.480 +00:00 [INF] - Experimental features configuration: {"Enabled":false,"DisableCloudSubscriptions":false}
<6> 2022-11-04 17:13:48.595 +00:00 [INF] - Installing certificates [CN=iotedged workload ca thinkedgejoproulx:11/02/2022 15:48:07] to Root
<6> 2022-11-04 17:13:48.709 +00:00 [INF] - Starting metrics listener on Host: *, Port: 9600, Suffix: metrics
<6> 2022-11-04 17:13:48.718 +00:00 [INF] - Updating performance metrics every 05m:00s
<6> 2022-11-04 17:13:48.722 +00:00 [INF] - Started operation Get system resources
<6> 2022-11-04 17:13:48.723 +00:00 [INF] - Collecting metadata metrics
<6> 2022-11-04 17:13:48.798 +00:00 [INF] - Set metadata metrics: 1.4.0.59620487 (157650c3b1c5fb58b17f5975e6f1417a5a5f4f89), {"Enabled":false,"DisableCloudSubscriptions":false}, {"OperatingSystemType":"Linux","Architecture":"x86_64","Version":"1.4.0","Provisioning":{"Type":"ProvisioningType","DynamicReprovisioning":false,"AlwaysReprovisionOnStartup":false},"ServerVersion":"20.10.17+azure-3","KernelVersion":"#49~20.04.1-Ubuntu SMP Thu Aug 4 19:15:44 UTC 2022","OperatingSystem":"ubuntu","NumCpus":4,"TotalMemory":0,"Virtualized":"no"}, True
<6> 2022-11-04 17:13:48.812 +00:00 [INF] - Started operation Checkpoint Availability
<6> 2022-11-04 17:13:48.815 +00:00 [INF] - Started operation refresh twin config
<6> 2022-11-04 17:13:48.828 +00:00 [INF] - Edge agent attempting to connect to IoT Hub via Amqp_Tcp_Only...
<6> 2022-11-04 17:13:49.259 +00:00 [INF] - Created persistent store at /tmp/edgeAgent
<6> 2022-11-04 17:13:49.320 +00:00 [INF] - Started operation Metrics Scrape
<6> 2022-11-04 17:13:49.320 +00:00 [INF] - Started operation Metrics Upload
Scraping frequency: 01:00:00
Upload Frequency: 1.00:00:00
<6> 2022-11-04 17:13:49.494 +00:00 [INF] - Registering request handler UploadModuleLogs
<6> 2022-11-04 17:13:49.494 +00:00 [INF] - Registering request handler GetModuleLogs
<6> 2022-11-04 17:13:49.494 +00:00 [INF] - Registering request handler UploadSupportBundle
<6> 2022-11-04 17:13:49.494 +00:00 [INF] - Registering request handler RestartModule
<6> 2022-11-04 17:13:49.648 +00:00 [INF] - Edge agent connected to IoT Hub via Amqp_Tcp_Only.
<6> 2022-11-04 17:13:49.736 +00:00 [INF] - Received direct method call - GetModuleLogs
<6> 2022-11-04 17:13:49.737 +00:00 [INF] - Received request GetModuleLogs with payload
<6> 2022-11-04 17:13:49.756 +00:00 [INF] - Processing request to get logs for {"schemaVersion":"1.0","items":{"id":"\\bedgeAgent\\b","filter":{"tail":1500,"since":"15m","until":null,"loglevel":null,"regex":""}},"encoding":1,"contentType":1}
<6> 2022-11-04 17:13:49.763 +00:00 [INF] - Received direct method call - GetModuleLogs
<6> 2022-11-04 17:13:49.763 +00:00 [INF] - Received request GetModuleLogs with payload
<6> 2022-11-04 17:13:49.764 +00:00 [INF] - Processing request to get logs for {"schemaVersion":"1.0","items":{"id":"\\bedgeAgent\\b","filter":{"tail":1500,"since":"15m","until":null,"loglevel":null,"regex":""}},"encoding":1,"contentType":1}
<6> 2022-11-04 17:13:49.783 +00:00 [INF] - Initialized new module client with subscriptions enabled
<6> 2022-11-04 17:13:49.785 +00:00 [INF] - Initiating streaming logs for edgeAgent
<6> 2022-11-04 17:13:49.792 +00:00 [INF] - Initiating streaming logs for edgeAgent
<6> 2022-11-04 17:13:49.848 +00:00 [INF] - Obtained Edge agent twin from IoTHub with desired properties version 343 and reported properties version 1275.
<6> 2022-11-04 17:13:49.942 +00:00 [INF] - Received 2000 bytes of logs for edgeAgent
<6> 2022-11-04 17:13:49.942 +00:00 [INF] - Received 2000 bytes of logs for edgeAgent
<6> 2022-11-04 17:13:49.948 +00:00 [INF] - Successfully handled request GetModuleLogs
<6> 2022-11-04 17:13:49.950 +00:00 [INF] - Successfully handled request GetModuleLogs
<6> 2022-11-04 17:13:50.216 +00:00 [INF] - Updated reported properties
<6> 2022-11-04 17:16:45.888 +00:00 [INF] - Received direct method call - GetModuleLogs
<6> 2022-11-04 17:16:45.888 +00:00 [INF] - Received request GetModuleLogs with payload
<6> 2022-11-04 17:16:45.889 +00:00 [INF] - Processing request to get logs for {"schemaVersion":"1.0","items":{"id":"\\bedgeAgent\\b","filter":{"tail":1500,"since":"15m","until":null,"loglevel":null,"regex":""}},"encoding":1,"contentType":1}
edge-hub logs
2022-11-04 17:21:30  Starting Edge Hub
2022-11-04 17:21:30  Starting Edge Hub
2022-11-04 17:21:30  Changing ownership of storage folder: /tmp/edgeHub to 13623
2022-11-04 17:21:30  Changing ownership of backup folder: /tmp/edgeHub_backup to 13623
2022-11-04 17:21:30.287 +00:00 Edge Hub Main()
<6> 2022-11-04 17:21:30.790 +00:00 [INF] - Installing certificates [CN=iotedged workload ca thinkedgejoproulx:11/02/2022 15:48:07] to Root
<6> 2022-11-04 17:21:30.829 +00:00 [INF] - Installing certificates [CN=iotedged workload ca thinkedgejoproulx:11/02/2022 15:48:07] to Root
<6> 2022-11-04 17:21:30.834 +00:00 [INF] - Enabling SSL protocols: Tls12
<6> 2022-11-04 17:21:30.926 +00:00 [INF] - Experimental features configuration: {"Enabled":false,"DisableCloudSubscriptions":false,"DisableConnectivityCheck":false}
<6> 2022-11-04 17:21:31.301 +00:00 [INF] - Created persistent store at /tmp/edgeHub
<6> 2022-11-04 17:21:31.332 +00:00 [INF] - Initializing Edge Hub
<6> 2022-11-04 17:21:31.332 +00:00 [INF] - 
        █████╗ ███████╗██╗   ██╗██████╗ ███████╗
       ██╔══██╗╚══███╔╝██║   ██║██╔══██╗██╔════╝
       ███████║  ███╔╝ ██║   ██║██████╔╝█████╗
       ██╔══██║ ███╔╝  ██║   ██║██╔══██╗██╔══╝
       ██║  ██║███████╗╚██████╔╝██║  ██║███████╗
       ╚═╝  ╚═╝╚══════╝ ╚═════╝ ╚═╝  ╚═╝╚══════╝

 ██╗ ██████╗ ████████╗    ███████╗██████╗  ██████╗ ███████╗
 ██║██╔═══██╗╚══██╔══╝    ██╔════╝██╔══██╗██╔════╝ ██╔════╝
 ██║██║   ██║   ██║       █████╗  ██║  ██║██║  ███╗█████╗
 ██║██║   ██║   ██║       ██╔══╝  ██║  ██║██║   ██║██╔══╝
 ██║╚██████╔╝   ██║       ███████╗██████╔╝╚██████╔╝███████╗
 ╚═╝ ╚═════╝    ╚═╝       ╚══════╝╚═════╝  ╚═════╝ ╚══════╝

<6> 2022-11-04 17:21:31.333 +00:00 [INF] - Version - 1.4.0.59620487 (157650c3b1c5fb58b17f5975e6f1417a5a5f4f89)
<6> 2022-11-04 17:21:31.333 +00:00 [INF] - OptimizeForPerformance=True
<6> 2022-11-04 17:21:31.333 +00:00 [INF] - MessageAckTimeoutSecs=30
<6> 2022-11-04 17:21:31.333 +00:00 [INF] - Loaded server certificate with expiration date of "2022-11-02T15:48:07.0000000+00:00"
<6> 2022-11-04 17:21:31.338 +00:00 [INF] - Using Asp Net server for metrics
<6> 2022-11-04 17:21:31.384 +00:00 [INF] - Created new message store
<6> 2022-11-04 17:21:31.385 +00:00 [INF] - Started task to cleanup processed and stale messages
<6> 2022-11-04 17:21:31.433 +00:00 [INF] - Created DeviceConnectivityManager with connected check frequency 00:05:00 and disconnected check frequency 00:02:00
<6> 2022-11-04 17:21:31.476 +00:00 [INF] - Add node: thinkedge_joproulx
<6> 2022-11-04 17:21:31.477 +00:00 [INF] - Add node: thinkedge_joproulx/$edgeAgent
<6> 2022-11-04 17:21:31.477 +00:00 [INF] - Add node: thinkedge_joproulx/$edgeHub
<6> 2022-11-04 17:21:31.477 +00:00 [INF] - Add node: thinkedge_joproulx/BacnetStub
<6> 2022-11-04 17:21:31.477 +00:00 [INF] - Add node: thinkedge_joproulx/IoTEdgeMetricsCollector
<6> 2022-11-04 17:21:31.477 +00:00 [INF] - Add node: thinkedge_joproulx/IoTHubDeviceUpdate
<6> 2022-11-04 17:21:31.477 +00:00 [INF] - Add node: thinkedge_joproulx/PulseApp
<6> 2022-11-04 17:21:31.477 +00:00 [INF] - Add node: thinkedge_joproulx/PulseToolbox
<6> 2022-11-04 17:21:31.478 +00:00 [INF] - Created device scope identities cache
<6> 2022-11-04 17:21:31.487 +00:00 [INF] - Starting refresh of device scope identities cache
<6> 2022-11-04 17:21:31.490 +00:00 [INF] - Initialized storing twin manager
<6> 2022-11-04 17:21:31.497 +00:00 [INF] - Updated node: thinkedge_joproulx/$edgeHub
<6> 2022-11-04 17:21:31.497 +00:00 [INF] - Initializing configuration
<6> 2022-11-04 17:21:31.509 +00:00 [INF] - New device connection for device thinkedge_joproulx/$edgeHub
<6> 2022-11-04 17:21:31.517 +00:00 [INF] - Client thinkedge_joproulx/$edgeHub connected to edgeHub, processing existing subscriptions.
<6> 2022-11-04 17:21:31.552 +00:00 [INF] - Attempting to connect to IoT Hub for client thinkedge_joproulx/$edgeHub via AMQP...
<6> 2022-11-04 17:21:31.566 +00:00 [INF] - Processing pending subscriptions for thinkedge_joproulx/$edgeHub
<6> 2022-11-04 17:21:31.718 +00:00 [INF] - Experimental features configuration: {"Enabled":false,"DisableCloudSubscriptions":false,"DisableConnectivityCheck":false}
<6> 2022-11-04 17:21:31.730 +00:00 [INF] - Starting timer to authenticate connections with a period of 300 seconds
<4> 2022-11-04 17:21:31.781 +00:00 [WRN] - Server certificate is expired ("-2.01:33:24.7808729"). Not scheduling renewal.
<6> 2022-11-04 17:21:31.894 +00:00 [INF] - Created cloud endpoint iothub with max batch size 10 and fan-out factor of 10.
<6> 2022-11-04 17:21:32.017 +00:00 [INF] - Set the following 1 route(s) in edge hub
<6> 2022-11-04 17:21:32.017 +00:00 [INF] - PulseAppToIoTHub: FROM /messages/modules/PulseApp/outputs/* INTO $upstream
<6> 2022-11-04 17:21:32.017 +00:00 [INF] - Updated message store TTL to 604800 seconds
<6> 2022-11-04 17:21:32.018 +00:00 [INF] - Updated the edge hub store and forward configuration
<6> 2022-11-04 17:21:32.028 +00:00 [INF] - Started operation Get EdgeHub config
<6> 2022-11-04 17:21:32.028 +00:00 [INF] - Initialized edge hub configuration
<6> 2022-11-04 17:21:32.029 +00:00 [INF] - Starting protocol heads - (MQTT, AMQP, HTTP)
<6> 2022-11-04 17:21:32.030 +00:00 [INF] - Starting MQTT head
<6> 2022-11-04 17:21:32.067 +00:00 [INF] - Initializing TLS endpoint on port 8883 for MQTT head.
<6> 2022-11-04 17:21:32.106 +00:00 [INF] - Starting AMQP head
<6> 2022-11-04 17:21:32.111 +00:00 [INF] - Started MQTT head
<6> 2022-11-04 17:21:32.149 +00:00 [INF] - Started AMQP head
<6> 2022-11-04 17:21:32.150 +00:00 [INF] - Starting HTTP head
<6> 2022-11-04 17:21:32.200 +00:00 [INF] - Not changed node: thinkedge_joproulx
<6> 2022-11-04 17:21:32.200 +00:00 [INF] - Not changed node: thinkedge_joproulx/$edgeAgent
<6> 2022-11-04 17:21:32.200 +00:00 [INF] - Updated node: thinkedge_joproulx/$edgeHub
<6> 2022-11-04 17:21:32.225 +00:00 [INF] - Service identity for thinkedge_joproulx/$edgeHub in device scope was updated.
<6> 2022-11-04 17:21:32.225 +00:00 [INF] - Not changed node: thinkedge_joproulx/IoTHubDeviceUpdate
<6> 2022-11-04 17:21:32.225 +00:00 [INF] - Not changed node: thinkedge_joproulx/IoTEdgeMetricsCollector
<6> 2022-11-04 17:21:32.225 +00:00 [INF] - Not changed node: thinkedge_joproulx/BacnetStub
<6> 2022-11-04 17:21:32.225 +00:00 [INF] - Not changed node: thinkedge_joproulx/PulseApp
<6> 2022-11-04 17:21:32.225 +00:00 [INF] - Not changed node: thinkedge_joproulx/PulseToolbox
<6> 2022-11-04 17:21:32.227 +00:00 [INF] - Done refreshing device scope identities cache. Waiting for 60 minutes.
<4> 2022-11-04 17:21:32.279 +00:00 [WRN] - Overriding address(es) '"http://+:80"'. Binding to endpoints defined via IConfiguration and/or UseKestrel() instead.
<6> 2022-11-04 17:21:32.291 +00:00 [INF] - Started HTTP head
<6> 2022-11-04 17:21:32.366 +00:00 [INF] - Exiting disconnected state
<6> 2022-11-04 17:21:32.377 +00:00 [INF] - Device connected to cloud, processing subscriptions for connected clients.
<6> 2022-11-04 17:21:32.378 +00:00 [INF] - Processing subscriptions for client thinkedge_joproulx/$edgeHub on device connected to cloud.
<6> 2022-11-04 17:21:32.378 +00:00 [INF] - Skipping thinkedge_joproulx/$edgeHub for subscription processing, as it is currently being processed.
<6> 2022-11-04 17:21:32.380 +00:00 [INF] - Received device connected callback
<6> 2022-11-04 17:21:32.384 +00:00 [INF] - Skipping cache refresh, waiting 119 seconds until refreshing again.
<6> 2022-11-04 17:21:32.391 +00:00 [INF] - Entering connected state
<6> 2022-11-04 17:21:32.393 +00:00 [INF] - Cloud connection for thinkedge_joproulx/$edgeHub is True
<6> 2022-11-04 17:21:32.395 +00:00 [INF] - Connection status for thinkedge_joproulx/$edgeHub changed to ConnectionEstablished
<6> 2022-11-04 17:21:32.396 +00:00 [INF] - Client thinkedge_joproulx/$edgeHub connected to cloud, processing existing subscriptions.
<6> 2022-11-04 17:21:32.396 +00:00 [INF] - Skipping thinkedge_joproulx/$edgeHub for subscription processing, as it is currently being processed.
<6> 2022-11-04 17:21:32.398 +00:00 [INF] - Created cloud proxy for client thinkedge_joproulx/$edgeHub via AMQP, with client operation timeout 20 seconds.
<6> 2022-11-04 17:21:32.400 +00:00 [INF] - Initialized cloud proxy ff7e051d-efea-4fbe-bcf5-acd53b30903f for thinkedge_joproulx/$edgeHub
<6> 2022-11-04 17:21:32.401 +00:00 [INF] - Created cloud connection for client thinkedge_joproulx/$edgeHub
<6> 2022-11-04 17:21:32.546 +00:00 [INF] - Processing pending subscriptions for thinkedge_joproulx/$edgeHub
<6> 2022-11-04 17:21:32.644 +00:00 [INF] - Updated reported properties for thinkedge_joproulx/$edgeHub
<6> 2022-11-04 17:21:32.717 +00:00 [INF] - Obtained edge hub config from module twin
<6> 2022-11-04 17:21:32.746 +00:00 [INF] - Set the following 1 route(s) in edge hub
<6> 2022-11-04 17:21:32.746 +00:00 [INF] - PulseAppToIoTHub: FROM /messages/modules/PulseApp/outputs/* INTO $upstream
<6> 2022-11-04 17:21:32.746 +00:00 [INF] - Updated message store TTL to 604800 seconds
<6> 2022-11-04 17:21:32.746 +00:00 [INF] - Updated the edge hub store and forward configuration

Additional Information

Not sure if this could be related, but we also see these errors on some of our devices:


Unhandled exception. System.Net.Sockets.SocketException (111): Connection refused
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.CreateException(SocketError error, Boolean forAsyncThrow)
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ConnectAsync(Socket socket)
   at System.Net.Sockets.Socket.ConnectAsync(EndPoint remoteEP, CancellationToken cancellationToken)
   at System.Net.Sockets.Socket.ConnectAsync(EndPoint remoteEP)
   at System.Net.Sockets.SocketTaskExtensions.ConnectAsync(Socket socket, EndPoint remoteEP)
   at Microsoft.Azure.Devices.Client.HsmAuthentication.Transport.HttpUdsMessageHandler.GetConnectedSocketAsync()
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[TStateMachine](TStateMachine& stateMachine)
   at Microsoft.Azure.Devices.Client.HsmAuthentication.Transport.HttpUdsMessageHandler.GetConnectedSocketAsync()
   at Microsoft.Azure.Devices.Client.HsmAuthentication.Transport.HttpUdsMessageHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[TStateMachine](TStateMachine& stateMachine)
   at Microsoft.Azure.Devices.Client.HsmAuthentication.Transport.HttpUdsMessageHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
   at System.Net.Http.HttpMessageInvoker.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
   at System.Net.Http.HttpClient.<SendAsync>g__Core|83_0(HttpRequestMessage request, HttpCompletionOption completionOption, CancellationTokenSource cts, Boolean disposeCts, CancellationTokenSource pendingRequestsCts, CancellationToken originalCancellationToken)
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[TStateMachine](TStateMachine& stateMachine)
   at Microsoft.Azure.Devices.Client.HsmAuthentication.GeneratedCode.HttpHsmClient.TrustBundleAsync(String api_version, CancellationToken cancellationToken)
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[TStateMachine](TStateMachine& stateMachine)
   at Microsoft.Azure.Devices.Client.HsmAuthentication.GeneratedCode.HttpHsmClient.TrustBundleAsync(String api_version, CancellationToken cancellationToken)
   at Microsoft.Azure.Devices.Client.HsmAuthentication.GeneratedCode.HttpHsmClient.TrustBundleAsync(String api_version)
   at Microsoft.Azure.Devices.Client.Edge.TrustBundleProvider.<>c__DisplayClass3_0.<GetTrustBundleWithRetryAsync>b__0()
   at Microsoft.Azure.Devices.Client.TransientFaultHandling.RetryPolicy.RunWithRetryAsync[T](Func`1 taskFunc, ShouldRetry shouldRetry, Func`2 isTransient, Action`3 onRetrying, Boolean fastFirstRetry, CancellationToken cancellationToken)
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[TStateMachine](TStateMachine& stateMachine)
   at Microsoft.Azure.Devices.Client.TransientFaultHandling.RetryPolicy.RunWithRetryAsync[T](Func`1 taskFunc, ShouldRetry shouldRetry, Func`2 isTransient, Action`3 onRetrying, Boolean fastFirstRetry, CancellationToken cancellationToken)
   at Microsoft.Azure.Devices.Client.TransientFaultHandling.RetryPolicy.RunWithRetryAsync[TResult](Func`1 taskFunc, CancellationToken cancellationToken)
   at Microsoft.Azure.Devices.Client.TransientFaultHandling.RetryPolicy.RunWithRetryAsync[TResult](Func`1 taskFunc)
   at Microsoft.Azure.Devices.Client.Edge.TrustBundleProvider.GetTrustBundleWithRetryAsync(HttpHsmClient hsmHttpClient, String apiVersion)
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[TStateMachine](TStateMachine& stateMachine)
   at Microsoft.Azure.Devices.Client.Edge.TrustBundleProvider.GetTrustBundleWithRetryAsync(HttpHsmClient hsmHttpClient, String apiVersion)
   at Microsoft.Azure.Devices.Client.Edge.TrustBundleProvider.GetTrustBundleAsync(Uri providerUri, String apiVersion)
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[TStateMachine](TStateMachine& stateMachine)
   at Microsoft.Azure.Devices.Client.Edge.TrustBundleProvider.GetTrustBundleAsync(Uri providerUri, String apiVersion)
   at Microsoft.Azure.Devices.Client.Edge.EdgeModuleClientFactory.CreateInternalClientFromEnvironmentAsync()
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[TStateMachine](TStateMachine& stateMachine)
   at Microsoft.Azure.Devices.Client.Edge.EdgeModuleClientFactory.CreateInternalClientFromEnvironmentAsync()
   at Microsoft.Azure.Devices.Client.Edge.EdgeModuleClientFactory.CreateAsync()
   at Microsoft.Azure.Devices.Client.ModuleClient.CreateFromEnvironmentAsync(ITransportSettings[] transportSettings, ClientOptions options)
   at Microsoft.Azure.Devices.Client.ModuleClient.CreateFromEnvironmentAsync(TransportType transportType, ClientOptions options)
   at Microsoft.Azure.Devices.Client.ModuleClient.CreateFromEnvironmentAsync(ClientOptions options)
   at PulseModule.ProgramBootstrap.ProgramRunner.MainLoop(String[] args, Boolean checkIotHubConnectionStringEnvVar) in /app/PulseModule/PulseModule.ProgramBootstrap/ProgramRunner.cs:line 28
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[TStateMachine](TStateMachine& stateMachine)
   at PulseModule.ProgramBootstrap.ProgramRunner.MainLoop(String[] args, Boolean checkIotHubConnectionStringEnvVar)
   at PulseModule.Program.Main(String[] args) in /app/PulseModule/PulseModule/Program.cs:line 11
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[TStateMachine](TStateMachine& stateMachine)
   at PulseModule.Program.Main(String[] args)
   at PulseModule.Program.<Main>(String[] args)
--- End of stack trace from previous location ---
   at Microsoft.Azure.Devices.Client.HsmAuthentication.Transport.HttpUdsMessageHandler.GetConnectedSocketAsync()
   at Microsoft.Azure.Devices.Client.HsmAuthentication.Transport.HttpUdsMessageHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
   at System.Net.Http.HttpClient.<SendAsync>g__Core|83_0(HttpRequestMessage request, HttpCompletionOption completionOption, CancellationTokenSource cts, Boolean disposeCts, CancellationTokenSource pendingRequestsCts, CancellationToken originalCancellationToken)
   at Microsoft.Azure.Devices.Client.HsmAuthentication.GeneratedCode.HttpHsmClient.TrustBundleAsync(String api_version, CancellationToken cancellationToken)
   at Microsoft.Azure.Devices.Client.TransientFaultHandling.RetryPolicy.RunWithRetryAsync[T](Func`1 taskFunc, ShouldRetry shouldRetry, Func`2 isTransient, Action`3 onRetrying, Boolean fastFirstRetry, CancellationToken cancellationToken)
   at Microsoft.Azure.Devices.Client.Edge.TrustBundleProvider.GetTrustBundleWithRetryAsync(HttpHsmClient hsmHttpClient, String apiVersion)
   at Microsoft.Azure.Devices.Client.Edge.TrustBundleProvider.GetTrustBundleAsync(Uri providerUri, String apiVersion)
   at Microsoft.Azure.Devices.Client.Edge.EdgeModuleClientFactory.CreateInternalClientFromEnvironmentAsync()
   at PulseModule.ProgramBootstrap.ProgramRunner.MainLoop(String[] args, Boolean checkIotHubConnectionStringEnvVar) in /app/PulseModule/PulseModule.ProgramBootstrap/ProgramRunner.cs:line 28
   at PulseModule.Program.Main(String[] args) in /app/PulseModule/PulseModule/Program.cs:line 11
   at PulseModule.Program.<Main>(String[] args)

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 16 (8 by maintainers)

Most upvoted comments

Ok, thanks. I’ve updated the release notes for 1.3 and 1.4 LTS to call out this quirk in addition to our more general upgrade docs linked earlier.

image

image

Ok, and is the module certificate used to authenticate modules to the IoT Hub?

No, Edge CA generates module server certificates. For IoT Hub, it’s the device identity (client) certificate (and SAS keys generated off of it) https://learn.microsoft.com/en-us/azure/iot-edge/iot-edge-certs?view=iotedge-1.4#iot-hub-verifies-iot-edge-device-identity

Or its only use is to authenticate modules when communicating with each others (inside a device and between devices)?

Correct

  • In this situation, we assumed that it was ok to keep the Quickstart Edge CA even in prod environment. Does this make sense? If not, could you please explain why?

Microsoft’s official position on using quickstart Edge CA in production is, quoted from our docs:

Never use the quickstart Edge CA for production because the locally generated certificate in it isn’t connected to a PKI.

The security of a certificates-based identity derives from a well-operated PKI (the infrastructure) in which the certificate (a document) is only a component. A well-operated PKI enables definition, application, management, and enforcements of security policies to include but not limited to certificates issuance, revocation, and lifecycle management.

  • We are using the default config for the QuickStart Edge CA (we didn’t modify the related section in config.toml). From the doc, we assume that the QuickStart Edge CA will be renewed at 80% of its lifetime. Is this valid?

In 1.3 and above, yes. To ensure that the default quickstart Edge CA auto-renew mechanism is engaged, run sudo iotedge config apply. Running this command after every IoT Edge version update is the general recommendation, but in the case of updating from 1.2 -> 1.3+ this step is critical to ensure the new quickstart Edge CA auto-renew mechanism begins working. This is because prior in 1.2 and below, quickststart Edge CA renews only with a full IoT Edge restart, and the 1.3+ updated behavior is injected at the configuration apply stage.