iotedge: EdgeHub not reconnecting to IotHub after network failure.

Last night we had a network failure disconnecting the edgeDevice from the internet at 03:07. The device came online again at 06:39 but the edgeHub did not reconnect to iotHub.

The output of iotedge list said that all modules are running (and have been running for a week). The IotEdgeExtension in VSCode displayed the edgeDevice as disconnected.

To reconnect the device to IotHub we had to restart the edgeHub module.

Logging output of edgeAgent:

2019-03-28 23:55:08.600 +00:00 [INF] - Starting compaction of stores
2019-03-28 23:55:08.601 +00:00 [INF] - Starting compaction of store moduleState
2019-03-28 23:55:08.601 +00:00 [INF] - Starting compaction of store default
2019-03-28 23:55:08.602 +00:00 [INF] - Starting compaction of store deploymentConfig
2019-03-29 01:55:08.605 +00:00 [INF] - Starting compaction of stores
2019-03-29 01:55:08.606 +00:00 [INF] - Starting compaction of store moduleState
2019-03-29 01:55:08.606 +00:00 [INF] - Starting compaction of store default
2019-03-29 01:55:08.607 +00:00 [INF] - Starting compaction of store deploymentConfig
2019-03-29 03:25:54.644 +00:00 [ERR] - Error refreshing edge agent configuration from twin.
System.TimeoutException: Operation timeout expired.
   at Microsoft.Azure.Devices.Client.InternalClient.<>c.<ApplyTimeoutTwin>b__64_2(Task`1 t)
   at System.Threading.Tasks.ContinuationResultTaskFromResultTask`2.InnerInvoke()
   at System.Threading.Tasks.Task.<>c.<.cctor>b__278_1(Object obj)
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
--- End of stack trace from previous location where exception was thrown ---
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot)
--- End of stack trace from previous location where exception was thrown ---
   at Microsoft.Azure.Devices.Edge.Agent.IoTHub.EdgeAgentConnection.RefreshTwinAsync() in /home/vsts/work/1/s/edge-agent/src/Microsoft.Azure.Devices.Edge.Agent.IoTHub/EdgeAgentConnection.cs:line 190
2019-03-29 03:25:58.943 +00:00 [WRN] - Reconcile failed because of the an exception
System.TimeoutException: Operation timeout expired.
   at Microsoft.Azure.Devices.Edge.Agent.Core.Agent.ReconcileAsync(CancellationToken token) in /home/vsts/work/1/s/edge-agent/src/Microsoft.Azure.Devices.Edge.Agent.Core/Agent.cs:line 115
2019-03-29 03:26:28.978 +00:00 [WRN] - Updating reported properties failed with error Operation timeout expired. type System.TimeoutException
2019-03-29 03:26:34.063 +00:00 [WRN] - Reconcile failed because of the an exception
System.TimeoutException: Operation timeout expired.
   at Microsoft.Azure.Devices.Edge.Agent.Core.Agent.ReconcileAsync(CancellationToken token) in /home/vsts/work/1/s/edge-agent/src/Microsoft.Azure.Devices.Edge.Agent.Core/Agent.cs:line 115
2019-03-29 03:27:04.088 +00:00 [WRN] - Updating reported properties failed with error Operation timeout expired. type System.TimeoutException
2019-03-29 03:27:09.180 +00:00 [WRN] - Reconcile failed because of the an exception
System.TimeoutException: Operation timeout expired.
   at Microsoft.Azure.Devices.Edge.Agent.Core.Agent.ReconcileAsync(CancellationToken token) in /home/vsts/work/1/s/edge-agent/src/Microsoft.Azure.Devices.Edge.Agent.Core/Agent.cs:line 115
2019-03-29 03:27:39.207 +00:00 [WRN] - Updating reported properties failed with error Operation timeout expired. type System.TimeoutException
2019-03-29 03:27:44.297 +00:00 [WRN] - Reconcile failed because of the an exception
System.TimeoutException: Operation timeout expired.
   at Microsoft.Azure.Devices.Edge.Agent.Core.Agent.ReconcileAsync(CancellationToken token) in /home/vsts/work/1/s/edge-agent/src/Microsoft.Azure.Devices.Edge.Agent.Core/Agent.cs:line 115
2019-03-29 03:28:14.332 +00:00 [WRN] - Updating reported properties failed with error Operation timeout expired. type System.TimeoutException
2019-03-29 03:28:19.416 +00:00 [WRN] - Reconcile failed because of the an exception
System.TimeoutException: Operation timeout expired.
   at Microsoft.Azure.Devices.Edge.Agent.Core.Agent.ReconcileAsync(CancellationToken token) in /home/vsts/work/1/s/edge-agent/src/Microsoft.Azure.Devices.Edge.Agent.Core/Agent.cs:line 115
2019-03-29 03:28:49.458 +00:00 [WRN] - Updating reported properties failed with error Operation timeout expired. type System.TimeoutException
2019-03-29 03:28:54.509 +00:00 [WRN] - Reconcile failed because of the an exception
System.TimeoutException: Operation timeout expired.
   at Microsoft.Azure.Devices.Edge.Agent.Core.Agent.ReconcileAsync(CancellationToken token) in /home/vsts/work/1/s/edge-agent/src/Microsoft.Azure.Devices.Edge.Agent.Core/Agent.cs:line 115
2019-03-29 03:29:24.531 +00:00 [WRN] - Updating reported properties failed with error Operation timeout expired. type System.TimeoutException
2019-03-29 03:29:29.633 +00:00 [WRN] - Reconcile failed because of the an exception
System.TimeoutException: Operation timeout expired.
   at Microsoft.Azure.Devices.Edge.Agent.Core.Agent.ReconcileAsync(CancellationToken token) in /home/vsts/work/1/s/edge-agent/src/Microsoft.Azure.Devices.Edge.Agent.Core/Agent.cs:line 115

The rest of that logfile is the same error over and over again.

Logging output of edgeHub does not reach back to the time of the network failure (because it’s set to debug and produces a lot of output) but it is repeating the following error when trying to send messages upstream:

2019-03-29 09:32:35.053 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Hub.Core.Routing.CloudEndpoint] - Sending 1 message(s) upstream.
2019-03-29 09:32:35.191 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Hub.CloudProxy.ConnectivityAwareClient] - Operation OpenAsync timed out for bewaPi1/OpcDataToMessage
2019-03-29 09:32:35.191 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Hub.CloudProxy.DeviceConnectivityManager] - IotHub call timed out
2019-03-29 09:32:35.205 +00:00 [WRN] [Microsoft.Azure.Devices.Edge.Hub.CloudProxy.CloudConnection] - Error creating cloud connection for client bewaPi1/OpcDataToMessage
System.TimeoutException: Operation timeout expired.
   at Microsoft.Azure.Devices.Client.InternalClient.<>c.<ApplyTimeout>b__62_2(Task t)
   at System.Threading.Tasks.ContinuationTaskFromTask.InnerInvoke()
   at System.Threading.Tasks.Task.<>c.<.cctor>b__278_1(Object obj)
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
--- End of stack trace from previous location where exception was thrown ---
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot)
--- End of stack trace from previous location where exception was thrown ---
   at Microsoft.Azure.Devices.Edge.Hub.CloudProxy.ModuleClientWrapper.OpenAsync() in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.CloudProxy/ModuleClientWrapper.cs:line 48
   at Microsoft.Azure.Devices.Edge.Hub.CloudProxy.ConnectivityAwareClient.<>c__DisplayClass28_0.<<InvokeFunc>b__0>d.MoveNext() in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.CloudProxy/ConnectivityAwareClient.cs:line 174
--- End of stack trace from previous location where exception was thrown ---
   at Microsoft.Azure.Devices.Edge.Hub.CloudProxy.ConnectivityAwareClient.InvokeFunc[T](Func`1 func, String operation, Boolean useForConnectivityCheck) in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.CloudProxy/ConnectivityAwareClient.cs:line 134
   at Microsoft.Azure.Devices.Edge.Hub.CloudProxy.CloudConnection.ConnectToIoTHub(ITokenProvider newTokenProvider) in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.CloudProxy/CloudConnection.cs:line 125
   at Microsoft.Azure.Devices.Edge.Hub.CloudProxy.CloudConnection.CreateNewCloudProxyAsync(ITokenProvider newTokenProvider) in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.CloudProxy/CloudConnection.cs:line 97
   at Microsoft.Azure.Devices.Edge.Hub.CloudProxy.CloudConnection.Create(IIdentity identity, Action`2 connectionStatusChangedHandler, ITransportSettings[] transportSettings, IMessageConverterProvider messageConverterProvider, IClientProvider clientProvider, ICloudListener cloudListener, ITokenProvider tokenProvider, TimeSpan idleTimeout, Boolean closeOnIdleTimeout, TimeSpan operationTimeout) in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.CloudProxy/CloudConnection.cs:line 86
   at Microsoft.Azure.Devices.Edge.Hub.CloudProxy.CloudConnectionProvider.<>c__DisplayClass15_1.<<Connect>b__2>d.MoveNext() in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.CloudProxy/CloudConnectionProvider.cs:line 134
--- End of stack trace from previous location where exception was thrown ---
   at Microsoft.Azure.Devices.Edge.Hub.CloudProxy.CloudConnectionProvider.Connect(IIdentity identity, Action`2 connectionStatusChangedHandler) in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.CloudProxy/CloudConnectionProvider.cs:line 129
2019-03-29 09:32:35.222 +00:00 [INF] [Microsoft.Azure.Devices.Edge.Hub.Core.ConnectionManager] - Error getting cloud connection for device bewaPi1/OpcDataToMessage
System.TimeoutException: Operation timeout expired.
   at Microsoft.Azure.Devices.Client.InternalClient.<>c.<ApplyTimeout>b__62_2(Task t)
   at System.Threading.Tasks.ContinuationTaskFromTask.InnerInvoke()
   at System.Threading.Tasks.Task.<>c.<.cctor>b__278_1(Object obj)
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
--- End of stack trace from previous location where exception was thrown ---
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot)
--- End of stack trace from previous location where exception was thrown ---
   at Microsoft.Azure.Devices.Edge.Hub.CloudProxy.ModuleClientWrapper.OpenAsync() in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.CloudProxy/ModuleClientWrapper.cs:line 48
   at Microsoft.Azure.Devices.Edge.Hub.CloudProxy.ConnectivityAwareClient.<>c__DisplayClass28_0.<<InvokeFunc>b__0>d.MoveNext() in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.CloudProxy/ConnectivityAwareClient.cs:line 174
--- End of stack trace from previous location where exception was thrown ---
   at Microsoft.Azure.Devices.Edge.Hub.CloudProxy.ConnectivityAwareClient.InvokeFunc[T](Func`1 func, String operation, Boolean useForConnectivityCheck) in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.CloudProxy/ConnectivityAwareClient.cs:line 134
   at Microsoft.Azure.Devices.Edge.Hub.CloudProxy.CloudConnection.ConnectToIoTHub(ITokenProvider newTokenProvider) in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.CloudProxy/CloudConnection.cs:line 125
   at Microsoft.Azure.Devices.Edge.Hub.CloudProxy.CloudConnection.CreateNewCloudProxyAsync(ITokenProvider newTokenProvider) in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.CloudProxy/CloudConnection.cs:line 97
   at Microsoft.Azure.Devices.Edge.Hub.CloudProxy.CloudConnection.Create(IIdentity identity, Action`2 connectionStatusChangedHandler, ITransportSettings[] transportSettings, IMessageConverterProvider messageConverterProvider, IClientProvider clientProvider, ICloudListener cloudListener, ITokenProvider tokenProvider, TimeSpan idleTimeout, Boolean closeOnIdleTimeout, TimeSpan operationTimeout) in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.CloudProxy/CloudConnection.cs:line 86
   at Microsoft.Azure.Devices.Edge.Hub.CloudProxy.CloudConnectionProvider.<>c__DisplayClass15_1.<<Connect>b__2>d.MoveNext() in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.CloudProxy/CloudConnectionProvider.cs:line 134
--- End of stack trace from previous location where exception was thrown ---
   at Microsoft.Azure.Devices.Edge.Hub.CloudProxy.CloudConnectionProvider.Connect(IIdentity identity, Action`2 connectionStatusChangedHandler) in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.CloudProxy/CloudConnectionProvider.cs:line 129
2019-03-29 09:32:35.230 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Hub.Core.TwinManager] - Processing ConnectionEstablished for device bewaPi1/OpcDataToMessage
2019-03-29 09:32:35.231 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Hub.Core.IDeviceScopeIdentitiesCache] - Getting service identity for bewaPi1/OpcDataToMessage
2019-03-29 09:32:35.232 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Hub.CloudProxy.CloudConnection] - Creating cloud connection for client bewaPi1/OpcDataToMessage using EdgeHub credentials
2019-03-29 09:32:35.232 +00:00 [INF] [Microsoft.Azure.Devices.Edge.Hub.CloudProxy.CloudConnection] - Attempting to connect to IoT Hub for client bewaPi1/OpcDataToMessage via AMQP...
2019-03-29 09:32:35.230 +00:00 [INF] [Microsoft.Azure.Devices.Edge.Hub.Core.ConnectionManager] - Error getting cloud connection for device bewaPi1/OpcDataToMessage
System.TimeoutException: Operation timeout expired.
   at Microsoft.Azure.Devices.Client.InternalClient.<>c.<ApplyTimeout>b__62_2(Task t)
   at System.Threading.Tasks.ContinuationTaskFromTask.InnerInvoke()
   at System.Threading.Tasks.Task.<>c.<.cctor>b__278_1(Object obj)
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
--- End of stack trace from previous location where exception was thrown ---
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot)
--- End of stack trace from previous location where exception was thrown ---
   at Microsoft.Azure.Devices.Edge.Hub.CloudProxy.ModuleClientWrapper.OpenAsync() in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.CloudProxy/ModuleClientWrapper.cs:line 48
   at Microsoft.Azure.Devices.Edge.Hub.CloudProxy.ConnectivityAwareClient.<>c__DisplayClass28_0.<<InvokeFunc>b__0>d.MoveNext() in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.CloudProxy/ConnectivityAwareClient.cs:line 174
--- End of stack trace from previous location where exception was thrown ---
   at Microsoft.Azure.Devices.Edge.Hub.CloudProxy.ConnectivityAwareClient.InvokeFunc[T](Func`1 func, String operation, Boolean useForConnectivityCheck) in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.CloudProxy/ConnectivityAwareClient.cs:line 134
   at Microsoft.Azure.Devices.Edge.Hub.CloudProxy.CloudConnection.ConnectToIoTHub(ITokenProvider newTokenProvider) in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.CloudProxy/CloudConnection.cs:line 125
   at Microsoft.Azure.Devices.Edge.Hub.CloudProxy.CloudConnection.CreateNewCloudProxyAsync(ITokenProvider newTokenProvider) in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.CloudProxy/CloudConnection.cs:line 97
   at Microsoft.Azure.Devices.Edge.Hub.CloudProxy.CloudConnection.Create(IIdentity identity, Action`2 connectionStatusChangedHandler, ITransportSettings[] transportSettings, IMessageConverterProvider messageConverterProvider, IClientProvider clientProvider, ICloudListener cloudListener, ITokenProvider tokenProvider, TimeSpan idleTimeout, Boolean closeOnIdleTimeout, TimeSpan operationTimeout) in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.CloudProxy/CloudConnection.cs:line 86
   at Microsoft.Azure.Devices.Edge.Hub.CloudProxy.CloudConnectionProvider.<>c__DisplayClass15_1.<<Connect>b__2>d.MoveNext() in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.CloudProxy/CloudConnectionProvider.cs:line 134
--- End of stack trace from previous location where exception was thrown ---
   at Microsoft.Azure.Devices.Edge.Hub.CloudProxy.CloudConnectionProvider.Connect(IIdentity identity, Action`2 connectionStatusChangedHandler) in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.CloudProxy/CloudConnectionProvider.cs:line 129
2019-03-29 09:32:35.255 +00:00 [WRN] [Microsoft.Azure.Devices.Edge.Hub.Core.Routing.CloudEndpoint] - Could not get an active Iot Hub connection for device bewaPi1/OpcDataToMessage

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 28 (7 by maintainers)

Most upvoted comments

This is fixed with 1.0.8 release coming out by end of this week. Please give it a try when available and re-open issue as needed.

@veyalla We’re using 1.0.7.1 and use MqttWs as the upstream protocol. (We tried several other upstream protocols but had the same problem with those as well)

The same huge delay before reconnecting also occurs in our tests. This issue is a show-stopper for a platform that is operated via the cloud! Especially when the products developed on top of it need either human interaction (you can’t/shouldn’t expect a user to wait 15 minutes) or are limited in storage (thus requiring a cloud connection to upload data).

@veyalla I hope that you can prioritize releasing a fix as soon as possible.