iotedge: edgehub fails to clean messages for endpoint 'Value not found in store'

Expected Behavior

EdgeHub should clean up messages and device memory should maintain stable level.

Current Behavior

‘System.InvalidOperationException: Value not found in store’ error is thrown while edge attempts to clean up messages for an endpoint. Full exception below was found in edgeHub logs.

<4> 2020-10-25 14:58:06.346 +00:00 [WRN] - Error cleaning up messages for endpoint Core-c6f13dd7/RadioServiceModule/UdpRfPayloadReceived System.InvalidOperationException: Value not found in store at Microsoft.Azure.Devices.Edge.Util.Option1.Expect[TException](Func1 exception) in /home/vsts/work/1/s/edge-util/src/Microsoft.Azure.Devices.Edge.Util/Opt ion.cs:line 117 at Microsoft.Azure.Devices.Edge.Storage.EntityStore2.Update(TK key, Func2 updator, CancellationToken cancellationToken) at Microsoft.Azure.Devices.Edge.Util.TaskEx.TimeoutAfter[T](Task1 task, TimeSpan timeout) at Microsoft.Azure.Devices.Edge.Hub.Core.Storage.MessageStore.CleanupProcessor.<>c__DisplayClass9_0.<<CleanupMessages>g__DeleteMessageCallback|0>d.MoveNext() in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.Core/storage/MessageStore.cs:line 267 --- End of stack trace from previous location where exception was thrown --- at Microsoft.Azure.Devices.Edge.Storage.SequentialStore1.<>c__DisplayClass16_0.<<RemoveFirst>b__0>d.MoveNext() in /home/vsts/work/1/s/edge-util/src/Microsof t.Azure.Devices.Edge.Storage/SequentialStore.cs:line 86 — End of stack trace from previous location where exception was thrown — at Microsoft.Azure.Devices.Edge.Storage.SequentialStore1.RemoveFirst(Func3 predicate, CancellationToken cancellationToken) in /home/vsts/work/1/s/edge-util /src/Microsoft.Azure.Devices.Edge.Storage/SequentialStore.cs:line 82 at Microsoft.Azure.Devices.Edge.Hub.Core.Storage.MessageStore.CleanupProcessor.CleanupMessages() in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices. Edge.Hub.Core/storage/MessageStore.cs:line 294

Context (Environment)

We have around 10 devices onsite broadcasting and listening to messages broadcasted over the local network using udp. All of the devices onsite are listening and operating correctly except one. Each device has an iot edge deployment with one module that listens to udp messages over a specified port and another module to process messages. The Udp listener module sends messages to the processing module via inter module edge message. We only have one device with the memory leak with the edgehub log exception posted above. All the devices have a very similar traffic volume where the udp listener recieves 600 messages per 5 minutes.

Steps to Reproduce

Below is a basic list of steps to reproduce. The Exception doesn’t always occur under similar circumstances.

  1. Get a raspbery pi device with raspbian stretch os
  2. Install IotEdge
  3. set deployment for a module (running .net core 2.1) to listen for udp messages via ethernet port.
  4. set deployment for a second module to recieve messages from udp listener…
  5. Broad cast udp message with byte array payload over local network.

Output of iotedge check

Click here
Configuration checks
--------------------
√ config.yaml is well-formed - OK
√ config.yaml has well-formed connection string - OK
√ container engine is installed and functional - OK
√ config.yaml has correct hostname - OK
√ config.yaml has correct URIs for daemon mgmt endpoint - OK
‼ latest security daemon - Warning
    Installed IoT Edge daemon has version 1.0.9.5 but 1.0.10 is the latest stable version available.
    Please see https://aka.ms/iotedge-update-runtime for update instructions.
√ host time is close to real time - OK
√ container time is close to host time - OK
‼ DNS server - Warning
    Container engine is not configured with DNS server setting, which may impact connectivity to IoT Hub.
    Please see https://aka.ms/iotedge-prod-checklist-dns for best practices.
    You can ignore this warning if you are setting DNS server per module in the Edge deployment.
        caused by: Could not open container engine config file /etc/docker/daemon.json
        caused by: No such file or directory (os error 2)
√ production readiness: certificates - OK
√ production readiness: container engine - OK
‼ production readiness: logs policy - Warning
    Container engine is not configured to rotate module logs which may cause it run out of disk space.
    Please see https://aka.ms/iotedge-prod-checklist-logs for best practices.
    You can ignore this warning if you are setting log policy per module in the Edge deployment.
        caused by: Could not open container engine config file /etc/docker/daemon.json
        caused by: No such file or directory (os error 2)
‼ production readiness: Edge Agent's storage directory is persisted on the host filesystem - Warning
    The edgeAgent module is not configured to persist its /tmp/edgeAgent directory on the host filesystem.
    Data might be lost if the module is deleted or updated.
    Please see https://aka.ms/iotedge-storage-host for best practices.
‼ production readiness: Edge Hub's storage directory is persisted on the host filesystem - Warning
    The edgeHub module is not configured to persist its /tmp/edgeHub directory on the host filesystem.
    Data might be lost if the module is deleted or updated.
    Please see https://aka.ms/iotedge-storage-host for best practices.

Connectivity checks
-------------------
√ host can connect to and perform TLS handshake with IoT Hub AMQP port - OK
√ host can connect to and perform TLS handshake with IoT Hub HTTPS / WebSockets port - OK
√ host can connect to and perform TLS handshake with IoT Hub MQTT port - OK
√ container on the default network can connect to IoT Hub AMQP port - OK
√ container on the default network can connect to IoT Hub HTTPS / WebSockets port - OK
√ container on the default network can connect to IoT Hub MQTT port - OK
√ container on the IoT Edge module network can connect to IoT Hub AMQP port - OK
√ container on the IoT Edge module network can connect to IoT Hub HTTPS / WebSockets port - OK
√ container on the IoT Edge module network can connect to IoT Hub MQTT port - OK

18 check(s) succeeded.
5 check(s) raised warnings.

Device Information

  • Host OS: Raspbian GNU/Linux 9 (stretch)
  • Architecture: ARMv7 Processor rev 4 (v7l), 32 bit
  • Container OS: Rasbian stretch linux containers
  • Hardware: Raspberry Pi cm3+

Runtime Versions

  • iotedged: iotedge 1.0.9.5
  • Edge Agent: mcr.microsoft.com/azureiotedge-agent:1.0
  • Edge Hub: mcr.microsoft.com/azureiotedge-hub:1.0
  • Docker/Moby: 3.0.13+azure
  • Module Code: C# .Net core 2.1

Note: when using Windows containers on Windows, run docker -H npipe:////./pipe/iotedge_moby_engine version instead

Logs

edge-agent logs

<6> 2020-10-26 11:36:19.935 +00:00 [INF] - Obtained Edge agent twin from IoTHub with desired properties version 11 and reported properties version 61.
<6> 2020-10-26 11:36:19.955 +00:00 [INF] - Successfully completed periodic operation refresh twin config
<6> 2020-10-26 12:36:07.417 +00:00 [INF] - Starting compaction of stores
<6> 2020-10-26 12:36:07.468 +00:00 [INF] - Starting compaction of store deploymentConfig
<6> 2020-10-26 12:36:07.540 +00:00 [INF] - Starting compaction of store default
<6> 2020-10-26 12:36:07.541 +00:00 [INF] - Starting compaction of store moduleState
<6> 2020-10-26 12:36:19.958 +00:00 [INF] - Starting periodic operation refresh twin config...
<6> 2020-10-26 12:36:20.095 +00:00 [INF] - Obtained Edge agent twin from IoTHub with desired properties version 11 and reported properties version 61.
<6> 2020-10-26 12:36:20.117 +00:00 [INF] - Successfully completed periodic operation refresh twin config
<6> 2020-10-26 13:36:20.127 +00:00 [INF] - Starting periodic operation refresh twin config...
<6> 2020-10-26 13:36:20.265 +00:00 [INF] - Obtained Edge agent twin from IoTHub with desired properties version 11 and reported properties version 61.
<6> 2020-10-26 13:36:20.292 +00:00 [INF] - Successfully completed periodic operation refresh twin config
<6> 2020-10-26 14:25:54.085 +00:00 [INF] - Updated reported properties
<6> 2020-10-26 14:36:07.419 +00:00 [INF] - Starting compaction of stores
<6> 2020-10-26 14:36:07.420 +00:00 [INF] - Starting compaction of store deploymentConfig
<6> 2020-10-26 14:36:07.495 +00:00 [INF] - Starting compaction of store default
<6> 2020-10-26 14:36:07.496 +00:00 [INF] - Starting compaction of store moduleState
<6> 2020-10-26 14:36:20.289 +00:00 [INF] - Starting periodic operation refresh twin config...
<6> 2020-10-26 14:36:20.441 +00:00 [INF] - Obtained Edge agent twin from IoTHub with desired properties version 11 and reported properties version 62.
<6> 2020-10-26 14:36:20.469 +00:00 [INF] - Successfully completed periodic operation refresh twin config
<6> 2020-10-26 15:36:20.475 +00:00 [INF] - Starting periodic operation refresh twin config...
<6> 2020-10-26 15:36:20.637 +00:00 [INF] - Obtained Edge agent twin from IoTHub with desired properties version 11 and reported properties version 62.
<6> 2020-10-26 15:36:20.651 +00:00 [INF] - Successfully completed periodic operation refresh twin config
<6> 2020-10-26 16:36:07.428 +00:00 [INF] - Starting compaction of stores
<6> 2020-10-26 16:36:07.493 +00:00 [INF] - Starting compaction of store deploymentConfig
<6> 2020-10-26 16:36:07.562 +00:00 [INF] - Starting compaction of store default
<6> 2020-10-26 16:36:07.562 +00:00 [INF] - Starting compaction of store moduleState
<6> 2020-10-26 16:36:20.646 +00:00 [INF] - Starting periodic operation refresh twin config...
<6> 2020-10-26 16:36:20.780 +00:00 [INF] - Obtained Edge agent twin from IoTHub with desired properties version 11 and reported properties version 62.
<6> 2020-10-26 16:36:20.792 +00:00 [INF] - Successfully completed periodic operation refresh twin config


edge-hub logs

<6> 2020-10-26 16:47:25.972 +00:00 [INF] - Updated reported properties for Core-c6f13dd7/LocalNetworkModule
<6> 2020-10-26 16:47:31.137 +00:00 [INF] - Updated reported properties for Core-c6f13dd7/BacnetRouterModule
<6> 2020-10-26 16:47:39.797 +00:00 [INF] - Started task to cleanup processed and stale messages for endpoint Core-c6f13dd7/RulesEngineModule/sharedCommands
<6> 2020-10-26 16:47:39.798 +00:00 [INF] - Cleaned up 0 messages from queue for endpoint Core-c6f13dd7/RulesEngineModule/sharedCommands and 0 messages from message store.
<6> 2020-10-26 16:48:09.800 +00:00 [INF] - Started task to cleanup processed and stale messages for endpoint Core-c6f13dd7/BacnetRouterModule/udpinput
<6> 2020-10-26 16:48:09.801 +00:00 [INF] - Cleaned up 0 messages from queue for endpoint Core-c6f13dd7/BacnetRouterModule/udpinput and 0 messages from message store.
<6> 2020-10-26 16:48:39.799 +00:00 [INF] - Started task to cleanup processed and stale messages for endpoint Core-c6f13dd7/LocalNetworkModule/SharedDeviceBroadcast
<6> 2020-10-26 16:48:39.800 +00:00 [INF] - Cleaned up 0 messages from queue for endpoint Core-c6f13dd7/LocalNetworkModule/SharedDeviceBroadcast and 0 messages from message store.
<6> 2020-10-26 16:49:09.801 +00:00 [INF] - Started task to cleanup processed and stale messages for endpoint Core-c6f13dd7/LocalNetworkModule/SharedDosingRecipeBroadcast
<6> 2020-10-26 16:49:09.802 +00:00 [INF] - Cleaned up 0 messages from queue for endpoint Core-c6f13dd7/LocalNetworkModule/SharedDosingRecipeBroadcast and 0 messages from message store.
<6> 2020-10-26 16:49:39.803 +00:00 [INF] - Started task to cleanup processed and stale messages for endpoint Core-c6f13dd7/LocalNetworkModule/RfBroadcastPayload
<6> 2020-10-26 16:50:06.681 +00:00 [INF] - Cleaned up 4420 messages from queue for endpoint Core-c6f13dd7/LocalNetworkModule/RfBroadcastPayload and 4420 messages from message store.
<6> 2020-10-26 16:50:36.687 +00:00 [INF] - Started task to cleanup processed and stale messages for endpoint Core-c6f13dd7/LocalNetworkModule/RouteReadRequest
<6> 2020-10-26 16:50:36.688 +00:00 [INF] - Cleaned up 0 messages from queue for endpoint Core-c6f13dd7/LocalNetworkModule/RouteReadRequest and 0 messages from message store.
<6> 2020-10-26 16:51:06.693 +00:00 [INF] - Started task to cleanup processed and stale messages for endpoint Core-c6f13dd7/BacnetRouterModule/BacnetOperation
<6> 2020-10-26 16:51:06.695 +00:00 [INF] - Cleaned up 0 messages from queue for endpoint Core-c6f13dd7/BacnetRouterModule/BacnetOperation and 0 messages from message store.
<6> 2020-10-26 16:51:36.699 +00:00 [INF] - Started task to cleanup processed and stale messages for endpoint Core-c6f13dd7/RadioServiceModule/configurationUpdated
<6> 2020-10-26 16:51:36.700 +00:00 [INF] - Cleaned up 0 messages from queue for endpoint Core-c6f13dd7/RadioServiceModule/configurationUpdated and 0 messages from message store.
<6> 2020-10-26 16:52:02.067 +00:00 [INF] - Entering periodic task to reauthenticate connected clients
<6> 2020-10-26 16:52:06.702 +00:00 [INF] - Started task to cleanup processed and stale messages for endpoint Core-c6f13dd7/BacnetRouterModule/configurationUpdated
<6> 2020-10-26 16:52:06.705 +00:00 [INF] - Cleaned up 0 messages from queue for endpoint Core-c6f13dd7/BacnetRouterModule/configurationUpdated and 0 messages from message store.
<6> 2020-10-26 16:52:25.946 +00:00 [INF] - Updated reported properties for Core-c6f13dd7/LocalNetworkModule
<6> 2020-10-26 16:52:31.180 +00:00 [INF] - Updated reported properties for Core-c6f13dd7/BacnetRouterModule
<6> 2020-10-26 16:52:36.706 +00:00 [INF] - Started task to cleanup processed and stale messages for endpoint iothub
<6> 2020-10-26 16:52:36.707 +00:00 [INF] - Cleaned up 0 messages from queue for endpoint iothub and 0 messages from message store.
<6> 2020-10-26 16:53:06.694 +00:00 [INF] - Started task to cleanup processed and stale messages for endpoint Core-c6f13dd7/RulesEngineModule/configurationUpdated
<6> 2020-10-26 16:53:06.695 +00:00 [INF] - Cleaned up 0 messages from queue for endpoint Core-c6f13dd7/RulesEngineModule/configurationUpdated and 0 messages from message store.
<6> 2020-10-26 16:53:36.705 +00:00 [INF] - Started task to cleanup processed and stale messages for endpoint Core-c6f13dd7/RadioServiceModule/UdpRfPayloadReceived
<4> 2020-10-26 16:53:36.784 +00:00 [WRN] - Error cleaning up messages for endpoint Core-c6f13dd7/RadioServiceModule/UdpRfPayloadReceived
System.InvalidOperationException: Value not found in store
   at Microsoft.Azure.Devices.Edge.Util.Option`1.Expect[TException](Func`1 exception) in /home/vsts/work/1/s/edge-util/src/Microsoft.Azure.Devices.Edge.Util/Option.cs:line 117
   at Microsoft.Azure.Devices.Edge.Storage.EntityStore`2.Update(TK key, Func`2 updator, CancellationToken cancellationToken)
   at Microsoft.Azure.Devices.Edge.Util.TaskEx.TimeoutAfter[T](Task`1 task, TimeSpan timeout)
   at Microsoft.Azure.Devices.Edge.Hub.Core.Storage.MessageStore.CleanupProcessor.<>c__DisplayClass9_0.<<CleanupMessages>g__DeleteMessageCallback|0>d.MoveNext() in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.Core/storage/MessageStore.cs:line 267
--- End of stack trace from previous location where exception was thrown ---
   at Microsoft.Azure.Devices.Edge.Storage.SequentialStore`1.<>c__DisplayClass16_0.<<RemoveFirst>b__0>d.MoveNext() in /home/vsts/work/1/s/edge-util/src/Microsoft.Azure.Devices.Edge.Storage/SequentialStore.cs:line 86
--- End of stack trace from previous location where exception was thrown ---
   at Microsoft.Azure.Devices.Edge.Storage.SequentialStore`1.RemoveFirst(Func`3 predicate, CancellationToken cancellationToken) in /home/vsts/work/1/s/edge-util/src/Microsoft.Azure.Devices.Edge.Storage/SequentialStore.cs:line 82
   at Microsoft.Azure.Devices.Edge.Hub.Core.Storage.MessageStore.CleanupProcessor.CleanupMessages() in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.Core/storage/MessageStore.cs:line 294
<6> 2020-10-26 16:53:37.067 +00:00 [INF] - Started task to cleanup processed and stale messages for endpoint Core-c6f13dd7/ExpansionModule/configurationUpdated
<6> 2020-10-26 16:53:37.069 +00:00 [INF] - Cleaned up 0 messages from queue for endpoint Core-c6f13dd7/ExpansionModule/configurationUpdated and 0 messages from message store.
<6> 2020-10-26 16:54:07.065 +00:00 [INF] - Started task to cleanup processed and stale messages for endpoint Core-c6f13dd7/ConfigurationModule/expansionModuleDetected
<6> 2020-10-26 16:54:07.067 +00:00 [INF] - Cleaned up 0 messages from queue for endpoint Core-c6f13dd7/ConfigurationModule/expansionModuleDetected and 0 messages from message store.

Additional Information

Below is a memory available chart for the device throwing the exception. The sharp increases are points where we restart iotedge or reboot the device which in tern brings back some memory available.

DeviceMemoryProfile

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 15 (7 by maintainers)

Most upvoted comments

so after upgrade the edgeHub memory usage stabilized. Thanks alot… didn’t expect the upgrade to help honestly.