aws-iot-device-sdk-python-v2: Publish future raises AWS_ERROR_MQTT_CONNECTION_DESTROYED
Describe the bug
In our design I add a callback to the future returned by the publish method:
publish_future, pkt_id = self._mqtt_connection.publish(topic, pkt[1], mqtt.QoS.AT_LEAST_ONCE, retain=False)
publish_future.add_done_callback(lambda x: self._publish_future_callback(x, pkt, topic))
This normally works fine but I have been testing the code for robustness using a repeating cycle in and out of connected modes and on one occasion I got this exception: AWS_ERROR_MQTT_CONNECTION_DESTROYED
After this the sdk still appears to respond correctly but never publishes any packets. It even indicates it is resuming a connection but I don’t think it really is.
I have added a workaround to recreate the connection after I get this exception but I’d like to understand why it occurs.
Expected Behavior
For this exception to not get thrown and if it does for other library methods to then fail instead of appearing to work.
Current Behavior
Occasionally throws AWS_ERROR_MQTT_CONNECTION_DESTROYED and then fails to publish afterwards although the library seems to respond like it is working:
exception calling callback for <Future at 0xffff74059160 state=finished raised AwsCrtError>
Traceback (most recent call last):
File "/bba_app/aws_iot_core.py", line 154, in _publish_future_callback
result = publish_future.result(timeout=0.0)
File "/usr/lib/python3.9/concurrent/futures/_base.py", line 433, in result
return self.__get_result()
File "/usr/lib/python3.9/concurrent/futures/_base.py", line 389, in __get_result
raise self._exception
awscrt.exceptions.AwsCrtError: AWS_ERROR_MQTT_CONNECTION_DESTROYED: Connection has started destroying process, all uncompleted requests will fail.
Reproduction Steps
- Create connection
- Start connection
- Publish every 10 seconds for 5 minutes
- Shutdown connection
- Idle 2 minutes
- Repeat
Possible Solution
Are the AWSCRT objects persistent even with create context calls?
Additional Information/Context
Can someone explain what this message means and what can cause it? If there is an error in how I am using the sdk then perhaps that will make it clear.
SDK version used
1.11.3
Environment details (OS name and version, etc.)
Debian, Python 3.8
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 15 (6 by maintainers)
Hi @TwistedTwigleg, Thanks so much for the detailed response that’s very helpful. I think I have arrived at your suggestions independently and my new build is working well. For the crash I had to get the destroyed message a lot so 100+ times over a few days before it occured. I added a catch for the destroyed message and rescue the packet like this:
Although I no longer see this message because I now no longer recreate the connection when waking from sleep. I agree it just seems unnesscessary.
Thanks @edcloudcycle! I will try and see if I can reproduce this issue on my end.
In the meantime, I would highly recommend persisting the MqttConnection until all the in-flight/pending messages are processed before reassigning the MqttConnection to avoid this issue. 👍
Hi @TwistedTwigleg,
This function was being called when waking from sleep and caused the destroyed message when a packet was in flight. Posted as an image as the code formatting seems to mangle python.
Hi, The message occurs during the creation of a connection. I am shutting down using:
try: self._mqtt_connection.disconnect() except: passPerhaps I should add some debug here just to check if it does throw. It occurs to me that a good use of this exception would be to put the unsent packet back on the queue of packets for use on reconnection.It looks like the not sending packets symptom might be related to my code falling out of its correct state machine path and not recovering after the exception. I have tightened that up and will start testing again now. Thanks very much for your response. Ed