chirpstack-gateway-bridge: Subscriptions are not obtained upon Mqtt disconnects / or connection resets
- [x The issue is present in the latest release.
- I have searched the issues of this repository and believe that this is not a duplicate.
What happened?
Some unexpected behaviors when connections are interupted to the MQTT broker (EMQX v4.1.0 in my case).
Inconsistently following a connection error like:
time="2020-10-22T17:11:09Z" level=error msg="mqtt: connection error" error="read tcp IP:38074->IP:31709: read: connection reset by peer"
time="2020-10-22T18:36:01Z" level=error msg="mqtt: connection error" error=EOF
time="2020-10-22T18:36:17Z" level=error msg="mqtt: connection error" error="write tcp IP:33194->IP:31709: write: broken pipe"
The subscriptions for each gateway topic to MQTT broker are not re-subscribed. When I enabled debug and the paho logging, I could see the subscriptions attempt to be re-added, but it would log this line and then never log it for the 2nd gateway not obtain either subscription.
What did you expect?
I would expect the subscriptions to be re-obtained. This may be a bug with the paho client.
I am currently testing gateway bridge at master + paho client at master.
Steps to reproduce this issue
Steps:
Could you share your log output?
Your Environment
| Component | Version |
|---|---|
| Application Server | v?.?.? |
| Network Server | |
| Gateway Bridge | master |
| Chirpstack API | |
| Geolocation | |
| Concentratord |
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 23 (14 by maintainers)
Commits related to this issue
- Refactor MQTT (un)subscribe. This separates the SetGatewaySubscription from the actual MQTT (un)subscribe. When SetGatewaySubscription is called multiple times for the same gateway while the MQTT cli... — committed to chirpstack/chirpstack-gateway-bridge by brocaar 4 years ago
- Refactor MQTT (un)subscribe. This separates the SetGatewaySubscription from the actual MQTT (un)subscribe. When SetGatewaySubscription is called multiple times for the same gateway while the MQTT cli... — committed to vfylyk/chirpstack-gateway-bridge by brocaar 4 years ago
I deployed an EMQX cluster after I read about it in this thread and will test it, too. 😃
Thanks @JohnRoesler for testing this!
For this reason, I have split up the mutex into different variables. It was used and for connection purposes, and to guard against concurrent access to the gateways map.
@JohnRoesler I’m looking forward to your feedback on https://github.com/brocaar/chirpstack-gateway-bridge/commit/6cc9cafd4b1f18aa5ab3fbf394ac61ab8e042ab6 (https://github.com/brocaar/chirpstack-gateway-bridge/tree/subscribe_refactor). Could you try to reproduce your issue with these changes?
Fyi: I have just merged in some other improvements, during an other project we have found some bottlenecks in how the channels were setup. These channels have been removed and callbacks executed in go routines are now used. This does not solve this issue however, I’m currently testing various scenarios and I might have found the potential issue. I’m currently testing some modifications.
Note that in the original implementation, the mutex might not be the issue. While
SetGatewaySubscriptionwould hold the lock until the (un)subscribe is completed, this function should finish once re-connected. This means thatonConnectedwill be able to acquire the lock only afterSetGatewaySubscriptionhas been completed, but that is fine. The potential race is withconnect, as this also tries to acquire a lock. So if SetGatewaySubscription is not able to (un)subscribe because the client is disconnected, thenconnectis blocked forever and there is a deadlock.I’m going to make some modifications and let you know as soon as I have something to test with 😃