Android-nRF-Mesh-Library: Potential ConcurrentException in MeshNetworkDb:InsertNetworkAsyncTask()
Hello, I’d like to report a second bug found during importing a MeshNetwork (The first one and solution has been post under BubblyNetDev post named “Migration issue in 3.1.5”, not sure if it has been read and is planned to be fixed in next releases).
By the way, In class MeshNetworkDb.Java
Problem :
In class :
private static class InsertNetworkAsyncTask extends AsyncTask<Void, Void, Void> {
[...]
if (!meshNetwork.nodes.isEmpty()) {
nodesDao.insert(meshNetwork.nodes);
}
if (meshNetwork.groups != null) {
groupsDao.insert(meshNetwork.groups);
}
if (meshNetwork.scenes != null) {
scenesDao.insert(meshNetwork.scenes);
}
[...]
}
This can cause troubles down the line, in our case it caused several (still rare) ConcurrentExceptions :
Fatal Exception: java.lang.RuntimeException: An error occurred while executing doInBackground()
at android.os.AsyncTask$4.done(AsyncTask.java:415)
...
Caused by java.util.ConcurrentModificationException
at java.util.ArrayList$Itr.next(ArrayList.java:860)
at androidx.room.EntityInsertionAdapter.insert(EntityInsertionAdapter.java:95)
at no.nordicsemi.android.meshprovisioner.data.ProvisionedMeshNodesDao_Impl.insert(ProvisionedMeshNodesDao_Impl.java:303)
at no.nordicsemi.android.meshprovisioner.MeshNetworkDb$InsertNetworkAsyncTask.doInBackground(MeshNetworkDb.java:307)
at no.nordicsemi.android.meshprovisioner.MeshNetworkDb$InsertNetworkAsyncTask.doInBackground(MeshNetworkDb.java:271)
Cause :
meshNetwork.nodes meshNetwork.groups and meshNetwork.scenes are arrayList. They can be modified from outside the MeshNetworkDb.java for any reason, it shouldn’t happen, but if it happens this leads to ConcurrentException.
Solution I fixed the problem cloning the arrayList before passing it to the Dao, this makes sure we never pass an array which could be potentially modified from outside:
if (!meshNetwork.nodes.isEmpty()) {
ArrayList<ProvisionedMeshNode> copyNodes = new ArrayList<>(meshNetwork.nodes);
nodesDao.insert(copyNodes);
}
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 45 (23 by maintainers)
Commits related to this issue
- Fixes #432 by passing an unmodifiable list for insert and update operations. — committed to NordicSemiconductor/Android-nRF-Mesh-Library by roshanrajaratnam 2 years ago
- Fixes #432 network exclusions list must be unmodifiable. — committed to NordicSemiconductor/Android-nRF-Mesh-Library by roshanrajaratnam 2 years ago
- Removes a multiple call to an internal delete function #432 — committed to NordicSemiconductor/Android-nRF-Mesh-Library by roshanrajaratnam 2 years ago
- Contains a possible fix #432 by using an unmodifiable list when importing a mesh network. — committed to NordicSemiconductor/Android-nRF-Mesh-Library by roshanrajaratnam 2 years ago
- Fix for #432 by returning a un-modifiable network exclusions list and fixes a possible concurrent modification exception when resetting a node. — committed to NordicSemiconductor/Android-nRF-Mesh-Library by roshanrajaratnam 2 years ago
- Copies the network exclusion list to a new HashMap #432. — committed to NordicSemiconductor/Android-nRF-Mesh-Library by roshanrajaratnam 2 years ago
I will use the approach that @Mavericksb has suggested since it’s a working solution. All updates and inserts will be done after copying the data in to a new list.
What is the line of code at MeshNetworkDb.java:149 ? If there is an array passed to an insert() method, just copy the array first, and then pass the copy to the insert method instead of the original array, to avoid concurrency.
We are hitting this repeatedly when making multiple changes to the network in quick succession
Hi, thanks for reporting this. Asynctask has been deprecated in the library. However like you mentioned the migration may cause an issue. I will take a look.
Great, yes pull the latest Dev branch and you should get some other fix relating to an invalid 16-bit validation.
@FluffyBunniesTasteTheBest will take a look at this issue today and get back to you!
Any node that is reset gets added to the exclusion list. I still haven’t had time to test against your json file. I will push a fix, could you test it for me on your network?
@R0m4in-dooz could you export and share your network with me? I can test against your large exclusion list!
ok, I haven’t released yet. Please try it and let me know thanks.
Could you try this please?
Thanks for your feedback. You are correct I did forget to include network exclusions in the latest release.
Hello @roshanrajaratnam ! So I’ve pulled the new release to test the fixes, and it seems better, thanks for the update 🥳 Though the only place I flagged as “maybe our fault” is the one I can still get lol… Of course you didn’t apply the fix to the network exclusions list (by my fault?). But actually, I think it needs the fix too. (here)
PS: I’m not getting it every time so it seems to be a sporadic issue 🤔 Now, most of the time I can receive ConfigNodeResetStatus without any issue, so again : thanks for the update 👍
Just for the records, there’s one more situation where the
ConcurrentModificationExceptioncan occur and crash the app:@Mavericksb Thank you very much for the clarification!
Yes. During startup, the app creates a bunch of provisioners, to enable multiple app instances to operate the mesh.
That’s all handled by the library, but…
… that perfectly explains the cause of the
SQLiteConstraintException:If the app crashes because of a
ConcurrentException, it tries to re-create the provisioners again during the next start, which, the way it looks, then causesSQLiteConstrainExceptionbecause some of the provisioners already exist.Thank you very much for sharing this! You’ve just saved me from quite some sleepless nights…
Ok, BTW adding a delay shouldn’t be necessary.
SQLiteConstrainException isn’t related to the previous problem, it means that you are passing an Object (or list of Objects) that violate the constraints that are defined in Table constructor. As far as I can tell you are trying to insert a Provisioner Node. It’s hard to help without any more detail on the Table you are trying to write on, its definition, and the objects you are passing. One of the most common causes (but it is just an example) could be that the Provisioner Table defined in MeshNetworkDb class needs an UNIQUE Key, and you are trying to insert an object (A Provisioner Node in this case) with an UNIQUE ID which already exists in the Table.
You need to do some debugging to find out where the problem lies. I suspect there is some circumstance where you don’t clear the Provisioner table before inserting a new Provisioner Node or you don’t change the Unique ID for the new one. Honestly I didn’t face it on our App, sorry I can’t help more than this.