ndk: Runtime problems when including multiple versions of ndk-glue
Having multiple versions of ndk-glue in the dependency tree leads to crashes because of uninitialized NATIVE_ACTIVITY.
Example: Including ndk-glue 0.5. When entering main(), NATIVE_ACTIVITY 0.5 gets initialized. Including Oboe crate that uses ndk-glue 0.4.0. When enumerating audio devices, i get:
thread 'tokio-runtime-worker' panicked at 'called `Option::unwrap()` on a `None` value', /Users/ric/.cargo/registry/src/github.com-1ecc6299db9ec823/ndk-glue-0.4.0/src/lib.rs:59:39
I tried using [patch.crates-io] but that does not seem to work.
This bug might be related to using resolver = "2" as required by wgpu.
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 28 (26 by maintainers)
Commits related to this issue
- ndk-glue: Initialize ndk-context on 0.5 stable releases In order to show that [#223] solves our problem ([#211]) with multiple ndk-glue versions in tree, all having their own `static` globals, backpo... — committed to rust-mobile/ndk by MarijnS95 2 years ago
- ndk-glue: Initialize ndk-context on 0.5 stable releases (#225) In order to show that [#223] solves our problem ([#211]) with multiple ndk-glue versions in tree, all having their own `static` globals... — committed to rust-mobile/ndk by MarijnS95 2 years ago
- android: Migrate from `ndk-glue` to `ndk-context` `ndk-glue` suffers one fatal flaw: it's "only" supposed to be used by the crate providing `fn main()` and only supposed to end up in the dependency g... — committed to MarijnS95/app_dirs2 by MarijnS95 2 years ago
- ndk-glue: Initialize ndk-context on 0.4 stable releases In order to show that [#223] solves our problem ([#211]) with multiple ndk-glue versions in tree, all having their own `static` globals, backpo... — committed to rust-mobile/ndk by MarijnS95 2 years ago
- android: Migrate from `ndk-glue` to `ndk-context` `ndk-glue` suffers one fatal flaw: it's "only" supposed to be used by the crate providing `fn main()` and only supposed to end up in the dependency g... — committed to MarijnS95/cpal by MarijnS95 2 years ago
- android: Migrate from `ndk-glue` to `ndk-context` `ndk-glue` suffers one fatal flaw: it's "only" supposed to be used by the crate providing `fn main()` and only supposed to end up in the dependency g... — committed to MarijnS95/cpal by MarijnS95 2 years ago
- android: Migrate from `ndk-glue` to `ndk-context` `ndk-glue` suffers one fatal flaw: it's "only" supposed to be used by the crate providing `fn main()` and only supposed to end up in the dependency g... — committed to MarijnS95/cpal by MarijnS95 2 years ago
- ndk-glue: Initialize ndk-context on 0.4 stable releases (#226) In order to show that [#223] solves our problem ([#211]) with multiple ndk-glue versions in tree, all having their own `static` globals... — committed to rust-mobile/ndk by MarijnS95 2 years ago
Opened a PR #223
Between @MarijnS95 and @dvc94ch ideas I prefer the latter. While @MarijnS95 idea is cleaner, @dvc94ch idea is the more practical IMO. Just one nit, I think “ndk-context” is clearer than “ndk-ctx”. The ndk-context pointers can be
RwLock<Option<NonNull<c_void>>>, and the ndk crate can use some constructors likefrom_context()that returnRwLock<Type>. But the raw pointers could also also be used unsafely directly from ndk-context (this would be the case for openxrs). Do you need me to make a PR?I’m currently leaning towards the separate ptr crate approach as it seems to minimize the risk of breaking changes compared to just using a version range for
ndk.Regarding the
ANativeActivity::instanceI’m not totally sure if I understand it in detail. In particular who is owning the memory storing the pointers? TheANativeActivity::instanceis then only for internal usage?Indeed, and increased use of
ndk-gluelately shows.Not easily possible since
ndk-gluehas to know of and call/use some of the functions/types from thendk. We have better solutions.Nah, I think the
ndkcrate should purely hold the (safe) abstractions to the NDK.ndk-glueis “just” a helper (like Android’s “official” glue layer) that users can leverage to simplify their Android setup. Either way it’ll suffer from breaking changes to thendkthere too… Pick your poison.I’ve been thinking about the exact same, we can even apply this retroactively to previous crate-releases as an additional patch. It’ll just need a dependency on one of the crates that provides a
Mutex/RWLockwith amapable inner value on the*LockGuard. That way we can convert from the pointer to the typed handle, and give the user what they asked for while held inside the*LockGuard.@msiglreith @francesca64 @dvc94ch What are your thoughts on this? I/we can start working on a little proof of concept for this, having to make sure that we don’t need to release any breaking changes to it for a good long while.
Alternatively we could do away with statics entirely and work with a heap allocation to which a pointer is stored in
ANativeActivity::instance. Users would then have to pass this object around to get access to theNativeActivityandNativeWindow, and code would simply not compile if differing versions ofndk/ndk-glueare combined. Safer and more predictable, but the same dependency management hell for all the library crates involved.think we can close now
I think we should close this issue as soon as most popular crates migrated to
ndk-context:https://crates.io/crates/ndk-glue/reverse_dependencies
webbrowserwinit(needs more than just theVM/Context, https://github.com/MarijnS95/android-ndk-rs/compare/anativeactivity-instance?)@zarik5 When making these PRs at least link back to the original issue and PR 😃
ndk-glue 0.6.1 and ndk-context 0.1.0 are released.
ndk-ctx is likely still useful. The reason being that if a cross platform crate wants to add android support they’ll likely not be very happy with having to pass around the android context, as evidence by the use of ndk-glue. A simple solution is likely going to have to be provided that doesn’t force users to have to add new_with_extra_android_context to all their apis. This recursively infects everything, so the results are not pretty and could lead to substandard support for android through the eco system. I think my proposal is likely the best way forward to provide the convenience required to see a larger number of crates add some android logic.
@msiglreith
We can do both, the dependency range should result in less minor (breaking) updates for at least
ndk-gluebecause we’re not forcing the update of another breaking-change crate dependency anymore. But then again we’d have to validate howcargoselects the dependency and also make sure at least min and max versions are tested in the CI? If we do this for multiple crates possible combinations may explode exponentially though.It seems to be the typical userdata pointer that is untouched by the NDK, you’re free to assign any data you want but of course have to
dropit when theANativeActivityis cleaned up, presumably inonDestroy.@dvc94ch Thanks for those insights! It indeed seems that many crates are erroneously using
ndk-glueto get hold of theANativeActivityfor JNI purposes (or in one of my test-crates: theinternal_data_path()).Hence going back to my comment and your suggestion, also taking Kotlin into account: those crates should consume the necessary objects through an init function or constructor. We could pass it the needed JNI or NDK types, and get rid of static state and public getters altogether.
On the
ndk-glueside we will then pass anANativeActivityintofn main()fromfn init(), and internally hold a pointer to our own state ininstanceso that there’s nostaticglobals anymore. We’ll still have to wrap things in locks to be able to access them concurrently fromANativeActivityCallbacks.If there’s an NDK version mismatch things will simply refuse to compile (instead of the hard-to-debug failures we’re observing right now), and we can use the existing
ptr/unsafe from_ptrhelpers to turn one into the other (leveraging the more stablendk-syscrate types, and if all else a raw pointer can easily be cast too).so how about a crate that looks like this:
ndk-glue can initialize it and all library crates can get the stuff they need from the ndk-context crate? If you spawn a new thread you can use the
get_java_vmmethod and then attach to the new thread. this would mean that downstream crates don’t depend on ndk-glue (which they shouldn’t).@enfipy As @repi mentioned such crates cannot be published… But it should be possible to publish them with a range of compatible crate revisions. Perhaps
ndk-gluecould set something (imaginary, I didn’t compatibility-test this) likendk = ">= 0.3, <= 0.6". Then we only need to bump these (usually only the upper bound) in a patch release. (We can probably do this regardless of introducing anndk-ptrs/staticscrate mentioned above?)Note that an upper bound is mandatory in my eyes to prevent a situation like https://github.com/gwihlidal/vk-mem-rs/issues/54. It’s much safer for us to validate a setup and push a patch release enlarging the available range of versions than to retroactively publish a new patch release with the upper bound fixed to the last working/compatible version, and yanking everything without that upper bound from crates.io.
The only thing I am not sure about is crate selection. I assume, but have not tested this, that if a crate were to select
ndk = "0.5"any crate withndk-gluewith the imaginaryndk = ">= 0.3, <= 0.6"dependency from above would get justndk 0.5.0? Though perhaps if cargo were to try and select the highest or the lowest (-Zminimal-versions) you could end up with duplicates on different versions after all. Then only Embark’scargo-denycould save you (or the aforementioned suggestion to removestatics from the crate altogether).Don’t think you can publish a crate with wildcard version? Believe it only works for local crates