notifee: RemoteServiceException when using foreground service on Android API 26
Start seeing the following crash when rolling out our new version that uses foreground service
Fatal Exception: android.app.RemoteServiceException
Context.startForegroundService() did not then call Service.startForeground()
android.app.ActivityThread$H.handleMessage (ActivityThread.java:1881)
android.os.Handler.dispatchMessage (Handler.java:105)
android.os.Looper.loop (Looper.java:164)
android.app.ActivityThread.main (ActivityThread.java:6938)
java.lang.reflect.Method.invoke (Method.java)
com.android.internal.os.Zygote$MethodAndArgsCaller.run (Zygote.java:327)
com.android.internal.os.ZygoteInit.main (ZygoteInit.java:1374)
So far the crash is limited to only Android 8.0 devices, but not sure if it’s an artifact of the small rollout percentage or an actual version-specific issue.
Based on https://developer.android.com/about/versions/oreo/android-8.0-changes & https://developer.android.com/reference/android/app/Service#startForeground(int, android.app.Notification) it seems to suggest that startForeground
wasn’t called within 5 seconds of foreground service creation. Without any visibility into the code base, it’s difficult to debug the issue further.
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 46 (38 by maintainers)
@helenaford Whoa! Found the root cause! If I set
largeIcon
to a slightly large local file (256x256 25KB PNG), it’ll lead to a crash. Resizing the image down to 128x128 fixed the problem. I suspect Android 8 has a slow/buggy PNG decoding that leads to the eventual 5s timeout. Here’s the canonical code to reproduce it.Thanks for open sourcing the library! Will definitely take a deeper look into this one during the weekends.
You always log the tough ones @mars-lan ! 😆
I know it is a really tricky bug to fix, but any idea when the fix can be delivered? and is there any way that we can contribute to the native side of the project?
I dig in a bit more. From my finding it is not related to trampoline effect. It is only used because Notifee assumes that foreground push may be started from background and since
Buld.VERSION_CODES.O
it is not allowed except for few conditions.So:
ContextHolder.getApplicationContext().startService(intent)
and not worry about 5s limit.I think that starting foreground service from background that does not satisfy the conditions linked in (2) is rather an edge-case.
Hence I propose we add a setting in
displayNotification
likestartServiceWitPromise
to be used alongsideasForegroundService
. IfstartServiceWithPromise
isfalse
then we would always use olderContextHolder.getApplicationContext().startService(intent)
.For example,
startServiceWithPromise: false
could be used when a high-priority push that initialises a call.We just got hit by this issue today - after releasing to production we saw at least 20% of crashes. Locally it is extremely hard to reproduce (both debug and release variant).
Before integrating Notifee for foreground push, we used our own solution which never crashed so I was wondering why Notifee does crash. The only difference in our case is that we always used
ContextHolder.getApplicationContext().startService(intent)
regardless of Android version but we got hit by trampoline issue hence we integrated Notifee for a fix.I am wondering does Notifee really has to use
ContextHolder.getApplicationContext().startForegroundService(intent)
forBuild.VERSION.SDK_INT >= Build.VERSION_CODES.O)
inForegroundService.java
…I see that in startForegroundService it says:
So I guess
ContextHolder.getApplicationContext().startForegroundService(intent)
has to be used to avoid issue with trampoline issue, isn’t it?But then it looks like Notifee could change the condition from
Build.VERSION.SDK_INT >= Build.VERSION_CODES.O
toBuild.VERSION.SDK_INT >= Build.VERSION_CODES.S
so it only affects Android 12+.At least until the real fix is implemented.
@helenaford after rolling out an update that uses Android drawable for
largeIcon
, we still observed a handful of crashes on Android 8 in production. Looks like we may need to implement a proper fix as per @mikehardy’s suggestion to avoid the 5s timeout.I must admit I was dreading the ping on this one - all I can offer is that the previous one from @sambegin is one of only a handful of messages I leave marked as unread in my software mail box. I have not made specific progress on it but am still slowly working towards it on the stack of todo items I’ve got in my Invertase queue. Cold comfort probably, but it is on the radar.
I think we have occurrences on the crash for android 11 as well. We are still at the beginning of our investigation, we’ve halted our rollout until we have a better idea of the impact and reproduction.
Wow! Great digging @mars-lan - lot of electrons spent discussing this one. Looks like https://stackoverflow.com/a/57521350/9910298 is a great example of how to use bindService.
I’m a little confused on your last paragraph though - not to read too much in to it, but could your last sentence (“As a developer…”) be considered a new paragraph?
To be precise, are you saying that unless this API is fixed in general (“the ability to invoke
displayNotification
in background”) you’d rather not touch it, but you think the bindService path could fix it? Or, are you saying that by your read on the links you posted it is unfixable?I ask because it seemed as though the bindService style (or even JobService maybe - but bindService easier…) would fix it.
Then you’d have displayNotification in the background and no crashes, which would be ideal. But I might be missing something you’ve already realized
@helenaford Based on https://stackoverflow.com/questions/44425584/context-startforegroundservice-did-not-then-call-service-startforeground & https://issuetracker.google.com/issues/76112072, this seemss like a common issue for Android. In fact, there’s simply no way to guarantee that the system will even start the foreground service within 5s, regardless how little work there is beteween
startForegroundService
&startForeground
. This is consistent with what we observed in the wild even after droppinglargeIcon
altogether.Looks like the only foolproof way to avoid ANR is context.bindService + startService + service.startForeground. As a developer, I’d rather lose the ability to invoke
displayNotification
in background, then dealing with potential crashes that I can’t control.