notifee: RemoteServiceException when using foreground service on Android API 26

Start seeing the following crash when rolling out our new version that uses foreground service

Fatal Exception: android.app.RemoteServiceException
Context.startForegroundService() did not then call Service.startForeground()

android.app.ActivityThread$H.handleMessage (ActivityThread.java:1881)
android.os.Handler.dispatchMessage (Handler.java:105)
android.os.Looper.loop (Looper.java:164)
android.app.ActivityThread.main (ActivityThread.java:6938)
java.lang.reflect.Method.invoke (Method.java)
com.android.internal.os.Zygote$MethodAndArgsCaller.run (Zygote.java:327)
com.android.internal.os.ZygoteInit.main (ZygoteInit.java:1374)

So far the crash is limited to only Android 8.0 devices, but not sure if it’s an artifact of the small rollout percentage or an actual version-specific issue.

Based on https://developer.android.com/about/versions/oreo/android-8.0-changes & https://developer.android.com/reference/android/app/Service#startForeground(int, android.app.Notification) it seems to suggest that startForeground wasn’t called within 5 seconds of foreground service creation. Without any visibility into the code base, it’s difficult to debug the issue further.

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 46 (38 by maintainers)

Most upvoted comments

@helenaford Whoa! Found the root cause! If I set largeIcon to a slightly large local file (256x256 25KB PNG), it’ll lead to a crash. Resizing the image down to 128x128 fixed the problem. I suspect Android 8 has a slow/buggy PNG decoding that leads to the eventual 5s timeout. Here’s the canonical code to reproduce it.

import React from 'react';
import { AppRegistry } from 'react-native';
import notifee from '@notifee/react-native';

notifee.registerForegroundService(async () => {
  return new Promise(() => {});
});

notifee.createChannel({
  id: 'channel',
  name: 'channel',
});

function Test() {
  React.useEffect(() => {
    notifee.displayNotification({
      title: 'Title',
      body: 'Body',
      android: {
        channelId: 'channel',
        largeIcon: require('./image.png'),
        asForegroundService: true,
      },
    });
  }, []);
  return null;
}

AppRegistry.registerComponent('Test', () => Test);

Thanks for open sourcing the library! Will definitely take a deeper look into this one during the weekends.

You always log the tough ones @mars-lan ! 😆

I know it is a really tricky bug to fix, but any idea when the fix can be delivered? and is there any way that we can contribute to the native side of the project?

So I guess ContextHolder.getApplicationContext().startForegroundService(intent) has to be used to avoid issue with trampoline issue, isn’t it?

I dig in a bit more. From my finding it is not related to trampoline effect. It is only used because Notifee assumes that foreground push may be started from background and since Buld.VERSION_CODES.O it is not allowed except for few conditions.

So:

  1. If foreground service is started from foreground we can always use ContextHolder.getApplicationContext().startService(intent) and not worry about 5s limit.
  2. Based on this doc foreground service can still be started from background when some conditions are satisfied, e.g. after getting high-priority message.

I think that starting foreground service from background that does not satisfy the conditions linked in (2) is rather an edge-case.

Hence I propose we add a setting in displayNotification like startServiceWitPromise to be used alongside asForegroundService. If startServiceWithPromise is false then we would always use older ContextHolder.getApplicationContext().startService(intent).

diff --git a/android/src/main/java/app/notifee/core/ForegroundService.java b/android/src/main/java/app/notifee/core/ForegroundService.java
index 9150047..7d16768 100644
--- a/android/src/main/java/app/notifee/core/ForegroundService.java
+++ b/android/src/main/java/app/notifee/core/ForegroundService.java
@@ -45,7 +45,7 @@ public class ForegroundService extends Service {
     intent.putExtra("notification", notification);
     intent.putExtra("notificationBundle", notificationBundle);
 
-    if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.O) {
+    if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.O && startServiceWithPromise) {
       ContextHolder.getApplicationContext().startForegroundService(intent);
     } else {
       // TODO test this on older device

For example, startServiceWithPromise: false could be used when a high-priority push that initialises a call.

We just got hit by this issue today - after releasing to production we saw at least 20% of crashes. Locally it is extremely hard to reproduce (both debug and release variant).

Before integrating Notifee for foreground push, we used our own solution which never crashed so I was wondering why Notifee does crash. The only difference in our case is that we always used ContextHolder.getApplicationContext().startService(intent) regardless of Android version but we got hit by trampoline issue hence we integrated Notifee for a fix.

I am wondering does Notifee really has to use ContextHolder.getApplicationContext().startForegroundService(intent) for Build.VERSION.SDK_INT >= Build.VERSION_CODES.O) in ForegroundService.java

I see that in startForegroundService it says:

Note: Beginning with SDK Version Build.VERSION_CODES.S, apps targeting SDK Version Build.VERSION_CODES.S or higher are not allowed to start foreground services from the background. See Behavior changes: Apps targeting Android 12 for more details.

So I guess ContextHolder.getApplicationContext().startForegroundService(intent) has to be used to avoid issue with trampoline issue, isn’t it?

But then it looks like Notifee could change the condition from Build.VERSION.SDK_INT >= Build.VERSION_CODES.O to Build.VERSION.SDK_INT >= Build.VERSION_CODES.S so it only affects Android 12+.

At least until the real fix is implemented.

@helenaford after rolling out an update that uses Android drawable for largeIcon, we still observed a handful of crashes on Android 8 in production. Looks like we may need to implement a proper fix as per @mikehardy’s suggestion to avoid the 5s timeout.

I must admit I was dreading the ping on this one - all I can offer is that the previous one from @sambegin is one of only a handful of messages I leave marked as unread in my software mail box. I have not made specific progress on it but am still slowly working towards it on the stack of todo items I’ve got in my Invertase queue. Cold comfort probably, but it is on the radar.

I think we have occurrences on the crash for android 11 as well. We are still at the beginning of our investigation, we’ve halted our rollout until we have a better idea of the impact and reproduction.

image

Wow! Great digging @mars-lan - lot of electrons spent discussing this one. Looks like https://stackoverflow.com/a/57521350/9910298 is a great example of how to use bindService.

I’m a little confused on your last paragraph though - not to read too much in to it, but could your last sentence (“As a developer…”) be considered a new paragraph?

To be precise, are you saying that unless this API is fixed in general (“the ability to invoke displayNotification in background”) you’d rather not touch it, but you think the bindService path could fix it? Or, are you saying that by your read on the links you posted it is unfixable?

I ask because it seemed as though the bindService style (or even JobService maybe - but bindService easier…) would fix it.

Then you’d have displayNotification in the background and no crashes, which would be ideal. But I might be missing something you’ve already realized

@helenaford Based on https://stackoverflow.com/questions/44425584/context-startforegroundservice-did-not-then-call-service-startforeground & https://issuetracker.google.com/issues/76112072, this seemss like a common issue for Android. In fact, there’s simply no way to guarantee that the system will even start the foreground service within 5s, regardless how little work there is beteween startForegroundService & startForeground. This is consistent with what we observed in the wild even after dropping largeIcon altogether.

Looks like the only foolproof way to avoid ANR is context.bindService + startService + service.startForeground. As a developer, I’d rather lose the ability to invoke displayNotification in background, then dealing with potential crashes that I can’t control.