node-apn: Getting app write error

Hi,

I have been trying to use the latest app version 2.x and keep encountering following issue. I am using the token base and everything works fine (able to receive notifications etc) and ~50% of the time it would return the following error after calling the provider.send() method. Only way to recover is to restart my node app.

Error after send method with 2 device token (take a while for the following error to return)

[{"device":"..masked..","error":{"jse_shortmsg":"apn write failed","jse_cause":{"code":"ETIMEDOUT","errno":"ETIMEDOUT","syscall":"read"},"jse_info":{},"message":"apn write failed: read ETIMEDOUT"}},{"device":"..masked...","error":{"jse_shortmsg":"apn write failed","jse_cause":{"code":"ETIMEDOUT","errno":"ETIMEDOUT","syscall":"read"},"jse_info":{},"message":"apn write failed: read ETIMEDOUT"}}]

About this issue

  • Original URL
  • State: closed
  • Created 8 years ago
  • Reactions: 5
  • Comments: 69 (20 by maintainers)

Commits related to this issue

Most upvoted comments

I’ve reported this to Apple as rdar://30168506.

PR #480 is now available in 2.1.3.

Hey, friends. I’m the primary author of Pushy, and we’re receiving reports of similar issues. Since we’re seeing similar behavior (HTTP/500 after a period of inactivity) with an entirely separate codebase on an entirely separate platform (Java), I’m pretty convinced that this is an upstream problem.

To be clear, I think I’m hearing two different issues here, and want to make sure we’re on the same page about which one we’re also seeing:

  1. If left completely idle for a long time (i.e. no PING frames), the connection will become completely unresponsive.
  2. If a connection is left idle except for PING frames for a long time, messages sent with an (expired?) authentication token on that connection will result in an HTTP/500 error instead of a “token is expired” error.

If that’s accurate, let me be clear that we’re seeing the second problem, but not the first. My strong recommendation in that case is to file a bug report with Apple about the issue if you have been affected and can reproduce the issue.

@florianreinhart - I think we should make a call. This is essentially making the library unusable with Token based authentication. I think we should get PR by @cmdkoh reviewed ASAP, check-in and release a patch such that more ppl can test it. If there is no clear evidence that that PR resolves all the problems, we should make it clear in the docs that Token authentication has issues.

Anything I can do to help out? - Eager to see this resolved

Hello, friends!

I’m the maintainer of Pushy, an APNs provider library for Java. We’ve also seen this problem (please see https://github.com/relayrides/pushy/issues/408). As part of the debugging effort, I opened a bug report with Apple back in January. They’ve just responded wondering if this is still an issue and seeking more information.

If any of you are still getting a cascade of HTTP/500 responses, could you please email me directly at jon@turo.com?

Thanks kindly!

Hi Guys,

I’ve been testing my changes and is been a week and it seems like it solve my issue. Two things I changed.

  1. Adding Ping Frame (interval 60 seconds): This take care of my issue with hosting in Azure which disconnect TCP connection after 4 minutes idling - was mentioned here

  2. Add Token expiration check prior to sending Push: This take care of issue from APN when no 403 is returned (when token is expired over certain amount of time, e.g. x mins). Currently hardcoded to 55 minutes threshold - if token is issued over 55 minutes, a new token will be regenerated

If you guys are interested, you can test it using below GitHub repository (i submitted a pull request, hope the main contributor can verify the code and suggest better approach)

  • Update your package.json to use node-apn from the my branch:
{
  "dependencies": {
    "apn": "https://github.com/cmdkoh/node-apn/archive/fix_Getting_app_write_error_%23449_part4.tar.gz"
  }
}
  • You can change the default heartbeat of 60 seconds in config (in milliseconds)
var options = {
  token: {
    key: "path/to/key.p8",
    keyId: "...",
    teamId: "...",
  },
  production: false,
  heartBeat: 30000
};

var apnProvider = new apn.Provider(options);

One issue I did encounter is occasionally the node core returns

Warning: Possible EventEmitter memory leak detected. 11 wakeup listeners added. Use emitter.setMaxListeners() to increase limit

which I have not gotten the chance to check where I can solve this warning - this happens in the Ping method I believe

@quixoticaxis I know, I’ve read that doc again and again … 😃 But for now we are in a situation where our users did not received notifications at all… And I told myself, "what if you deploy in prod every 1 hour ? (That can happen in our case) Your connections between Apple would also restart each hour… " So clearly it needs to support that.

For now this is just a work around as I said, this is not the way we should handle that but I’ve put hours and hours in this package to find what’s the real cause of the problem and did not find anything… Need more time on that and so far it solved the problem 😉

I am testing the solution mentioned by @Discountrobot ( #454 and Code => HealthCheck) - So far seems promising…

I got the same. Usually I get this error after an interval (0.5 - 1 hour) of passive connection (i.e. when I keep APNS connection opened but don’t send no notifications).

I have to pull back my issue confirmation. In my case most likely it was broken json in rawPayload. At least when I fixed this issue on my side, I no longer saw this problem. I’ll come back if anything change. I’m sorry!

Upgraded last weekend. Service was idle (meaning no pushes were sent) for ~24h before a push was successfully delivered on Monday. Previously, pushes were not delivered once the service was idle for more than a few hours. So for me it was definitively an improvement.

@zaumreit Can you try again using the latest commit? (I updated the comment here https://github.com/node-apn/node-apn/issues/449#issuecomment-262670555)

P.S. you need to catch the error 500 and recreate the provider.

@jkoepcke We’ve implemented a “workaround” to let us time to find the real problem.

We’ve reset the provider (shutdown and creating new one) every 50 minutes… It’s been three days and we did not have any error yet again. Note that this is only a “dirty fix” before finding what’s the real problem.

Can confirm I am seeing this issue with the latest release too. Everything works fine when I start the server, some time later without any pushes sent, only getting 500. Restarting solves it. Any workaround available other than downgrading?

@sambegin , @casevictor I tried setting heart rate (PING interval) to about 5 minutes interval and all seems good until it reaches the 1 hour Token validity threshold (imposed by APNS) then all sub-sequence push will result in “500 InternalServerError”. I would expect APNS to return “403 ExpiredProviderToken” which should already been taken care in the client.js

@vzayko could you please elaborate on what might have been wrong with your payload?

We are seeing the same issue described by @cmdkoh but also sometimes getting "status":"500","response":{"reason":"InternalServerError"}}

I have tried to enable DEBUG=apn and are not seeing anything in the logs

Since the 2.x rewrite for the new HTTP/2 provider interface there isn’t a lot of internal logging in node-apn. Setting DEBUG=apn enables logging for node-apn, but you probably won’t get anything useful from that.

We should definitely add more debug logging. However, I am currently short on time. So any help would be appreciated.