google-cloud-node: Error: Endpoint read failed at /user_code/node_modules/@google-cloud/logging/node_modules/grpc/src/node/src/client.js:569:15

We are using Stack Driver to log errors for Firebase functions. We unpredictably get the error in the title. Typically it happens the first to we try to log an error after a function re-deploys. Subsequent error log writes will go through to StackDriver without issue, but occasionally we’ll get the error again.

We’re using "@google-cloud/logging": "^1.0.2", and deployed via Firebase Functions.

Below is our module that implements the logging…

Anybody have any idea what is causing this?

const Logging = require('@google-cloud/logging');

// To keep on top of errors, we should raise a verbose error report with Stackdriver rather
// than simply relying on console.error. This will calculate users affected + send you email
// alerts, if you've opted into receiving them.
const logging = Logging();

// This is the name of the StackDriver log stream that will receive the log
// entry. This name can be any valid log stream name, but must contain "err"
// in order for the error to be picked up by StackDriver Error Reporting.
const logName:string = 'errors-fb-func';

// Enum of StackDriver severities
enum Severities {
  ERROR = 500, // ERROR	(500) Error events are likely to cause problems.
  CRITICAL = 600, // CRITICAL	(600) Critical events cause more severe problems or outages.
  ALERT = 700, // ALERT	(700) A person must take an action immediately.
  EMERGENCY = 800 // EMERGENCY	(800) One or more systems are unusable.
}

// Provide an error object and and optional context object
export function log(err:Error, logLevel:number=Severities.ERROR, user?:string): Promise<any> {
  // https://cloud.google.com/functions/docs/monitoring/error-reporting#advanced_error_reporting

  const FUNCTION_NAME = process.env.FUNCTION_NAME;
  const log = logging.log(logName);

  const metadata = {
    // MonitoredResource
    // See https://cloud.google.com/logging/docs/api/ref_v2beta1/rest/v2beta1/MonitoredResource
    resource: {
      // MonitoredResource.type
      type: 'cloud_function',
      // MonitoredResource.labels
      labels: {
       function_name: FUNCTION_NAME
      }
    },
    severity: logLevel
  };

  const context:any = {};
  if (user && typeof user === 'string') {
    // ErrorEvent.context.user
    context.user = user;
  }

  // ErrorEvent
  // See https://cloud.google.com/error-reporting/reference/rest/v1beta1/ErrorEvent
  let structPayload:any = {
    // ErrorEvent.serviceContext
    serviceContext: {
      // ErrorEvent.serviceContext.service
      service: `cloud_function:${FUNCTION_NAME}`,
      // ErrorEvent.serviceContext.version
      resourceType: 'cloud_function'
    },

  };

  if (context) {
    // ErrorEvent.context
    structPayload.context = context;
  }

  structPayload.message = getMsgForError(err);

  return writeLog(log, metadata, structPayload);
}

function getMsgForError(error:Error): string {
  // https://cloud.google.com/functions/docs/monitoring/error-reporting#advanced_error_reporting
  // ErrorEvent.message
  if (error instanceof Error && typeof error.stack === 'string') {
    return error.stack;
  } else if (typeof error === 'string') {
    return error;
  } else if (typeof error.message === 'string') {
    return error.message;
  } else {
    logFatalError(error, "Error message type not supported");
    return "";
  }
}

function writeLog(log:any, metadata:any, structPayload:any): Promise<any> {
  console.log(metadata);
  console.log(structPayload);
  // Write the error log entry
  return new Promise((resolve, reject) => {
    try {
      log.write(log.entry(metadata, structPayload), (error:any) => {
        if (error) {
          logFatalError(error);
          reject(error);
        }
        resolve();
      });
    } catch(error) {
      reject(error);
    }
  });
}

// Utility function to log error if Logger fails
function logFatalError(error:Error, msg?:string): void {
  console.error(error, msg);
  throw error;
}

// error, crtical, alert, emergency, accept an Error object
// And then set error.stack as the message
export function error(error:Error, user?:string): Promise<any> {
  return log(error, Severities.ERROR, user);
}

export function critical(error:Error, user?:string): Promise<any> {
  return log(error, Severities.CRITICAL, user);
}

export function alert(error:Error, user?:string): Promise<any> {
  return log(error, Severities.ALERT, user);
}

export function emergency(error:Error, user?:string): Promise<any> {
  return log(error, Severities.EMERGENCY, user);
}

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 82 (33 by maintainers)

Most upvoted comments

I’m seeing

{ Error: 14 UNAVAILABLE: TCP Read failed at new createStatusError (/user_code/node_modules/@google-cloud/datastore/node_modules/grpc/src/client.js:64:15) at /user_code/node_modules/@google-cloud/datastore/node_modules/grpc/src/client.js:583:15 code: 14, metadata: Metadata { _internal_repr: {} }, details: 'TCP Read failed' }

When trying to save() on datastore via GCF.

Greetings folks! I think we’ve resolved this issue. With the latest and greatest version of all libraries, you should now be getting a dependency on @grpc/grpc-js instead of grpc. This module is rewritten from the ground up, and uses the native HTTP/2 support in node core. We think the combination of this new module, and some fixes to timeouts on the Cloud Functions/App Engine backend should resolve the issue.

If you are running into this still … please let us know! Just make sure you’re using the latest version of the module, and that you post a stack trace (as above).

Edit: I was wrong

Also seeing an issue using logging-bunyan within Firebase Functions

Error: 14 UNAVAILABLE: TCP Read failed 
at Object.exports.createStatusError (/user_code/node_modules/@google-cloud/logging-bunyan/node_modules/grpc/src/common.js:87:15) 
at Object.onReceiveStatus (/user_code/node_modules/@google-cloud/logging-bunyan/node_modules/grpc/src/client_interceptors.js:1188:28)
at InterceptingListener._callNext (/user_code/node_modules/@google-cloud/logging-bunyan/node_modules/grpc/src/client_interceptors.js:564:42) 
at InterceptingListener.onReceiveStatus (/user_code/node_modules/@google-cloud/logging-bunyan/node_modules/grpc/src/client_interceptors.js:614:8) 
at callback (/user_code/node_modules/@google-cloud/logging-bunyan/node_modules/grpc/src/client_interceptors.js:841:24)

Also having this issue with FireStore and PHP client. The first request after ~5 minutes returns an error, the second goes fine.

https://github.com/googlecloudplatform/google-cloud-php#cloud-firestore-beta

$store = new FirestoreClient([ 
    'keyFile' => ['permissions']
]);

$document = $store->document('records/1');

{
    "message": "OS Error",
    "code": 14,
    "status": "UNAVAILABLE",
    "details": []
}

Surprisingly, this method works fine.

$collection = $store->collection('records');

As an update, it seems to be more of an issue the first time a call is made to Stackdriver after a Firebase function is updated and restarted. Subsequent requests seem to go through ok. Although it does re-emerge sporadically. We really haven’t been able to get to the bottom of it.

Unfortunately, this causes the functions to timeout and throw an “Unhandled Exception”, which is not going to work for us in production. So unless we can resolve this issue, we will have to replace Stackdriver with another logger.

Not currently, but we’re making progress on a fix

Same error here using @google-cloud/logging: 4.1.1 on GAE NodeJS8 Standard Environment.

Error: 14 UNAVAILABLE: TCP Read failed 
  at Object.exports.createStatusError (/srv/node_modules/grpc/src/common.js:87:15) 
  at Object.onReceiveStatus (/srv/node_modules/grpc/src/client_interceptors.js:1188:28) 
  at InterceptingListener._callNext (/srv/node_modules/grpc/src/client_interceptors.js:564:42) 
  at InterceptingListener.onReceiveStatus (/srv/node_modules/grpc/src/client_interceptors.js:614:8) 
  at callback (/srv/node_modules/grpc/src/client_interceptors.js:841:24) 

First errors detected in three different projects in the first week of December.

Smells a little like an internal networking issue (TCP protocol support surface), which may be unrelated to GRPC or client libraries specifically. @JustinBeckwith perhaps create an internal bug and reference this issue for context?

I do believe it is a different issue. I have split it out here: https://github.com/GoogleCloudPlatform/google-cloud-node/issues/2544.

As for the original issue with tcp timeouts / endpoint read failures: those are due to the fact that cloud functions get aggressively throttled once you call the completion callback. This means that any background work still pending once you call the completion callback is not guaranteed to be able to execute.

The solution is to ensure that you make sure out outstanding work that you care about (e.g. log.write as above) is done before you call the completion callback.

We were experimenting with a few things and one thing we try is moving const logging = Logging(); inside the log() function. Since we’ve done that we haven’t seen the error.

If ok, can we leave this open for a couple of days so that I can report back on if that fixed it or not?

If that is in fact the issue, then Firebase may want to update their docs: https://firebase.google.com/docs/functions/reporting-errors#importing_dependencies