nodejs-storage: v6.4.1 seems to introduce a bug with gzipped files

Environment details

  • OS: Cloud Run
  • Node.js version: 18.3.0 (ALPINE: 18.3-alpine3.15 Docker image)
  • npm version: 8.12.1
  • @google-cloud/storage version: 6.4.1

Steps to reproduce

I have a log file processing pipeline on Cloud Run/Node which is pretty well monitored and managed via CI/CD (GitHub -> Cloud Build -> Cloud Run). Dependabot raised a PR for 6.4.1 of this lib which passed all my tests and ran fine overnight on pre-live so I merged it which deploys to live. Immediately, the number of log files being rejected by the pipeline jumped and I saw that the Node app was logging errors: gzip error: unexpected end of file (code: Z_BUF_ERROR) The processing time per file also jumped massively (by an order of magnitude or 2). Once I reverted this version of the lib, the errors disappeared and processing time returned to normal.

The Node app Does this:

  1. Receives a “new GCS file” push from Eventarc (HTTP lib is Fastify)
  2. Streams the new file through the processing pipeline, using gunzip-maybe
  3. Streams resultant data into BigQuery

So I think the usage of this lib is relatively simple - i am only using a few methods (via async/await):

  • setMetadata()
  • exists()
  • bucket()
  • file()
  • delete()
  • createReadStream() - probably the most likely cuplrit here, i guess

Please let me know if you have more Qs. Sorry this isn’t a simple set of steps, I thought ^ was more practical in this case.

Making sure to follow these steps will guarantee the quickest resolution possible.

Thanks!

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Reactions: 1
  • Comments: 15 (9 by maintainers)

Most upvoted comments

Thanks for the additional information @neilstuartcraig, I believe https://github.com/googleapis/nodejs-storage/issues/2055 is related to the issue you are facing.

I will do some more digging and resolve this for you.

Hey @neilstuartcraig, thanks for your patience - we were finally able to reproduce the issue and a fix has been merged. The next release will public shortly:

* [chore(main): release 6.4.2 #2050](https://github.com/googleapis/nodejs-storage/pull/2050)

To better mitigate and catch issues like this in testing in the future we’re planning to implement a revamp of our stream implementation in the next major release:

* [refactor(deps)!: Remove `duplexify` dependency + Revamp Stream Implementation #2041](https://github.com/googleapis/nodejs-storage/issues/2041)

Sorry for the delay in reply…busy times! I rolled this out to dev and prod environments a few days back and it looks good. Thanks so much for your work and time on this, I really appreciate it.