graphql-upload: FileStreamDisconnectUploadError when checking file validity

Hello,

The problem

I have a problem when checking for the validity of the files sent by multipart request from the client side.

The client sends a multipart request with maximum 5 different files, then the server loops through them to check if every file is an instance of stream.Readable. While looping through the files an error occures: FileStreamDisconnectUploadError: Request disconnected during file upload stream parsing..

The first file is checked but then the others files are never read.

This is what my loop looks like:

for (const pic of data.pictures) {
      const resolved = await pic;
      if (!(resolved.stream instanceof stream.Readable)) return new Error('Wrong file');
}

The error is raised when executing the line const resolved = await pic;. The first file fills the resolved variable but it never gets filled again with the next files.

What I tried

  • I tried to destroy the stream after the check: resolved.stream.destroy();
  • I tried to replace the for...of loop with a map

Some thoughts

Before making the requests by the client, I sent the requests using curl according to these examples. But the error still occured.

I work on a Macbook with MacOS Sierra 10.12.6 and curl 7.54.0 which uses SecureTransport, as it is the native TLS library.

What is interesting is that my coworkers, which work with the same Macbook as mine but with MacOS High Sierra and curl 7.54.0 which uses LibreSSL as TLS library, manage to get the requests work with the server. Of course we tried with the exact same requests and on a server running on my mac.

Do you have any idea why the error occures and why does the await pic never manage to complete ?

Thanks for your time.

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 38 (8 by maintainers)

Commits related to this issue

Most upvoted comments

Apologies for my unconstructive tone.

I finally see what you mean now. The query example explains it all.

Again, I realised that talking about code but not seeing the code often leads to an unconstructive discussion.

PS: I’ve refactored the code back to it’s original state little above in this thread: https://github.com/jaydenseric/graphql-upload/issues/65#issuecomment-382922064

Your “less ideal” conditions is impossible. There is literally no scenarios in the world where uploading file B depends on the contents of the file A. (Btw, the great and mighty graphql-upload module has the feature called “variables” to cary any data besides the files.)

You may want to reread my post… my “less ideal” scenario has nothing to do with dependancies of files on each other, or other data. Instead, it has to do with the HTTP request, which is a linear stream of bytes that can contain uploads in a different order than you expect in your resolver. Consider this example query:

query {
  foo: findPeople(byFacePhotos: [$uploadB, $uploadX, $uploadA]) {
    name
  }
  bar: findPeople(byFacePhotos: [$uploadA, $uploadB, $uploadC]) {
    name
  }
}

This kind of request is valid GraphQL and must still be valid when using the the upload spec to do variable replacement. Therefore, the upload spec cannot constrain the order of uploads in the HTTP request such that it matches the order of appearance in every list.

It is completely possible that your upload stream will contain the contents of $uploadC first, followed by $uploadB, then $uploadA. You don’t get to decide this, because you aren’t sending the request; the client is. If your code is waiting on $uploadA before processing $uploadB, then it is inefficient.

Also, there are scenarios where business logic requires one upload to be present before another becomes useful; just because you haven’t run into one doesn’t mean they don’t exist. Your tone here is painfully hyperbolic and very much in contrast to the meticulous attention to edge cases that has made this library suitable for such a wide range of use cases. If you have good feedback, we want to hear it, and dismissive hand-waving is counterproductive to meaningful communication of an intricate subject.


The Promise.all was invented to execute several things at the same time, like saving data to 3 databases simultaneously. Suggesting Promise.all implies that we are consuming A, B, and C simultaneously. But this is very false assumption. We are confusing newbie developers with it too much.

That’s not how promises work. Promises are executed eagerly, meaning that they run as soon as they are created. Something like Promise.all does not cause them to run but instead “pauses” the current flow when using await (or waits to call any functions registered with .then()) until all its promises (which are already running) resolve.


I suspect your example does not work the way you think it works, and if it does, I don’t believe it represents a common use case. In my experience, it’s been standard practice to ensure the file is stored without error before returning a “successful” response.

Keeping in mind that the HTTP response won’t make it to the browser until the request has finished, your resolver should almost certainly be refactored to:

async function resolver(root, { files }, context) {
  try {
    await Promise.all(files.map(async (file) => {
      const { stream, filename /*, mimetype, encoding */ } = await file;
      await s3.upload({Key: filename, Body: stream}).promise();
    }))
    return {success: true};
  } catch (err) {
    log.error(err);
    return {success: false, message: err.message};
  }
}

@koresar @mike-marcacci Thanks for the additional discussion an explanation. For posterity, I was not so much concerned with the order of file uploads being non-determinant, I was more worried whether the uploads to the second endpoint (s3 in the examples above) would all be initiated in the beginning, meaning whatever upload was not the first one would leave that endpoint’s connection hanging while the first upload completed (from the originating client’s multipart form upload), potentially leading to timeouts. I created a test repo that performed all these steps. Looking through the log messages I added, it looks like all uploads to the second endpoint are not all initiated in the beginning of the mutation. Edit: I suppose the resolution of the await file (file being the graphql Upload) promise is ambiguous to me: does the resolution of the promise from that upload mean it has begun receiving data for that file in the multipart upload, or that the upload completed?

[0] 2019-07-16 12:51:18 info [graphql]: uploadFiles mutation start (upload server url: http://localhost:4001/upload)
[0] 2019-07-16 12:51:18 info [upload-client]: [file1] await "file" start
[0] 2019-07-16 12:51:18 info [upload-client]: [file2] await "file" start
[0] 2019-07-16 12:51:18 info [graphql]: uploadFiles await promises start
[0] 2019-07-16 12:51:18 info [upload-client]: [file1] await "file" complete: filename: file1.jpg
[0] 2019-07-16 12:51:18 info [upload-client]: [file1:file1.jpg] upload request start
[1] 2019-07-16 12:51:18 info [upload-server]: new POST request to /upload, transfer-encoding: chunked
[1] 2019-07-16 12:51:18 info [upload-server]: [file1.jpg] multipart file (image/jpeg) begin: -> /var/folders/0f/5j5z00r51lx69t070qf1p8jc0000gn/T/upload_7215a7aaa8b484db9d2a1f88a4ed6333
[0] 2019-07-16 12:51:18 info [upload-client]: [file2] await "file" complete: filename: file2.jpg
[0] 2019-07-16 12:51:18 info [upload-client]: [file2:file2.jpg] upload request start
[1] 2019-07-16 12:51:18 info [upload-server]: new POST request to /upload, transfer-encoding: chunked
[1] 2019-07-16 12:51:18 info [upload-server]: [file2.jpg] multipart file (image/jpeg) begin: -> /var/folders/0f/5j5z00r51lx69t070qf1p8jc0000gn/T/upload_3e0ca4ba5ae70692288d9e6f00212e8d
[1] 2019-07-16 12:51:18 info [upload-server]: [file1.jpg] multipart file (image/jpeg) end: -> /var/folders/0f/5j5z00r51lx69t070qf1p8jc0000gn/T/upload_7215a7aaa8b484db9d2a1f88a4ed6333
[0] 2019-07-16 12:51:18 info [upload-client]: [file1:file1.jpg] upload request end: '{"ok":true}'
[1] 2019-07-16 12:51:18 info [upload-server]: [file2.jpg] multipart file (image/jpeg) end: -> /var/folders/0f/5j5z00r51lx69t070qf1p8jc0000gn/T/upload_3e0ca4ba5ae70692288d9e6f00212e8d
[0] 2019-07-16 12:51:18 info [upload-client]: [file2:file2.jpg] upload request end: '{"ok":true}'
[0] 2019-07-16 12:51:18 info [graphql]: uploadFiles await promises end
[0] 2019-07-16 12:51:18 info [graphql]: uploadFiles mutation end: [ { ok: true }, { ok: true }, [length]: 2 ]

So I tried the solution of @hkjeffchan and indeed consuming the stream in the loop makes the process to work. However I need to get the stream once again to upload it to S3, I can’t do it at the same time as the validation of the files. I’ll try to pipe it to another stream as you suggested or change my workflow.

I close this issue as it isn’t a bug but rather a bad use of the library.

Thanks for your help and your time everyone.

I just hack the middleware.js from this package to log as much event as possible. After testing with different filesize and sequence samples, can confirm @koresar is correct.

  1. files are provided sequentially via multifile upload HTTP request.
  2. a for-loop is enough, promise all is not required
  3. the root problem of the original code below is the stream is not CONSUMED, not related to the promise all or for-loop
for (const pic of data.pictures) {
      const resolved = await pic;
      if (!(resolved.stream instanceof stream.Readable)) return new Error('Wrong file');
}

TL:DR You must consume the stream whenever the file is resolved const file = await fileList;, just pipe it to another stream or call file.stream.resume() to discard it. Otherwise, the underlying busboy will wait for you to consume the stream chuck and eventually timeout and crash the node server.

SELF EDIT: this comment is WRONG, you may just skip to https://github.com/jaydenseric/apollo-upload-server/issues/65#issuecomment-383017000

I ran into similar problem. Solution: Create a new promise for your operation on the stream and then await Promise.all for your new promises. Do NOT await Promise.all for the streams to resolve and then do your operation with map. The first stream will block all other streams to resolve.

This is similar to RXJS switchMap and then combineAll

@jaydenseric We changed computers recently and we all have the exact same Macbook, so this mustn’t be a hardware problem.

I’ll investigate some more to find what’s the cause of the error.