msgraph-sdk-javascript: Node.js stream support missing in LargFileUploadTask

Bug Report

Prerequisites

[ x ] Can you reproduce the problem?
[ x ] Are you running the latest version?
[ x ] Are you reporting to the correct repository?
[ x ] Did you perform a cursory search? yes, I found a gist that suggested reading entire file into memory for each chunk splicing 😃

Description

Attempting to pass a Stream into LargeFileUploadTask as file content is failing with

this.file.content.slice is not a function

const fileObject = {
    size,
    content: fileStream,
    name: targetFileName,
  };
  const uploadTask = await new LargeFileUploadTask(
    graphClient,
    fileObject,
    uploadSession
  );
  const response = await uploadTask.upload()

Steps to Reproduce

Take the example https://github.com/microsoftgraph/msgraph-sdk-javascript/blob/HEAD/docs/tasks/LargeFileUploadTask.md
use var readStream = fs.createReadStream(filename); to get a file stream and pass that in as content

Expected behavior: [What you expected to happen]

Stream flows and gets accumulated into chunks that get uploaded

Actual behavior: [What actually happened]

Error stemming from the assumption content is a Buffer

Additional Context

Streams allow for a container limited to 100MB of RM to upload a multi GB file successfully. LargFileUploadTask is where it breaks.

Users of the SDK can’t afford to load the entire file into RAM if they’re building a multi tenant web app.

I’m willing to contribute a solution if I get someone from the maintainers to help me out (I have little experience using TypeScript and typing a stream is difficult)

Usage Information

SDK Version - [SDK version you are using]

[ x ] Node (Check, if using Node version of SDK)

Node Version - [The version of Node you are using]

10+ (I used multiple)

AB#7546

About this issue

Original URL
State: closed
Created 4 years ago
Comments: 22 (12 by maintainers)

Most upvoted comments

I am really interested in this too. In case someone needs it, I am using this drop-in alternative to the default LargeFileUploadTask in order to allow an upload from a generic Stream: https://gist.github.com/fabiofenoglio/388a80204c2f203893882644567205f7

usage:

const yourBigFile = '/your-filesystem/big-file.mp4';
const pathToUploadIn = 'Test';
const fileNameToUpload = 'huge-file.mp4';

// find folder to upload in
const uploadFolder = await client.api(`/me/drive/root:/${pathToUploadIn}`).get() as MicrosoftGraph.DriveItem;

// create the upload session
const createUploadSessionUrl = `/me/drive/items/${uploadFolder.id}:/${encodeURIComponent(fileNameToUpload)}:/createUploadSession`;
const largeFileUploadSession = await LargeFileUploadTask.createUploadSession(
  client,
  createUploadSessionUrl,
  {
    item: {
      '@microsoft.graph.conflictBehavior': 'fail',
      name: fileNameToUpload,
    },
  },
);

// read file info and prepare the upload object.
// 'size' could be retrieved in any other way.
// contentProvider could not only read from filesystem but from any stream
const fileInfo = fs.statSync(yourBigFile);

const fileObject = {
  size: fileInfo.size,
  contentStream: fs.createReadStream(yourBigFile),
  name: fileNameToUpload,
};

// prepare the upload task.
// includes a callback for each slice upload if the uploadingSliceCallback option is given.
const uploadTask = new LargeFileUploadStreamTask(
  client,
  fileObject,
  largeFileUploadSession, {
    uploadingSliceCallback: (range) => {
      const percentage = 100.0 * range.minValue / fileInfo.size;
      console.log(`uploading: ${Math.floor(percentage)}%`)
    }
  }
);

// launch the upload and wait for completion
const response = await uploadTask.upload();

fabiofenoglio on Jan 2, 2021

@nikithauc thank you very much for your feedback. I will test this as soon as possible but it will take a few days before I’ll be able to get back on this. have a nice day!

fabiofenoglio on May 11, 2021

@nikithauc I’m in agreement. Our best bet is to use the documentation as spec as much as possible. With that said, it is a challenge to provide prescriptive guidance as limits can change without us knowing. I’d expect an implementer of large file upload to test different range sizes to determine what is optimal for their scenario. So, we should leave it to the implementer to specify their range size based on their testing. I can imagine that the optimal range size will be different for an application that occasionally uploads a file for a single account versus an application that uploads thousands of files across many accounts and across many organizations. Even different geo-locations may be serviced by different clouds that have a different number of servers so that may be a factor. Applications that are uploading many large files may want to consider implementing heuristics to adjust the range size for subsequent uploads based on the performance on past uploads. I’ve gone on a bit of a tangent…

We also need to make sure that we support large file attachment upload and the potential for future service APIs to support a similar mechanism.

MIchaelMainer on Sep 23, 2020

I’ve been working on this on the Java SDK recently. The upload services behave a bit differently between OneDrive consumer (live account) OneDrive for business, SharePoint and file attachments on Outlook. It worth testing all these scenarios to catch eventual issues.

baywet on Sep 21, 2020

@naugtur We’d be happy to have you contribute here. @nikithauc will work with you on this.

MIchaelMainer on Sep 10, 2020