snowflake-connector-nodejs: SNOW-1049322: Failing to load large data sets with snowflake-sdk ^v1.9.0 with message "Request to S3/Blob failed", works well with lower versions

  1. What version of NodeJS driver are you using? ^1.9.0

  2. What operating system and processor architecture are you using? Darwin and arm

  3. What version of NodeJS are you using? node 16.20.0 and npm 8.19.4

  4. What are the component versions in the environment (npm list)? └── snowflake-sdk@1.9.3

5.Server version:* 8.5.1 6. What did you do?

Tried out this sample code from https://docs.snowflake.com/en/developer-guide/sql-api/submitting-requests but because my data set size is between 6-7MB, it is failing with message Request to S3/Blob failed We are observing this while upgrading snowflake-sdk from 1.6.23 to ^1.9.0. Things seem to be working fine with version 1.6.*, 1.7.0 and 1.8.0. Is there a resolution for fetching large data sets with sdk version ^1.9.0?

`// Load the Snowflake Node.js driver.
var snowflake = require('snowflake-sdk');
// Create a Connection object that we can use later to connect.
var connection = snowflake.createConnection({
    account: "MY_SF_ACCOUNT",
    database: "MY_DB",
    schema: "MY_SCHEMA",
    warehouse: "MY_WH",
    username: "MY_USER",
    password: "MY_PWD"
});
// Try to connect to Snowflake, and check whether the connection was successful.
connection.connect( 
    function(err, conn) {
        if (err) {
            console.error('Unable to connect: ' + err.message);
            } 
        else {
            console.log('Successfully connected to Snowflake.');
            // Optional: store the connection ID.
            connection_ID = conn.getId();
            }
    }
);

var statement = connection.execute({
  sqlText: "Select * from LargeDataSet limit 100",
//sqlText: "Select * from LargeDataSet", -- fails with Request to S3/Blob failed
  complete: function(err, stmt, rows) {
    if (err) {
      console.error('Failed to execute statement due to the following error: ' + err.message);
    } else {
      console.log('Successfully executed statement: ' + stmt.getSqlText());
    }
  }
});`
  1. What did you expect to see? With a minor version upgrade, we were expecting the code to be backward compatible. Expected data to be returned in same way as with v1.6.23 or v1.7.0 or v1.8.0

  2. Can you set logging to DEBUG and collect the logs? Can’t upload logs due to company security policies.

  3. What is your Snowflake account identifier, if any? (Optional)

About this issue

  • Original URL
  • State: closed
  • Created 5 months ago
  • Reactions: 1
  • Comments: 20

Most upvoted comments

@sfc-gh-dszmolka I was able to avoid using the outdated library and use the latest https-proxy-agent library that you guys use, and I was able to fetch data from S3 when I specified a specific url and headers. So, I tried doing the same in the sdk code by directly using the https-proxy-agent instead of the HttpsProxyOcspAgent class that is created, but that didn’t seem to affect anything.

So, I tried something weird. Normally my code for trying a certain url and headers is as follows:

import { HttpsProxyAgent } from "https-proxy-agent";
import axios from 'axios';

let httpsAgent = new HttpsProxyAgent({
    host: hostString,
    port: portNum
});

let requestOptions = {
    method: 'GET',
    url: requestURL,
    headers: {
        "header1": "",
        "header2": ""
      },
    httpsAgent: httpsAgent
};

let res = axios.request(requestOptions)
    .then(res => {
      console.log(res);
    })
    .catch(e => {
      console.log(e);
});

This usually is able to fetch me all the data.

However on a whim, I decided to try this exact same code inside the sendRequest() in the large_result_set.js and remove everything else. I even just used the same url and headers instead of the ones passed through options. The only difference in the code was that I had to use require instead of import for the 2 packages. For some reason, this request is unable to go through and gives me a 400. I find this behavior a little odd as it should ignore any setting previously defined in the code and essentially mimic the same environment that I tested this code in, which was just another random folder on my computer.

Since HttpsProxyOcspAgent just extends https-proxy-agent, I was wondering if you could try sending one of the S3 requests that failed for you outside of the sdk like above to see if it seems to go through. This makes me think the agent setup and everything is fine but wonder if the sdk code flow is doing something strange, or if it’s somehow still an Axios bug.

have you tried submitting the same request url and headers through a different client to see if you get the same errors?

yes, this is exactly my point.

when I managed to reproduce the error symptoms (HTTP400 and ssl3_get_record:wrong version number, using the random proxy I found on the net), the error was also reproducible with other client as well as mentioned, curl

so while it did reproduce the symptoms, it did not reproduce the actual error, which is exclusively specific to axios and the various bugs we discussed in this thread. Hope i’m making sense 😃

the bug is indeed with axios, but judging from the number of still open issues, I’m not sure of the timeline they can address it. However, we can indeed do changes on our own side to see how we can mitigate the issues brought in by axios. The big challenge, or blocker I might say, is that there’s apparently a very specific proxy setup needed for reproducing the issue only in axios and not other http clients.

I did not have time yet to move forward with this and finding a proxy which behaves well with other http clients and only breaks axios.

Of course since you already have the environment which I don’t have, that would be a massive help from your side if you can tweak the agent setup. Otherwise I’ll keep researching how to set up the reproduction environment so we could see the exact issue for ourselves (only with axios , not other http client) and therefore be able to write a fix and tests for it.

@sfc-gh-dszmolka I’m working on this issue as well, and I’ve tried what you mentioned about unsetting the noProxy value. I’ve never really initialized the Snowflake object with that property, but I tried doing so and it results in the same error either way. As far as the major versions go, this issue is persistent from snowflake-sdk 1.9.0 and above, not just 1.9.3.

When I do debug the S3 request being sent by the sdk and try fetching the response manually through an http client, I’m actually able to successfully fetch a response without an issue with the proper headers. Although admittedly, I haven’t been able to replicate the same behavior through a manual axios fetch in code (or urllib), so that’s something I’ll play around with to see if I can make any progress.