google-cloud-cpp: [Q] Failed to use GCS client to download file in a basic scenario

Using C++ client to test basic upload/download scenarios with GCS.

TEST(StorageTest, TestGcsSync) {
  auto fs = nebula::storage::makeFS("gs", "nebula-com");
  auto lfs = nebula::storage::makeFS("local");
  auto content = "test";
  auto local = lfs->temp(false);
  std::ofstream out(local);
  out << content;
  out.close();

  // upload
  LOG(INFO) << "upload local file";
  fs->sync(local, "cdn/test.txt");

  // download
  auto local2 = lfs->temp(false);
  fs->sync("cdn/test.txt", local2);
  std::ifstream in(local2);
  std::string v;
  std::getline(in, v);
  in.close();
  EXPECT_EQ(v, content);
}

Basically in the download path - it just makes this function call

   google::cloud::Status status = client_->DownloadToFile(bucket_, key, local);

In this test, upload works fine, while download failed with these traces:

...
<< curl(Recv Header): alt-svc: h3-29=":443"; ma=2592000,h3-T051=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"
<< curl(Recv Header):
>> curl(Recv Data): size=4
test                     74657374
 (/Users/shawncao/nebula/build/gcp/src/gcp/google/cloud/storage/internal/curl_handle.cc:151)
2021-02-18T20:31:19.766902000Z [DEBUG] <0x10e7c3dc0> ~CurlHandle == curl(Info): Failure writing output to destination
== curl(Info): stopped the pause stream!
== curl(Info): Connection #0 to host storage.googleapis.com left intact
 (/Users/shawncao/nebula/build/gcp/src/gcp/google/cloud/storage/internal/curl_handle.cc:151)
W0218 12:31:19.766983 243023296 GCS.cpp:97] Failed to download: DownloadFileImpl(ReadObjectRangeRequest={bucket_name=nebula-com, object_name=cdn/test.txt, disable-md5-hash=1}, /tmp/nebula.M1aNqa): cannot open download source object - status.message=Permanent error in Read(): EasyPause() - CURL error [43]=A libcurl function was given a bad argument [UNKNOWN]
/Users/shawncao/nebula/src/storage/test/TestStorage.cpp:240: Failure
...

From the trace - looks like content downloaded successfully but failed to write to a temp file passed in as variable “local2”.

Not sure how to dig more, can you help?

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 35 (14 by maintainers)

Most upvoted comments

Just to close the loop. It seems like you are using a super build, which starts here, with no option to pick any particular lib curl:

https://github.com/varchar-io/nebula/blob/745fcbcb3d04b2e95b11e099f8256d182cd2b592/build.mac.sh#L74

That calls our CMake configuration steps here:

https://github.com/varchar-io/nebula/blob/745fcbcb3d04b2e95b11e099f8256d182cd2b592/ext/Gcp_Ext.cmake#L82

Our CMake code to find libcurl is here:

https://github.com/googleapis/google-cloud-cpp/blob/eeeb29f61d31cc407b17dd251c3dcab93c27bd0e/cmake/FindCurlWithTargets.cmake#L26

Most likely FindCURL finds the Xcode or the system libcurl version, you need to setup the search path for CMake to find the one in /usr/local/opt/curl.

PS: We recently added options so now you only need to list the libraries you want, and avoid the long list of ...=OFF:

https://github.com/varchar-io/nebula/blob/745fcbcb3d04b2e95b11e099f8256d182cd2b592/ext/Gcp_Ext.cmake#L60-L74

There do not seem to be further questions, closing for now. Feel free to reopen if needed.

Oh-ah - this macro is from system XCode SDK curlver.h, which defines LIBCURL_VERSION_NUM as 7.64.1

I was about to ask if there are two versions of libcurl in your system.

Question: shouldn’t GCP lib override it by detecting real curl version in build time?

Which one is real? I mean, if both are available how are we to chose for you? Dependency management is a gigantic headache in C and C++. 😭

Can you tell us how you configured google-cloud-cpp? What arguments did you pass to cmake? Maybe that gives us some ideas to improve our CMake files to check for things better.

You might consider using vcpkg to install your dependencies, including google-cloud-cpp. That will give you a consistent set of libraries to work with.

Just FYI, I think we can rule out libcurl-7.74 by itself as a source of problems. I built libcurl from source myself, it is possible that the version compiled by homebrew has a lot more features enabled, or it links a weird version of another library.

@shawncao I appreciate that you have already spent quite a while troubleshooting this, but we really will need your help with a way to reproduce this.

$ CLOUD_STORAGE_ENABLE_TRACING=raw-client,http GOOGLE_CLOUD_CPP_ENABLE_CLOG=yes ./cmake-build-debug/google/cloud/storage/examples/storage_object_file_transfer_samples download-file coryan-test-bucket foo/bar/baz/test.txt /tmp/test-2.txt

2021-02-19T02:44:59.012598000Z [INFO] <0x111e71e00> Enabling logging for http (../google/cloud/storage/client_options.cc:155)
2021-02-19T02:44:59.012776000Z [INFO] <0x111e71e00> Enabling logging for RawClient functions (../google/cloud/storage/client_options.cc:159)
2021-02-19T02:44:59.014811000Z [INFO] <0x111e71e00> ReadObject() << ReadObjectRangeRequest={bucket_name=coryan-test-bucket, object_name=foo/bar/baz/test.txt, disable-md5-hash=1} (../google/cloud/storage/internal/logging_client.cc:75)
2021-02-19T02:44:59.216938000Z [INFO] <0x111e71e00> Read() current_offset=0 (../google/cloud/storage/internal/retry_object_read_source.cc:49)
2021-02-19T02:44:59.266620000Z [DEBUG] <0x111e71e00> Wait == curl(Info):   Trying 2607:f8b0:4006:811::2010:443...
== curl(Info): Immediate connect fail for 2607:f8b0:4006:811::2010: No route to host
== curl(Info):   Trying 2607:f8b0:4006:802::2010:443...
== curl(Info): Immediate connect fail for 2607:f8b0:4006:802::2010: No route to host
== curl(Info):   Trying 2607:f8b0:4006:818::2010:443...
== curl(Info): Immediate connect fail for 2607:f8b0:4006:818::2010: No route to host
== curl(Info):   Trying 2607:f8b0:4006:803::2010:443...
== curl(Info): Immediate connect fail for 2607:f8b0:4006:803::2010: No route to host
== curl(Info):   Trying 172.217.7.16:443...
 (../google/cloud/storage/internal/curl_handle.cc:151)
2021-02-19T02:44:59.305350000Z [DEBUG] <0x111e71e00> Wait == curl(Info): Connected to storage.googleapis.com (172.217.7.16) port 443 (#0)
== curl(Info): ALPN, offering http/1.1
== curl(Info): successfully set certificate verify locations:
== curl(Info):  CAfile: /etc/ssl/cert.pem
== curl(Info):  CApath: /etc/ssl/certs
== curl(Info): TLSv1.3 (OUT), TLS handshake, Client hello (1):
 (../google/cloud/storage/internal/curl_handle.cc:151)
2021-02-19T02:44:59.331731000Z [DEBUG] <0x111e71e00> Wait == curl(Info): TLSv1.3 (IN), TLS handshake, Server hello (2):
== curl(Info): TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
== curl(Info): TLSv1.3 (IN), TLS handshake, Certificate (11):
== curl(Info): TLSv1.3 (IN), TLS handshake, CERT verify (15):
== curl(Info): TLSv1.3 (IN), TLS handshake, Finished (20):
== curl(Info): TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
== curl(Info): TLSv1.3 (OUT), TLS handshake, Finished (20):
== curl(Info): SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
== curl(Info): ALPN, server accepted to use http/1.1
== curl(Info): Server certificate:
== curl(Info):  subject: C=US; ST=California; L=Mountain View; O=Google LLC; CN=*.storage.googleapis.com
== curl(Info):  start date: Jan 26 09:04:57 2021 GMT
== curl(Info):  expire date: Apr 20 09:04:56 2021 GMT
== curl(Info):  subjectAltName: host "storage.googleapis.com" matched cert's "*.googleapis.com"
== curl(Info):  issuer: C=US; O=Google Trust Services; CN=GTS CA 1O1
== curl(Info):  SSL certificate verify ok.
>> curl(Send Header): GET /coryan-test-bucket/foo%2Fbar%2Fbaz%2Ftest.txt HTTP/1.1
Host: storage.googleapis.com
User-Agent: gcloud-cpp/v1.25.0+d9f08aa0e libcurl/7.74.0 OpenSSL/1.1.1i zlib/1.2.11 c-ares/1.14.0 libssh2/1.9.0 AppleClang 12.0.0.12000032
Accept: */*
Authorization: Bearer [censored]
x-goog-api-client: gl-cpp/AppleClang-12.0.0.12000032-ex-2011 gccl/v1.25.0+d9f08aa0e

 (../google/cloud/storage/internal/curl_handle.cc:151)
2021-02-19T02:44:59.544046000Z [DEBUG] <0x111e71e00> WriteCallback == curl(Info): TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
== curl(Info): TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
== curl(Info): old SSL session ID is stale, removing
== curl(Info): Mark bundle as not supporting multiuse
<< curl(Recv Header): HTTP/1.1 200 OK
<< curl(Recv Header): X-GUploader-UploadID: ABg5-Uyq7lQxUMR7hUDhqrSBwr2ozf3m2F5vG-4yHt-jCuYNIkF-IE8JlF5WbsvdKeSnkRetFHwl6Vy1K_QnyjaAuvw
<< curl(Recv Header): Expires: Fri, 19 Feb 2021 02:44:58 GMT
<< curl(Recv Header): Date: Fri, 19 Feb 2021 02:44:58 GMT
<< curl(Recv Header): Cache-Control: private, max-age=0
<< curl(Recv Header): Last-Modified: Fri, 19 Feb 2021 01:58:55 GMT
<< curl(Recv Header): ETag: "098f6bcd4621d373cade4e832627b4f6"
<< curl(Recv Header): x-goog-generation: 1613699934960790
<< curl(Recv Header): x-goog-metageneration: 1
<< curl(Recv Header): x-goog-stored-content-encoding: identity
<< curl(Recv Header): x-goog-stored-content-length: 4
<< curl(Recv Header): Content-Type: application/octet-stream
<< curl(Recv Header): x-goog-hash: crc32c=hqBywA==
<< curl(Recv Header): x-goog-hash: md5=CY9rzUYh03PK3k6DJie09g==
<< curl(Recv Header): x-goog-storage-class: MULTI_REGIONAL
<< curl(Recv Header): Accept-Ranges: bytes
<< curl(Recv Header): Content-Length: 4
<< curl(Recv Header): Server: UploadServer
<< curl(Recv Header): Alt-Svc: h3-29=":443"; ma=2592000,h3-T051=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"
<< curl(Recv Header): 
>> curl(Recv Data): size=4
test                     74657374                                        
 (../google/cloud/storage/internal/curl_handle.cc:151)
2021-02-19T02:44:59.552679000Z [INFO] <0x111e71e00> xsgetn(): count=1572864, in_avail=4, status= [OK] (../google/cloud/storage/internal/object_streambuf.cc:128)
2021-02-19T02:44:59.552766000Z [INFO] <0x111e71e00> Read() current_offset=4 (../google/cloud/storage/internal/retry_object_read_source.cc:49)
2021-02-19T02:44:59.552821000Z [DEBUG] <0x111e71e00> Read == curl(Info): Connection #0 to host storage.googleapis.com left intact
 (../google/cloud/storage/internal/curl_handle.cc:151)
2021-02-19T02:44:59.552890000Z [INFO] <0x111e71e00> xsgetn(): count=1572864, in_avail=0, offset=4, read_result->bytes_received=0 (../google/cloud/storage/internal/object_streambuf.cc:201)
Downloaded foo/bar/baz/test.txt to /tmp/test-2.txt

I am a bit at a loss here. Can I ask you do do a little more troubleshooting? Well, maybe a lot more. The level of tracing we need is normally disabled at compile-time. Can you change this line:

https://github.com/googleapis/google-cloud-cpp/blob/eeeb29f61d31cc407b17dd251c3dcab93c27bd0e/google/cloud/storage/internal/curl_download_request.cc#L50

to read GCP_LOG(INFO) instead of GCP_LOG(TRACE) and run your test again? The log main contain all kinds of confidential information, consider scrubbing it or we can figure out a way for you to send it to my email address (should be guessable, I work at google, my username is the same as my github name).

Thanks, I missed the full error message, the error starts with EasyPause(), fortunately there are only two places where we call this:

https://github.com/googleapis/google-cloud-cpp/blob/eeeb29f61d31cc407b17dd251c3dcab93c27bd0e/google/cloud/storage/internal/curl_download_request.cc#L153-L165

and

https://github.com/googleapis/google-cloud-cpp/blob/eeeb29f61d31cc407b17dd251c3dcab93c27bd0e/google/cloud/storage/internal/curl_download_request.cc#L97-L99

I think it is the first one, since it says it was in Read(). That ugly #ifdef I never liked, it means we are doing something wrong, or at least weird… Did you mention what version of curl you were using? The first time we noticed the problem was with 7.69 or 7.69.1, but maybe it was a problem in earlier versions too and we never tested with (for example) 7.68

This is a long shot, but I see in your documentation that you recommend installing libcurl4-gnutls-dev but we test with libcurl4-openssl-dev. What happens if you use libcurl4-openssl-dev instead?

From the trace - looks like content downloaded successfully but failed to write to a temp file passed in as variable “local2”.

Ack.

Not sure how to dig more, can you help?

The DownloadToFile() API returns a Status object that may have more information, can you log it?