dbt-core: [Bug]

Is this a new bug in dbt-core?

  • I believe this is a new bug in dbt-core
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

packages:
  - git: "https://github.com/dbt-labs/dbt-utils.git"
    revision: 0.9.2

  - git: "https://github.com/dbt-labs/dbt-codegen.git"
    revision: 0.10.0

I have tried this yml file. The first one worked very well which is git: “https://github.com/dbt-labs/dbt-utils.git”. but when I tried the second one I got SSLError(SSLCertVerificationError.

Expected Behavior

my expectation is able to install codegen dbt package.

Steps To Reproduce

dbt deps

Relevant log output

14:33:36  Encountered an error:
External connection exception occurred: HTTPSConnectionPool(host='hub.getdbt.com', port=443): Max retries exceeded with url: /api/v1/index.json (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate in certificate chain (_ssl.c:1002)')))
(dbt_venv)

Environment

- OS: Windows 11 Enterprise
- Python: 3.11.4
- dbt: 1.6.1

Which database adapter are you using with dbt?

other (mention it in “Additional Context”)

Additional Context

I am using databrick as a database adapter

About this issue

  • Original URL
  • State: closed
  • Created 10 months ago
  • Comments: 19 (9 by maintainers)

Most upvoted comments

@EricFabianHG The error message you are getting is on step 2 in the “Context” section below. My understanding is that requests is using a different CA certificate store than nc (Netcat) which explains why the latter could work while the former doesn’t. This is probably due to a configuration issue in your local environment rather than a bug in dbt-core.

See below for several different ideas for solutions.

Context

In a Python environment, when you’re using the requests library to initiate an HTTPS connection to a web server, the underlying machinery performs an SSL/TLS handshake to establish a secure channel. A critical part of this handshake involves verifying the server’s SSL certificate to ensure the connection is secure and the server is who it claims to be.

Here’s a high-level rundown:

  1. Server Certificate Presentation: The server presents its digital certificate, which includes the server’s public key and has been signed by a Certificate Authority (CA).

  2. Certificate Verification: The requests library contains a bundled CA certificate store (usually from certifi). During the handshake, it verifies the server’s certificate against this local store to ensure:

    • The certificate is signed by a trusted CA.
    • The certificate is not expired.
    • The hostname matches the CN (Common Name) or SAN (Subject Alternative Name) on the certificate.
  3. Session Key Exchange: After successful verification, the client generates a pre-master secret and encrypts it with the server’s public key. This is sent to the server, which decrypts it using its private key. Both parties then use this to generate the symmetric session key for data encryption and decryption.

  4. Secure Communication: Now, both the client and the server use the session key to encrypt and decrypt the data being sent between them, ensuring confidentiality and integrity.

Python script for troubleshooting

Here’s a Python script that you can use to troubleshoot your environment that mimics what dbt deps is doing here – the key is to only change things in your environment rather than changing anything within the script itself. See below for ideas of environmental changes that have worked for others.

test_requests_https.py

import os
import requests
from requests.utils import DEFAULT_CA_BUNDLE_PATH

url = "https://hub.getdbt.com"

try:
    response = requests.get(url, timeout=10)
    print("It worked")
except Exception as e:
    print("It didn't work\n")

    print(f"HTTPS_PROXY: {os.environ.get('HTTPS_PROXY')}")
    print(f"REQUESTS_CA_BUNDLE: {os.environ.get('REQUESTS_CA_BUNDLE')}")
    print(f"CURL_CA_BUNDLE: {os.environ.get('CURL_CA_BUNDLE')}")
    print(f"DEFAULT_CA_BUNDLE_PATH: {DEFAULT_CA_BUNDLE_PATH}")
python test_requests_https.py

Solutions that have worked for others

⚠️ Since some of the following ideas will modify the your Python environment, I’d recommend using an isolated virtual environment to test them out.

  1. Install/upgrade certifi: This Python package provides an up-to-date bundle of CA certificates derived from Mozilla’s CA trust store. It can help resolve issues related to expired or missing CA bundles.

    python -m pip install --upgrade certifi
    

    See here for a 2nd step that may or may not be required.

  2. Install python-certifi-win32: On Windows, this package augments certifi to use certificates from the Windows system’s trust store (Windows Certificate Store) instead of certifi’s built-in trust store.

    python -m pip install python-certifi-win32
    
  3. Install pip-system-certs: This package is an alternative replacement for python-certifi-win32, and it patches pip and requests at runtime to use certificates from the default system store.

    python -m pip install pip-system-certs
    
  4. Set the REQUESTS_CA_BUNDLE environment variable: If you have a custom CA bundle (e.g., for a corporate environment), you can set the REQUESTS_CA_BUNDLE environment variable to its path. This will instruct requests to use it for all outgoing connections.

    export REQUESTS_CA_BUNDLE=/path/to/custom/cabundle.pem
    
  5. Set the HTTPS_PROXY environment variable: If you’re behind a proxy, setting this variable can help requests navigate through it. This may lead to more problems or less depending if your proxy tampers with SSL certificates.

    export HTTPS_PROXY=http://your.proxy.server:port
    

Summary

After trying one or more of the solutions from above, re-run the Python script provided above and see if it worked or not:

python test_requests_https.py

Once the test script is working, then dbt deps should work as well.

#7153 has the same error that you reported. Another Windows user reported that the following worked for them. Could you give it a try and see if it works for you also?

python -m pip install python-certifi-win32

Then:

dbt deps

It works on my computer. Thanks a lot @dbeatty10

corporate VPN

Hi @dbeatty10

Thanks a lot for all the suggestions to solve my issue. I try it all but it didn’t work for me.

You are right, I’m on Macbook pro M2 in a corporate VPN.

After asking more around here I found a solution: export HTTPS_PROXY="http://proxy-<your-proxy-here>"

Just set this variable solves the issue for Mac and Windows users.

@andrea-montes-yello thanks for sending those logs 💪 This looks like a key portion:

Failed to establish a new connection: [Errno -5] No address associated with hostname

Your situation and error message looks unique from the other reports within this thread, and I suspect we may need to open an issue with sqlfluff.

In the meantime, I’ve opened https://github.com/dbt-labs/dbt-core/issues/8567 so we can investigate further.

https://github.com/dbt-labs/dbt-core/issues/7153 has the same error that you reported. Another Windows user reported that the following worked for them. Could you give it a try and see if it works for you also?

python -m pip install python-certifi-win32

Then:

dbt deps