delta-rs: Simple delta write in Fabric notebook failing with SSL error

Delta-rs version: Python 0.12.0 Cloud provider: Microsoft (UK South) Environment: Fabric Notebook


Bug

What happened: When trying to write a pandas dataframe to a delta table in Microsoft Fabric it fails with an SSL error:

OSError: Generic MicrosoftAzure error: response error "request error", after 10 retries: error sending request for url (https://onelake.blob.fabric.microsoft.com/xxx/yyy.Lakehouse/Tables/Test/_delta_log/_last_checkpoint): error trying to connect: error:0A000086:SSL routines:tls_post_process_server_certificate:certificate verify failed:ssl/statem/statem_clnt.c:1889: (self-signed certificate)

How to reproduce it: Installed deltalake in fabric library management and then ran the following in a notebook:

import pandas as pd
from deltalake.writer import write_deltalake
from trident_token_library_wrapper import PyTridentTokenLibrary

token = PyTridentTokenLibrary.get_access_token("storage")

TablePath = "abfss://xxx@onelake.dfs.fabric.microsoft.com/yyy.Lakehouse/Tables/Test"
aadToken = PyTridentTokenLibrary.get_access_token("storage")

df = pd.DataFrame({"id": [1, 2], "value": ["foo", "boo"]})

write_deltalake(TablePath, df, storage_options={"bearer_token": aadToken, "use_fabric_endpoint": "true"})

About this issue

  • Original URL
  • State: closed
  • Created 8 months ago
  • Comments: 20 (10 by maintainers)

Most upvoted comments

and lastly, here is a really ugly solution if you are still keen on trying runtime 1.2.

  1. Run !openssl s_client -connect onelake.blob.fabric.microsoft.com:443 to get the certificate.
  2. Copy the certificate value between -----BEGIN CERTIFICATE----- and -----END CERTIFICATE----- and write it out to a local file, e.g.
cert = """-----BEGIN CERTIFICATE-----
MIIFGzCCBAOgAwIBAgIUFO5FzvkmKVoyIlO8gQM8vkcNJ0kwDQYJKoZIhvcNAQEL
BQAwgZ8xCzAJBgNVBAYTAlVTMRMwEQYDVQQIDApXYXNoaW5ndG9uMRAwDgYDVQQH
<rest of the cert value here, shortened for brevity
-----END CERTIFICATE-----
"""
with open("ca.cert", "w") as out:
    out.write(cert)
  1. Export the created file name into ENV var and things should work, e.g.
os.environ["SSL_CERT_FILE"] = "./ca.cert"

workspace_id = <your workspace id here>
lakehouse_id = <your lakehouse id here>
dt = DeltaTable(f"abfss://{workspace_id}@onelake.dfs.fabric.microsoft.com/{lakehouse_id}/Tables/bad", storage_options={"bearer_token": aadToken, "use_fabric_endpoint": "true"})
print(dt.version())

With this you may actually consider closing this ticket, not the place to be resolved imo

Thanks @r3stl355 and @djouallah, really appreciate your time. Indeed reverting the Fabric runtime let’s it work fine! Really excited as work for a group of schools so data volumes aren’t huge and always looking for ways to keep compute costs low.

Much appreciated

Ben

hmm, ok, tried with deltalake 0.13, and same erros, I think the regression was introduced in Fabric 1.2 runtime, for now better use runtime 1.1 where it works fine.

OK @bcdobbs, ignore everything I wrote before 😁 , this looks like a problem with the writer because Spark writer works and so does the direct API call (i.e. I can create a file under Files with a PUT. I’ll carry on digging