dbt-snowflake: [CT-1813] [CT-1378] [Bug] Python models not picking up custom schema

Is this a new bug in dbt-core?

  • I believe this is a new bug in dbt-core
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

When I try to create a python model using v1.3, I get an error that a schema was not provided. I have set a schema in my dbt_project.yml file for the directory that the model is in, and have the generate_schema_name set to change the schema based on the environment.

Expected Behavior

The python model will run and create a model in the same schema as sql models in the same directory

Steps To Reproduce

  1. Set the dbt_project.yml to a use a given schema, e.g.
models:
  dbt:
    datamarts:
      dim:
        schema: dim
  1. add a python file dbt/models/datamarts/dim/test_python.py in that directory, e.g:
def model(dbt, session):
    dbt.config(materialized="table")
    df = dbt.ref("my_other_model").to_pandas()
    return df
  1. create the macros generate_schema_name_for_env and generate_schema_name:
{% macro generate_schema_name_for_env(custom_schema_name, node) -%}
    {%- set default_schema = target.schema -%}
    {%- if target.name = 'prod' and custom_schema_name is not none -%}
        {{ custom_schema_name | trim }}
    {%- else -%}
        {{ default_schema }}_{{custom_schema_name | trim}}
    {%- endif -%}
{%- endmacro %}


{% macro generate_schema_name(custom_schema_name, node) -%}
    {{ generate_schema_name_for_env(custom_schema_name, node) }}
{%- endmacro %}
  1. Run the python model

Relevant log output

"/usr/lib/python_udf/431102875043084cdb1968114cbff94386b72d5e184e97acf96b1780c382f509/lib/python3.8/site-packages/snowflake/connector/pandas_tools.py", line 122, in write_pandas
      raise ProgrammingError(
  snowflake.connector.errors.ProgrammingError: Schema has to be provided to write_pandas when a database is provided
   in function TEST_PYTHON__DBT_SP with handler main
  compiled Code at target/run/dbt/models/datamarts/dim/test_python.py

Environment

- OS: dbt Cloud IDE
- Python: python in dbt cloud IDE
- dbt: 1.3

Which database adapter are you using with dbt?

snowflake

Additional Context

No response

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Reactions: 4
  • Comments: 32 (17 by maintainers)

Most upvoted comments

@patkearns10 and I had a chance to live-debug this with a user who’s been running into the issue, and I think we got to the bottom of it! I’m still not sure why this is cropping up for only some users, and not others (me!) — and I’m pretty sure the root-cause fix here was (or should have been) in https://github.com/snowflakedb/snowflake-connector-python/pull/1274 — but this managed to reliably solve the problem.

The quickest fix was to add two lines within the Python model code, right before the return of the Pandas dataframe:

def model(dbt, session):
    dbt.config(schema="custom_schema")
    pandas_df = dbt.ref("my_model").to_pandas()
    
    # add these lines
    session.use_database(dbt.this.database)
    session.use_schema(dbt.this.schema)
    
    return pandas_df

I believe we could just fix this for everyone by updating these lines to include those session.use_* calls as well: https://github.com/dbt-labs/dbt-snowflake/blob/0f77a0886f3308aea0d8313ff2fa82e7378bb8ce/dbt/include/snowflake/macros/materializations/table.sql#L49-L51

Namely, by changing that to:

        if isinstance(df, pandas.core.frame.DataFrame):
          session.use_database(target_relation.database)
          session.use_schema(target_relation.schema)
          # session.write_pandas does not have overwrite function
          df = session.createDataFrame(df)

Hey all, we’re lining this up for our next sprint. So hopefully we should have something out in the next 2 weeks.

I’m getting the same error when running my project on dbt cloud. It works perfectly when building from the IDE but I get this error when I try running it in production.

The SPs from the build are left in my production schema and I tried calling them manually from the snowflake UI and that works, if I’ve first selected the correct database and schema, maybe that it also a hint about what’s going wrong?

Hey @satpai1215 and @eaphilli,

To cover the basics:

  • Custom schema values are concatenated to the base target schema, so @eaphilli your resulting schema will be ADMIN_ DEV_MODEL_python_model_test, not DEV_MODEL_python_model_test. I’m wondering if in your case the resulting schema doesn’t exist and you don’t have the privilege to create the custom schema, resulting in a weird behavior
  • I can’t reproduce the issue in the versions you are both using with the simplest setup detailed below. Could you please both try to see if that works for you too?

Starting from a new project (dbt init), configure a schema in your profiles.

Then add the following files in your project: In /models, my_python_model.py:

def model(dbt, session):
    dbt.config(
        materialized="table",
        schema="customSchemaSuffix"
    )

    df = dbt.ref("my_first_dbt_model")
    
    return df

In /models, my_python_model.yml:

version: 2

models:
  - name: my_python_model
    description: my_python_model
    config:
        materialized: table
        schema: customSchemaSuffix
        tags: ['python']

You can define the custom schema property in either the property file or the model file. Both should work.

Do make sure that all of the following schemas exist and you have access to them (or you have enough privilege to have dbt create them on the fly):

  • baseSchema
  • baseSchema_customSchemaSuffix

The (hopeful) fix for this was included in snowflake-connector-python==3.0.0 just a few weeks ago (after the dbt Core v1.4 release): https://community.snowflake.com/s/article/Snowflake-Connector-for-Python-Release-Notes

Since it’s a major version bump, there’s some additional due diligence warranted before including the upgrade in a v1.4.x patch release.

bumping this from another community report: https://getdbt.slack.com/archives/C03QUA7DWCW/p1673461860389249

@jtcohen6 looks like the underlying Snowflake issues is fixed, could we fix here as well?

Will close out once we confirm that Snowflake has fixed this on their end https://github.com/snowflakedb/snowflake-connector-python/issues/1034#issuecomment-1266583831