dbt-snowflake: [CT-333] Macro for warehouse selection
Describe the feature
It would be nice to have a macro similar to generate_schema_name for the selection of the snowflake warehouse to use. For example for change the warehouse depending on the environment.
Describe alternatives you’ve considered
I tried to do a pre hook with the use warehouse query but don’t gives the flexibility needed, for example if you use a wildcard in the name all the queries that don’t use the pre hook will fail.
Additional context
Please include any other relevant context here.
Who will this benefit?
Every one that wants more control and flexibility on the warehouse selected.
Are you interested in contributing this feature?
Yes, I have some idea about how to do it. I’ve performed some local tests and it’s working. With some help as it’s my first contribution I will be able to raise a PR.
About this issue
- Original URL
- State: open
- Created 2 years ago
- Reactions: 3
- Comments: 18 (9 by maintainers)
+1 for this.
For anyone looking for a workaround, we ended up using a custom macro that gets called in the model config.
The macro was similar to this:
So this enabled models to “opt in” to the warehouse size they need by calling the
get_warehousemacro, while models that didn’t use the macro continued to run on the default warehouse fromprofiles.yml.Hi @manugarri, The topic is thornier than it looks. We will have a deeper look shortly to find an alternative approach, but I’m not sure yet when we’ll get to solving it out of the box. We may have a good workaround available soon though.
@Fleid as requested, im sharing my last comment in https://github.com/dbt-labs/dbt-snowflake/issues/533 to keep track of the bug:
my original intentions where to setup a macro that select different warehouses depending on whether an analyst is testing dbt locally versus running automated runs on airflow.
We use environment variables on our profiles.yml
My original implementation for the macro was something like this:
this was failing with the error i mentioned.
In the end some very helpful person at the dbt slack helped me and suggested replacing
env_var('SNOWFLAKE_USER')fortarget.user, which solves my current needs.But still, it sounds like allowing environment variables in this macro like any other macro would be useful for other use cases.
Another use case that just arose at my company, we want to have different warehouses per environment, but we want a single profiles.yml (since the prod dbt runs on docker ECS but local runs for multiple analysts) if we have an environment variable to define the environment dbt is running, we can include that environment variable in the mapping function i shared above
Re-opening as we’ve reverted the PR solving it, the implementation was causing side effects.
@javiCalvo closing due to being covered by pr #503
It would be awesome to have such macro, and be able to choose warehouse depending on:
I saw that https://github.com/dbt-labs/dbt-snowflake/pull/115 is still open, and then came across this discussion to read more.
Everything said above is super valid! After some more reflection, and time spent looking at https://github.com/dbt-labs/dbt-snowflake/pull/115, I’m going to reopen this issue for two reasons:
use warehouseinside the codebase is pretty janky: a Python f-string. We should wrap SQL in Jinja wherever possible, since it’s easier for us to keep consistent and more feature-ful for end users.use warehouse@javiCalvo If this is something you’d still be interested in contributing, I’d be open to it. I’ll leave a comment or two on the PR with recommendations for the implementation.
Hi @VersusFacit ! Thanks a lot for the answer.
Yes, sorry for the ambiguity I was talking about snowflake’s virtual warehouse.
We are using already the
+snowflake_warehousetag all over our dbt_project and we can insert there the jinja code as you suggest (in fact we are already doing it).The improve is precisely to avoid having so much jinja code in the dbt_project, specially repeated code. We have a project which involves 12 snowflake databases and we are using around 60 different virtual warehouses splitted in 3 different environments. Righ now out dbt_project have the
+snowflake_warehousetag 9 times and growing so a macro as I suggest would be great to have a more controlled project.Example of what we have now in the dbt_project:
+snowflake_warehouse: "WH_{{ {'stg': 'D', 'tst': 'Q', 'prd': 'P'}[target.name] }}_IMPORT_XL"And this structure is working but makes our dbt_project somehow not pretty 😃