dbt-core: [CT-201] Reconcile configs + properties for sources
Let’s convert some source properties into configs that can be set inside config: blocks and within dbt_project.yml:
# models/src_whatever.yml
version: 2
sources:
- name: my_source
description: ... # not a config
config:
enabled: ...
quoting: {dict}
freshness: {dict}
loader: ...
loaded_at_field: ...
database: ... # or 'project' in dbt-bigquery
schema: ... # or 'dataset' in dbt-bigquery
identifier: ... # this is like an alias for alias
meta: {dict}
tags: ...
tables:
- name: my_src_table
config: # all the same stuff as above. these take precedence for specific tables
description: ... # not a config
tests: ... # not a config
columns: ... # not a config
# dbt_project.yml
sources:
project_name:
subdirectory:
+database: raw
+loader: fivetran
another_subdirectory:
+enabled: false
For backwards compatibility, we should still support setting these as top-level properties:
sources:
- name: my_source
loaded_at_field: updated_at
But raise an error if the same config is set in both places (even if it’s with the same value):
sources:
- name: my_source
loaded_at_field: updated_at
config:
loaded_at_field: updated_at
Notes
-
We’re thinking that descriptions shouldn’t be configs. They’re rendered with a different context (docs), and they don’t make sense to set hierarchically. The same goes for
testsandcolumns—this would be quite tricky to figure out, sincetestsactually generate new nodes, rather than adding properties to the existing node. -
As far as the manifest /
graph.sourcescontext, I’m open to suggestions. For backwards compatibility, we’d want to store things likeloaded_at_fieldin bothnode.configand node-level keys. But I think there’s a valid argument for removing this as a top-level key and only storing it innode.config, so long as we communicate clearly that such a move is taking place. -
For better
state:modifiedcomparisons, we’d want to store the un-rendered version of these configs innode.unrendered_config, regardless of whether they’re set indbt_project.ymlormodels/src_whatever.yml. Original issue for this is https://github.com/dbt-labs/dbt/issues/2744.
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 4
- Comments: 16 (14 by maintainers)
Please make possible to add “config:” section to my sources.yml on source table level to store information! Configs are reachable programmatically on runtime with {{ config.get(‘my_config_key1’) }} and i need to get watermark_field for my source:
@alexrosenfeld10 Totally fair question. v1.1 will include the work completed by https://github.com/dbt-labs/dbt-core/pull/5008: supporting
config.enableddefined in each source’s.ymldefinition. That’s what we were able to get done with the time and appetite we had to work on this. Unfortunately, I don’t have an estimate for when we’ll be able to prioritize remaining steps, since we need to move onto other initiatives.Context:
enabledconfig indbt_project.ymloverride, which allows users to redefine a source, with new properties, that take precedence over the same properties for the source with the same name defined in a package.In summary, the goal of this issue:
dbt_project.yml..ymlfile definitionMain exit criteria:
.ymlfiles, accept a new property,config:configproperty acceptsenabled: true|false, which has the effect of enabling or disabling the sourceconfigproperty, i.e. as configs as instead. These configs can be defined at the project, source, and source-table level—and they are inherited/overridden in that order of specificity.manifest.json. This offers backwards compatibility for metadata use cases (such asdbt-docs) that depend on accessing those attributes at the top-level. If users want fine-grained control over config resolution, they should define these as configs.Totally fair question around how to handle a source
overrides. This might be out of scope for the initial effort, but it’s worth thinking through, since we want to eventually support propertyoverridesfor other resource types, too: #4157.I think configurations should be resolved first, and then
overridesecond, but that’s not a strongly held view—as long as we can document consistent behavior. To make this concrete:I would expect both configurations to be resolved, then the override applied, such that
my_src_nameends up pointing todb_two.This is great! Just commenting here to show my support of this feature.