dbt-clickhouse: Problem with the distributed_table materialization
Hello everyone, I am not 100% sure if this is a bug or because I did not correctly use the connector.
- No same issue with
tableandviewmaterializations.
Describe the bug
Here my table definition event.sql: it takes a parquet file from s3, then creates the table with the configurations.
{{ config(
cluster="default",
schema="db_test_dbt",
materialized="distributed_table",
sharding_key="rand()",
unique_key="(event_id)",
full_refresh=true,
engine="ReplicatedReplacingMergeTree('/clickhouse/tables/{uuid}/{shard}', '{replica}') ",
settings={"allow_nullable_key": 1},
order_by="(event_id)"
)
}}
SELECT
*
FROM
s3('path/to/parquet/file/on/s3/event/*', 'Parquet')
😊The first dbt run works well, the table is correctly created on my clickhouse cluster.
ðŸ˜When I do dbt run again, it does not work any more:
Database Error in model event (models/event.sql)
Code: 62.
DB::Exception: Empty query. Stack trace:
The file target/run/dbt_project_name/models/event.sql is empty;
The file target/compiled/dbt_project_name/models/event.sql contains the select * from s3 query;
Steps to reproduce
dbt run --select models/event.sql: Completed successfully.😊dbt run --select models/event.sqlagain: ERROR!ðŸ˜dbt run --select models/event.sqlagain: SAME ERROR!ðŸ˜dbt run --select models/event.sqlagain: SAME ERROR!ðŸ˜- drop the tables manually, and retry
dbt run --select models/event.sql: Completed successfully. 😊
Configuration
Environment
- dbt version:
1.4.6. - dbt-clickhouse version:
1.4.7and same problem for previous versions. - clickhouse-driver version (if using native)
- clickhouse-connect version (if using http):
1.4.7
ClickHouse server
- ClickHouse Server version:
23.3.1 - multiple shards * 2 replicates
Is it a bug?
Or something under development?
Or something wrongly set in my config clause?
Thanks in advance.
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 15 (15 by maintainers)
PR merged into 1.4.8 release.
@zli06160 Yes, I’ve noticed this also when trying pytest with
clustersetting. thus I changed this block toIt seems like the assumption for the purpose of using a ClickHouse cluster within this repo is to replicate data (Replicated table engines).
@zli06160 Can you possibly check to see if changing the macro as in the PR https://github.com/ClickHouse/dbt-clickhouse/pull/180 fixes your situation, since we don’t have tests? Some additional validation would be nice before we merge and release the fix.
raised a merge request for this specific case. In the meanwhile creating test cases for distributed materializations.
I will try to reproduce this with the NYC taxi data example. Looks like sth is broken before the actual SQL run.
can you please provide dbt.log? The error was returned by ClickHouse, guess we need to know what exactly the query looks like.
This is a new, experimental feature so I suspect it’s a bug. Any thoughts @gladkikhtutu?