ClickHouse: MaterializedPostgreSQL: possible segfault when table with incorrect structure

Describe what’s wrong

ClickHouse service crashes after I try to create a table using MaterializedPostgreSQL Engine.

I’m running version 23.9.1.1761 on a single node ubuntu server 22.04

How to reproduce Create a replication from a PostgreSQL database source to a Clickhouse DB as destination using MaterializedPostgreSQL engine with the following command:

CREATE TABLE slang_app.activities
(
    id UInt64 CODEC(ZSTD),
    created_at DateTime CODEC(ZSTD),
    updated_at DateTime CODEC(ZSTD),
    topic_id Int CODEC(ZSTD),
    data String CODEC(ZSTD),
    correct_answer String CODEC(ZSTD),
    answer_map String CODEC(ZSTD),
    primary_sense_ids String CODEC(ZSTD),
    secondary_sense_ids String CODEC(ZSTD),
    sentence_ids String CODEC(ZSTD),
    sense_difficulty String CODEC(ZSTD),
    sentence_difficulty String CODEC(ZSTD),
    activity_hash String CODEC(ZSTD),
    bucket Int CODEC(ZSTD),
    key_term Bool CODEC(ZSTD),
    cefr_level String CODEC(ZSTD),
    sense_ids String CODEC(ZSTD),
    active Bool CODEC(ZSTD),
    activity_language String CODEC(ZSTD),
    primary_grammar_pattern_ids UInt64 CODEC(ZSTD),
    secondary_grammar_pattern_ids UInt64 CODEC(ZSTD),
    interaction_info_id Int CODEC(ZSTD),
    material_ids String CODEC(ZSTD)
)
ENGINE = MaterializedPostgreSQL('db_host:5432', 'db_name', 'activities', 'db_user', 'db_password')
PRIMARY KEY (id)
SETTINGS materialized_postgresql_tables_list = 'activities';

This is the source schema on postgres side:

 id                            | bigint                      |           | not null | nextval('activities_id_seq'::regclass) | plain    |              |
 created_at                    | timestamp without time zone |           | not null |                                        | plain    |              |
 updated_at                    | timestamp without time zone |           | not null |                                        | plain    |              |
 topic_id                      | integer                     |           | not null |                                        | plain    |              |
 data                          | jsonb                       |           | not null |                                        | extended |              |
 correct_answer                | jsonb                       |           | not null |                                        | extended |              |
 answer_map                    | jsonb                       |           |          |                                        | extended |              |
 primary_sense_ids             | integer[]                   |           | not null |                                        | extended |              |
 secondary_sense_ids           | integer[]                   |           |          |                                        | extended |              |
 sentence_ids                  | integer[]                   |           |          |                                        | extended |              |
 sense_difficulty              | double precision            |           |          |                                        | plain    |              |
 sentence_difficulty           | double precision            |           |          |                                        | plain    |              |
 activity_hash                 | character varying           |           | not null |                                        | extended |              |
 bucket                        | integer                     |           | not null |                                        | plain    |              |
 key_term                      | boolean                     |           | not null | false                                  | plain    |              |
 cefr_level                    | cefr_level                  |           |          |                                        | plain    |              |
 sense_ids                     | integer[]                   |           | not null | '{}'::integer[]                        | extended |              |
 active                        | boolean                     |           |          | true                                   | plain    |              |
 activity_language             | character varying           |           |          |                                        | extended |              |
 primary_grammar_pattern_ids   | bigint[]                    |           | not null | '{}'::bigint[]                         | extended |              |
 secondary_grammar_pattern_ids | bigint[]                    |           | not null | '{}'::bigint[]                         | extended |              |
 interaction_info_id           | integer                     |           |          |                                        | plain    |              |
 material_ids                  | integer[]                   |           | not null | '{}'::integer[]                        | extended |              |

Expected behavior The table should be created and replication done. I have this setup for several other tables from this same source database and it works, but not for this one specifically.

Error message and/or stacktrace As soon as I run the command I instantaneously get:

Query id: 02b01da9-cb3e-4b76-8376-23d0322755a8

Exception on client:
Code: 32. DB::Exception: Attempt to read after eof: while receiving packet from localhost:9000. (ATTEMPT_TO_READ_AFTER_EOF)

There were no stacktraces in the logs, only these: clickhouse-server.log

2023.10.09 13:58:13.500286 [ 379896 ] {b84057e6-6d65-4117-8f97-dc3edc54f5fb} <Debug> StorageMaterializedPostgreSQL(slangapp.activities): Creating clickhouse table for postgresql table slang_app.activities (381720fd-a661-4253-a906-9fa03feb67b1)
2023.10.09 13:58:13.506057 [ 379896 ] {b84057e6-6d65-4117-8f97-dc3edc54f5fb} <Debug> slang_app.`381720fd-a661-4253-a906-9fa03feb67b1_nested` (e32a86e4-063f-46c5-b706-85d7b1689e8e): Loading data parts
2023.10.09 13:58:13.506289 [ 379896 ] {b84057e6-6d65-4117-8f97-dc3edc54f5fb} <Debug> slang_app.`381720fd-a661-4253-a906-9fa03feb67b1_nested` (e32a86e4-063f-46c5-b706-85d7b1689e8e): There are no data parts
2023.10.09 13:58:45.329586 [ 385637 ] {} <Trace> LoadingDeps: Table slang_app.`381720fd-a661-4253-a906-9fa03feb67b1_nested` has no dependencies (level 0)
2023.10.09 13:58:47.800505 [ 385894 ] {} <Debug> slang_app.`381720fd-a661-4253-a906-9fa03feb67b1_nested` (e32a86e4-063f-46c5-b706-85d7b1689e8e): Loading data parts
2023.10.09 13:58:47.800633 [ 385894 ] {} <Debug> slang_app.`381720fd-a661-4253-a906-9fa03feb67b1_nested` (e32a86e4-063f-46c5-b706-85d7b1689e8e): There are no data parts

clickhouse-server.err.log

2023.10.09 13:58:13.951739 [ 374044 ] {} <Fatal> Application: Child process was terminated by signal 11.
2023.10.09 13:58:44.511311 [ 385637 ] {} <Warning> Application: Integrity check of the executable skipped because the reference checksum could not be read.

About this issue

Original URL
State: closed
Created 9 months ago
Comments: 26 (12 by maintainers)

Most upvoted comments

Segmentation fault is never expected to happen 😃 But yes, the schema change could be the reason. Maybe @kssenii has some ideas

tavplubix on Oct 15, 2023

Reproduced and confirm that this segfault with incorrect structure is fixed in https://github.com/ClickHouse/ClickHouse/pull/57567.

kssenii on Dec 6, 2023

About the segfault with inserting incorrect data type, I have a strong impression that it is the same issue as in https://github.com/ClickHouse/ClickHouse/issues/54297 and it is fixed in https://github.com/ClickHouse/ClickHouse/pull/57567 (not merged yet). Will make a test to verify that in a moment.

kssenii on Dec 6, 2023

Thread … received signal SIGUSR …

It’s because of https://clickhouse.com/docs/en/operations/optimizing-performance/sampling-query-profiler

You can disable stopping on those signals:

handle SIGUSR1 noprint nostop
handle SIGUSR2 noprint nostop

tavplubix on Oct 12, 2023

Sure, let me try it!

gabricc on Oct 10, 2023