clickhouse-sink-connector: [sink-connector-lightweight] Initial snapshot slows down and then OOM

Good day.

When creating initial snapshot everything starts fast but then slows down considerably. It manages to export a couple of million of rows no problem and then this happens:

74164 2023-10-05 17:52:41.416 [debezium-mysqlconnector-embeddedconnector-change-event-source-coordinator] INFO io.debezium.relational.RelationalSnapshotChangeEventSource  - 	 Exported 347670 of 2867292 records for table 'schema.table' after 00:00:10.009
84165 2023-10-05 17:52:51.417 [debezium-mysqlconnector-embeddedconnector-change-event-source-coordinator] INFO io.debezium.relational.RelationalSnapshotChangeEventSource  - 	 Exported 628341 of 2867292 records for table 'schema.table' after 00:00:20.01
96286 2023-10-05 17:53:03.538 [debezium-mysqlconnector-embeddedconnector-change-event-source-coordinator] INFO io.debezium.relational.RelationalSnapshotChangeEventSource  - 	 Exported 843087 of 2867292 records for table 'schema.table' after 00:00:32.131
96297 2023-10-05 17:53:03.549 [pool-271-thread-1] INFO com.altinity.clickhouse.sink.connector.db.DbWriter  - *** QUERY***insert into table(columns)
106292 2023-10-05 17:53:13.544 [debezium-mysqlconnector-embeddedconnector-change-event-source-coordinator] INFO io.debezium.relational.RelationalSnapshotChangeEventSource  - 	 Exported 1064731 of 2867292 records for table 'schema.table' after 00:00:42.137
116617 2023-10-05 17:53:23.869 [debezium-mysqlconnector-embeddedconnector-change-event-source-coordinator] INFO io.debezium.relational.RelationalSnapshotChangeEventSource  - 	 Exported 1109216 of 2867292 records for table 'schema.table' after 00:00:52.462
127343 2023-10-05 17:53:34.595 [debezium-mysqlconnector-embeddedconnector-change-event-source-coordinator] INFO io.debezium.relational.RelationalSnapshotChangeEventSource  - 	 Exported 1156627 of 2867292 records for table 'schema.table' after 00:01:03.188
138743 2023-10-05 17:53:45.995 [debezium-mysqlconnector-embeddedconnector-change-event-source-coordinator] INFO io.debezium.relational.RelationalSnapshotChangeEventSource  - 	 Exported 1162867 of 2867292 records for table 'schema.table' after 00:01:14.588
150097 2023-10-05 17:53:57.349 [debezium-mysqlconnector-embeddedconnector-change-event-source-coordinator] INFO io.debezium.relational.RelationalSnapshotChangeEventSource  - 	 Exported 1167905 of 2867292 records for table 'schema.table' after 00:01:25.942
161378 2023-10-05 17:54:08.630 [debezium-mysqlconnector-embeddedconnector-change-event-source-coordinator] INFO io.debezium.relational.RelationalSnapshotChangeEventSource  - 	 Exported 1169915 of 2867292 records for table 'schema.table' after 00:01:37.223
164166 2023-10-05 17:54:11.418 [main] INFO io.debezium.connector.common.BaseSourceTask  - 749911 records sent during previous 00:01:27.143, last recorded offset of {server=embeddedconnector} partition is {ts_sec=1696528351, file=mysql-bin-changelog.000517, pos=157, snapshot=true}
181053 2023-10-05 17:54:28.305 [debezium-mysqlconnector-embeddedconnector-change-event-source-coordinator] INFO io.debezium.relational.RelationalSnapshotChangeEventSource  - 	 Exported 1171289 of 2867292 records for table 'schema.table' after 00:01:56.898
192186 2023-10-05 17:54:39.438 [debezium-mysqlconnector-embeddedconnector-change-event-source-coordinator] INFO io.debezium.relational.RelationalSnapshotChangeEventSource  - 	 Exported 1171670 of 2867292 records for table 'schema.table' after 00:02:08.031
222901 2023-10-05 17:55:10.153 [debezium-mysqlconnector-embeddedconnector-change-event-source-coordinator] INFO io.debezium.relational.RelationalSnapshotChangeEventSource  - 	 Exported 1172039 of 2867292 records for table 'schema.table' after 00:02:38.746
234068 2023-10-05 17:55:21.320 [debezium-mysqlconnector-embeddedconnector-change-event-source-coordinator] INFO io.debezium.relational.RelationalSnapshotChangeEventSource  - 	 Exported 1172245 of 2867292 records for table 'schema.table' after 00:02:49.913
245211 2023-10-05 17:55:32.463 [debezium-mysqlconnector-embeddedconnector-change-event-source-coordinator] INFO io.debezium.relational.RelationalSnapshotChangeEventSource  - 	 Exported 1172488 of 2867292 records for table 'schema.table' after 00:03:01.056
256355 2023-10-05 17:55:43.607 [debezium-mysqlconnector-embeddedconnector-change-event-source-coordinator] INFO io.debezium.relational.RelationalSnapshotChangeEventSource  - 	 Exported 1172644 of 2867292 records for table 'schema.table' after 00:03:12.2
267478 2023-10-05 17:55:54.730 [debezium-mysqlconnector-embeddedconnector-change-event-source-coordinator] INFO io.debezium.relational.RelationalSnapshotChangeEventSource  - 	 Exported 1172871 of 2867292 records for table 'schema.table' after 00:03:23.323
279701 2023-10-05 17:56:06.953 [debezium-mysqlconnector-embeddedconnector-change-event-source-coordinator] INFO io.debezium.relational.RelationalSnapshotChangeEventSource  - 	 Exported 1173115 of 2867292 records for table 'schema.table' after 00:03:35.546
293680 2023-10-05 17:56:20.932 [debezium-mysqlconnector-embeddedconnector-change-event-source-coordinator] INFO io.debezium.relational.RelationalSnapshotChangeEventSource  - 	 Exported 1173426 of 2867292 records for table 'schema.table' after 00:03:49.525
304822 2023-10-05 17:56:32.074 [debezium-mysqlconnector-embeddedconnector-change-event-source-coordinator] INFO io.debezium.relational.RelationalSnapshotChangeEventSource  - 	 Exported 1173970 of 2867292 records for table 'schema.table' after 00:04:00.667
315965 2023-10-05 17:56:43.217 [debezium-mysqlconnector-embeddedconnector-change-event-source-coordinator] INFO io.debezium.relational.RelationalSnapshotChangeEventSource  - 	 Exported 1174096 of 2867292 records for table 'schema.table' after 00:04:11.81
327084 2023-10-05 17:56:54.336 [debezium-mysqlconnector-embeddedconnector-change-event-source-coordinator] INFO io.debezium.relational.RelationalSnapshotChangeEventSource  - 	 Exported 1174406 of 2867292 records for table 'schema.table' after 00:04:22.929
338216 2023-10-05 17:57:05.468 [debezium-mysqlconnector-embeddedconnector-change-event-source-coordinator] INFO io.debezium.relational.RelationalSnapshotChangeEventSource  - 	 Exported 1174680 of 2867292 records for table 'schema.table' after 00:04:34.061
349328 2023-10-05 17:57:16.580 [debezium-mysqlconnector-embeddedconnector-change-event-source-coordinator] INFO io.debezium.relational.RelationalSnapshotChangeEventSource  - 	 Exported 1174997 of 2867292 records for table 'schema.table' after 00:04:45.173
366045 2023-10-05 17:57:33.297 [debezium-mysqlconnector-embeddedconnector-change-event-source-coordinator] INFO io.debezium.relational.RelationalSnapshotChangeEventSource  - 	 Exported 1175345 of 2867292 records for table 'schema.table' after 00:05:01.89
380959 2023-10-05 17:57:48.211 [debezium-mysqlconnector-embeddedconnector-change-event-source-coordinator] INFO io.debezium.relational.RelationalSnapshotChangeEventSource  - 	 Exported 1175675 of 2867292 records for table 'schema.table' after 00:05:16.804`

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "HTTP-Dispatcher"

After that snapshot is restarted the connector starts to tank even earlier, after ~200K rows.

Both MySQL and Clickhouse instances have negligible load, but the connector instance has 100% CPU load and consumes full RAM constantly even when pushing tens of rows per second.

What I tried:

  • I’ve tried to use -Xmx6144M and -Xmx10240M on a 16GB instance without any noticeable effect (i.e. both of them tank and OOM)
  • Tried to tweak parameters like snapshot.max.threads, offset.flush.interval.ms, offset.flush.timeout.ms, max.request.size, max.queue.size, max.batch.size with no effect (not even sure if all of them work for this jar file)
  • Tried to change offset storage to org.apache.kafka.connect.storage.FileOffsetBackingStore, no effect (except removing the transactions not supported log). I’ve also seen Timed out waiting to flush EmbeddedEngine{id="engine"} offsets to storage even with file storage, which should not happen normally I think.
  • Tried to swtich to 0.6, same
  • Java 11, Java 17 - same

Thanks in advance.

About this issue

  • Original URL
  • State: open
  • Created 9 months ago
  • Comments: 19 (7 by maintainers)

Most upvoted comments

I was facing the same issue and tuning using snapshot.fetch.size in config.yml allowed me to throttle the snapshot creation which solved my issue.

@AdamKatzDev, I have faced with this issue too. The reason is that method put() in ClickHouseSinkTask receives records faster than ClickhouseBatchRunnable processes them. I have solved that issue by adding a code to control a buffer filling. If limit of records in a buffer (buffer is the “records” field in the ClickHouseSinkTask class) is reached, method put() continuously polls its condition until it’s free for a new batch.

    @Override
    public void put(Collection<SinkRecord> records) {
        long taskId = this.config.getLong(ClickHouseSinkConnectorConfigVariables.TASK_ID.toString());
        long recievedRecordsCount = records.size();
        totalRecords += recievedRecordsCount;
        log.debug("******** CLICKHOUSE received records **** " + totalRecords + " Task Id: " + taskId);
        while (this.getBufferCount() + recievedRecordsCount > bufferMaxRecords) {
            try {
                Thread.sleep(100);
            }
            catch(InterruptedException e) {
                log.error("******** SLEEP ERROR {} ****", e.getMessage());
            }
        }
    public long getBufferCount() {
        long recordsInBuffer = 0;
        for (Map.Entry<String, ConcurrentLinkedQueue<ClickHouseStruct>> entry : this.records.entrySet()) {
            recordsInBuffer += entry.getValue().size();
        }
        return recordsInBuffer;
    }

Thank you for swift responses, folks.

@subkanthi

My guess is that the Xmx settings did not get applied. you can do a ps -ef in the docker container and check if its applied to the java process.

It 100% applies, I’ve verified it using htop, after applying -Xmx10240M RAM consumption increased to, yeah, about 10G. The command is the following: java -Xmx10240M -jar clickhouse-debezium-embedded-1.0-SNAPSHOT.jar config.yaml

I’ve also tried to spin the docker image. Looks like it does not respect JAVA_OPTS env. I had to override entrypoint to this: java -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=*:5005 $JAVA_OPTS -jar /app.jar /config.yml com.altinity.clickhouse.debezium.embedded.ClickHouseDebeziumEmbeddedApplication The result is the same, import slows down to low values, then OOMs.

@aadant

This project has all the tooling to do it

Yeah, I saw mentions of Clickhouse Loader, but the documentation link on the main page leads to 404 (https://github.com/Altinity/clickhouse-sink-connector/blob/develop/python/README.md). Could you please lead me to the tool?

In order to initialize large databases

The logical dump I am trying to create snapshot from weights 29GB uncompressed. The largest table is 14GB, but the snapshot process stalls after processing ~5GB (two tables 1.7GB and 3GB respectively). Is this really the limit?

Hi @AdamKatzDev , this is the updated path - https://github.com/Altinity/clickhouse-sink-connector/tree/develop/sink-connector/python

In order to initialize large databases, you may need to take a snapshot from MySQL and restore it to CH. This project has all the tooling to do it