ByConity: Why INSERT VALUES queries cannot be auto forwarded from server to workers?

Currently, in InterpreterInsertQuery.cpp, the logic of forwarding INSERT query to workers is as following:

......
    StorageCnchMergeTree * cnch_merge_tree = dynamic_cast<StorageCnchMergeTree *>(table.get());
    /// Directly forward the query to cnch worker if select or infile
    if (getContext()->getServerType() == ServerType::cnch_server && (query.select || query.in_file) && cnch_merge_tree)
    {
        /// Handle the insert commit for insert select/infile case in cnch server.
        BlockInputStreamPtr in = cnch_merge_tree->writeInWorker(query_ptr, metadata_snapshot, getContext());

        bool enable_staging_area_for_write = settings.enable_staging_area_for_write;
        if (const auto * cnch_table = dynamic_cast<const StorageCnchMergeTree *>(table.get());
            cnch_table && metadata_snapshot->hasUniqueKey() && !enable_staging_area_for_write)
        {
            /// for unique table, insert select|infile is committed from worker side
            res.in = std::move(in);
        }
        else
        {
            auto txn = getContext()->getCurrentTransaction();
            txn->setMainTableUUID(table->getStorageUUID());
            res.in = std::make_shared<TransactionWrapperBlockInputStream>(in, std::move(txn));
        }
        return res;
    }

    BlockOutputStreams out_streams;
    ......

My questions:

  1. Why we only forward INSERT SELECT or INSERT INFILE queries to workers here?
  2. Is the design of not forwarding INSERT VALUES queries functionally difficult to implement, or would it cause performance issues?

Thank you very much.

About this issue

  • Original URL
  • State: closed
  • Created 9 months ago
  • Comments: 24 (1 by maintainers)

Most upvoted comments

Hi @xiaomo728 , may i know what kind of INSERT query you execute. And there is one thing is that the settings prefer_cnch_catalog is only supposed to be set in worker cli. Do not set it in config file or in server. You can give me the example query, i will try to reproduce it? And if you want to debug by yourself, i think at the moment the query is stuck it is actually is executing in worker, so you can see which worker that the query is forward to and take a look at the worker log.

One more thing: how to understand that this config prefer_cnch_catalog needs to be set only on the worker cli? If I use JDBC to connect, do I need INSERT... VALUES SETTINGS prefer_cnch_catalog = 1 every time? Isn’t this a bit of a hassle…

Could I open this config in worker’s config as default and close it in server’s config?

I think setting this config in worker config might break some use case, like kafka insertion. For now you can try it and if it doesn’t break any use case that it is ok. In future we will try to remove this config

Hi @xiaomo728 , in that case you can connect to worker directly and enable this setting. Please see this example

SET prefer_cnch_catalog=1; INSERT INTO test.test_direct_insert VALUES (1, 1);

@dmthuc Ok, thanks for your guidance. I will try it first.