ClickHouse: Crash on MV Insert Into Locked Buffer Table With LowCardinality Key Column

ClickHouse 21.4.4

We are testing using buffer tables for materialized views to reduce the number of parts created during inserts. When trying to read from one of those buffer tables (using a Distributed table), ClickHouse crashed on multiple shards. This is the table:

(
    `datetime` DateTime,
    `svc_type` LowCardinality(String),
    `svc` LowCardinality(String),
    `cache_group` LowCardinality(String),
    `client_status` Enum8('NONE' = 0, 'SUCCESS' = 1, 'CLIENT_ERROR' = 2, 'SERVER_ERROR' = 3),
    `cache_result` Enum8('HIT' = 1, 'MISS' = 2, 'ERROR' = 3),
    `event_count` UInt64,
    `served_bytes` UInt64,
    `parent_bytes` UInt64,
    `ttms_avg` AggregateFunction(avg, UInt32),
    `ttms_quants` AggregateFunction(quantilesTiming(0.99, 0.95, 0.9), UInt32),
    `chi_count` AggregateFunction(uniq, FixedString(16)),
    `manifest_count` UInt64,
    `fragment_count` UInt64
)
ENGINE = Buffer('comcast_xcr', 'atsec_svc_1h_mt', 1, 30, 300, 10000, 1000000, 1000000, 100000000)

Note that this table has only one buffer “layer” (we don’t expect a huge number of inserts). It also relies on an implicit cast of String to LowCardinality String.

The crash (on 5 or 6 servers):

2021.05.15 20:51:20.446960 [ 399962 ] {} <Fatal> BaseDaemon: (version 21.4.4.30 (official build), build id: E3FA92117218D182F17C14F864FF4ED3D3689BFE) (from thread 2448209) (no query) Received signal Segmentation fault (11)
2021.05.15 20:51:20.446976 [ 399962 ] {} <Fatal> BaseDaemon: Address: NULL pointer. Access: read. Unknown si_code.
2021.05.15 20:51:20.447003 [ 399962 ] {} <Fatal> BaseDaemon: Stack trace: 0xf1439f7 0xf142358 0xf140dce 0xfa59948 0xfd529a0 0xfd52120 0xf4ba024 0xf4c3fdb 0xf4c45fc 0xf4c4679 0xf4bbb6f 0xfd4e458 0xfd4bd03 0xfd4eff5 0xf33b820 0xf33d817 0xf33e5e2 0x8954fef 0x8958a83 0x7fb34a24314a 0x7fb349f74f23
2021.05.15 20:51:20.475409 [ 399962 ] {} <Fatal> BaseDaemon: 1. DB::ReverseIndex<unsigned long, DB::ColumnString>::insert(StringRef const&) @ 0xf1439f7 in /usr/bin/clickhouse
2021.05.15 20:51:20.475428 [ 399962 ] {} <Fatal> BaseDaemon: 2. COW<DB::IColumn>::mutable_ptr<DB::IColumn> DB::ColumnUnique<DB::ColumnString>::uniqueInsertRangeImpl<char8_t>(DB::IColumn const&, unsigned long, unsigned long, unsigned long, DB::ColumnVector<char8_t>::MutablePtr&&, DB::ReverseIndex<unsigned long, DB::ColumnString>*, unsigned long) @ 0xf142358 in /usr/bin/clickhouse
2021.05.15 20:51:20.475436 [ 399962 ] {} <Fatal> BaseDaemon: 3. DB::ColumnUnique<DB::ColumnString>::uniqueInsertRangeFrom(DB::IColumn const&, unsigned long, unsigned long) @ 0xf140dce in /usr/bin/clickhouse
2021.05.15 20:51:20.475447 [ 399962 ] {} <Fatal> BaseDaemon: 4. DB::ColumnLowCardinality::insertRangeFrom(DB::IColumn const&, unsigned long, unsigned long) @ 0xfa59948 in /usr/bin/clickhouse
2021.05.15 20:51:20.475456 [ 399962 ] {} <Fatal> BaseDaemon: 5. DB::BufferBlockOutputStream::insertIntoBuffer(DB::Block const&, DB::StorageBuffer::Buffer&) @ 0xfd529a0 in /usr/bin/clickhouse
2021.05.15 20:51:20.475461 [ 399962 ] {} <Fatal> BaseDaemon: 6. DB::BufferBlockOutputStream::write(DB::Block const&) @ 0xfd52120 in /usr/bin/clickhouse
2021.05.15 20:51:20.475470 [ 399962 ] {} <Fatal> BaseDaemon: 7. DB::PushingToViewsBlockOutputStream::write(DB::Block const&) @ 0xf4ba024 in /usr/bin/clickhouse
2021.05.15 20:51:20.475476 [ 399962 ] {} <Fatal> BaseDaemon: 8. DB::AddingDefaultBlockOutputStream::write(DB::Block const&) @ 0xf4c3fdb in /usr/bin/clickhouse
2021.05.15 20:51:20.475483 [ 399962 ] {} <Fatal> BaseDaemon: 9. DB::SquashingBlockOutputStream::finalize() @ 0xf4c45fc in /usr/bin/clickhouse
2021.05.15 20:51:20.475488 [ 399962 ] {} <Fatal> BaseDaemon: 10. DB::SquashingBlockOutputStream::writeSuffix() @ 0xf4c4679 in /usr/bin/clickhouse
2021.05.15 20:51:20.475493 [ 399962 ] {} <Fatal> BaseDaemon: 11. DB::PushingToViewsBlockOutputStream::writeSuffix() @ 0xf4bbb6f in /usr/bin/clickhouse
2021.05.15 20:51:20.475499 [ 399962 ] {} <Fatal> BaseDaemon: 12. DB::StorageBuffer::writeBlockToDestination(DB::Block const&, std::__1::shared_ptr<DB::IStorage>) @ 0xfd4e458 in /usr/bin/clickhouse
2021.05.15 20:51:20.475506 [ 399962 ] {} <Fatal> BaseDaemon: 13. DB::StorageBuffer::flushBuffer(DB::StorageBuffer::Buffer&, bool, bool, bool) @ 0xfd4bd03 in /usr/bin/clickhouse
2021.05.15 20:51:20.475511 [ 399962 ] {} <Fatal> BaseDaemon: 14. DB::StorageBuffer::backgroundFlush() @ 0xfd4eff5 in /usr/bin/clickhouse
2021.05.15 20:51:20.475519 [ 399962 ] {} <Fatal> BaseDaemon: 15. DB::BackgroundSchedulePoolTaskInfo::execute() @ 0xf33b820 in /usr/bin/clickhouse
2021.05.15 20:51:20.475524 [ 399962 ] {} <Fatal> BaseDaemon: 16. DB::BackgroundSchedulePool::threadFunction() @ 0xf33d817 in /usr/bin/clickhouse
2021.05.15 20:51:20.475529 [ 399962 ] {} <Fatal> BaseDaemon: 17. ? @ 0xf33e5e2 in /usr/bin/clickhouse
2021.05.15 20:51:20.475539 [ 399962 ] {} <Fatal> BaseDaemon: 18. ThreadPoolImpl<std::__1::thread>::worker(std::__1::__list_iterator<std::__1::thread, void*>) @ 0x8954fef in /usr/bin/clickhouse
2021.05.15 20:51:20.475544 [ 399962 ] {} <Fatal> BaseDaemon: 19. ? @ 0x8958a83 in /usr/bin/clickhouse
2021.05.15 20:51:20.475557 [ 399962 ] {} <Fatal> BaseDaemon: 20. start_thread @ 0x814a in /usr/lib64/libpthread-2.28.so
2021.05.15 20:51:20.475567 [ 399962 ] {} <Fatal> BaseDaemon: 21. clone @ 0xfcf23 in /usr/lib64/libc-2.28.so
2021.05.15 20:51:20.576475 [ 399962 ] {} <Fatal> BaseDaemon: Checksum of the binary: 21B45BF98BF6821B2FE099092F5117E8, integrity check passed.
2021.05.15 20:51:42.120885 [ 2446750 ] {} <Fatal> Application: Child process was terminated by signal 11.

Everything seems fine if we don’t try to read from the buffer table.

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 24 (16 by maintainers)

Most upvoted comments

even easier:

clickhouse-benchmark -c 5 --database dw <<< 'SELECT count() FROM low_card_buffer_test GROUP BY test_text format Null'
clickhouse-benchmark -c 2 --database=dw <<< "INSERT INTO low_card_buffer_test values('TEST1')"

I have prepared a minimal example that reproduces this crash. The issue is related to the Buffer Engine, that has a LowCardinality(String) field, is being inserted into and read from with a select statement that uses the GROUP BY clause.

ClickHouse client version 21.9.2.17 (official build).
Connecting to localhost:9000 as user default.
Connected to ClickHouse server version 21.9.2 revision 54449.

The buffer is as follows:

CREATE TABLE low_card_buffer_test (
    `test_text` LowCardinality(String)
)
ENGINE Buffer('', '', 16, 60, 360, 100, 1000, 10000, 100000);

Inserting and selecting from the buffer is done with the following Python program: inserts.py

import requests


def inserts():
    for i in range(0, 10000):
        if i % 2 == 0:
            text = "TEST1"
        else:
            text = "TEST2"
        result = requests.post('http://localhost:8123', data=f'''INSERT INTO low_card_buffer_test (test_text) VALUES (\'{text}\')''')
        if result.status_code != 200:
            print(result.text)
            break

selects.py

import requests
from time import sleep


def selects():
    for i in range(0, 100):
        result = requests.post('http://localhost:8123', data=f'''SELECT count() FROM low_card_buffer_test GROUP BY test_text''')
        sleep(0.1)
        print(result.text)

main.py

from inserts import inserts
from selects import selects
from threading import Thread

if __name__ == "__main__":
    thread1 = Thread(target=inserts, args=())
    thread2 = Thread(target=selects, args=())

    thread1.start()
    thread2.start()

    thread1.join()
    thread2.join()

When the program is ran, it manages to make one insert and crashes Clickhouse:

[mks@cave testing_clickhouse_buffers]$ python3 main.py 
1

Exception in thread Thread-2:
Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/urllib3/connectionpool.py", line 699, in urlopen
    httplib_response = self._make_request(
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/urllib3/connectionpool.py", line 445, in _make_request
  File "/usr/lib/python3.9/site-packages/urllib3/connectionpool.py", line 699, in urlopen
    httplib_response = self._make_request(
    six.raise_from(e, None)
  File "/usr/lib/python3.9/site-packages/urllib3/connectionpool.py", line 445, in _make_request
  File "<string>", line 3, in raise_from
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/usr/lib/python3.9/site-packages/urllib3/connectionpool.py", line 440, in _make_request
  File "/usr/lib/python3.9/site-packages/urllib3/connectionpool.py", line 440, in _make_request
    httplib_response = conn.getresponse()
  File "/usr/lib/python3.9/http/client.py", line 1371, in getresponse
    httplib_response = conn.getresponse()
    response.begin()
  File "/usr/lib/python3.9/http/client.py", line 319, in begin
  File "/usr/lib/python3.9/http/client.py", line 1371, in getresponse
    version, status, reason = self._read_status()
  File "/usr/lib/python3.9/http/client.py", line 288, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/requests/adapters.py", line 439, in send
    response.begin()
  File "/usr/lib/python3.9/http/client.py", line 319, in begin
    resp = conn.urlopen(
  File "/usr/lib/python3.9/site-packages/urllib3/connectionpool.py", line 755, in urlopen
    version, status, reason = self._read_status()
  File "/usr/lib/python3.9/http/client.py", line 288, in _read_status
    retries = retries.increment(
  File "/usr/lib/python3.9/site-packages/urllib3/util/retry.py", line 532, in increment
    raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/requests/adapters.py", line 439, in send
    resp = conn.urlopen(
  File "/usr/lib/python3.9/site-packages/urllib3/connectionpool.py", line 755, in urlopen
    retries = retries.increment(
  File "/usr/lib/python3.9/site-packages/urllib3/util/retry.py", line 532, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "/usr/lib/python3.9/site-packages/urllib3/packages/six.py", line 769, in reraise
    raise value.with_traceback(tb)
  File "/usr/lib/python3.9/site-packages/urllib3/connectionpool.py", line 699, in urlopen
    httplib_response = self._make_request(
    raise six.reraise(type(error), error, _stacktrace)
  File "/usr/lib/python3.9/site-packages/urllib3/connectionpool.py", line 445, in _make_request
  File "/usr/lib/python3.9/site-packages/urllib3/packages/six.py", line 769, in reraise
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
    raise value.with_traceback(tb)
  File "/usr/lib/python3.9/site-packages/urllib3/connectionpool.py", line 699, in urlopen
  File "/usr/lib/python3.9/site-packages/urllib3/connectionpool.py", line 440, in _make_request
    httplib_response = self._make_request(
  File "/usr/lib/python3.9/site-packages/urllib3/connectionpool.py", line 445, in _make_request
    six.raise_from(e, None)
    httplib_response = conn.getresponse()
  File "<string>", line 3, in raise_from
  File "/usr/lib/python3.9/http/client.py", line 1371, in getresponse
  File "/usr/lib/python3.9/site-packages/urllib3/connectionpool.py", line 440, in _make_request
    response.begin()
  File "/usr/lib/python3.9/http/client.py", line 319, in begin
    httplib_response = conn.getresponse()
  File "/usr/lib/python3.9/http/client.py", line 1371, in getresponse
    version, status, reason = self._read_status()
    response.begin()
  File "/usr/lib/python3.9/http/client.py", line 288, in _read_status
  File "/usr/lib/python3.9/http/client.py", line 319, in begin
    raise RemoteDisconnected("Remote end closed connection without"
    version, status, reason = self._read_status()
urllib3.exceptions.ProtocolError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

During handling of the above exception, another exception occurred:

  File "/usr/lib/python3.9/http/client.py", line 288, in _read_status
Traceback (most recent call last):
  File "/usr/lib/python3.9/threading.py", line 973, in _bootstrap_inner
    raise RemoteDisconnected("Remote end closed connection without"
urllib3.exceptions.ProtocolError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.9/threading.py", line 973, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.9/threading.py", line 910, in run
    self.run()
  File "/usr/lib/python3.9/threading.py", line 910, in run
    self._target(*self._args, **self._kwargs)
  File "/warehouse/code/python experiments/testing_clickhouse_buffers/selects.py", line 7, in selects
    self._target(*self._args, **self._kwargs)
  File "/warehouse/code/python experiments/testing_clickhouse_buffers/inserts.py", line 10, in inserts
    result = requests.post('http://localhost:8123', data=f'''SELECT count() FROM low_card_buffer_test GROUP BY test_text''')
  File "/usr/lib/python3.9/site-packages/requests/api.py", line 117, in post
    return request('post', url, data=data, json=json, **kwargs)
  File "/usr/lib/python3.9/site-packages/requests/api.py", line 61, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/lib/python3.9/site-packages/requests/sessions.py", line 542, in request
    result = requests.post('http://localhost:8123', data=f'''INSERT INTO low_card_buffer_test (test_text) VALUES (\'{text}\')''')
  File "/usr/lib/python3.9/site-packages/requests/api.py", line 117, in post
    return request('post', url, data=data, json=json, **kwargs)
  File "/usr/lib/python3.9/site-packages/requests/api.py", line 61, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python3.9/site-packages/requests/sessions.py", line 655, in send
    return session.request(method=method, url=url, **kwargs)
  File "/usr/lib/python3.9/site-packages/requests/sessions.py", line 542, in request
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python3.9/site-packages/requests/adapters.py", line 498, in send
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python3.9/site-packages/requests/sessions.py", line 655, in send
    raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python3.9/site-packages/requests/adapters.py", line 498, in send
    raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

Crash log:

SELECT *
FROM system.crash_log
ORDER BY event_time DESC
LIMIT 1
FORMAT Vertical

Query id: 1b78bba7-aa13-4cc0-8301-af6bc3b4eb26

Row 1:
──────
event_date:   2021-10-04
event_time:   2021-10-04 22:27:31
timestamp_ns: 1633386451101892646
signal:       11
thread_id:    152032
query_id:     01011361-f4cc-48ba-ba20-7cde7d641c34
trace:        [273707351,273701720,273696110,284895880,288457377,288455060,293478753,279873127,279895503,279948971,279950636,279950793,293726621,293712241,293704143,293703577,284059330,292832940,292847154,293322160,339157167,339163962,340421049,340405322,140370000085593,140369999132131]
trace_full:   ['1. DB::ReverseIndex<unsigned long, DB::ColumnString>::insert(StringRef const&) @ 0x10507157 in /usr/bin/clickhouse','2. COW<DB::IColumn>::mutable_ptr<DB::IColumn> DB::ColumnUnique<DB::ColumnString>::uniqueInsertRangeImpl<char8_t>(DB::IColumn const&, unsigned long, unsigned long, unsigned long, DB::ColumnVector<char8_t>::MutablePtr&&, DB::ReverseIndex<unsigned long, DB::ColumnString>*, unsigned long) @ 0x10505b58 in /usr/bin/clickhouse','3. DB::ColumnUnique<DB::ColumnString>::uniqueInsertRangeFrom(DB::IColumn const&, unsigned long, unsigned long) @ 0x1050456e in /usr/bin/clickhouse','4. DB::ColumnLowCardinality::insertRangeFrom(DB::IColumn const&, unsigned long, unsigned long) @ 0x10fb2a88 in /usr/bin/clickhouse','5. DB::BufferSink::insertIntoBuffer(DB::Block const&, DB::StorageBuffer::Buffer&) @ 0x113182a1 in /usr/bin/clickhouse','6. DB::BufferSink::consume(DB::Chunk) @ 0x11317994 in /usr/bin/clickhouse','7. DB::ISink::work() @ 0x117e2161 in /usr/bin/clickhouse','8. DB::PushingToSinkBlockOutputStream::write(DB::Block const&) @ 0x10ae8667 in /usr/bin/clickhouse','9. DB::PushingToViewsBlockOutputStream::write(DB::Block const&) @ 0x10aeddcf in /usr/bin/clickhouse','10. DB::AddingDefaultBlockOutputStream::write(DB::Block const&) @ 0x10afaeab in /usr/bin/clickhouse','11. DB::SquashingBlockOutputStream::finalize() @ 0x10afb52c in /usr/bin/clickhouse','12. DB::SquashingBlockOutputStream::writeSuffix() @ 0x10afb5c9 in /usr/bin/clickhouse','13. ? @ 0x1181e99d in /usr/bin/clickhouse','14. DB::PipelineExecutor::executeStepImpl(unsigned long, unsigned long, std::__1::atomic<bool>*) @ 0x1181b171 in /usr/bin/clickhouse','15. DB::PipelineExecutor::executeImpl(unsigned long) @ 0x118191cf in /usr/bin/clickhouse','16. DB::PipelineExecutor::execute(unsigned long) @ 0x11818f99 in /usr/bin/clickhouse','17. DB::executeQuery(DB::ReadBuffer&, DB::WriteBuffer&, bool, std::__1::shared_ptr<DB::Context>, std::__1::function<void (std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&)>, std::__1::optional<DB::FormatSettings> const&, std::__1::function<void ()>) @ 0x10ee66c2 in /usr/bin/clickhouse','18. DB::HTTPHandler::processQuery(DB::HTTPServerRequest&, DB::HTMLForm&, DB::HTTPServerResponse&, DB::HTTPHandler::Output&, std::__1::optional<DB::CurrentThread::QueryScope>&) @ 0x117446ac in /usr/bin/clickhouse','19. DB::HTTPHandler::handleRequest(DB::HTTPServerRequest&, DB::HTTPServerResponse&) @ 0x11747e32 in /usr/bin/clickhouse','20. DB::HTTPServerConnection::run() @ 0x117bbdb0 in /usr/bin/clickhouse','21. Poco::Net::TCPServerConnection::start() @ 0x143720af in /usr/bin/clickhouse','22. Poco::Net::TCPServerDispatcher::run() @ 0x14373b3a in /usr/bin/clickhouse','23. Poco::PooledThread::run() @ 0x144a69b9 in /usr/bin/clickhouse','24. Poco::ThreadImpl::runnableEntry(void*) @ 0x144a2c4a in /usr/bin/clickhouse','25. start_thread @ 0x9259 in /usr/lib/libpthread-2.33.so','26. clone @ 0xfe5e3 in /usr/lib/libc-2.33.so']
version:      ClickHouse 21.9.2.17
revision:     54454
build_id:     E4F05ABB2100332308613C22030A03F5A4621821

1 rows in set. Elapsed: 0.002 sec.