ByConity: DB::Exception: Cannot read all marks from file xxx/xxx/data, eof: 0, buffer size: 6866, file size: 176688: While executing MergeTreeThread SQLSTATE: 22000.

I was just using ClickHouse official’s Star Schema Benchmark to test the ByConity. When I tried to create a big flat table lineorder_flat (about 40 columns) by INNER JOIN 4 tables (the biggest table has about 600 million rows), I got this exception:

Received exception from server (version 21.8.7): Code: 33. DB::Exception: Received from 127.0.0.1:56871. DB::Exception: Received from 127.0.0.1:53749. DB::Exception: Received from 127.0.0.1:41655. DB::Exception: Cannot read all marks from file c1133162-7809-4269-8ad5-41243831e384/1996_439329587933478912_439330151713472512_2_439330217603104768_0/data, eof: 0, buffer size: 6866, file size: 176688: While executing MergeTreeThread SQLSTATE: 22000.

It seems to be a memory issue. I have not seen this error before when using ClickHouse. I assume this is a new error added to ByConity? Is this a problem with the HDFS related configuations or do I need to configure several ByConity settings correctly?

Thank you very much.

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 34 (5 by maintainers)

Commits related to this issue

Most upvoted comments

@xiaomo728 Can you pull the latest version, this issue is fixed.

@canhld94 Hi, the logs of Code: 49, e.displayText() = DB::Exception: Not a Valid Block SQLSTATE: HY000 (version 21.8.7.1) exception caused by CREATE STATS IF NOT EXISTS ALL are as following:

(xx is just my file path)

2023.02.10 09:48:15.411582 [ 2524357 ] {1e8b9930-fb32-477f-9bd7-9103dd9f5c4b} <Error> CreateStats: Code: 49, e.displayText() = DB::Exception: Not a Valid Block SQLSTATE: HY000, Stack trace (when copying this message, always include the lines below):

0. Poco::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int) @ 0x1874f632 in /xx/xx/xxx/ByConity/bin/clickhouse
1. DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int, bool) @ 0xae247e0 in /xx/xx/xxx/ByConity/bin/clickhouse
2. DB::Statistics::getOnlyRowFrom(DB::Statistics::SubqueryHelper&) @ 0x14e8ec20 in /xx/xx/xxx/ByConity/bin/clickhouse
3. DB::Statistics::StatisticsCollectorStepFull::collectFirstStep(std::__1::vector<DB::NameAndTypePair, std::__1::allocator<DB::NameAndTypePair> > const&) @ 0x14e8f7b6 in /xx/xx/xxx/ByConity/bin/clickhouse
4. DB::Statistics::StatisticsCollectorStepFull::collect(std::__1::vector<DB::NameAndTypePair, std::__1::allocator<DB::NameAndTypePair> > const&) @ 0x14e8f5fd in /xx/xx/xxx/ByConity/bin/clickhouse
5. DB::Statistics::StatisticsCollector::collect(std::__1::vector<DB::NameAndTypePair, std::__1::allocator<DB::NameAndTypePair> > const&) @ 0x14e8caf9 in /xx/xx/xxx/ByConity/bin/clickhouse
6. DB::collectStatsOnTarget(std::__1::shared_ptr<DB::Context const>, DB::Statistics::CollectorSettings const&, DB::CollectTarget const&) @ 0x13af0522 in /xx/xx/xxx/ByConity/bin/clickhouse
7. DB::(anonymous namespace)::CreateStatsBlockInputStream::readImpl() @ 0x13af3090 in /xx/xx/xxx/ByConity/bin/clickhouse
8. DB::IBlockInputStream::read() @ 0x1359c845 in /xx/xx/xxx/ByConity/bin/clickhouse
9. DB::AsynchronousBlockInputStream::calculate() @ 0x135984e4 in /xx/xx/xxx/ByConity/bin/clickhouse
10. void std::__1::__function::__policy_invoker<void ()>::__call_impl<std::__1::__function::__default_alloc_func<DB::AsynchronousBlockInputStream::next()::$_0, void ()> >(std::__1::__function::__policy_storage const*) @ 0x135987d0 in /xx/xx/xxx/ByConity/bin/clickhouse
11. ThreadPoolImpl<ThreadFromGlobalPool>::worker(std::__1::__list_iterator<ThreadFromGlobalPool, void*>) @ 0xae5f7cf in /xx/xx/xxx/ByConity/bin/clickhouse
12. ThreadFromGlobalPool::ThreadFromGlobalPool<void ThreadPoolImpl<ThreadFromGlobalPool>::scheduleImpl<void>(std::__1::function<void ()>, int, std::__1::optional<unsigned long>)::'lambda0'()>(void&&, void ThreadPoolImpl<ThreadFromGlobalPool>::scheduleImpl<void>(std::__1::function<void ()>, int, std::__1::optional<unsigned long>)::'lambda0'()&&...)::'lambda'()::operator()() @ 0xae6156c in /xx/xx/xxx/ByConity/bin/clickhouse
13. ThreadPoolImpl<std::__1::thread>::worker(std::__1::__list_iterator<std::__1::thread, void*>) @ 0xae5c2c5 in /xx/xx/xxx/ByConity/bin/clickhouse
14. void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void ThreadPoolImpl<std::__1::thread>::scheduleImpl<void>(std::__1::function<void ()>, int, std::__1::optional<unsigned long>)::'lambda0'()> >(void*) @ 0xae607da in /xx/xx/xxx/ByConity/bin/clickhouse
15. ? @ 0x8f4b in /usr/lib64/libpthread-2.28.so
16. __clone @ 0xf874f in /usr/lib64/libc-2.28.so
 (version 21.8.7.1)

I tried to execute this SQL many times, but all of them ended up with this exception and never finished.

Thank you guys so many valuable helps! I will keep giving feedback about ByConity to this community.

foundationdb

Hi @xiaomo728 , I just notice that you have to change permission and owner of the new folder to foundationdb. After you do that, remove all new log. Stop foundationdb service with systemctl stop foundationdb.service, the copy log again then start foundationdb service again with systemctl start foundationdb.service

Hi @dmthuc , thank you very much! It works!

foundationdb

Hi @xiaomo728 , I just notice that you have to change permission and owner of the new folder to foundationdb. After you do that, remove all new log. Stop foundationdb service with systemctl stop foundationdb.service, the copy log again then start foundationdb service again with systemctl start foundationdb.service

@hustnn Yes, I think I have solved that. I just re-installed the fdb. Many thx.

Hi @xiaomo728 thank you for trying Byconity. We can reproduce this issue and will fix it asap, will let you know soon.

In addition, can I have some questions:

  1. How’re you deploy Byconity? We’re continuously merge new performance optimizations, so it’s recommended to use latest version for performance testing. If you’re using docker/k8s deployment, let me update the newest image version.

  2. Specially for SSB testing, I have some recommendations: a. Before creating flat table, let’s build the statistic of the table. This allow our CBO to work more efficient, and the join performance will be much better. To build statistic for all table (for example, your tables are in database ssb100), you can run: (1) use ssb100, then (2) create stats if not exists all. This will create statistics for all tables in current database. Note that the second query will take a while, but you only need to run it once 😃 b. While creating flat table, let’s set enable_optimizer=1 to use CBO c. In addition to flat queries, you can also test the Star Schema queries (JOIN queries) to compare with original Clickhouse (hint: it’s an order of magnitude faster, and use much less memory)