ClickHouse: Remote Table Function Connection Error
Describe the bug
The remote function unexpectedly throws a connection error when attempting to connect to a “connectable” server.
How to reproduce
- Clickhouse server version 19.3.6 revision 54415
- ACL settings:
<!-- Users and ACL. -->
<users>
<!-- If user name was not specified, 'default' user is used. -->
<default>
<!-- Password could be specified in plaintext or in SHA256 (in hex format).
If you want to specify password in plaintext (not recommended), place it in 'password' element.
Example: <password>qwerty</password>.
Password could be empty.
If you want to specify SHA256, place it in 'password_sha256_hex' element.
Example: <password_sha256_hex>65e84be33532fb784c48129675f9eff3a682b27168c0ea744b2cf58ee02337c5</password_sha256_hex>
How to generate decent password:
Execute: PASSWORD=$(base64 < /dev/urandom | head -c8); echo "$PASSWORD"; echo -n "$PASSWORD" | sha256sum | tr -d '-'
In first line will be password and in second - corresponding SHA256.
-->
<password></password>
<!-- List of networks with open access.
To open access from everywhere, specify:
<ip>::/0</ip>
To open access only from localhost, specify:
<ip>::1</ip>
<ip>127.0.0.1</ip>
Each element of list has one of the following forms:
<ip> IP-address or network mask. Examples: 213.180.204.3 or 10.0.0.1/8 or 10.0.0.1/255.255.255.0
2a02:6b8::3 or 2a02:6b8::3/64 or 2a02:6b8::3/ffff:ffff:ffff:ffff::.
<host> Hostname. Example: server01.yandex.ru.
To check access, DNS query is performed, and all received addresses compared to peer address.
<host_regexp> Regular expression for host names. Example, ^server\d\d-\d\d-\d\.yandex\.ru$
To check access, DNS PTR query is performed for peer address and then regexp is applied.
Then, for result of PTR query, another DNS query is performed and all received addresses compared to peer address.
Strongly recommended that regexp is ends with $
All results of DNS requests are cached till server restart.
-->
<networks incl="networks" replace="replace">
<ip>::/0</ip>
</networks>
<!-- Settings profile for user. -->
<profile>default</profile>
<!-- Quota for user. -->
<quota>default</quota>
</default>
<!-- Example of user with readonly access. -->
<readonly>
<password></password>
<networks incl="networks" replace="replace">
<ip>::1</ip>
<ip>127.0.0.1</ip>
</networks>
<profile>readonly</profile>
<quota>default</quota>
</readonly>
</users>
Expected behavior
host01 :) select count(*) from remote('host02', system, "settings")
SELECT count(*)
FROM remote('host02', system, settings)
[host01] 2019.04.30 20:19:17.257224 {2bd95337-4089-43ef-82f4-3790614907cd} [ 39 ] <Debug> executeQuery: (from [::ffff:127.0.0.1]:53780) select count(*) from remote('host02', system, "settings")
→ Progress: 0.00 rows, 0.00 B (0.00 rows/s., 0.00 B/s.) [host02] 2019.04.30 20:19:17.341413 {ca1f7f65-4d3a-4b15-879c-d316b502389b} [ 39 ] <Debug> executeQuery: (from [::ffff:10.00.000.100]:46242, initial_query_id: 2bd95337-4089-43ef-82f4-3790614907cd) DESC TABLE system.settings
[host02] 2019.04.30 20:19:17.341676 {ca1f7f65-4d3a-4b15-879c-d316b502389b} [ 39 ] <Debug> executeQuery: Query pipeline:
One
[host02] 2019.04.30 20:19:17.341966 {ca1f7f65-4d3a-4b15-879c-d316b502389b} [ 39 ] <Information> executeQuery: Read 4 rows, 266.00 B in 0.000 sec., 8242 rows/sec., 535.27 KiB/sec.
[host02] 2019.04.30 20:19:17.341979 {ca1f7f65-4d3a-4b15-879c-d316b502389b} [ 39 ] <Debug> MemoryTracker: Peak memory usage (for query): 1.07 MiB.
[host02] 2019.04.30 20:19:17.426472 {631cae5e-6772-4e36-95c9-40d03d487a52} [ 25 ] <Debug> executeQuery: (from [::ffff:10.00.000.100]:46260, initial_query_id: 2bd95337-4089-43ef-82f4-3790614907cd) DESC TABLE system.settings
[host02] 2019.04.30 20:19:17.426782 {631cae5e-6772-4e36-95c9-40d03d487a52} [ 25 ] <Debug> executeQuery: Query pipeline:
One
[host02] 2019.04.30 20:19:17.427089 {631cae5e-6772-4e36-95c9-40d03d487a52} [ 25 ] <Information> executeQuery: Read 4 rows, 266.00 B in 0.001 sec., 6987 rows/sec., 453.80 KiB/sec.
[host02] 2019.04.30 20:19:17.427103 {631cae5e-6772-4e36-95c9-40d03d487a52} [ 25 ] <Debug> MemoryTracker: Peak memory usage (for query): 1.07 MiB.
[host01] 2019.04.30 20:19:17.424397 {2bd95337-4089-43ef-82f4-3790614907cd} [ 39 ] <Debug> executeQuery: Query pipeline:
Remote
┌─count()─┐
│ 198 │
└─────────┘
↘ Progress: 0.00 rows, 0.00 B (0.00 rows/s., 0.00 B/s.) [host02] 2019.04.30 20:19:17.468844 {c33e01a2-0188-4d47-b87e-07357a3502f6} [ 39 ] <Debug> executeQuery: (from [::ffff:10.00.000.100]:46242, initial_query_id: 2bd95337-4089-43ef-82f4-3790614907cd) SELECT count() FROM system.settings
[host02] 2019.04.30 20:19:17.469382 {c33e01a2-0188-4d47-b87e-07357a3502f6} [ 39 ] <Debug> executeQuery: Query pipeline:
Expression
Expression
Aggregating
Concat
Expression
One
↓ Progress: 198.00 rows, 29.13 KB (946.36 rows/s., 139.23 KB/s.) [host02] 2019.04.30 20:19:17.469713 {c33e01a2-0188-4d47-b87e-07357a3502f6} [ 39 ] <Information> executeQuery: Read 198 rows, 28.45 KiB in 0.001 sec., 235637 rows/sec., 33.06 MiB/sec.
[host02] 2019.04.30 20:19:17.469726 {c33e01a2-0188-4d47-b87e-07357a3502f6} [ 39 ] <Debug> MemoryTracker: Peak memory usage (for query): 1.09 MiB.
[host01] 2019.04.30 20:19:17.464816 {2bd95337-4089-43ef-82f4-3790614907cd} [ 39 ] <Information> executeQuery: Read 198 rows, 28.45 KiB in 0.208 sec., 954 rows/sec., 137.07 KiB/sec.
[host01] 2019.04.30 20:19:17.464839 {2bd95337-4089-43ef-82f4-3790614907cd} [ 39 ] <Debug> MemoryTracker: Peak memory usage (for query): 7.04 MiB.
1 rows in set. Elapsed: 0.210 sec.
Error message and/or stacktrace
host01 :) select count(*) from remote('host02', system, "settings");
SELECT count(*)
FROM remote('host02', system, `settings`)
[host01] 2019.04.26 18:26:43.079540 {69e973ab-e784-4fd8-a3b1-5e6655c3b0d5} [ 39 ] <Debug> executeQuery: (from [::ffff:127.0.0.1]:60802) select count(*) from remote('host02', system, "settings");
[host01] 2019.04.26 18:26:43.130489 {69e973ab-e784-4fd8-a3b1-5e6655c3b0d5} [ 39 ] <Warning> ConnectionPoolWithFailover: Connection failed at try №1, reason: Code: 209, e.displayText() = DB::NetException: Timeout: connect timed out: 10.10.1.111:9000 (host02:9000, 10.10.1.111)
[host01] 2019.04.26 18:26:43.180652 {69e973ab-e784-4fd8-a3b1-5e6655c3b0d5} [ 39 ] <Warning> ConnectionPoolWithFailover: Connection failed at try №2, reason: Code: 209, e.displayText() = DB::NetException: Timeout: connect timed out: 10.10.1.111:9000 (host02:9000, 10.10.1.111)
[host01] 2019.04.26 18:26:43.230794 {69e973ab-e784-4fd8-a3b1-5e6655c3b0d5} [ 39 ] <Warning> ConnectionPoolWithFailover: Connection failed at try №3, reason: Code: 209, e.displayText() = DB::NetException: Timeout: connect timed out: 10.10.1.111:9000 (host02:9000, 10.10.1.111)
[host01] 2019.04.26 18:26:43.230940 {69e973ab-e784-4fd8-a3b1-5e6655c3b0d5} [ 39 ] <Debug> MemoryTracker: Peak memory usage (total): 97.50 KiB.
[host01] 2019.04.26 18:26:43.302733 {69e973ab-e784-4fd8-a3b1-5e6655c3b0d5} [ 39 ] <Error> executeQuery: Code: 279, e.displayText() = DB::NetException: All connection tries failed. Log:
Code: 209, e.displayText() = DB::NetException: Timeout: connect timed out: 10.10.1.111:9000 (host02:9000, 10.10.1.111)
Code: 209, e.displayText() = DB::NetException: Timeout: connect timed out: 10.10.1.111:9000 (host02:9000, 10.10.1.111)
Code: 209, e.displayText() = DB::NetException: Timeout: connect timed out: 10.10.1.111:9000 (host02:9000, 10.10.1.111)
(from [::ffff:127.0.0.1]:60802) (in query: select count(*) from remote('host02', system, "settings");), Stack trace:
0. /usr/bin/clickhouse-server(StackTrace::StackTrace()+0x16) [0x6f1eb16]
1. /usr/bin/clickhouse-server(DB::Exception::Exception(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int)+0x22) [0x33992d2]
2. /usr/bin/clickhouse-server(PoolWithFailoverBase<DB::IConnectionPool>::getMany(unsigned long, unsigned long, std::function<PoolWithFailoverBase<DB::IConnectionPool>::TryResult (DB::IConnectionPool&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&)> const&, std::function<unsigned long (unsigned long)> const&, bool)+0x1e5d) [0x66dc18d]
3. /usr/bin/clickhouse-server(DB::ConnectionPoolWithFailover::getManyImpl(DB::Settings const*, DB::PoolMode, std::function<PoolWithFailoverBase<DB::IConnectionPool>::TryResult (DB::IConnectionPool&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&)> const&)+0xcd) [0x66d3bad]
4. /usr/bin/clickhouse-server(DB::ConnectionPoolWithFailover::getManyChecked(DB::Settings const*, DB::PoolMode, DB::QualifiedTableName const&)+0x8e) [0x66d412e]
5. /usr/bin/clickhouse-server() [0x61bbad7]
6. /usr/bin/clickhouse-server(DB::RemoteBlockInputStream::sendQuery()+0x40) [0x61bdf20]
7. /usr/bin/clickhouse-server(DB::getStructureOfRemoteTable(DB::Cluster const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, DB::Context const&, std::shared_ptr<DB::IAST> const&)+0x9d3) [0x6595913]
8. /usr/bin/clickhouse-server(DB::TableFunctionRemote::executeImpl(std::shared_ptr<DB::IAST> const&, DB::Context const&) const+0xa12) [0x3517a32]
9. /usr/bin/clickhouse-server(DB::ITableFunction::execute(std::shared_ptr<DB::IAST> const&, DB::Context const&) const+0x4e) [0x676ea9e]
10. /usr/bin/clickhouse-server(DB::Context::executeTableFunction(std::shared_ptr<DB::IAST> const&)+0x2be) [0x629f5be]
11. /usr/bin/clickhouse-server(DB::InterpreterSelectQuery::InterpreterSelectQuery(std::shared_ptr<DB::IAST> const&, DB::Context const&, std::shared_ptr<DB::IBlockInputStream> const&, std::shared_ptr<DB::IStorage> const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, DB::QueryProcessingStage::Enum, unsigned long, bool)+0xb7b) [0x62fb2db]
12. /usr/bin/clickhouse-server(DB::InterpreterSelectQuery::InterpreterSelectQuery(std::shared_ptr<DB::IAST> const&, DB::Context const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, DB::QueryProcessingStage::Enum, unsigned long, bool)+0x56) [0x62fbc16]
13. /usr/bin/clickhouse-server(DB::InterpreterSelectWithUnionQuery::InterpreterSelectWithUnionQuery(std::shared_ptr<DB::IAST> const&, DB::Context const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, DB::QueryProcessingStage::Enum, unsigned long, bool)+0x7e7) [0x6307e07]
14. /usr/bin/clickhouse-server(DB::InterpreterFactory::get(std::shared_ptr<DB::IAST>&, DB::Context&, DB::QueryProcessingStage::Enum)+0x3b0) [0x62e33d0]
15. /usr/bin/clickhouse-server() [0x6441c24]
16. /usr/bin/clickhouse-server(DB::executeQuery(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, DB::Context&, bool, DB::QueryProcessingStage::Enum, bool)+0x81) [0x6443aa1]
17. /usr/bin/clickhouse-server(DB::TCPHandler::runImpl()+0x4a6) [0x33a92b6]
18. /usr/bin/clickhouse-server(DB::TCPHandler::run()+0x2b) [0x33aa48b]
19. /usr/bin/clickhouse-server(Poco::Net::TCPServerConnection::start()+0xf) [0x705866f]
20. /usr/bin/clickhouse-server(Poco::Net::TCPServerDispatcher::run()+0x16a) [0x7058a4a]
21. /usr/bin/clickhouse-server(Poco::PooledThread::run()+0x77) [0x7134f57]
22. /usr/bin/clickhouse-server(Poco::ThreadImpl::runnableEntry(void*)+0x38) [0x7130e18]
23. /usr/bin/clickhouse-server() [0xacce88f]
24. /lib/x86_64-linux-gnu/libpthread.so.0(+0x8064) [0x7fdf27b7c064]
25. /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7fdf271a462d]
Received exception from server (version 19.3.6):
Code: 279. DB::Exception: Received from localhost:9000, 127.0.0.1. DB::NetException. DB::NetException: All connection tries failed. Log:
Code: 209, e.displayText() = DB::NetException: Timeout: connect timed out: 10.10.1.111:9000 (host02:9000, 10.10.1.111)
Code: 209, e.displayText() = DB::NetException: Timeout: connect timed out: 10.10.1.111:9000 (host02:9000, 10.10.1.111)
Code: 209, e.displayText() = DB::NetException: Timeout: connect timed out: 10.10.1.111:9000 (host02:9000, 10.10.1.111)
.
0 rows in set. Elapsed: 0.226 sec.
Additional context The following two commands work.
host01:~$ clickhouse-client -h $host02IP -q 'select count(*) from system."settings"'
198
host01:~$ clickhouse-client -h host02 -q 'select count(*) from system."settings"'
198
There are also no logs generated on host02 in the failure case.
Results of netstat:
host01:~$ sudo netstat -nlp and netstat -nap | grep clickhouse
tcp 0 0 127.0.0.1:39432 127.0.0.1:9000 ESTABLISHED 83776/clickhouse-cl
tcp6 0 0 :::9000 :::* LISTEN 53591/clickhouse-se
tcp6 0 0 :::9009 :::* LISTEN 53591/clickhouse-se
tcp6 0 0 :::8123 :::* LISTEN 53591/clickhouse-se
tcp6 0 0 127.0.0.1:9000 127.0.0.1:39432 ESTABLISHED 53591/clickhouse-se
unix 3 [ ] STREAM CONNECTED 116688898 53591/clickhouse-se
host02:~$ sudo netstat -nlp and netstat -nap | grep clickhouse
tcp6 0 0 :::9000 :::* LISTEN 70251/clickhouse-se
tcp6 0 0 :::9009 :::* LISTEN 70251/clickhouse-se
tcp6 0 0 :::8123 :::* LISTEN 70251/clickhouse-se
unix 3 [ ] STREAM CONNECTED 133148652 70251/clickhouse-se
Host02 hostname/IP and port combo are reachable from host01.
Results of tcpdump on host02 in the error case:
host02:~$ sudo tcpdump -v -n dst port 9000
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
20:30:09.179092 IP (tos 0x0, ttl 49, id 61232, offset 0, flags [DF], proto TCP (6), length 60)
$HOST01IP.58670 > $HOST02IP.9000: Flags [S], cksum 0x87c9 (correct), seq 2184761190, win 29200, options [mss 1460,sackOK,TS val 1228808278 ecr 0,nop,wscale 7], length 0
20:30:09.227461 IP (tos 0x0, ttl 49, id 21551, offset 0, flags [DF], proto TCP (6), length 60)
$HOST01IP.58678 > $HOST02IP.9000: Flags [S], cksum 0x08a4 (correct), seq 1550021709, win 29200, options [mss 1460,sackOK,TS val 1228808290 ecr 0,nop,wscale 7], length 0
20:30:09.239830 IP (tos 0x0, ttl 49, id 14756, offset 0, flags [DF], proto TCP (6), length 40)
$HOST01IP.58670 > $HOST02IP.9000: Flags [R], cksum 0xcb4d (correct), seq 2184761191, win 0, length 0
20:30:09.279520 IP (tos 0x0, ttl 49, id 45150, offset 0, flags [DF], proto TCP (6), length 60)
$HOST01IP.58686 > $HOST02IP.9000: Flags [S], cksum 0x427e (correct), seq 110137393, win 29200, options [mss 1460,sackOK,TS val 1228808303 ecr 0,nop,wscale 7], length 0
20:30:09.284833 IP (tos 0x0, ttl 49, id 14758, offset 0, flags [DF], proto TCP (6), length 40)
$HOST01IP.58678 > $HOST02IP.9000: Flags [R], cksum 0x4c34 (correct), seq 1550021710, win 0, length 0
20:30:09.340346 IP (tos 0x0, ttl 49, id 14768, offset 0, flags [DF], proto TCP (6), length 40)
$HOST01IP.58686 > $HOST02IP.9000: Flags [R], cksum 0x861b (correct), seq 110137394, win 0, length 0
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 17 (9 by maintainers)
https://clickhouse.yandex/docs/en/operations/settings/settings/#connect-timeout-with-failover-ms connect_timeout_with_failover_ms Default value: 50.
cat /etc/clickhouse-server/conf.d/user_substitutes.xml