ClickHouse: Integration tests fails due to docker-compose pull timeout

Sometimes, 900 seconds is not enough for docker to pull the images, so:

  • maybe there are some problems with http://dockerhub-proxy.dockerhub-proxy-zone:5000/?
  • or just with CI infrastructure?
  • or maybe it worth to simply increase this timeout? and enable --debug mode for dockerd (maybe it will have more logs, like on retries on something), but I doubt that this is a good idea, since otherwise the tests will takes even longer.

@Felixoid what do you think?

Examples:

About this issue

  • Original URL
  • State: closed
  • Created 8 months ago
  • Comments: 20 (19 by maintainers)

Most upvoted comments

Together with support, we narrowed down the issue to the OS. And there are the following lines in the syslog:

Dec 7 18:17:07 i-00b68f90e176e90ce CRON[3107]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
Dec 7 18:27:39 i-00b68f90e176e90ce kernel: [29603.038581] TCP: out of memory -- consider tuning tcp_mem
Dec 7 18:27:41 i-00b68f90e176e90ce kernel: [29604.715673] TCP: out of memory -- consider tuning tcp_mem
Dec 7 18:27:43 i-00b68f90e176e90ce kernel: [29606.806509] TCP: out of memory -- consider tuning tcp_mem
Dec 7 18:27:43 i-00b68f90e176e90ce kernel: [29607.391224] TCP: out of memory -- consider tuning tcp_mem

From the linux manual and some random pages, I try the following configuration to mitigate it:

net.core.netdev_max_backlog=2000
net.core.rmem_max=1048576
net.core.wmem_max=1048576
net.ipv4.tcp_max_syn_backlog=1024
net.ipv4.tcp_rmem=4096 131072  16777216
net.ipv4.tcp_wmem=4096 87380   16777216
net.ipv4.tcp_mem=4096 131072  16777216

References: https://dzone.com/articles/tcp-out-of-memory-consider-tuning-tcp-mem and https://www.kernel.org/doc/html/latest/networking/ip-sysctl.html

update: I caught another resets case, and fixed it by net.ipv4.tcp_mem=4096 131072 16777216. Will apply everywhere in a moment

The query I’m monitoring

updated one