podman: /v1.32/networks/create "failed to respond"

Issue Description

I have a suite of integration tests using testcontainers to automate Apache Kafka tests. Each test creates containers for Kafka and Zookeeper and a Podman Network which is torn down after each test. After succession of several passing tests, I see a /v1.32/networks/create call fail. Podman appears to close the socket without sending a HTTP response.

The problem is 100% repeatable for me and always occurs in exactly the same spot in the test suite.

If I fix a bug in testcontainers that allows it to retry the network POST, the problem disappears.

client:       Podman Engine
Version:      4.4.2
API Version:  4.4.2
Go Version:   go1.20.1
Git Commit:   74afe26887f814d1c39925a1624851ef3590e79c
Built:        Thu Feb 23 15:59:21 2023
OS/Arch:      darwin/amd64

Server:       Podman Engine
Version:      4.4.1
API Version:  4.4.1
Go Version:   go1.19.5
Built:        Thu Feb  9 10:58:53 2023
OS/Arch:      linux/amd64
2023-02-26 19:29:55 DEBUG com.github.dockerjava.zerodep.shaded.org.apache.hc.client5.http.impl.classic.MainClientExec:102 - ex-00000011: executing POST /v1.32/networks/create HTTP/1.1
2023-02-26 19:29:55 DEBUG com.github.dockerjava.zerodep.shaded.org.apache.hc.client5.http.impl.io.DefaultManagedHttpClientConnection:134 - http-outgoing-0: set socket timeout to 44000000000 NANOSECONDS
2023-02-26 19:29:55 DEBUG com.github.dockerjava.zerodep.shaded.org.apache.hc.client5.http.impl.classic.InternalHttpClient:213 - ep-00000010: start execution ex-00000011
2023-02-26 19:29:55 DEBUG com.github.dockerjava.zerodep.shaded.org.apache.hc.client5.http.impl.io.PoolingHttpClientConnectionManager:594 - ep-00000010: executing exchange ex-00000011 over http-outgoing-0
2023-02-26 19:29:55 DEBUG com.github.dockerjava.zerodep.shaded.org.apache.hc.client5.http.headers:169 - http-outgoing-0 >> POST /v1.32/networks/create HTTP/1.1
2023-02-26 19:29:55 DEBUG com.github.dockerjava.zerodep.shaded.org.apache.hc.client5.http.headers:172 - http-outgoing-0 >> accept: application/json
2023-02-26 19:29:55 DEBUG com.github.dockerjava.zerodep.shaded.org.apache.hc.client5.http.headers:172 - http-outgoing-0 >> content-type: application/json
2023-02-26 19:29:55 DEBUG com.github.dockerjava.zerodep.shaded.org.apache.hc.client5.http.headers:172 - http-outgoing-0 >> x-tc-sid: 16246381-601c-4abb-8e67-5cb92c70a7f5
2023-02-26 19:29:55 DEBUG com.github.dockerjava.zerodep.shaded.org.apache.hc.client5.http.headers:172 - http-outgoing-0 >> Accept-Encoding: gzip, x-gzip, deflate
2023-02-26 19:29:55 DEBUG com.github.dockerjava.zerodep.shaded.org.apache.hc.client5.http.headers:172 - http-outgoing-0 >> Content-Length: 363
2023-02-26 19:29:55 DEBUG com.github.dockerjava.zerodep.shaded.org.apache.hc.client5.http.headers:172 - http-outgoing-0 >> Host: localhost:2375
2023-02-26 19:29:55 DEBUG com.github.dockerjava.zerodep.shaded.org.apache.hc.client5.http.headers:172 - http-outgoing-0 >> Connection: keep-alive
2023-02-26 19:29:55 DEBUG com.github.dockerjava.zerodep.shaded.org.apache.hc.client5.http.headers:172 - http-outgoing-0 >> User-Agent: Apache-HttpClient/5.0.3 (Java/11.0.16.1)
2023-02-26 19:29:55 DEBUG com.github.dockerjava.zerodep.shaded.org.apache.hc.client5.http.wire:92 - http-outgoing-0 >> "POST /v1.32/networks/create HTTP/1.1[\r][\n]"
2023-02-26 19:29:55 DEBUG com.github.dockerjava.zerodep.shaded.org.apache.hc.client5.http.wire:92 - http-outgoing-0 >> "accept: application/json[\r][\n]"
2023-02-26 19:29:55 DEBUG com.github.dockerjava.zerodep.shaded.org.apache.hc.client5.http.wire:92 - http-outgoing-0 >> "content-type: application/json[\r][\n]"
2023-02-26 19:29:55 DEBUG com.github.dockerjava.zerodep.shaded.org.apache.hc.client5.http.wire:92 - http-outgoing-0 >> "x-tc-sid: 16246381-601c-4abb-8e67-5cb92c70a7f5[\r][\n]"
2023-02-26 19:29:55 DEBUG com.github.dockerjava.zerodep.shaded.org.apache.hc.client5.http.wire:92 - http-outgoing-0 >> "Accept-Encoding: gzip, x-gzip, deflate[\r][\n]"
2023-02-26 19:29:55 DEBUG com.github.dockerjava.zerodep.shaded.org.apache.hc.client5.http.wire:92 - http-outgoing-0 >> "Content-Length: 363[\r][\n]"
2023-02-26 19:29:55 DEBUG com.github.dockerjava.zerodep.shaded.org.apache.hc.client5.http.wire:92 - http-outgoing-0 >> "Host: localhost:2375[\r][\n]"
2023-02-26 19:29:55 DEBUG com.github.dockerjava.zerodep.shaded.org.apache.hc.client5.http.wire:92 - http-outgoing-0 >> "Connection: keep-alive[\r][\n]"
2023-02-26 19:29:55 DEBUG com.github.dockerjava.zerodep.shaded.org.apache.hc.client5.http.wire:92 - http-outgoing-0 >> "User-Agent: Apache-HttpClient/5.0.3 (Java/11.0.16.1)[\r][\n]"
2023-02-26 19:29:55 DEBUG com.github.dockerjava.zerodep.shaded.org.apache.hc.client5.http.wire:92 - http-outgoing-0 >> "[\r][\n]"
2023-02-26 19:29:55 DEBUG com.github.dockerjava.zerodep.shaded.org.apache.hc.client5.http.wire:106 - http-outgoing-0 >> "{"enableIPv6":null,"Name":"1664a022-9886-424a-8028-65c38b953248","Driver":null,"IPAM":null,"Options":{},"CheckDuplicate":true,"Internal":null,"EnableIPv6":null,"Attachable":null,"Labels":{"org.testcontainers":"true","org.testcontainers.lang":"java","org.testcontainers.version":"unspecified","org.testcontainers.sessionId":"16246381-601c-4abb-8e67-5cb92c70a7f5"}}"
2023-02-26 19:29:55 DEBUG com.github.dockerjava.zerodep.shaded.org.apache.hc.client5.http.wire:106 - http-outgoing-0 << "end of stream"
2023-02-26 19:29:55 DEBUG com.github.dockerjava.zerodep.shaded.org.apache.hc.client5.http.impl.io.DefaultManagedHttpClientConnection:125 - http-outgoing-0: Close connection
2023-02-26 19:29:55 DEBUG com.github.dockerjava.zerodep.shaded.org.apache.hc.client5.http.impl.classic.InternalHttpClient:239 - ep-00000010: endpoint closed
2023-02-26 19:29:55 DEBUG com.github.dockerjava.zerodep.shaded.org.apache.hc.client5.http.impl.classic.InternalHttpClient:243 - ep-00000010: discarding endpoint
2023-02-26 19:29:55 DEBUG com.github.dockerjava.zerodep.shaded.org.apache.hc.client5.http.impl.io.PoolingHttpClientConnectionManager:349 - ep-00000010: releasing endpoint
2023-02-26 19:29:55 DEBUG com.github.dockerjava.zerodep.shaded.org.apache.hc.client5.http.impl.io.PoolingHttpClientConnectionManager:372 - ep-00000010: connection is not kept alive
2023-02-26 19:29:55 DEBUG com.github.dockerjava.zerodep.shaded.org.apache.hc.client5.http.impl.io.PoolingHttpClientConnectionManager:381 - ep-00000010: connection released [route: {}->unix://localhost:2375][total available: 1; route allocated: 1 of 2147483647; total allocated: 1 of 2147483647]
2023-02-26 19:29:55 ERROR 🐳 [quay.io/ogunalp/kafka-native:latest-snapshot]:535 - Could not start container
java.lang.RuntimeException: com.github.dockerjava.zerodep.shaded.org.apache.hc.core5.http.NoHttpResponseException: localhost:2375 failed to respond
	at com.github.dockerjava.zerodep.ApacheDockerHttpClientImpl.execute(ApacheDockerHttpClientImpl.java:195) ~[docker-java-transport-zerodep-3.3.0.jar:?]
	at com.github.dockerjava.zerodep.ZerodepDockerHttpClient.execute(ZerodepDockerHttpClient.java:8) ~[docker-java-transport-zerodep-3.3.0.jar:?]
	at org.testcontainers.dockerclient.HeadersAddingDockerHttpClient.execute(HeadersAddingDockerHttpClient.java:23) ~[testcontainers-unspecified.jar:unspecified]
	at org.testcontainers.shaded.com.github.dockerjava.core.DefaultInvocationBuilder.execute(DefaultInvocationBuilder.java:228) ~[testcontainers-unspecified.jar:unspecified]
	at org.testcontainers.shaded.com.github.dockerjava.core.DefaultInvocationBuilder.post(DefaultInvocationBuilder.java:124) ~[testcontainers-unspecified.jar:unspecified]
	at org.testcontainers.shaded.com.github.dockerjava.core.exec.CreateNetworkCmdExec.execute(CreateNetworkCmdExec.java:27) ~[testcontainers-unspecified.jar:unspecified]
	at org.testcontainers.shaded.com.github.dockerjava.core.exec.CreateNetworkCmdExec.execute(CreateNetworkCmdExec.java:12) ~[testcontainers-unspecified.jar:unspecified]
	at org.testcontainers.shaded.com.github.dockerjava.core.exec.AbstrSyncDockerCmdExec.exec(AbstrSyncDockerCmdExec.java:21) ~[testcontainers-unspecified.jar:unspecified]
	at org.testcontainers.shaded.com.github.dockerjava.core.command.AbstrDockerCmd.exec(AbstrDockerCmd.java:35) ~[testcontainers-unspecified.jar:unspecified]
	at org.testcontainers.containers.Network$NetworkImpl.create(Network.java:100) ~[testcontainers-unspecified.jar:unspecified]
	at org.testcontainers.containers.Network$NetworkImpl.getId(Network.java:64) ~[testcontainers-unspecified.jar:unspecified]
	at org.testcontainers.containers.GenericContainer.applyConfiguration(GenericContainer.java:874) ~[testcontainers-unspecified.jar:unspecified]
	at org.testcontainers.containers.GenericContainer.tryStart(GenericContainer.java:378) ~[testcontainers-unspecified.jar:unspecified]
	at org.testcontainers.containers.GenericContainer.lambda$doStart$0(GenericContainer.java:344) ~[testcontainers-unspecified.jar:unspecified]
	at org.rnorth.ducttape.unreliables.Unreliables.retryUntilSuccess(Unreliables.java:81) ~[duct-tape-1.0.8.jar:?]
	at org.testcontainers.containers.GenericContainer.doStart(GenericContainer.java:334) ~[testcontainers-unspecified.jar:unspecified]
	at org.testcontainers.containers.GenericContainer.start(GenericContainer.java:322) ~[testcontainers-unspecified.jar:unspecified]
	at java.util.concurrent.CompletableFuture$UniRun.tryFire(CompletableFuture.java:783) ~[?:?]
	at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:478) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
	at java.lang.Thread.run(Thread.java:829) ~[?:?]

Steps to reproduce the issue

I haven’t yet been able to reproduce the issue outside test suite. There’s something about the workload of the test that appears to trip the defect.

Steps to reproduce the issue

  1. git clone https://github.com/k-wall/kroxylicious-junit5-extension/tree/minimal
  2. TESTCONTAINERS_RYUK_DISABLED=true mvn clean verify -Dmaven.javadoc.skip=true #Use OpenJDK 11
  3. StaticFieldSubclassExtensionTest will fail with the failed network creation call.
  4. Commenting out any test from TemplateTest will allow the test to succeed

Describe the results you received

No response to network creation HTTP request.

Describe the results you expected

Network created successfully.

podman info output

podman info
host:
  arch: amd64
  buildahVersion: 1.29.0
  cgroupControllers:
  - cpu
  - io
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.5-1.fc37.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.5, commit: '
  cpuUtilization:
    idlePercent: 95.76
    systemPercent: 2.6
    userPercent: 1.64
  cpus: 1
  distribution:
    distribution: fedora
    variant: coreos
    version: "37"
  eventLogger: journald
  hostname: localhost.localdomain
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 1000000
    uidmap:
    - container_id: 0
      host_id: 501
      size: 1
    - container_id: 1
      host_id: 100000
      size: 1000000
  kernel: 6.1.11-200.fc37.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 74768384
  memTotal: 2065817600
  networkBackend: netavark
  ociRuntime:
    name: crun
    package: crun-1.8-1.fc37.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.8
      commit: 0356bf4aff9a133d655dc13b1d9ac9424706cac4
      rundir: /run/user/501/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  remoteSocket:
    exists: true
    path: /run/user/501/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.0-8.fc37.x86_64
    version: |-
      slirp4netns version 1.2.0
      commit: 656041d45cfca7a4176f6b7eed9e4fe6c11e8383
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.3
  swapFree: 0
  swapTotal: 0
  uptime: 22h 9m 22.00s (Approximately 0.92 days)
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  volume:
  - local
registries:
  search:
  - docker.io
store:
  configFile: /var/home/core/.config/containers/storage.conf
  containerStore:
    number: 1
    paused: 0
    running: 0
    stopped: 1
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /var/home/core/.local/share/containers/storage
  graphRootAllocated: 106769133568
  graphRootUsed: 4927156224
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 3
  runRoot: /run/user/501/containers
  transientStore: false
  volumePath: /var/home/core/.local/share/containers/storage/volumes
version:
  APIVersion: 4.4.1
  Built: 1675940333
  BuiltTime: Thu Feb  9 10:58:53 2023
  GitCommit: ""
  GoVersion: go1.19.5
  Os: linux
  OsArch: linux/amd64
  Version: 4.4.1

Podman in a container

No

Privileged Or Rootless

Rootless

Upstream Latest Release

No

Additional environment details

Additional environment details

Additional information

Additional information like issue happens only occasionally or issue happens with a particular architecture or on a particular setting

I checked the podman machine’s journalctl, I see nothing suspicious in that log.

podman-machine-journalctl.txt

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Reactions: 1
  • Comments: 22 (7 by maintainers)

Most upvoted comments

Thanks @Luap99, there’s also a non-standard response header Keep-Alive that could be added to give a hint to clients about how long the connection can be kept alive, so adding something like Keep-Alive: timeout=10 to responses if idle timeout is 10 seconds.

FYI: We are also exploring client-side fixes as it’s perhaps better to make the client resilient, just wondering if there’s any improvements on the server.

updating to testcontainers 1.18.0 fixed it for me

@abentele - you are probably benefiting from https://github.com/testcontainers/testcontainers-java/issues/6667 which was part of 1.18.0. That fix allows testcontainers to retry the network creation if it fails. Previously network creation was not subjected to the retry.

We are running 1.18.0 and confirm we still see the issue in podman/mac. We are working around by setting org.testcontainers.containers.GenericContainer#withStartupAttempts on our containers.