buildx: network mode "custom_network" not supported by buildkit
Background: Running a simple integration test fails with network option:
docker network create custom_network
docker run -d --network custom_network --name mongo mongo:3.6
docker buildx build --network custom_network --target=test .
Output:
network mode "custom_network" not supported by buildkit
Still not supported? Code related: https://github.com/docker/buildx/blob/master/build/build.go#L462-L463
About this issue
- Original URL
- State: open
- Created 5 years ago
- Reactions: 12
- Comments: 51 (8 by maintainers)
Indeed @Hronom, it feels like gaslighting when they act like
But I’d like to spin up a network for each build - and have all the stuff running that would be needed for the integration tests. But again, I have to loop back around and either do weird stuff with iptables, or run postgres on the host and share it with all builds (contention/secrets/writing to the same resources/etc).
You could see how it would be so much more encapsulated and attractive if I could spin up a network per build with a bunch of stub services and tear it down afterwards ?
Ok, so for those who is looking for some solution like me, here I created repo https://github.com/Hronom/buildx-add-host-example with workaround that is aggregated from this thread and from this topic.
There I put examples for local usage and GitHub Actions usage with
setup-buildx-actionandbuild-push-action@poconnor-lab49 big respect to you! Thanks for the inspiration in this topic
Shame to buildkit/buildx that you still not able to find normal solution within 4 years for common and popular use case. This is really sad story.
OK, so once you’ve got it set up, how do you get name resolution to work? If I have a container
foothat’s running on my custom network, and I dodocker run --rm --network custom alpine ping -c 1 foo, it’s able to resolve the namefoo. Likewise, if I create a builder withdocker buildx create --driver docker-container --driver-opt network=custom --name example --bootstrap, and thendocker exec buildx_buildkit_example0 ping -c 1 foo, that works. But if I have a Dockerfile withRUN ping -c 1 fooand then rundocker buildx build --builder example ., I getbad address foo. If I manually specify the IP address, that works, but hard-coding an IP address into the Dockerfile hardly seems reasonable.Adding another use case where specifying the network would be useful: “hermetic builds”.
I’m defining a docker network with
--internalthat has one other container on the network, a proxy that is providing all the external libraries and files needed for the build. I’d like the docker build to run on this network without access to the external internet, but with access to that proxy.I can do this with the classic docker build today, or I can create an entire VM with the appropriate network settings, perhaps it would also work if I setup a DinD instance, but it would be useful for buildkit to support this natively.
Sorry, I’m not sure if we will ever start supporting this as it makes the build dependant on the configuration of a specific node and limits the build to a single node.
I will send $100 to anyone who adds custom networks back to docker compose by default (without disabling BuildKit). Sorry it is low, but I’m not valued in the billions of dollars, unlike the Docker corporation.
I can’t cope anymore. Going in circles trying to mitigate functionality issues between compose/docker/buildx/bake/buildkit when all I want is to be in one coherent environment (compose). It just overcomplicates everything in an already complicated environment which was previously working well. Until then I’ll continue to disable BuildKit like the folks above. Considering moving my infra over to podman or just baremetal.
Maybe @bryanhuntesl or others wouldn’t mind chipping in if my offer is to meager on its own.
I found a nice workaround, it’s also relevant to any other frontend framework too, short workaround strategy is this:
hostnetwork mode available, meaning by default the build happens insandboxedisolated mode, so we switch our Next.js app build time network to host mode from the isolationhostnetwork, so follow this rule:From your container's perspective localhost is the container itself. If you want to communicate with the host machine, you will need the IP address of the host itself. Depending on your Operating System and Docker configuration, this IP address varies. host.docker.internal is recommended from Docker version 18.03 onward.host.docker.internalas Strapi app hostname, if we take into account Strapi+Postgres container up and running with:1337port exposed for Strapi we need connection string value like thisNEXT_PUBLIC_STRAPI_API_URL=http://host.docker.internal:1337/graphqlThat way we have build and runtime both working just fine, without external network setup, just two independent docker-compose files started locally on the same machine It works like that:
hostmode from thesandboxedmode -> it builds and asks graphql for data -> connects tohost.docker.internal:1337-> Docker’s internal network redirects it tolocalhost:1337-> Docker routes this to locally exposed Strapi running at:1337port -> connection is fineso, frontend build stage can connect straight to internet to fetch cloud CMS instance or
host.docker.internalconnection to access another local docker-compose setup, but it can’t connect directly tolocalhost:1337or attach to custom network and use DNS there likehttp://strapi:1337/, so the build stage network setup is very limited and needs some tweaking like this, onlyhttps://cms-in-the-cloud.com/graphqlor onlyhttp://host.docker.internal:1337/graphqland work outside the default sandboxed mode set for docker buildhost.docker.internalThis will work for you on the Windows machine with latest Docker version used (Docker v3 and above with Buildkit engine), for Linux/MacOs consider to use 127.0.0.1 address instead if you have older versions of Docker
To make this work follow the external network configuration at Docker like this
Docker-compose A for Next.js Specify .env connection to Strapi like this
NEXT_PUBLIC_STRAPI_API_URL=http://host.docker.internal:1337/graphqlDocker-compose B for Strapi + PostgreSQL for data storage
Run the Docker-compose B to get Strapi up, and then run the Docker-compose A to run the Next.js build+run process that will go through the
buildandrunphases of it nicely through theNEXT_PUBLIC_STRAPI_API_URL=http://host.docker.internal:1337/graphqlconnection, that’s it!More docs:
Read https://github.com/docker/buildx/issues/591#issuecomment-816843505 to see
Buildkit supports none, host, default(sandboxed) as network mode. Supporting custom network names is currently not planned as it conflicts with the portability and security guarantees.and don’t use any other custom networks with Buildkit, that’s the new reality ofdocker buildcommand.In short, custom network cannot be specified for the build step of the docker-compose anymore, meaning that connecting two docker-compose files into a single custom external network to have build running as before is not feasible anymore after the Docker moved to this https://docs.docker.com/build/buildkit/ engine which was enabled by-default in v23.0.0 on 2023-02-01 (https://docs.docker.com/engine/release-notes/23.0/#2300)
With custom network runtime phase is always fine but this Docker v3 update breaks the build phase completely making the build impossible if any
getStaticPropsstatic asset can’t be fetched during the build phase(docker build)and fails the build so it never reaches the runtime phase(docker run)(if you skip build and got to runtime, it will work)The use case for this setup: Your organization has
website/Strapiandwebsite/Nextjsrepositories separately, but you just want to run them both locally in same Docker network runningwebsite/Strapidocker-compose file andwebsite/Nextjsdocker-compose file with build step and getStaticProps, this is stupid simple use case we all use in everyday life and this is where Docker v3 introduced that breaking change.Any other solution I found is to use
docker buildx ...but that way you lose thedocker-composefeatures or merge strapi and next.js in one big monorepo with big single docker-compose fileIt seems GCE’s metadata server IP is
169.254.169.254(but I’m not sure if this is always the case), so this worked for me in Google Cloud Build:and inside
Dockerfile(or using Cloud Client Libraries which use Application Default Credentials):This is particularly needed in environments such as Google Cloud Build where ambient credentials (via special-IP metadata service) are available only on a particular named network, not on the default network, in order to keep their exposure to build steps opt-in.
Why would someone do that? ssh-agent is a something that needs to be fairly well locked down - why would someone forward it across an insecure connection?
I mean, that’s a tangent anyway. Being able to run integration-tests in a docker build was an incredibly useful feature, one less VM to spin up, and one less iceberg to melt, it’s just useful because it’s efficient.
It’s also great to not have to run nodejs, ruby, etc on the build host but instead just have them as container dependency, if you can do all your tests in a docker build container it’s one less thing to lock down.
Anyhow, I apologise for running off on a tangent. All I’m saying is, it would be awesome if you could bring that functionality into the latest version of docker along with the means to temporary mount secrets. It’s just a really lightweight way to run disposable VMs without touching the host or even giving any rights to run any scripts or anything on the host.
That horse has bolted - SSH mount makes the build dependent upon the configuration of a single node - where did that dogma even get started?
I am currently hitting this issue, too, with the following setup on my Jenkins. I want to a) spin up a postgres docker image b) build a python library inside a Dockerfile, while running tests against said postgres database with a fixed name.
The issue is that my company wants me to use the docker plugin for Jenkins (https://plugins.jenkins.io/docker-workflow/, see https://docs.cloudbees.com/docs/cloudbees-ci/latest/pipelines/docker-workflow)
The Jenkinsfile code looks similar to this here:
Now, I could rewrite this code to work with buildx as said above, but then I’d need to use basic shell syntax as opposed to the plugin, which will perform clean-up activities in case of failures automatically.
I’ve been seeing similar. You can run the build in a user specified network. But the buildkit container on that network has DNS set to the docker’s localhost entry which won’t get passed through to nested containers. So the RUN steps within the build don’t have that DNS resolution. I’m not sure of the best way to get that to pass through, perhaps a proxy running in the buildkit container that lets DNS get set to the container IP instead of localhost?
I have the same problem as @philomory. Name resolution doesn’t work. I am using
network=cloudbuildon Google Cloud platform, so I can’t hardcode any IP address.Builder has been created with the following command:
docker buildx create --driver docker-container --driver-opt network=cloudbuild --name test --useThe recommendation is to use
buildx create --driver-opt network=custominstead when you absolutely need this capability. The same applies to the google cloud build use case.Indeed, ideally it would be more simple. These workarounds are ugly as hell.
My pipelines are still working fine for some time, but trying to create a new container locally I now notice that
docker buildis not working anymore. Apparentlydocker buildis an alias forbuildx(Whatever that is) but buildx can’t have a simple ‘network’ flag to build with a network? (The recommendation is still to disable the default network on local installs and create a custom bridge network, so this seems quite essential??).So, to get a build going locally, I now have to ‘create a buildx node’ with the network flag, then tell buildx to use that node, then when use ‘buildx build’ instead of just ‘build’, and the first thing it does is load up some buildx-image.
… why? If the build system is being replaced by buildx, at least make it seamless and feature-parity before doing things like this (or make users opt in to the experimental stuff).
Making the
docker buildcommandline compatible by including an option to set the network to use makes it already a lot better, but I’m sure other people have other issues with it reading this thread.this is crazy,
docker composewas used to spin up dependencies e.g. db, redis etc. Butbuild-push-actionnot able to connect to that docker containers. This is critical…I try use:
in
docker/setup-buildx-action@v3and then run
docker/metadata-action@v5and during build it not able to resolve host name fordbandredis.What should I do?
Any updates on this? I have also checked this issue https://github.com/moby/buildkit/issues/978, but can’t find a straight answer. I’ve disabled
buildKitin the Docker Desktop configuration to be able to build my containers, but I’m guessing that is a workaround. Any progress on this would be appreciated.Good point, I should have mentioned I was doing that too for git dependencies, and… Docker themselves have blogged about using it to augment the docker cache. Now I just burn the network, take lots of coffee breaks, and do my bit to melt the ice caps.
Why would that connection be insecure? Forwarding agent is more secure than build secrets because your nodes never get access to your keys.
We have solutions for build secrets, privileged execution modes (where you needed
docker runbefore for more complicated integration tests) and persistent cache for your apt/npm cache etc. https://github.com/moby/buildkit/issues/1337 is implementing sidecar containers support. None of this breaks the portability of the build. And if you really want it, host networking is available for you.I think it’s a lot of different workarounds , for something that could be simple .
With the buildx plug-in (which is in the install docs these days ) the ‘build’ command doesn’t manage to pass the ‘–network’ parameter along to the buildx builder being automatically created.
And since the ‘–network’ parameter exists and tries to do something with buildx it seems like a bug to me, but that’s probably the view from my limited bubble.
On Wed, Dec 20, 2023, 19:57 Vinícius Gajo @.***> wrote:
I was able to reach the containers using
--network "host"but this is not good enough, since some people that would run this command are not using Linux, and this flag does not work in other major OS’ like Mac and Windows (even with WSL).That’s not nearly enough information to get help, I fear. Is this repository public? Then we could actually see what you’ve tried to do, and possibly work out what’s gone wrong.
Edit: Looking back, e.g. this comment and the preceding few suggests that DNS resolution during building specifically may not be correctly honouring the custom network.
That’s already been specifically reported at https://github.com/docker/buildx/issues/1347, but turns out to be a BuildKit issue, rather than buildx, so the real issue to track (AFAICT) is https://github.com/moby/buildkit/issues/3210 or https://github.com/moby/buildkit/issues/2404. Probably both, at an initial glance. I suspect the former is a duplicate of the latter, since when using a custom network for the docker-container driver, from the perspective of the BuildKit running in that container, that custom network is the host network.
You might be able to use the workaround shown in https://github.com/docker/buildx/issues/175#issuecomment-1099567863 with add-hosts, but possibly you’ll need a separate build-step to extract the IPs from the custom network, as (I hope) you can’t smuggle arbitrary subshell executions into
build-push-action…Right, that’s why I opened this a couple weeks ago: https://github.com/docker/buildx/issues/1761
I haven’t been able to, but I need to put more effort into it. It just feels like too many nested workarounds at this point to set myself up to depend on. What stopped me from
docker buildx buildwas that I couldn’t get it to find the (first layer of) images that were already created, whichbuildx buildextends viaFROMin this service’s Dockerfile. It would just try to download them fromdocker.io. It also wouldn’t read the${project}env var that has always worked in the Dockerfile.But then there’s a bunch of compose-specific build stuff you don’t have access to…
Makes no sense. Also, provably false; it used to work and therefore was already supported. Who exactly is “we”?
Is there some version of docker-compose v1 where DOCKER_BUILDKIT=1 somehow does not result in buildkit getting use? Because docker-compose build on Docker Compose version 2.2.3 produces output that looks like the clssic builder.
You don’t pass the custom network name with build commands. Your builder instance is already part of that network.
@bryanhuntesl The proxy vars are still supported. For this use case, cache mounts might be a better solution now https://github.com/moby/buildkit/blob/master/frontend/dockerfile/docs/syntax.md#run---mounttypecache
Also, it does work in compose while build secrets does not.
The sidecar also sounds great and very clever and well structured. But again, 3 years ago I could build with secrets and talk to network services to run integration tests.