earthly: SAVE IMAGE is slow, even when there's no work to be done

I was observing that for highly optimized builds the slowest part can be saving images. For example, here’s a repo where earth +all takes 12s if everything is cached (like I run earth +all twice). Yet if I comment the SAVE IMAGE lines out the total time drops to 2s. This implies that SAVE IMAGE is doing a lot of work, even when nothing has changed.

Is there anything I can do speed up SAVE IMAGE in instances like this? I’m surprised that SAVE IMAGE does anything if the image hasn’t changed, is it possible for it do some more sophisticated content negotiation with the layers?

After talking with @agbell in Slack I hypothesized that it might not be possible for Earthly to what images/layers the host has. This is all conjecture on my part, but:

If Earth is running in a container then it doesn’t know the state of the registry on the host machine, and what layers it has. It’s only option is to export the entire image to the host, which on a Mac could be slow because containers on a mac are actually running in a VM.

Maybe if Earth could mount/be aware of the host docker registry it could just do docker push? This reminds me of similar problems that are being solved in the Kubernetes local cluster space https://github.com/kubernetes/enhancements/tree/master/keps/sig-cluster-lifecycle/generic/1755-communicating-a-local-registry

About this issue

Original URL
State: closed
Created 4 years ago
Comments: 20 (19 by maintainers)

Most upvoted comments

Hi @jazzdan - you are spot-on. The output of the build is passed to docker as a tar file, so there are inherent inefficiencies resulting from that. I think indeed there may be ways in which Earthly could talk to host docker while it’s building and decide to skip generating tars (or even individual layers) if the image (or parts of it) is already available.

vladaionescu on Nov 11, 2020

After some deliberation, we decided not to add encryption for the connection as it is not necessary.

Reasoning:

It seems that docker pull just works with insecure 127.0.0.1:*. There’s no need for any additional configuration (yay!).
The registry is read-only - there’s no way for anyone to push anything to it - so that means that you couldn’t cause damage by doing a rogue docker push somehow.
The ability for anyone to docker pull from localhost is enabled already. So we’re not opening up any new vulnerability to the docker system.
Adding certificates, on the other hand, requires that we use sudo to place the certificates in the right place for docker to find and accept them for docker pulls AND on mac & windows it additionally requires a docker restart, which some users will forget to do.
On the local system there’s no way to sniff or manipulate loopback traffic unless you already have root.

This means that this feature is code-complete and will be available in the next release. We will roll it out as disabled by default initially. To enable the feature you will be able to run the following.

earthly config global.local_registry_host 'tcp://127.0.0.1:8371'

vladaionescu on May 28, 2021

We ended up with rules to either build the image or output a docker context via artifacts like this:

setup:
   COPY Dockerfile .
   SAVE ARTIFACT ./*

image-local:
    FROM +setup
    SAVE ARTIFACT ./* AS LOCAL out/

image:
    FROM DOCKERFILE +setup
    SAVE IMAGE image-name

So we can earth +image-local && docker build -t image-name out/ for faster local-dev iteration.

loganfsmyth on Dec 3, 2020

This is now released as experimental in v0.5.16. It’s disabled by default for now, but can be enabled using

earthly config global.local_registry_host 'tcp://127.0.0.1:8371'

vladaionescu on Jun 3, 2021

Ok, running a headless registry isn’t very hard. I was able to do it here - as far as I can tell it’s working fine as I was able to push a test image to it. Will look into implementing the storage driver next.

vladaionescu on Apr 14, 2021

Hi @jazzdan - I’ll get back to this in April. Haven’t yet had a chance to dig into the local registry idea.

vladaionescu on Mar 31, 2021

Can you describe the use-case a little better?

Yeah, there are two use cases. tl;dr though: mostly one image that we’re iterating on constantly and only the top layers are changing.

When we use Earthly + Tilt Tilt will automatically trigger Earthly builds when files change, build the resulting docker image, and push it in to a local k8s’s cluster image registry. This should be very fast but because there’s this Earthly export step it can take a long time, even for tiny changes to a single layer.
As you said there are some images we want to iterate on locally, manually. In this case you’re often changing just a top layer or two and exports take a long time. We use a lot of ubuntu images so having to export + import all of Ubuntu across the macOS docker VM boundary takes a long time.

We do have the other problem (many images, most of which have not changed at all) but this only comes in deploys I think so I’m not that worried about it. Plus those run on Linux anyways so it’s faster.

FWIW I’ve seen the docker pull method be very effective for transferring images in to local Kubernetes clusters.

jazzdan on Feb 26, 2021