COPY --link is a new one BuildKit Feature that could significantly speed up your Docker image builds. It works by copying files into independent image layers that don’t rely on the existence of their ancestors. You can add new content to images without the base image even being present on your system.
In this article we show what
--link does and explains how it works. We will also look at some of the situations in which this is the case should not be used.
What is -link?
--link is a new optional argument for the existing Dockerfile
COPY Instruction. It changes how copies work by creating a new snapshot layer each time it is used.
COPY -Instructions add files to the level that precedes them in the Dockerfile. The content of this layer must exist on your hard drive in order for the new content to be merged:
FROM alpine COPY my-file /my-file COPY another-file /another-file
The Dockerfile copy above
my-file into the layer created by the previous command. After this
FROM Instructions, the image consists of the content of Alpine:
bin/ dev/ etc/ ...
COPY Guide creates an image that encompasses everything from Alpine, as well as the
my-file bin/ dev/ etc/ ...
And the second
COPY instruction added
another-file at the top of this picture:
another-file my-file bin/ dev/ etc/ ...
The layer created by each statement contains everything that came before it and everything that was newly added. At the end of the build, Docker uses a diffing process to break out the changes within each layer. The final image blob contains only the files added at each snapshot stage, but this is not reflected in the assembly process during build.
Introduction of “-link”
COPY to create a new standalone file system each time it is used. Instead of copying the new files to the previous layer, they are sent to a completely different location to become an independent layer. The layers are then merged together to create the final image.
Let’s change the Dockerfile sample to use
FROM alpine COPY --link my-file /my-file COPY --link another-file /another-file
The result of
FROM The statement remains unchanged – it yields the Alpine layer with all the contents of this image:
bin/ dev/ etc/ ...
COPY Lessons have a significantly different effect. This time another independent layer is created. It is a new file system that only contains
Then the second
COPY Command just creates another new snapshot with
When the build is complete, Docker saves these independent snapshots as new layer archives (tarballs). The tarballs are tied back into the chain of the previous levels and form the final image. This consists of all three merged snapshots, resulting in a file system that matches the original when containers are created:
my-file another-file bin/ dev/ etc/ ...
This image from the BuildKit project illustrates the differences between the two approaches.
Adding “COPY -link” to your builds
COPY --link is only available if you use BuildKit to build your images. Run your build with either
docker buildx --create or use
docker build with the
DOCKER_BUILDKIT=1 Environment variable set.
You also need to sign up for Dockerfile v1.4 syntax using a comment at the top of your file:
# syntax=docker/dockerfile:1.4 FROM alpine:latest COPY --link my-file /my-file COPY --link another-file /another-file
Now you can build your image with linked copy support:
DOCKER_BUILDKIT=1 docker build -t my-image:latest .
Images built from Dockerfiles with
COPY --link can be used like any other. You can start a container with
docker run and slide them right into the registers. That
--link The flag only affects how content is added to the image layers during build.
Why Linked Copies Matter
--link Flag allow build caches to be reused even if you’re happy
COPY in case of changes. Additionally, builds may complete without their base image even being present on your computer.
Back to the example above, Standard
COPY behavior requires the
alpine image must be present on your Docker host before the new content can be added. The image will be downloaded automatically during the build if you haven’t dragged it first.
For linked copies, Docker doesn’t need this
alpine content of the image. It pulls the
alpine manifest, creates new independent layers for the copied files, and then creates a revised manifest that associates the layers with those provided by
alpine. The content of
alpine image – its layer blobs – are only downloaded when you start a container from your new image or export it to a tar archive. When you push the image to a registry, that registry saves its new layers and calls the
This functionality also facilitates efficient image rebases. Perhaps you are currently distributing a Docker image with the latest Ubuntu 20.04 LTS release:
FROM golang AS build ... RUN go build -o /app . FROM ubuntu:20.04 COPY --link --from=build /app /bin/app ENTRYPOINT ["/bin/app"]
You can build the image with caching enabled using BuildKit
--cache-to Flag. That
inline Cache stores create cached data in the output image where it can be reused in subsequent builds:
docker buildx build --cache-to type=inline -t example-image:20.04 .
Now let’s say you want to deploy an image based on the next LTS after its release, Ubuntu 22.04:
FROM golang AS build ... RUN go build -o /app . FROM ubuntu:22.04 COPY --link --from=build /app /bin/app ENTRYPOINT ["/bin/app"]
Rebuild the image with the cache data embedded in the original version:
docker buildx build --cache-from example-image:20.04 -t example-image:22.04 .
The build will complete almost immediately. Using the cached data from the existing image, Docker can verify the files needed to build it
/app have not changed. This means the cache for the independent level created by
COPY statement remains valid. Since this layer does not depend on any other layer, the
ubuntu:22.04 Picture is also not drawn. Docker just links the containing snapshot layer
/bin/app into a new manifest within the
ubuntu:22.04 layer chain. The snapshot layer is effectively “rebased” to a new parent image without any file system operations taking place.
The model also optimizes multi-stage builds, where changes can occur between stages:
FROM golang AS build RUN go build -o /app . FROM config-builder AS config RUN generate-config --out /config.yaml FROM ubuntu:latest COPY --link --from=config /config.yaml build.conf COPY --link --from=build /app /bin/app
--linkany change to the generated
ubuntu:latest dragged and the file copied into it. The binary must then be recompiled since its cache is invalidated by the file system changes. For linked copies, a change is on
config.yaml allows construction to continue without dragging
ubuntu:latest or recompile the binary. The snapshot level with
build.conf inside is simply replaced with a new version that is independent of all other layers.
When not to use it
There are situations in which the
--link Flag not working properly. Since files are copied to a new layer instead of being added above the previous one, you cannot use ambiguous references as destination path:
COPY --link my-file /data
With a regular
my-file is copied
/data already exists as a directory in the image. With
--linkthe target level filesystem is always empty, so
my-file will be written to
The same consideration applies to symlink resolution. default
COPY automatically resolves target paths that are symbolic links in the image. If you use
--linkthis behavior is not supported because the symbolic link does not exist in the independent layer of the copy.
It is recommended that you start using it
--link wherever these restrictions do not apply. Adopting this feature speeds up your builds and makes caching more efficient. If you cannot remove ambiguous or symbolically linked target paths immediately, you can continue to use the existing ones
COPY Instruction. That’s because of these backwards incompatible changes
--link is an optional flag instead of the new default.
COPY --link is a new Dockerfile feature that can make builds faster and more efficient. Images using linked copies do not need to drag previous layers just so files can be copied into them. Docker creates a new independent layer for each
COPY Instead, these layers are then tied back into the chain.
You can now start using linked copies when building images using BuildKit and the latest version of Buildx or Docker CLI. Adopting –link is a new best practice Docker build step, provided you are not affected by the required target path resolution changes.
This article was previously published on Source link