Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

half-baked idea: Dockerfile support for explicit layer generation #12530

Closed
cgwalters opened this issue Dec 7, 2021 · 6 comments
Closed

half-baked idea: Dockerfile support for explicit layer generation #12530

cgwalters opened this issue Dec 7, 2021 · 6 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@cgwalters
Copy link
Contributor

Today, Dockerfile is basically the lowest common denominator of the container ecosystem. There are better tools to build container images out there; I'm filing this because it relates to this ostree-container issue but really I found
https://grahamc.com/blog/nix-and-layered-docker-images
very inspirational. I honestly hadn't fully internalized the fact that it's possible to build Docker/OCI images in a not-completely-stupid way until seeing that post.

However: one really can't do this from Dockerfile. The core semantic is that each RUN invocation generates a new layer. Choices such as when and how to squash aren't in control of the container.

Here's my strawman proposal: Something like:

OUTPUT /layers

When specified, /layers will already have a pre-populated directory /layers/input/blobs/sha256 with blobs chosen by the build system that do not need to be regenerated by this build. This could be empty, but e.g. a simple useful implementation might pre-populate it with layers from the previous build of the same image tag.

Hmm, and a build system would need some way to name those blobs, so maybe something like /layers/history.json which corresponds to the entries in the config history, so the build system can e.g. insert into its builds something like "bash-5.2.0-3-amd64.deb" or whatever and map from that to the blob sha256.

Then finally, the build system can output any new blobs to /layers/output/blobs/sha256.

Basically, what this would allow is something like:

FROM quay.io/nixos/nix
OUTPUT /layers
COPY . . 
RUN nix build

And we wouldn't even need a multi-stage build here - the in-container build process itself gets to have much more power over the generation of the final image. IOW here the final image wouldn't derive from quay.io/nixos/nix or whatever at all (usually).

@mheon
Copy link
Member

mheon commented Dec 7, 2021

I'll tag in @nalind and @rhatdan here for opinions. I do recall that we have been reluctant in the past to extend the syntax of Dockerfiles, to prevent a split in supported syntaxes between Podman/Buildah and other tools.

@mheon mheon added the kind/feature Categorizes issue or PR as related to a new feature. label Dec 7, 2021
@rhatdan
Copy link
Member

rhatdan commented Dec 8, 2021

I am traveling and have not been able to concentrate on this. But this could be something we could extend with Containerfile support as opposed to Dockerfile.

@rhatdan
Copy link
Member

rhatdan commented Dec 8, 2021

@cgwalters
Copy link
Contributor Author

There may be a better place to discuss this; totally agree that were we to try this, the benefit is much greater if it's "standardized" by being shared with docker/moby.

This isn't a burning urgent issue or anything. But I just wanted to get it written down in a persistent place so I can link to it and we can potentially revisit it at some point.

@vrothberg
Copy link
Member

I have a hard time wrapping my head around it. My first impression is that multi-stage builds allow for selectively copying data from other images/layers.

@flouthoc
Copy link
Collaborator

flouthoc commented Dec 8, 2021

@cgwalters Please correct me if i missed anything or interpret anything wrongly. I just have few doubts.

What would be the exact behavior of OUTPUT /layers , would it commit a new layer every time invoked ? and how would caching work with this.

I might be interpreting this wrongly but doesn't multi-stage builds allows users to generate layers per each stage and multi-stage layers could also be re-used from other builds as well.

So how is this fundamentally different from calling explicit FROM.

Choices such as when and how to squash aren't in control of the container.

Do we need something like
SQUASH as <squash-nickname> in containerfile which allows users to squash already all built stages or layers into one ?
and then stuff could be re-used from squashed layer in containerfile like COPY --from=<squash-nickname>

@containers containers locked and limited conversation to collaborators Dec 15, 2021
@vrothberg vrothberg converted this issue into discussion #12605 Dec 15, 2021

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

5 participants