Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion: Deprecate and disrecommend Platform's os.features key. #1147

Closed
TBBle opened this issue Nov 2, 2023 · 14 comments
Closed

Suggestion: Deprecate and disrecommend Platform's os.features key. #1147

TBBle opened this issue Nov 2, 2023 · 14 comments

Comments

@TBBle
Copy link

TBBle commented Nov 2, 2023

The platform object (in both Image Index and Config) contains a string-array property os.features, which is currently defined only for Windows, only for one possible value (plus empty), and only as a SHOULD.

This field describes a requirement of the Host OS for the container to run, per #632 (comment) and the spec documentation.

However, it's become both not-useful and not-usable.

The specific requirement was for containers that could not run on a Windows Nano Server host, due to absent win32k.sys. However, in 2017, Windows Nano Server lost the ability to be run as anything except a container base image, and no Nano Server edition before that change has been in a supported state since 2019.

So there's been no supported platforms where this check is relevant for about 3 years.

As mentioned at #632 (comment), Microsoft Windows base images served from mcr.microsoft.com, except Nano Server, are marked with the win32k feature, even today.

However, neither containerd (windows-specific, generic) nor moby can use os.features to select an image from an image index, or ensure an image will be runnable. (containerd also does not advertise the feature in its own platform definition, so if it was filtering as specified, it would be getting the wrong result). This was noted as the case in Docker in 2017, and presumably never changed.

This lack of support also turns out to be irrelevant because it seems that the Docker legacy builder (the only supported Windows Container image builder AFAIK) does not propagate this field into derived images, and so images on the Internet that should have this annotation do not. (Or I did something wrong in my research. I haven't tracked down specific code in Docker that leads to this result, and os.version is correctly propagated.)

Further, examination of a couple of projects (PowerShell-Docker and DotNet-Docker) which are:

  • MS Supported, i.e. should be best-practice
  • Generate images based on both Server Core and Nanoserver base image
  • Generate an Image Index

showed that none of the images contained the win32k feature, and also showed no attempt to include both Nanoserver- and Server Core-based images in the same index. These two projects have fallen on opposite sides of which one to make default, as it happens.

So my suggestion is move the spec to reflect the status quo:

  • Deprecate os.features in both places it appears
  • Recommend image builds may either ignore or propagate os.features into derived images. They should not provide a mechanism to modify os.features in any other way, e.g., platform overrides should not expose os.features for overriding.
    • BuildKit might turn out to propagate this by accident. We're still getting the final rough edges sorted, and this showed up during that process.
  • Container runtimes SHOULD ignore os.features, and not use it for purposes of selecting, ranking, or filtering images from an image index.
    • Putting images in an image index that differ only in os.features is a bad idea, and won't work correctly today, so we should standardese-discourage that.

I suggest we deprecate (I'd say remove, but I suspect that's hard for back-compat reasons) this field completely because, even if a future need for this appears, the field name has become inappropriate, as it describes a requirement on the Host OS, but sits next to os.version, which describes the Container's OS. At the time these fields were introduced, that distinction was minor (basically equality-matching versions), but improvements in the Windows Container ecosystem have complicated the use of os.version.

Because I'm aiming to reflect the status quo here, such a change to the spec is not urgent for 1.1.0; if it's considered a good idea, then various implementations need only not start supporting it and they're compliant. (This is relevant right now because containerd probably needs to start exposing os.version in its stringified Platfom object representation, and a simple approach would probably also expose os.features for consistency)

@tianon
Copy link
Member

tianon commented Nov 2, 2023

I obviously can't speak for all the maintainers, but I'm +1 (image spec maintainer, maintainer of windows images, maintainer of moby, to name some relevant angles I'm agreeing from).

@tianon
Copy link
Member

tianon commented Nov 2, 2023

(there's probably a pretty strong argument to be made about the similar non-OS features field too)

@sudo-bmitch
Copy link
Contributor

I would have been a +1 on this before, but now that the image compatibility working group has started, my preference is to wait for them to finish. Knowing that this field is currently unused is very helpful, so thank you for doing the research and providing the context. If the working group does not need this field in their output, then I'll gladly add my own +1 to this.

@jprendes
Copy link

jprendes commented Nov 2, 2023

We are also looking into using the os.features field for WebAssembly containers.

This is very much WIP, so I would also prefer to wait until that work is completed.

@jsturtevant
Copy link

Agree it isn't used on the Windows side. I don't know of any plans to use it either @kevpar @kiashok for awareness

From WASI perspective, the idea would be use it to flag WIT Worlds a given runtime supports or other specific features that might be runtime specific. Nothing has been done with this yet but it is something being considering, so open to other ideas and approaches.

I will also join the image compatibility wg to gain insight and add some WASI perspective as it sounds similar to what we are looking at

@TBBle
Copy link
Author

TBBle commented Nov 3, 2023

Yeah, I wasn't aware of the image compatibility working group but it sounds like they're looking to address this question in a more general way (which is nice) and from a quick skim of those docs, seems like it'd replace os.features and perhaps even os.version with something more-general and more-defined, i.e. documented compatibility rules.

Although I'm not clear yet how it is expected to work with image indexes, unless the puller is going to be expected to check each possibly-compatible image's referenced compat-spec to see if it's valid where right now that information is (shoehorned...) into Platform.

Anyway, this does cross over into their space, so I'm happy to wait for that work to progress. I think that's a good enough basis to argue "Don't start adding os.features support to runtimes/builders" in the short-term, which is my most-immediate concern.

I do note that matching a runtime's WIT Worlds does seem like a clear example of what this field was actually intended for, although matching rules might be different. There'd need to be a bunch more work to deliver that flow though, since (for example) I don't recall a path for a containerd runtime v2 shim to deliver back to containerd any details about the platforms it supports (but I could be wrong... plugins can announce their platforms, I now recall). There'd be benefit for WASI Wit Worlds and Windows if containerd could ask the shim "Will this image work" and "Which image should I pull from this index" rather than having to maintain those rules inside containerd as we do now, particularly since for Windows the answer varies by shim configuration.

Hopefully whatever the Image Compatibility OCI Working Group delivers will replace/fulfill that need.

@TBBle
Copy link
Author

TBBle commented Nov 3, 2023

Closing (for now) as duplicate of Image Compatibility OCI Working Group.

@TBBle TBBle closed this as not planned Won't fix, can't repro, duplicate, stale Nov 3, 2023
@tianon
Copy link
Member

tianon commented Nov 3, 2023

I bet they'd really appreciate your participation, especially as someone really familiar with Windows containers inner workings 👀

@TBBle
Copy link
Author

TBBle commented Nov 3, 2023

Yeah, I was considering that. My time availability varies, so I'm not sure if I'll join the Slack chat or not yet. I did notice that MS wasn't marked as a participant, but Windows did make it into scope at least, so I'm definitely thinking about how I can best-contribute to this effort.

@TBBle
Copy link
Author

TBBle commented Mar 15, 2024

In containerd/platforms#6 (comment) it was pointed out that https://github.com/unikraft/kraftkit is using the osversion field to carry a record of Linux Kernel config options. I'm not sure if they're doing anything else to keep that use-case separate from the wild world (since they aren't containers) apart from using their own registry, but this seems like something that the Image Compatibility OCI Working Group proposals should address, but the currently active proposals (B&D per meeting minutes) both punt "runtime image selection using image index metadata" as out-of-scope, and are looking at stuffing all that stuff into registry-hosted artifacts using the Referrers system.

Anyway, I thought I'd note it here for if-and-when we come back to this.

@ChaoyiHuang
Copy link

In containerd/platforms#6 (comment) it was pointed out that https://github.com/unikraft/kraftkit is using the osversion field to carry a record of Linux Kernel config options. I'm not sure if they're doing anything else to keep that use-case separate from the wild world (since they aren't containers) apart from using their own registry, but this seems like something that the Image Compatibility OCI Working Group proposals should address, but the currently active proposals (B&D per meeting minutes) both punt "runtime image selection using image index metadata" as out-of-scope, and are looking at stuffing all that stuff into registry-hosted artifacts using the Referrers system.

Anyway, I thought I'd note it here for if-and-when we come back to this.

already sent a message to Unikraft community, and hope that someone will join the discussion.

considering the image compatibility, "runtime image selection using image index metadata" is in fact included in the group's scope, please follow recent proposals.

@TBBle
Copy link
Author

TBBle commented Apr 7, 2024

considering the image compatibility, "runtime image selection using image index metadata" is in fact included in the group's scope, please follow recent proposals.

The current proposals is where I got this as being out-of-scope. Proposal B explicitly states "Image selection by runtimes is out of scope for this proposal" and both it and Proposal D have not marked either of the use-cases involving image indexes as addressed.

It's definitely within the group's scope, but both active proposals (as far as I can see) have left it out-of-scope for now. I assume there may be future proposals addressing what (if anything) is done to image indexes for these two use-cases, but at this time I have no visibility on what that might be.

@ChaoyiHuang
Copy link

please refer to the PR Add proposal E and Proposal H - compatibilities in flat format, these PRs have not been merged yet

@nderjung
Copy link

nderjung commented Apr 8, 2024

In containerd/platforms#6 (comment) it was pointed out that https://github.com/unikraft/kraftkit is using the osversion field to carry a record of Linux Kernel config options. I'm not sure if they're doing anything else to keep that use-case separate from the wild world (since they aren't containers) apart from using their own registry, but this seems like something that the Image Compatibility OCI Working Group proposals should address, but the currently active proposals (B&D per meeting minutes) both punt "runtime image selection using image index metadata" as out-of-scope, and are looking at stuffing all that stuff into registry-hosted artifacts using the Referrers system.

Anyway, I thought I'd note it here for if-and-when we come back to this.

Hi folks, I'm Alex and one of the main authors and a maintainer of KraftKit over at Unikraft. Thank you to @ChaoyiHuang for notifying me about this on-going discussion. Maybe I can give some context as to how we're using the OCI spec today and the problems we ran into.

For context, at Unikraft, we build highly specialized VM kernel images which are then bundled as part of the OCI Manifest along with the application. There are essentially two ways to construct a unikernel: (1) the application is compiled against library OS that is Unikraft and the final artifact is a single kernel binary image (i.e. the OCI Manifest has only one entry: the kernel) or (2) the application is external and is loaded into a slimmer unikernel binary through a filesystem (at least two entries). In both cases, the unikernel is highly specialized. Since Unikraft's build system allows for such specialization, we go beyond compiling a single application against just different architectures: we also vary the hypervisor, the VMM (e.g. QEMU or Firecracker), as well as internal properties to Unikraft (everything from having more debug logs, to changing the kernel scheduler, memory allocator, network stack, its properties, etc.), and so on. This means that for a single application there can be many final binary permutations.

When it came to packaging and distributing these artifacts, the OCI specification made a lot of sense thanks to being able to specify arch, os and os.features within the context of an OCI Index. In the latter two attributes os and os.features, we set the VMM or hypervisor and then all the configuration which is used to construct the unikernel, respectively. This makes it possible in an OCI Index to have images of the same canonical name, e.g. nginx:latest, but vary many attributes and therefore binary permutations.

Ultimately we propagate this back to the user at the client-side, and our toolchain selects the relevant image appropriately from the OCI Index based on their host environment as well as any desired properties of the unikernel (e.g. whether a specific os.feature is enabled or not.)

Here is a sample image, as part of a manifest:

{
  "schemaVersion": 2,
  "mediaType": "application/vnd.oci.image.manifest.v1+json",
  "config": {
    "mediaType": "application/vnd.oci.image.config.v1+json",
    "digest": "sha256:182e131937a07305607faae984e17ae7ff7fabaaecf4e1e7cd8c81f995838c1d",
    "size": 9401,
    "platform": {
      "architecture": "x86_64",
      "os": "qemu",
      "os.version": "0.16.3~21bf34c",
      "os.features": [
        "CONFIG_ARCH_X86_64=y",
        "CONFIG_CPU_EXCEPT_STACK_SIZE_PAGE_ORDER=4",
        "CONFIG_CROSS_COMPILE=",
        "CONFIG_DEBUG_SYMBOLS_LVL3=y",
        "CONFIG_HAVE_APIC=y",
        "CONFIG_HAVE_BOOTENTRY=y",
        "CONFIG_HAVE_INTCTLR=y",
        "CONFIG_HAVE_LIBC=y",
        "CONFIG_HAVE_MMIO=y",
        // [...]

A few additional notes:

  • We originally set os to the VMM/hypervisor as opposed to simply "unikraft" because we needed a clean and fast way of being able to specify the VMM/hypervisor to make this lookup. Of course, this value is incorrect because it is not an OS but since other implementations would select based on the actual OS, populating this value with the VMM meant we would not run into any conflicts. At the same time, there was no equivalent field.
  • Right now the client does most of the heavy lifting of calculating whether the provided set of os.features is compatible with the remote set of os.features (see example). Preferentially this would be part of the OCI Distribution Spec, where a query for such features is possible.
  • Finally, not sure whether this thread is is the correct place for this discussion but along the same lines, we had significant issues injecting our own mediaType within the manifest. In an ideal world we could use our own mediaType to represent the unikernel binary image without having to touch other elements in the manifest. Right now we are forced (largely due to container runtimes such as containerd which do not pull artifacts of unrecognized mediaTypes) to turn the unikernel binary into a traditional layer. It does not make sense to distribute it this way since it is external to the rootfs and always requires a heavy re-packaging process to place the unikernel as a file in a filesystem when it is in fact external to it. Instead, we'd prefer if we could simply inject a new entry into the OCI Manifest and be able to query Indexes and Manifests based on the mediaType entries and their metadata attributes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants