Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

layer: clarify wording around applying changesets #317

Merged
merged 2 commits into from
Oct 20, 2016
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 11 additions & 13 deletions layer.md
Original file line number Diff line number Diff line change
Expand Up @@ -204,31 +204,29 @@ The resulting tar archive for `rootfs-c9d-v1.s1` has the following entries:
./etc/.wh.my-app-config
```

Where the basename name of `./etc/my-app-config` is now prefixed with `.wh.`, and will therefore be removed when the changeset is applied.
To signify that the resource `./etc/my-app-config` MUST be removed when the changeset is applied, the basename of the entry is prefixed with `.wh.`.

## Applying
## Applying Changesets

Layer Changesets of [mediatype](./media-types.md) `application/vnd.oci.image.layer.v1.tar+gzip` are applied rather than strictly extracted in normal fashion for tar archives.
Layer Changesets of [mediatype](./media-types.md) `application/vnd.oci.image.layer.v1.tar+gzip` are _applied_, rather than simply extracted as tar archives.

Applying a layer changeset requires consideration for the [whiteout](#whiteouts) files.
In the absence of any [whiteout](#whiteouts) files in a layer changeset, the archive is extracted like a regular tar archive.
Applying a layer changeset requires special consideration for the [whiteout](#whiteouts) files.

In the absence of any [whiteout](#whiteouts) files in a layer changeset, the archive is extracted like a regular tar archive.

### Changeset over existing files

This section covers applying an entry in a layer changeset, if the file path already exists.
This section specifies applying an entry from a layer changeset if the target path already exists.

If the file path is a directory, then the existing path just has it's attribute set from the layer changeset for that filepath.
If the file path is any other file type (regular file, FIFO, etc), then the:
* file path is unlinked (See [`unlink(2)`](http://linux.die.net/man/2/unlink))
* create the file
* If a regular file then content written.
* set attributes on the filepath
If the entry and the existing path are both directories, then the existing path's attributes MUST be replaced by those of the entry in the changeset.
In all other cases, the implementation MUST do the semantic equivalent of the following:
- removing the file path (e.g. [`unlink(2)`](http://linux.die.net/man/2/unlink) on Linux systems)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

“removing the file” → “(recursively) removing the file”? Most of the discussion in the GNU docs you linked is for “folks would be mad if we blew away a full tree and replaced it with a broken symlink”. That makes sense for working filesystems, but we're building the rootfs from scratch here, so losing information is not a concern. However, you don't want folks blindly using unlink, having it fail because the target is a directory and either dying or leaving it in place. We do want them to get that old thing off the filesystem (whatever it takes ;), and the “(recursively)” hint may be enough to get that across.

Or maybe it's sufficient as it stands :p.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm open to adding "recursively" as long as that is actually what is expected here - @vbatts @stevvooe ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cannot act recursively. It must only act on the immediate resource. For example, a link should not be followed.

There are some ordering requirements for this to work correctly. Let's take the following directories from the applying layer:

/a/b
/a/b/c

/a/b should probably come before /a/b/c for this to work correctly. If /a/b/d is encountered, it needs to be left alone, unless the layer above specifies a removal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stevvooe what is supposed to happen if /a/b/ is a directory and a changeset contains /a/b as a file?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I supposed I missed the distinction here. I think we need to be clear that this is recursive only when the lstat (shallow) type of the resource is dir.

If /a/b/... exists, a non-directory entry /a/b would cause removal, recursively. Further entries prefixed with an already encountered, non-directory entry, would be ignored or cause an error.

If /a/b/ exists and a new directory entries /a/b/... are encountered, the result will be a union of the two directory trees, unless whiteouts apply.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to be clear that this is recursive only when the lstat (shallow) type of the resource is dir.

No, it's only not a recursive delete when both the existing entry (lstat) and tar entry are dirs. See the implementation in opencontainers/image-tools#42 for machine-readable phrasing.

If /a/b/… exists, a non-directory entry /a/b would cause removal, recursively.

Yes.

Further entries prefixed with an already encountered, non-directory entry, would be ignored or cause an error.

Already-encountered-ness has nothing to do with it. I think we should just unpack as we read through the tarball. Only the current .wh.* handling jumps the tar order, and the remove-before-unpacking behavior is for non-whiteout entries.

If /a/b/ exists and a new directory entries /a/b/… are encountered, the result will be a union of the two directory trees, unless whiteouts apply.

This is true even without the “directory” limit on /a/b/…. If the filesystem has /a/b/c and an /a/b/not-c entry is found in the tarball, then the result will have both /a/b/c and /a/b/not-c. If an /a/b entry is found in the tarball, then it's just a metadata clobber (the result will still have /a/b/c).

- recreating the file path, based on the contents and attributes of the changeset entry

## Whiteouts

A whiteout file is an empty file with a special filename that signifies a path should be deleted.
A whiteout filename consists of the prefix .wh. plus the basename of the path to be deleted.
A whiteout filename consists of the prefix `.wh.` plus the basename of the path to be deleted.
As files prefixed with `.wh.` are special whiteout markers, it is not possible to create a filesystem which has a file or directory with a name beginning with `.wh.`.

Once a whiteout is applied, the whiteout itself MUST also be hidden.
Expand Down