-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
entrypoint-aws-batch: Keep ../ path parts in ZIP archive members during extraction #241
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
18 tasks
b759049
to
4630445
Compare
…ng extraction The default of stripping ../ parts in member paths is a (good!) restriction for safety and security, but such paths do not pose any (additional) risk in the context of our Nextstrain runtime containers. We're already downloading and executing arbitrary user-supplied code, so the ability to potentially overwrite system files with ZIP archive members is not any additional privilege. And it's only potential at that due to most files being owned by root in the image, not the default container user of nextstrain. Keeping the ../ parts will allow Nextstrain CLI to construct ZIP archives for jobs which write to new sibling paths of /nextstrain/build in the container. This will be used for including pathogen workflow source separate (e.g. in /nextstrain/pathogen) from the analysis working directory (/nextstrain/build). It can also be used to support Nextstrain CLI's existing --augur, --auspice, etc. overlays on AWS Batch, though a few other changes are required for that too (coming soon). Note that Nextstrain CLI does *not* permit ../ path parts when extracting from these same ZIP archives (e.g. after a job completes to download results), as that *would* be additional risk. Currently it strips ../ parts, like unzip's default behaviour, but that will change soon to entirely skip archive members containing ../ parts.
4630445
to
e05ddfb
Compare
…archive extraction This allows Nextstrain CLI's --augur, --auspice, etc. overlays to start working with AWS Batch when previously they did not, by bundling them up with appropriate ../ path parts into the workdir ZIP archive. See "entrypoint-aws-batch: Keep ../ path parts in ZIP archive members during extraction" (e05ddfb) for the rationale of why this is not particularly unsafe.
This can (and should) be merged ahead of Nextstrain CLI's usage of |
jameshadfield
approved these changes
Feb 13, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't test, but the concept seems sensible!
tsibley
added a commit
to nextstrain/cli
that referenced
this pull request
Mar 4, 2025
The latest image, nextstrain/base:build-20250304T041009Z, provides a mechanism in the entrypoint to support bundling of overlays in the workdir ZIP archive.¹ Extending overlay support to AWS Batch has been very low priority and something I thought was unlikely to ever happen. However, in the course of working on AWS Batch support for `nextstrain run`, it turned out to be easiest/most straightforward/most minimal changes to bundle the pathogen source directory with the working analysis directory in the workdir ZIP archive, i.e. as a "pathogen" overlay. This naturally led to supporting overlays more generally, which I've done here. ¹ <nextstrain/docker-base#241>
tsibley
added a commit
to nextstrain/cli
that referenced
this pull request
Mar 5, 2025
The latest image, nextstrain/base:build-20250304T041009Z, provides a mechanism in the entrypoint to support bundling of overlays in the workdir ZIP archive by way of upwards-traversing archive member paths.¹ For example, an Augur overlay is bundled into the workdir ZIP archive with member paths starting with ../augur/ and ends up overwriting files in the image's /nextstrain/augur/ since the AWS Batch workdir is always /nextstrain/build/. Extending overlay support to AWS Batch has been very low priority and something I thought was unlikely to ever happen. However, in the course of working on AWS Batch support for `nextstrain run`, it turned out to be easiest/most straightforward/most minimal changes to bundle the pathogen source directory with the working analysis directory in the workdir ZIP archive, i.e. as a "pathogen" overlay. This naturally led to supporting overlays more generally, which I've done here. One caveat compared to overlays in runtimes with the concept of volume mounts (Docker, Singularity) is that any files in the image that do not exist in the overlaid files will remain present since nothing removes them. This is potentially problematic and will be annoying if run into but most of the time should be a non-issue. It is also solvable if we care to exert the effort and extra code to do so. I don't right now. ¹ <nextstrain/docker-base#241>
tsibley
added a commit
to nextstrain/cli
that referenced
this pull request
Mar 5, 2025
The latest image, nextstrain/base:build-20250304T041009Z, provides a mechanism in the entrypoint to support bundling of overlays in the workdir ZIP archive by way of upwards-traversing archive member paths.¹ For example, an Augur overlay is bundled into the workdir ZIP archive with member paths starting with ../augur/ and ends up overwriting files in the image's /nextstrain/augur/ since the AWS Batch workdir is always /nextstrain/build/. Extending overlay support to AWS Batch has been very low priority and something I thought was unlikely to ever happen. However, in the course of working on AWS Batch support for `nextstrain run`, it turned out to be easiest/most straightforward/most minimal changes to bundle the pathogen source directory with the working analysis directory in the workdir ZIP archive, i.e. as a "pathogen" overlay. This naturally led to supporting overlays more generally, which I've done here. One caveat compared to overlays in runtimes with the concept of volume mounts (Docker, Singularity) is that any files in the image that do not exist in the overlaid files will remain present since nothing removes them. This is potentially problematic and will be annoying if run into but most of the time should be a non-issue. It is also solvable if we care to exert the effort and extra code to do so. I don't right now. ¹ <nextstrain/docker-base#241>
tsibley
added a commit
to nextstrain/cli
that referenced
this pull request
Mar 11, 2025
The latest image, nextstrain/base:build-20250304T041009Z, provides a mechanism in the entrypoint to support bundling of overlays in the workdir ZIP archive by way of upwards-traversing archive member paths.¹ For example, an Augur overlay is bundled into the workdir ZIP archive with member paths starting with ../augur/ and ends up overwriting files in the image's /nextstrain/augur/ since the AWS Batch workdir is always /nextstrain/build/. Extending overlay support to AWS Batch has been very low priority and something I thought was unlikely to ever happen. However, in the course of working on AWS Batch support for `nextstrain run`, it turned out to be easiest/most straightforward/most minimal changes to bundle the pathogen source directory with the working analysis directory in the workdir ZIP archive, i.e. as a "pathogen" overlay. This naturally led to supporting overlays more generally, which I've done here. One caveat compared to overlays in runtimes with the concept of volume mounts (Docker, Singularity) is that any files in the image that do not exist in the overlaid files will remain present since nothing removes them. This is potentially problematic and will be annoying if run into but most of the time should be a non-issue. It is also solvable if we care to exert the effort and extra code to do so. I don't right now. ¹ <nextstrain/docker-base#241>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
entrypoint-aws-batch: Keep ../ path parts in ZIP archive members during extraction
The default of stripping ../ parts in member paths is a (good!) restriction for safety and security, but such paths do not pose any (additional) risk in the context of our Nextstrain runtime containers. We're already downloading and executing arbitrary user-supplied code, so the ability to potentially overwrite system files with ZIP archive members is not any additional privilege. And it's only potential at that due to most files being owned by root in the image, not the default container user of nextstrain.
Keeping the ../ parts will allow Nextstrain CLI to construct ZIP archives for jobs which write to new sibling paths of /nextstrain/build in the container. This will be used for including pathogen workflow source separate (e.g. in /nextstrain/pathogen) from the analysis working directory (/nextstrain/build). It can also be used to support Nextstrain CLI's existing --augur, --auspice, etc. overlays on AWS Batch, though a few other changes are required for that too (coming soon).
Note that Nextstrain CLI does not permit ../ path parts when extracting from these same ZIP archives (e.g. after a job completes to download results), as that would be additional risk. Currently it strips ../ parts, like unzip's default behaviour, but that will change soon to entirely skip archive members containing ../ parts.