Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #2439 (COPY with Heredocs eats quotes) #2442

Merged
merged 3 commits into from
Jan 31, 2022

Conversation

jedevc
Copy link
Member

@jedevc jedevc commented Nov 1, 2021

Woops 😄

From the commit message:

In the contents of COPY/ADD, we perform expansion on variables using a lexer. However, this lexer, by default, removes quotes as well expanding variables - this isn't really the kind of behavior we're after, as it feels quite unintuitive.

To fix this, we introduce a new ExpandRaw function, which commands can implement that implement an alternative expansion that preserves quotes (and possibly other characters/features in the future).

Additionally, we introduce new tests to more clearly define the behavior. One major note is that backslashes are not passed and are processed, following normal escape rules (so that we can use $ symbols directly).

This fixes #2439 but does also leave backslashes in COPY not working quite as one might expect - I think this can probably be fixed with some documentation, happy to patch that if needed. Additionally, users can get "raw" content by quoting the heredoc to prevent expansion, i.e. <<"EOF".

@jedevc jedevc changed the title Fix #2439 Fix #2439 (COPY with Heredocs eats quotes) Nov 1, 2021
@jedevc jedevc requested a review from tonistiigi November 1, 2021 20:50
@jedevc jedevc added the kind/bug label Nov 1, 2021
Copy link
Member

@tonistiigi tonistiigi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't this done for the other commands as well? Where does it break in these cases?

@jedevc
Copy link
Member Author

jedevc commented Nov 3, 2021

It shouldn't cause any problems in the RUN command - because with those, we leave those alone completely until container runtime, so they're not expanded at all by buildkit, but by the native shell in whatever container image is being built.

COPY/ADD are special, because we do some expansion at build, specifically doing environment variable replacement in the file contents. However, since we use the Lex type for that, by default it strips quotes in processing words, which is why they get removed.

This fix is pretty much an elaborate "don't remove those quotes for heredocs" - we do still want to remove them for all other commands though, preserving the behaviour we had before, which this fix does.

@tonistiigi
Copy link
Member

Can you give me an example where the old behavior is preferred for COPY/ADD (outside heredocs). When you use quotes in path?

The expansion we do just for display in progress bar would be better with quotes kept I think? Eg. RUN ./myapp "foo $bar" baz

FYI @thaJeztah

@jedevc
Copy link
Member Author

jedevc commented Nov 4, 2021

Of course 🎉

ENV foo="bar"

Currently, this sets the variable foo to the contents bar - choosing not to strip quotes like this in Expand would instead set foo to "bar".

@thaJeztah
Copy link
Member

choosing not to strip quotes like this in Expand would instead set foo to "bar".

Ah, hm, yes, that would be problematic. Originally that was the behaviour (anything after = would be treated as value, including quotes (if any)), but this confused users, so support for stripping quotes was added at some point (long time ago though, in the classic builder)

@tonistiigi tonistiigi added this to the v0.10.0 milestone Nov 8, 2021
@jedevc
Copy link
Member Author

jedevc commented Nov 17, 2021

Is there anything left blocking this being merged? Let me know if there's still anything missing 👍

EOF

COPY <<EOF slashfile1
\
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we have a case with <<'EOF' as well to verify that \ is retained in that case. If these cases already exist in other tests then comment pointing to these cases from here would do.

@thaJeztah
Copy link
Member

thaJeztah commented Jan 28, 2022

Ah, I think I wanted to compare the behavior with doing the equivalent in a shell. Here's doing so:

No quotes (plain EOF)

export world=world

cat > myfile <<EOF
\
One
\\
Two
\
Hello "$world"
Hello '$world'
\$
EOF

cat myfile
One
\
Two
Hello "world"
Hello 'world'
$

Single quotes ('EOF')

cat > myfile <<'EOF'
\
One
\\
Two
\
Hello "$world"
Hello '$world'
\$
EOF

cat myfile
\
One
\\
Two
\
Hello "$world"
Hello '$world'
\$

Double quotes ("EOF")

cat > myfile <<"EOF"
\
One
\\
Two
\
Hello "$world"
Hello '$world'
\$
EOF

\
One
\\
Two
\
Hello "$world"
Hello '$world'
\$

(edit: added \$ and single quotes within the here doc)

@thaJeztah
Copy link
Member

I agree with @tonistiigi #2442 (comment) - it would make sense to verify both the "plain" EOF and the quoted variant(s); perhaps something along the cases I wrote above

@jedevc jedevc force-pushed the heredoc-copy-symbols branch 3 times, most recently from c43276f to 41e0564 Compare January 31, 2022 08:09
@jedevc
Copy link
Member Author

jedevc commented Jan 31, 2022

Have taken the test cases from @thaJeztah's comment and split them up across a couple tests - I've put the backslashes and dollars and such into testCopyHeredocSpecialSymbols and the quotes related ones into testHeredocVarSubstitution.

Everything seems ok to me, except for the case:

COPY <<EOF /dest/q1
Hello '${name}'!
EOF

For this, we were producing the wrong result, and not expanding ${name} - this is because POSIX treats the whole heredoc as one double-quoted word. So, to achieve the same results, I've added a new property to the lexer which will do the same thing by ignoring and passing over quotes entirely - then we can use this for the ExpandRaw function.

@jedevc jedevc mentioned this pull request Jan 31, 2022
@jedevc
Copy link
Member Author

jedevc commented Jan 31, 2022

Not quite sure what the test failure is - doesn't look to me like anything that should've broken? I also can't seem to reproduce it.

@crazy-max
Copy link
Member

Not quite sure what the test failure is - doesn't look to me like anything that should've broken? I also can't seem to reproduce it.

Yeah that's a flaky one. I re-run the workflow.

Previously, we'd only write raw escapes out as words, and not out to the
result. We weren't using or relying on this behaviour, but it could
easily have caused a bug if we were. This patch just cleans rawEscapes
to behave like rawQuotes.

Signed-off-by: Justin Chadwell <[email protected]>
In the contents of COPY/ADD, we perform expansion on variables using a
lexer. However, this lexer, by default, removes quotes as well as
expanding variables - this isn't really the kind of behavior we're
after, as it feels quite unintuitive.

To fix this, we introduce a new ExpandRaw function, which commands can
implement that implement an alternative expansion that preserves quotes
(and possibly other characters/features in the future).

Additionally, we introduce new tests to more clearly define the desired
behavior. One major note is that backslashes are not passed directly,
and are processed, following normal escape rules (so that we can use `$`
symbols directly).

Signed-off-by: Justin Chadwell <[email protected]>
This adds a couple more integration tests to more fully define the
behavior of these interactions. Additionally, through this, a minor
difference to POSIX was discovered where quotes are supposed to be
properly preserved in a heredoc (since a heredoc is treated as a
double-quoted word).

To handle this, a new property, SkipProcessQuotes is added to the shell
lexer which simply treats quotes as ordinary characters. This is the
only behavioral change needed to actually get the new tests working.

Signed-off-by: Justin Chadwell <[email protected]>
Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM, but deferring to @tonistiigi and @crazy-max, who are more familiar with the code

@tonistiigi tonistiigi merged commit 2633c96 into moby:master Jan 31, 2022
@LecrisUT
Copy link

I wonder if this PR introduced an issue like containers/buildah#5422

@jedevc
Copy link
Member Author

jedevc commented Mar 20, 2024

@LecrisUT this is an old PR - the linked issue is in buildah, which is unrelated to this project, so it seems unlikely to have been caused here.

@LecrisUT
Copy link

Oh, sorry about that. I had a tab open with a similar PR in buildah

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

COPY with Heredocs eats quotes
5 participants