Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure that containers do not get stuck in stopping #22641

Merged
merged 1 commit into from
May 13, 2024

Conversation

mheon
Copy link
Member

@mheon mheon commented May 8, 2024

The scenario for inducing this is as follows:

  1. Start a container with a long stop timeout and a PID1 that ignores SIGTERM
  2. Use podman stop to stop that container
  3. Simultaneously, in another terminal, kill -9 pidof podman (the container is now in ContainerStateStopping)
  4. Now kill that container's Conmon with SIGKILL.
  5. No commands are able to move the container from Stopping to Stopped now.

The cause is a logic bug in our exit-file handling logic. Conmon being dead without an exit file causes no change to the state. Add handling for this case that tries to clean up, including stopping the container if it still seems to be running.

Fixes #19629

Does this PR introduce a user-facing change?

Fixed a bug where Systemd-managed containers could be stuck in the Stopping state, unable to be restarted, if systemd killed the unit before `podman stop` finished stopping the container ([#19629](https://github.com/containers/podman/issues/19629)).

@mheon
Copy link
Member Author

mheon commented May 8, 2024

Still needs a test

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 8, 2024
Copy link
Member

@Luap99 Luap99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic looks good

Comment on lines 1459 to 1475
if err := unix.Kill(c.state.PID, 0); err == nil {
// We have a runaway container, unmanaged by
// Conmon. Invoke OCI runtime stop.
// Use 0 timeout for immediate SIGKILL as things
// have gone seriously wrong.
// Ignore the error from stopWithAll, it's just
// a cgroup check - more important that we
// continue.
// If we wanted to be really fancy here, we
// could open a pidfd on container PID1 before
// this to get the real exit code... But I'm not
// that dedicated.
all, _ := c.stopWithAll()
if err := c.ociRuntime.StopContainer(c, 0, all); err != nil {
logrus.Errorf("Error stopping container %s after Conmon exited prematurely: %w", c.ID(), err)
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not aware of any case were conmon can leak the container main process as it uses PDEATHSIG so I like to have it clear in the comment that this likely only relevant in the pid=host case were process are not killed as there is no pid namespace which does it for us.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've heard of this happening even with PDEATHSIG set from support. Not sure of exact circumstances, but my trust in Conmon's ability to take the container down on being killed is not absolute.

Comment on lines 1493 to 1495
if err := c.runtime.state.AddContainerExitCode(c.ID(), c.state.ExitCode); err != nil {
logrus.Errorf("Error saving container %s exit code after Conmon exited prematurely: %w", c.ID(), err)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't have a real exit code so should we actually store this -1 exit code?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needed for certain cases in podman run - otherwise we'd pop up an angry, duplicate error about not having a sensible error code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mhh, yeah I guess it keeps it more consitent internally given many assumptions on having the exit code available even when it is sort of a lie.

@mheon mheon force-pushed the handle_stopping_loop branch from 8afca63 to 766764f Compare May 8, 2024 19:16
@@ -1465,4 +1465,34 @@ search | $IMAGE |
is "$output" "Error.*: $expect" "podman emits useful diagnostic when no entrypoint is set"
}

@test "podman run - stopping loop" {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 1479 to 1493
while :; do
sleep 0.5
run_podman logs testctr
if [[ "$output" =~ READY ]]; then
break
fi
done
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but since you have to repush anyway, please use wait_for_ready testctr instead of this loop

Copy link
Member

@edsantiago edsantiago left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OMG wait no

run_podman inspect --format '{{ .State.ConmonPid }}' testctr
conmon_pid=$output

$(PODMAN) stop testctr &
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curly braces, not parens

Comment on lines 1487 to 1488
kill -9 $(stop_pid)
kill -9 $(conmon_pid)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here too

@mheon mheon force-pushed the handle_stopping_loop branch from 766764f to 5b00546 Compare May 8, 2024 19:57
@test "podman run - stopping loop" {
skip_if_remote "this doesn't work with with the REST service"

run_podman run -t -d --name testctr --stop-timeout 240 $IMAGE sh -c 'trap "echo READY" INT; while:; do sleep 0.1; done'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need a space between while and the colon.

You can run this yourself via

$ ./bats --rootless 030:"stopping loop"

(I include --rootless simply to avoid the sudo password thing, because there's no difference here anyway)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You also need to s/INT/TERM/, and possibly nuke the -t, but I still can't get this test to work.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added another wait_for_ready to make sure sh -c is good to go before we trigger the first stop - maybe that'll do it. Will check if it fixed things tomorrow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh it's much worse than that. I lied: you can't use wait_for_ready because that function "helpfully" checks container status and bails if != running.

I'm working on something. Will let you know.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This passes, but I suspect it's not quite exactly what you had hoped for. In particular, the -t0 in the last podman stop. I can't make anything work without that.

@test "podman run - stopping loop" {
    skip_if_remote "this doesn't work with with the REST service"

    run_podman run -d --name testctr --stop-timeout 240 $IMAGE sh -c 'echo READY; sleep 999'
    cid="$output"
    wait_for_ready testctr

    run_podman inspect --format '{{ .State.ConmonPid }}' testctr
    conmon_pid=$output

    ${PODMAN} stop testctr &
    stop_pid=$!

    timeout=20
    while :;do
        sleep 0.5
        run_podman container inspect --format '{{.State.Status}}' testctr
        if [[ "$output" = "stopping" ]]; then
            break
        fi
        timeout=$((timeout - 1))
        if [[ $timeout == 0 ]]; then
            run_podman ps -a
            die "Timed out waiting for testctr to acknowledge signal"
        fi
    done

    kill -9 ${stop_pid}
    kill -9 ${conmon_pid}

    PODMAN_TIMEOUT=5 run_podman 125 stop -t0 testctr
    is "$output" "Error: container $cid conmon exited prematurely, exit code could not be retrieved: internal libpod error" "correct error on missing conmon"

    # This should be safe because stop is guaranteed to call cleanup?
    run_podman inspect --format "{{ .State.Status }}" testctr
    is "$output" "stopped" "container has successfully transitioned to exited state after stop"
}

@mheon mheon force-pushed the handle_stopping_loop branch 2 times, most recently from 3b2338b to d474b2d Compare May 9, 2024 13:55
@mheon
Copy link
Member Author

mheon commented May 9, 2024

Oof. The cleanup in ContainerStop() only happens if there was no error - so the error we're expecting prevents it, and prevents container cleanup. Actual code bug, we need to trigger a cleanup manually in this case otherwise it won't happen.

The scenario for inducing this is as follows:
1. Start a container with a long stop timeout and a PID1 that
   ignores SIGTERM
2. Use `podman stop` to stop that container
3. Simultaneously, in another terminal, kill -9 `pidof podman`
   (the container is now in ContainerStateStopping)
4. Now kill that container's Conmon with SIGKILL.
5. No commands are able to move the container from Stopping to
   Stopped now.

The cause is a logic bug in our exit-file handling logic. Conmon
being dead without an exit file causes no change to the state.
Add handling for this case that tries to clean up, including
stopping the container if it still seems to be running.

Fixes containers#19629

Signed-off-by: Matt Heon <[email protected]>
@mheon mheon force-pushed the handle_stopping_loop branch from d474b2d to 3fa8e98 Compare May 9, 2024 15:17
@mheon
Copy link
Member Author

mheon commented May 9, 2024

This is ready now


# This should be safe because stop is guaranteed to call cleanup?
run_podman inspect --format "{{ .State.Status }}" testctr
is "$output" "exited" "container has successfully transitioned to exited state after stop"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm... I'm nervous. In my version, this line said stopped. Yours says exited. Is that because of your c.cleanup() change? If it isn't, is it possible that we're hitting that old bug where sometimes podman sets one and sometimes the other? I'm running a test loop now, see if I ever get stopped.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. We want exited. That indicates that cleanup has run properly, the container is unmounted, networking is torn down. stopped means we're still mounted, and we don't want to be stuck there.

@mheon
Copy link
Member Author

mheon commented May 10, 2024

@containers/podman-maintainers PTAL

@rhatdan
Copy link
Member

rhatdan commented May 13, 2024

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label May 13, 2024
Copy link
Member

@Luap99 Luap99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
But I must say the stop logic is really not nice to follow/understand. I would not be surprised if there are more corner cases.

The fact that we now have a second place where the event exited event is generated and the state is reset doesn't make things better as well.


# Unclear why `-t0` is required here, works locally without.
# But it shouldn't hurt and does make the test pass...
PODMAN_TIMEOUT=5 run_podman 125 stop -t0 testctr
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@edsantiago Why PODMAN_TIMEOUT=5?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leftover artifact from my testing: I got tired of ^Cing on every single iteration (until I hit upon the -t0 trick). I just copy-pasted what I had, and didn't notice that it included PODMAN_TIMEOUT. When I noticed it in the PR, I thought it probably doesn't hurt anything. Does it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is confusing to readers that ask why is this here.

Copy link
Contributor

openshift-ci bot commented May 13, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: edsantiago, Luap99, mheon

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [Luap99,edsantiago,mheon]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot openshift-merge-bot bot merged commit 0c09421 into containers:main May 13, 2024
89 of 91 checks passed
@stale-locking-app stale-locking-app bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Aug 12, 2024
@stale-locking-app stale-locking-app bot locked as resolved and limited conversation to collaborators Aug 12, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. release-note
Projects
None yet
Development

Successfully merging this pull request may close these issues.

podman stuck stopping
4 participants