Adds ability to only SIGTERM the immediate process #10

denibertovic · 2021-06-02T15:55:17Z

This is useful in certain situations (with multiprocessing apps such as
gunicorn or celery) where we want to signal to the "main" process
only and let it handle gracefully terminating the "worker" processes.

In case the "main" process does not gracefully shutdown the "worker"
processes in the given "timeout" we still make sure to send SIGKILL to
all "(-1)" remaining processes.

Hey @snoyberg do you mind going over this and letting me know what you think? Did I miss anything / could this be done in a better way?
I'm not crazy about the flag name but couldn't think of anything more suitable. Also I didn't bump the major version since this is behind a feature flag and wouldn't impact existing users.

This is useful in certain situations (with multiprocessing apps such as `gunicorn` or `celery`) where we want to signal to the "main" process only and let *it* handle gracefully terminating the "worker" processes. In case the "main" process does not gracefully shutdown the "worker" processes in the given "timeout" we still make sure to send SIGKILL to all "(-1)" remaining processes.

snoyberg · 2021-06-04T03:16:25Z

Overall this looks good, but I'm wondering if it actually needs a flag. Maybe the best behavior would be:

SIGTERM immediate child
Sleep for timeout
SIGTERM all processes (-1)
Sleep for timeout
SIGKILL all processes

It seems overall better for all use cases. What do you think?

denibertovic · 2021-06-07T11:31:45Z

@snoyberg I think I'm inclined to agree with you. The only issue I see is that it might be surprising that in very rare situations the timeout set on the command line will be doubled because of the double SIGTERM attempt and then SIGKILL. But I guess that can be documented.

The reason why I didn't go with that approach initially is because it's a breaking change and I wasn't sure if this would impact existing users. (Although I can't see how yet).
That, and I was basing my implementation on what tini does. For example, tini uses child process groups for this use-case - I believe. That said, I see no difference from sending a signal to the child process group vs to "everything" (-1) since they should be the same in terms of what's running in a container. Right?

Let me try and implement the proposed solution.

denibertovic · 2021-10-10T14:14:20Z

@snoyberg I totally forgot about this. I actually made the change locally but then it slipped my mind and I didn't push it.
Vacation came along and I totally forgot about it.

Anyway, I've pushed a new commit that does what we discussed above: send sigterm to immediate child -> wait for timeout -> send sigterm to all children -> wait for timeout (AGAIN) -> send sigkill to all children. I ran some local testing and it does seem to work fine for my use case.

I pushed the change as a new commit so we can still choose between the two approaches and makes things easier to revert if we decide to. But, if you want let me know and I can squash the two commit.

There's one thing I'm not too sure about. In a scenario where we don't have a well behaved process (ignores TERM signal) we will essentially be allowing it the double amount of timeout before killing it. I'm not sure if this is a problem since the default timeout is 5 seconds meaning that we would wait for 10 seconds by default. Perhaps we should change the default timeout to 3 seconds so doubling it to 6 will be closer to the original behavior. Or would that be even more confusing for users?

Anyway I know you've got your hands full at the moment so don't feel pressure to respond to this. :)

snoyberg

Minor comment update, otherwise LGTM

src/System/Process/PID1.hs

Instead we will signal TERM to the immediate child process by default. After that we wait for `timeout` period and signal TERM to all children. Finally, we wait for another `timeout` period and signal KILL to all remaining processes (in case they didn't exit with the TERM signal by now). It's very important to set -t/--timeout to the appropriate time based on the graceful shutdown you allow your processes to linger for.

snoyberg

LGTM!

denibertovic · 2021-10-12T14:11:43Z

@snoyberg awesome thanks! When you get the chance can you do a gh release (with the binary) and pushing the new docker image?

snoyberg · 2021-10-12T16:20:01Z

If I made you a maintainer here, would you be up for creating the GH release?

Docker Hub should already be updated: https://hub.docker.com/layers/fpco/pid1/0.1.3.0/images/sha256-05541270069465ac854d1fe79e68cd0a7df64fd42ec42aa5049bed14b82678a4?context=explore

denibertovic · 2021-10-13T14:23:15Z

@snoyberg sure thing, I can do that.

denibertovic · 2021-10-19T14:36:41Z

@snoyberg just a reminder about the repo permissions so I can make a release.

snoyberg · 2021-10-19T15:45:42Z

Sorry, my mistake, I thought I'd added you already! Added now.

denibertovic · 2021-11-15T19:05:21Z

hey @snoyberg . forgot about this again... here's the release: https://github.com/fpco/pid1/releases/tag/pid1-0.1.3.0

The process for building a static binary didn't work for me anymore (static-base/Dockerfile doesn't build anymore).... so I had to use some nix magic to get it to produce a static binary. I used @nh2 's approach suggested here: #1 (comment)

I'll make a separate PR with the changes as it might be useful. Certainly it's more reproducible.
If I find the time I'll try and add a github actions pipeline since travis doesn't work anymore and it would be useful to have this build/release stuff automated away again.

denibertovic force-pushed the master branch from 6cd4f82 to 2ae7fb4 Compare June 2, 2021 15:57

snoyberg requested changes Oct 11, 2021

View reviewed changes

src/System/Process/PID1.hs Outdated Show resolved Hide resolved

denibertovic force-pushed the master branch from eee15d5 to db9fa58 Compare October 11, 2021 09:33

snoyberg approved these changes Oct 12, 2021

View reviewed changes

snoyberg merged commit 6f2e150 into fpco:master Oct 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds ability to only SIGTERM the immediate process #10

Adds ability to only SIGTERM the immediate process #10

denibertovic commented Jun 2, 2021

snoyberg commented Jun 4, 2021

denibertovic commented Jun 7, 2021

denibertovic commented Oct 10, 2021

snoyberg left a comment

snoyberg left a comment

denibertovic commented Oct 12, 2021

snoyberg commented Oct 12, 2021

denibertovic commented Oct 13, 2021

denibertovic commented Oct 19, 2021

snoyberg commented Oct 19, 2021

denibertovic commented Nov 15, 2021

Adds ability to only SIGTERM the immediate process #10

Adds ability to only SIGTERM the immediate process #10

Conversation

denibertovic commented Jun 2, 2021

snoyberg commented Jun 4, 2021

denibertovic commented Jun 7, 2021

denibertovic commented Oct 10, 2021

snoyberg left a comment

Choose a reason for hiding this comment

snoyberg left a comment

Choose a reason for hiding this comment

denibertovic commented Oct 12, 2021

snoyberg commented Oct 12, 2021

denibertovic commented Oct 13, 2021

denibertovic commented Oct 19, 2021

snoyberg commented Oct 19, 2021

denibertovic commented Nov 15, 2021