-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds ability to only SIGTERM the immediate process #10
Conversation
This is useful in certain situations (with multiprocessing apps such as `gunicorn` or `celery`) where we want to signal to the "main" process only and let *it* handle gracefully terminating the "worker" processes. In case the "main" process does not gracefully shutdown the "worker" processes in the given "timeout" we still make sure to send SIGKILL to all "(-1)" remaining processes.
Overall this looks good, but I'm wondering if it actually needs a flag. Maybe the best behavior would be:
It seems overall better for all use cases. What do you think? |
@snoyberg I think I'm inclined to agree with you. The only issue I see is that it might be surprising that in very rare situations the timeout set on the command line will be doubled because of the double SIGTERM attempt and then SIGKILL. But I guess that can be documented. The reason why I didn't go with that approach initially is because it's a breaking change and I wasn't sure if this would impact existing users. (Although I can't see how yet). Let me try and implement the proposed solution. |
@snoyberg I totally forgot about this. I actually made the change locally but then it slipped my mind and I didn't push it. Anyway, I've pushed a new commit that does what we discussed above: send sigterm to immediate child -> wait for timeout -> send sigterm to all children -> wait for timeout (AGAIN) -> send sigkill to all children. I ran some local testing and it does seem to work fine for my use case. I pushed the change as a new commit so we can still choose between the two approaches and makes things easier to revert if we decide to. But, if you want let me know and I can squash the two commit. There's one thing I'm not too sure about. In a scenario where we don't have a well behaved process (ignores TERM signal) we will essentially be allowing it the double amount of Anyway I know you've got your hands full at the moment so don't feel pressure to respond to this. :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comment update, otherwise LGTM
Instead we will signal TERM to the immediate child process by default. After that we wait for `timeout` period and signal TERM to all children. Finally, we wait for another `timeout` period and signal KILL to all remaining processes (in case they didn't exit with the TERM signal by now). It's very important to set -t/--timeout to the appropriate time based on the graceful shutdown you allow your processes to linger for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
@snoyberg awesome thanks! When you get the chance can you do a gh release (with the binary) and pushing the new docker image? |
If I made you a maintainer here, would you be up for creating the GH release? Docker Hub should already be updated: https://hub.docker.com/layers/fpco/pid1/0.1.3.0/images/sha256-05541270069465ac854d1fe79e68cd0a7df64fd42ec42aa5049bed14b82678a4?context=explore |
@snoyberg sure thing, I can do that. |
@snoyberg just a reminder about the repo permissions so I can make a release. |
Sorry, my mistake, I thought I'd added you already! Added now. |
hey @snoyberg . forgot about this again... here's the release: https://github.com/fpco/pid1/releases/tag/pid1-0.1.3.0 The process for building a static binary didn't work for me anymore ( I'll make a separate PR with the changes as it might be useful. Certainly it's more reproducible. |
This is useful in certain situations (with multiprocessing apps such as
gunicorn
orcelery
) where we want to signal to the "main" processonly and let it handle gracefully terminating the "worker" processes.
In case the "main" process does not gracefully shutdown the "worker"
processes in the given "timeout" we still make sure to send SIGKILL to
all "(-1)" remaining processes.
Hey @snoyberg do you mind going over this and letting me know what you think? Did I miss anything / could this be done in a better way?
I'm not crazy about the flag name but couldn't think of anything more suitable. Also I didn't bump the major version since this is behind a feature flag and wouldn't impact existing users.