Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-allow zero blocking threads with warnings, panic in blocking spawner #3367

Closed
wants to merge 2 commits into from

Conversation

dekellum
Copy link
Contributor

@dekellum dekellum commented Jan 3, 2021

Motivation

I'm of the development school favoring a fixed, small number of threads per process. Its the raison d'être for the additional effort of asynchronous code. Such applications are more predictable in terms of memory usage and performance. With tokio 0.2-0.3 I was able to configure a fixed number of "core" worker threads and, with some careful other choices, know that no (zero) "extra" threads were needed; in tokio terms, neither spawn_blocking nor block_in_place were in use.

Note that the existing blocking facilities are really just workarounds for OS gaps which we should expect to become less favored as alternatives become available, e.g. rust-native fully asynchronous DNS resolvers and Linux io_uring.

As of tokio-1.0.0, #3287 disallowed configuring zero (0) "extra" threads (max_blocking_threads) on tokio's threaded runtime pool. Previously, if one configured zero extra threads, testing showed that any application tasks needing extra blocking threads would simply fail to complete. That was too subtle a failure, but on the other hand, #3287 may be too prescriptive as it stands, not matching other parts of tokio, which allows selecting only the features needed for the application.

With #3287 as released, configuring just one "extra" thread (max_blocking_threads(1)) may still be a silent and hard to diagnose performance liability, for example serializing DNS lookups with file reads/writes. Runtime (as opposed to config time) panic on calls to spawn_blocking may be more educational to the end user than simply forcing them to use a small positive number, which may still be inadequate for good performance.

Solution

  • Introduce spawn_core to separate the call path for "core" worker thread launching.

  • Make the pool aware of "core" and "extra" thread capacity. If there is zero "extra" capacity, panic with an informative message when spawn_blocking or block_in_place is called.

  • Improve rustdoc in general for max_blocking_threads

This logically preserves the "fix" for #2802 in that instead of never completing, tokio will panic with an informative runtime error message (and stack trace) as to the mis-configuration:

panicked at '`spawn_blocking` or `block_in_place` called,
but `runtime::Builder::max_blocking_threads(0)` was configured!',

Alternatives

Configurable workaround

With tokio 1.0.0-1.0.1, we can effectively obtain the same panics and backtrace on the first use of a blocking "extra" thread:

    use std::sync::atomic::{AtomicUsize, Ordering};

    let rt = tokio::runtime::Builder::new_multi_thread()
        .worker_threads(2)
        .max_blocking_threads(1)
        .thread_name_fn(|| {
            static CNT: AtomicUsize = AtomicUsize::new(0);
            let c = CNT.fetch_add(1, Ordering::SeqCst);
            if c >= 2 {
                panic!("spawn_blocking/block_in_place must have been used!");
            } else {
                format!("worker-{}", c)
            }
        })
        .build()
        .unwrap();

This isn't terribly bad as a workaround, but consider that many users whom might have initially configured zero (0) blocking threads, thinking they don't need it, might next choose one (1) blocking thread, and not be aware of the significant performance implications to effectively serializing all blocking operations, e.g. DNS lookups and file operations, on one thread.

Feature gates

We could place spawn_blocking and block_in_place functions under spawn-blocking and block-in-place feature gates, respectively. (Previously there was the blocking feature gate.) These would be included by full. Also tokio features like fs could depend on spawn-blocking to ensure continued compatibility.

A start of a prototype for this is here:

master...dekellum:opt-in-blocking-features

An advantage for this alternative is that libraries (not using full) will need to explicitly opt-in to the features or will fail to compile (rather then deferring to a runtime panic).

However, this alternative will frequently not be sufficient without associated down stream changes, and these may be an up hill battle for users. For example, hyper currently includes a client::connect::dns::GaiResolver which uses spawn_blocking. However, currently the only way to disable this resolver is to disable hyper's tcp feature, which disables too much: the entire HttpConnector and all its tokio integration.

Add runtime::Builder::enable_blocking(false)

This adds an additional safety/hurdle to configuring no capacity for blocking, but would result in the same runtime panics as the preferred solution above. Here the user would additionally need to add the following to runtime::Builder configuration:

    let rt = tokio::runtime::Builder::new_multi_thread()
        .enable_blocking(false)
        .max_blocking_threads(0) // optional, but now allowed
        .build()
        .unwrap();

Note that as a user, I would still want to explicitly set max_blocking_threads(0).

Prototype for this is as follows, with explicit warnings in rustdoc:

master...dekellum:allow-disable-blocking

    /// Enable or disable support for spawning of blocking operations.
    ///
    /// The default value is `true` (enabled).
    ///
    /// **Warning: Setting this to `false` (disabled) will result in runtime
    /// panics as described below.** Doing so is only possibly useful as
    /// _either_ a temporary way to find calls to `spawn_blocking` or as a
    /// testing/runtime assertion that an application does not use
    /// `spawn_blocking` either directly or indirectly via tokio or other
    /// libraries.
    ///
    /// # Panic
    ///
    /// If set to `false` (disabled), any calls to [`Runtime::spawn_blocking`]
    /// or [`task::block_in_place`](crate::task::block_in_place), including
    /// internal use such as in `tokio::fs` or via other libraries, will result
    /// in a panic.
    pub fn enable_blocking(&mut self, val: bool) -> &mut Self {
        /*…*/
    }

IMO this alternative amounts to an unnecessary additional hurdle, but I'm willing to add this to the preferred solution of this PR, if it remains desired.

Or disable_blocking

Another alternative interface is to add disable_blocking for the same purpose. In terms of consistency of the builder interface, there is already enable_* methods (e.g. enable_time) so a disable_* method, without a parameter, is more consistent in some sense.

@Darksonn Darksonn added A-tokio Area: The main tokio crate M-blocking Module: tokio/task/blocking labels Jan 3, 2021
@dekellum
Copy link
Contributor Author

dekellum commented Jan 7, 2021

@Darksonn (I know you're not a bot ❤️) could you please add label: C-feature-accepted to this PR as well? Accepted I believe based on below discord comments? Apologizes if this is overly pedantic, but my hope would be that C-feature-accepted has a better chance of being found and reviewed by the next (1.1.0) minor release. If you recall our discord conversations:

https://discord.com/channels/500028886025895936/500336346770964480/789592454348800038

Carl Lerche 12/18/2020
we don't have time to dig into it, so we can take the conservative option leaving open the opportunity to improve later
Alice Ryhl 12/18/2020
Ideally I think it makes more sense for spawn_blocking to panic if it's disabled, than to just let it sleep forever.
Alice Ryhl 12/18/2020
I'm ok with a PR to add it, but we wont block the 1.0.0 release on it because it is not a breaking change to add it in the following
point release.

@dekellum
Copy link
Contributor Author

dekellum commented Jan 22, 2021

Closing this, given the radio silence here despite discord, since its looks unlikely to be merged for 1.1.0, will likely start conflicting with other changes, and because the above described Configurable Workaround alternative has another advantage over the direct support implemented here:

While testing with the workaround on 1.0.0-1.0.2 I found that its possible to hang benchmark runs and it would theoretically be possible to hang a production applications with this in place, and a rarely used/untested spawn_blocking call path. Hangs in CI'd benchmarks/tests or production applications are definitely inferior to a graceful process exit! Its easy enough to avoid the hang with an AbortOnPanic guard to the Configurable Workaround recipe above, like this.

The same could be added to the panic assert of this PRs implementation, but in the vacuum of feedback, I'm doubtful that would help the likelihood of landing this re-instated feature. In any case, a better way of supporting that with tokio's thread pool, would be to offer a ignore_panics(false) flag to the runtime builder, like for example this one. In other words, the feature of this PR has effectively that feature as a dependency, and could be revived with that in place in the future.

Thanks, always a pleasure.

@dekellum dekellum closed this Jan 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-tokio Area: The main tokio crate M-blocking Module: tokio/task/blocking
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants