Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: add types for named Unix pipes #5351

Merged
merged 40 commits into from
Feb 27, 2023
Merged

Conversation

satakuma
Copy link
Member

@satakuma satakuma commented Jan 5, 2023

Motivation

This change provides types to asynchronously read and write to named Unix pipes (FIFOs). Currently, the only way to use named pipes is either using tokio::fs::File, which spawns blocking tasks, or using AsyncFd, which does not implement AsyncRead and AsyncWrite.

See also #5318.

Solution

This change exposes a new module: tokio::net::pipe with Sender and Receiver types, which represent a writing end and a reading end of a Unix pipe, respectively. Both types encapsulate corresponding Sender and Receiver types from mio, which in turn use std::fs::File underneath to handle reads and writes.

Sender is AsyncWrite and Receiver is AsyncRead. Following the pattern of other types from net such as ReadHalf/WriteHalf for socket streams, Receiver additionally has:

  • poll_read_ready
  • readable
  • ready
  • try_read
  • try_read_vectored
  • try_read_buf

and Sender has:

  • poll_write_ready
  • writable
  • ready
  • try_write
  • try_write_vectored

I considered omitting ready methods, since the communication is already one-way and there are readable/writable functions for waiting for readiness. However, ready method allows inspecting readiness events and distinguishing READ_CLOSED from READABLE events, what might be the reason why types such as OwnedReadHalf have ready functions in the first place. Therefore, I decided to follow the pattern and include ready methods as well.

Opening a FIFO

Creating a pipe from a FIFO file is done by opening the file with a proper access mode flag, see fifo(7). open methods on Sender/Receiver types open FIFO files for writing/reading in non-blocking mode and register them in the event loop.

However, there is a requirement that the reading end has to be opened before any writing end. If a Sender opens a FIFO with no readers, the open will immediately fail with ENXIO. For that reason, docs for Sender::open include an example presenting how to wait for the reading end by sleeping in a loop.

On Linux, it is also possible to open a FIFO file in access mode for both reading and writing. Such open will never fail and can be used as a workaround to this problem. For Linux users Sender::open_rw can open a FIFO file in the O_RDWR access mode. Note that the reading access won't be used in practice, as Sender has no methods/traits to read from a pipe and is not registered in the event loop with interest for read events.

Sender/Receiver also have from_file methods which handle conversion from std Files holding a file descriptor to a pipe. from_file is necessary for users which are not sure whether a file is a special FIFO file that can be opened with the open method. In such case they can use File methods to inspect file's metadata and then convert it to a pipe.

Next steps

Since on a Unix system anonymous pipes and named pipes (FIFOs) are only different in the way they are created (pipe() syscall vs. mkfifo + open), Sender/Receiver types can provide one abstraction for both cases. Usage of anonymous pipes is almost exclusively restricted to communication with spawned processes, and for that there already is an async interface in tokio::process with types like ChildStdin etc. A user may wish to have one type for pipes used to communicate with children processes and for pipes created from FIFO files in a filesystem. I think it's reasonable to add integration with types from tokio::process. This can be done in the following way:

  1. Construct a single, internal type backing pipe operations, for example in tokio::io::unix. This type can be done the same way as ChildStdio i.e. use SourceFd instead of types from mio::unix::pipe.
  2. Refactor the code to use the same type internally for named pipes in net and for anonymous pipes in process.
  3. Implement From<ChildStdin> for Sender and From<ChildStdout>,From<ChildStderr> for Receiver. This conversion can be infallible since the pipe is already registered in the event loop.

@NobodyXu
Copy link
Contributor

NobodyXu commented Jan 5, 2023

Thanks for the great work!

Usage of anonymous pipes is almost exclusively restricted to communication with spawned processes, and for that there already is an async interface in tokio::process with types like ChildStdin etc.

I would definitely love to see annoymous pipe support in tokio, if this PR is merged.
Support for annoymous pipe would be really easy with this PR.

Also, if you check the dependents of tokio-pipe, a crate for annoymous pipe only, you can find quite some crates dependent on it.

tokio-pipe also provides splice/tee support in addition to write/read.

@satakuma
Copy link
Member Author

satakuma commented Jan 6, 2023

It might be worthwhile to rethink whether net is the right place for types from this PR, especially if tokio plans to support anonymous pipes with them. Unix pipes are quite different from net types such as Unix domain sockets and windows named pipe in the sense that there is no connect and no client/server structure. tokio::io::unix::pipe looks reasonable to me, next to the AsyncFd.

Basic support for anonymous pipes probably boils down to just adding the new_pair function like this:

fn new_pair() -> io::Result<(Sender, Receiver)>

Additional support like splice/tee from tokio-pipe can be done later if needed.

@NobodyXu
Copy link
Contributor

NobodyXu commented Jan 6, 2023

tokio::io::unix::pipe looks reasonable to me, next to the AsyncFd.

I agree, I think that's the right place for unix pipes.

If someone want to provide portable named pipe over unix and windows named pipe, then can simply provide an abstraction in tokio::net later.

Basic support for anonymous pipes probably boils down to just adding the new_pair function like this:

That's probably enough, but would be great if we can also set flags such as O_CLOEXEC.

Additional support like splice/tee from tokio-pipe can be done later if needed.

👍

@Noah-Kennedy
Copy link
Contributor

I need to take some time to go through this.

/// The runtime is usually set implicitly when this function is called
/// from a future driven by a tokio runtime, otherwise runtime can be set
/// explicitly with [`Runtime::enter`](crate::runtime::Runtime::enter) function.
pub fn from_file(file: File) -> io::Result<Sender> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest that we could add some extra checks on whether it's writeable and actually a pipe fd here, similar to how tokio_pipe::PipeFd::from_raw_fd_checked performs the check.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really sure that this is necessary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really sure that this is necessary.

While it's not absolutely necessary, checking it will rule out some stupid mistakes, e.g. user opened a regular file and passed it to Sender::from_file, or they opened the read end of a pipe and passed it here.

It's still runtime error, but better than no error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I think it would be better to use OwnedFd here instead of File:

fn from_owned_fd(fd: Into<OwnedFd>) -> io::Result<Sender>;

since impl From<File> for OwnedFd makes it possible to create OwnedFd from File.

There are also From<ChildStdin>, From<ChildStdOut> and From<ChildStdErr> implementation, which would makes this function more useful.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be best to do both File and OwnedFd separately. For now I don't think we can support OwnedFd though due to MSRV.

Regarding the check, I'm hesitant as I've seen plenty of weird cases before where people use different weird linux APIs and then open them as tokio types which use the same system calls under the hood for interaction. I could totally see people trying to use this to open different things that aren't strictly pipes but are readable or writable only. Do we want to disqualify this use case?

If we decide to do a check now, it will be a breaking change to remove it, so we need to think this through.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could totally see people trying to use this to open different things that aren't strictly pipes but are readable or writable only. Do we want to disqualify this use case?

They already have AsyncFd for that and tokio also provides unix socket, tcp/udp socket, etc.

I don't see why people would want to create Sender from File, since regular file are blocking, other special files mostly the same.

The only thing that I can think of is people trying to open a named unix socket, but it is already taken care of by UnixStream::connect, not to mentio that Sender does not provide several functions that is available in UnixStream and functionalities that will be included (e.g. send fd over unix socket).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be best to do both File and OwnedFd separately. For now I don't think we can support OwnedFd though due to MSRV.

Oops, then it makes sense to have two separate functions.
It's a shame we can't just have one from_owned_fd function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same argument about the extra checks can be made against open methods, since they also don't check if the file is a FIFO file. My problem with such validation is that it costs additional syscalls, which are probably unnecessary most of the time. At first, I wanted to make a pair of open/open_unchecked functions for each type like here, but now I think it would be better to avoid syscall overhead by default. The purpose of adding from_file was to let the user do all sort of checks with a FIFO file if they need to, and then do the conversion. It's not possible to do those checks independently before calling Sender::open because of TOCTTOU.

Note that the docs for from_file warn users to make sure the access mode is correct etc. Maybe this should be more emphasized, or it should be renamed to from_file_unchecked and then we can add a from_file variant with checks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not possible to do those checks independently before calling Sender::open because of TOCTTOU.

That's true, it's only possible after the file is opened.

Maybe this should be more emphasized, or it should be renamed to from_file_unchecked and then we can add a from_file variant with checks.

IMHO renaming it to from_file_unchecked and adding a new from_file would be better.

/// The runtime is usually set implicitly when this function is called
/// from a future driven by a tokio runtime, otherwise runtime can be set
/// explicitly with [`Runtime::enter`](crate::runtime::Runtime::enter) function.
pub fn from_file(file: File) -> io::Result<Receiver> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same suggestion as Sender::from_file.

unsafe { &mut *(dst as *mut _ as *mut [std::mem::MaybeUninit<u8>] as *mut [u8]) };

// Safety: We trust `mio_pipe::Receiver::read` to have filled up `n` bytes in the
// buffer.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that since dst is [MaybeUninit<u8>], we should also add that we trsut mio_pipe::Receiver::read to not read from dst.

Copy link
Contributor

@Noah-Kennedy Noah-Kennedy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm gonna take another look tomorrow before I approve, but this looks great. I think it's probably fine to merge as is.

/// The runtime is usually set implicitly when this function is called
/// from a future driven by a tokio runtime, otherwise runtime can be set
/// explicitly with [`Runtime::enter`](crate::runtime::Runtime::enter) function.
pub fn open_rw<P>(path: P) -> io::Result<Sender>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not lock this behind #[cfg(linux)]?

Copy link
Member Author

@satakuma satakuma Jan 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably a good idea, but while only Linux explicitly allows using FIFOs in O_RDWR access mode, it may also be possible on other systems. For example, it seems possible on FreeBSD, but it's lacking documentation or I couldn't find it. So, an experienced user may want to use open_rw on other systems than Linux.

But I think locking it to Linux is the better choice since it will probably prevent more mistakes, I will change it in the next iteration.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also start with Linux and allow this on other platforms as we find out more

/// The runtime is usually set implicitly when this function is called
/// from a future driven by a tokio runtime, otherwise runtime can be set
/// explicitly with [`Runtime::enter`](crate::runtime::Runtime::enter) function.
pub fn from_file(file: File) -> io::Result<Sender> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really sure that this is necessary.

/// The runtime is usually set implicitly when this function is called
/// from a future driven by a tokio runtime, otherwise runtime can be set
/// explicitly with [`Runtime::enter`](crate::runtime::Runtime::enter) function.
pub fn from_file(file: File) -> io::Result<Sender> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really sure that this is necessary.

While it's not absolutely necessary, checking it will rule out some stupid mistakes, e.g. user opened a regular file and passed it to Sender::from_file, or they opened the read end of a pipe and passed it here.

It's still runtime error, but better than no error.

/// The runtime is usually set implicitly when this function is called
/// from a future driven by a tokio runtime, otherwise runtime can be set
/// explicitly with [`Runtime::enter`](crate::runtime::Runtime::enter) function.
pub fn from_file(file: File) -> io::Result<Sender> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I think it would be better to use OwnedFd here instead of File:

fn from_owned_fd(fd: Into<OwnedFd>) -> io::Result<Sender>;

since impl From<File> for OwnedFd makes it possible to create OwnedFd from File.

There are also From<ChildStdin>, From<ChildStdOut> and From<ChildStdErr> implementation, which would makes this function more useful.

@satakuma
Copy link
Member Author

satakuma commented Jan 6, 2023

It may be a good idea to also add the open_rw method to the Receiver. For Sender, the use case for this method is to open a FIFO without waiting for readers in a sleeping loop. For Receiver the use case is something called resilient named pipes, here is a pretty good explanation.

As an example, consider a scenario where a reader wants to receive commands from a named pipe, which are periodically sent by a writer. If it is done like this:

let mut reader = pipe::Receiver::open(path)?;
loop {
    let mut buf = vec![0; 0x100];
    reader.read_exact(&mut buf)?;
    /* handle the command */
}

and the writer closes the file, then the reader will fail in the next iteration with UnexpectedEof because once the writing end has been closed, read operations will keep returning 0 bytes read until the next writer opens the FIFO file. To recover from this the reader has to either re-open the FIFO file or use a sleeping loop to wait for the next writer.

If the reader is opened in the read-write access mode, then he also holds an open writing end and therefore there won't be any EOF. The reader can asynchronously wait for the next writer to open the file without sleeping loops and re-opening the file. Note that this is a Linux-specific solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-tokio Area: The main tokio crate M-net Module: tokio/net
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants