Output iteration + buffering #33

ehmicky · 2024-08-23T21:48:07Z

We buffer and return the subprocess stdout/stderr as result.stdout and result.stderr.

Users can also iterate over stdout/stderr. It is helpful for getting progressive output (e.g. a progress bar) or transforming it incrementally.

Also, one of the main use cases of iteration is to reduce memory consumption. However, in our case, the memory is unfortunately not reduced since we buffer result.stdout/result.stderr anyway.

The problem is: with the current API, we only know whether the subprocess output will be iterated or not after the subprocess has already started and its output is already buffering. We have to start buffering right away, else we might miss some data.

Some possible solutions:

When iteration starts, stop any ongoing buffering, and return undefined for result.stdout/result.stderr.
Add a boolean option.
Remove the iteration feature altogether.
Expose iteration as a separate top-level method.

I am leaning towards 1. because it results in the simplest API, while still keeping the iteration feature. It's somewhat quirky though.

What do you think?

The text was updated successfully, but these errors were encountered:

sindresorhus · 2024-08-24T06:13:15Z

When iteration starts, stop any ongoing buffering, and return undefined for result.stdout/result.stderr.

👍 But maybe keep them empty strings? The main use-case is not iteration, and it would be annoying having to guard them for each usage when they are almost always filled in.

ehmicky · 2024-08-24T18:26:43Z

Yes, you're right. Also, making the type change from string to undefined depending on whether buffer.stdout is iterated is not typable in TypeScript, so it's better to turn it into an empty string. 👍

ehmicky · 2024-08-28T17:42:11Z

I've been trying to implement this but this is more difficult than expected.

The problem is being able to first remove the data listener, then start the for await (const chunk of stream) loop, while being 100% sure no chunk will be dropped in-between. I've been digging through the Node.js stream codebase, and it's a little tricky. The problem is that chunks are internally buffered by Node.js and some of the data/readable event emission is done asynchronously (typically, after 1 or multiple ticks). So it's not straightforward to switch from one consumer to another without dropping any chunk.

I am trying to think of implementation solutions right now. 🤔

One possible solution is outlined at #43

This was referenced Aug 28, 2024

Retrieve the subprocess asynchronously #43

Closed

Do not buffer output when iterating it #53

Merged

sindresorhus closed this as completed in #53 Sep 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Output iteration + buffering #33

Output iteration + buffering #33

ehmicky commented Aug 23, 2024 •

edited

Loading

sindresorhus commented Aug 24, 2024

ehmicky commented Aug 24, 2024

ehmicky commented Aug 28, 2024 •

edited

Loading

Output iteration + buffering #33

Output iteration + buffering #33

Comments

ehmicky commented Aug 23, 2024 • edited Loading

sindresorhus commented Aug 24, 2024

ehmicky commented Aug 24, 2024

ehmicky commented Aug 28, 2024 • edited Loading

ehmicky commented Aug 23, 2024 •

edited

Loading

ehmicky commented Aug 28, 2024 •

edited

Loading