-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Node 20 recursive file watching is misbehaving on Linux #48437
Comments
Does the issue resolve if you add a small timeout before writing file and after creating the watcher? I suspect the async nature of fs which does polling causes this issue. |
Looks like that did help - https://github.com/cjihrig/recursive-watcher-bug/actions/runs/5247088842/jobs/9476792611. I don't think we can realistically ask every user of recursive file watching to add that to their code though. |
@cjihrig I agree. It seems the issue is within this lines: https://github.com/nodejs/node/blob/main/lib/internal/fs/recursive_watch.js#L222 We can avoid using promisify, but I'm suspicious towards blocking the main thread while this is resolving. Unfortunately, I'm on parental leave and have limited access. If this task is not urgent, I can take a look at it in a week and a half. |
Can I take on this issue? |
Please do. Happy to help if you get stuck on something. |
So the traverse function doesn't complete its execution before the files start to be watched, thus causing a race condition? Would introducing a mechanism to 'await' the initialization of the watcher could resolve this issue? |
That would be a breaking change since it will cause the function to behave differently in different operating systems. |
Understood. Would it be feasible to implement a flag/event that is set once the traverse function has completed? This way, we could delay file watching until after the initial traversal? |
Regarding the race condition issue in the code you shared, I've analyzed it and come up with a solution. The problem arises from the I propose the following solution to address this issue: `async function traverse(dir, files = new SafeMap(), symbolicLinks = new SafeSet()) { const filenames = await opendir(dir); for await (const file of filenames) {
} await Promise.all(subdirectories); // Wait for all subdirectory promises to resolve return files; // ... async kFSWatchStart { try {
} catch (error) { Here are the changes made and their explanations: 1.Modified the traverse function to be an async function and added the await keyword before the recursive call to traverse for subdirectories. This ensures that the function waits for the subdirectory traversal to complete before moving on, resolving the race condition. 2.In the [kFSWatchStart] method, added the await keyword before calling traverse for the initial directory. This ensures that the function waits for the traversal to complete before proceeding with watching the files. |
@spacesugam are you still interested in sending a PR? I would go for the first option. |
@anonrig @cjihrig I took another look at this issue, and there is a race condition between traversing the file system asynchronously and This is not a problem on Windows or Mac because they have different low-level primitives for watching |
The only solution to this bug is to move the exploration of the directory tree to synchronous on Linux. @anonrig do you see any specific issue with that? |
Quite possibly related, calling |
Looks good to me!
@novemberborn Can you create a different issue on this, with a possible reproduction please? |
…entsWatcher and fsevents dependency Summary: Replaces the `FSEventsWatcher` with `fs.watch(dir, {recursive: true})` ("`NativeWatcher`"), allowing us to remove a dependency and open this up to all macOS and Windows users. Why now? Why is this better? That depends on the platform... ## macOS When `FSEventsWatcher` was written, Node.js didn't support OS-native recursive watch on macOS (Node.js supported the `recursive` flag, but the implementation inefficiently opened a file descriptor for each watched file - see libuv/libuv#2082). Therefore, we used the `fsevents` library to bind to the native macOS API, and achieved cheap recursive watching at the cost of a slightly awkward optional dependency. Now, Node.js effectively does the same thing behind the scenes that we've always done, using the [OS FSEvents primitives](https://developer.apple.com/documentation/coreservices/file_system_events), and there's no need for us to use the `fsevents` npm package. ## Windows Similarly, (this bit of Jest and) Metro pre-dates Node's support of Windows native recursive watching, introduced to libuv [as recently as 2015](libuv/libuv#421). We've never had a good alternative on Windows though, and simply fell back to walking the whole tree and watching each directory individually (`FallbackWatcher`). ## Linux Unfortunately, linux has no native primitive for recursive watching, and Node.js's attempts to patch over that are still in flux, [with no great options]( nodejs/node#48437). Watchman, or our `FallbackWatcher`, remain our best options on Linux. (That's likely to continue to be the case even if `Node.js` settles on a `'ready'` event or async API, because by controlling the walk we can avoid watching ignored subtrees.) Changelog: ``` - **[Performance]**: Use fast, recursive watch on Windows, and on macOS regardless of optional dependency installation. ``` Differential Revision: D67636998
…and fsevents dependency (#1420) Summary: Pull Request resolved: #1420 Replaces the `FSEventsWatcher` with `fs.watch(dir, {recursive: true})` ("`NativeWatcher`"), allowing us to remove a dependency and open this up to all macOS users, and (shortly) Windows users. Why now? Why is this better? That depends on the platform... ## macOS When `FSEventsWatcher` was written, Node.js didn't support OS-native recursive watch on macOS (Node.js supported the `recursive` flag, but the implementation inefficiently opened a file descriptor for each watched file - see libuv/libuv#2082). Therefore, we used the `fsevents` library to bind to the native macOS API, and achieved cheap recursive watching at the cost of a slightly awkward optional dependency. Now, Node.js effectively does the same thing behind the scenes that we've always done, using the [OS FSEvents primitives](https://developer.apple.com/documentation/coreservices/file_system_events), and there's no need for us to use the `fsevents` npm package. ## Windows Similarly, (this bit of Jest and) Metro pre-dates Node's support of Windows native recursive watching, introduced to libuv [as recently as 2015](libuv/libuv#421). We've never had a good alternative on Windows though, and simply fell back to walking the whole tree and watching each directory individually (`FallbackWatcher`). In principle, `NativeWatcher` could mean watching starts up near-instantaneously and cheaply, using `ReadDirectoryChangesW` via `libuv`: https://github.com/libuv/libuv/blob/v1.49.2/src/win/fs-event.c#L46. However **we keep this macOS-only for now because Windows needs more work** - e.g., Windows exhibits races on event-then-stat, emits multiple events for file writes, emits unwanted change events on directories, and may throw EPERM if a file is locked for pending deletion. ## Linux Unfortunately, linux has no native primitive for recursive watching, and Node.js's attempts to patch over that are still in flux, [with no great options]( nodejs/node#48437). Watchman, or our `FallbackWatcher`, remain our best options on Linux. (That's likely to continue to be the case even if `Node.js` settles on a `'ready'` event or async API, because by controlling the walk we can avoid watching ignored subtrees.) Changelog: ``` - **[Performance]**: Use fast, recursive watch on macOS regardless of optional dependency installation. ``` Reviewed By: huntie Differential Revision: D67636998 fbshipit-source-id: 5dfc9b81cbd22640c22695a9414242b929c5581d
Version
20.3.0
Platform
linux
Subsystem
fs
What steps will reproduce the bug?
I put together a repro at https://github.com/cjihrig/recursive-watcher-bug that shows the following passing on Windows and macOS, but failing on Ubuntu. I'm not sure if this is specific to GitHub Actions.
How often does it reproduce? Is there a required condition?
Always reproduces for me.
What is the expected behavior? Why is that the expected behavior?
I expect the test to pass.
What do you see instead?
The test times out.
Additional information
I noticed this while trying to update Platformatic to support Node 20 and created the minimal reproduction linked above.
I also noticed that Platformatic was passing
recursive: true
to the promisified version ofwatch()
on Ubuntu on earlier versions of Node. It should have thrownERR_FEATURE_UNAVAILABLE_ON_PLATFORM
, but did not. I did see that error with the callback basedwatch()
though, which makes me think there is some missing validation on older versions of Node in addition to this bug.cc: @anonrig who implemented recursive file watching on Linux.
The text was updated successfully, but these errors were encountered: