-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
File descriptor leak under high concurrency #145
Comments
Hi @kohlschuetter, would it be possible to look into this issue? |
Your commands to reproduce may not work on all platforms (e.g., Alpine Linux) and may return the wrong process ("grep" itself on macOS). I'm trying to verify with
and
The commit you recommend to reproduce does not work for me. When I manually remove the monkey-patched files, everything still works OK on macOS, but I see connection errors on Alpine Linux. The monkey patched code (assuming it's the same as in the PR), indeed seems to fix that bug, but I'm not seeing the timeout error messages. |
I didn't test on Alpine; @kevin-sq did reproduce on Linux? Not sure if macOS version matters in this case:
|
may actually get the PID of the
Also beware that |
I've merged the PR, however I'm not really happy that we don't have a proper test case. However, I can see these errors in curl go away with the fix. |
I'm able to reproduce the leaking junix file descriptor in linux via multipass against this baseline branch (which is the same as the above 575469b3453e7b09d9d60f6460cdab8117b8e773 but removed debug logging and other monkey patch files). Linux info:
In one terminal started monitoring the fd count of
In another terminal, started the server against baseline branch
In another terminal started calling the junix uds:
I could verify the monitor count started climbing but not for the native endpoint. Screenshots of the above process: Server started with monitoring. Junix uds not being called yet: Started calling junix uds Running against native endpoint does not leak |
Repeating the above with the monkey patched fix (baseline branch patched) results in junix behaving the same as native. |
Can you please verify with the latest 2.8.3-SNAPSHOT, and all monkey-patched code removed? Make sure the following section is in your POM:
|
I can verify the fix works against the snapshot with all of the monkey-patched code removed: baseline branch 2.8.3-SNAPSHOT |
It's definitely possible to stress test this, so we could make integration tests to that effect. However, I don't see why we couldn't write the code in such a way that these routines are unit testable; then we could mock selection keys and see how the cleanup code handles it. I could take a pass at a refactor to facilitate with unit testing if you would like? |
Yes that would be great, thank you! |
@kohlschuetter I think I want to approach it by first making some broader fixes before I break it down into unit testable functions. The first thing I want to attack is locking, where I noticed some inconsistencies in the way locks are managed. Here is a PR where I do some renaming followed by lock normalization: #146 I would like to follow this up with reworking how |
Clients may iterate upon AFSelector.selectedKeys() while the selected keys are modified in another thread. Change the underlying datastructure to be thread-safe (use ConcurrentHashMap), and add all "ready" selectors even if they're marked invalid (they will be removed later). We reimplement the fix as the original patch caused a FileDescriptor leak and TIPC test failures. #142 #145
Before we continue with improvements to AFSelector, please verify that the updated patch (5e8a6e6) resolves all concurrency issues / FileDescriptor leaks for you. I reverted the original patch attempts because we really introduced the regression in 2.8.2, and even with the latest changes, a TIPC test would fail on Linux. Please also verify by running the selftest jar on Linux with the TIPC kernel module loaded, and a bearer enabled:
|
It seems to be working okay on macOS from a preliminary test. However, this patch doesn't implement the locking in the way that the documentation suggests we should:
IMO, I think it would make sense to implement things as similar to Java's standard library as possible, but I'll leave that up to you. I don't have a Linux test setup handy, maybe @kevink-sq can try to verify the change? |
Yeah, it's really odd. When you replace that ConcurrentHashMap with HashMap, jetty throws a ConcurrentModificationException, which means that either two threads are accessing the structure concurrently, or modifications happen after the iterator has been created. Maybe there's something else we're missing? |
So I think it's the .clear() that's throwing off jetty. In the original SelectorImpl, they check if the key already exists (
|
Since we already have a ConcurrentHashMap, let's make use of the value, which now indicates the "selected" state. Introduce MapValueSet, which is a view over elements of the "keysRegistered" map, and precisely only those elements that have a certain value. For each call to select, we increment the expected value, and then set only the actually selected entries to that value, so we don't have to clear the entire map. #145
@ThePumpingLemma I was able to remove the second ConcurrentHashMap with 38cda27. Can you please take another look? |
Since we already have a ConcurrentHashMap, let's make use of the value, which now indicates the "selected" state. Introduce MapValueSet, which is a view over elements of the "keysRegistered" map, and precisely only those elements that have a certain value. For each call to select, we increment the expected value, and then set only the actually selected entries to that value, so we don't have to clear the entire map. #145
(Fixed checkstyle warning) |
Fix errors only occurring with -Drelease #145
Fix errors only occurring with -Drelease #145
@kohlschuetter are the latest changes (including 38cda27) on |
@kevink-sq Uploaded just now! 20231109.194633-3 |
I reviewed the changes and have some thoughts:
All in all, I think it is possible to prove your approach safe, but I'm not personally comfortable with it. |
MapValueSet wraps the ConcurrentHashMap, so it is as safe as that, and may throw the ConcurrentModificationException only in certain narrow circumstances with Once we had the ConcurrentHashMap for registered keys, it was clear that simply ignoring the value was a waste. Adding another ConcurrentHashMap resulted in simpler code but higher memory requirements. I'm happy that we got that out of the way. |
As a next step, could you take a look at creating some unit tests around the selected keys behavior? Once we have good test coverage, we can make more confident optimizations without introducing further regressions. Thanks for your help, @ThePumpingLemma! |
Reran the above tests against |
I'm also for @ThePumpingLemma's approach to mirroring the reference implementation in the Java standard library. Things like the concurrency issue and the file descriptor leaks were resolved from following their examples and straying from them with novel implementations may have setbacks simply because they haven't been battle tested as much. Unless there are clear benefits to the approach, I'm not sure what is gained. Furthermore, JEP-380 implements kqueue/epoll under the hood which gives a performance edge over junix. If junix were to follow suit, I think it'd be easier to make the changes if references were taken from the Java standard library beforehand. |
Once Java supports all sorts of obscure socket types like AF_UNIX, AF_TIPC, AF_VSOCK, etc., there will probably be little need for junixsocket, which would actually be a good thing. Until then, junixsocket remains a standalone project, under a different license, so copying code from the standard library just isn't an easy option. Moreover, unlike commercial endeavors like Square/Cash app, this is my personal project. I'm not being paid to develop it, which means I can deliberately spend some extra time on polishing bits that I would otherwise leave as-is. I hear your concerns. I don't want to sound harsh. However, please accept my explanation, and decision. In this instance, removing the second Set is, in my view, the right thing to do, as it removes unnecessary allocation operations in a hot path (polling sockets). It also obviously fixes the issue you initially reported, without introducing any obvious regressions (unlike the earlier attempts). Since I see you care about improving junixsocket, I hope you continue to contribute by filing tickets and PRs. Abstract, high-level critique as in "not battle tested" or "what about kqueue/epoll" is not very constructive for this present ticket. Please be very specific with your concerns, and, ideally, provide solutions for these concerns. As I said, I'm not being paid for this project, so I have to manage my attention wisely. If you find that junixsocket is slower/worse than some other approach, please file a bug with some data/code, ideally with a PR that you accept may end up not the way it started. As I explained earlier, the best way to contribute to the very specific situation of You're very welcome to use JEP-380, just as you're welcome to help improve junixsocket. For what it's worth, I'm seeing higher throughput with junixsocket, compared to JEP-380, and I hope you will find too that junixsocket is the better choice, at least for the foreseeable future. |
Clients may iterate upon AFSelector.selectedKeys() while the selected keys are modified in another thread. Change the underlying datastructure to be thread-safe (use ConcurrentHashMap), and add all "ready" selectors even if they're marked invalid (they will be removed later). We reimplement the fix as the original patch caused a FileDescriptor leak and TIPC test failures. #142 #145
Since we already have a ConcurrentHashMap, let's make use of the value, which now indicates the "selected" state. Introduce MapValueSet, which is a view over elements of the "keysRegistered" map, and precisely only those elements that have a certain value. For each call to select, we increment the expected value, and then set only the actually selected entries to that value, so we don't have to clear the entire map. #145
Fix errors only occurring with -Drelease #145
2.8.3 has just been released, including this fix. Please verify and re-open this ticket if necessary. Thanks again for reporting and working with me on this issue. |
Describe the bug
With our PoC at
https://github.com/kevink-sq/jetty-concurrency-issue-poc/tree/575469b3453e7b09d9d60f6460cdab8117b8e773, we noticed that there is unbounded growth of file descriptors when running our simple load test:
seq 1 15000 | xargs -P1500 -I{} curl --unix-socket ./junix-ingress.sock http://localhost/process
Leading to timeouts from curl and timeout exceptions in jetty:
To Reproduce
Steps to reproduce the behavior (assuming macOS):
mvn compile exec:java -Dexec.mainClass="org.example.Main"
seq 1 15000 | xargs -P1500 -I{} curl --unix-socket ./junix-ingress.sock http://localhost/process
Expected behavior
All curl calls should succeed and there should be no timeouts in jetty.
Output/Screenshots
N/A
Please make sure to test the problem still occurs on the latest version of junixsocket
Tested on v2.8.2
Notes
Proposed fix: #144
The text was updated successfully, but these errors were encountered: