-
Notifications
You must be signed in to change notification settings - Fork 558
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hang waiting for create_server #39
Comments
The general advice is to create a minimal testcase and then debug the code with gdb (or lldb) and print statements ;) In this particular case, could you please tell me the values of |
I'd also suggest to add prints before each |
Hm, actually, there is only one |
And there's also an option to make a debug build with |
On Thu, Jul 7, 2016 at 5:39 PM, Yury Selivanov [email protected]
The address is almost always Does the loop have any other scheduled tasks? No.
I'm 70% sure this is a resource exhaustion problem of some sort. Also worth noting:
Jim Jim Fulton |
On Thu, Jul 7, 2016 at 6:05 PM, Yury Selivanov [email protected]
|
On Thu, Jul 7, 2016 at 6:45 PM, Yury Selivanov [email protected]
https://github.com/zopefoundation/ZEO/blob/debug-uvloop/src/ZEO/asyncio/server.py#L226 |
The idea is to create a debug task that will periodically print debug info on the screen. This is how I use it in uvloop benchmarks: I'm thinking about adding a manhole or something, that would let you attach to a running event loop from the outside. |
The program was running out of file descriptors. Not an issue with uvloop at all. Sorry for the noise. |
Thanks a lot for confirming this! |
I was wrong. It has nothing to do with running out of file descriptors. (I was able to make uvloop fail when having too many open files, but that's not what made the tests fail.) Many of the ZEO tests are integration tests. Typically, these tests run a server in a subprocess using multiprocessing. A few tests run the server in a thread, which makes it possible to introspect server state. Mixing thread-based and multiprocessing-based tests leads to hanging in the test runner when using uvloop. Here's a minimal script that demonstrates the problem: https://gist.github.com/jimfulton/59c02a96ebe4720d512b2b0ee426c7ed I'm not sure if this issue is specific to multiprocessing (I suspect so because magic :) ) or would also be an issue with other ways of running a server in a subprocess. |
uvloop is a bit faster than the standard asyncion event loop. Unfortunately, there's an issue with the ZEO tests and using uvloop in the ZEO server: MagicStack/uvloop#39 Fortunately, most of the performamce benefits of using uvloop seems to be in the client. For now, we'll just use ivloop on the client side, as the incremental effort of using it in the server aren't worth futher wrestling with the tests. (I spent quite a bit of time just narrowing down the cause of the test issue.) With this PR, if uvloop can be imported, it will be used for the client. (uvloop requires Python 3.5 or later and doesn't support Windows.)
Interesting. Thanks a lot for the script -- will investigate tomorrow. |
uvloop is a bit faster than the standard asyncion event loop. Unfortunately, there's an issue with the ZEO tests and using uvloop in the ZEO server: MagicStack/uvloop#39 Fortunately, most of the performamce benefits of using uvloop seems to be in the client. For now, we'll just use ivloop on the client side, as the incremental effort of using it in the server aren't worth futher wrestling with the tests. (I spent quite a bit of time just narrowing down the cause of the test issue.) With this PR, if uvloop can be imported, it will be used for the client. (uvloop requires Python 3.5 or later and doesn't support Windows.)
Looks like this bug is in libuv. A call to Options:
Not sure how to even estimate which option is more preferable here. cc @saghul |
This is the case, yes. The threadpool has some global state which needs to be adjusted in the child. Since There is, however, a pull request which addresses this: libuv/libuv#846 more eyes are always welcome!
You might want to do this for another reason though: Python's |
I've seen that PR, it looks good to me (I spent quite a bit of time looking and playing with libuv internals). But I'm not a core libuv committer to say the LGTM :)
Yes, but I already have all the code to normalize libuv results into what Python code is expecting to see. All that code would be useful for c-ares or getdns though. @jimfulton How serious is this bug? Does it preclude you from using ZODB with uvloop? |
The bug is mainly an inconvenience at this point. Informal performance measurements indicate that most of the win is on the client, so for the moment, I'm not using uvloop on the server. I could use it on the server if I rearrange some tests, which I think I'll probably do. (If it was easy for you to fix, I'd have waited, but I'll go ahead and work around it.) (Lack of connect_accepted_socket is also preventing using on the server for the main project I'm working on. :) ) |
I'll try to add |
If you don't mind, please leave a comment, it helps.
I'm not sure about that. c-ares does not implement |
@saghul I've left a comment for libuv/libuv#846 -- it solves this particular bug and other bugs we have with |
@1st1 thanks! I can't really give an estimate, it's a (relatively) big change, so we are cautious. |
To accomplish this, it was necessary to rearrange the tests so that tests that ran servers in threads rather than using multiprocessing into their own layer. This is due to a bug currently in uvloop that prevents using running uvloop servers in a process and in subprocesses created with multiprocessing: MagicStack/uvloop#39 To run the tests with uvloop installed, it's necessary to use the ``-j`` option to run layers in separate processes.
@jimfulton FWIW uvloop v0.5.1 has |
Great! I'll try it today. BTW, I've updated the ZEO tests so they pass with uvloop (when run a certain way that runs groups of tests in separate processes). |
connect_accepted_socket works great. :) ZEO is now using it in it's multi-threaded-server configuration. Thanks. |
Finally some good news ;) |
Fixed in master. Will be in the next uvloop release (0.9.0) soon. |
Please try uvloop v0.9.0. |
I need some suggestions for debugging server startup hangs.
Startup code looks like:
What's sometimes happening, in the context of the ZEO tests, is that the
run_loop_until_complete
call above never returns.This is in a port of ZEO, github.com/zopefoundation/ZEO, to asyncio. The ZEO tests are exhaustive, and I suspect in uvloop's case, exhausting. :) The tests start and stop servers 10s, maybe 100s of times in a test runs that typically lasts a few minutes. If I run a smaller subset of the tests I don't get a hang.
Most tests run the server as a subprocess using multiprocessing. Some tests run the server using threading. It appears that the server isn't exiting, but merely hanging.
Do you have any suggestions for how to debug this? I've tried setting PYTHONASYNCIODEBUG=1 and enabling debug logging, but that isn't yielding any additional information..
The text was updated successfully, but these errors were encountered: