-
Notifications
You must be signed in to change notification settings - Fork 558
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite asyncio/sslproto.py #158
Comments
@1st1 Thanks! I'm trying to quickly profile |
Vanilla asyncio echo server on Python 3.7b3 without SSL: With SSL: I've repainted this in red ( This gives a rough idea about the time portfolio:
Top
|
Just made a very simple PoC, the improvement is not significant. I'll keep trying on something else tomorrow, just posting the results here for now.
Just curious about Trio, which is using the same stdlib With SSL: Without SSL: |
Oh, that's unfortunate. I looked at |
Maybe asyncio's read buffer is too small and we have too much overhead with |
Throughput depends a lot on the chosen cipher. See e.g. https://jbp.io/2018/01/07/rustls-vs-openssl-performance-1.html, the difference can be quite large. |
All above tests are using cipher Using |
Or you can try uvloop which has 256kb (or something like that) buffer size by default. |
Sorry for the delay! Tweaking buffer size had no luck. Actually:
I'm thinking of these TODOs next, please kindly advice:
(Similarly, quick-and-dirty PoCs first.) |
That's true only for 3.7, but I get your point.
Let's re-sync after (1). (2) and (3) will require us waiting until 3.8, considering your improvements to the |
Got it, thanks! Meanwhile there's also a possibility to make uvloop use pyOpenSSL, libuv-tls or similar libs directly (as a last way). |
Yes, I was thinking about that too until I looked at |
Christian Heims also suggests trying to enable http://manpages.org/SSL_CTX_set_read_ahead/3 |
We'll also need to think about offloading SSL encryption to a thread, in case it consumes significant CPU. |
Also need to investigate
I've seen some comments in forums suggesting enabling it for non-blocking IO. |
Fixed two buffer size issue in the PoC, and tested again with:
Results:
|
In the process testing with 100kB requests, I found that Previous commit embeds the loop in We could actually do the same in This has no effect if the payload is small (like 1kB). |
Nice. |
😃 currently working on a relatively more complete |
Great. You can also try to use |
Fully buffered - also supporting client |
It's taking longer than expected, previous commit shows the rough idea - |
Take your time. And... nice diagram! We'll probably want to have a digital version of it to put somewhere in the docs ;) |
Sure thing! Looks like renegotiation is a bit awful - Python didn't expose it, OpenSSL didn't do it fully right, HTTP/2 forbid it, and TLS1.3 replaced it with something else. I'll try to get it working and tested with minimal effort - perhaps copying some tests from Trio as Nathaniel seemed already walked through this. Eventually figured out how renegotiation works in Python, there are quite some implications:
Therefore, I came up with this process diagram:
Meanwhile, I'm still trying to fit this into a state transitioning diagram - states probably make more sense with explicit handshake and shutdown included.
Other than the 4 original states, I added 3:
|
Great progress! (GH doesn't send notifications on edit, so this is the first time I see the updates)
Is it so broken that it's impractical to support it? I mean it's a very niche requirement, so if its support comes at the expense of code clarity/maintainability I'd say let's not do it at all. Your call though.
Can you share some details on that? Is that new TLS 1.3 renegotiation protocol easier to implement? Maybe we can get away with supporting just that TLS 1.3 thing? |
Thanks! No worries, I didn't want to make too frequent noises, just about to ask for your review and next steps. The renegotiation support is actually done in previous commit. It turns out that as far as we follow the process diagram to call OpenSSL functions with those implications considered, current TLS<=1.2 renegotiation shall just work. It's manually tested, a unit test would require pyOpenSSL installed, though I'm not sure if that is practical in cpython source code. For the two renegotiation issues, 1) duplex traffic during renegotiation: it seems that OpenSSL had already fixed the issue, by preventing sending app data once it had sent the renegotiation Hello. I didn't find the actual fix, but verified the behavior (test code is a bit ugly), so it should just work fine now. 2) TLS 1.3 rekeying is still in OpenSSL 1.1.1 beta, so it is a future thing I think. Another finding during fixing cpython tests is about canceling handshake. SSL/TLS has this A side topic - we may be able to support user-random SSL wrap/unwrap in the same TCP connection without reconnecting (transitioning between The worry is about being unwrapped by the peer - application protocol needs to be aware or even in charge of the SSL downgrading, or sensitive data may be sent in plaintext by mistake. For now, an SSL I think this is getting too far, sorry for the long story! It might be better to just stay with closing the connection immediately as the first part of RFC 5246 7.2.1 stated, and leave the random wrap/unwrap to a new feature. If so, I'll simply clean up the pass through logic (also the one from original |
About the performance thing, I just realized that I’ve been comparing encryption time with loopback network time, which is probably not a practical anchor point. So I’ve compared the absolute throughput of OpenSSL(with this benchmark tool found in Elvis's reference) and This probably means that our initial problem was incomplete - SSL connections are 3-4x slower than regular connections when the network time is about the same as encryption/decryption time. If the encryption/decryption time was 10% of the network time, then current SSL implementation is supposed to be adding another 15% ~ 30% overhead. Making the time consumed in the |
Right. This is great. Perhaps we'll be able to squeeze a few more %% when we cythonize the |
What's the latest news here? |
Got a few local commits, will update tomorrow. :)
|
Problems with sslproto.py
The code is very complex and hard to reason about. At this point it's pretty much unmaintainable, many bugs on bugs.python.org linger unresolved just because we don't have capacity to debug them or review PRs.
It's very slow. SSL connections are 3-4x slower than regular connections.
It has many yet unresolved problems associated with SSL connection shutdown and cleanup:
We don't have enough functional tests for SSL in asyncio. Especially ones that test cancellation & timeouts thoroughly enough.
Requirements for the new implementation
Easy to understand code. Ideally, the state machine should be implemented right in the
SSLProtocol
(so noSSLPipe
abstraction). A good example of a protocol state machine is asyncpg'sCoreProtocol <-> Protocol <-> Connection
mechanism.Performance. Ideally, SSL connections shouldn't be more than 10% slower.
Tests. Nathaniel once mentioned that Trio's SSL tests are very robust and that he found bugs in CPython's ssl module with them. We maybe want to backport them to asyncio.
SSL tunneled through SSL should be naturally supported.
Suggested steps towards the new implementation
Try to benchmark the current
asyncio/sslproto.py
. Maybe we'll be able to pinpoint the exact bottleneck that makes it so slow; what if there's a bug in ssl module? Most likely there's no single bottleneck, and the reason for poor performance is just its inefficient code.Implement a new SSL state machine/protocol in pure Python (I suggest to use new
asyncio.BufferedProtocol
from the beginning, though.) Get it to the point that it passes asyncio's existing SSL tests. Add more tests.Once we have a working pure Python implementation, we can (a) contribute it to Python 3.8, (b) start optimizing it in Cython for uvloop.
cc @fantix
The text was updated successfully, but these errors were encountered: