Skip to content

gh-99813: Start using SSL_sendfile when available #99907

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 52 commits into
base: main
Choose a base branch
from

Conversation

illia-v
Copy link
Contributor

@illia-v illia-v commented Nov 30, 2022

New _ssl__SSLSocket_sendfile_impl function is very similar to _ssl__SSLSocket_write_impl with a few adjustments for the SSL_sendfile syscall.

I am glad there was non-SSL socket.socket.sendfile and underlying socket.socket._sendfile_use_sendfile implemented. I reused that code for new ssl.SSLSocket._sendfile_use_ssl_sendfile by moving shared logic to socket.socket._sendfile_zerocopy.

And test.test_ssl.ThreadedTests.test_sendfile needed to have a socket bound before an SSL context wrapped it because this seemed to affect kTLS availability.

@illia-v illia-v changed the title gh-99813: Start using SSL_sendfile when available gh-99813: Start using SSL_sendfile when available Nov 30, 2022
@illia-v illia-v marked this pull request as ready for review March 13, 2023 21:29
@illia-v
Copy link
Contributor Author

illia-v commented Mar 13, 2023

@gpshead could you please review this when you have a chance?

@AlexWaygood AlexWaygood requested a review from gpshead July 26, 2023 17:49
@illia-v
Copy link
Contributor Author

illia-v commented Jul 26, 2023

I've recently pushed some changes after testing the new functionality manually by sending small and large files from Linux with the tls module loaded, it worked well

Copy link
Member

@gpshead gpshead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I won't have time to look at this in detail for a while. I'm just dropping a note about one issue seen while quickly skimming it.

@illia-v illia-v requested a review from gpshead August 2, 2023 19:54
@illia-v illia-v requested a review from picnixz April 12, 2025 14:22
Copy link
Member

@picnixz picnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

few final comments

with client_context.wrap_socket(socket.socket(),
server_hostname=hostname) as s:
s.connect((HOST, server.port))
# kTLS seems to work only with a connection created before
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you check whether it "seems" to work or not? (for a personal project I would accept this but I'm interested in knowing whether this is an issue with Python SSL or OpenSSL)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I'll post more details about this soon

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can reproduce the same issue with C code using OpenSSL (without Python). Please check my findings here. I mentioned a possible patch, let me know if we can apply it or there can be undesirable consequences.

FYI, ssl.OP_ENABLE_KTLS to enable kTLS has been available since Python 3.12 (and it could be added to options as a simple integer even before). And the issue is not specific to SSL_sendfile, it affects kTLS availability for simple reads and writes.

I added my reproducer to an existing OpenSSL issue openssl/openssl#19676 (comment).

@illia-v
Copy link
Contributor Author

illia-v commented Apr 19, 2025

@picnixz OpenSSL provides SSL_OP_ENABLE_KTLS_TX_ZEROCOPY_SENDFILE which will be good to have in Python together with the change. Should I add it in this PR or separately after this one is merged?

@illia-v illia-v requested a review from picnixz April 19, 2025 15:33
Copy link
Member

@picnixz picnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel we need explicit functions for common error messages just to always use the same ones (but it's a separate issue)

@illia-v illia-v requested a review from picnixz April 28, 2025 21:16
Copy link
Member

@picnixz picnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like @ZeroIntensity to have a look for threads safety and @gpshead to confirm the implementation as well. We could add a What's New entry but it's more of an implementation detail so I would prefer not to in the first place.

@@ -74,6 +74,33 @@
#endif


#ifdef BIO_get_ktls_send
# ifdef MS_WINDOWS
typedef long long Py_off_t;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've seen this construction elsewhere. This is off topic but I think we should add it to the default clinic converters or be able to share code like that across clinic generated code. I need to think about it later.

int uses = BIO_get_ktls_send(SSL_get_wbio(self->ssl));
// BIO_get_ktls_send() returns 1 if kTLS is used and 0 if not.
// Also, it returns -1 for failure before OpenSSL 3.0.4.
return Py_NewRef(uses == 1 ? Py_True : Py_False);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an immortal reference, you can omit Py_NewRef.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally it's better to make it explicit. We had a similar discussion on the discord and on some issues

Copy link
Member

@ZeroIntensity ZeroIntensity May 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh? I thought the consensus was to omit immortal reference counting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there was no clear consensus but I think it's better to use the macro when possible and use Py_NewRef otherwise, even if the macro does not use Py_NewRef. IIRC, the discussion involved Irit and Serhiy so if you can find it and see what we eventually decided, it would be great! (I don't have the time nor the resources for that). And if I incorrectly remembered the consensus then I apologise in advance!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It started with returning Py_False without the wrapper as it looked like the most clean option, than I was asked to use Py_NewRef and Py_RETURN_FALSE.
"There should be one-- and preferably only one --obvious way to do it." is not applicable to returning booleans in CPython's C code definitely 😄

int has_timeout;

if (sock != NULL) {
if ((PyObject *)sock == Py_None) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a real issue, but thinking out loud for me and Bénédikt: does Py_Is come with _PyObject_CAST? If so, it seems more useful than == here. If not, we should add it.

@illia-v
Copy link
Contributor Author

illia-v commented May 2, 2025

@ZeroIntensity @picnixz thanks for the suggestion! I applied them

@illia-v illia-v requested review from ZeroIntensity and picnixz May 2, 2025 14:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants