Skip to content

gh-99813: Start using SSL_sendfile when available #99907

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 46 commits into
base: main
Choose a base branch
from

Conversation

illia-v
Copy link
Contributor

@illia-v illia-v commented Nov 30, 2022

New _ssl__SSLSocket_sendfile_impl function is very similar to _ssl__SSLSocket_write_impl with a few adjustments for the SSL_sendfile syscall.

I am glad there was non-SSL socket.socket.sendfile and underlying socket.socket._sendfile_use_sendfile implemented. I reused that code for new ssl.SSLSocket._sendfile_use_ssl_sendfile by moving shared logic to socket.socket._sendfile_zerocopy.

And test.test_ssl.ThreadedTests.test_sendfile needed to have a socket bound before an SSL context wrapped it because this seemed to affect kTLS availability.

@illia-v illia-v changed the title gh-99813: Start using SSL_sendfile when available gh-99813: Start using SSL_sendfile when available Nov 30, 2022
@illia-v illia-v marked this pull request as ready for review March 13, 2023 21:29
@illia-v
Copy link
Contributor Author

illia-v commented Mar 13, 2023

@gpshead could you please review this when you have a chance?

@AlexWaygood AlexWaygood requested a review from gpshead July 26, 2023 17:49
@illia-v
Copy link
Contributor Author

illia-v commented Jul 26, 2023

I've recently pushed some changes after testing the new functionality manually by sending small and large files from Linux with the tls module loaded, it worked well

Copy link
Member

@gpshead gpshead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I won't have time to look at this in detail for a while. I'm just dropping a note about one issue seen while quickly skimming it.

@illia-v illia-v requested a review from gpshead August 2, 2023 19:54
@illia-v illia-v requested a review from picnixz as a code owner April 10, 2025 23:17
Copy link
Member

@picnixz picnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments. The if-else if blocks need to follow PEP-7 but you can also use a switch if you want (it can be clearer to read).

@illia-v illia-v requested a review from picnixz April 12, 2025 14:22
Copy link
Member

@picnixz picnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

few final comments

with client_context.wrap_socket(socket.socket(),
server_hostname=hostname) as s:
s.connect((HOST, server.port))
# kTLS seems to work only with a connection created before
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you check whether it "seems" to work or not? (for a personal project I would accept this but I'm interested in knowing whether this is an issue with Python SSL or OpenSSL)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I'll post more details about this soon

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can reproduce the same issue with C code using OpenSSL (without Python). Please check my findings here. I mentioned a possible patch, let me know if we can apply it or there can be undesirable consequences.

FYI, ssl.OP_ENABLE_KTLS to enable kTLS has been available since Python 3.12 (and it could be added to options as a simple integer even before). And the issue is not specific to SSL_sendfile, it affects kTLS availability for simple reads and writes.

I added my reproducer to an existing OpenSSL issue openssl/openssl#19676 (comment).

@illia-v
Copy link
Contributor Author

illia-v commented Apr 19, 2025

@picnixz OpenSSL provides SSL_OP_ENABLE_KTLS_TX_ZEROCOPY_SENDFILE which will be good to have in Python together with the change. Should I add it in this PR or separately after this one is merged?

@illia-v illia-v requested a review from picnixz April 19, 2025 15:33
Copy link
Member

@picnixz picnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel we need explicit functions for common error messages just to always use the same ones (but it's a separate issue)

@illia-v illia-v requested a review from picnixz April 28, 2025 21:16
Copy link
Member

@picnixz picnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like @ZeroIntensity to have a look for threads safety and @gpshead to confirm the implementation as well. We could add a What's New entry but it's more of an implementation detail so I would prefer not to in the first place.

@@ -74,6 +74,33 @@
#endif


#ifdef BIO_get_ktls_send
# ifdef MS_WINDOWS
typedef long long Py_off_t;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've seen this construction elsewhere. This is off topic but I think we should add it to the default clinic converters or be able to share code like that across clinic generated code. I need to think about it later.

// Also, it returns -1 for failure before OpenSSL 3.0.4.
return Py_NewRef(uses == 1 ? Py_True : Py_False);
#else
return Py_NewRef(Py_False);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this one, use Py_RETURN_FALSE

[clinic start generated code]*/

static PyObject *
_ssl__SSLSocket_uses_ktls_for_send_impl(PySSLSocket *self)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function needs a critical section. Concurrent calls to SSL_get_wbio aren't thread-safe without the lock.

int uses = BIO_get_ktls_send(SSL_get_wbio(self->ssl));
// BIO_get_ktls_send() returns 1 if kTLS is used and 0 if not.
// Also, it returns -1 for failure before OpenSSL 3.0.4.
return Py_NewRef(uses == 1 ? Py_True : Py_False);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an immortal reference, you can omit Py_NewRef.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally it's better to make it explicit. We had a similar discussion on the discord and on some issues

Copy link
Member

@ZeroIntensity ZeroIntensity May 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh? I thought the consensus was to omit immortal reference counting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there was no clear consensus but I think it's better to use the macro when possible and use Py_NewRef otherwise, even if the macro does not use Py_NewRef. IIRC, the discussion involved Irit and Serhiy so if you can find it and see what we eventually decided, it would be great! (I don't have the time nor the resources for that). And if I incorrectly remembered the consensus then I apologise in advance!

[clinic start generated code]*/

static PyObject *
_ssl__SSLSocket_uses_ktls_for_recv_impl(PySSLSocket *self)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto, let's add a critical section.

@@ -0,0 +1,4 @@
Python now uses ``SSL_sendfile`` internally when it is possible (see
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We know it's Python, but we don't know it's SSL:

Suggested change
Python now uses ``SSL_sendfile`` internally when it is possible (see
:mod:`ssl` now uses ``SSL_sendfile`` internally when it is possible (see

Copy link
Member

@picnixz picnixz May 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the same wording as the previous entry (in the docs)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense in the docs because there's context. It's not immediately clear that we're talking about ssl here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right! let's do it your way

int has_timeout;

if (sock != NULL) {
if ((PyObject *)sock == Py_None) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a real issue, but thinking out loud for me and Bénédikt: does Py_Is come with _PyObject_CAST? If so, it seems more useful than == here. If not, we should add it.

Py_INCREF(sock);
}

if (sock != NULL) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't this merged with the case above?

- :meth:`~socket.socket.sendfile` (but :mod:`os.sendfile` will be used
for plain-text sockets only, else :meth:`~socket.socket.send` will be used)
- :meth:`~socket.socket.sendfile` (but it may be high-performant only when
the kernel TLS is enabled (see :data:`~ssl.OP_ENABLE_KTLS`) or when a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what we do about nested parentheses in the docs. I haven't seen them anywhere else, so I'd like to avoid them here. Could we reword this a little to avoid that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants