Skip to content

Port HMAC implementation to new OpenSSL APIs #134531

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
zooba opened this issue May 22, 2025 · 3 comments
Open

Port HMAC implementation to new OpenSSL APIs #134531

zooba opened this issue May 22, 2025 · 3 comments
Assignees
Labels
extension-modules C modules in the Modules dir topic-SSL type-feature A feature request or enhancement

Comments

@zooba
Copy link
Member

zooba commented May 22, 2025

@zooba zooba added type-feature A feature request or enhancement extension-modules C modules in the Modules dir topic-SSL labels May 22, 2025
@picnixz
Copy link
Member

picnixz commented May 23, 2025

I can take care of this. Note that EVP_MAC-HMAC is only available since OpenSSL 3.x and the requirements for CPython are OpenSSL 1.1.1 and later. We still need to maintain the old API (though, we're already doing it through PY_EVP_MD macros) since our build requirements OpenSSL 3.0.9 is the recommended minimum version for the ssl and hashlib extension modules.

@picnixz picnixz self-assigned this May 24, 2025
@picnixz
Copy link
Member

picnixz commented May 24, 2025

Ok, so I had a quick look at how I would plan this. I'll need multiple PRs because _hashopenssl is a bit messy and now that we will use EVP_MAC in addition to EVP_MD, we should rename some functions. To that end, I'll first make a "cleanup" PR which prepares everything and a bit of refactoring especially to reuse functions later.

Note: I eventually decided against the backports, even if they could ease the life of my future self (see #134626 (comment) and #134703 (comment)).

picnixz added a commit that referenced this issue May 26, 2025
Rename components related to `_hashlib.{HASH,HASHXOF}` objects.

- The `EVPobject` structure is renamed `HASHobject`.
- Non-clinic `HASH` methods are now prefixed by `_hashlib_HASH_*`.
  A similar change is made for non-clinic `HASHXOF` methods.
- Functions extracting information from `EVP_MD` objects and functions
  constructing `EVP_MD` objects now include `openssl_evp_md` in their name.

This change allows us to avoid future ambiguities between the `EVP_MD`
and the `EVP_MAC` APIs (currently, we only use `EVP_MD` for hash functions
and rely on the legacy interface for HMAC instead of using `EVP_MAC`).
miss-islington pushed a commit to miss-islington/cpython that referenced this issue May 26, 2025
…nGH-134626)

Rename components related to `_hashlib.{HASH,HASHXOF}` objects.

- The `EVPobject` structure is renamed `HASHobject`.
- Non-clinic `HASH` methods are now prefixed by `_hashlib_HASH_*`.
  A similar change is made for non-clinic `HASHXOF` methods.
- Functions extracting information from `EVP_MD` objects and functions
  constructing `EVP_MD` objects now include `openssl_evp_md` in their name.

This change allows us to avoid future ambiguities between the `EVP_MD`
and the `EVP_MAC` APIs (currently, we only use `EVP_MD` for hash functions
and rely on the legacy interface for HMAC instead of using `EVP_MAC`).
(cherry picked from commit cb8045e)

Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
@picnixz
Copy link
Member

picnixz commented Jun 7, 2025

Ok, so here's the plan. I've finished writing the implementation, but the PR is huge:

 Modules/_hashopenssl.c | 815 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----------------------------------
 1 file changed, 582 insertions(+), 233 deletions(-)

So I'll break down the commits, because I also found multiple issues elsewhere. Well, not real issues, but annoying interfaces that should be refactored first.


  • There is one issue that I introduced in gh-134531: cleanup _hashopenssl.c to support EVP_MAC #134626, namely I renamed the structure types, but I forgot to rename them in clinic directives. It's not an issue because it's never used but we should first fix this. [fixed by 4372011]

  • Next, we have a way to map EVP_MD objects to NID (thank you OpenSSL), and digestmods to EVP_MD, and EVP_MD/digestmods to names. That's great. But the names in question are the "preferred" Python names, not the OpenSSL ones (well sometimes they can be OpenSSL ones as we use it as a fallback). Now, for EVP_MAC objects, there is no way to recover the underlying digest. Why? because MAC objects are not necessarily HMAC, and thus there's not always an underlying "digest" (sometimes it's a cipher). [see https://github.com/gh-134531: refactor _hashlib logic for mapping NIDs to EVP_MD objects #135254]

    This is extremely annoying. Also there is no mapping from EVP_MAC to NIDs, and we can only know whether a MAC is a [HPC]MAC etc, not what the parameters for the underlying algorithm itself were. Now, the goods news is: since we provide a way to create HMAC objects using digestmods, we always get an EVP_MD object behind the scenes. From there, we can instantiate HMAC+digest correctly and remember the NID of that EVP_MD. To recover the underlying digest name, we just need to resolve the NID.

  • The last task will be to rely on digestmods <-> NIDS <-> OpenSSL digest names -> HMAC to construct the EVP_MAC_CTX object (we only need one EVP_MAC object because it's always HMAC).


The deprecated HMAC API is deprecated because it's actually wrapping an EVP_MD object. The new EVP_MAC API generalize MACs by delegating everything to providers, and callback functions, hiding the implementation of the actual MAC algorithm itself. Now, it's the provider that is wrapping an HMAC_CTX object instead of directly wrapping the EVP_MD objects. Long story short: we still have an EVP_MD somewhere but it's no more recoverable from EVP_MAC objects themselves (and there's no EVP_HMAC).


I have a branch with everything ready so I'll just make different PRs. At the same time, I'll also refactor _hashopenssl.c because the naming is a bit messy (we don't have a clear convention and I think it's better if we had one good convention when dealing with this module). Later, I'll need to refactor out everything so that we have a unified interface that can be used by _hashlib and _hmac for recognizing "digestmods", but for now, we'll keep things separate (this task is #131876).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
extension-modules C modules in the Modules dir topic-SSL type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

2 participants