Skip to content

ENH: Streamline and improve the origin and license documentation of third party bundled in wheels #27764

@pombredanne

Description

@pombredanne

Proposed new feature or change:

The current wheel builds (as of 2.1.3) may contain not entirely correct license or origin information for bundled third-party components. As a result, it may be difficult to collect missing information for the wheels, and one needs to get back to the sdist or a checkout for a proper picture of 3rd-party with the inclusion of correct, compliant license notices and actionable origin details.

  • For instance, pocketfft is neither attributed nor referenced in the wheel, but is part of numpy/fft/_pocketfft_umath.cpython-310-x86_64-linux-gnu.so
  • Or lapack-lite is missing its license, though we have a license for the full lapack used with openblas

These are just two examples, and there are likely several small incorrect, missing or inaccurate data because numpy is big and it is hard to keep track of all these.

The reason why this matters is that:

  1. It is important to provide proper license notice and credits for all the code bundled
  2. It is even more important to provide accurate origin information to support vulnerability management and reporting that may exists in the bundled code.

The proposed enhancement would consists in:

  1. Running a detailed baseline scan to ensure that there is a clear record of every bits of 3rd party code bundled in wheel
  2. Update the package(s) metadata to ensure that they are comprehensive
  3. Automate 2. in the CI to avoid any regression

PS: I maintain popular open source Python tools to do just that https://github.com/aboutcode-org/ and https://aboutcode.org/ and I can help with this enhancement!

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions