From 20daf41c4e9225fbbb6e73daffebcc889909399c Mon Sep 17 00:00:00 2001 From: "C.A.M. Gerlach" Date: Sun, 14 Nov 2021 16:47:53 -0600 Subject: [PATCH 01/19] PEP 639: Reframe core metadata update as discrete proposal per PEP 643 --- pep-0639.rst | 76 ++++++++++++++++++---------------------------------- 1 file changed, 26 insertions(+), 50 deletions(-) diff --git a/pep-0639.rst b/pep-0639.rst index 48cfeb7c655..437455dc209 100644 --- a/pep-0639.rst +++ b/pep-0639.rst @@ -1,44 +1,41 @@ PEP: 639 -Title: Metadata for Python Software Packages 2.2 +Title: Improving License Clarity with Better Package Metadata Version: $Revision$ Last-Modified: $Date$ -Author: Philippe Ombredanne +Author: Philippe Ombredanne , + C.A.M. Gerlach Sponsor: Paul Moore -BDFL-Delegate: Paul Moore +PEP-Delegate: Paul Moore Discussions-To: https://discuss.python.org/t/2154 Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 15-Aug-2019 -Python-Version: 3.x Post-History: -Replaces: 566 Resolution: Abstract ======== -This PEP describes the changes between versions 2.1 and 2.2 of the `Core -Metadata Specification` [#cms]_ for Python packages. Version 2.1 is specified in -PEP 566. - -The primary change introduced in this PEP updates how licenses are documented in -core metadata via the ``License`` field with license expression strings using -SPDX license identifiers [#spdxlist]_ such that license documentation is simpler -and less ambiguous: +This PEP defines a specification for how licenses are documented in +core metadata via the ``License`` field, with license expression strings +using SPDX identifiers [#spdxlist]_, such that license documentation +is simpler and less ambiguous: - for package authors to create, - for package users to read and understand, and, - for tools to process package license information mechanically. -The other changes include: +The PEP also proposes: - specifying a ``License-File`` field which is already used by ``wheel`` and ``setuptools`` to include license files in built distributions. - defining how tools can validate license expressions and report warnings to users for invalid expressions (but still accept any string as ``License``). +The changes in this PEP will update the core metadata format to version 2.3. + Goals ===== @@ -49,14 +46,14 @@ distribution: - with an improved and structured way to document a license expression, and, - by including license texts in a built package. -The core metadata specification updates that are part of this PEP have been -designed to have minimal impact and to be backward compatible with v2.1. These -changes utilize emerging new ways to document licenses that are already in use -in some tools (e.g. by adding the ``License-File`` field already used in -``wheel`` and ``setuptools``) or by some package authors (e.g. storing an SPDX +The changes to the core metadata specification that this PEP requires have been +designed to have minimal impact and to be backward compatible. +This specification utilizes existing ways to document licenses that are already +in use in some tools (e.g. by adding the ``License-File`` field already used in +``wheel`` and ``setuptools``) and by some package authors (e.g. storing an SPDX license expression in the existing ``License`` field). -In addition to an update to the metadata specification, this PEP contains: +In addition to these proposed changes, this PEP contains: - recommendations for publishing tools on how to validate the ``License`` and ``Classifier`` fields and report informational warnings when a package uses an @@ -168,22 +165,20 @@ publishers improve the clarity of their license documentation to the benefit of package authors, consumers and redistributors. -Core Metadata Specification updates -=================================== +Specification +============= The canonical source for the names and semantics of each of the supported metadata fields is the Core Metadata Specification [#cms]_ document. +This PEP proposes modifying the following fields, which will be implemented +in the canonical document once this PEP is approved. -The details of the updates considered to the Core Metadata Specification [#cms]_ -document as part of this PEP are described here and will be added to the -canonical source once this PEP is approved. +As it adds a new field, this PEP updates the core metadata format +to version 2.3. -Added in Version 2.2 --------------------- - License-File (multiple use) -::::::::::::::::::::::::::: +--------------------------- The License-File is a string that is a path, relative to``.dist-info``, to a license file. The license file content MUST be UTF-8 encoded text. @@ -192,11 +187,8 @@ Build tools SHOULD honor this field and include the corresponding license file(s) in the built package. -Changed in Version 2.2 ----------------------- - License (optional) -:::::::::::::::::: +------------------ Text indicating the license covering the distribution. This text can be either a valid license expression as defined here or any free text. @@ -265,7 +257,7 @@ License expression examples:: Classifier (multiple use) -::::::::::::::::::::::::: +------------------------- Each entry is a string giving a single classification value for the distribution. Classifiers are described in PEP 301. @@ -345,14 +337,6 @@ cannot reliably infer a license expression and should suggest that the package author construct a license expression which expresses their intent. -Summary of Differences From PEP 566 -=================================== - -* Metadata-Version is now 2.2. -* Added one new field: ``License-File`` -* Updated the documentation of two fields: ``License`` and ``Classifier`` - - Backwards Compatibility ======================= @@ -796,14 +780,6 @@ Conventions used by other ecosystems References ========== -This document specifies version 2.2 of the metadata format. - -- Version 1.0 is specified in PEP 241. -- Version 1.1 is specified in PEP 314. -- Version 1.2 is specified in PEP 345. -- Version 2.0, while not formally accepted, was specified in PEP 426. -- Version 2.1 is specified in PEP 566. - .. [#cms] https://packaging.python.org/specifications/core-metadata .. [#cdstats] https://clearlydefined.io/stats .. [#cd] https://clearlydefined.io From 01c9952fdcd5f10da494733b87dbec4fa12cf990 Mon Sep 17 00:00:00 2001 From: "C.A.M. Gerlach" Date: Sun, 14 Nov 2021 18:51:29 -0600 Subject: [PATCH 02/19] PEP 639: Clean up formatting, syntax, spelling, headers and lists --- pep-0639.rst | 195 ++++++++++++++++++++++++++------------------------- 1 file changed, 98 insertions(+), 97 deletions(-) diff --git a/pep-0639.rst b/pep-0639.rst index 437455dc209..dcfc378127d 100644 --- a/pep-0639.rst +++ b/pep-0639.rst @@ -21,17 +21,18 @@ Abstract This PEP defines a specification for how licenses are documented in core metadata via the ``License`` field, with license expression strings using SPDX identifiers [#spdxlist]_, such that license documentation -is simpler and less ambiguous: +is simpler and less ambiguous for: -- for package authors to create, -- for package users to read and understand, and, -- for tools to process package license information mechanically. +- package authors to create, +- package users to read and understand, and, +- tools to process package license information mechanically. -The PEP also proposes: +The PEP also proposes to: -- specifying a ``License-File`` field which is already used by ``wheel`` and +- Specify a ``License-File`` field which is already used by ``wheel`` and ``setuptools`` to include license files in built distributions. -- defining how tools can validate license expressions and report warnings to + +- Define how tools can validate license expressions and report warnings to users for invalid expressions (but still accept any string as ``License``). The changes in this PEP will update the core metadata format to version 2.3. @@ -41,10 +42,10 @@ Goals ===== This PEP's scope is limited strictly to how we document the license of a -distribution: +distribution, specifically covering: -- with an improved and structured way to document a license expression, and, -- by including license texts in a built package. +- An improved and structured way to document a license expression. +- A formal mechanism to include license texts in a built package. The changes to the core metadata specification that this PEP requires have been designed to have minimal impact and to be backward compatible. @@ -55,11 +56,11 @@ license expression in the existing ``License`` field). In addition to these proposed changes, this PEP contains: -- recommendations for publishing tools on how to validate the ``License`` and +- Recommendations for publishing tools on how to validate the ``License`` and ``Classifier`` fields and report informational warnings when a package uses an older, non-structured style of license documentation conventions. -- informational appendices that contain surveys of how we document licenses +- Informational appendices that contain surveys of how we document licenses today in Python packages and elsewhere, and a reference Python library to parse, validate and build correct license expressions. @@ -90,16 +91,16 @@ This PEP is not about license documentation in files inside packages, even though this is a surveyed topic in the appendix. -Possible future PEPs +Possible Future PEPs -------------------- It is the intention of the authors of this PEP to consider the submission of related but separate PEPs in the future such as: -- make ``License`` and new ``License-File`` fields mandatory including - stricter enforcement in tools and PyPI publishing. +- Making the ``License-Expression`` and ``License-File`` fields mandatory + in tools and PyPI publishing. -- require uploads to PyPI to use only FOSS (Free and Open Source software) +- Requiring uploads to PyPI to use only FOSS (Free and Open Source Software) licenses. @@ -107,7 +108,7 @@ Motivation ========== Software is licensed, and providing accurate licensing information to Python -packages users is an important matter. Today, there are multiple places where +packages users is an important matter. Today, there are multiple places where licenses are documented in package metadata and there are limitations to what can be documented. This is often leading to confusion or a lack of clarity both for package authors and package users. @@ -143,10 +144,10 @@ There are a few takeaways from the survey: - Most package formats use a single ``License`` field. - Many modern package formats use some form of license expression syntax to - optionally combine more than one license identifier together. SPDX and - SPDX-like syntaxes are the most popular in use. + optionally combine more than one license identifier together. + SPDX and SPDX-like syntaxes are the most popular in use. -- SPDX license identifiers are becoming a de facto way to reference common +- SPDX license identifiers are becoming a de-facto way to reference common licenses everywhere, whether or not a license expression syntax is used. - Several package formats support documenting both a license expression and the @@ -177,16 +178,6 @@ As it adds a new field, this PEP updates the core metadata format to version 2.3. -License-File (multiple use) ---------------------------- - -The License-File is a string that is a path, relative to``.dist-info``, to a -license file. The license file content MUST be UTF-8 encoded text. - -Build tools SHOULD honor this field and include the corresponding license -file(s) in the built package. - - License (optional) ------------------ @@ -198,7 +189,7 @@ missing, or is not a valid license expression as defined here. Build tools MAY issue a similar warning. -License Expression syntax +License Expression Syntax ''''''''''''''''''''''''' A license expression is a string using the SPDX license expression syntax as @@ -210,12 +201,12 @@ When used in the ``License`` field and as a specialization of the SPDX license expression definition, a license expression can use the following license identifiers: -- any SPDX-listed license short-form identifiers that are published in the SPDX +- Any SPDX-listed license short-form identifiers that are published in the SPDX License List [#spdxlist]_ using either Version 3.10 or any later compatible version. Note that the SPDX working group never removes any license identifiers: instead they may choose to mark an identifier as "deprecated". -- the ``LicenseRef-Public-Domain`` and ``LicenseRef-Proprietary`` strings to +- The ``LicenseRef-Public-Domain`` and ``LicenseRef-Proprietary`` strings to identify licenses that are not included in the SPDX license list. When processing the ``License`` field to determine if it contains a valid @@ -224,9 +215,9 @@ license expression, tools: - SHOULD report an informational warning if one or more of the following applies: - - the field does not contain a license expression, + - the field does not contain a license expression - - the license expression syntax is invalid, + - the license expression syntax is invalid - the license expression syntax is valid but some license identifiers are unknown as defined here or the license identifiers have been marked as @@ -256,6 +247,16 @@ License expression examples:: License: LicenseRef-Proprietary AND LicenseRef-Public-Domain +License-File (multiple use) +--------------------------- + +A text string that is a path, relative to ``.dist-info``, to a license file. +The license file content MUST be UTF-8 encoded text. + +Build tools SHOULD honor this field and include the corresponding license +file(s) in the built package. + + Classifier (multiple use) ------------------------- @@ -302,7 +303,6 @@ expression to suggest would be:: License: MIT - Here are mapping guidelines for the legacy classifiers: - Classifier ``License :: Other/Proprietary License`` becomes License: @@ -324,7 +324,7 @@ Here are mapping guidelines for the legacy classifiers: ``License :: Free To Use But Restricted``, and ``License :: Freeware`` are mapped to the generic License: ``LicenseRef-Proprietary`` expression. -- Classifiers ``License :: GUST*`` have no mapping to SPDX license identifierss +- Classifiers ``License :: GUST*`` have no mapping to SPDX license identifiers for now and no package uses them in PyPI as of the writing of this PEP. The remainder of the classifiers using a ``License ::`` prefix map to a simple @@ -359,8 +359,8 @@ string and the License-File(s) are file paths. None of them introduces any new security concern. -How to Teach Users to Use License Expressions -============================================= +How to Teach This +================= The simple cases are simple: a single license identifier is a valid license expression and a large majority of packages use a single license. @@ -376,16 +376,16 @@ gently teach users how to provide correct license expressions over time. Tools may also help with the conversion and suggest a license expression in some cases: -1. The section `Mapping Legacy Classifiers to New License expressions`_ provides - tool authors with guidelines on how to suggest a license expression produced - from legacy classifiers. +- The section `Mapping Legacy Classifiers to New License expressions`_ provides + tool authors with guidelines on how to suggest a license expression produced + from legacy classifiers. -2. Tools may also be able to infer and suggest how to update an existing - incorrect ``License`` value and convert that to a correct license expression. - For instance a tool may suggest to correct a ``License`` field from - ``Apache2`` (which is not a valid license expression as defined in this PEP) - to ``Apache-2.0`` (which is a valid license expression using an SPDX license - identifier as defined in this PEP). +- Tools may also be able to infer and suggest how to update an existing + incorrect ``License`` value and convert that to a correct license expression. + For instance a tool may suggest to correct a ``License`` field from + ``Apache2`` (which is not a valid license expression as defined in this PEP) + to ``Apache-2.0`` (which is a valid license expression using an SPDX license + identifier as defined in this PEP). Reference Implementation @@ -403,10 +403,11 @@ such as the SPDX Python tools [#spdxpy]_, the ScanCode toolkit [#scancodetk]_ and the Free Software Foundation Europe (FSFE) Reuse project [#reuse]_. -Rejected ideas +Rejected Ideas ============== -1. Use a new ``License-Expression`` field and deprecate the ``License`` field. +Re-Use the License Field +------------------------ Adding a new field would introduce backward incompatible changes when the ``License`` field would be retired later and require having more complex @@ -417,8 +418,8 @@ less likely to start using a new field than make small adjustments to their use of existing fields. -2. Mapping licenses used in the license expression to specific files in the - license files (or vice versa). +Map Identifiers to License Files +-------------------------------- This would require using a mapping (two parallel lists would be too prone to alignment errors) and a mapping would bring extra complication to how license @@ -435,23 +436,23 @@ slight loss of clarity by not specifying which text file is for which license identifier, but you are not forcing the more complex data model (e.g. a mapping) on everyone that may not need it. -We could of course have a data field with multiple possible value types (it’s a -string, it’s a list, it’s a mapping!) but this could be a source of confusion. +We could of course have a data field with multiple possible value types (it's a +string, it's a list, it's a mapping!) but this could be a source of confusion. This is what has been done for instance in npm (historically) and in Rubygems (still today) and as result you need to test the type of the metadata field before using it in code and users are confused about when to use a list or a string. -3. Mapping licenses to specific source files and/or directories of source files - (or vice versa). +Map Identifiers to Source Files +------------------------------- File-level notices are not considered as part of the scope of this PEP and the existing ``SPDX-License-Identifier`` [#spdxids]_ convention can be used and may not need further specification as a PEP. -Appendix 1. License Expression example +Appendix 1. License Expression Example ====================================== The current version of ``setuptools`` metadata [#setuptools5030]_ does not use @@ -499,21 +500,20 @@ license and the copyrights used by ``setuptools``, ``appdirs``, ``pyparsing`` an Apache and BSD license, its copyrights and its license choice notice [#packlic]_. -Appendix 2. Surveying how we document licenses today in Python +Appendix 2. Surveying How we Document Licenses Today in Python ============================================================== There are multiple ways used or recommended to document Python package licenses today: -In Core metadata +In Core Metadata ---------------- There are two overlapping core metadata fields to document a license: the license-related ``Classifier`` strings [#classif]_ prefixed with ``License ::`` and the ``License`` field as free text [#licfield]_. - The core metadata documentation ``License`` field documentation is currently:: License (optional) @@ -521,7 +521,7 @@ The core metadata documentation ``License`` field documentation is currently:: Text indicating the license covering the distribution where the license is not a selection from the "License" Trove classifiers. See - "Classifier" below. This field may also be used to specify a + "Classifier" below. This field may also be used to specify a particular version of a license which is named via the ``Classifier`` field, or to indicate a variation or exception to such a license. @@ -540,16 +540,16 @@ not clear if this is a choice or all these apply and which ones. Furthermore, the list of available license-related classifiers is often out-of-date. -In the PyPA ``sampleproject`` ------------------------------ +In the PyPA Sample Project +-------------------------- The latest PyPA ``sampleproject`` recommends only to use classifiers in ``setup.py`` and does not list the ``license`` field in its example ``setup.py`` [#samplesetup]_. -The License Files in wheels and setuptools ------------------------------------------- +License Files in Wheels and Setuptools +-------------------------------------- Beyond a license code or qualifier, license text files are documented and included in a built package either implicitly or explicitly and this is another @@ -557,32 +557,32 @@ possible source of confusion: - In wheels [#wheels]_ license files are automatically added to the ``.dist-info`` directory if they match one of a few common license file name patterns (such - as LICENSE*, COPYING*). Alternatively a package author can specify a list of - license file paths to include in the built wheel using in the - ``license_files`` field in the ``[metadata]`` section of the project's - ``setup.cfg``. Previously this was a (singular) ``license_file`` file attribute - that is now deprecated but is still in common use. See [#pipsetup]_ for - instance. + as ``LICENSE*`` and ``COPYING*``). Alternatively a package author can specify + a list of license file paths to include in the built wheel in the ``license_files`` + field in the ``[metadata]`` section of the project's ``setup.cfg``. + Previously this was a (singular) ``license_file`` file attribute that is now + deprecated but is still in common use. See [#pipsetup]_ for instance. - In ``setuptools`` [#setuptoolssdist]_, a ``license_file`` attribute is used to add a single license file to a source distribution. This singular version is still honored by ``wheels`` for backward compatibility. -- Using a LICENSE.txt file is encouraged in the packaging guide [#packaging]_ +- Using a ``LICENSE.txt`` file is encouraged in the packaging guide [#packaging]_ paired with a ``MANIFEST.in`` entry to ensure that the license file is included in a built source distribution (sdist). -Note: the License-File field proposed in this PEP already exists in ``wheel`` and -``setuptools`` with the same behaviour as explained above. This PEP is only -recognizing and documenting the existing practice as used in ``wheel`` (with the -``license_file`` and ``license_files`` ``setup.cfg`` ``[metadata]`` entries) and in -``setuptools`` ``license_file`` ``setup()`` argument. +**Note:** the ``License-File`` field proposed in this PEP already exists in +``wheel`` and ``setuptools`` with the same behaviour as explained above. +This PEP is only recognizing and documenting the existing practice as used +in ``wheel`` (with the ``license_file`` and ``license_files`` ``setup.cfg`` +``[metadata]`` entries) and in the ``setuptools`` ``license_file`` +``setup()`` argument. -In Python code files --------------------- +In Python Source Code Files +--------------------------- -(Note: Documenting licenses in source code is not in the scope of this PEP) +**Note:** Documenting licenses in source code is not in the scope of this PEP. Beside using comments and/or ``SPDX-License-Identifier`` conventions, the license is sometimes documented in Python code files using "dunder" variables typically @@ -594,33 +594,34 @@ function and the standard ``pydoc`` module. The dunder variable(s) will show up the ``help()`` DATA section for a module. -In some other Python packaging tools ------------------------------------- +In Other Python Packaging Tools +------------------------------- - Conda package manifest [#conda]_ has support for ``license`` and ``license_file`` fields as well as a ``license_family`` license grouping field. -- ``flit`` [#flit]_ recommends to use classifiers instead of License (as per the - current metadata spec). +- Flit [#flit]_ recommends to use classifiers instead of ``License`` + (as per the current metadata spec). -- ``pbr`` [#pbr]_ uses similar data as setuptools but always stored setup.cfg. +- PBR [#pbr]_ uses similar data as setuptools but always stored setup.cfg. -- ``poetry`` [#poetry]_ specifies the use of the ``license`` field in +- Poetry [#poetry]_ specifies the use of the ``license`` field in ``pyproject.toml`` with SPDX license identifiers. -Appendix 3. Surveying how other package formats document licenses +Appendix 3. Surveying How Other Package Formats Document Licenses ================================================================= Here is a survey of how things are done elsewhere. -License in Linux distribution packages + +Licenses in Linux Distribution Packages --------------------------------------- -Note: in most cases the license texts of the most common licenses are included -globally once in a shared documentation directory (e.g. /usr/share/doc). +**Note:** in most cases the license texts of the most common licenses are included +globally once in a shared documentation directory (e.g. ``/usr/share/doc``). -- Debian document package licenses with machine readable copyright files +- Debian documents package licenses with machine readable copyright files [#dep5]_. This specification defines its own license expression syntax that is very similar to the SDPX syntax and use its own list of license identifiers for common licenses (also closely related to SPDX identifiers). @@ -650,7 +651,7 @@ globally once in a shared documentation directory (e.g. /usr/share/doc). expression syntax. FreeBSD also recommends the use of ``SPDX-License-Identifier`` in source code files. -- Archlinux PKGBUILD [#archinux]_ define its own license identifiers +- Arch Linux PKGBUILD [#archinux]_ define its own license identifiers [#archlinuxlist]_. The value ``'unknown'`` can be used if the license is not defined. @@ -669,8 +670,8 @@ globally once in a shared documentation directory (e.g. /usr/share/doc). license field. -License in Language and Application packages --------------------------------------------- +Licenses in Language and Application Packages +--------------------------------------------- - In Java, Maven POM [#maven]_ defines a ``licenses`` XML tag with a list of license items each with a name, URL, comments and "distribution" type. This is not @@ -744,7 +745,7 @@ License in Language and Application packages license expression syntax. -Conventions used by other ecosystems +Conventions Used by Other Ecosystems ------------------------------------ - ``SPDX-License-Identifier`` [#spdxids]_ is a simple convention to document the @@ -857,8 +858,8 @@ This document is placed in the public domain or under the CC0-1.0-Universal license [#cc0]_, whichever is more permissive. -Acknowledgements -================ +Acknowledgments +=============== - Nick Coghlan - Kevin P. Fleming From 801c64df31f83cd0603a282d073e86d0b3c2c7eb Mon Sep 17 00:00:00 2001 From: "C.A.M. Gerlach" Date: Mon, 15 Nov 2021 21:39:08 -0600 Subject: [PATCH 03/19] PEP 639: Rewrite to reflect License-Expression field and depr process --- pep-0639.rst | 553 +++++++++++++++++++++++++++++++++++++-------------- 1 file changed, 407 insertions(+), 146 deletions(-) diff --git a/pep-0639.rst b/pep-0639.rst index dcfc378127d..fe53c1295ee 100644 --- a/pep-0639.rst +++ b/pep-0639.rst @@ -18,10 +18,11 @@ Resolution: Abstract ======== -This PEP defines a specification for how licenses are documented in -core metadata via the ``License`` field, with license expression strings -using SPDX identifiers [#spdxlist]_, such that license documentation -is simpler and less ambiguous for: +This PEP defines a specification for how licenses are documented in the +core metadata via a new ``License-Expression`` field, with license expression +strings using SPDX identifiers [#spdxlist]_. + +This will make license declarations simpler and less ambiguous for: - package authors to create, - package users to read and understand, and, @@ -29,11 +30,14 @@ is simpler and less ambiguous for: The PEP also proposes to: -- Specify a ``License-File`` field which is already used by ``wheel`` and - ``setuptools`` to include license files in built distributions. +- Deprecate the legacy ``License`` field and ``license ::`` classifiers. + +- Formally specify a new ``License-File`` field, which is already used by + ``wheel`` and ``setuptools`` to include license files in distributions. -- Define how tools can validate license expressions and report warnings to - users for invalid expressions (but still accept any string as ``License``). +- Define how tools can validate license expressions and handle errors and + deprecated fields/classifiers to balance adoption of this PEP with + backwards-compatibility and a smooth transition for package authors. The changes in this PEP will update the core metadata format to version 2.3. @@ -48,17 +52,17 @@ distribution, specifically covering: - A formal mechanism to include license texts in a built package. The changes to the core metadata specification that this PEP requires have been -designed to have minimal impact and to be backward compatible. -This specification utilizes existing ways to document licenses that are already +designed to minimize impact and maximize backward compatibility. +This specification builds off of existing ways to document licenses that are in use in some tools (e.g. by adding the ``License-File`` field already used in ``wheel`` and ``setuptools``) and by some package authors (e.g. storing an SPDX license expression in the existing ``License`` field). In addition to these proposed changes, this PEP contains: -- Recommendations for publishing tools on how to validate the ``License`` and - ``Classifier`` fields and report informational warnings when a package uses an - older, non-structured style of license documentation conventions. +- Recommendations for publishing tools on how to validate the new + ``License-Expression`` field and report informational warnings when a package + uses legacy metadata (the ``License`` field and ``License ::`` classifers). - Informational appendices that contain surveys of how we document licenses today in Python packages and elsewhere, and a reference Python library to @@ -83,11 +87,11 @@ This PEP makes no recommendation for specific licenses and does not require the use of specific license documentation conventions. This PEP also does not impose any restrictions when uploading to PyPI. -Instead, this PEP is intended to document common practices already in use, and -recommends that publishing tools should encourage users via informational -warnings when they do not follow this PEP's recommendations. +Instead, it is intended to document best practices already in use, extend them +to use a new formally-specified and supported mechanism, and provide guidance +for packaging tools on how to hand the transition and inform users accordingly. -This PEP is not about license documentation in files inside packages, even +This PEP is not about license documentation in files inside packages, though this is a surveyed topic in the appendix. @@ -95,10 +99,13 @@ Possible Future PEPs -------------------- It is the intention of the authors of this PEP to consider the submission of -related but separate PEPs in the future such as: +related but separate PEPs in the future, which may include: + +- Removing the deprecated ``License`` field and ``License ::`` + classifiers from the Core Metadata specification - Making the ``License-Expression`` and ``License-File`` fields mandatory - in tools and PyPI publishing. + for publishing tools and PyPI packages. - Requiring uploads to PyPI to use only FOSS (Free and Open Source Software) licenses. @@ -157,13 +164,17 @@ There are a few takeaways from the survey: These considerations have guided the design and recommendations of this PEP. -The reuse of the ``License`` field with license expressions will provide an -intuitive and more structured way to express the license of a distribution using -a well-defined syntax and well-known license identifiers. +The use of a new ``License-Expression`` field will provide an intuitive, +structured and unambiguous way to express the license of a distribution +using a well-defined syntax and well-known license identifiers. +Similarly, a formally-specified ``License-Files`` field offers a standardized +way to declare the full text of the license(s) as legally required to be +included with the package when distributed. -Over time, recommending the usage of these expressions will help Python package -publishers improve the clarity of their license documentation to the benefit of -package authors, consumers and redistributors. +Over time, encouraging the use of these fields and deprecating and ambiguous, +duplicative legacy alternatives will help Python software publishers improve +the clarity, accuracy and portability of their licensing practices, +to the benefit of package authors, consumers and redistributors alike. Specification @@ -171,22 +182,22 @@ Specification The canonical source for the names and semantics of each of the supported metadata fields is the Core Metadata Specification [#cms]_ document. -This PEP proposes modifying the following fields, which will be implemented -in the canonical document once this PEP is approved. +This PEP proposes adding the ``License-Expression`` and ``License-File`` +fields, deprecating the ``License`` field, and deprecating the ``License ::`` +classifiers in the ``Classifier`` field. -As it adds a new field, this PEP updates the core metadata format -to version 2.3. +As it adds new fields, this PEP updates the core metadata to version 2.3. -License (optional) ------------------- +Add License-Expression Field +---------------------------- -Text indicating the license covering the distribution. This text can be either a -valid license expression as defined here or any free text. +The ``License-Expression`` optional field is specified to contain a text string +that is a valid SPDX license expression, defined below. -Publishing tools SHOULD issue an informational warning if this field is empty, -missing, or is not a valid license expression as defined here. Build tools MAY -issue a similar warning. +Publishing tools SHOULD issue an informational warning if this field is +missing, and MAY raise an error. Build tools MAY issue a similar warning, +but MUST NOT raise an error. License Expression Syntax @@ -197,8 +208,8 @@ documented in the SPDX specification [#spdx]_ using either Version 2.2 [#spdx22]_ or a later compatible version. SPDX is a working group at the Linux Foundation that defines a standard way to exchange package information. -When used in the ``License`` field and as a specialization of the SPDX license -expression definition, a license expression can use the following license +When used in the ``License-Expression`` field and as a specialization of the SPDX +license expression definition, a license expression can use the following license identifiers: - Any SPDX-listed license short-form identifiers that are published in the SPDX @@ -209,146 +220,209 @@ identifiers: - The ``LicenseRef-Public-Domain`` and ``LicenseRef-Proprietary`` strings to identify licenses that are not included in the SPDX license list. -When processing the ``License`` field to determine if it contains a valid -license expression, tools: +When processing the ``License-Expression`` field to determine if it contains +a valid license expression, build and publishing tools: -- SHOULD report an informational warning if one or more of the following - applies: +- SHOULD halt execution and raise an error if: - - the field does not contain a license expression + - The field does not contain a valid license expression - - the license expression syntax is invalid + - One or more license identifiers are not valid (as defined above) - - the license expression syntax is valid but some license identifiers are - unknown as defined here or the license identifiers have been marked as - deprecated in the SPDX License List [#spdxlist]_ +- SHOULD report an informational warning, and publishing tools MAY raise an + error if one or more license identifiers have been marked as deprecated in + the SPDX License List [#spdxlist]_. -- SHOULD store a case-normalized version of the ``License`` field using the - reference case for each SPDX license identifier and uppercase for the AND, OR - and WITH keywords. +- SHOULD store a case-normalized version of the ``License-Expression`` field + using the reference case for each SPDX license identifier and + uppercase for the ``AND``, ``OR`` and ``WITH`` keywords. -- SHOULD report an informational warning if normalization process results in - changes to the ``License`` field contents. +- SHOULD report an informational warning, and MAY raise an error if + the normalization process results in changes to the + ``License-Expression`` field contents. License expression examples:: - License: MIT - - License: BSD-3-Clause + License-Expression: MIT - License: MIT OR GPL-2.0-or-later OR (FSFUL AND BSD-2-Clause) + License-Expression: BSD-3-Clause - License: GPL-3.0-only WITH Classpath-Exception-2.0 OR BSD-3-Clause + License-Expression: MIT OR GPL-2.0-or-later OR (FSFUL AND BSD-2-Clause) - License: This software may only be obtained by sending the - author a postcard, and then the user promises not - to redistribute it. + License-Expression: GPL-3.0-only WITH Classpath-Exception-2.0 OR BSD-3-Clause - License: LicenseRef-Proprietary AND LicenseRef-Public-Domain + License-Expression: LicenseRef-Proprietary AND LicenseRef-Public-Domain -License-File (multiple use) ---------------------------- +Add License-File Field +---------------------- -A text string that is a path, relative to ``.dist-info``, to a license file. +The ``License-File`` field is specified to contain the string representation of +the path relative to ``.dist-info`` of license-related files, +such as license text, author/attribution information or legal notices. The license file content MUST be UTF-8 encoded text. +It is an optional, multi-use field which may appear zero or more times, +each instance listing a path to one of the license files to be included in +distributions of the package. + Build tools SHOULD honor this field and include the corresponding license file(s) in the built package. -Classifier (multiple use) -------------------------- +Deprecate License Field +----------------------- + +The legacy unstructured-text ``License`` field is deprecated and replaced by +the new ``License-Expression`` field. -Each entry is a string giving a single classification value for the -distribution. Classifiers are described in PEP 301. +Build and publishing tools MUST raise an error if both fields are present and +their values are not identical, including capitalization and excluding +leading and trailing whitespace. -Examples:: +If only the ``License`` field is present, such tools SHOULD issue a warning +informing users it is deprecated and recommending ``License-Expression`` +instead. - Classifier: Development Status :: 4 - Beta - Classifier: Environment :: Console (Text Based) +Along with license classifiers, the ``License`` field may be removed from a +new version of specification in a future PEP. -Tools SHOULD issue an informational warning if this field contains a licensing- -related classifier string starting with the ``License ::`` prefix and SHOULD -suggest the use of a license expression in the ``License`` field instead. -If the ``License`` field is present and contains a valid license expression, -publishing tools MUST NOT also provide any licensing-related classifier entries -[#classif]_. +Deprecate License Classifiers +----------------------------- -However, for compatibility with existing publishing and installation processes, -licensing-related classifier entries SHOULD continue to be accepted if the -``License`` field is absent or does not contain a valid license expression. +Including license classifiers [#classif]_ (those beginning with ``License ::``) +in the ``Classifier`` field (described in PEP 301) is deprecated and +replaced by the more precise ``License-Expression`` field. -Publishing tools MAY infer a license expression from the provided classifier -entries if they are able to do so unambiguously. +New license classifiers SHALL NOT be added; users needing them should +use the ``License-Expression`` field instead. +Along with the ``License`` field, license classifiers may be removed from a +new version of the specification in a future PEP. -However, no new licensing related classifiers will be added; anyone -requesting them will be directed to use a license expression in the ``License`` -field instead. Note that the licensing-related classifiers may be deprecated in -a future PEP. +If the ``License-Expression`` field is present, build tools MAY and publishing +tools SHOULD raise an error if one or more license classifiers (as defined +above) is included in a ``Classifier`` field, and not add such classifiers +themselves. +Otherwise, if this field contains a license classifier, build tools MAY +and publishing tools SHOULD issue a warning informing users such classifiers +are deprecated and recommending ``License-Expression`` instead. +For compatibility with existing publishing and installation processes, +the presence of license classifiers SHOULD NOT raise an error unless +``License-Expression`` is also provided. -Mapping Legacy Classifiers to New License Expressions -''''''''''''''''''''''''''''''''''''''''''''''''''''' -Publishing tools MAY infer or suggest an equivalent license expression from the -provided ``License`` or ``Classifier`` information if they are able to do so -unambiguously. For instance, if a package only has this license classifier:: +Conversion of Legacy License Metadata +===================================== + +If the contents of the ``License`` field are a valid SPDX expression containing +solely known, non-deprecated license identifiers, build and publishing tools MAY +use it to fill the ``License-Expression`` field. + +Similarly, if the ``Classifier`` field contains exactly one license classifier +(those beginning with ``License ::``) that unambiguously maps to exactly one +valid, non-deprecated SPDX identifier, tools MAY use it to fill the +``License-Expression`` field. + +If both a non-empty ``License`` field and a single license classifier are +present, the contents of the ``License`` field, including capitalization +(but excluding leading and trailing whitespace), MUST exactly match the SPDX +license identifier mapped to the license classifier to be considered +unambiguous for the purposes of automatically + +For instance, if a package only has this license classifier:: Classifier: License :: OSI Approved :: MIT License -Then the corresponding value for a ``License`` field using a valid license -expression to suggest would be:: +And this value for the ``License`` field, if present and non-empty:: License: MIT -Here are mapping guidelines for the legacy classifiers: +Then the suggested value for a ``License-Expression`` field should be:: -- Classifier ``License :: Other/Proprietary License`` becomes License: - ``LicenseRef-Proprietary`` expression. + License-Expression: MIT -- Classifier ``License :: Public Domain`` becomes License: ``LicenseRef-Public-Domain`` - expression, though tools should encourage the use of more explicit and legally - portable license identifiers such as ``CC0-1.0`` [#cc0]_, the ``Unlicense`` - [#unlic]_ since the meaning associated with the term "public domain" is thoroughly - dependent on the specific legal jurisdiction involved and some jurisdictions - have no concept of Public Domain as it exists in the USA. +If tools have filled the ``License-Expression`` field as described above, +they MUST output a prominent, user-visible warning informing package authors +of that fact, including the ``License-Expression`` string they have output, +and recommending that the source metadata be updated accordingly +with the indicated ``License-Expression``. -- The generic and ambiguous classifiers ``License :: OSI Approved`` and - ``License :: DFSG approved`` do not have an equivalent license expression. +In any other case, tools MUST NOT use the contents of the ``License`` field +or license classifiers to fill the ``License-Expression`` field without +informing the user and requiring unambiguous, affirmative user action to +select and confirm the desired ``License-Expression`` value before proceeding. + + +Mapping License Classifiers to SPDX Identifiers +----------------------------------------------- + +The following defines the mapping between legacy license classifiers +and SPDX license identifiers, in cases other than strict 1:1 +correspondence from a classifier to a defined identifier. + +- Classifier ``License :: Public Domain`` becomes + ``License-Expression: LicenseRef-Public-Domain``. Tools SHOULD + encourage the use of more explicit and legally portable license identifiers + such as ``CC0-1.0`` [#cc0]_ or the ``Unlicense`` [#unlic]_, + since the meaning associated with the term "public domain" is thoroughly + dependent on the specific legal jurisdiction involved, + some of which lack the concept entirely. - The generic and sometimes ambiguous classifiers - ``License :: Free For Educational Use``, ``License :: Free For Home Use``, - ``License :: Free for non-commercial use``, ``License :: Freely Distributable``, - ``License :: Free To Use But Restricted``, and ``License :: Freeware`` are mapped - to the generic License: ``LicenseRef-Proprietary`` expression. + ``License :: Free For Educational Use``, + ``License :: Free For Home Use``, + ``License :: Free for non-commercial use``, + ``License :: Freely Distributable``, + ``License :: Free To Use But Restricted``, + ``License :: Freeware``, and + ``License :: Other/Proprietary License`` MAY be mapped to the generic + ``License-Expression: ``LicenseRef-Proprietary`` expression, + but tools MUST issue a prominent, informative warning if they do so. + Alternatively, tools MAY choose to treat the above as ambiguous and + require user confirmation to fill ``License-Expression`` in these cases. + +- The generic and ambiguous classifiers ``License :: OSI Approved`` and + ``License :: DFSG approved`` do not map to any license expression, + and thus tools MUST treat them as ambiguous and require user intervention + to fill ``License-Expression``. -- Classifiers ``License :: GUST*`` have no mapping to SPDX license identifiers - for now and no package uses them in PyPI as of the writing of this PEP. +- The classifiers ``License :: GUST Font License 1.0*`` and + ``License :: GUST Font License 2006-09-30`` have no mapping to SPDX license + identifiers and no PyPI package uses them, as of the writing of this PEP. + Therefore, tools MUST treat them as ambiguous when attempting to fill + ``License-Expression``. -The remainder of the classifiers using a ``License ::`` prefix map to a simple -single-identifier license expression using the corresponding SPDX license identifiers. +The remainder of the classifiers using a ``License ::`` prefix map to a +simple single-identifier license expression using the corresponding +SPDX license identifiers. When multiple license-related classifiers are used, their relation is ambiguous and it is typically not possible to determine if all the licenses apply or if there is a choice that is possible among the licenses. In this case, tools -cannot reliably infer a license expression and should suggest that the package -author construct a license expression which expresses their intent. +MUST NOT automatically infer a license expression and SHOULD suggest that the +package author construct a license expression which expresses their intent. Backwards Compatibility ======================= -The reuse of the ``License`` field means that we keep backward -compatibility. The specification of the ``License-File`` field is only writing -down the practices of the ``wheel`` and ``setuptools`` tools and is backward -compatible with their support for that field. +Adding a new, dedicated ``License-Expression`` field unambiguously signals +support for the new metadata fields and avoids the risk of new tooling +misinterpreting a license expression as a free-form license description, +or vice versa, and raises an error if and only if the user affirmatively +upgrades to the latest metadata version by adding said field. + +The legacy ``License`` field and ``License ::`` classifiers will be deprecated +but not removed, to retain backward compatibility, while gently preparing users +for their future removal. Eventually, they would be removed, but that would be +following a suitable transition period and left to a future PEP and a new +version of the core metadata specification. -The "soft" validation of the ``License`` field when it does not contain a valid -license expression and when the ``Classifier`` field is used with legacy -license-related classifiers means that we can gently prepare users for possible -strict and incompatible validation of these fields in the future. +Formally specifying the ``License-File`` field is only codifying the existing +practice in ``wheel`` and ``setuptools``, and should be fully +backwards-compatible with their existing use of that field. Security Implications @@ -367,32 +441,39 @@ expression and a large majority of packages use a single license. The plan to teach users of packaging tools how to express their package's license with a valid license expression is to have tools issue informative -messages when they detect invalid license expressions or when a license-related -classifier is used in the ``Classifier`` field. - -With a warning message that does not terminate processing, publishing tools will -gently teach users how to provide correct license expressions over time. +messages when they detect invalid license expressions, or when the deprecated +``License`` field or a ``License ::`` classifier is used. + +An immediate, descriptive error message if an invalid ``License-Expression`` +is used will help users understand they need to use valid SPDX identifiers in +this field, and catch them if they make a mistake. +For authors still using the now-deprecated, less precise and more redundant +``License`` field or ``License ::`` classifiers, packaging tools will warn +them and inform them of the modern replacement, ``License-Expression``. +Finally, for users who may have forgot or not be aware they need to do so, +publishing tools will gently guide them toward including ``License-Expression`` +and ``License-Files`` with their uploaded packages. Tools may also help with the conversion and suggest a license expression in some cases: -- The section `Mapping Legacy Classifiers to New License expressions`_ provides +- The section `Mapping License Classifiers to SPDX Identifiers`_ provides tool authors with guidelines on how to suggest a license expression produced from legacy classifiers. - Tools may also be able to infer and suggest how to update an existing - incorrect ``License`` value and convert that to a correct license expression. - For instance a tool may suggest to correct a ``License`` field from + ``License`` value and convert that to a ``License-Expression``. + For instance, a tool may suggest converting from a ``License`` field with ``Apache2`` (which is not a valid license expression as defined in this PEP) - to ``Apache-2.0`` (which is a valid license expression using an SPDX license - identifier as defined in this PEP). + to a ``License-Expression`` field with ``Apache-2.0`` (which is a valid license + expression using an SPDX license identifier). Reference Implementation ======================== Tools will need to support parsing and validating license expressions in the -``License`` field. +``License-Expression`` field. The ``license-expression`` library [#licexp]_ is a reference Python implementation of a library that handles license expressions including parsing, @@ -406,20 +487,174 @@ and the Free Software Foundation Europe (FSFE) Reuse project [#reuse]_. Rejected Ideas ============== -Re-Use the License Field ------------------------- +Core Metadata Fields +-------------------- + +Potential alternatives to the structure, content and deprecation of the +core metadata fields specified in this PEP. -Adding a new field would introduce backward incompatible changes when the -``License`` field would be retired later and require having more complex -validation. The use of such a field would further introduce a new concept that -is not seen anywhere else in any other package metadata (e.g. a new field only -for license expression) and possibly be a source of confusion. Also, users are -less likely to start using a new field than make small adjustments to their use -of existing fields. + +Re-Use the License Field +'''''''''''''''''''''''' + +Following initial discussion [#reusediscussion]_, earlier versions of this +PEP proposed to re-use the existing ``License`` field, which tools would +attempt to parse as a SPDX expression with a fall back to treating as free +text. Initially, this would merely cause a warning (or even pass silently), +but would eventually be treated as an error by modern tooling. + +This offered the benefit of greater backwards-compatibility, +easing the community into using SPDX expressions while taking advantage of +packages that already have them (either intentionally or coincidentally), +and avoided adding yet another license-related field. + +However, following substantial discussion, consensus was reached that a +dedicated ``License-Expression`` field was the preferred overall approach. +The presence of this field is an unambiguous signal that a package +intends it to be interpreted as a valid SPDX identifier, without the need +for complex and potentially erroneous heuristics, and allows tools to +easily and unambiguously detect invalid content. + +This avoids both false positive (``License`` values that a package author +didn't explicitly intend as an explicit SPDX identifier, but that happen +to validate as one), and false negatives (expressions the author intended +to be valid SPDX, but due to a typo or mistake is not), which are otherwise +not clearly distinguishable from true positives and negatives, an ambiguity +at odds with the goals of this PEP. + +Furthermore, it allows both the existing ``License`` field and +the ``License::`` classifiers to be more easily deprecated, +with tools able to cleanly distinguish between packages intending to +affirmatively conform to the updated specification in this PEP or not, +and adapt their behavior (warnings, errors, etc) accordingly. +Otherwise, tools would either have to allow duplicative and potentially +conflicting ``License`` fields and classifiers, or warn/error on the +substantial number of existing packages that have SPDX identifiers as the +value for the ``License`` field, intentionally or otherwise (e.g. ``MIT``). + +Finally, it avoids changing the behavior of an existing metadata field, +and avoids tools having to guess the ``Metadata-Version`` and field behavior +based on its value rather than merely its presence. + +While this would mean the subset of existing projects containing ``License`` +fields valid as SPDX expressions wouldn't automatically be recognized as such, +this only requires appending a few characters to the key name in the +package's source metadata, and this PEP provides extensive guidance on +how this can be done automatically by tooling. + +Given all this, it was decided to proceed with defining a new, +purpose-created field, ``License-Expression``. + + +Re-Use the License Field with a Value Prefix +'''''''''''''''''''''''''''''''''''''''''''' + +As an alternative to the above, it was suggested to reduce the ambiguity +inherent in re-using the ``License`` field by prefixing SPDX expressions +with, e.g. ``spdx:``. However, this effectively amounted to creating a field +within a field, and doesn't address all the downsides of keeping the +``License`` field. Namely, it still changes the behavior of an +existing metadata field, requires tools to parse its value +to determine how to handle its content, and makes the specification and +deprecation process more complex and less clean. + +Yet, it still shares a same main potential downside as just creating a new +field, that projects currently using valid SPDX identifiers in the ``License`` +field, intentionally or not, won't be automatically recognized, and requires +about the same amount of effort to fix, namely changing a line in the +package's source metadata. Therefore, it was rejected in favor of a new field. + + +Don't Make License-Expression Mutually Exclusive +'''''''''''''''''''''''''''''''''''''''''''''''' + +For backwards compatibility, the ``License`` field and/or the license +classifiers could still be allowed together with the new +``License-Expression`` field, presumably with a warning. However, this +could easily lead to inconsistent, and at the very least duplicative +license metadata in no less than *three* different fields, which is +squarely contrary to the goals of this PEP of making the licensing story +simpler and unambiguous. Therefore, and in concert with clear community +consensus otherwise, this idea was soundly rejected. + + +Don't Deprecate Existing License Field and Classifiers +'''''''''''''''''''''''''''''''''''''''''''''''''''''' + +Several community members were initially concerned that deprecating the +existing ``License`` field and license classifiers would result in +excessive churn for existing package authors and raise the barrier to +entry for new ones, particularly everyday Python developers seeking to +package and publish their personal projects without necessarily caring +too much about the legal technicalities or being a "license lawyer". +Indeed, every deprecation comes with some non-zero short-term cost, +and should be carefully considered relative to the overall long-term +net benefit. And at the minimum, this change shouldn't make it more +difficult for the average Python developer to share their work under +a license of their choice, and ideally improve the situation. + +Following many rounds of proposals, discussion and refinement, +the general consensus was clearly in favor of deprecating the legacy +means of specifying a license, in favor of "one obvious way to do it", +to improve the currently complex and fragmented story around license +documentation. Not doing so would leave three different un-deprecated ways of +specifying a license for a package, two of them ambiguous, less than +clear/obvious how to use, inconsistently documented and out of date. +This is more complex for for all tools in the ecosystem to support +indefinitely (rather than simply installers supporting older packages +implementing previous frozen metadata versions), resulting in a non-trivial +and unbounded maintenance cost. + +Furthermore, it leads to a more complex and confusing landscape for users with +three similar but distinct options to choose from, particularly with older +documentation, answers and articles floating around suggesting different ones. +Of the three, ``License-Expression`` is the simplest and clearest to use +correctly; users just paste in their desired license identifier, or select it +via a tool, and they're done; no need to learn about Trove classifiers and +dig through the list to figure out which one(s) apply (and be confused +by many ambiguous options), or figure out on their own what should go +in the ``license`` field (anything from nothing, to the license text, +to a free-form description, to the same SPDX identifier they would be +entering in the ``License-Expression`` field anyway, assuming they can +easily find documentation at all about it). In fact, this can be +made even easier thanks to the new field. For example, GitHub's popular +ChooseALicense.com [#choosealicense]_ links to how to add SPDX license +identifiers to the packaging metadata of various languages that support +them right in the sidebar of every license page; the SPDX support in this +PEP enables adding Python to that list. + +For current package maintainers who have specified a ``License`` or license +classifiers, this PEP only recommends warnings and prohibits errors for +all but publishing tools, which are allowed to error if their intended +distribution platform(s) so requires. Once maintainers are ready to +upgrade, for those already using SPDX expressions (accidentally or not) +this only requires appending a few characters to the key name in the +package's source metadata, and for those with license classifiers that +map to a single unambiguous license, or another defined case (public domain, +proprietary), they merely need to drop the classifier and paste in the +corresponding license identifier. This PEP provides extensive guidance and +examples, as will other resources, as well as explicit instructions for +automated tooling to take care of this with no human changes needed. +More complex cases where license metadata is currently specified may +need a bit of human intervention, but in most cases tools will be able +to provide a list of options following the mappings in this PEP, and +these are typically the projects most likely to be concerned about +licensing issues in any case, and thus most benefited by this PEP. + +Finally, for unmaintained packages, those using tools supporting older +metadata versions, or those who choose not to provide license metadata, +no changes are required regardless of the deprecation. + + +Other Ideas +----------- + +Miscellaneous proposals, possibilities and discussion points that were +ultimately not adopted. Map Identifiers to License Files --------------------------------- +'''''''''''''''''''''''''''''''' This would require using a mapping (two parallel lists would be too prone to alignment errors) and a mapping would bring extra complication to how license @@ -445,13 +680,36 @@ string. Map Identifiers to Source Files -------------------------------- +''''''''''''''''''''''''''''''' File-level notices are not considered as part of the scope of this PEP and the existing ``SPDX-License-Identifier`` [#spdxids]_ convention can be used and may not need further specification as a PEP. +Don't Require Compatibility with a Specific SPDX Version +'''''''''''''''''''''''''''''''''''''''''''''''''''''''' + +This PEP could omit specifying a specific SPDX specification version, +or one for the list of valid license identifiers, which would allow +more flexible updates as the specification evolves without another +PEP or equivalent. + +However, serious concerns were expressed about a future SPDX update breaking +compatibility with existing expressions and identifiers, leaving current +packages with invalid metadata per the definition in this PEP. Requiring +compatibility with a specific version of these specifications here +and requiring a PEP or similar process to update it avoids that from +occurring, and follows the practice of other packaging ecosystems. + +Therefore, it was decided [#spdxversion]_ to specify a minimum version +and requires tools to be compatible with it, while still allowing updates +so long as they don't break backward compatibility. This enables +tools to immediate take advantage of improvements and accept new +licenses, but also remain backwards compatible with the version +specified here, balancing flexibility and compatibility. + + Appendix 1. License Expression Example ====================================== @@ -465,7 +723,7 @@ the ``License`` field. It uses instead this license-related information in The simplest migration to this PEP would consist of using this instead:: - license = MIT + license_expression = MIT license_files = LICENSE @@ -489,7 +747,7 @@ Therefore, a comprehensive license expression covering both ``setuptools`` prope and its vendored packages could contain these metadata, combining all the license expressions in one expression:: - license = MIT AND (Apache-2.0 OR BSD-2-Clause) + license_expression = MIT AND (Apache-2.0 OR BSD-2-Clause) license_files = LICENSE.MIT LICENSE.packaging @@ -793,6 +1051,9 @@ References .. [#reuse] https://reuse.software/ .. [#licexp] https://github.com/nexB/license-expression/ .. [#spdxpy] https://github.com/spdx/tools-python/ +.. [#reusediscussion] https://github.com/pombredanne/spdx-pypi-pep/issues/7 +.. [#choosealicense] https://choosealicense.com/ +.. [#spdxversion] https://github.com/pombredanne/spdx-pypi-pep/issues/6 .. [#scancodetk] https://github.com/nexB/scancode-toolkit .. [#licfield] https://packaging.python.org/guides/distributing-packages-using-setuptools/?highlight=MANIFEST.in#license .. [#samplesetup] https://github.com/pypa/sampleproject/blob/52966defd6a61e97295b0bb82cd3474ac3e11c7a/setup.py#L98 From 5ef0e154c6e8545b416f77a4c5590f1cc849501c Mon Sep 17 00:00:00 2001 From: "C.A.M. Gerlach" Date: Wed, 17 Nov 2021 01:17:56 -0600 Subject: [PATCH 04/19] PEP 639: Update examples, current tools survey, links and more --- pep-0639.rst | 209 +++++++++++++++++++++++++++++---------------------- 1 file changed, 118 insertions(+), 91 deletions(-) diff --git a/pep-0639.rst b/pep-0639.rst index fe53c1295ee..11384762bb0 100644 --- a/pep-0639.rst +++ b/pep-0639.rst @@ -169,7 +169,8 @@ structured and unambiguous way to express the license of a distribution using a well-defined syntax and well-known license identifiers. Similarly, a formally-specified ``License-Files`` field offers a standardized way to declare the full text of the license(s) as legally required to be -included with the package when distributed. +included with the package when distributed, and allows other tools consuming +the core metadata to unambiguously locate a distribution's license files. Over time, encouraging the use of these fields and deprecating and ambiguous, duplicative legacy alternatives will help Python software publishers improve @@ -213,9 +214,9 @@ license expression definition, a license expression can use the following licens identifiers: - Any SPDX-listed license short-form identifiers that are published in the SPDX - License List [#spdxlist]_ using either Version 3.10 or any later compatible - version. Note that the SPDX working group never removes any license - identifiers: instead they may choose to mark an identifier as "deprecated". + License List [#spdxlist]_, version 3.15 or any later compatible version. + Note that the SPDX working group never removes any license identifiers; + instead, they may choose to mark an identifier as "deprecated". - The ``LicenseRef-Public-Domain`` and ``LicenseRef-Proprietary`` strings to identify licenses that are not included in the SPDX license list. @@ -713,73 +714,78 @@ specified here, balancing flexibility and compatibility. Appendix 1. License Expression Example ====================================== -The current version of ``setuptools`` metadata [#setuptools5030]_ does not use -the ``License`` field. It uses instead this license-related information in -``setup.cfg``:: +Setuptools itself, as of version 59.1.1 [#setuptools5911]_, does not use the +``License`` field. Further, ``license_file``/``license_files`` is no longer +explicitly specified, as it was previously, since ``setuptools`` relies on +its automatic inclusion of license-related files matching common patterns, +including the ``LICENSE`` file it uses. + +It only includes the following license-related metadata in its ``setup.cfg``:: - license_file = LICENSE classifiers = License :: OSI Approved :: MIT License The simplest migration to this PEP would consist of using this instead:: license_expression = MIT - license_files = - LICENSE -Another possibility would be to include the licenses of the third-party packages +Suppose Setuptools were to include the licenses of the third-party packages that are vendored in the ``setuptools/_vendor/`` and ``pkg_resources/_vendor`` -directories:: +directories; specifically:: - appdirs==1.4.3 - packaging==20.4 + packaging==21.2 pyparsing==2.2.1 ordered-set==3.1.1 + more_itertools==8.8.0 -These license expressions for these packages are:: +The license expressions for these packages are:: - appdirs: MIT packaging: Apache-2.0 OR BSD-2-Clause pyparsing: MIT ordered-set: MIT + more_itertools: MIT -Therefore, a comprehensive license expression covering both ``setuptools`` proper -and its vendored packages could contain these metadata, combining all the -license expressions in one expression:: +Therefore, a comprehensive license expression covering both ``setuptools`` +proper and its vendored dependencies could contain these metadata, combining +all the license expressions in one expression:: license_expression = MIT AND (Apache-2.0 OR BSD-2-Clause) license_files = - LICENSE.MIT - LICENSE.packaging + LICENSE.txt + LICENSE-PACKAGING.txt -Here we would assume that the ``LICENSE.MIT`` file contains the text of the MIT -license and the copyrights used by ``setuptools``, ``appdirs``, ``pyparsing`` and -``ordered-set``, and that the ``LICENSE.packaging`` file contains the texts of the -Apache and BSD license, its copyrights and its license choice notice [#packlic]_. +Here we would assume that the ``LICENSE.txt`` file contains the text of the MIT +license and the copyrights used by ``setuptools``, ``pyparsing``, +``more_itertools`` and ``ordered-set``, and that the ``LICENSE-PACKAGING.txt`` +file contains the texts of the Apache and BSD licenses, and the ``packaging`` +copyright statements and license choice notice [#packlic]_. Appendix 2. Surveying How we Document Licenses Today in Python ============================================================== There are multiple ways used or recommended to document Python package -licenses today: +licenses today. The most common are listed below. In Core Metadata ---------------- There are two overlapping core metadata fields to document a license: the -license-related ``Classifier`` strings [#classif]_ prefixed with ``License ::`` and -the ``License`` field as free text [#licfield]_. +license-related ``Classifier`` strings [#classif]_ prefixed with ``License ::`` +and the ``License`` field as free text [#licfield]_. The core metadata documentation ``License`` field documentation is currently:: - License (optional) - :::::::::::::::::: + License + ======= + + .. versionadded:: 1.0 Text indicating the license covering the distribution where the license is not a selection from the "License" Trove classifiers. See - "Classifier" below. This field may also be used to specify a + :ref:`"Classifier" ` below. + This field may also be used to specify a particular version of a license which is named via the ``Classifier`` field, or to indicate a variation or exception to such a license. @@ -792,49 +798,65 @@ The core metadata documentation ``License`` field documentation is currently:: License: GPL version 3, excluding DRM provisions Even though there are two fields, it is at times difficult to convey anything -but simpler licensing. For instance some classifiers lack accuracy (GPL -without a version) and when you have multiple License-related classifiers it is -not clear if this is a choice or all these apply and which ones. Furthermore, -the list of available license-related classifiers is often out-of-date. - - -In the PyPA Sample Project --------------------------- - -The latest PyPA ``sampleproject`` recommends only to use classifiers in -``setup.py`` and does not list the ``license`` field in its example -``setup.py`` [#samplesetup]_. +but simpler licensing. For instance, some classifiers lack precision +(GPL without a version) and when multiple license-related classifiers are +listed, it is not clear if both licenses must apply, or the user may choose +between them. Furthermore, the list of available license-related classifiers +is often out-of-date. -License Files in Wheels and Setuptools --------------------------------------- +In Wheels and Setuptools +------------------------ Beyond a license code or qualifier, license text files are documented and included in a built package either implicitly or explicitly and this is another possible source of confusion: -- In wheels [#wheels]_ license files are automatically added to the ``.dist-info`` - directory if they match one of a few common license file name patterns (such - as ``LICENSE*`` and ``COPYING*``). Alternatively a package author can specify - a list of license file paths to include in the built wheel in the ``license_files`` - field in the ``[metadata]`` section of the project's ``setup.cfg``. - Previously this was a (singular) ``license_file`` file attribute that is now - deprecated but is still in common use. See [#pipsetup]_ for instance. +- In ``setuptools`` [#setuptoolssdist]_ and wheels [#wheels]_, license files + are automatically added to the distribution (at their source location in + in a source distribution (sdist), and in the ``.dist-info`` directory + of a built wheel) if they match one of a number of common license file + name patterns (``LICEN[CS]E*``, ``COPYING*``, ``NOTICE*`` and ``AUTHORS*``). + Alternatively, a package author can specify a list of license file paths to + include in the built wheel under the ``license_files`` key in the + ``[metadata]`` section of the project's ``setup.cfg``, or as an argument + to the ``setuptools`` ``setup()`` function. -- In ``setuptools`` [#setuptoolssdist]_, a ``license_file`` attribute is used to add - a single license file to a source distribution. This singular version is - still honored by ``wheels`` for backward compatibility. +- Both tools also support an older, singular ``license_file`` parameter that + allows specifying only one license file to add to the distribution, which + has been deprecated for some time but still sees some use. + See [#pipsetup]_ for instance. + +- Following the publication of an earlier draft of this PEP, ``setuptools`` + added support for ``License-File`` in package metadata as described here. + This allows other tools consuming the resulting metadata to unambiguously + locate the license file(s) for a given package. -- Using a ``LICENSE.txt`` file is encouraged in the packaging guide [#packaging]_ - paired with a ``MANIFEST.in`` entry to ensure that the license file is included - in a built source distribution (sdist). **Note:** the ``License-File`` field proposed in this PEP already exists in ``wheel`` and ``setuptools`` with the same behaviour as explained above. This PEP is only recognizing and documenting the existing practice as used -in ``wheel`` (with the ``license_file`` and ``license_files`` ``setup.cfg`` -``[metadata]`` entries) and in the ``setuptools`` ``license_file`` -``setup()`` argument. +in ``wheel`` and ``setuptools`` to add license files to the distribution, +and formally including their paths in core metadata (which has since been +implemented on the basis of a draft of this PEP). + + +In the PyPA Packaging User Guide and Sample Project +--------------------------------------------------- + +Both the PyPA beginner packaging tutorial [#packagingtuttxt]_ and its more +comprehensive packaging guide [#packagingguidetxt]_ state that it is important +that every package include a license file. They point to the ``LICENSE.txt`` +in the official PyPA sample project as an example, which is explicitly listed +under the ``license_files`` key in its ``setup.cfg`` [#samplesetupcfg]_, +following existing practice formally specified by this PEP. + +Both the beginner packaging tutorial [#packagingtutkey]_ and the sample project +[#samplesetuppy]_ only use classifiers to declare a package's license, and do +not include or mention the ``license`` field. The full packaging guide does +mention this field, but states that authors should use the license classifiers +instead, unless the project uses a non-standard license (which the guide +discourages) [#licfield]_. In Python Source Code Files @@ -855,13 +877,15 @@ the ``help()`` DATA section for a module. In Other Python Packaging Tools ------------------------------- -- Conda package manifest [#conda]_ has support for ``license`` and ``license_file`` - fields as well as a ``license_family`` license grouping field. +- Conda package manifests [#conda]_ have support for ``license`` and + ``license_file`` fields, and automatically include license files + following similar naming patterns as ``wheel`` and ``setuptools``. -- Flit [#flit]_ recommends to use classifiers instead of ``License`` - (as per the current metadata spec). +- Flit [#flit]_ recommends using classifiers instead of the ``License`` field + (per the current PyPA packaging guide). -- PBR [#pbr]_ uses similar data as setuptools but always stored setup.cfg. +- PBR [#pbr]_ uses similar data as setuptools, but always stored in + ``setup.cfg``. - Poetry [#poetry]_ specifies the use of the ``license`` field in ``pyproject.toml`` with SPDX license identifiers. @@ -1042,12 +1066,12 @@ References .. [#cms] https://packaging.python.org/specifications/core-metadata .. [#cdstats] https://clearlydefined.io/stats .. [#cd] https://clearlydefined.io -.. [#osi] http://opensource.org +.. [#osi] https://opensource.org .. [#classif] https://pypi.org/classifiers -.. [#spdxlist] https://spdx.org/licenses -.. [#spdx] https://spdx.org -.. [#spdx22] https://spdx.github.io/spdx-spec/appendix-IV-SPDX-license-expressions/ -.. [#wheels] https://github.com/pypa/wheel/blob/b8b21a5720df98703716d3cd981d8886393228fa/docs/user_guide.rst#including-license-files-in-the-generated-wheel-file +.. [#spdxlist] https://spdx.org/licenses/ +.. [#spdx] https://spdx.dev/ +.. [#spdx22] https://spdx.github.io/spdx-spec/SPDX-license-expressions/ +.. [#wheels] https://github.com/pypa/wheel/blob/0.37.0/docs/user_guide.rst#including-license-files-in-the-generated-wheel-file .. [#reuse] https://reuse.software/ .. [#licexp] https://github.com/nexB/license-expression/ .. [#spdxpy] https://github.com/spdx/tools-python/ @@ -1055,17 +1079,20 @@ References .. [#choosealicense] https://choosealicense.com/ .. [#spdxversion] https://github.com/pombredanne/spdx-pypi-pep/issues/6 .. [#scancodetk] https://github.com/nexB/scancode-toolkit -.. [#licfield] https://packaging.python.org/guides/distributing-packages-using-setuptools/?highlight=MANIFEST.in#license -.. [#samplesetup] https://github.com/pypa/sampleproject/blob/52966defd6a61e97295b0bb82cd3474ac3e11c7a/setup.py#L98 -.. [#pipsetup] https://github.com/pypa/pip/blob/476606425a08c66b9c9d326994ff5cf3f770926a/setup.cfg#L40 -.. [#setuptoolssdist] https://github.com/pypa/setuptools/blob/97e8ad4f5ff7793729e9c8be38e0901e3ad8d09e/setuptools/command/sdist.py#L202 -.. [#packaging] https://packaging.python.org/guides/distributing-packages-using-setuptools/?highlight=MANIFEST.in#license-txt +.. [#licfield] https://packaging.python.org/guides/distributing-packages-using-setuptools/#license +.. [#samplesetuppy] https://github.com/pypa/sampleproject/blob/3a836905fbd687af334db16b16c37cf51dcbc99c/setup.py#L98 +.. [#samplesetupcfg] https://github.com/pypa/sampleproject/blob/3a836905fbd687af334db16b16c37cf51dcbc99c/setup.cfg +.. [#pipsetup] https://github.com/pypa/pip/blob/21.3.1/setup.cfg#L114 +.. [#setuptoolssdist] https://github.com/pypa/setuptools/pull/1767 +.. [#packagingtuttxt] https://packaging.python.org/tutorials/packaging-projects/#creating-a-license +.. [#packagingguidetxt] https://packaging.python.org/guides/distributing-packages-using-setuptools/#license-txt +.. [#packagingtutkey] https://packaging.python.org/tutorials/packaging-projects/#configuring-metadata .. [#pycode] https://github.com/search?l=Python&q=%22__license__%22&type=Code -.. [#setuptools5030] https://github.com/pypa/setuptools/blob/v50.3.0/setup.cfg#L17 -.. [#packlic] https://github.com/pypa/packaging/blob/19.1/LICENSE -.. [#conda] https://docs.conda.io/projects/conda-build/en/latest/resources/define-metadata.html#about-section -.. [#flit] https://github.com/takluyver/flit -.. [#poetry] https://poetry.eustace.io/docs/pyproject/#license +.. [#setuptools5911] https://github.com/pypa/setuptools/blob/v59.1.1/setup.cfg +.. [#packlic] https://github.com/pypa/packaging/blob/21.2/LICENSE +.. [#conda] https://docs.conda.io/projects/conda-build/en/stable/resources/define-metadata.html#about-section +.. [#flit] https://flit.readthedocs.io/en/stable/pyproject_toml.html +.. [#poetry] https://python-poetry.org/docs/pyproject/#license .. [#pbr] https://docs.openstack.org/pbr/latest/user/features.html .. [#dep5] https://dep-team.pages.debian.net/deps/dep5/ .. [#fedora] https://docs.fedoraproject.org/en-US/packaging-guidelines/LicensingGuidelines/ @@ -1077,16 +1104,16 @@ References .. [#gentoo] https://devmanual.gentoo.org/ebuild-writing/variables/index.html#license .. [#glep23] https://www.gentoo.org/glep/glep-0023.html .. [#gentoodev] https://devmanual.gentoo.org/general-concepts/licenses/index.html -.. [#freebsd] https://www.freebsd.org/doc/en_US.ISO8859-1/books/porters-handbook/licenses.html -.. [#archinux] https://wiki.archlinux.org/index.php/PKGBUILD#license -.. [#archlinuxlist] https://wiki.archlinux.org/index.php/PKGBUILD#license +.. [#freebsd] https://docs.freebsd.org/en/books/porters-handbook/makefiles/#licenses +.. [#archinux] https://wiki.archlinux.org/title/PKGBUILD#license +.. [#archlinuxlist] https://archlinux.org/packages/core/any/licenses/files/ .. [#openwrt] https://openwrt.org/docs/guide-developer/packages#buildpackage_variables -.. [#nixos] https://github.com/NixOS/nixpkgs/blob/master/lib/licenses.nix -.. [#guix] http://git.savannah.gnu.org/cgit/guix.git/tree/guix/licenses.scm +.. [#nixos] https://github.com/NixOS/nixpkgs/blob/21.05/lib/licenses.nix +.. [#guix] https://git.savannah.gnu.org/cgit/guix.git/tree/guix/licenses.scm?h=v1.3.0 .. [#guixlic] https://guix.gnu.org/manual/en/html_node/package-Reference.html#index-license_002c-of-packages .. [#alpine] https://wiki.alpinelinux.org/wiki/Creating_an_Alpine_package#license .. [#maven] https://maven.apache.org/pom.html#Licenses -.. [#npm] https://docs.npmjs.com/files/package.json#license +.. [#npm] https://docs.npmjs.com/cli/v8/configuring-npm/package-json#license .. [#gem] https://guides.rubygems.org/specification-reference/#license= .. [#perl] https://metacpan.org/pod/CPAN::Meta::Spec#license .. [#cargo] https://doc.rust-lang.org/cargo/reference/manifest.html#package-metadata @@ -1094,20 +1121,20 @@ References .. [#composer] https://getcomposer.org/doc/04-schema.md#license .. [#nuget] https://docs.microsoft.com/en-us/nuget/reference/nuspec#licenseurl .. [#flutter] https://flutter.dev/docs/development/packages-and-plugins/developing-packages#adding-licenses-to-the-license-file -.. [#bower] https://github.com/bower/spec/blob/master/json.md#license +.. [#bower] https://github.com/bower/spec/blob/b00c4403e22e3f6177c410ed3391b9259687e461/json.md#license .. [#cocoapod] https://guides.cocoapods.org/syntax/podspec.html#license -.. [#cabal] https://cabal.readthedocs.io/en/latest/developing-packages.html#pkg-field-license +.. [#cabal] https://cabal.readthedocs.io/en/3.6/cabal-package.html?highlight=license#pkg-field-license .. [#mix] https://hex.pm/docs/publish .. [#dub] https://dub.pm/package-format-json.html#licenses .. [#cran] https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Licensing -.. [#spdxids] https://spdx.org/using-spdx-license-identifier +.. [#spdxids] https://spdx.dev/resources/use/#identifiers .. [#gnu] https://www.gnu.org/licenses/identify-licenses-clearly.html .. [#fsf] https://www.fsf.org/blogs/rms/rms-article-for-claritys-sake-please-dont-say-licensed-under-gnu-gpl-2 .. [#linux] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/license-rules.rst .. [#uboot] https://www.denx.de/wiki/U-Boot/Licensing .. [#apache] https://svn.apache.org/repos/asf/allura/doap_Allura.rdf .. [#eclipse] https://www.eclipse.org/legal/epl-2.0/faq.php -.. [#android] https://github.com/aosp-mirror/platform_external_tcpdump/blob/master/MODULE_LICENSE_BSD +.. [#android] https://github.com/aosp-mirror/platform_external_tcpdump/blob/android-platform-12.0.0_r1/MODULE_LICENSE_BSD .. [#cc0] https://creativecommons.org/publicdomain/zero/1.0/ .. [#unlic] https://unlicense.org/ From 4df70ca3a7a59f972a448e2fd7700308fb774807 Mon Sep 17 00:00:00 2001 From: "C.A.M. Gerlach" Date: Thu, 18 Nov 2021 01:15:14 -0600 Subject: [PATCH 05/19] PEP 639: Add PEP 621, expand/move examples & reorganize/shorten headings --- pep-0639.rst | 277 ++++++++++++++++++++++++++++++++++++++------------- 1 file changed, 210 insertions(+), 67 deletions(-) diff --git a/pep-0639.rst b/pep-0639.rst index 11384762bb0..8fa3dbc8edd 100644 --- a/pep-0639.rst +++ b/pep-0639.rst @@ -181,17 +181,29 @@ to the benefit of package authors, consumers and redistributors alike. Specification ============= +The changes necessary to implement the improved license handling outlined in +this PEP include those in both author-provided static source metadata, as +specified in PEP 621, and built package metadata, as defined in the Core +Metadata specification [#cms]_. Furthermore, requirements are needed for +tools handling and converting legacy license metadata to license expressions, +to ensure the results are consistent, correct and unambiguous. + + +Core Metadata +------------- + The canonical source for the names and semantics of each of the supported metadata fields is the Core Metadata Specification [#cms]_ document. -This PEP proposes adding the ``License-Expression`` and ``License-File`` -fields, deprecating the ``License`` field, and deprecating the ``License ::`` + +This PEP adds the ``License-Expression`` and ``License-File`` fields, +deprecates the ``License`` field, and deprecates the ``License ::`` classifiers in the ``Classifier`` field. As it adds new fields, this PEP updates the core metadata to version 2.3. Add License-Expression Field ----------------------------- +'''''''''''''''''''''''''''' The ``License-Expression`` optional field is specified to contain a text string that is a valid SPDX license expression, defined below. @@ -200,10 +212,6 @@ Publishing tools SHOULD issue an informational warning if this field is missing, and MAY raise an error. Build tools MAY issue a similar warning, but MUST NOT raise an error. - -License Expression Syntax -''''''''''''''''''''''''' - A license expression is a string using the SPDX license expression syntax as documented in the SPDX specification [#spdx]_ using either Version 2.2 [#spdx22]_ or a later compatible version. SPDX is a working group at the Linux @@ -234,7 +242,7 @@ a valid license expression, build and publishing tools: error if one or more license identifiers have been marked as deprecated in the SPDX License List [#spdxlist]_. -- SHOULD store a case-normalized version of the ``License-Expression`` field +- MUST store a case-normalized version of the ``License-Expression`` field using the reference case for each SPDX license identifier and uppercase for the ``AND``, ``OR`` and ``WITH`` keywords. @@ -242,21 +250,9 @@ a valid license expression, build and publishing tools: the normalization process results in changes to the ``License-Expression`` field contents. -License expression examples:: - - License-Expression: MIT - - License-Expression: BSD-3-Clause - - License-Expression: MIT OR GPL-2.0-or-later OR (FSFUL AND BSD-2-Clause) - - License-Expression: GPL-3.0-only WITH Classpath-Exception-2.0 OR BSD-3-Clause - - License-Expression: LicenseRef-Proprietary AND LicenseRef-Public-Domain - Add License-File Field ----------------------- +'''''''''''''''''''''' The ``License-File`` field is specified to contain the string representation of the path relative to ``.dist-info`` of license-related files, @@ -272,7 +268,7 @@ file(s) in the built package. Deprecate License Field ------------------------ +''''''''''''''''''''''' The legacy unstructured-text ``License`` field is deprecated and replaced by the new ``License-Expression`` field. @@ -286,11 +282,11 @@ informing users it is deprecated and recommending ``License-Expression`` instead. Along with license classifiers, the ``License`` field may be removed from a -new version of specification in a future PEP. +new version of the specification in a future PEP. Deprecate License Classifiers ------------------------------ +''''''''''''''''''''''''''''' Including license classifiers [#classif]_ (those beginning with ``License ::``) in the ``Classifier`` field (described in PEP 301) is deprecated and @@ -314,8 +310,96 @@ the presence of license classifiers SHOULD NOT raise an error unless ``License-Expression`` is also provided. -Conversion of Legacy License Metadata -===================================== +PEP 621 Source Metadata +----------------------- + +As currently specified in the canonical PyPA specification [#pypapep621]_, +PEP 621 defines how to declare a project's source metadata in a ``[project]`` +table in the ``pyproject.toml`` file for packaging tools to consume and +output a distribution's core metadata. + +This PEP adds the ``expression`` and ``files`` keys to the ``license`` table +and deprecates the ``file`` and ``text`` keys. + + +Add Expression Key +'''''''''''''''''' + +A new ``expression`` key is added to the ``license`` table, which has a string +value that is a valid SPDX license expression, as defined above. Its value +maps to the ``License-Expression`` field in the core metadata. +Tools MAY also back-fill the ``license`` core metadata field with the +verbatim value of the ``expression`` key, but to do so, ``license`` MUST be +listed as ``dynamic``. + +It is mutually exclusive with both the ``file`` and ``text`` keys; tools +MUST raise an error if both ``expression`` and ``text`` are present, and +SHOULD raise an error if both ``expression`` and ``file`` are specified. +It is not mutually exclusive with the ``files`` key; both can and should +be included. + +Packaging tools SHOULD validate the field as described above, outputting an +error or warning as specified, but MUST NOT perform case normalization unless +the field is listed as ``dynamic``, and if it is not, MUST raise an error if +case-normalization would result in a change to the field contents. + + +Add Files Key +''''''''''''' + +A new ``files`` key is added to the ``license`` table, which has as its +value an array of strings that are paths relative to the ``pyproject.toml`` +to file(s) containing licenses and other legal notices for the project. +Its values map to the ``License-File`` fields in the core metadata. +Packaging tools SHOULD include all specified files in any distribution +artifacts for the project, and MUST assume all files are encoded as UTF-8. + +It is mutually exclusive with both the ``file`` and ``text`` keys; tools +MUST raise an error if both ``files`` and ``file`` are present, and SHOULD +raise an error if both ``files`` and ``text`` are specified. +It is not mutually exclusive with the ``expression`` key; both can and +should be included. + + +Deprecate File Key +'''''''''''''''''' + +The (singular) ``file`` key in the ``license`` table is now deprecated. +However, its value now maps to the ``License-File`` field in the core +metadata; packaging tools SHOULD now use it to fill that field, +and SHOULD include the file in any distribution artifacts for the project. + +The new ``expression`` and ``files`` keys are mutually exclusive with it; +tools MUST raise an error if both ``file`` and ``files`` are present, +and SHOULD raise an error if both ``file`` and ``expression`` are specified. + +If only the ``file`` key is present in the ``license`` table, tools SHOULD +issue a warning informing users it is deprecated and recommending the +``files`` key instead. + +The ``file`` key may be removed from a new version of the specification +in a future PEP. + + +Deprecate Text Key +'''''''''''''''''' + +The ``text`` key in the ``license`` table is now deprecated. + +The new ``expression`` and ``files`` keys are mutually exclusive with it; +tools MUST raise an error if both ``text`` and ``expression`` are present, +and SHOULD raise an error if both ``text`` and ``files`` are specified. + +If only the ``text`` key is present in the ``license`` table, tools SHOULD +issue a warning informing users it is deprecated and recommending the +``expression`` key instead. + +The ``text`` key may be removed from a new version of the specification +in a future PEP. + + +Converting Legacy Metadata +-------------------------- If the contents of the ``License`` field are a valid SPDX expression containing solely known, non-deprecated license identifiers, build and publishing tools MAY @@ -330,19 +414,8 @@ If both a non-empty ``License`` field and a single license classifier are present, the contents of the ``License`` field, including capitalization (but excluding leading and trailing whitespace), MUST exactly match the SPDX license identifier mapped to the license classifier to be considered -unambiguous for the purposes of automatically - -For instance, if a package only has this license classifier:: - - Classifier: License :: OSI Approved :: MIT License - -And this value for the ``License`` field, if present and non-empty:: - - License: MIT - -Then the suggested value for a ``License-Expression`` field should be:: - - License-Expression: MIT +unambiguous for the purposes of automatically filling the +``License-Expression`` field. If tools have filled the ``License-Expression`` field as described above, they MUST output a prominent, user-visible warning informing package authors @@ -357,7 +430,7 @@ select and confirm the desired ``License-Expression`` value before proceeding. Mapping License Classifiers to SPDX Identifiers ------------------------------------------------ +''''''''''''''''''''''''''''''''''''''''''''''' The following defines the mapping between legacy license classifiers and SPDX license identifiers, in cases other than strict 1:1 @@ -429,9 +502,9 @@ backwards-compatible with their existing use of that field. Security Implications ===================== -This PEP has no foreseen security implications: the License field is a plain -string and the License-File(s) are file paths. None of them introduces any new -security concern. +This PEP has no foreseen security implications: the License-Expression field is +a plain string and the License-File(s) are file paths. None of them introduces +any known new security concerns. How to Teach This @@ -711,8 +784,11 @@ licenses, but also remain backwards compatible with the version specified here, balancing flexibility and compatibility. -Appendix 1. License Expression Example -====================================== +Appendix 1. License Expression Examples +======================================= + +Simple Example +-------------- Setuptools itself, as of version 59.1.1 [#setuptools5911]_, does not use the ``License`` field. Further, ``license_file``/``license_files`` is no longer @@ -729,6 +805,20 @@ The simplest migration to this PEP would consist of using this instead:: license_expression = MIT +Or, in a PEP 621 ``pyproject.toml``:: + + [project.license] + expression = "MIT" + +The output core metadata for the package would then be:: + + License-Expression: MIT + License-File: LICENSE + + +Complex Example +--------------- + Suppose Setuptools were to include the licenses of the third-party packages that are vendored in the ``setuptools/_vendor/`` and ``pkg_resources/_vendor`` directories; specifically:: @@ -747,29 +837,81 @@ The license expressions for these packages are:: Therefore, a comprehensive license expression covering both ``setuptools`` proper and its vendored dependencies could contain these metadata, combining -all the license expressions in one expression:: +all the license expressions in one expression, our ``setup.cfg`` would be:: license_expression = MIT AND (Apache-2.0 OR BSD-2-Clause) license_files = LICENSE.txt LICENSE-PACKAGING.txt -Here we would assume that the ``LICENSE.txt`` file contains the text of the MIT +Or, in a PEP 621 ``pyproject.toml``, this would look like:: + + [project.license] + expression = "MIT AND (Apache-2.0 OR BSD-2-Clause)" + files = ["LICENSE.txt", "LICENSE-PACKAGING.txt"] + +Here, we assume that the ``LICENSE.txt`` file contains the text of the MIT license and the copyrights used by ``setuptools``, ``pyparsing``, ``more_itertools`` and ``ordered-set``, and that the ``LICENSE-PACKAGING.txt`` file contains the texts of the Apache and BSD licenses, and the ``packaging`` copyright statements and license choice notice [#packlic]_. +With either approach, the resulting core metadata would be:: + + License-Expression: MIT AND (Apache-2.0 OR BSD-2-Clause) + License-File: LICENSE.txt + License-File: LICENSE-PACKAGING.txt + + +Conversion Example +------------------ + +Suppose we were to return to our simple ``setuptools`` case. +Per the specification, given it only has the following license classifier:: + + Classifier: License :: OSI Approved :: MIT License + +And no value for the ``License`` field; or, equivalently, a value of:: + + License: MIT + +Then the suggested value for a ``License-Expression`` field would be:: + + License-Expression: MIT + +For the more complex case, assuming it was currently expressed as multiple +license classifiers, no automatic conversion could be performed due to the +inherent ambiguity, and the user would be prompted on how to handle the +situation themselves. + + +Expression Examples +------------------- + +Some additional examples of valid ``License-Expression`` values:: + + License-Expression: MIT + + License-Expression: BSD-3-Clause + + License-Expression: MIT OR GPL-2.0-or-later OR (FSFUL AND BSD-2-Clause) + + License-Expression: GPL-3.0-only WITH Classpath-Exception-2.0 OR BSD-3-Clause -Appendix 2. Surveying How we Document Licenses Today in Python -============================================================== + License-Expression: LicenseRef-Public-Domain OR CC0-1.0 OR Unlicense + + License-Expression: LicenseRef-Proprietary + + +Appendix 2. License Documentation in Python +=========================================== There are multiple ways used or recommended to document Python package licenses today. The most common are listed below. -In Core Metadata ----------------- +Core Metadata +------------- There are two overlapping core metadata fields to document a license: the license-related ``Classifier`` strings [#classif]_ prefixed with ``License ::`` @@ -805,8 +947,8 @@ between them. Furthermore, the list of available license-related classifiers is often out-of-date. -In Wheels and Setuptools ------------------------- +Setuptools and Wheels +--------------------- Beyond a license code or qualifier, license text files are documented and included in a built package either implicitly or explicitly and this is another @@ -841,8 +983,8 @@ and formally including their paths in core metadata (which has since been implemented on the basis of a draft of this PEP). -In the PyPA Packaging User Guide and Sample Project ---------------------------------------------------- +PyPA Packaging Guide and Sample Project +--------------------------------------- Both the PyPA beginner packaging tutorial [#packagingtuttxt]_ and its more comprehensive packaging guide [#packagingguidetxt]_ state that it is important @@ -859,8 +1001,8 @@ instead, unless the project uses a non-standard license (which the guide discourages) [#licfield]_. -In Python Source Code Files ---------------------------- +Python Source Code Files +------------------------ **Note:** Documenting licenses in source code is not in the scope of this PEP. @@ -874,8 +1016,8 @@ function and the standard ``pydoc`` module. The dunder variable(s) will show up the ``help()`` DATA section for a module. -In Other Python Packaging Tools -------------------------------- +Other Python Packaging Tools +---------------------------- - Conda package manifests [#conda]_ have support for ``license`` and ``license_file`` fields, and automatically include license files @@ -891,14 +1033,14 @@ In Other Python Packaging Tools ``pyproject.toml`` with SPDX license identifiers. -Appendix 3. Surveying How Other Package Formats Document Licenses -================================================================= +Appendix 3. License Documentation in Other Projects +=================================================== Here is a survey of how things are done elsewhere. -Licenses in Linux Distribution Packages ---------------------------------------- +Linux Distribution Packages +--------------------------- **Note:** in most cases the license texts of the most common licenses are included globally once in a shared documentation directory (e.g. ``/usr/share/doc``). @@ -952,8 +1094,8 @@ globally once in a shared documentation directory (e.g. ``/usr/share/doc``). license field. -Licenses in Language and Application Packages ---------------------------------------------- +Language and Application Packages +--------------------------------- - In Java, Maven POM [#maven]_ defines a ``licenses`` XML tag with a list of license items each with a name, URL, comments and "distribution" type. This is not @@ -1027,8 +1169,8 @@ Licenses in Language and Application Packages license expression syntax. -Conventions Used by Other Ecosystems ------------------------------------- +Other Ecosystems +---------------- - ``SPDX-License-Identifier`` [#spdxids]_ is a simple convention to document the license inside a file. @@ -1068,6 +1210,7 @@ References .. [#cd] https://clearlydefined.io .. [#osi] https://opensource.org .. [#classif] https://pypi.org/classifiers +.. [#pypapep621] https://packaging.python.org/specifications/declaring-project-metadata/ .. [#spdxlist] https://spdx.org/licenses/ .. [#spdx] https://spdx.dev/ .. [#spdx22] https://spdx.github.io/spdx-spec/SPDX-license-expressions/ From cb5be4ff579c273e1fcf4c4bed5f76b570d3546a Mon Sep 17 00:00:00 2001 From: "C.A.M. Gerlach" Date: Fri, 19 Nov 2021 22:31:48 -0600 Subject: [PATCH 06/19] PEP 639: Further specify and clarify license file handling --- pep-0639.rst | 73 ++++++++++++++++++++++++++++++++++++++++------------ 1 file changed, 57 insertions(+), 16 deletions(-) diff --git a/pep-0639.rst b/pep-0639.rst index 8fa3dbc8edd..1f9ae78baeb 100644 --- a/pep-0639.rst +++ b/pep-0639.rst @@ -255,16 +255,25 @@ Add License-File Field '''''''''''''''''''''' The ``License-File`` field is specified to contain the string representation of -the path relative to ``.dist-info`` of license-related files, -such as license text, author/attribution information or legal notices. -The license file content MUST be UTF-8 encoded text. +the path of license-related files relative to the directory containing the +core metadata: ``.dist-info`` (for wheels), ``.egg-info`` (for sdists), +or the equivalent for other distribution formats. It is an optional, multi-use field which may appear zero or more times, each instance listing a path to one of the license files to be included in -distributions of the package. +distributions of the package. Files specified under this field could include +license text, author/attribution information, or other legal notices that +need be distributed with the package. -Build tools SHOULD honor this field and include the corresponding license -file(s) in the built package. +Path separators, if needed, MUST be the forward slash character (``/``), +and parent directory indicators (``..``) MUST NOT be used. +License file content MUST be UTF-8 encoded text. + +If a ``License-File`` is listed in a source or binary distribution's core +metadata, that file MUST be included at the specified path relative to that +distribution's core metadata directory. Build tools MAY and publishing tools +SHOULD produce an informative warning if a built package's metadata contains +no ``License-File`` entries, and publishing tools MAY raise an error. Deprecate License Field @@ -350,9 +359,7 @@ Add Files Key A new ``files`` key is added to the ``license`` table, which has as its value an array of strings that are paths relative to the ``pyproject.toml`` to file(s) containing licenses and other legal notices for the project. -Its values map to the ``License-File`` fields in the core metadata. -Packaging tools SHOULD include all specified files in any distribution -artifacts for the project, and MUST assume all files are encoded as UTF-8. +It corresponds to the ``License-File`` fields in the core metadata. It is mutually exclusive with both the ``file`` and ``text`` keys; tools MUST raise an error if both ``files`` and ``file`` are present, and SHOULD @@ -360,22 +367,56 @@ raise an error if both ``files`` and ``text`` are specified. It is not mutually exclusive with the ``expression`` key; both can and should be included. +Each string MUST be either a verbatim file path, or a valid glob pattern, +parsable by the ``glob`` module [#globmodule]_ in the Python standard library. +Packaging tools MUST include in distribution artifacts all files matched by +at least one listed pattern, and MUST list each under a ``License-File`` +field in the core metadata. Tools MAY exclude files matched by glob patterns +that can be unambiguously determined to be backup, temporary, VCS-ignored, +OS or hidden. + +Path separators, if needed, MUST use the forward slash character (``/``), +and parent directory indicators (``..``) MUST NOT be used. +Tools MUST assume that license file content is UTF-8 encoded text, +and SHOULD raise an error if it is not. + +Tools SHOULD issue a warning and MAY raise an error if the ``files`` key +is present and not an empty array, but no files are matched. + +Tools MAY issue a warning if any individual explicitly-specified pattern +does not match at least one file, and MAY raise an error if a pattern with +no special glob characters (i.e., one specifying a single verbatim file) +does not correspond to a valid file on disk. + +If the ``files`` key is present, tools MUST NOT match any additional license +file patterns beyond those explicitly specified by the user, and a value of +an empty array (``[]``) for the ``files`` key implies none will be matched. +If the ``files`` key is not present, tools SHOULD assume a default list of +patterns that is inclusive of at least the following:: + + ["LICEN[CS]E*", "COPYING*", "NOTICE*", "AUTHORS*"] + +In this case, tools MAY issue a warning if no license files are matched, +but MUST NOT raise an error. + Deprecate File Key '''''''''''''''''' The (singular) ``file`` key in the ``license`` table is now deprecated. -However, its value now maps to the ``License-File`` field in the core -metadata; packaging tools SHOULD now use it to fill that field, -and SHOULD include the file in any distribution artifacts for the project. - The new ``expression`` and ``files`` keys are mutually exclusive with it; tools MUST raise an error if both ``file`` and ``files`` are present, and SHOULD raise an error if both ``file`` and ``expression`` are specified. If only the ``file`` key is present in the ``license`` table, tools SHOULD issue a warning informing users it is deprecated and recommending the -``files`` key instead. +``files`` key instead. However, if the file is present in the source, +packaging tools SHOULD still use it to fill the ``License-File`` field +in the core metadata, and if so, MUST include the specified file in any +distribution artifacts for the project. If the file does not exist at the +specified path, tools SHOULD issue a warning, and MUST NOT fill it in a +``License-File`` field. Finally, tools SHOULD still automatically add the +license files corresponding to the default list of patterns specified above. The ``file`` key may be removed from a new version of the specification in a future PEP. @@ -956,7 +997,7 @@ possible source of confusion: - In ``setuptools`` [#setuptoolssdist]_ and wheels [#wheels]_, license files are automatically added to the distribution (at their source location in - in a source distribution (sdist), and in the ``.dist-info`` directory + in a source distribution/sdist, and in the ``.dist-info`` directory of a built wheel) if they match one of a number of common license file name patterns (``LICEN[CS]E*``, ``COPYING*``, ``NOTICE*`` and ``AUTHORS*``). Alternatively, a package author can specify a list of license file paths to @@ -974,7 +1015,6 @@ possible source of confusion: This allows other tools consuming the resulting metadata to unambiguously locate the license file(s) for a given package. - **Note:** the ``License-File`` field proposed in this PEP already exists in ``wheel`` and ``setuptools`` with the same behaviour as explained above. This PEP is only recognizing and documenting the existing practice as used @@ -1211,6 +1251,7 @@ References .. [#osi] https://opensource.org .. [#classif] https://pypi.org/classifiers .. [#pypapep621] https://packaging.python.org/specifications/declaring-project-metadata/ +.. [#globmodule] https://docs.python.org/3/library/glob.html .. [#spdxlist] https://spdx.org/licenses/ .. [#spdx] https://spdx.dev/ .. [#spdx22] https://spdx.github.io/spdx-spec/SPDX-license-expressions/ From 42e318b48a01b7d9d7f9faa38ef60e664053ac1d Mon Sep 17 00:00:00 2001 From: "C.A.M. Gerlach" Date: Sat, 20 Nov 2021 00:15:52 -0600 Subject: [PATCH 07/19] PEP 639: Add additional ambigous classifiers and tool recommendations --- pep-0639.rst | 72 +++++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 57 insertions(+), 15 deletions(-) diff --git a/pep-0639.rst b/pep-0639.rst index 1f9ae78baeb..cfec6244b49 100644 --- a/pep-0639.rst +++ b/pep-0639.rst @@ -301,8 +301,8 @@ Including license classifiers [#classif]_ (those beginning with ``License ::``) in the ``Classifier`` field (described in PEP 301) is deprecated and replaced by the more precise ``License-Expression`` field. -New license classifiers SHALL NOT be added; users needing them should -use the ``License-Expression`` field instead. +New ``License ::`` classifiers MUST NOT be added to PyPI [#classifersrepo]_; +users needing them SHOULD use the ``License-Expression`` field instead. Along with the ``License`` field, license classifiers may be removed from a new version of the specification in a future PEP. @@ -313,7 +313,7 @@ themselves. Otherwise, if this field contains a license classifier, build tools MAY and publishing tools SHOULD issue a warning informing users such classifiers -are deprecated and recommending ``License-Expression`` instead. +are deprecated, and recommending ``License-Expression`` instead. For compatibility with existing publishing and installation processes, the presence of license classifiers SHOULD NOT raise an error unless ``License-Expression`` is also provided. @@ -473,17 +473,60 @@ select and confirm the desired ``License-Expression`` value before proceeding. Mapping License Classifiers to SPDX Identifiers ''''''''''''''''''''''''''''''''''''''''''''''' -The following defines the mapping between legacy license classifiers -and SPDX license identifiers, in cases other than strict 1:1 -correspondence from a classifier to a defined identifier. - -- Classifier ``License :: Public Domain`` becomes - ``License-Expression: LicenseRef-Public-Domain``. Tools SHOULD - encourage the use of more explicit and legally portable license identifiers +Most single license classifiers (namely, all those not mentioned below) +map to a single valid SPDX license identifier, allowing tools to insert them +into the ``License-Expression`` field following the specification above. + +Many legacy license classifiers intend to specify a particular license, +but do not specify the particular version or variant, leading to critical +ambiguity as to their terms, compatibility and acceptability [#issue17]_. +Tools MUST NOT attempt to automatically infer a ``License-Expression`` +when one of these classifiers is used, and SHOULD instead prompt the user +to affirmatively select and confirm their intended license choice. + +These classifiers are the following: + +- ``License :: OSI Approved :: Academic Free License (AFL)`` +- ``License :: OSI Approved :: Apache Software License`` +- ``License :: OSI Approved :: Apple Public Source License`` +- ``License :: OSI Approved :: Artistic License`` +- ``License :: OSI Approved :: BSD License`` +- ``License :: OSI Approved :: GNU Affero General Public License v3`` +- ``License :: OSI Approved :: GNU Free Documentation License (FDL)`` +- ``License :: OSI Approved :: GNU General Public License (GPL)`` +- ``License :: OSI Approved :: GNU General Public License v2 (GPLv2)`` +- ``License :: OSI Approved :: GNU General Public License v3 (GPLv3)`` +- ``License :: OSI Approved :: GNU Lesser General Public License v2 (LGPLv2)`` +- ``License :: OSI Approved :: GNU Lesser General Public License v2 or later (LGPLv2+)`` +- ``License :: OSI Approved :: GNU Lesser General Public License v3 (LGPLv3)`` +- ``License :: OSI Approved :: GNU Library or Lesser General Public License (LGPL)`` + +A comprehensive mapping of these classifiers to their possible specific +identifiers was assembled by Dustin Ingram [#badclassifiers]_, which tools +MAY use as a reference for the identifier selection options to offer users +when prompting the user to explicitly select the license identifier +they intended for their project. + +**Note**: A couple additional classifiers, namely the "or later" variants of +the AGPLv3, GPLv2, GPLv3 and LGPLv3, are also listed in the aforementioned +mapping, but as they were merely proposed for textual harmonization and +still unambiguously map to their respective respective licenses, +they were not included here; LGPLv2 is, however, as it could ambiguously +refer to either the distinct v2.0 or v2.1 variants of that license. + +In addition, for the various special cases, the following mappings are +considered canonical and normative for the purposes of this specification: + +- Classifier ``License :: Public Domain`` MAY be mapped to the generic + ``License-Expression: LicenseRef-Public-Domain``. + If tools do so, they SHOULD issue an informational warning encouraging + the use of more explicit and legally portable license identifiers such as ``CC0-1.0`` [#cc0]_ or the ``Unlicense`` [#unlic]_, since the meaning associated with the term "public domain" is thoroughly dependent on the specific legal jurisdiction involved, some of which lack the concept entirely. + Alternatively, tools MAY choose to treat the above as ambiguous and + require user confirmation to fill ``License-Expression`` in these cases. - The generic and sometimes ambiguous classifiers ``License :: Free For Educational Use``, @@ -493,7 +536,7 @@ correspondence from a classifier to a defined identifier. ``License :: Free To Use But Restricted``, ``License :: Freeware``, and ``License :: Other/Proprietary License`` MAY be mapped to the generic - ``License-Expression: ``LicenseRef-Proprietary`` expression, + ``License-Expression: LicenseRef-Proprietary``, but tools MUST issue a prominent, informative warning if they do so. Alternatively, tools MAY choose to treat the above as ambiguous and require user confirmation to fill ``License-Expression`` in these cases. @@ -509,10 +552,6 @@ correspondence from a classifier to a defined identifier. Therefore, tools MUST treat them as ambiguous when attempting to fill ``License-Expression``. -The remainder of the classifiers using a ``License ::`` prefix map to a -simple single-identifier license expression using the corresponding -SPDX license identifiers. - When multiple license-related classifiers are used, their relation is ambiguous and it is typically not possible to determine if all the licenses apply or if there is a choice that is possible among the licenses. In this case, tools @@ -1250,6 +1289,9 @@ References .. [#cd] https://clearlydefined.io .. [#osi] https://opensource.org .. [#classif] https://pypi.org/classifiers +.. [#classifersrepo] https://github.com/pypa/trove-classifiers +.. [#issue17] https://github.com/pypa/trove-classifiers/issues/17 +.. [#badclassifiers] https://github.com/pypa/trove-classifiers/issues/17#issuecomment-385027197 .. [#pypapep621] https://packaging.python.org/specifications/declaring-project-metadata/ .. [#globmodule] https://docs.python.org/3/library/glob.html .. [#spdxlist] https://spdx.org/licenses/ From eb2e8740b96e2e47ec53188b2935d084c5370ef0 Mon Sep 17 00:00:00 2001 From: "C.A.M. Gerlach" Date: Mon, 22 Nov 2021 01:39:08 -0600 Subject: [PATCH 08/19] PEP 639: Rework PEP 621 keys, refine dynamic/globs & add rejected ideas --- pep-0639.rst | 621 +++++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 522 insertions(+), 99 deletions(-) diff --git a/pep-0639.rst b/pep-0639.rst index cfec6244b49..9e58c4c8db4 100644 --- a/pep-0639.rst +++ b/pep-0639.rst @@ -327,115 +327,138 @@ PEP 621 defines how to declare a project's source metadata in a ``[project]`` table in the ``pyproject.toml`` file for packaging tools to consume and output a distribution's core metadata. -This PEP adds the ``expression`` and ``files`` keys to the ``license`` table -and deprecates the ``file`` and ``text`` keys. +This PEP adds the ``license-expression`` and ``license-files`` keys and +deprecates the ``license`` key. -Add Expression Key -'''''''''''''''''' +Add license-expression Key +'''''''''''''''''''''''''' -A new ``expression`` key is added to the ``license`` table, which has a string -value that is a valid SPDX license expression, as defined above. Its value -maps to the ``License-Expression`` field in the core metadata. -Tools MAY also back-fill the ``license`` core metadata field with the -verbatim value of the ``expression`` key, but to do so, ``license`` MUST be -listed as ``dynamic``. +A new ``license-expression`` key is added to the ``project`` table, which has +a string value that is a valid SPDX license expression, as defined previously. +Its value maps to the ``License-Expression`` field in the core metadata. -It is mutually exclusive with both the ``file`` and ``text`` keys; tools -MUST raise an error if both ``expression`` and ``text`` are present, and -SHOULD raise an error if both ``expression`` and ``file`` are specified. -It is not mutually exclusive with the ``files`` key; both can and should -be included. +Packaging tools SHOULD validate the expression as described above, outputting +an error or warning as specified. When generating the core metadata, tools +MUST perform case normalization. -Packaging tools SHOULD validate the field as described above, outputting an -error or warning as specified, but MUST NOT perform case normalization unless -the field is listed as ``dynamic``, and if it is not, MUST raise an error if -case-normalization would result in a change to the field contents. +If and only if the ``license-expression`` key is listed as ``dynamic`` +(and is not specified), tools MAY infer a value for this field if they can do +so unambiguously, but MUST follow the provisions in the +`Converting Legacy Metadata`_ section. +If the ``license-expression`` key is present and valid (and the ``license`` +key is not specified), for purposes of backward compatibility, tools MAY +back-fill the ``License`` core metadata field with the case-normalized value +of the ``license-expression`` key. -Add Files Key -''''''''''''' -A new ``files`` key is added to the ``license`` table, which has as its -value an array of strings that are paths relative to the ``pyproject.toml`` -to file(s) containing licenses and other legal notices for the project. +Add license-files Key +''''''''''''''''''''' + +A new ``license-files`` key is added to the ``project`` table for specifying +paths in the project source relative to ``pyproject.toml`` to file(s) +containing licenses and other legal notices to be distributed with the package. It corresponds to the ``License-File`` fields in the core metadata. -It is mutually exclusive with both the ``file`` and ``text`` keys; tools -MUST raise an error if both ``files`` and ``file`` are present, and SHOULD -raise an error if both ``files`` and ``text`` are specified. -It is not mutually exclusive with the ``expression`` key; both can and -should be included. - -Each string MUST be either a verbatim file path, or a valid glob pattern, -parsable by the ``glob`` module [#globmodule]_ in the Python standard library. -Packaging tools MUST include in distribution artifacts all files matched by -at least one listed pattern, and MUST list each under a ``License-File`` -field in the core metadata. Tools MAY exclude files matched by glob patterns -that can be unambiguously determined to be backup, temporary, VCS-ignored, -OS or hidden. - -Path separators, if needed, MUST use the forward slash character (``/``), +Its value may either be a table or an array of strings. If a table, it may +contain one of two optional, mutually exclusive keys, ``paths`` and ``globs``; +both arrays of strings. If both are specified, tools MUST raise an error. +The ``paths`` subkey contains verbatim file paths, and the ``globs`` subkey +valid glob patterns, parsable by the ``glob`` module [#globmodule]_ in the +Python standard library. + +**Note**: To avoid ambiguity, confusion and (per PEP 20, the Zen of Python) +"more than one (obvious) way to do it", a flat array of strings value for the +``license-files`` key has been left out for now. + +Path separators, if used, MUST be the forward slash character (``/``), and parent directory indicators (``..``) MUST NOT be used. -Tools MUST assume that license file content is UTF-8 encoded text, -and SHOULD raise an error if it is not. +Tools MUST assume that license file content is valid UTF-8 encoded text, +and SHOULD validate this and raise an error if it is not. -Tools SHOULD issue a warning and MAY raise an error if the ``files`` key -is present and not an empty array, but no files are matched. +If the ``paths`` subkey is a non-empty array, packaging tools: -Tools MAY issue a warning if any individual explicitly-specified pattern -does not match at least one file, and MAY raise an error if a pattern with -no special glob characters (i.e., one specifying a single verbatim file) -does not correspond to a valid file on disk. +- MUST treat each value as a verbatim, literal file path, and + MUST NOT treat them as glob patterns. -If the ``files`` key is present, tools MUST NOT match any additional license -file patterns beyond those explicitly specified by the user, and a value of -an empty array (``[]``) for the ``files`` key implies none will be matched. -If the ``files`` key is not present, tools SHOULD assume a default list of -patterns that is inclusive of at least the following:: +- MUST include each listed file in distribution artifacts. - ["LICEN[CS]E*", "COPYING*", "NOTICE*", "AUTHORS*"] +- MUST NOT match any additional license files beyond those explicitly + statically specified by the user under the ``paths`` key. -In this case, tools MAY issue a warning if no license files are matched, -but MUST NOT raise an error. +- MUST list each file path under a ``License-File`` field in the core metadata. +- MUST raise an error if one or more paths do not correspond to a valid file + in the package source that can be copied into the built distribution. -Deprecate File Key -'''''''''''''''''' +If the ``globs`` subkey is a non-empty array, packaging tools: -The (singular) ``file`` key in the ``license`` table is now deprecated. -The new ``expression`` and ``files`` keys are mutually exclusive with it; -tools MUST raise an error if both ``file`` and ``files`` are present, -and SHOULD raise an error if both ``file`` and ``expression`` are specified. +- MUST treat each value as a glob pattern, and MUST raise an error if the + pattern contains invalid glob syntax. -If only the ``file`` key is present in the ``license`` table, tools SHOULD -issue a warning informing users it is deprecated and recommending the -``files`` key instead. However, if the file is present in the source, -packaging tools SHOULD still use it to fill the ``License-File`` field -in the core metadata, and if so, MUST include the specified file in any -distribution artifacts for the project. If the file does not exist at the -specified path, tools SHOULD issue a warning, and MUST NOT fill it in a -``License-File`` field. Finally, tools SHOULD still automatically add the -license files corresponding to the default list of patterns specified above. +- MUST include all files matched by at least one listed pattern in + distribution artifacts. -The ``file`` key may be removed from a new version of the specification -in a future PEP. +- MAY exclude files matched by glob patterns that can be unambiguously + determined to be backup, temporary, hidden, OS-generated or VCS-ignored. + +- MUST list each matched file path under a ``License-File`` field in the + core metadata. + +- SHOULD issue a warning and MAY raise an error if no files are matched. + +- MAY issue a warning if any individual user-specified pattern + does not match at least one file. +If the ``license-files`` key is present, and the ``paths`` or ``globs`` subkey +is set to a value of an empty array, then tools MUST NOT include any +license files and MUST NOT raise an error. -Deprecate Text Key -'''''''''''''''''' +If the ``license-files`` key is not present and not explicitly marked as +``dynamic``, tools MUST assume a default value of the following:: -The ``text`` key in the ``license`` table is now deprecated. + license-files.globs = ["LICEN[CS]E*", "COPYING*", "NOTICE*", "AUTHORS*"] -The new ``expression`` and ``files`` keys are mutually exclusive with it; -tools MUST raise an error if both ``text`` and ``expression`` are present, -and SHOULD raise an error if both ``text`` and ``files`` are specified. +In this case, tools MAY issue a warning if no license files are matched, +but MUST NOT raise an error. + +If the ``license-files`` key is marked as ``dynamic`` (and not present), +to preserve consistent behavior with current tools and help ensure the packages +they create are legally distributable, packaging tools SHOULD default to +including at least the license files matching the above patterns, unless the +user has explicitly specified their own. + + +Deprecate license Key +''''''''''''''''''''' + +The ``license`` key in the ``project`` table is now deprecated. +It MUST not be used if either of the new ``license-expression`` or +``license-files`` keys are defined, nor should it be listed as ``dynamic``, +and packaging tools MUST raise an error if either is the case. + +Otherwise, if the ``text`` key is present in the ``license`` table, tools +SHOULD issue a warning informing users it is deprecated and recommending the +``license-expression`` key instead. + +Likewise, if the ``file`` key is present in the ``license`` table, tools SHOULD +issue a warning informing users it is deprecated and recommending +the ``license-files`` key instead. However, if the file is present in the +source, packaging tools SHOULD still use it to fill the ``License-File`` field +in the core metadata, and if so, MUST include the specified file in any +distribution artifacts for the project. If the file does not exist at the +specified path, tools SHOULD issue a warning, and MUST NOT fill it in a +``License-File`` field. -If only the ``text`` key is present in the ``license`` table, tools SHOULD -issue a warning informing users it is deprecated and recommending the -``expression`` key instead. +For backwards compatibility, to preserve consistent behavior with current tools +and ensure that users do not unknowingly create packages that are not legally +distributable, tools MUST assume the above default value for the +``license-files`` key and also include, in addition to the license file +specified under this ``file`` subkey, any license files that match the +corresponding list of patterns. -The ``text`` key may be removed from a new version of the specification +The ``license`` key may be removed from a new version of the specification in a future PEP. @@ -562,21 +585,28 @@ package author construct a license expression which expresses their intent. Backwards Compatibility ======================= -Adding a new, dedicated ``License-Expression`` field unambiguously signals -support for the new metadata fields and avoids the risk of new tooling -misinterpreting a license expression as a free-form license description, +Adding a new, dedicated ``License-Expression`` core metadata field and +``license-expression`` PEP 621 source metadata key unambiguously signals +support for the specification in this PEP. This avoids the risk of new tooling +misinterpreting a license expression as a free-form license description or vice versa, and raises an error if and only if the user affirmatively -upgrades to the latest metadata version by adding said field. +upgrades to the latest metadata version by adding said field/key. -The legacy ``License`` field and ``License ::`` classifiers will be deprecated -but not removed, to retain backward compatibility, while gently preparing users -for their future removal. Eventually, they would be removed, but that would be -following a suitable transition period and left to a future PEP and a new -version of the core metadata specification. +The legacy ``License`` core metadata field and ``license`` PEP 621 source +metadata key will be deprecated along with the ``License ::`` classifiers, +retaining backwards compatibility while gently preparing users for their +future removal. Such a removal would follow a suitable transition period, and +be left to a future PEP and a new version of the core metadata specification. -Formally specifying the ``License-File`` field is only codifying the existing -practice in ``wheel`` and ``setuptools``, and should be fully -backwards-compatible with their existing use of that field. +Formally specifying the new ``License-File`` core metadata field and the +inclusion of the listed files in the distribution merely codifies and +refines the existing practices in popular packaging tools, including +``wheel`` and ``setuptools``, and is designed to be backwards-compatible +with their existing use of that field. Likewise, the new ``license-files`` +PEP 621 source metadata key standardizes statically specifying the files +to include, as well as the default behavior, and allows other tools to +make use of them, while only having an effect once users and tools expressly +adopt it. Security Implications @@ -800,6 +830,399 @@ metadata versions, or those who choose not to provide license metadata, no changes are required regardless of the deprecation. +PEP 621 License Key +------------------- + +Alternate possibilities related to the ``License`` key in the +``pyproject.toml`` project source metadata specified in PEP 621. + + +Add Expression and Files Subkeys to Table +''''''''''''''''''''''''''''''''''''''''' + +A previous working draft of this PEP added ``expression`` and ``files`` subkeys +to the existing ``license`` table in the PEP 621 source metadata, to parallel +the existing ``file`` and ``text`` subkeys. While this seemed perhaps the +most obvious approach at first, it had several serious drawbacks relative to +that ultimately taken here. + +Most saliently, this means two very different types of metadata are being +specified under the same top-level key that require very different handling, +and furthermore, unlike the previous arrangement, the keys were not mutually +exclusive and can both be specified at once, and with some subkeys potentially +being dynamic and others static, and mapping to different core metadata fields. +This also breaks from the consensus for the core metadata fields, namely to +separate the license expression into its own explicit field. + +Furthermore, this leads to a conflict with marking the field as ``dynamic`` +(assuming that is intended to specify PEP 621 keys, as that PEP seems to rather +imprecisely imply, rather than core metadata fields), as either both would have +to be treated as ``dynamic``. A user may want to specify the ``expression`` +key as ``dynamic``, if they intend their tooling to generate it automatically; +conversely, they may rely on their build tool to dynamically detect license +files via means outside of that strictly specified here. And indeed, current +users may mark the present ``license`` key as ``dynamic`` to automatically +fill it in the metadata. Grouping all these uses under the same key forces an +"all or nothing" approach, and creates ambiguity as to user intent. + +There are further downsides to this as well. Both users and tools would need to +keep track of which fields are mutually exclusive with which of the others, +greatly increasing cognitive and code complexity, and in turn the probability +of errors. Conceptually, juxtaposing so many different fields under the +same key is rather jarring, and leads to a much more complex mapping between +PEP 621 keys and core metadata fields, not in keeping with PEP 621. +This causes the PEP 621 naming and structure to diverge further from +both the core metadata and native formats of the various popular packaging +tools that use it. Finally, this results in the spec being significantly more +complex and convoluted to understand and implement than the alternatives. + +The approach this PEP now takes, adding distinct ``license-expression`` and +``license-file`` keys and simply deprecating the whole ``license`` key, avoids +all the issues identified above, and results in a much clearer and cleaner +design overall. It allows ``license`` and ``license-files`` to be tagged +``dynamic`` independently, separates two independent types of metadata +(syntactically and semantically), restores a closer to 1:1 mapping of +PEP 621 keys to core metadata fields, automatically makes +``License-Expression`` exclusive of the deprecated and conflicting +``file`` and ``text`` subkeys, and reduces nesting by a level for both. +Other than adding two extra keys to the file, there was no real apparent +downside to this latter approach, so it was adopted for this PEP. + + +Define License Expression as String Value +''''''''''''''''''''''''''''''''''''''''' + +A compromise approach between adding two new top-level keys for license +expressions and files would be to add a separate ``license-files`` key, +but re-using the ``license`` key for the license expression, either by +defining it as the (previously reserved) string value for the ``license`` +key, retaining the ``expression`` sub-key in the ``license`` table, or +allowing both. Indeed, this would seem to have been envisioned by PEP 621 +itself with this PEP in mind, in particular the first approach:: + + A practical string value for the license key has been purposefully left out + to allow for a future PEP to specify support for SPDX [6] expressions. + +However, while a working draft temporarily explored this solution, it was +ultimately rejected, as it shared most of the downsides identified with +adding new subkeys under the existing ``license`` table, as well as several +of its own, with again minimal advantage over separating both. + +In particular, it means the top-level ``license`` key still maps to multiple +core metadata fields with different purposes and interpretation (``License`` +and ``License-Expression``), one deprecated and one new, and still prevents +them from being separately marked as dynamic, and conflates the same with +an existing mark. This further exhibits the same divergence from both +PEP 621, core metadata, tool file formats and the consensus in the discussion +in not making the new license expression map to a corresponding new field, +none of which was the case at the time PEP 621 was drafted. +Finally, this would deny a clear separation from the old behavior by not +cleanly deprecating the entire ``license`` field, and increases the complexity +of the specification and implementation. + +In addition to the aforementioned issues, this also requires deciding between +the three individual approaches (``expression`` subkey, top-level string or +allowing both), all of which have further significant downsides and none of +which are clearly superior or more obvious, leading to needless bikeshedding. + +If the license expression was made the string value of the ``license`` key, +as reserved by PEP 621, it would be slightly shorter for users to type and +more obviously the preferred approach. However, it is far *less* obvious that +it is a license expression at all, to authors and those viewing the files, +and this lack of clarity, explicitness, ambiguity and potential for user +confusion is exactly what this PEP seeks to avoid, all to save a few characters +over other approaches. + +If an ``expression`` key was added to the ``license`` table, it would retain +the clarity of a new top-level field, but add additional complexity for no +real benefit, with an extra level of nesting, and users and tools needing to +deal with the mutual exclusivity of the keys, as before. And allowing both +(as a table key *and* the string value) would inherit both's downsides, +while adding even more spec and tool complexity and making there more than +"one obvious way to do it", further potentially confusing users. + +Therefore, a separate top-level ``license-expression`` key was adopted to avoid +all these issues, with relatively minimal downside aside from adding a single +additional top-level key and (versus some approaches) a few extra characters +to type. + + +Add a Type Key to Treat as Expression +''''''''''''''''''''''''''''''''''''' + +Instead of creating a new top-level ``license-expression`` key in the +PEP 621 source metadata, we could add a ``type`` key to the existing +``license`` table to control whether ``text`` (or a string value) +is interpreted as free-text or a license expression. This could make +backward compatibility a little more seamless, as older tools could ignore +it and always treat ``text`` as ``license``, while newer tools would +know to treat it as a license expression, if ``type`` was set appropriately. +Indeed, PEP 621 suggests something of this sort as a possible alternative +way that SPDX expressions could be implemented. + +However, all the same downsides as in the previous item apply here, +including greater complexity, a more complex mapping between the project +source metadata and core metadata and inconsistency between the presentation +in tool config, PEP 621 and core metadata, a much less clean deprecation, +further bikeshedding over what to name it, and inability to mark one but +not the other as dynamic, among others. + +In addition, while theoretically potentially a little easier in the short +term, in the long term it would mean users would always have to remember +to specify the correct ``type`` to ensure their license expression is +interpreted correctly, which adds work and potential for error; we could +never safety change the default while being confident that users +understand that what they are entering is unambiguously a license expression, +with all the false positive and fales negative issues as above. + +Therefore, for these as well as the same reasons this approach was rejected +for the core metadata in favor of a distinct ``License-Expression`` field, +we similarly reject this here. + + +Must be Marked Dynamic to Back-Fill +''''''''''''''''''''''''''''''''''' + +The ``license`` key in the ``pyproject.toml`` could be required to be +explicitly set to dynamic in order for the ``License`` core metadata field +to be automatically back-filled from the value of the ``license-expression`` +key. This would be more explicit that the filling will be done, as strictly +speaking the ``license`` key is not (and cannot be) specified in +``pyproject.toml``. + +However, this isn't seen to be necessary, because it is simply using the +static, verbatim literal value of the ``license-expression`` key, as specified +strictly in this PEP. Therefore, any conforming tool can trivially, +deterministically and unambiguously derive this using only the static data +in the ``pyproject.toml`` file itself. + +Furthermore, this actually adds significant ambiguity, as it means the value +could get filled arbitrarily by other tools, which would in turn compromise +and conflict with the value of the new ``License-Expression`` field, which is +why such is explicitly prohibited by this PEP. Therefore, not marking it as +``dynamic`` will ensure it is only handled in accordance with this PEP's +requirements. + +Finally, users explicitly being told to mark it as ``dynamic``, or not, to +control filling behavior is both a mis-use of the ``dynamic`` field as +apparently intended, and prevents tools from adapting to best practices +(fill, don't fill, etc) as they develop and evolve over time. + + +PEP 621 License-Files Key +------------------------- + +Alternatives considered for the ``License-Files`` key in the +``pyproject.toml`` project source metadata, primarily related to the +path/glob type handling. + + +Add a Type Key to Control Path/Glob +''''''''''''''''''''''''''''''''''' + +Instead of defining mutually exclusive ``paths`` and ``globs`` subkeys +of the ``license-files`` PEP 621 project metadata key, we could +achieve the same effect with a ``files`` key for the list and +a ``type`` key for how to interpret it. However, the latter offers no +real advantage over the former, in exchange for requiring more keystrokes, +verbosity and complexity, as well as less flexibility in allowing both, +or another additional key in the future, as well as the need to bikeshed +over the key name. Therefore, it was summarily rejected. + + +Only Accept Verbatim Paths +'''''''''''''''''''''''''' + +Globs could be disallowed completely as values to the ``license-files`` +key in ``pyproject.toml`` and only verbatim literal paths allowed. +This would ensure that all license files are explicitly specified, all +specified license files are found and included, and the source metadata +is completely static in the strictest sense of the term, without tools +having to inspect the rest of the package files to determine exactly +what license files will be included and what the ``License-Files`` values +will be. This would also modestly simplify the spec and tool implementation. + +However, practicality once again beats purity here. Globs are supported and +used by many existing tools for finding license files, and explicitly +specifying the full path to every license file would be unnecessarily tedious +for more complex projects with vendored code and dependencies. More +critically, it would make it much easier to accidentally miss a required +legal file, silently rendering the package illegal to distribute. + +Tools can still statically and consistently determine the files to be included, +based only on those glob patterns the user explicitly specified and the +filenames in the package, without installing it, executing its code or even +examining its files. Furthermore, tools are still explicitly allowed to warn +if specified glob patterns (including full paths) don't match any files. +And, of course, sdists, wheels and others will have the +full static list of files specified in their core metadata. + +Perhaps most importantly, this would also preclude the currently specified +default value, as widely used by the current most popular tools, and thus +be a major break to backward compatibility, tool consistency, and safe +and sane default functionality to avoid unintentional license violations. +And of course, authors are welcome and encouraged to specify their license +files explicitly via the ``files`` table key, once they are aware of it and +if it is suitable for their project and workflow. + + +Only Accept Glob Patterns +''''''''''''''''''''''''' + +Conversely, all ``License-Files`` strings could be treated as glob patterns. +This would slightly simplify the spec and implementation, avoid an extra level +of nesting, and more closely match the configuration format of existing tools. + +However, for the cost of a few characters, it ensures users are aware +whether they are entering globs or verbatim paths. Furthermore, allowing +license files to be specified as literal paths avoids edge cases, such as those +containing glob or other special characters (or those confusingly or even +maliciously similar to them, as described in PEP 672). + +Including an explicit ``paths`` value guarantees that the resulting +``License-File`` metadata is correct, complete and purely static in the +strictest sense of the term, with all license paths explicitly specified +in the ``pyproject.toml`` file, guaranteed to be included and with an early +error should any be missing. + +This allows tools to locate them and know the exact values of the +``License-File`` core metadata fields without having to traverse the +source files of the project and match globs, potentially allowing easier, +more efficient and reliable inspection by tools. + +Therefore, given the relatively small cost and the significant benefits, +this approach was not adopted. + + +Infer Whether Paths or Globs +'''''''''''''''''''''''''''' + +It was considered whether to simply allow specifying an array of strings +directly for the ``license-file`` key, rather than making it a table with +explicit ``paths`` and ``globs``. This would be somewhat simpler and avoid +an extra level of nesting, and more closely match the configuration format +of existing tools. However, it was ultimately rejected in favor of separate, +mutually exclusive ``paths`` and ``globs`` table keys. + +In practice, it only saves six extra characters in the ``pyproject.toml`` +(``license-files = [...]`` vs ``license-files.globs = [...]``), but allows +the user to more explicitly declare their intent, ensures they understand how +the field is going to be interpreted, and serves as an unambiguous indicator +for tools to parse them as globs rather than verbatim path literals. + +This, in turn, allows for more appropriate, clearly specified tool +behaviors for each case, many of which would be unreliable or impossible +without it, to avoid common traps, provide more helpful feedback and +behave more sensibly and intuitively overall. These include, with ``files``, +guaranteeing that each and every specified file is included and immediately +raising an error if one is missing, and with ``globs``, checking glob syntax, +excluding unwanted backup, temporary, or other such files (as current tools +already do), and optionally warning if a glob doesn't match any files. +This also avoids edge cases (e.g. paths that contain glob characters) and +reliance on heuristics to determine interpretation—the very thing this PEP +seeks to avoid. + + +Also Allow a Flat Array Value +''''''''''''''''''''''''''''' + +Initially, after deciding to define ``license-files`` as a table of ``paths`` +and ``globs``, thought was given to making a top-level string array under the +``license-files`` key mean one or the other (probably ``globs``, to match most +current tools). This is slightly shorter and simpler, would allow gently +nudging users toward a preferred one, and allow a slightly cleaner handling of +the empty case (which, at present, is treated identically for either). + +However, this again only saves six characters in the best case, and there +isn't an obvious choice; whether from a perspective of preference (both had +clear use cases and benefits), nor as to which one users would naturally +assume. + +Flat may be better than nested, but in the face of ambiguity, users +may not resist the temptation to guess. Requiring users to explicitly specify +one or the other ensures they are aware of how their inputs will be handled, +and is more readable for others, both human and machine alike. It also makes +the spec and tool implementation slightly more complicated, and it can always +be added in the future, but not removed without breaking backward +compatibility. And finally, for the "preferred" option, it means there is +more than one obvious way to do it. + +Therefore, per PEP 20, the Zen of Python, this approach is hereby rejected. + + +Allow Both Paths and Globs Keys +''''''''''''''''''''''''''''''' + +Allowing both ``paths`` and ``globs`` keys to be specified under the +``license-files`` table was considered, as it could potentially allow +more flexible handling for particularly complex projects, and specify on a +per-pattern rather than overall basis whether ``license-files`` entries +should be treated as ``paths`` or ``globs``. + +However, given the existing proposed approach already matches or exceeds the +power and capabilities of those offered in tools' config files, there isn't +clear demand for this and few likely cases that would benefit, it adds a large +amount of complexity for relatively minimal gain, in terms of the +specification, in tool implementations and in ``pyproject.toml`` itself. + +There would be many more edge cases to deal with, such as how to handle files +matched by both lists, and it conflicts in multiple places with the current +specification for how tools should behave with one or the other, such as when +no files match, guarantees of all files being included and of the file paths +being explicitly, statically specified, and others. + +Like the previous, if there is a clear need for it, it can be always allowed +in the future in a backward-compatible manner (to the extent it is possible +at all), while the same is not true of disallowing it. Therefore, it was +decided to require the two keys to be mutually exclusive. + + +Rename Paths Subkey to Files +'''''''''''''''''''''''''''' + +Initially, it was considered whether to name the ``paths`` subkey of the +``license-files`` table ``files`` instead. However, ``paths`` was ultimately +chosen, as calling the table key ``files`` resulted in duplication between +the table name (``license-files``) and the subkey name (``files``), i.e. +``license-files.files = ["LICENSE.txt"]``, made it seem like the preferred/ +default subkey when it was not, and lacked the same parallelism with ``globs`` +in describing the format of the string entry rather than what was being +pointed to. + + +Must be Marked Dynamic to Use Defaults +'''''''''''''''''''''''''''''''''''''' + +It may seem outwardly sensible, at least with a particularly restrictive +interpretation of PEP 621 's description of the ``dynamic`` field, to +consider requiring the ``license-files`` key to be explicitly marked as +``dynamic`` in order for the default glob patterns to be used, or alternatively +for license files to be matched and included at all. + +However, this is merely declaring a static, strictly-specified default value +for this particular key, required to be used exactly by all conforming tools +(so long as it is not marked ``dynamic``, negating this argument entirely), +and is no less static than any other set of glob patterns. Furthermore, the +resulting ``License-File`` core metadata values can still be determined with +only a list of files in the source, without installing or executing any of the +code, or even inspecting file contents. + +Moreover, even if this were not so, practicality would trump purity, as this +interpretation would be strictly backwards-incompatible with the existing +format, as it would trigger inconstant behavior with the existing tools. +Further, this would create a very serious and likely risk of a large number of +projects unknowingly no longer including legally mandatory license files, +making their distribution illegal, and is thus not a sane, much less sensible +default. + +Finally, aside from adding an additional line of virtually-required boilerplate +to the file, not defining the default as dynamic allows authors to clearly +and unambiguously indicate when their build/packaging tools are going to be +handling the inclusion of license files themselves rather than strictly +conforming to the PEP 621 portions of this PEP; to do otherwise would defeat +the primary purpose of the ``dynamic`` field as a marker and escape hatch. + + Other Ideas ----------- @@ -887,8 +1310,8 @@ The simplest migration to this PEP would consist of using this instead:: Or, in a PEP 621 ``pyproject.toml``:: - [project.license] - expression = "MIT" + [project] + license-expression = "MIT" The output core metadata for the package would then be:: @@ -926,9 +1349,9 @@ all the license expressions in one expression, our ``setup.cfg`` would be:: Or, in a PEP 621 ``pyproject.toml``, this would look like:: - [project.license] - expression = "MIT AND (Apache-2.0 OR BSD-2-Clause)" - files = ["LICENSE.txt", "LICENSE-PACKAGING.txt"] + [project] + license-expression = "MIT AND (Apache-2.0 OR BSD-2-Clause)" + license-files.paths = ["LICENSE.txt", "LICENSE-PACKAGING.txt"] Here, we assume that the ``LICENSE.txt`` file contains the text of the MIT license and the copyrights used by ``setuptools``, ``pyparsing``, From 820c99f3f92d405985b613af697f5fa93ff77514 Mon Sep 17 00:00:00 2001 From: "C.A.M. Gerlach" Date: Tue, 23 Nov 2021 20:12:52 -0600 Subject: [PATCH 09/19] PEP 639: Specify license path root & subdirs, & expand examples --- pep-0639.rst | 237 +++++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 200 insertions(+), 37 deletions(-) diff --git a/pep-0639.rst b/pep-0639.rst index 9e58c4c8db4..d192ad50667 100644 --- a/pep-0639.rst +++ b/pep-0639.rst @@ -254,26 +254,55 @@ a valid license expression, build and publishing tools: Add License-File Field '''''''''''''''''''''' -The ``License-File`` field is specified to contain the string representation of -the path of license-related files relative to the directory containing the -core metadata: ``.dist-info`` (for wheels), ``.egg-info`` (for sdists), -or the equivalent for other distribution formats. +The ``License-File`` optional field is specified to contain the +string representation of the path to a license-related file, relative to +the primary canonical directory containing the core metadata. +Files specified under this field could include license text, author/attribution +information, or other legal notices that need be distributed with the package. -It is an optional, multi-use field which may appear zero or more times, -each instance listing a path to one of the license files to be included in -distributions of the package. Files specified under this field could include -license text, author/attribution information, or other legal notices that -need be distributed with the package. +If a ``License-File`` is listed in a source or binary distribution's core +metadata, that file MUST be included in the distribution at the specified path +relative to that distribution's core metadata directory, and MUST be installed +with the distribution. It is a multi-use field that may appear zero or more +times, each instance listing the path to one such file. Path separators, if needed, MUST be the forward slash character (``/``), and parent directory indicators (``..``) MUST NOT be used. License file content MUST be UTF-8 encoded text. -If a ``License-File`` is listed in a source or binary distribution's core -metadata, that file MUST be included at the specified path relative to that -distribution's core metadata directory. Build tools MAY and publishing tools -SHOULD produce an informative warning if a built package's metadata contains -no ``License-File`` entries, and publishing tools MAY raise an error. +This relative path MUST be consistent between project source trees, +source distributions (sdists), binary distributions (wheels) and installed +projects. Therefore, inside the metadata parent directory of their +distribution formats, packaging tools MUST reproduce the directory structure +under which the source license files are located relative to the project root. + +This means: + +- In project source trees, license paths MUST be relative to the project root + directory; i.e. the directory containing the ``pyproject.toml`` (which in + turn contains the PEP 621 source metadata in the ``[project]`` table, if + present), or equivalently other legacy project configuration, + e.g. ``setup.py``, ``setup.cfg``, ``flit.ini``, etc. + +- In source distributions (sdists) [#sdistspec]_, license paths MUST be + relative to the root directory of the sdist, which contains the + ``pyproject.toml`` and the ``PKG-INFO`` core metadata. + +- In binary distributions (wheels) [#wheelspec]_, license paths MUST be + relative to the ``.dist-info`` directory of the wheel, which contains the + core metadata in the ``METADATA`` file. + +- In installed projects [#installedspec]_, license paths MUST be + relative to the ``.dist-info`` directory of the installed project, + which contains the core metadata in the ``METADATA`` file. + +In other distribution formats, license files MUST be included and installed +in some form, and SHOULD be relative to the directory (or equivalent) +storing the core metadata. + +Build tools MAY and publishing tools SHOULD produce an informative warning +if a built package's metadata contains no ``License-File`` entries, +and publishing tools MAY and build tools MUST NOT raise an error. Deprecate License Field @@ -1223,6 +1252,64 @@ conforming to the PEP 621 portions of this PEP; to do otherwise would defeat the primary purpose of the ``dynamic`` field as a marker and escape hatch. +License File Paths +------------------ + +Alternatives related to the paths and locations of license files in the source +and built distributions. + + +Flatten License Files in Subdirectories +''''''''''''''''''''''''''''''''''''''' + +Previous drafts of this PEP were silent on the issue of handling license files +in subdirectories. Currently, Wheel [#wheelfiles]_ and (following its example) +Setuptools [#setuptoolsfiles]_ flattens all license files into the +``.dist-info`` directory [#setuptoolsfiles]_, without preserving the source +subdirectory hierarchy. + +While this is the simplest approach and matches existing ad hoc practice, +this can result in name conflicts and license files clobbering others, +with no obvious defined behavior for how to resolve them, and leaving the +package legally un-distributable without any clear indication to users that +their specified license files have not been included. + +Furthermore, this leads to inconsistent relative file paths for non-root +license files between the source, sdist and wheel, and prevents the paths +given in the PEP 621 "static" metadata from being truly static, as they need +to be flattened, and may potentially overwrite one another. Finally, +the source directory structure often implies valuable information about +what the licenses apply to, and where to find them in the source, +which is lost when flattening them and far from trivial to reconstruct. + +To resolve this, the PEP now proposes, as did contributors on both of the +above issues, reproducing the source directory structure of the original +license files inside the ``.dist-info`` directory. This would fully resolve the +concerns above, with the only downside being a more nested, cluttered +``.dist-info`` directory. There is still a risk of filename collision with +edge-case custom filenames (e.g. ``RECORD``, ``METADATA``), but that is the +case currently, and in fact with fewer files flattened into the root, +this would actually reduce the risk, while a related proposal to root the +license files under a ``license`` subdirectory or similar would eliminate both +it and the clutter problem entirely. + + +Resolve Name Conflicts Differently +'''''''''''''''''''''''''''''''''' + +Rather than preserving the source directory structure for license files +inside the ``.dist-info`` directory, we could specify some other mechanism +for conflict resolution, such as pre- or appending the parent directory name +to the license filename, traversing up the tree until the name was unique, +to avoid excessively nested directories. + +However, this would not address the path consistency issues, would require +much more discussion, coordination and bikeshedding, and further complicate +the specification and the implementations. Therefore, it was rejected in +favor of the simpler and more obvious solution of just preserving the +source subdirectory layout, as many stakeholders have already advocated for. + + Other Ideas ----------- @@ -1290,8 +1377,8 @@ specified here, balancing flexibility and compatibility. Appendix 1. License Expression Examples ======================================= -Simple Example --------------- +Basic Example +------------- Setuptools itself, as of version 59.1.1 [#setuptools5911]_, does not use the ``License`` field. Further, ``license_file``/``license_files`` is no longer @@ -1301,11 +1388,13 @@ including the ``LICENSE`` file it uses. It only includes the following license-related metadata in its ``setup.cfg``:: + [metadata] classifiers = License :: OSI Approved :: MIT License The simplest migration to this PEP would consist of using this instead:: + [metadata] license_expression = MIT Or, in a PEP 621 ``pyproject.toml``:: @@ -1318,9 +1407,14 @@ The output core metadata for the package would then be:: License-Expression: MIT License-File: LICENSE +The ``LICENSE`` file would be stored at ``/LICENSE`` in the sdist and +``/setuptools-{version}.dist-info/LICENSE`` in the wheel, where ``/`` is the +respective sdist and wheel root and ``{version}`` is the setuptools +package version in the core metadata. -Complex Example ---------------- + +Advanced Example +---------------- Suppose Setuptools were to include the licenses of the third-party packages that are vendored in the ``setuptools/_vendor/`` and ``pkg_resources/_vendor`` @@ -1338,32 +1432,92 @@ The license expressions for these packages are:: ordered-set: MIT more_itertools: MIT -Therefore, a comprehensive license expression covering both ``setuptools`` -proper and its vendored dependencies could contain these metadata, combining -all the license expressions in one expression, our ``setup.cfg`` would be:: +A comprehensive license expression covering both ``setuptools`` +proper and its vendored dependencies would contain these metadata, +combining all the license expressions into one. Such an expression might be:: + MIT AND (Apache-2.0 OR BSD-2-Clause) + +In addition, per the requirements of the licenses, the relevant license files +must be included in the package. Suppose the ``LICENSE`` file contains the text +of the MIT license and the copyrights used by ``setuptools``, ``pyparsing``, +``more_itertools`` and ``ordered-set``; and the ``LICENSE`` files in the +``setuptools/_vendor/packaging/`` directory contain the Apache 2.0 and +2-clause BSD license text, and the Packaging copyright statement and +license choice notice [#packlic]_. + +Therefore, we assume the license files are located at the following +paths in the project source tree (relative to the project root and +``pyproject.toml``):: + + LICENSE + setuptools/_vendor/packaging/LICENSE + setuptools/_vendor/packaging/LICENSE.APACHE + setuptools/_vendor/packaging/LICENSE.BSD + +Putting it all together, our ``setup.cfg`` would be:: + + [metadata] license_expression = MIT AND (Apache-2.0 OR BSD-2-Clause) license_files = - LICENSE.txt - LICENSE-PACKAGING.txt + LICENSE + setuptools/_vendor/packaging/LICENSE + setuptools/_vendor/packaging/LICENSE.APACHE + setuptools/_vendor/packaging/LICENSE.BSD -Or, in a PEP 621 ``pyproject.toml``, this would look like:: +In a PEP 621 ``pyproject.toml``, with license files specified explicitly +via the ``paths`` key, this would look like:: [project] license-expression = "MIT AND (Apache-2.0 OR BSD-2-Clause)" - license-files.paths = ["LICENSE.txt", "LICENSE-PACKAGING.txt"] + license-files.paths = [ + "LICENSE", + "setuptools/_vendor/LICENSE", + "setuptools/_vendor/LICENSE.APACHE", + "setuptools/_vendor/LICENSE.BSD", + ] + +Or alternatively, matched via glob patterns, this would be:: -Here, we assume that the ``LICENSE.txt`` file contains the text of the MIT -license and the copyrights used by ``setuptools``, ``pyparsing``, -``more_itertools`` and ``ordered-set``, and that the ``LICENSE-PACKAGING.txt`` -file contains the texts of the Apache and BSD licenses, and the ``packaging`` -copyright statements and license choice notice [#packlic]_. + [project] + license-expression = "MIT AND (Apache-2.0 OR BSD-2-Clause)" + license-files.globs = [ + "LICENSE*", + "setuptools/_vendor/LICENSE*", + ] With either approach, the resulting core metadata would be:: License-Expression: MIT AND (Apache-2.0 OR BSD-2-Clause) - License-File: LICENSE.txt - License-File: LICENSE-PACKAGING.txt + License-File: LICENSE + License-File: setuptools/_vendor/packaging/LICENSE + License-File: setuptools/_vendor/packaging/LICENSE.APACHE + License-File: setuptools/_vendor/packaging/LICENSE.BSD + +In the resulting sdist, with ``/`` as the root of the sdist, the license files +would be located at the paths:: + + /LICENSE + /setuptools/_vendor/packaging/LICENSE + /setuptools/_vendor/packaging/LICENSE.APACHE + /setuptools/_vendor/packaging/LICENSE.BSD + +If Setuptools were to decide to also locate them under its ``.egg-info`` +directory, they would additionally be found at the paths:: + + /setuptools.egg-info/LICENSE + /setuptools.egg-info/setuptools/_vendor/packaging/LICENSE + /setuptools.egg-info/setuptools/_vendor/packaging/LICENSE.APACHE + /setuptools.egg-info/setuptools/_vendor/packaging/LICENSE.BSD + +Finally, in the built wheel with ``/`` as the wheel root and ``{version}`` +as the canonical Setuptools package version in the core metadata, +the license files would be stored at:: + + /setuptools-{version}.dist-info/LICENSE + /setuptools-{version}.dist-info/setuptools/_vendor/packaging/LICENSE + /setuptools-{version}.dist-info/setuptools/_vendor/packaging/LICENSE.APACHE + /setuptools-{version}.dist-info/setuptools/_vendor/packaging/LICENSE.BSD Conversion Example @@ -1457,7 +1611,7 @@ Beyond a license code or qualifier, license text files are documented and included in a built package either implicitly or explicitly and this is another possible source of confusion: -- In ``setuptools`` [#setuptoolssdist]_ and wheels [#wheels]_, license files +- In Setuptools [#setuptoolssdist]_ and wheels [#wheels]_, license files are automatically added to the distribution (at their source location in in a source distribution/sdist, and in the ``.dist-info`` directory of a built wheel) if they match one of a number of common license file @@ -1465,7 +1619,10 @@ possible source of confusion: Alternatively, a package author can specify a list of license file paths to include in the built wheel under the ``license_files`` key in the ``[metadata]`` section of the project's ``setup.cfg``, or as an argument - to the ``setuptools`` ``setup()`` function. + to the ``setuptools`` ``setup()`` function. At present, following wheel's + lead, Setuptools flattens the collected license files into the metadata + directory, clobbering files with the same name, but there is a desire to + resolve this, contingent on the this PEP being accepted [#setuptoolsfiles]_. - Both tools also support an older, singular ``license_file`` parameter that allows specifying only one license file to add to the distribution, which @@ -1473,9 +1630,9 @@ possible source of confusion: See [#pipsetup]_ for instance. - Following the publication of an earlier draft of this PEP, ``setuptools`` - added support for ``License-File`` in package metadata as described here. - This allows other tools consuming the resulting metadata to unambiguously - locate the license file(s) for a given package. + added support for ``License-File`` in package metadata as described herein + [#setuptoolspep639]_. This allows other tools consuming the resulting + metadata to unambiguously locate the license file(s) for a given package. **Note:** the ``License-File`` field proposed in this PEP already exists in ``wheel`` and ``setuptools`` with the same behaviour as explained above. @@ -1708,6 +1865,9 @@ References ========== .. [#cms] https://packaging.python.org/specifications/core-metadata +.. [#sdistspec] https://packaging.python.org/specifications/source-distribution-format/ +.. [#wheelspec] https://packaging.python.org/specifications/binary-distribution-format/ +.. [#installedspec] https://packaging.python.org/specifications/recording-installed-packages/ .. [#cdstats] https://clearlydefined.io/stats .. [#cd] https://clearlydefined.io .. [#osi] https://opensource.org @@ -1716,6 +1876,9 @@ References .. [#issue17] https://github.com/pypa/trove-classifiers/issues/17 .. [#badclassifiers] https://github.com/pypa/trove-classifiers/issues/17#issuecomment-385027197 .. [#pypapep621] https://packaging.python.org/specifications/declaring-project-metadata/ +.. [#setuptoolspep639] https://github.com/pypa/setuptools/pull/2645 +.. [#wheelfiles] https://github.com/pypa/wheel/issues/138 +.. [#setuptoolsfiles] https://github.com/pypa/setuptools/issues/2739 .. [#globmodule] https://docs.python.org/3/library/glob.html .. [#spdxlist] https://spdx.org/licenses/ .. [#spdx] https://spdx.dev/ From abf5bee87609bab0a6c8739ecdd37c6c7ccc9fb9 Mon Sep 17 00:00:00 2001 From: "C.A.M. Gerlach" Date: Thu, 25 Nov 2021 03:54:10 -0600 Subject: [PATCH 10/19] PEP 639: Add license_files dir, rejected ideas & sdist/wheel/installed --- pep-0639.rst | 281 +++++++++++++++++++++++++++++++++++++-------------- 1 file changed, 204 insertions(+), 77 deletions(-) diff --git a/pep-0639.rst b/pep-0639.rst index d192ad50667..1d196282722 100644 --- a/pep-0639.rst +++ b/pep-0639.rst @@ -254,55 +254,38 @@ a valid license expression, build and publishing tools: Add License-File Field '''''''''''''''''''''' -The ``License-File`` optional field is specified to contain the -string representation of the path to a license-related file, relative to -the primary canonical directory containing the core metadata. -Files specified under this field could include license text, author/attribution -information, or other legal notices that need be distributed with the package. +The ``License-File`` optional field is specified to contain the string +representation of the path to a license-related file, relative to the +root license directory. Files specified under this field +could include license text, author/attribution information, or other +legal notices that need to be distributed with the package. +It is a multi-use field that may appear zero or more times, +each instance listing the path to one such file. If a ``License-File`` is listed in a source or binary distribution's core metadata, that file MUST be included in the distribution at the specified path -relative to that distribution's core metadata directory, and MUST be installed -with the distribution. It is a multi-use field that may appear zero or more -times, each instance listing the path to one such file. +relative to the root license directory, and MUST be installed with the +distribution at that same path. -Path separators, if needed, MUST be the forward slash character (``/``), -and parent directory indicators (``..``) MUST NOT be used. -License file content MUST be UTF-8 encoded text. +The root license directory is defined to be the project root directory +for source trees and source distributions, and the ``license_files`` +subdirectory of the directory containing the core metadata (i.e. the +``.dist-info`` directory containing the ``METADATA`` file), for built +distributions and installed projects. -This relative path MUST be consistent between project source trees, +The specified relative path MUST be consistent between project source trees, source distributions (sdists), binary distributions (wheels) and installed -projects. Therefore, inside the metadata parent directory of their -distribution formats, packaging tools MUST reproduce the directory structure -under which the source license files are located relative to the project root. - -This means: - -- In project source trees, license paths MUST be relative to the project root - directory; i.e. the directory containing the ``pyproject.toml`` (which in - turn contains the PEP 621 source metadata in the ``[project]`` table, if - present), or equivalently other legacy project configuration, - e.g. ``setup.py``, ``setup.cfg``, ``flit.ini``, etc. - -- In source distributions (sdists) [#sdistspec]_, license paths MUST be - relative to the root directory of the sdist, which contains the - ``pyproject.toml`` and the ``PKG-INFO`` core metadata. +projects. Therefore, inside the root license directory, packaging tools +MUST reproduce the directory structure under which the +source license files are located relative to the project root. -- In binary distributions (wheels) [#wheelspec]_, license paths MUST be - relative to the ``.dist-info`` directory of the wheel, which contains the - core metadata in the ``METADATA`` file. - -- In installed projects [#installedspec]_, license paths MUST be - relative to the ``.dist-info`` directory of the installed project, - which contains the core metadata in the ``METADATA`` file. - -In other distribution formats, license files MUST be included and installed -in some form, and SHOULD be relative to the directory (or equivalent) -storing the core metadata. +Path separators MUST be the forward slash character (``/``), +and parent directory indicators (``..``) MUST NOT be used. +License file content MUST be UTF-8 encoded text. Build tools MAY and publishing tools SHOULD produce an informative warning if a built package's metadata contains no ``License-File`` entries, -and publishing tools MAY and build tools MUST NOT raise an error. +and publishing tools MAY but build tools MUST NOT raise an error. Deprecate License Field @@ -351,7 +334,7 @@ the presence of license classifiers SHOULD NOT raise an error unless PEP 621 Source Metadata ----------------------- -As currently specified in the canonical PyPA specification [#pypapep621]_, +As currently specified in the canonical PyPA specification [#projectspec]_, PEP 621 defines how to declare a project's source metadata in a ``[project]`` table in the ``pyproject.toml`` file for packaging tools to consume and output a distribution's core metadata. @@ -491,6 +474,47 @@ The ``license`` key may be removed from a new version of the specification in a future PEP. +License Files In Project Formats +-------------------------------- + +A few minor additions will be made to the relevant existing specifications +to document, standardize and clarify what is already currently supported, +allowed and implemented behavior, as well as explicitly mention the directory +location the license file tree is rooted in for each format, per the +specification above. + +Project source trees + As described above, the project source metadata specification [#projectspec]_ + will be updated to reflect that license file paths MUST be relative to the + project root directory; i.e. the directory containing the ``pyproject.toml`` + (or equivalently, other legacy project configuration, + e.g. ``setup.py``, ``setup.cfg``, etc). + +Source distributions (sdists) + The sdist specification [#sdistspec]_ will be updated to reflect that for + metadata version 2.3, the sdist MUST contain any license files specified + by ``License-Files`` in the ``PKG-INFO`` at their respective paths relative + to the top-level directory of the sdist + (containing the ``pyproject.toml`` and the ``PKG-INFO`` core metadata). + +Built distributions (wheels) + The wheel specification [#wheelspec]_ will be updated to reflect that if + the ``METADATA`` version is 2.3 or greater and one or more ``License-File`` + fields is specified, the ``.dist-info`` directory MUST contain a + ``license_files`` subdirectory which MUST contain the files listed in the + ``License-File`` fields in the ``METADATA`` file at their respective paths + relative to the ``license_files`` directory. + +Installed projects + The recording installed projects specification [#installedspec]_ will be + updated to reflect that if the ``METADATA`` version is 2.3 or greater + and one or more ``License-File`` fields is specified, the ``.dist-info`` + directory MUST contain a ``license_files`` subdirectory which MUST contain + the files listed in the ``License-File`` fields in the ``METADATA`` file + at their respective paths relative to the ``license_files`` directory, + and that any files in this directory MUST be copied from installed wheels. + + Converting Legacy Metadata -------------------------- @@ -1285,13 +1309,13 @@ which is lost when flattening them and far from trivial to reconstruct. To resolve this, the PEP now proposes, as did contributors on both of the above issues, reproducing the source directory structure of the original license files inside the ``.dist-info`` directory. This would fully resolve the -concerns above, with the only downside being a more nested, cluttered -``.dist-info`` directory. There is still a risk of filename collision with -edge-case custom filenames (e.g. ``RECORD``, ``METADATA``), but that is the -case currently, and in fact with fewer files flattened into the root, -this would actually reduce the risk, while a related proposal to root the -license files under a ``license`` subdirectory or similar would eliminate both -it and the clutter problem entirely. +concerns above, with the only downside being a more nested ``.dist-info`` +directory. There is still a risk of filename collision with +edge-case custom filenames (e.g. ``RECORD``, ``METADATA``), but that is also +the case with the previous approach, and in fact with fewer files flattened +into the root, this would actually reduce the risk. Furthermore, +a followup proposal rooting the license files under a ``license_files`` +subdirectory eliminates both collisions and the clutter problem entirely. Resolve Name Conflicts Differently @@ -1310,6 +1334,107 @@ favor of the simpler and more obvious solution of just preserving the source subdirectory layout, as many stakeholders have already advocated for. +Dump Directly in Dist-Info +'''''''''''''''''''''''''' + +Previously, the included license files were stored directly in the top-level +``.dist-info`` directory of built wheels and installed projects. This followed +existing ad hoc practice, ensured most existing wheels currently using this +feature will match new ones (i.e. those projects built with Wheel versions +that include license files but don't specify license files in subdirectories), +and kept the specification simpler, with the license files always being +stored in the same location relative to the core metadata regardless of +distribution type. + +However, this leads to a more cluttered ``.dist-info`` directory, littered +with arbitrary license files and subdirectories, as opposed to separating +licenses into their own namespace (which per the Zen of Python, PEP 20, are +"one honking great idea"). While currently small, there is still a +risk of collision with specific custom license filenames +(e.g. ``RECORD``, ``METADATA``) in the ``.dist-info`` directory, which +would only increase if and when additional files were specified here, and +would require carefully limiting the potential filenames used to avoid +likely conflicts with those of license-related files. Finally, +putting licenses into their own specified subdirectory would allow +humans and tools to quickly, easily and correctly list, copy and manipulate +all of them at once (such as in distro packaging, legal checks, etc) +without having to reference each of their paths from the core metadata. + +Therefore, now is a prudent time to specify an alternate approach. +The simplest and most obvious solution, as suggested by several on the Wheel +and Setuptools implementation issues, is to simply root the license files +relative to a ``license_files`` subdirectory of ``.dist-info``. This is simple +to implement and solves all the problems noted above, without clear significant +drawbacks relative to other more complex options. + +It does make the specification a bit more complex and less elegant, but +implementation should remain equally simple. It does mean that wheels +produced with following this change will have differently-located licenses +than those prior, but as this was already true for those in subdirectories, +and until this PEP there was no way of discovering these files or +accessing them programmatically, this doesn't seem likely to pose +significant problems in practice. Given this will be much harder if not +impossible to change later, once the status quo is standardized, tools are +relying on the current behavior and there is much greater uptake of not +only simply including license files but potentially accessing them as well +using the core metadata, if we're going to change it, now would be the time +(particularly since we're already introducing an edge-case change with how +license files in subdirs are handled, as well as other things). + +Therefore, the latter has been incorporated into current drafts of this PEP. + + +Add New Licenses Category to Wheel +'''''''''''''''''''''''''''''''''' + +Instead of defining a root license directory (``license_files``) inside +the core metadata directory (``.dist-info``) for wheels, we could +instead define a new category (and, presumably, a corresponding install scheme), +similar to the others currently included under ``.data`` in the wheel archive, +specifically for license files, called (e.g.) ``licenses``. This was mentioned +by the wheel creator, and would allow installing licenses somewhere more +platform-appropriate and flexible than just the ``.dist-info`` directory +in the site path, and potentially be conceptually cleaner than including +them there. + +However, at present, this PEP does not implement this idea, and it is +deferred to a future one. It would add significant complexity and friction +to this PEP, being primarily concerned with standardizing existing practice +and updating the core metadata specification. Furthermore, doing so would +likely require modifying ``sysconfig`` and the install schemes specified +therein, alongside Wheel, Installer and other tools, which would be a +non-trivial undertaking. While potentially slightly more complex for +repackagers (such as those for Linux distributions) the current proposal +ensuring all license files are included, and in a single dedicated directory +(which can easily be copied or relocated downstream), should still greatly +improve the status quo in this regard without the attendant complexity. + +In addition, this approach is not fully backwards compatible (since it +isn't transparent to tools that simply extract the wheel), is a greater +departure from existing practice and would lead to more inconsistent +license install locations from wheels of different versions. Finally, +this would mean licenses were not installed as proximately to their +associated code, there would be more variability in the license root path +across platforms and between built and installed packages, accessing +installed licenses pro grammatically would be more non-trivial, and a +suitable install location and method would need to be created, discussed +and decided that would avoid name clashes. + +Therefore, to keep this PEP in scope, the current approach was retained. + + +Name the Subdirectory Licenses +'''''''''''''''''''''''''''''' + +Both ``licenses`` and ``license_files`` have been suggested as potential +names for the root license directory inside ``.dist-info`` of wheels and +installed projects. The former is slightly shorter, but the latter is +more clear and unambiguous regarding its contents, and is consistent with +the name of the core metadata field (``License-File``) and the PEP 621 +project source metadata key (``License-Files``). Therefore, the latter +was chosen instead. + + Other Ideas ----------- @@ -1380,11 +1505,12 @@ Appendix 1. License Expression Examples Basic Example ------------- -Setuptools itself, as of version 59.1.1 [#setuptools5911]_, does not use the -``License`` field. Further, ``license_file``/``license_files`` is no longer -explicitly specified, as it was previously, since ``setuptools`` relies on -its automatic inclusion of license-related files matching common patterns, -including the ``LICENSE`` file it uses. +The Setuptools project itself, as of version 59.1.1 [#setuptools5911]_, +does not use the ``License`` field in its own project metadata. +Further, it not longer explictly specifies ``license_file``/``license_files`` +as it did previously, since ``setuptools`` relies on its own automatic +inclusion of license-related files matching common patterns, +such as the ``LICENSE`` file it uses. It only includes the following license-related metadata in its ``setup.cfg``:: @@ -1407,10 +1533,11 @@ The output core metadata for the package would then be:: License-Expression: MIT License-File: LICENSE -The ``LICENSE`` file would be stored at ``/LICENSE`` in the sdist and -``/setuptools-{version}.dist-info/LICENSE`` in the wheel, where ``/`` is the -respective sdist and wheel root and ``{version}`` is the setuptools -package version in the core metadata. +The ``LICENSE`` file would be stored at ``/setuptools-{version}/LICENSE`` +in the sdist and ``/setuptools-{version}.dist-info/license_files/LICENSE`` +in the wheel, and unpacked from there into the site directory (e.g. +``site-packages) on installation; ``/`` is the root of the respective archive +and ``{version}`` the version of the Setuptools project in the core metadata. Advanced Example @@ -1477,7 +1604,7 @@ via the ``paths`` key, this would look like:: "setuptools/_vendor/LICENSE.BSD", ] -Or alternatively, matched via glob patterns, this would be:: +Or alternatively, matched via glob patterns, this could be:: [project] license-expression = "MIT AND (Apache-2.0 OR BSD-2-Clause)" @@ -1494,30 +1621,30 @@ With either approach, the resulting core metadata would be:: License-File: setuptools/_vendor/packaging/LICENSE.APACHE License-File: setuptools/_vendor/packaging/LICENSE.BSD -In the resulting sdist, with ``/`` as the root of the sdist, the license files -would be located at the paths:: +In the resulting sdist, with ``/`` as the root of the archive and ``{version}`` +the version of the Setuptools project specified in the core metadata, +the license files would be located at the paths:: - /LICENSE - /setuptools/_vendor/packaging/LICENSE - /setuptools/_vendor/packaging/LICENSE.APACHE - /setuptools/_vendor/packaging/LICENSE.BSD + /setuptools-{version}/LICENSE + /setuptools-{version}/setuptools/_vendor/packaging/LICENSE + /setuptools-{version}/setuptools/_vendor/packaging/LICENSE.APACHE + /setuptools-{version}/setuptools/_vendor/packaging/LICENSE.BSD -If Setuptools were to decide to also locate them under its ``.egg-info`` -directory, they would additionally be found at the paths:: +In the built wheel, with ``/`` being the root of the archive and +``{version}`` as above, the license files would be stored at:: - /setuptools.egg-info/LICENSE - /setuptools.egg-info/setuptools/_vendor/packaging/LICENSE - /setuptools.egg-info/setuptools/_vendor/packaging/LICENSE.APACHE - /setuptools.egg-info/setuptools/_vendor/packaging/LICENSE.BSD + /setuptools-{version}.dist-info/license_files/LICENSE + /setuptools-{version}.dist-info/license_files/setuptools/_vendor/packaging/LICENSE + /setuptools-{version}.dist-info/license_files/setuptools/_vendor/packaging/LICENSE.APACHE + /setuptools-{version}.dist-info/license_files/setuptools/_vendor/packaging/LICENSE.BSD -Finally, in the built wheel with ``/`` as the wheel root and ``{version}`` -as the canonical Setuptools package version in the core metadata, -the license files would be stored at:: +Finally, in the installed project, with ``site-packages`` being the site dir +and ``{version}`` as above, the license files would be installed to:: - /setuptools-{version}.dist-info/LICENSE - /setuptools-{version}.dist-info/setuptools/_vendor/packaging/LICENSE - /setuptools-{version}.dist-info/setuptools/_vendor/packaging/LICENSE.APACHE - /setuptools-{version}.dist-info/setuptools/_vendor/packaging/LICENSE.BSD + site-packages/setuptools-{version}.dist-info/license_files/LICENSE + site-packages/setuptools-{version}.dist-info/license_files/setuptools/_vendor/packaging/LICENSE + site-packages/setuptools-{version}.dist-info/license_files/setuptools/_vendor/packaging/LICENSE.APACHE + site-packages/setuptools-{version}.dist-info/license_files/setuptools/_vendor/packaging/LICENSE.BSD Conversion Example @@ -1865,6 +1992,7 @@ References ========== .. [#cms] https://packaging.python.org/specifications/core-metadata +.. [#projectspec] https://packaging.python.org/specifications/declaring-project-metadata/ .. [#sdistspec] https://packaging.python.org/specifications/source-distribution-format/ .. [#wheelspec] https://packaging.python.org/specifications/binary-distribution-format/ .. [#installedspec] https://packaging.python.org/specifications/recording-installed-packages/ @@ -1875,7 +2003,6 @@ References .. [#classifersrepo] https://github.com/pypa/trove-classifiers .. [#issue17] https://github.com/pypa/trove-classifiers/issues/17 .. [#badclassifiers] https://github.com/pypa/trove-classifiers/issues/17#issuecomment-385027197 -.. [#pypapep621] https://packaging.python.org/specifications/declaring-project-metadata/ .. [#setuptoolspep639] https://github.com/pypa/setuptools/pull/2645 .. [#wheelfiles] https://github.com/pypa/wheel/issues/138 .. [#setuptoolsfiles] https://github.com/pypa/setuptools/issues/2739 From 7c1765c0214bcc9d0ce1324d3ecbbe72a092fc84 Mon Sep 17 00:00:00 2001 From: "C.A.M. Gerlach" Date: Thu, 25 Nov 2021 18:24:11 -0600 Subject: [PATCH 11/19] PEP 639: Add PyPI validation of new fields for newly-uploaded files --- pep-0639.rst | 77 +++++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 67 insertions(+), 10 deletions(-) diff --git a/pep-0639.rst b/pep-0639.rst index 1d196282722..f7d84590b26 100644 --- a/pep-0639.rst +++ b/pep-0639.rst @@ -85,7 +85,8 @@ such licenses. This PEP makes no recommendation for specific licenses and does not require the use of specific license documentation conventions. This PEP also does not impose -any restrictions when uploading to PyPI. +any additional restrictions when uploading to PyPI, unless projects choose to +make use of the new fields. Instead, it is intended to document best practices already in use, extend them to use a new formally-specified and supported mechanism, and provide guidance @@ -250,6 +251,13 @@ a valid license expression, build and publishing tools: the normalization process results in changes to the ``License-Expression`` field contents. +For all newly-upload distribution packages that include a +``License-Expression`` field, the Python Package Index (PyPI) [#pypi]_ MUST +validate that it contains a valid, case-normalized license expression with +valid identifiers (as defined above) and MUST reject uploads that do not +validate. PyPI MAY reject an upload for using a deprecated license identifier, +so long as it was deprecated as of the above SPDX License List version. + Add License-File Field '''''''''''''''''''''' @@ -287,6 +295,11 @@ Build tools MAY and publishing tools SHOULD produce an informative warning if a built package's metadata contains no ``License-File`` entries, and publishing tools MAY but build tools MUST NOT raise an error. +For all newly-uploaded distribution packages that include one or more +``License-File`` fields and declare a ``Metadata-Version`` of ``2.3`` or +higher, PyPI SHOULD validate that the specified files are present in all +distribution packages, and MUST reject uploads that do not validate. + Deprecate License Field ''''''''''''''''''''''' @@ -302,6 +315,11 @@ If only the ``License`` field is present, such tools SHOULD issue a warning informing users it is deprecated and recommending ``License-Expression`` instead. +For all newly-uploaded distribution packages that include a +``License-Expression`` field, the Python Package Index (PyPI) [#pypi]_ MUST +reject any that specify a ``License`` field and the text of which is not +identical to that of ``License-Expression``, as defined above. + Along with license classifiers, the ``License`` field may be removed from a new version of the specification in a future PEP. @@ -313,15 +331,10 @@ Including license classifiers [#classif]_ (those beginning with ``License ::``) in the ``Classifier`` field (described in PEP 301) is deprecated and replaced by the more precise ``License-Expression`` field. -New ``License ::`` classifiers MUST NOT be added to PyPI [#classifersrepo]_; -users needing them SHOULD use the ``License-Expression`` field instead. -Along with the ``License`` field, license classifiers may be removed from a -new version of the specification in a future PEP. - -If the ``License-Expression`` field is present, build tools MAY and publishing -tools SHOULD raise an error if one or more license classifiers (as defined -above) is included in a ``Classifier`` field, and not add such classifiers -themselves. +If the ``License-Expression`` field is present, build tools SHOULD and +publishing tools MUST raise an error if one or more license classifiers +(as defined above) is included in a ``Classifier`` field, and MUST NOT add +such classifiers themselves. Otherwise, if this field contains a license classifier, build tools MAY and publishing tools SHOULD issue a warning informing users such classifiers @@ -330,6 +343,15 @@ For compatibility with existing publishing and installation processes, the presence of license classifiers SHOULD NOT raise an error unless ``License-Expression`` is also provided. +For all newly-uploaded distribution packages that include a +``License-Expression`` field, the Python Package Index (PyPI) [#pypi]_ MUST +reject any that also specify any license classifiers. + +New ``License ::`` classifiers MUST NOT be added to PyPI [#classifersrepo]_; +users needing them SHOULD use the ``License-Expression`` field instead. +Along with the ``License`` field, license classifiers may be removed from a +new version of the specification in a future PEP. + PEP 621 Source Metadata ----------------------- @@ -883,6 +905,40 @@ metadata versions, or those who choose not to provide license metadata, no changes are required regardless of the deprecation. +Don't Mandate Validating New Fields on PyPI +''''''''''''''''''''''''''''''''''''''''''' + +Previously, while this PEP did include normative guidelines for packaging +publishing tools (such as Twine), it did not provide specific guidance +for PyPI (or other package indicies) as to whether and how they +should validate the ``License-Expression`` or ``License-Files`` fields, +nor how they should handle using them in combination with the deprecated +``License`` field or license classifiers. This simplifies the specification +and either defers implementation on PyPI to a later PEP, or gives +discretion to PyPI to enforce the stated invariants, to minimize +disruption to package authors. + +However, this had been left unstated from before the ``License-Expression`` +field was separate from the existing ``License``, which would make +validation much more challenging and backwards-incompatible, breaking +existing packages. With that change, there was a clear consensus that +the new field should be validated from the start, guaranteeing that all +packages uploaded to PyPI that declare adhere to core metadata version 2.3 +or higher and have the ``License-Expression`` field will have a valid +expression that PyPI and consumers of its packages and metadata can rely upon +to follow the specification here. + +The same can be extended to the ``License-Files`` field, as also specified +here, to ensure that it is valid and the legally required license files +present, and thus it is lawful for PyPI, users and downstream consumers +to distribute the package (of course, this makes no _guarentee_ of such +as it is ultimately reliant on authors to declare such, but it improves +assurance of this and allows doing so in the future if the community so +decides). To be clear, this would not require that any uploaded package +have such metadata, only that if they choose to declare it per the new +specification in this PEP, it is assured to be valid. + + PEP 621 License Key ------------------- @@ -1999,6 +2055,7 @@ References .. [#cdstats] https://clearlydefined.io/stats .. [#cd] https://clearlydefined.io .. [#osi] https://opensource.org +.. [#pypi] https://pypi.org/ .. [#classif] https://pypi.org/classifiers .. [#classifersrepo] https://github.com/pypa/trove-classifiers .. [#issue17] https://github.com/pypa/trove-classifiers/issues/17 From dd5f73610427bf5a00b852ac8ef0e0d0350bed82 Mon Sep 17 00:00:00 2001 From: "C.A.M. Gerlach" Date: Thu, 25 Nov 2021 19:51:13 -0600 Subject: [PATCH 12/19] PEP 639: Add open isssue for back-filling & reject per-dist licenses --- pep-0639.rst | 58 +++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 57 insertions(+), 1 deletion(-) diff --git a/pep-0639.rst b/pep-0639.rst index f7d84590b26..71021954903 100644 --- a/pep-0639.rst +++ b/pep-0639.rst @@ -93,7 +93,9 @@ to use a new formally-specified and supported mechanism, and provide guidance for packaging tools on how to hand the transition and inform users accordingly. This PEP is not about license documentation in files inside packages, -though this is a surveyed topic in the appendix. +though this is a surveyed topic in the appendix, and nor does it currently +intend to cover the rare cases where the source and binary distribution +packages have different licenses. Possible Future PEPs @@ -1555,6 +1557,57 @@ licenses, but also remain backwards compatible with the version specified here, balancing flexibility and compatibility. +Different Licenses for Source and Binary Distributions +'''''''''''''''''''''''''''''''''''''''''''''''''''''' + +As an additional use case, it was asked whether it was in scope for this +PEP to handle cases where the license expression for a binary distribution +(wheel) is different from that for a source distribution (sdist), such +as in cases of non-pure-Python packages that compile and bundle binaries +under different licenses than the package itself. An example cited was +PyTorch [#pytorch]_, which contains CUDA from Nvidia, which is freely +distributable but not open source. NumPy [#numpyissue]_ and SciPy +[#scipyissue]_ also had similar issues, as reported by the original author +of this PEP and now resolved for those cases. + +However, given the inherent complexity here and a lack of an obvious +mechanism to do so, the fact that each wheel would need its own license +information, lack of support on PyPI for exposing license info on a +per-distribution basis, and the relatively niche use case, it was +determined to be out of scope for this PEP, and left to a future PEP +to resolve if sufficient need and interest exists, and an appropriate +mechanism can be found. + + +Open Issues +=========== + +Should the License Field be Back-Filled, or Mutually Exclusive? +--------------------------------------------------------------- + +At present, this PEP explicitly allows, but does not formally recommend or +require, tools to back-fill the ``License`` core metadata field with +the verbatim text from the ``License-Expression`` field. This would +presumably improve backwards compatibility and was suggested +by some on the Discourse thread. On the other hand, allowing it does +increase complexity and is less of a clean, consistent separation, +preventing the ``License`` field from being completely mutually exclusive +with the new ``License-Expression`` field and requiring that their values +match. + +As such, it would be very useful to have a more concrete and specific +rationale and use cases for the back-filled data, and give fuller +consideration to any potential benefits or drawbacks of this approach, +in order to come to a final consensus on this matter that can be appropriately +justified here. + +Therefore, is the status quo expressed here acceptable, allowing tools +leeway to decide this for themselves? Should this PEP formally recommend, +or even require, that tools back-fill this metadata (which would presumably +be reversed once a breaking revision of the metadata spec is issued)? +Or should this not be explicitly allowed, discouraged or even prohibited? + + Appendix 1. License Expression Examples ======================================= @@ -2074,6 +2127,9 @@ References .. [#reusediscussion] https://github.com/pombredanne/spdx-pypi-pep/issues/7 .. [#choosealicense] https://choosealicense.com/ .. [#spdxversion] https://github.com/pombredanne/spdx-pypi-pep/issues/6 +.. [#pytorch] https://pypi.org/project/torch/ +.. [#numpyissue] https://github.com/numpy/numpy/issues/8689 +.. [#scipyissue] https://github.com/scipy/scipy/issues/7093 .. [#scancodetk] https://github.com/nexB/scancode-toolkit .. [#licfield] https://packaging.python.org/guides/distributing-packages-using-setuptools/#license .. [#samplesetuppy] https://github.com/pypa/sampleproject/blob/3a836905fbd687af334db16b16c37cf51dcbc99c/setup.py#L98 From 3a884da5b91501705fe407b645e0d9a355b3a45c Mon Sep 17 00:00:00 2001 From: "C.A.M. Gerlach" Date: Fri, 26 Nov 2021 22:48:44 -0600 Subject: [PATCH 13/19] PEP 639: Add User Scenarios to provide guidance for common use cases --- pep-0639.rst | 146 ++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 144 insertions(+), 2 deletions(-) diff --git a/pep-0639.rst b/pep-0639.rst index 71021954903..ec91a880402 100644 --- a/pep-0639.rst +++ b/pep-0639.rst @@ -659,6 +659,143 @@ MUST NOT automatically infer a license expression and SHOULD suggest that the package author construct a license expression which expresses their intent. +User Scenarios +============== + +The following covers the range of common use cases from a user perspective, +providing straightforward guidance for each. Do note that the following +should **not** be considered legal advice, and you should consult a licensed +attorney if you are unsure about the specifics for your situation. + + +I have a private package that won't be distributed +-------------------------------------------------- + +If your package isn't shared publicly, i.e. outside your company, +organization or household, it *usually* isn't necessary to include a formal +license, so you wouldn't have to do anything extra here. + +To be more explicit, it is still a good idea to include +``LicenseRef-Proprietary`` as a license expression in your package +configuration, and/or a copyright statement and any legal notices in a +``LICENSE.txt`` file in the root of your project directory, which will be +automatically included by packaging tools. + + +I just want to share my own work without legal restrictions +----------------------------------------------------------- + +While you aren't required to include a license, if you don't, no one has +*any* permission to download, use or improve your work [#dontchoosealicense]_, +so that's probably the *opposite* of what you actually want. +The MIT license [#mitlicense]_ is a great choice for this, as its simple, +widely used and allows anyone to do whatever they want with your work +(other than sue you, which you probably also don't want). + +To apply it, just paste the text [#chooseamitlicense]_ into a file named +``LICENSE.txt`` at the root of your repo, and add the year and your name to +the copyright line. Then, just add ``license-expression = "MIT"`` under +``[project]`` in your ``pyproject.toml`` if your packaging tool supports it, +or in its config file/section (e.g. Setuptools ``license_expression = MIT`` +under ``[metadata]`` in ``setup.cfg``). You're done! + + +I want to distribute my project under a specific license +-------------------------------------------------------- + +To use a particular license, simply paste its text into a ``LICENSE.txt`` +file at the root of your repo (if you don't have it in a file starting with +``LICENSE`` or ``COPYING`` already), and add +``license-expression = "LICENSE-ID"`` under ``[project]`` in your +``pyproject.toml`` if your packaging tool supports it, or in its config +file (e.g. for Setuptools, ``license_expression = LICENSE-ID`` +under ``[metadata]`` in ``setup.cfg``). You can find the ``LICENSE-ID`` +and copyable license text on sites like ChooseALicense [#choosealicenselist]_ +or SPDX [#spdxlist]_. + +Many popular code hosts, project templates and packaging tools can add the +license file for you, and may support the expression as well in the future. + + +I maintain an existing package that's already licensed +------------------------------------------------------ + +If you already have license files and metadata in your project, you +should only need to make a couple tweaks to take advantage of the new +functionality. + +In your project config file, enter your license expression under +``license-expression`` (PEP 621 ``pyproject.toml``), ``license_expression`` +(Setuptools ``setup.cfg``/``setup.py``), or the equivalent for your +packaging tool, and make sure to remove any legacy ``license`` value or +``License ::`` classifiers. Your existing ``license`` value may already +be valid as one (e.g. ``MIT``, ``Apache-2.0 OR BSD-2-Clause``, etc); +otherwise, check the SPDX license list [#spdxlist]_ for the identifier +that matches the license used in your project. + +If your license files begin with ``LICENSE``, ``COPYING``, ``NOTICE`` or +``AUTHORS``, or you've already configured your packaging tool to add them +(e.g. ``license_files`` in ``setup.cfg``), you should already be good to go. +If not, make sure to list them under ``license-files.paths`` +or ``license-files.globs`` under ``[project]`` in ``pyproject.toml`` +(if your tool supports it), or in your tool's configuration file +(e.g. ``license_files`` in ``setup.cfg`` for Setuptools). + +See the `Basic Example`_ for a simple but complete real-world demo of how +this works in practice, including some additional technical details. +Packaging tools may support automatically converting legacy licensing +metadata; check your tool's documentation for details. + + +My package includes other code under different licenses +------------------------------------------------------- + +If your project includes code from others covered by different licenses, +such as vendored dependencies or files copied from other open source +software, you can construct a license expression (or have a tool +help you do so) to describe the licenses involved and the relationship +between them. + +In short, ``License-1 AND License-2`` mean that *both* licenses apply +to your project, or parts of it (for example, you included a file +under another license), and ``License-1 OR License-2`` means that +*either* of the licenses can be used, at the user's option (for example, +you want to allow users a choice of multiple licenses). You can use +parenthesis (``()``) for grouping to form expressions to cover even the most +complex situations. + +In your project config file, enter your license expression under +``license-expression`` (PEP 621 ``pyproject.toml``), ``license_expression`` +(Setuptools ``setup.cfg``/``setup.py``), or the equivalent for your +packaging tool, and make sure to remove any legacy ``license`` value or +``License ::`` classifiers. + +Also, make sure you add the full license text of all the licenses as files +somewhere in your project repository. If all of them are in the root directory +and begin with ``LICENSE``, ``COPYING``, ``NOTICE`` or ``AUTHORS``, +they will be included automatically. Otherwise, you'll need to list the +relative path or glob patterns to each of them under ``license-files.paths`` +or ``license-files.globs`` under ``[project]`` in ``pyproject.toml`` +(if your tool supports it), or in your tool's configuration file +(e.g. ``license_files`` in ``setup.cfg`` for Setuptools). + +As an example, if your project was licensed MIT but incorporated +a vendored dependency (say, ``packaging``) that was licensed under +either Apache 2.0 or the 2-clause BSD, your license expression would +be ``MIT AND (Apache-2.0 OR BSD-2-Clause)``. You might have a +``LICENSE.txt`` in your repo root, and a ``LICENSE-APACHE.txt`` and +``LICENSE-BSD.txt`` in the ``_vendor`` subdirectory, so to include +all of them, you'd specify ``["LICENSE.txt", "_vendor/packaging/LICENSE*"]`` +as glob patterns, or +``["LICENSE.txt", "_vendor/LICENSE-APACHE.txt", "_vendor/LICENSE-BSD.txt"]`` +as literal file paths. + +See a fully worked out `Advanced Example`_ for a comprehensive end-to-end +application of this to a real-world complex project, with copious technical +details, and consult a tutorial [#spdxtutorial]_ for more help and examples +on using SPDX identifiers and expressions. + + Backwards Compatibility ======================= @@ -715,8 +852,8 @@ Finally, for users who may have forgot or not be aware they need to do so, publishing tools will gently guide them toward including ``License-Expression`` and ``License-Files`` with their uploaded packages. -Tools may also help with the conversion and suggest a license expression in some -cases: +Tools may also help with the conversion and suggest a license expression in +many, if not most common cases: - The section `Mapping License Classifiers to SPDX Identifiers`_ provides tool authors with guidelines on how to suggest a license expression produced @@ -2126,6 +2263,11 @@ References .. [#spdxpy] https://github.com/spdx/tools-python/ .. [#reusediscussion] https://github.com/pombredanne/spdx-pypi-pep/issues/7 .. [#choosealicense] https://choosealicense.com/ +.. [#choosealicenselist] https://choosealicense.com/licenses/ +.. [#dontchoosealicense] https://choosealicense.com/no-permission/ +.. [#chooseamitlicense] https://choosealicense.com/licenses/mit/ +.. [#mitlicense] https://opensource.org/licenses/MIT +.. [#spdxtutorial] https://github.com/david-a-wheeler/spdx-tutorial .. [#spdxversion] https://github.com/pombredanne/spdx-pypi-pep/issues/6 .. [#pytorch] https://pypi.org/project/torch/ .. [#numpyissue] https://github.com/numpy/numpy/issues/8689 From 43962c051dcb840961bf6126ecb2d359858f609a Mon Sep 17 00:00:00 2001 From: "C.A.M. Gerlach" Date: Fri, 26 Nov 2021 23:55:25 -0600 Subject: [PATCH 14/19] PEP 639: Update Abstract, Goals, Rationale, Compat. & other sections --- pep-0639.rst | 123 ++++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 102 insertions(+), 21 deletions(-) diff --git a/pep-0639.rst b/pep-0639.rst index ec91a880402..f3a93cec478 100644 --- a/pep-0639.rst +++ b/pep-0639.rst @@ -28,18 +28,23 @@ This will make license declarations simpler and less ambiguous for: - package users to read and understand, and, - tools to process package license information mechanically. -The PEP also proposes to: +The PEP also: -- Deprecate the legacy ``License`` field and ``license ::`` classifiers. +- Formally specifies a new ``License-File`` field and how license files + should be included in distributions, as already used by Wheel and Setuptools. -- Formally specify a new ``License-File`` field, which is already used by - ``wheel`` and ``setuptools`` to include license files in distributions. +- Deprecates the legacy ``License`` field and ``license ::`` classifiers. -- Define how tools can validate license expressions and handle errors and - deprecated fields/classifiers to balance adoption of this PEP with - backwards-compatibility and a smooth transition for package authors. +- Provides clear guidance for authors and tools converting legacy license + metadata, adding license files and validating license expressions. -The changes in this PEP will update the core metadata format to version 2.3. +- Adds and deprecates corresponding fields in the ``pyproject.toml`` + project source metadata format. + +The changes in this PEP will update the core metadata to version 2.3, +modify the PEP 621 project metadata format, and make minor additions to +the source distribution (sdist), binary distribution (wheel) and installed +project standards. Goals @@ -54,22 +59,30 @@ distribution, specifically covering: The changes to the core metadata specification that this PEP requires have been designed to minimize impact and maximize backward compatibility. This specification builds off of existing ways to document licenses that are -in use in some tools (e.g. by adding the ``License-File`` field already used in -``wheel`` and ``setuptools``) and by some package authors (e.g. storing an SPDX -license expression in the existing ``License`` field). +already in use in popular tools (e.g. adding support to core metadata for +the ``License-File`` field already used in Wheel and Setuptools) +and by some package authors (e.g. storing an SPDX license expression +in the existing ``License`` field). In addition to these proposed changes, this PEP contains: - Recommendations for publishing tools on how to validate the new - ``License-Expression`` field and report informational warnings when a package - uses legacy metadata (the ``License`` field and ``License ::`` classifers). + ``License-Expression`` field, add license files to packages, and and + warn on and convert legacy metadata + (the ``License`` field and ``License ::`` classifiers). + +- Simplified guidance for package authors on how to handle license files + and expressions for various common situations. + +- A detailed summary of related proposed ideas and alternate approaches, and + why they were or were not incorporated into the current version of this PEP - Informational appendices that contain surveys of how we document licenses today in Python packages and elsewhere, and a reference Python library to parse, validate and build correct license expressions. -It is the intent of the PEP authors to work closely with tool authors to -implement the recommendations for validation and warnings specified in this PEP. +It is the intent of the PEP authors to work closely with tool maintainers to +implement the recommendations for validation and warnings specified here. Non-Goals @@ -94,8 +107,8 @@ for packaging tools on how to hand the transition and inform users accordingly. This PEP is not about license documentation in files inside packages, though this is a surveyed topic in the appendix, and nor does it currently -intend to cover the rare cases where the source and binary distribution -packages have different licenses. +intend to cover cases where the source and binary distribution packages +have different licenses. Possible Future PEPs @@ -167,6 +180,29 @@ There are a few takeaways from the survey: These considerations have guided the design and recommendations of this PEP. +The current license classifiers cover some common cases, and could +theoretically be extended to include the full range of current SPDX +identifiers while deprecating the many ambiguous classifiers (including some +extremely popular and particularly problematic ones, such as +``License :: OSI Approved :: BSD License``). However, this both requires a +substantial amount of effort to duplicate the SPDX license list and keep +it in sync, and is effectively a hard break in backward compatibility, +forcing a huge proportion of package authors to immediately update to new +classifiers (in most cases, with many possible choices that require closely +examining the project's license) immediately when PyPI deprecates the old ones. + +Furthermore, this only covers simple packages entirely under a single license; +it doesn't address the substantial fraction of common packages that vendor +dependencies (e.g. Setuptools), offer a choice of licenses (e.g. Packaging) +or were relicensed, adapt code from other projects or contain fonts, images, +examples, binaries or other assets under other licenses. It also requires +both authors and tools understand and implement the PyPI-specific bespoke +classifier system, rather than using short, easy to add and standardized +SPDX identifiers in a simple text field, as increasingly widely adopted by +most other packaging systems, reducing the overall burden on the ecosystem. +Finally, this does not provide as clear an indicator that a package +has adopted the new system, and should be treated accordingly. + The use of a new ``License-Expression`` field will provide an intuitive, structured and unambiguous way to express the license of a distribution using a well-defined syntax and well-known license identifiers. @@ -189,7 +225,10 @@ this PEP include those in both author-provided static source metadata, as specified in PEP 621, and built package metadata, as defined in the Core Metadata specification [#cms]_. Furthermore, requirements are needed for tools handling and converting legacy license metadata to license expressions, -to ensure the results are consistent, correct and unambiguous. +to ensure the results are consistent, correct and unambiguous. Finally, minor +additions to the source distribution (sdist), binary distribution (wheel) +and installed project specifications will help document and clarify the +already allowed, now formally standardized behavior in these respects. Core Metadata @@ -804,7 +843,7 @@ Adding a new, dedicated ``License-Expression`` core metadata field and support for the specification in this PEP. This avoids the risk of new tooling misinterpreting a license expression as a free-form license description or vice versa, and raises an error if and only if the user affirmatively -upgrades to the latest metadata version by adding said field/key. +upgrades to the latest metadata version and adds the new field. The legacy ``License`` core metadata field and ``license`` PEP 621 source metadata key will be deprecated along with the ``License ::`` classifiers, @@ -822,6 +861,46 @@ to include, as well as the default behavior, and allows other tools to make use of them, while only having an effect once users and tools expressly adopt it. +Due to requiring license files not be flattened into ``.dist-info`` and +specifying that they should be placed in a dedicated ``license_files`` subdir, +wheels produced with following this change will have differently-located +licenses relative to those produced via the previous unspecified, +installer-specific behavior, but as until this PEP there was no way of +discovering these files or accessing them programmatically, and this will +be further discriminated by a new metadata version, there aren't any foreseen +mechanism for this to pose a practical issue. + +Furthermore, this resolves existing compatibility issues with the current +ad hoc behavior, namely license files being silently clobbered if they have +the same names as others at different paths, unknowingly rendering the wheel +undistributable, and conflicting with the names of other metadata files in +the same directory. Formally specifying otherwise would in fact block full +forward compatibility with additional standard or installer-specified files +and directories added to ``.dist-info``, as they too could conflict with +the names of existing licenses. + +While minor additions will be made to the source distribution (sdist) +binary distribution (wheel) and installed project specifications, all of these +are merely documenting, clarifing and formally specifying behaviors explicitly +allowed under their current respective specifications, and already implemented +in practice, and gating them behind the explicit presence of both the new +metadata versions and the new fields. In particular, sdsts may contain +arbitrary files following the source tree layout, and formally mentioning that +these must include the license files listed in the metadata merely documents +and codifies existing Setuptools practice. Likewise, arbitrary installer- +specific files are allowed in the ``.dist-info`` directory of wheels and +copied to installed projects, and again this PEP just formally clarifies +and standardizes what is already being done. + +Finally, while this PEP does propose PyPI implement validation of the new +license expressions and license files fields, this has no effect on existing +packages, and no effect on any new packages uploaded unless they explicitly +choose to include these new fields while unintentionally not following the +requirements in the specification. Therefore, this does not have a backward +compatibility impact, and in fact ensures forward compatibility with any +future changes by ensuring all packages uploaded to PyPI with the new fields +are valid and conform to the specification. + Security Implications ===================== @@ -1994,8 +2073,10 @@ possible source of confusion: ``[metadata]`` section of the project's ``setup.cfg``, or as an argument to the ``setuptools`` ``setup()`` function. At present, following wheel's lead, Setuptools flattens the collected license files into the metadata - directory, clobbering files with the same name, but there is a desire to - resolve this, contingent on the this PEP being accepted [#setuptoolsfiles]_. + directory, clobbering files with the same name, and dump license files + directly into the top-level ``.dist-info`` directory, but there is a desire + to resolve both these issues, contingent on the this PEP being accepted + [#setuptoolsfiles]_. - Both tools also support an older, singular ``license_file`` parameter that allows specifying only one license file to add to the distribution, which From d8a7b05590aaa52fbcbf4a609a5e38cd29cfd0f5 Mon Sep 17 00:00:00 2001 From: "C.A.M. Gerlach" Date: Sat, 27 Nov 2021 23:33:57 -0600 Subject: [PATCH 15/19] PEP 639: Add terminology section & use consistant, clear & correct terms --- pep-0639.rst | 742 ++++++++++++++++++++++++++++++--------------------- 1 file changed, 443 insertions(+), 299 deletions(-) diff --git a/pep-0639.rst b/pep-0639.rst index f3a93cec478..d229cd3e9c0 100644 --- a/pep-0639.rst +++ b/pep-0639.rst @@ -38,12 +38,12 @@ The PEP also: - Provides clear guidance for authors and tools converting legacy license metadata, adding license files and validating license expressions. -- Adds and deprecates corresponding fields in the ``pyproject.toml`` +- Adds and deprecates corresponding keys in the PEP 621 project source metadata format. The changes in this PEP will update the core metadata to version 2.3, -modify the PEP 621 project metadata format, and make minor additions to -the source distribution (sdist), binary distribution (wheel) and installed +modify the PEP 621 project metadata specification, and make minor additions to +the source distribution (sdist), built distribution (wheel) and installed project standards. @@ -51,10 +51,10 @@ Goals ===== This PEP's scope is limited strictly to how we document the license of a -distribution, specifically covering: +distribution package, specifically covering: -- An improved and structured way to document a license expression. -- A formal mechanism to include license texts in a built package. +- An improved and structured way to include a license expression. +- A formal mechanism to include license texts in a built distribution (wheel). The changes to the core metadata specification that this PEP requires have been designed to minimize impact and maximize backward compatibility. @@ -67,9 +67,9 @@ in the existing ``License`` field). In addition to these proposed changes, this PEP contains: - Recommendations for publishing tools on how to validate the new - ``License-Expression`` field, add license files to packages, and and + ``License-Expression`` field, add license files to distributions, and and warn on and convert legacy metadata - (the ``License`` field and ``License ::`` classifiers). + (the ``License`` field and license classifiers). - Simplified guidance for package authors on how to handle license files and expressions for various common situations. @@ -105,20 +105,20 @@ Instead, it is intended to document best practices already in use, extend them to use a new formally-specified and supported mechanism, and provide guidance for packaging tools on how to hand the transition and inform users accordingly. -This PEP is not about license documentation in files inside packages, +This PEP is not about license documentation in files inside projects, though this is a surveyed topic in the appendix, and nor does it currently intend to cover cases where the source and binary distribution packages have different licenses. -Possible Future PEPs +Possible future PEPs -------------------- It is the intention of the authors of this PEP to consider the submission of related but separate PEPs in the future, which may include: -- Removing the deprecated ``License`` field and ``License ::`` - classifiers from the Core Metadata specification +- Removing the deprecated ``License`` field and license classifiers + from the Core Metadata specification. - Making the ``License-Expression`` and ``License-File`` fields mandatory for publishing tools and PyPI packages. @@ -131,14 +131,14 @@ Motivation ========== Software is licensed, and providing accurate licensing information to Python -packages users is an important matter. Today, there are multiple places where -licenses are documented in package metadata and there are limitations to what +package users is an important matter. Today, there are multiple places where +licenses are documented in core metadata and there are limitations to what can be documented. This is often leading to confusion or a lack of clarity both -for package authors and package users. +for package authors and users. Several package authors have expressed difficulty and/or frustrations due to the -limited capabilities to express licensing in package metadata. This also applies -to Linux and BSD* distribution packagers. This has triggered several +limited capabilities to express licensing in project metadata. This also applies +to Linux and BSD distribution packagers. This has triggered several license-related discussions and issues, in particular: - https://github.com/pypa/trove-classifiers/issues/17 @@ -192,7 +192,7 @@ classifiers (in most cases, with many possible choices that require closely examining the project's license) immediately when PyPI deprecates the old ones. Furthermore, this only covers simple packages entirely under a single license; -it doesn't address the substantial fraction of common packages that vendor +it doesn't address the substantial fraction of common projects that vendor dependencies (e.g. Setuptools), offer a choice of licenses (e.g. Packaging) or were relicensed, adapt code from other projects or contain fonts, images, examples, binaries or other assets under other licenses. It also requires @@ -204,9 +204,9 @@ Finally, this does not provide as clear an indicator that a package has adopted the new system, and should be treated accordingly. The use of a new ``License-Expression`` field will provide an intuitive, -structured and unambiguous way to express the license of a distribution -using a well-defined syntax and well-known license identifiers. -Similarly, a formally-specified ``License-Files`` field offers a standardized +structured and unambiguous way to express the license of a +package using a well-defined syntax and well-known license identifiers. +Similarly, a formally-specified ``License-File`` field offers a standardized way to declare the full text of the license(s) as legally required to be included with the package when distributed, and allows other tools consuming the core metadata to unambiguously locate a distribution's license files. @@ -217,35 +217,175 @@ the clarity, accuracy and portability of their licensing practices, to the benefit of package authors, consumers and redistributors alike. +Terminology +=========== + +This PEP seeks to clearly define the terms it uses, specifically those that: + +- Have multiple established meanings (import vs. distribution package, + wheel *format* vs. Wheel *package*). + +- Are related and often used interchangeably, but have critical + distinctions in meaning (PEP 621 *key* vs. core metadata *field*, + a point of apparent confusion in PEP 621 with significant effects on this + PEP). + +- Are existing concepts that don't have formal terms/definitions + (project/source metadata vs. distribution/built metadata, + build vs. publishing tools). + +- Are new concepts introduced here (license expression/identifier). + +Whenever available, definitions are excepted from the PyPA PyPUG Glossary +[#pypugglossary]_ and SPDX [#spdx]_. Terms are listed here in their full +versions; related words (``Rel:``) are in parenthesis, including short forms +(``Short:``), sub-terms (``Sub:``) and common synonyms for the purposes of +this PEP (``Syn:``). + +**Built Distribution** *(Syn: Binary Distribution/Wheel)* + A Distribution format containing files and metadata that only need to be + moved to the correct location on the target system, to be installed. + Wheel is such a format, whereas distutil's *[sic]* Source Distribution + is not. + *(PyPUG Glossary)* + + For the purposes of this PEP, except where noted, this is synonymous + with **binary distribution** (a built distribution containing compiled code) + and **wheel** (the format). + +**Core Metadata** *(Syn: Package Metadata, Sub: Distribution Metadata)* + The PyPA specification [#cms]_ and the set of metadata fields it defines that + describe key static attributes of distribution packages and installed + projects. + + **Distribution metadata** refers to, more specifically, the concrete form + core metadata takes when included inside a distribution package + (``PKG-INFO`` in a sdist and ``METADATA`` in a wheel) or installed project + (``METADATA``). + +**Core Metadata Field** *(Short: Metadata Field/Field)* + A single key-value pair, or sequence of such with the same key, as defined + by the core metadata specification. Notably, *not* a PEP 621 project + metadata format key. + +**Distribution Package** *(Sub: Package, Distribution Archive)* + A versioned archive file that contains Python packages, modules, and other + resource files that are used to distribute a Release. + *(PyPUG Glossary)* + + In this PEP, **package** is used to refer to the abstract concept of a + distributable form of a Python project, while **distribution** more + specifically references the physical **distribution archive**. + +**License Classifier** + A PyPI Trove classifier [#classif]_ as originally defined in PEP 301 which + begins with ``License ::``, currently used to indicate a project's + license status by including it as a ``Classifer`` in the core metadata. + +**License Expression** *(Syn: SPDX Expression)* + A string with valid SPDX license expression syntax [#spdxpression]_ + including any SPDX license identifiers as defined here, which describes + a project's license(s) and how they related to one another. Examples: + ``GPL-3.0-or-later``, ``MIT AND (Apache-2.0 OR BSD-2-clause)`` + +**License Identifier** *(Syn: License ID/SPDX Identifier)* + A valid SPDX short-form license identifier [#spdxid]_, as described in the + `Add License-Expression field`_ section of this PEP; briefly, + this includes all valid SPDX identifiers and the ``LicenseRef-Public-Domain`` + and ``LicenseRef-Proprietary`` strings. Examples: ``MIT``, ``GPL-3.0-only`` + +**Project** *(Sub: Project Source Tree, Installed Project)* + A library, framework, script, plugin, application, or collection of data + or other resources, or some combination thereof that is intended to be + packaged into a Distribution. I.E., contains a ``pyproject.toml``, + ``setup.py``, or ``setup.cfg`` file at the root of the project source + directory. + *(PyPUG Glossary)* + + Here, a **project source tree** refers to the on-disk format of + a project used for development, while an **installed project** is the form a + project takes once installed from a distribution, as specified by PyPA + [#installedspec]_. + +**Project Source Metadata** *(Sub: PEP 621 Metadata, Key, Subkey)* + Core metadata defined by the package author in the project source tree, + as top-level keys in the ``[project]`` table of a PEP 621 ``pyproject.toml``, + in the ``[metadata]`` table of ``setup.cfg``, or the equivalent for other + build tools. + + **PEP 621 metadata** refers specifically to the former, as defined by the + PyPA Declaring Project Metadata specification [#projectspec]_. + A **PEP 621 metadata key**, or an unqualified *key* refers specifically to + a top-level ``[project]`` key (notably, *not* a core metadata *field*), + while a **subkey** refers to a second-level key in a table-valued + PEP 621 key. + +**Source Distribution** *(Short: sdist)* + Here, specifically refers to a source distribution (**sdist**) as specified + by PyPA [#sdistspec]_. + +**Tool** *(Sub: Packaging Tool, Build Tool, Install Tool, Publishing Tool)* + A program, script or service executed by the user or automatically that + seeks to conform to the specification defined in this PEP. + + A **packaging tool** refers to a tool used to build, publish, + install, or otherwise directly interact with Python packages. + + A **build tool** is a packaging tool used to generate a source or built + distribution from a project source tree or sdist, when directly invoked + as such (as opposed to by end-user-facing install tools). + Examples: Wheel project, PEP 517 backends via ``build`` or other + package-developer-facing frontends, calling ``setup.py`` directly. + + An **install tool** is a packaging tool used to install a source or built + distribution in a target environment. Examples include the PyPA pip and + ``installer`` projects. + + A **publishing tool** is a packaging tool used to upload distribution + archives to a package index, such as Twine for PyPI. + +**Wheel Format** *(Short: wheel, Rel: Wheel project)* + Here, **wheel**, the standard built distribution format introduced in PEP 427 + and specified by PyPA [#wheelspec]_, will be referred to in lowercase, + while the **Wheel project** [#wheelproject]_, its reference implementation, + which will be referred to as "Wheel" in Title Case. + + Specification ============= The changes necessary to implement the improved license handling outlined in -this PEP include those in both author-provided static source metadata, as -specified in PEP 621, and built package metadata, as defined in the Core -Metadata specification [#cms]_. Furthermore, requirements are needed for +this PEP include those in both author-provided project source metadata, as +specified in PEP 621, and distribution package metadata, as defined in the core +metadata specification [#cms]_. Furthermore, requirements are needed for tools handling and converting legacy license metadata to license expressions, to ensure the results are consistent, correct and unambiguous. Finally, minor -additions to the source distribution (sdist), binary distribution (wheel) +additions to the source distribution (sdist), built distribution (wheel) and installed project specifications will help document and clarify the already allowed, now formally standardized behavior in these respects. -Core Metadata +Core metadata ------------- -The canonical source for the names and semantics of each of the supported -metadata fields is the Core Metadata Specification [#cms]_ document. +The PyPA Core Metadata specification [#cms]_ defines the names and +semantics of each of the supported fields in the distribution metadata of +Python distribution packages and installed projects. This PEP adds the ``License-Expression`` and ``License-File`` fields, -deprecates the ``License`` field, and deprecates the ``License ::`` -classifiers in the ``Classifier`` field. +deprecates the ``License`` field, and deprecates the license classifiers +in the ``Classifier`` field. + +The error and warning guidance in this section applies to build and +publishing tools; user-facing install tools MAY be more lenient than +mentioned here when encountering malformed metadata +that does not conform to this specification. As it adds new fields, this PEP updates the core metadata to version 2.3. -Add License-Expression Field -'''''''''''''''''''''''''''' +Add ``License-Expression`` field +'''''''''''''''''''''''''''''''' The ``License-Expression`` optional field is specified to contain a text string that is a valid SPDX license expression, defined below. @@ -255,9 +395,8 @@ missing, and MAY raise an error. Build tools MAY issue a similar warning, but MUST NOT raise an error. A license expression is a string using the SPDX license expression syntax as -documented in the SPDX specification [#spdx]_ using either Version 2.2 -[#spdx22]_ or a later compatible version. SPDX is a working group at the Linux -Foundation that defines a standard way to exchange package information. +documented in the SPDX specification [#spdxpression]_ using either Version 2.2 +or a later compatible version. When used in the ``License-Expression`` field and as a specialization of the SPDX license expression definition, a license expression can use the following license @@ -292,7 +431,7 @@ a valid license expression, build and publishing tools: the normalization process results in changes to the ``License-Expression`` field contents. -For all newly-upload distribution packages that include a +For all newly-upload distributions that include a ``License-Expression`` field, the Python Package Index (PyPI) [#pypi]_ MUST validate that it contains a valid, case-normalized license expression with valid identifiers (as defined above) and MUST reject uploads that do not @@ -300,8 +439,8 @@ validate. PyPI MAY reject an upload for using a deprecated license identifier, so long as it was deprecated as of the above SPDX License List version. -Add License-File Field -'''''''''''''''''''''' +Add ``License-File`` field +'''''''''''''''''''''''''' The ``License-File`` optional field is specified to contain the string representation of the path to a license-related file, relative to the @@ -311,10 +450,10 @@ legal notices that need to be distributed with the package. It is a multi-use field that may appear zero or more times, each instance listing the path to one such file. -If a ``License-File`` is listed in a source or binary distribution's core +If a ``License-File`` is listed in a source or built distribution's core metadata, that file MUST be included in the distribution at the specified path relative to the root license directory, and MUST be installed with the -distribution at that same path. +distribution at that same relative path. The root license directory is defined to be the project root directory for source trees and source distributions, and the ``license_files`` @@ -323,7 +462,7 @@ subdirectory of the directory containing the core metadata (i.e. the distributions and installed projects. The specified relative path MUST be consistent between project source trees, -source distributions (sdists), binary distributions (wheels) and installed +source distributions (sdists), built distributions (wheels) and installed projects. Therefore, inside the root license directory, packaging tools MUST reproduce the directory structure under which the source license files are located relative to the project root. @@ -333,17 +472,17 @@ and parent directory indicators (``..``) MUST NOT be used. License file content MUST be UTF-8 encoded text. Build tools MAY and publishing tools SHOULD produce an informative warning -if a built package's metadata contains no ``License-File`` entries, +if a built distribution's metadata contains no ``License-File`` entries, and publishing tools MAY but build tools MUST NOT raise an error. For all newly-uploaded distribution packages that include one or more ``License-File`` fields and declare a ``Metadata-Version`` of ``2.3`` or higher, PyPI SHOULD validate that the specified files are present in all -distribution packages, and MUST reject uploads that do not validate. +uploaded distributions, and MUST reject uploads that do not validate. -Deprecate License Field -''''''''''''''''''''''' +Deprecate ``License`` field +''''''''''''''''''''''''''' The legacy unstructured-text ``License`` field is deprecated and replaced by the new ``License-Expression`` field. @@ -356,7 +495,7 @@ If only the ``License`` field is present, such tools SHOULD issue a warning informing users it is deprecated and recommending ``License-Expression`` instead. -For all newly-uploaded distribution packages that include a +For all newly-uploaded distributions that include a ``License-Expression`` field, the Python Package Index (PyPI) [#pypi]_ MUST reject any that specify a ``License`` field and the text of which is not identical to that of ``License-Expression``, as defined above. @@ -365,12 +504,12 @@ Along with license classifiers, the ``License`` field may be removed from a new version of the specification in a future PEP. -Deprecate License Classifiers +Deprecate license classifiers ''''''''''''''''''''''''''''' -Including license classifiers [#classif]_ (those beginning with ``License ::``) -in the ``Classifier`` field (described in PEP 301) is deprecated and -replaced by the more precise ``License-Expression`` field. +Including license classifiers [#classif]_ in the ``Classifier`` field +(described in PEP 301) is deprecated and replaced by the more precise +``License-Expression`` field. If the ``License-Expression`` field is present, build tools SHOULD and publishing tools MUST raise an error if one or more license classifiers @@ -384,43 +523,43 @@ For compatibility with existing publishing and installation processes, the presence of license classifiers SHOULD NOT raise an error unless ``License-Expression`` is also provided. -For all newly-uploaded distribution packages that include a +For all newly-uploaded distributions that include a ``License-Expression`` field, the Python Package Index (PyPI) [#pypi]_ MUST reject any that also specify any license classifiers. -New ``License ::`` classifiers MUST NOT be added to PyPI [#classifersrepo]_; +New license classifiers MUST NOT be added to PyPI [#classifersrepo]_; users needing them SHOULD use the ``License-Expression`` field instead. Along with the ``License`` field, license classifiers may be removed from a new version of the specification in a future PEP. -PEP 621 Source Metadata +Project source metadata ----------------------- -As currently specified in the canonical PyPA specification [#projectspec]_, -PEP 621 defines how to declare a project's source metadata in a ``[project]`` -table in the ``pyproject.toml`` file for packaging tools to consume and -output a distribution's core metadata. +As originally introduced in PEP 621, the PyPA Declaring Project Metadata +specification [#projectspec]_ defines how to declare a project's source +metadata in a ``[project]`` table in the ``pyproject.toml`` file for +build tools to consume and output distribution core metadata. This PEP adds the ``license-expression`` and ``license-files`` keys and deprecates the ``license`` key. -Add license-expression Key -'''''''''''''''''''''''''' +Add ``license-expression`` key +'''''''''''''''''''''''''''''' A new ``license-expression`` key is added to the ``project`` table, which has a string value that is a valid SPDX license expression, as defined previously. Its value maps to the ``License-Expression`` field in the core metadata. -Packaging tools SHOULD validate the expression as described above, outputting +Build tools SHOULD validate the expression as described above, outputting an error or warning as specified. When generating the core metadata, tools MUST perform case normalization. If and only if the ``license-expression`` key is listed as ``dynamic`` -(and is not specified), tools MAY infer a value for this field if they can do -so unambiguously, but MUST follow the provisions in the -`Converting Legacy Metadata`_ section. +(and is not specified), tools MAY infer a value for the ``License-Expression`` +field if they can do so unambiguously, but MUST follow the provisions in the +`Converting legacy metadata`_ section. If the ``license-expression`` key is present and valid (and the ``license`` key is not specified), for purposes of backward compatibility, tools MAY @@ -428,17 +567,17 @@ back-fill the ``License`` core metadata field with the case-normalized value of the ``license-expression`` key. -Add license-files Key -''''''''''''''''''''' +Add ``license-files`` key +''''''''''''''''''''''''' A new ``license-files`` key is added to the ``project`` table for specifying paths in the project source relative to ``pyproject.toml`` to file(s) containing licenses and other legal notices to be distributed with the package. It corresponds to the ``License-File`` fields in the core metadata. -Its value may either be a table or an array of strings. If a table, it may -contain one of two optional, mutually exclusive keys, ``paths`` and ``globs``; -both arrays of strings. If both are specified, tools MUST raise an error. +Its value is a table, which if present MUST contain one of two optional, +mutually exclusive subkeys, ``paths`` and ``globs``; both arrays of strings. +If both are specified, tools MUST raise an error. The ``paths`` subkey contains verbatim file paths, and the ``globs`` subkey valid glob patterns, parsable by the ``glob`` module [#globmodule]_ in the Python standard library. @@ -452,28 +591,28 @@ and parent directory indicators (``..``) MUST NOT be used. Tools MUST assume that license file content is valid UTF-8 encoded text, and SHOULD validate this and raise an error if it is not. -If the ``paths`` subkey is a non-empty array, packaging tools: +If the ``paths`` subkey is a non-empty array, build tools: - MUST treat each value as a verbatim, literal file path, and MUST NOT treat them as glob patterns. -- MUST include each listed file in distribution artifacts. +- MUST include each listed file in all distribution archives. - MUST NOT match any additional license files beyond those explicitly - statically specified by the user under the ``paths`` key. + statically specified by the user under the ``paths`` subkey. - MUST list each file path under a ``License-File`` field in the core metadata. - MUST raise an error if one or more paths do not correspond to a valid file - in the package source that can be copied into the built distribution. + in the project source that can be copied into the distribution archive. -If the ``globs`` subkey is a non-empty array, packaging tools: +If the ``globs`` subkey is a non-empty array, build tools: - MUST treat each value as a glob pattern, and MUST raise an error if the pattern contains invalid glob syntax. -- MUST include all files matched by at least one listed pattern in - distribution artifacts. +- MUST include all files matched by at least one listed pattern in all + distribution archives. - MAY exclude files matched by glob patterns that can be unambiguously determined to be backup, temporary, hidden, OS-generated or VCS-ignored. @@ -500,29 +639,29 @@ but MUST NOT raise an error. If the ``license-files`` key is marked as ``dynamic`` (and not present), to preserve consistent behavior with current tools and help ensure the packages -they create are legally distributable, packaging tools SHOULD default to +they create are legally distributable, build tools SHOULD default to including at least the license files matching the above patterns, unless the user has explicitly specified their own. -Deprecate license Key -''''''''''''''''''''' +Deprecate ``license`` key +''''''''''''''''''''''''' The ``license`` key in the ``project`` table is now deprecated. It MUST not be used if either of the new ``license-expression`` or ``license-files`` keys are defined, nor should it be listed as ``dynamic``, -and packaging tools MUST raise an error if either is the case. +and build tools MUST raise an error if either is the case. -Otherwise, if the ``text`` key is present in the ``license`` table, tools +Otherwise, if the ``text`` subkey is present in the ``license`` table, tools SHOULD issue a warning informing users it is deprecated and recommending the ``license-expression`` key instead. -Likewise, if the ``file`` key is present in the ``license`` table, tools SHOULD -issue a warning informing users it is deprecated and recommending +Likewise, if the ``file`` subkey is present in the ``license`` table, tools +SHOULD issue a warning informing users it is deprecated and recommending the ``license-files`` key instead. However, if the file is present in the -source, packaging tools SHOULD still use it to fill the ``License-File`` field +source, build tools SHOULD still use it to fill the ``License-File`` field in the core metadata, and if so, MUST include the specified file in any -distribution artifacts for the project. If the file does not exist at the +distribution archives for the project. If the file does not exist at the specified path, tools SHOULD issue a warning, and MUST NOT fill it in a ``License-File`` field. @@ -537,7 +676,7 @@ The ``license`` key may be removed from a new version of the specification in a future PEP. -License Files In Project Formats +License files in project formats -------------------------------- A few minor additions will be made to the relevant existing specifications @@ -546,49 +685,49 @@ allowed and implemented behavior, as well as explicitly mention the directory location the license file tree is rooted in for each format, per the specification above. -Project source trees +**Project source trees** As described above, the project source metadata specification [#projectspec]_ will be updated to reflect that license file paths MUST be relative to the project root directory; i.e. the directory containing the ``pyproject.toml`` (or equivalently, other legacy project configuration, e.g. ``setup.py``, ``setup.cfg``, etc). -Source distributions (sdists) +**Source distributions** *(sdists)* The sdist specification [#sdistspec]_ will be updated to reflect that for - metadata version 2.3, the sdist MUST contain any license files specified - by ``License-Files`` in the ``PKG-INFO`` at their respective paths relative - to the top-level directory of the sdist + ``Metadata-Version`` is ``2.3`` or greater, the sdist MUST contain any + license files specified by ``License-File`` in the ``PKG-INFO`` at their + respective paths relative to the top-level directory of the sdist (containing the ``pyproject.toml`` and the ``PKG-INFO`` core metadata). -Built distributions (wheels) +**Built distributions** *(wheels)* The wheel specification [#wheelspec]_ will be updated to reflect that if - the ``METADATA`` version is 2.3 or greater and one or more ``License-File`` - fields is specified, the ``.dist-info`` directory MUST contain a - ``license_files`` subdirectory which MUST contain the files listed in the - ``License-File`` fields in the ``METADATA`` file at their respective paths - relative to the ``license_files`` directory. - -Installed projects - The recording installed projects specification [#installedspec]_ will be - updated to reflect that if the ``METADATA`` version is 2.3 or greater + the ``Metadata-Version`` is ``2.3`` or greater and one or more + ``License-File`` fields is specified, the ``.dist-info`` directory MUST + contain a ``license_files`` subdirectory which MUST contain the files listed + in the ``License-File`` fields in the ``METADATA`` file at their respective + paths relative to the ``license_files`` directory. + +**Installed projects** + The Recording Installed Projects specification [#installedspec]_ will be + updated to reflect that if the ``Metadata-Version`` is ``2.3`` or greater and one or more ``License-File`` fields is specified, the ``.dist-info`` directory MUST contain a ``license_files`` subdirectory which MUST contain the files listed in the ``License-File`` fields in the ``METADATA`` file at their respective paths relative to the ``license_files`` directory, - and that any files in this directory MUST be copied from installed wheels. + and that any files in this directory MUST be copied from installed wheels + by install tools. -Converting Legacy Metadata +Converting legacy metadata -------------------------- -If the contents of the ``License`` field are a valid SPDX expression containing -solely known, non-deprecated license identifiers, build and publishing tools MAY -use it to fill the ``License-Expression`` field. +If the contents of the ``License`` field are a valid license expression +containing solely known, non-deprecated license identifiers, build tools +MAY use it to fill the ``License-Expression`` field. Similarly, if the ``Classifier`` field contains exactly one license classifier -(those beginning with ``License ::``) that unambiguously maps to exactly one -valid, non-deprecated SPDX identifier, tools MAY use it to fill the -``License-Expression`` field. +that unambiguously maps to exactly one valid, non-deprecated SPDX identifier, +tools MAY fill the ``License-Expression`` field with the latter. If both a non-empty ``License`` field and a single license classifier are present, the contents of the ``License`` field, including capitalization @@ -609,7 +748,7 @@ informing the user and requiring unambiguous, affirmative user action to select and confirm the desired ``License-Expression`` value before proceeding. -Mapping License Classifiers to SPDX Identifiers +Mapping license classifiers to SPDX identifiers ''''''''''''''''''''''''''''''''''''''''''''''' Most single license classifiers (namely, all those not mentioned below) @@ -691,7 +830,7 @@ considered canonical and normative for the purposes of this specification: Therefore, tools MUST treat them as ambiguous when attempting to fill ``License-Expression``. -When multiple license-related classifiers are used, their relation is ambiguous +When multiple license classifiers are used, their relation is ambiguous and it is typically not possible to determine if all the licenses apply or if there is a choice that is possible among the licenses. In this case, tools MUST NOT automatically infer a license expression and SHOULD suggest that the @@ -765,7 +904,7 @@ functionality. In your project config file, enter your license expression under ``license-expression`` (PEP 621 ``pyproject.toml``), ``license_expression`` -(Setuptools ``setup.cfg``/``setup.py``), or the equivalent for your +(Setuptools ``setup.cfg`` / ``setup.py``), or the equivalent for your packaging tool, and make sure to remove any legacy ``license`` value or ``License ::`` classifiers. Your existing ``license`` value may already be valid as one (e.g. ``MIT``, ``Apache-2.0 OR BSD-2-Clause``, etc); @@ -805,7 +944,7 @@ complex situations. In your project config file, enter your license expression under ``license-expression`` (PEP 621 ``pyproject.toml``), ``license_expression`` -(Setuptools ``setup.cfg``/``setup.py``), or the equivalent for your +(Setuptools ``setup.cfg`` / ``setup.py``), or the equivalent for your packaging tool, and make sure to remove any legacy ``license`` value or ``License ::`` classifiers. @@ -843,10 +982,10 @@ Adding a new, dedicated ``License-Expression`` core metadata field and support for the specification in this PEP. This avoids the risk of new tooling misinterpreting a license expression as a free-form license description or vice versa, and raises an error if and only if the user affirmatively -upgrades to the latest metadata version and adds the new field. +upgrades to the latest metadata version and adds the new field/key. The legacy ``License`` core metadata field and ``license`` PEP 621 source -metadata key will be deprecated along with the ``License ::`` classifiers, +metadata key will be deprecated along with the license classifiers, retaining backwards compatibility while gently preparing users for their future removal. Such a removal would follow a suitable transition period, and be left to a future PEP and a new version of the core metadata specification. @@ -854,7 +993,7 @@ be left to a future PEP and a new version of the core metadata specification. Formally specifying the new ``License-File`` core metadata field and the inclusion of the listed files in the distribution merely codifies and refines the existing practices in popular packaging tools, including -``wheel`` and ``setuptools``, and is designed to be backwards-compatible +Wheel and Setuptools, and is designed to be largely backwards-compatible with their existing use of that field. Likewise, the new ``license-files`` PEP 621 source metadata key standardizes statically specifying the files to include, as well as the default behavior, and allows other tools to @@ -880,34 +1019,34 @@ and directories added to ``.dist-info``, as they too could conflict with the names of existing licenses. While minor additions will be made to the source distribution (sdist) -binary distribution (wheel) and installed project specifications, all of these +built distribution (wheel) and installed project specifications, all of these are merely documenting, clarifing and formally specifying behaviors explicitly allowed under their current respective specifications, and already implemented in practice, and gating them behind the explicit presence of both the new metadata versions and the new fields. In particular, sdsts may contain -arbitrary files following the source tree layout, and formally mentioning that -these must include the license files listed in the metadata merely documents -and codifies existing Setuptools practice. Likewise, arbitrary installer- -specific files are allowed in the ``.dist-info`` directory of wheels and -copied to installed projects, and again this PEP just formally clarifies +arbitrary files following the project source tree layout, and formally +mentioning that these must include the license files listed in the metadata +merely documents and codifies existing Setuptools practice. Likewise, arbitrary +installer-specific files are allowed in the ``.dist-info`` directory of wheels +and copied to installed projects, and again this PEP just formally clarifies and standardizes what is already being done. Finally, while this PEP does propose PyPI implement validation of the new -license expressions and license files fields, this has no effect on existing -packages, and no effect on any new packages uploaded unless they explicitly -choose to include these new fields while unintentionally not following the -requirements in the specification. Therefore, this does not have a backward -compatibility impact, and in fact ensures forward compatibility with any -future changes by ensuring all packages uploaded to PyPI with the new fields -are valid and conform to the specification. +``License-Expression`` and ``License-File`` fields, this has no effect on +existing packages, nor any effect on any new distributions uploaded unless they +explicitly choose to include these new fields while unintentionally not +following the requirements in the specification. Therefore, this does not have +a backward compatibility impact, and in fact ensures forward compatibility with +any future changes by ensuring all distributions uploaded to PyPI with the new +fields are valid and conform to the specification. Security Implications ===================== -This PEP has no foreseen security implications: the License-Expression field is -a plain string and the License-File(s) are file paths. None of them introduces -any known new security concerns. +This PEP has no foreseen security implications: the ``License-Expression`` +field is a plain string and the License-File(s) are file paths. +None of them introduces any known new security concerns. How to Teach This @@ -919,22 +1058,22 @@ expression and a large majority of packages use a single license. The plan to teach users of packaging tools how to express their package's license with a valid license expression is to have tools issue informative messages when they detect invalid license expressions, or when the deprecated -``License`` field or a ``License ::`` classifier is used. +``License`` field or a license classifier is used. An immediate, descriptive error message if an invalid ``License-Expression`` is used will help users understand they need to use valid SPDX identifiers in this field, and catch them if they make a mistake. For authors still using the now-deprecated, less precise and more redundant -``License`` field or ``License ::`` classifiers, packaging tools will warn +``License`` field or license classifiers, packaging tools will warn them and inform them of the modern replacement, ``License-Expression``. Finally, for users who may have forgot or not be aware they need to do so, -publishing tools will gently guide them toward including ``License-Expression`` -and ``License-Files`` with their uploaded packages. +publishing tools will gently guide them toward including ``license-expression`` +and ``license-files`` in their project source metadata. Tools may also help with the conversion and suggest a license expression in many, if not most common cases: -- The section `Mapping License Classifiers to SPDX Identifiers`_ provides +- The section `Mapping license classifiers to SPDX identifiers`_ provides tool authors with guidelines on how to suggest a license expression produced from legacy classifiers. @@ -952,37 +1091,37 @@ Reference Implementation Tools will need to support parsing and validating license expressions in the ``License-Expression`` field. -The ``license-expression`` library [#licexp]_ is a reference Python +The license-expression library [#licexp]_ is a reference Python implementation of a library that handles license expressions including parsing, validating and formatting license expressions using flexible lists of license symbols (including SPDX license identifiers and any extra identifiers referenced here). It is licensed under the Apache-2.0 license and is used in a few projects -such as the SPDX Python tools [#spdxpy]_, the ScanCode toolkit [#scancodetk]_ +such as the SPDX Python Tools [#spdxpy]_, the ScanCode toolkit [#scancodetk]_ and the Free Software Foundation Europe (FSFE) Reuse project [#reuse]_. Rejected Ideas ============== -Core Metadata Fields +Core metadata fields -------------------- Potential alternatives to the structure, content and deprecation of the core metadata fields specified in this PEP. -Re-Use the License Field -'''''''''''''''''''''''' +Re-use the ``License`` field +'''''''''''''''''''''''''''' Following initial discussion [#reusediscussion]_, earlier versions of this PEP proposed to re-use the existing ``License`` field, which tools would -attempt to parse as a SPDX expression with a fall back to treating as free -text. Initially, this would merely cause a warning (or even pass silently), -but would eventually be treated as an error by modern tooling. +attempt to parse as a SPDX license expression with a fall back to treating +as free text. Initially, this would merely cause a warning (or even pass +silently), but would eventually be treated as an error by modern tooling. This offered the benefit of greater backwards-compatibility, -easing the community into using SPDX expressions while taking advantage of -packages that already have them (either intentionally or coincidentally), +easing the community into using SPDX license expressions while taking advantage +of packages that already have them (either intentionally or coincidentally), and avoided adding yet another license-related field. However, following substantial discussion, consensus was reached that a @@ -1000,7 +1139,7 @@ not clearly distinguishable from true positives and negatives, an ambiguity at odds with the goals of this PEP. Furthermore, it allows both the existing ``License`` field and -the ``License::`` classifiers to be more easily deprecated, +the license classifiers to be more easily deprecated, with tools able to cleanly distinguish between packages intending to affirmatively conform to the updated specification in this PEP or not, and adapt their behavior (warnings, errors, etc) accordingly. @@ -1013,24 +1152,24 @@ Finally, it avoids changing the behavior of an existing metadata field, and avoids tools having to guess the ``Metadata-Version`` and field behavior based on its value rather than merely its presence. -While this would mean the subset of existing projects containing ``License`` -fields valid as SPDX expressions wouldn't automatically be recognized as such, -this only requires appending a few characters to the key name in the -package's source metadata, and this PEP provides extensive guidance on -how this can be done automatically by tooling. +While this would mean the subset of existing distributions containing +``License`` fields valid as SPDX license expressions wouldn't automatically be +recognized as such, this only requires appending a few characters to the key +name in the project's source metadata, and this PEP provides extensive +guidance on how this can be done automatically by tooling. Given all this, it was decided to proceed with defining a new, purpose-created field, ``License-Expression``. -Re-Use the License Field with a Value Prefix -'''''''''''''''''''''''''''''''''''''''''''' +Re-Use the ``License`` field with a value prefix +'''''''''''''''''''''''''''''''''''''''''''''''' As an alternative to the above, it was suggested to reduce the ambiguity -inherent in re-using the ``License`` field by prefixing SPDX expressions -with, e.g. ``spdx:``. However, this effectively amounted to creating a field -within a field, and doesn't address all the downsides of keeping the -``License`` field. Namely, it still changes the behavior of an +inherent in re-using the ``License`` field by prefixing SPDX license +expressions with, e.g. ``spdx:``. However, this effectively amounted to +creating a field within a field, and doesn't address all the downsides of +keeping the ``License`` field. Namely, it still changes the behavior of an existing metadata field, requires tools to parse its value to determine how to handle its content, and makes the specification and deprecation process more complex and less clean. @@ -1039,11 +1178,11 @@ Yet, it still shares a same main potential downside as just creating a new field, that projects currently using valid SPDX identifiers in the ``License`` field, intentionally or not, won't be automatically recognized, and requires about the same amount of effort to fix, namely changing a line in the -package's source metadata. Therefore, it was rejected in favor of a new field. +project's source metadata. Therefore, it was rejected in favor of a new field. -Don't Make License-Expression Mutually Exclusive -'''''''''''''''''''''''''''''''''''''''''''''''' +Don't make ``License-Expression`` mutually exclusive +'''''''''''''''''''''''''''''''''''''''''''''''''''' For backwards compatibility, the ``License`` field and/or the license classifiers could still be allowed together with the new @@ -1055,8 +1194,8 @@ simpler and unambiguous. Therefore, and in concert with clear community consensus otherwise, this idea was soundly rejected. -Don't Deprecate Existing License Field and Classifiers -'''''''''''''''''''''''''''''''''''''''''''''''''''''' +Don't deprecate existing ``License`` field and classifiers +'''''''''''''''''''''''''''''''''''''''''''''''''''''''''' Several community members were initially concerned that deprecating the existing ``License`` field and license classifiers would result in @@ -1090,13 +1229,13 @@ correctly; users just paste in their desired license identifier, or select it via a tool, and they're done; no need to learn about Trove classifiers and dig through the list to figure out which one(s) apply (and be confused by many ambiguous options), or figure out on their own what should go -in the ``license`` field (anything from nothing, to the license text, +in the ``license`` key (anything from nothing, to the license text, to a free-form description, to the same SPDX identifier they would be -entering in the ``License-Expression`` field anyway, assuming they can +entering in the ``license-expression`` key anyway, assuming they can easily find documentation at all about it). In fact, this can be made even easier thanks to the new field. For example, GitHub's popular ChooseALicense.com [#choosealicense]_ links to how to add SPDX license -identifiers to the packaging metadata of various languages that support +identifiers to the project source metadata of various languages that support them right in the sidebar of every license page; the SPDX support in this PEP enables adding Python to that list. @@ -1104,9 +1243,9 @@ For current package maintainers who have specified a ``License`` or license classifiers, this PEP only recommends warnings and prohibits errors for all but publishing tools, which are allowed to error if their intended distribution platform(s) so requires. Once maintainers are ready to -upgrade, for those already using SPDX expressions (accidentally or not) +upgrade, for those already using SPDX license expressions (accidentally or not) this only requires appending a few characters to the key name in the -package's source metadata, and for those with license classifiers that +project's source metadata, and for those with license classifiers that map to a single unambiguous license, or another defined case (public domain, proprietary), they merely need to drop the classifier and paste in the corresponding license identifier. This PEP provides extensive guidance and @@ -1123,13 +1262,13 @@ metadata versions, or those who choose not to provide license metadata, no changes are required regardless of the deprecation. -Don't Mandate Validating New Fields on PyPI +Don't mandate validating new fields on PyPI ''''''''''''''''''''''''''''''''''''''''''' Previously, while this PEP did include normative guidelines for packaging publishing tools (such as Twine), it did not provide specific guidance for PyPI (or other package indicies) as to whether and how they -should validate the ``License-Expression`` or ``License-Files`` fields, +should validate the ``License-Expression`` or ``License-File`` fields, nor how they should handle using them in combination with the deprecated ``License`` field or license classifiers. This simplifies the specification and either defers implementation on PyPI to a later PEP, or gives @@ -1141,31 +1280,31 @@ field was separate from the existing ``License``, which would make validation much more challenging and backwards-incompatible, breaking existing packages. With that change, there was a clear consensus that the new field should be validated from the start, guaranteeing that all -packages uploaded to PyPI that declare adhere to core metadata version 2.3 +distributions uploaded to PyPI that declare adhere to core metadata version 2.3 or higher and have the ``License-Expression`` field will have a valid expression that PyPI and consumers of its packages and metadata can rely upon to follow the specification here. -The same can be extended to the ``License-Files`` field, as also specified +The same can be extended to the ``License-File`` field, as also specified here, to ensure that it is valid and the legally required license files present, and thus it is lawful for PyPI, users and downstream consumers to distribute the package (of course, this makes no _guarentee_ of such as it is ultimately reliant on authors to declare such, but it improves assurance of this and allows doing so in the future if the community so -decides). To be clear, this would not require that any uploaded package +decides). To be clear, this would not require that any uploaded distribution have such metadata, only that if they choose to declare it per the new specification in this PEP, it is assured to be valid. -PEP 621 License Key -------------------- +Source metadata ``license`` key +------------------------------- -Alternate possibilities related to the ``License`` key in the +Alternate possibilities related to the ``license`` key in the ``pyproject.toml`` project source metadata specified in PEP 621. -Add Expression and Files Subkeys to Table -''''''''''''''''''''''''''''''''''''''''' +Add ``expression`` and ``files`` subkeys to table +''''''''''''''''''''''''''''''''''''''''''''''''' A previous working draft of this PEP added ``expression`` and ``files`` subkeys to the existing ``license`` table in the PEP 621 source metadata, to parallel @@ -1175,13 +1314,13 @@ that ultimately taken here. Most saliently, this means two very different types of metadata are being specified under the same top-level key that require very different handling, -and furthermore, unlike the previous arrangement, the keys were not mutually +and furthermore, unlike the previous arrangement, the subkeys were not mutually exclusive and can both be specified at once, and with some subkeys potentially being dynamic and others static, and mapping to different core metadata fields. This also breaks from the consensus for the core metadata fields, namely to separate the license expression into its own explicit field. -Furthermore, this leads to a conflict with marking the field as ``dynamic`` +Furthermore, this leads to a conflict with marking the key as ``dynamic`` (assuming that is intended to specify PEP 621 keys, as that PEP seems to rather imprecisely imply, rather than core metadata fields), as either both would have to be treated as ``dynamic``. A user may want to specify the ``expression`` @@ -1204,31 +1343,29 @@ tools that use it. Finally, this results in the spec being significantly more complex and convoluted to understand and implement than the alternatives. The approach this PEP now takes, adding distinct ``license-expression`` and -``license-file`` keys and simply deprecating the whole ``license`` key, avoids +``license-files`` keys and simply deprecating the whole ``license`` key, avoids all the issues identified above, and results in a much clearer and cleaner design overall. It allows ``license`` and ``license-files`` to be tagged ``dynamic`` independently, separates two independent types of metadata (syntactically and semantically), restores a closer to 1:1 mapping of -PEP 621 keys to core metadata fields, automatically makes -``License-Expression`` exclusive of the deprecated and conflicting -``file`` and ``text`` subkeys, and reduces nesting by a level for both. +PEP 621 keys to core metadata fields, and reduces nesting by a level for both. Other than adding two extra keys to the file, there was no real apparent downside to this latter approach, so it was adopted for this PEP. -Define License Expression as String Value +Define license expression as string value ''''''''''''''''''''''''''''''''''''''''' A compromise approach between adding two new top-level keys for license expressions and files would be to add a separate ``license-files`` key, but re-using the ``license`` key for the license expression, either by defining it as the (previously reserved) string value for the ``license`` -key, retaining the ``expression`` sub-key in the ``license`` table, or +key, retaining the ``expression`` subkey in the ``license`` table, or allowing both. Indeed, this would seem to have been envisioned by PEP 621 itself with this PEP in mind, in particular the first approach:: A practical string value for the license key has been purposefully left out - to allow for a future PEP to specify support for SPDX [6] expressions. + to allow for a future PEP to specify support for SPDX expressions. However, while a working draft temporarily explored this solution, it was ultimately rejected, as it shared most of the downsides identified with @@ -1244,7 +1381,7 @@ PEP 621, core metadata, tool file formats and the consensus in the discussion in not making the new license expression map to a corresponding new field, none of which was the case at the time PEP 621 was drafted. Finally, this would deny a clear separation from the old behavior by not -cleanly deprecating the entire ``license`` field, and increases the complexity +cleanly deprecating the entire ``license`` key, and increases the complexity of the specification and implementation. In addition to the aforementioned issues, this also requires deciding between @@ -1260,11 +1397,11 @@ and this lack of clarity, explicitness, ambiguity and potential for user confusion is exactly what this PEP seeks to avoid, all to save a few characters over other approaches. -If an ``expression`` key was added to the ``license`` table, it would retain -the clarity of a new top-level field, but add additional complexity for no +If an ``expression`` subkey was added to the ``license`` table, it would retain +the clarity of a new top-level key, but add additional complexity for no real benefit, with an extra level of nesting, and users and tools needing to -deal with the mutual exclusivity of the keys, as before. And allowing both -(as a table key *and* the string value) would inherit both's downsides, +deal with the mutual exclusivity of the subkeys, as before. And allowing both +(as a table subkey *and* the string value) would inherit both's downsides, while adding even more spec and tool complexity and making there more than "one obvious way to do it", further potentially confusing users. @@ -1274,18 +1411,18 @@ additional top-level key and (versus some approaches) a few extra characters to type. -Add a Type Key to Treat as Expression -''''''''''''''''''''''''''''''''''''' +Add a ``type`` key to treat as expression +''''''''''''''''''''''''''''''''''''''''' Instead of creating a new top-level ``license-expression`` key in the -PEP 621 source metadata, we could add a ``type`` key to the existing +PEP 621 source metadata, we could add a ``type`` subkey to the existing ``license`` table to control whether ``text`` (or a string value) is interpreted as free-text or a license expression. This could make backward compatibility a little more seamless, as older tools could ignore it and always treat ``text`` as ``license``, while newer tools would know to treat it as a license expression, if ``type`` was set appropriately. Indeed, PEP 621 suggests something of this sort as a possible alternative -way that SPDX expressions could be implemented. +way that SPDX license expressions could be implemented. However, all the same downsides as in the previous item apply here, including greater complexity, a more complex mapping between the project @@ -1300,14 +1437,14 @@ to specify the correct ``type`` to ensure their license expression is interpreted correctly, which adds work and potential for error; we could never safety change the default while being confident that users understand that what they are entering is unambiguously a license expression, -with all the false positive and fales negative issues as above. +with all the false positive and false negative issues as above. Therefore, for these as well as the same reasons this approach was rejected for the core metadata in favor of a distinct ``License-Expression`` field, we similarly reject this here. -Must be Marked Dynamic to Back-Fill +Must be marked dynamic to back-fill ''''''''''''''''''''''''''''''''''' The ``license`` key in the ``pyproject.toml`` could be required to be @@ -1336,28 +1473,28 @@ apparently intended, and prevents tools from adapting to best practices (fill, don't fill, etc) as they develop and evolve over time. -PEP 621 License-Files Key -------------------------- +Source metadata ``license-files`` key +------------------------------------- -Alternatives considered for the ``License-Files`` key in the +Alternatives considered for the ``license-files`` key in the ``pyproject.toml`` project source metadata, primarily related to the path/glob type handling. -Add a Type Key to Control Path/Glob -''''''''''''''''''''''''''''''''''' +Add a ``type`` subkey to ``license-files`` +'''''''''''''''''''''''''''''''''''''''''' Instead of defining mutually exclusive ``paths`` and ``globs`` subkeys of the ``license-files`` PEP 621 project metadata key, we could -achieve the same effect with a ``files`` key for the list and -a ``type`` key for how to interpret it. However, the latter offers no +achieve the same effect with a ``files`` subkey for the list and +a ``type`` subkey for how to interpret it. However, the latter offers no real advantage over the former, in exchange for requiring more keystrokes, verbosity and complexity, as well as less flexibility in allowing both, -or another additional key in the future, as well as the need to bikeshed -over the key name. Therefore, it was summarily rejected. +or another additional subkey in the future, as well as the need to bikeshed +over the subkey name. Therefore, it was summarily rejected. -Only Accept Verbatim Paths +Only accept verbatim paths '''''''''''''''''''''''''' Globs could be disallowed completely as values to the ``license-files`` @@ -1365,8 +1502,8 @@ key in ``pyproject.toml`` and only verbatim literal paths allowed. This would ensure that all license files are explicitly specified, all specified license files are found and included, and the source metadata is completely static in the strictest sense of the term, without tools -having to inspect the rest of the package files to determine exactly -what license files will be included and what the ``License-Files`` values +having to inspect the rest of the project source files to determine exactly +what license files will be included and what the ``License-File`` values will be. This would also modestly simplify the spec and tool implementation. However, practicality once again beats purity here. Globs are supported and @@ -1389,14 +1526,14 @@ default value, as widely used by the current most popular tools, and thus be a major break to backward compatibility, tool consistency, and safe and sane default functionality to avoid unintentional license violations. And of course, authors are welcome and encouraged to specify their license -files explicitly via the ``files`` table key, once they are aware of it and +files explicitly via the ``paths`` table subkey, once they are aware of it and if it is suitable for their project and workflow. -Only Accept Glob Patterns +Only accept glob patterns ''''''''''''''''''''''''' -Conversely, all ``License-Files`` strings could be treated as glob patterns. +Conversely, all ``license-files`` strings could be treated as glob patterns. This would slightly simplify the spec and implementation, avoid an extra level of nesting, and more closely match the configuration format of existing tools. @@ -1414,33 +1551,33 @@ error should any be missing. This allows tools to locate them and know the exact values of the ``License-File`` core metadata fields without having to traverse the -source files of the project and match globs, potentially allowing easier, +source tree of the project and match globs, potentially allowing easier, more efficient and reliable inspection by tools. Therefore, given the relatively small cost and the significant benefits, this approach was not adopted. -Infer Whether Paths or Globs +Infer whether paths or globs '''''''''''''''''''''''''''' It was considered whether to simply allow specifying an array of strings -directly for the ``license-file`` key, rather than making it a table with +directly for the ``license-files`` key, rather than making it a table with explicit ``paths`` and ``globs``. This would be somewhat simpler and avoid an extra level of nesting, and more closely match the configuration format of existing tools. However, it was ultimately rejected in favor of separate, -mutually exclusive ``paths`` and ``globs`` table keys. +mutually exclusive ``paths`` and ``globs`` table subkeys. In practice, it only saves six extra characters in the ``pyproject.toml`` (``license-files = [...]`` vs ``license-files.globs = [...]``), but allows the user to more explicitly declare their intent, ensures they understand how -the field is going to be interpreted, and serves as an unambiguous indicator +the values are going to be interpreted, and serves as an unambiguous indicator for tools to parse them as globs rather than verbatim path literals. This, in turn, allows for more appropriate, clearly specified tool behaviors for each case, many of which would be unreliable or impossible without it, to avoid common traps, provide more helpful feedback and -behave more sensibly and intuitively overall. These include, with ``files``, +behave more sensibly and intuitively overall. These include, with ``paths``, guaranteeing that each and every specified file is included and immediately raising an error if one is missing, and with ``globs``, checking glob syntax, excluding unwanted backup, temporary, or other such files (as current tools @@ -1450,7 +1587,7 @@ reliance on heuristics to determine interpretation—the very thing this PEP seeks to avoid. -Also Allow a Flat Array Value +Also allow a flat array value ''''''''''''''''''''''''''''' Initially, after deciding to define ``license-files`` as a table of ``paths`` @@ -1477,10 +1614,10 @@ more than one obvious way to do it. Therefore, per PEP 20, the Zen of Python, this approach is hereby rejected. -Allow Both Paths and Globs Keys -''''''''''''''''''''''''''''''' +Allow both ``paths`` and ``globs`` subkeys +'''''''''''''''''''''''''''''''''''''''''' -Allowing both ``paths`` and ``globs`` keys to be specified under the +Allowing both ``paths`` and ``globs`` subkeys to be specified under the ``license-files`` table was considered, as it could potentially allow more flexible handling for particularly complex projects, and specify on a per-pattern rather than overall basis whether ``license-files`` entries @@ -1501,15 +1638,15 @@ being explicitly, statically specified, and others. Like the previous, if there is a clear need for it, it can be always allowed in the future in a backward-compatible manner (to the extent it is possible at all), while the same is not true of disallowing it. Therefore, it was -decided to require the two keys to be mutually exclusive. +decided to require the two subkeys to be mutually exclusive. -Rename Paths Subkey to Files -'''''''''''''''''''''''''''' +Rename ``paths`` subkey to ``files`` +'''''''''''''''''''''''''''''''''''' Initially, it was considered whether to name the ``paths`` subkey of the ``license-files`` table ``files`` instead. However, ``paths`` was ultimately -chosen, as calling the table key ``files`` resulted in duplication between +chosen, as calling the table subkey ``files`` resulted in duplication between the table name (``license-files``) and the subkey name (``files``), i.e. ``license-files.files = ["LICENSE.txt"]``, made it seem like the preferred/ default subkey when it was not, and lacked the same parallelism with ``globs`` @@ -1517,11 +1654,11 @@ in describing the format of the string entry rather than what was being pointed to. -Must be Marked Dynamic to Use Defaults +Must be marked dynamic to use defaults '''''''''''''''''''''''''''''''''''''' It may seem outwardly sensible, at least with a particularly restrictive -interpretation of PEP 621 's description of the ``dynamic`` field, to +interpretation of PEP 621 's description of the ``dynamic`` list, to consider requiring the ``license-files`` key to be explicitly marked as ``dynamic`` in order for the default glob patterns to be used, or alternatively for license files to be matched and included at all. @@ -1547,17 +1684,17 @@ to the file, not defining the default as dynamic allows authors to clearly and unambiguously indicate when their build/packaging tools are going to be handling the inclusion of license files themselves rather than strictly conforming to the PEP 621 portions of this PEP; to do otherwise would defeat -the primary purpose of the ``dynamic`` field as a marker and escape hatch. +the primary purpose of the ``dynamic`` list as a marker and escape hatch. -License File Paths +License file paths ------------------ Alternatives related to the paths and locations of license files in the source and built distributions. -Flatten License Files in Subdirectories +Flatten license files in subdirectories ''''''''''''''''''''''''''''''''''''''' Previous drafts of this PEP were silent on the issue of handling license files @@ -1592,7 +1729,7 @@ a followup proposal rooting the license files under a ``license_files`` subdirectory eliminates both collisions and the clutter problem entirely. -Resolve Name Conflicts Differently +Resolve name conflicts differently '''''''''''''''''''''''''''''''''' Rather than preserving the source directory structure for license files @@ -1608,8 +1745,8 @@ favor of the simpler and more obvious solution of just preserving the source subdirectory layout, as many stakeholders have already advocated for. -Dump Directly in Dist-Info -'''''''''''''''''''''''''' +Dump directly in ``.dist-info`` +''''''''''''''''''''''''''''''' Previously, the included license files were stored directly in the top-level ``.dist-info`` directory of built wheels and installed projects. This followed @@ -1658,8 +1795,8 @@ license files in subdirs are handled, as well as other things). Therefore, the latter has been incorporated into current drafts of this PEP. -Add New Licenses Category to Wheel -'''''''''''''''''''''''''''''''''' +Add new ``licenses`` category to wheel +'''''''''''''''''''''''''''''''''''''' Instead of defining a root license directory (``license_files``) inside the core metadata directory (``.dist-info``) for wheels, we could @@ -1689,34 +1826,34 @@ departure from existing practice and would lead to more inconsistent license install locations from wheels of different versions. Finally, this would mean licenses were not installed as proximately to their associated code, there would be more variability in the license root path -across platforms and between built and installed packages, accessing -installed licenses pro grammatically would be more non-trivial, and a +across platforms and between built distributions and installed projects, +accessing installed licenses pro grammatically would be more non-trivial, and a suitable install location and method would need to be created, discussed and decided that would avoid name clashes. Therefore, to keep this PEP in scope, the current approach was retained. -Name the Subdirectory Licenses -'''''''''''''''''''''''''''''' +Name the subdirectory ``licenses`` +'''''''''''''''''''''''''''''''''' Both ``licenses`` and ``license_files`` have been suggested as potential names for the root license directory inside ``.dist-info`` of wheels and installed projects. The former is slightly shorter, but the latter is more clear and unambiguous regarding its contents, and is consistent with the name of the core metadata field (``License-File``) and the PEP 621 -project source metadata key (``License-Files``). Therefore, the latter +project source metadata key (``license-files``). Therefore, the latter was chosen instead. -Other Ideas +Other ideas ----------- Miscellaneous proposals, possibilities and discussion points that were ultimately not adopted. -Map Identifiers to License Files +Map identifiers to license files '''''''''''''''''''''''''''''''' This would require using a mapping (two parallel lists would be too prone to @@ -1742,7 +1879,7 @@ before using it in code and users are confused about when to use a list or a string. -Map Identifiers to Source Files +Map identifiers to source files ''''''''''''''''''''''''''''''' File-level notices are not considered as part of the scope of this PEP and the @@ -1750,8 +1887,8 @@ existing ``SPDX-License-Identifier`` [#spdxids]_ convention can be used and may not need further specification as a PEP. -Don't Require Compatibility with a Specific SPDX Version -'''''''''''''''''''''''''''''''''''''''''''''''''''''''' +Don't freeze compatibility with a specific SPDX version +''''''''''''''''''''''''''''''''''''''''''''''''''''''' This PEP could omit specifying a specific SPDX specification version, or one for the list of valid license identifiers, which would allow @@ -1773,14 +1910,14 @@ licenses, but also remain backwards compatible with the version specified here, balancing flexibility and compatibility. -Different Licenses for Source and Binary Distributions +Different licenses for source and binary distributions '''''''''''''''''''''''''''''''''''''''''''''''''''''' As an additional use case, it was asked whether it was in scope for this PEP to handle cases where the license expression for a binary distribution (wheel) is different from that for a source distribution (sdist), such as in cases of non-pure-Python packages that compile and bundle binaries -under different licenses than the package itself. An example cited was +under different licenses than the project itself. An example cited was PyTorch [#pytorch]_, which contains CUDA from Nvidia, which is freely distributable but not open source. NumPy [#numpyissue]_ and SciPy [#scipyissue]_ also had similar issues, as reported by the original author @@ -1789,7 +1926,7 @@ of this PEP and now resolved for those cases. However, given the inherent complexity here and a lack of an obvious mechanism to do so, the fact that each wheel would need its own license information, lack of support on PyPI for exposing license info on a -per-distribution basis, and the relatively niche use case, it was +per-distribution archive basis, and the relatively niche use case, it was determined to be out of scope for this PEP, and left to a future PEP to resolve if sufficient need and interest exists, and an appropriate mechanism can be found. @@ -1798,11 +1935,11 @@ mechanism can be found. Open Issues =========== -Should the License Field be Back-Filled, or Mutually Exclusive? ---------------------------------------------------------------- +Should the ``License`` field be back-filled, or mutually exclusive? +------------------------------------------------------------------- At present, this PEP explicitly allows, but does not formally recommend or -require, tools to back-fill the ``License`` core metadata field with +require, build tools to back-fill the ``License`` core metadata field with the verbatim text from the ``License-Expression`` field. This would presumably improve backwards compatibility and was suggested by some on the Discourse thread. On the other hand, allowing it does @@ -1827,17 +1964,17 @@ Or should this not be explicitly allowed, discouraged or even prohibited? Appendix 1. License Expression Examples ======================================= -Basic Example +Basic example ------------- The Setuptools project itself, as of version 59.1.1 [#setuptools5911]_, -does not use the ``License`` field in its own project metadata. -Further, it not longer explictly specifies ``license_file``/``license_files`` -as it did previously, since ``setuptools`` relies on its own automatic +does not use the ``License`` field in its own project source metadata. +Further, it no longer explicitly specifies ``license_file``/``license_files`` +as it did previously, since Setuptools relies on its own automatic inclusion of license-related files matching common patterns, such as the ``LICENSE`` file it uses. -It only includes the following license-related metadata in its ``setup.cfg``:: +It includes the following license-related metadata in its ``setup.cfg``:: [metadata] classifiers = @@ -1853,7 +1990,7 @@ Or, in a PEP 621 ``pyproject.toml``:: [project] license-expression = "MIT" -The output core metadata for the package would then be:: +The output core metadata for the distribution packages would then be:: License-Expression: MIT License-File: LICENSE @@ -1862,13 +1999,13 @@ The ``LICENSE`` file would be stored at ``/setuptools-{version}/LICENSE`` in the sdist and ``/setuptools-{version}.dist-info/license_files/LICENSE`` in the wheel, and unpacked from there into the site directory (e.g. ``site-packages) on installation; ``/`` is the root of the respective archive -and ``{version}`` the version of the Setuptools project in the core metadata. +and ``{version}`` the version of the Setuptools release in the core metadata. -Advanced Example +Advanced example ---------------- -Suppose Setuptools were to include the licenses of the third-party packages +Suppose Setuptools were to include the licenses of the third-party projects that are vendored in the ``setuptools/_vendor/`` and ``pkg_resources/_vendor`` directories; specifically:: @@ -1877,14 +2014,14 @@ directories; specifically:: ordered-set==3.1.1 more_itertools==8.8.0 -The license expressions for these packages are:: +The license expressions for these projects are:: packaging: Apache-2.0 OR BSD-2-Clause pyparsing: MIT ordered-set: MIT more_itertools: MIT -A comprehensive license expression covering both ``setuptools`` +A comprehensive license expression covering both Setuptools proper and its vendored dependencies would contain these metadata, combining all the license expressions into one. Such an expression might be:: @@ -1892,7 +2029,7 @@ combining all the license expressions into one. Such an expression might be:: In addition, per the requirements of the licenses, the relevant license files must be included in the package. Suppose the ``LICENSE`` file contains the text -of the MIT license and the copyrights used by ``setuptools``, ``pyparsing``, +of the MIT license and the copyrights used by Setuptools, ``pyparsing``, ``more_itertools`` and ``ordered-set``; and the ``LICENSE`` files in the ``setuptools/_vendor/packaging/`` directory contain the Apache 2.0 and 2-clause BSD license text, and the Packaging copyright statement and @@ -1918,7 +2055,7 @@ Putting it all together, our ``setup.cfg`` would be:: setuptools/_vendor/packaging/LICENSE.BSD In a PEP 621 ``pyproject.toml``, with license files specified explicitly -via the ``paths`` key, this would look like:: +via the ``paths`` subkey, this would look like:: [project] license-expression = "MIT AND (Apache-2.0 OR BSD-2-Clause)" @@ -1938,7 +2075,8 @@ Or alternatively, matched via glob patterns, this could be:: "setuptools/_vendor/LICENSE*", ] -With either approach, the resulting core metadata would be:: +With either approach, the output core metadata in the distribution +would be:: License-Expression: MIT AND (Apache-2.0 OR BSD-2-Clause) License-File: LICENSE @@ -1947,7 +2085,7 @@ With either approach, the resulting core metadata would be:: License-File: setuptools/_vendor/packaging/LICENSE.BSD In the resulting sdist, with ``/`` as the root of the archive and ``{version}`` -the version of the Setuptools project specified in the core metadata, +the version of the Setuptools release specified in the core metadata, the license files would be located at the paths:: /setuptools-{version}/LICENSE @@ -1972,19 +2110,20 @@ and ``{version}`` as above, the license files would be installed to:: site-packages/setuptools-{version}.dist-info/license_files/setuptools/_vendor/packaging/LICENSE.BSD -Conversion Example +Conversion example ------------------ -Suppose we were to return to our simple ``setuptools`` case. +Suppose we were to return to our simple Setuptools case. Per the specification, given it only has the following license classifier:: Classifier: License :: OSI Approved :: MIT License -And no value for the ``License`` field; or, equivalently, a value of:: +And no value for the ``License`` field, or equivalently, if it had a +value of:: License: MIT -Then the suggested value for a ``License-Expression`` field would be:: +Then the suggested value for the ``License-Expression`` field would be:: License-Expression: MIT @@ -1994,7 +2133,7 @@ inherent ambiguity, and the user would be prompted on how to handle the situation themselves. -Expression Examples +Expression examples ------------------- Some additional examples of valid ``License-Expression`` values:: @@ -2003,6 +2142,8 @@ Some additional examples of valid ``License-Expression`` values:: License-Expression: BSD-3-Clause + License-Expression: MIT AND (Apache-2.0 OR BSD-2-clause) + License-Expression: MIT OR GPL-2.0-or-later OR (FSFUL AND BSD-2-Clause) License-Expression: GPL-3.0-only WITH Classpath-Exception-2.0 OR BSD-3-Clause @@ -2015,18 +2156,18 @@ Some additional examples of valid ``License-Expression`` values:: Appendix 2. License Documentation in Python =========================================== -There are multiple ways used or recommended to document Python package +There are multiple ways used or recommended to document Python project licenses today. The most common are listed below. -Core Metadata +Core metadata ------------- There are two overlapping core metadata fields to document a license: the -license-related ``Classifier`` strings [#classif]_ prefixed with ``License ::`` +license ``Classifier`` strings [#classif]_ prefixed with ``License ::`` and the ``License`` field as free text [#licfield]_. -The core metadata documentation ``License`` field documentation is currently:: +The core metadata ``License`` field documentation is currently:: License ======= @@ -2050,20 +2191,20 @@ The core metadata documentation ``License`` field documentation is currently:: Even though there are two fields, it is at times difficult to convey anything but simpler licensing. For instance, some classifiers lack precision -(GPL without a version) and when multiple license-related classifiers are +(GPL without a version) and when multiple license classifiers are listed, it is not clear if both licenses must apply, or the user may choose -between them. Furthermore, the list of available license-related classifiers +between them. Furthermore, the list of available license classifiers is often out-of-date. -Setuptools and Wheels ---------------------- +Setuptools and Wheel +-------------------- Beyond a license code or qualifier, license text files are documented and included in a built package either implicitly or explicitly and this is another possible source of confusion: -- In Setuptools [#setuptoolssdist]_ and wheels [#wheels]_, license files +- In Setuptools [#setuptoolssdist]_ and Wheel [#wheels]_, license files are automatically added to the distribution (at their source location in in a source distribution/sdist, and in the ``.dist-info`` directory of a built wheel) if they match one of a number of common license file @@ -2071,7 +2212,7 @@ possible source of confusion: Alternatively, a package author can specify a list of license file paths to include in the built wheel under the ``license_files`` key in the ``[metadata]`` section of the project's ``setup.cfg``, or as an argument - to the ``setuptools`` ``setup()`` function. At present, following wheel's + to the ``setuptools.setup()`` function. At present, following Wheel's lead, Setuptools flattens the collected license files into the metadata directory, clobbering files with the same name, and dump license files directly into the top-level ``.dist-info`` directory, but there is a desire @@ -2083,15 +2224,15 @@ possible source of confusion: has been deprecated for some time but still sees some use. See [#pipsetup]_ for instance. -- Following the publication of an earlier draft of this PEP, ``setuptools`` - added support for ``License-File`` in package metadata as described herein - [#setuptoolspep639]_. This allows other tools consuming the resulting +- Following the publication of an earlier draft of this PEP, Setuptools + added support for ``License-File`` in distribution metadata as described + herein [#setuptoolspep639]_. This allows other tools consuming the resulting metadata to unambiguously locate the license file(s) for a given package. -**Note:** the ``License-File`` field proposed in this PEP already exists in -``wheel`` and ``setuptools`` with the same behaviour as explained above. +**Note:** The ``License-File`` field proposed in this PEP already exists in +Wheel and Setuptools with the same behaviour as explained above. This PEP is only recognizing and documenting the existing practice as used -in ``wheel`` and ``setuptools`` to add license files to the distribution, +in Wheel and Setuptools to add license files to the distribution, and formally including their paths in core metadata (which has since been implemented on the basis of a draft of this PEP). @@ -2108,20 +2249,20 @@ following existing practice formally specified by this PEP. Both the beginner packaging tutorial [#packagingtutkey]_ and the sample project [#samplesetuppy]_ only use classifiers to declare a package's license, and do -not include or mention the ``license`` field. The full packaging guide does +not include or mention the ``License`` field. The full packaging guide does mention this field, but states that authors should use the license classifiers instead, unless the project uses a non-standard license (which the guide discourages) [#licfield]_. -Python Source Code Files +Python source code files ------------------------ **Note:** Documenting licenses in source code is not in the scope of this PEP. Beside using comments and/or ``SPDX-License-Identifier`` conventions, the license is sometimes documented in Python code files using "dunder" variables typically -named after one of the lower cased Core Metadata fields such as ``__license__`` +named after one of the lower cased core metadata fields such as ``__license__`` [#pycode]_. This convention (dunder global variables) is recognized by the built-in ``help()`` @@ -2129,17 +2270,17 @@ function and the standard ``pydoc`` module. The dunder variable(s) will show up the ``help()`` DATA section for a module. -Other Python Packaging Tools +Other Python packaging tools ---------------------------- - Conda package manifests [#conda]_ have support for ``license`` and ``license_file`` fields, and automatically include license files - following similar naming patterns as ``wheel`` and ``setuptools``. + following similar naming patterns as Wheel and Setuptools. - Flit [#flit]_ recommends using classifiers instead of the ``License`` field (per the current PyPA packaging guide). -- PBR [#pbr]_ uses similar data as setuptools, but always stored in +- PBR [#pbr]_ uses similar data as Setuptools, but always stored in ``setup.cfg``. - Poetry [#poetry]_ specifies the use of the ``license`` field in @@ -2152,7 +2293,7 @@ Appendix 3. License Documentation in Other Projects Here is a survey of how things are done elsewhere. -Linux Distribution Packages +Linux distribution packages --------------------------- **Note:** in most cases the license texts of the most common licenses are included @@ -2207,7 +2348,7 @@ globally once in a shared documentation directory (e.g. ``/usr/share/doc``). license field. -Language and Application Packages +Language and application packages --------------------------------- - In Java, Maven POM [#maven]_ defines a ``licenses`` XML tag with a list of license @@ -2282,7 +2423,7 @@ Language and Application Packages license expression syntax. -Other Ecosystems +Other ecosystems ---------------- - ``SPDX-License-Identifier`` [#spdxids]_ is a simple convention to document the @@ -2318,11 +2459,13 @@ Other Ecosystems References ========== +.. [#pypugglossary] https://packaging.python.org/glossary/ .. [#cms] https://packaging.python.org/specifications/core-metadata .. [#projectspec] https://packaging.python.org/specifications/declaring-project-metadata/ .. [#sdistspec] https://packaging.python.org/specifications/source-distribution-format/ .. [#wheelspec] https://packaging.python.org/specifications/binary-distribution-format/ .. [#installedspec] https://packaging.python.org/specifications/recording-installed-packages/ +.. [#wheelproject] https://wheel.readthedocs.io/en/stable/ .. [#cdstats] https://clearlydefined.io/stats .. [#cd] https://clearlydefined.io .. [#osi] https://opensource.org @@ -2337,7 +2480,8 @@ References .. [#globmodule] https://docs.python.org/3/library/glob.html .. [#spdxlist] https://spdx.org/licenses/ .. [#spdx] https://spdx.dev/ -.. [#spdx22] https://spdx.github.io/spdx-spec/SPDX-license-expressions/ +.. [#spdxid] https://spdx.dev/ids/ +.. [#spdxpression] https://spdx.github.io/spdx-spec/SPDX-license-expressions/ .. [#wheels] https://github.com/pypa/wheel/blob/0.37.0/docs/user_guide.rst#including-license-files-in-the-generated-wheel-file .. [#reuse] https://reuse.software/ .. [#licexp] https://github.com/nexB/license-expression/ From 1bec8c2084d8ffee157cc2f7f9d6d38c45a747c3 Mon Sep 17 00:00:00 2001 From: "C.A.M. Gerlach" Date: Sun, 28 Nov 2021 18:05:23 -0600 Subject: [PATCH 16/19] PEP 639: Make refs inline per #2130, add links, fix others & refine text --- pep-0639.rst | 773 ++++++++++++++++++++++++++------------------------- 1 file changed, 400 insertions(+), 373 deletions(-) diff --git a/pep-0639.rst b/pep-0639.rst index d229cd3e9c0..32fe94696e4 100644 --- a/pep-0639.rst +++ b/pep-0639.rst @@ -19,8 +19,9 @@ Abstract ======== This PEP defines a specification for how licenses are documented in the -core metadata via a new ``License-Expression`` field, with license expression -strings using SPDX identifiers [#spdxlist]_. +`core metadata <#coremetadataspec_>`_ via a new ``License-Expression`` field, +with `license expression strings `_ using +`SPDX identifiers <#spdxid_>`_. This will make license declarations simpler and less ambiguous for: @@ -30,21 +31,35 @@ This will make license declarations simpler and less ambiguous for: The PEP also: -- Formally specifies a new ``License-File`` field and how license files - should be included in distributions, as already used by Wheel and Setuptools. +- `Formally specifies `_ a new ``License-File`` field + and how license files should be + `included in distributions `_, + as already used by Wheel and Setuptools. -- Deprecates the legacy ``License`` field and ``license ::`` classifiers. +- `Deprecates `_ the legacy ``License`` field + and ``license ::`` classifiers. -- Provides clear guidance for authors and tools converting legacy license - metadata, adding license files and validating license expressions. +- `Adds and deprecates `_ corresponding keys in the + PEP 621 project source metadata format. -- Adds and deprecates corresponding keys in the PEP 621 - project source metadata format. +- `Provides clear guidance `_ for authors and + tools converting legacy license metadata, adding license files and + validating license expressions. -The changes in this PEP will update the core metadata to version 2.3, -modify the PEP 621 project metadata specification, and make minor additions to -the source distribution (sdist), built distribution (wheel) and installed -project standards. +- Discusses `user scenarios `_, + describes a `reference implementation `_, + analyzes numerous `potential alternatives `_, + includes `detailed examples `_ and + surveys license documentation + `in Python packaging `_ and + `other ecosystems `_. + +The changes in this PEP will update the +`core metadata <#coremetadataspec_>`_ to version 2.3, modify the +`PEP 621 project metadata specification <#pep621spec_>`_, +and make minor additions to the `source distribution (sdist) <#sdistspec_>`_, +`built distribution (wheel) <#wheelspec_>`_ and +`installed project <#installedspec_>`_ standards. Goals @@ -60,26 +75,15 @@ The changes to the core metadata specification that this PEP requires have been designed to minimize impact and maximize backward compatibility. This specification builds off of existing ways to document licenses that are already in use in popular tools (e.g. adding support to core metadata for -the ``License-File`` field already used in Wheel and Setuptools) -and by some package authors (e.g. storing an SPDX license expression -in the existing ``License`` field). - -In addition to these proposed changes, this PEP contains: - -- Recommendations for publishing tools on how to validate the new - ``License-Expression`` field, add license files to distributions, and and - warn on and convert legacy metadata - (the ``License`` field and license classifiers). - -- Simplified guidance for package authors on how to handle license files - and expressions for various common situations. +the ``License-File`` field `already used `_ in +Wheel and Setuptools) and by some package authors (e.g. storing an +SPDX license expression in the existing ``License`` field). -- A detailed summary of related proposed ideas and alternate approaches, and - why they were or were not incorporated into the current version of this PEP - -- Informational appendices that contain surveys of how we document licenses - today in Python packages and elsewhere, and a reference Python library to - parse, validate and build correct license expressions. +In addition to these proposed changes, this PEP contains guidance for tools +handling and converting these metadata, a tutorial for package authors +covering various common use cases, detailed examples of them in use, +and a comprehensive survey of license documentation in Python and other +languages. It is the intent of the PEP authors to work closely with tool maintainers to implement the recommendations for validation and warnings specified here. @@ -106,9 +110,10 @@ to use a new formally-specified and supported mechanism, and provide guidance for packaging tools on how to hand the transition and inform users accordingly. This PEP is not about license documentation in files inside projects, -though this is a surveyed topic in the appendix, and nor does it currently -intend to cover cases where the source and binary distribution packages -have different licenses. +though this is a `surveyed topic `_ in the appendix, +and nor does it intend to cover cases where the source and +binary distribution packages don't have +`the same licenses `_ Possible future PEPs @@ -118,7 +123,7 @@ It is the intention of the authors of this PEP to consider the submission of related but separate PEPs in the future, which may include: - Removing the deprecated ``License`` field and license classifiers - from the Core Metadata specification. + from the core metadata specification. - Making the ``License-Expression`` and ``License-File`` fields mandatory for publishing tools and PyPI packages. @@ -139,28 +144,31 @@ for package authors and users. Several package authors have expressed difficulty and/or frustrations due to the limited capabilities to express licensing in project metadata. This also applies to Linux and BSD distribution packagers. This has triggered several -license-related discussions and issues, in particular: - -- https://github.com/pypa/trove-classifiers/issues/17 -- https://github.com/pypa/interoperability-peps/issues/46 -- https://github.com/pypa/packaging-problems/issues/41 -- https://github.com/pypa/wheel/issues/138 -- https://github.com/pombredanne/spdx-pypi-pep/issues/1 +license-related discussions and issues, including on +`outdated and ambiguous PyPI classifiers <#classifierissue_>`_, +`license interoperability with other ecosystems <#interopissue_>`_, +`too many confusing/limited license metadata options <#packagingissue_>`_, +`limited Wheel support for license files <#wheelfiles_>`_, and +`the lack of clear, precise and standardized license metadata <#pepissue_>`_. On average, Python packages tend to have more ambiguous, or missing, license information than other common application package formats (such as npm, Maven or -Gem) as can be seen in the statistics [#cdstats]_ page of the ClearlyDefined -[#cd]_ project that cover all packages from PyPI, Maven, npm and Rubygems. -ClearlyDefined is an open source project to help improve clarity of other open -source projects that is incubating at the OSI (Open Source Initiative) [#osi]_. +Gem) as can be seen in the `statistics page <#cdstats_>`_ of the +`ClearlyDefined project <#clearlydefined_>`_ that cover all packages from +PyPI, Maven, npm and Rubygems. ClearlyDefined is an open source project +to help improve clarity of other open source projects that is incubating at +the `Open Source Initiative <#osi_>`_. Rationale ========= -A mini-survey of existing license metadata definitions in use in the Python -ecosystem today and documented in several other system/distro and application -package formats is provided in Appendix 2 of this PEP. +A survey of existing license metadata definitions in use in the Python +ecosystem today is provided in +`Appendix 2 `_ of this PEP, +and license documentation in a variety of other packaging systems, +Linux distros, languages ecosystems and applications is surveyed in +`Appendix 3 `_. There are a few takeaways from the survey: @@ -236,11 +244,11 @@ This PEP seeks to clearly define the terms it uses, specifically those that: - Are new concepts introduced here (license expression/identifier). -Whenever available, definitions are excepted from the PyPA PyPUG Glossary -[#pypugglossary]_ and SPDX [#spdx]_. Terms are listed here in their full -versions; related words (``Rel:``) are in parenthesis, including short forms -(``Short:``), sub-terms (``Sub:``) and common synonyms for the purposes of -this PEP (``Syn:``). +Whenever available, definitions are excepted from the +`PyPA PyPUG Glossary <#pypugglossary_>`_ and `SPDX <#spdx_>`_. Terms are listed +here in their full versions; related words (``Rel:``) are in parenthesis, +including short forms (``Short:``), sub-terms (``Sub:``) and common synonyms +for the purposes of this PEP (``Syn:``). **Built Distribution** *(Syn: Binary Distribution/Wheel)* A Distribution format containing files and metadata that only need to be @@ -254,12 +262,12 @@ this PEP (``Syn:``). and **wheel** (the format). **Core Metadata** *(Syn: Package Metadata, Sub: Distribution Metadata)* - The PyPA specification [#cms]_ and the set of metadata fields it defines that - describe key static attributes of distribution packages and installed - projects. + The `PyPA specification <#coremetadataspec_>`_ and the set of metadata fields + it defines that describe key static attributes of distribution packages + and installed projects. **Distribution metadata** refers to, more specifically, the concrete form - core metadata takes when included inside a distribution package + core metadata takes when included inside a distribution archive (``PKG-INFO`` in a sdist and ``METADATA`` in a wheel) or installed project (``METADATA``). @@ -278,18 +286,18 @@ this PEP (``Syn:``). specifically references the physical **distribution archive**. **License Classifier** - A PyPI Trove classifier [#classif]_ as originally defined in PEP 301 which - begins with ``License ::``, currently used to indicate a project's + A `PyPI Trove classifier <#classifiers_>`_ (as originally defined in PEP 301) + which begins with ``License ::``, currently used to indicate a project's license status by including it as a ``Classifer`` in the core metadata. **License Expression** *(Syn: SPDX Expression)* - A string with valid SPDX license expression syntax [#spdxpression]_ + A string with valid `SPDX license expression syntax <#spdxpression_>`_ including any SPDX license identifiers as defined here, which describes a project's license(s) and how they related to one another. Examples: ``GPL-3.0-or-later``, ``MIT AND (Apache-2.0 OR BSD-2-clause)`` **License Identifier** *(Syn: License ID/SPDX Identifier)* - A valid SPDX short-form license identifier [#spdxid]_, as described in the + A valid `SPDX short-form license identifier <#spdxid_>`_, as described in the `Add License-Expression field`_ section of this PEP; briefly, this includes all valid SPDX identifiers and the ``LicenseRef-Public-Domain`` and ``LicenseRef-Proprietary`` strings. Examples: ``MIT``, ``GPL-3.0-only`` @@ -304,8 +312,8 @@ this PEP (``Syn:``). Here, a **project source tree** refers to the on-disk format of a project used for development, while an **installed project** is the form a - project takes once installed from a distribution, as specified by PyPA - [#installedspec]_. + project takes once installed from a distribution, as + `specified by PyPA <#installedspec_>`_. **Project Source Metadata** *(Sub: PEP 621 Metadata, Key, Subkey)* Core metadata defined by the package author in the project source tree, @@ -314,15 +322,15 @@ this PEP (``Syn:``). build tools. **PEP 621 metadata** refers specifically to the former, as defined by the - PyPA Declaring Project Metadata specification [#projectspec]_. + `PyPA Declaring Project Metadata specification <#pep621spec_>`_. A **PEP 621 metadata key**, or an unqualified *key* refers specifically to a top-level ``[project]`` key (notably, *not* a core metadata *field*), while a **subkey** refers to a second-level key in a table-valued PEP 621 key. **Source Distribution** *(Short: sdist)* - Here, specifically refers to a source distribution (**sdist**) as specified - by PyPA [#sdistspec]_. + Here, specifically refers to a source distribution (**sdist**) as + `specified by PyPA <#sdistspec_>`_. **Tool** *(Sub: Packaging Tool, Build Tool, Install Tool, Publishing Tool)* A program, script or service executed by the user or automatically that @@ -346,34 +354,42 @@ this PEP (``Syn:``). **Wheel Format** *(Short: wheel, Rel: Wheel project)* Here, **wheel**, the standard built distribution format introduced in PEP 427 - and specified by PyPA [#wheelspec]_, will be referred to in lowercase, - while the **Wheel project** [#wheelproject]_, its reference implementation, - which will be referred to as "Wheel" in Title Case. + and `specified by PyPA <#wheelspec_>`_, will be referred to in lowercase, + while the `Wheel project <#wheelproject_>`_, its reference implementation, + which will be referred to as **Wheel** in Title Case. Specification ============= The changes necessary to implement the improved license handling outlined in -this PEP include those in both author-provided project source metadata, as -specified in PEP 621, and distribution package metadata, as defined in the core -metadata specification [#cms]_. Furthermore, requirements are needed for -tools handling and converting legacy license metadata to license expressions, -to ensure the results are consistent, correct and unambiguous. Finally, minor -additions to the source distribution (sdist), built distribution (wheel) -and installed project specifications will help document and clarify the -already allowed, now formally standardized behavior in these respects. +this PEP include those in both +`distribution package metadata `_, as defined in the +`core metadata specification <#coremetadataspec_>`_, and +`author-provided project source metadata `_, as +originally defined in PEP 621. + +Further, `minor additions `_ to the +source distribution (sdist), built distribution (wheel) and installed project +specifications will help document and clarify the already allowed, +now formally standardized behavior in these respects. +Finally, `guidance is established `_ +for tools handling and converting legacy license metadata to license +expressions, to ensure the results are consistent, correct and unambiguous. Core metadata ------------- -The PyPA Core Metadata specification [#cms]_ defines the names and -semantics of each of the supported fields in the distribution metadata of +The `PyPA Core Metadata specification <#coremetadataspec_>`_ defines the names +and semantics of each of the supported fields in the distribution metadata of Python distribution packages and installed projects. -This PEP adds the ``License-Expression`` and ``License-File`` fields, -deprecates the ``License`` field, and deprecates the license classifiers +This PEP `adds `_ the +``License-Expression`` field, +`adds `_ the ``License-File`` field, +`deprecates `_ the ``License`` field, +and `deprecates `_ the license classifiers in the ``Classifier`` field. The error and warning guidance in this section applies to build and @@ -388,24 +404,24 @@ Add ``License-Expression`` field '''''''''''''''''''''''''''''''' The ``License-Expression`` optional field is specified to contain a text string -that is a valid SPDX license expression, defined below. +that is a valid SPDX license expression, as defined herein. Publishing tools SHOULD issue an informational warning if this field is missing, and MAY raise an error. Build tools MAY issue a similar warning, but MUST NOT raise an error. A license expression is a string using the SPDX license expression syntax as -documented in the SPDX specification [#spdxpression]_ using either Version 2.2 -or a later compatible version. +documented in the `SPDX specification <#spdxpression_>`_ using either +Version 2.2 or a later compatible version. When used in the ``License-Expression`` field and as a specialization of the SPDX license expression definition, a license expression can use the following license identifiers: -- Any SPDX-listed license short-form identifiers that are published in the SPDX - License List [#spdxlist]_, version 3.15 or any later compatible version. - Note that the SPDX working group never removes any license identifiers; - instead, they may choose to mark an identifier as "deprecated". +- Any SPDX-listed license short-form identifiers that are published in the + `SPDX License List <#spdxlist_>`_, version 3.15 or any later compatible + version. Note that the SPDX working group never removes any license + identifiers; instead, they may choose to mark an identifier as "deprecated". - The ``LicenseRef-Public-Domain`` and ``LicenseRef-Proprietary`` strings to identify licenses that are not included in the SPDX license list. @@ -421,7 +437,7 @@ a valid license expression, build and publishing tools: - SHOULD report an informational warning, and publishing tools MAY raise an error if one or more license identifiers have been marked as deprecated in - the SPDX License List [#spdxlist]_. + the `SPDX License List <#spdxlist_>`_. - MUST store a case-normalized version of the ``License-Expression`` field using the reference case for each SPDX license identifier and @@ -432,11 +448,12 @@ a valid license expression, build and publishing tools: ``License-Expression`` field contents. For all newly-upload distributions that include a -``License-Expression`` field, the Python Package Index (PyPI) [#pypi]_ MUST +``License-Expression`` field, the `Python Package Index (PyPI) <#pypi_>`_ MUST validate that it contains a valid, case-normalized license expression with -valid identifiers (as defined above) and MUST reject uploads that do not +valid identifiers (as defined here) and MUST reject uploads that do not validate. PyPI MAY reject an upload for using a deprecated license identifier, -so long as it was deprecated as of the above SPDX License List version. +so long as it was deprecated as of the above-mentioned SPDX License List +version. Add ``License-File`` field @@ -455,7 +472,7 @@ metadata, that file MUST be included in the distribution at the specified path relative to the root license directory, and MUST be installed with the distribution at that same relative path. -The root license directory is defined to be the project root directory +The **root license directory** is defined to be the project root directory for source trees and source distributions, and the ``license_files`` subdirectory of the directory containing the core metadata (i.e. the ``.dist-info`` directory containing the ``METADATA`` file), for built @@ -496,9 +513,10 @@ informing users it is deprecated and recommending ``License-Expression`` instead. For all newly-uploaded distributions that include a -``License-Expression`` field, the Python Package Index (PyPI) [#pypi]_ MUST +``License-Expression`` field, the `Python Package Index (PyPI) <#pypi_>`_ MUST reject any that specify a ``License`` field and the text of which is not -identical to that of ``License-Expression``, as defined above. +identical to that of ``License-Expression``, as +`defined above `_. Along with license classifiers, the ``License`` field may be removed from a new version of the specification in a future PEP. @@ -507,13 +525,13 @@ new version of the specification in a future PEP. Deprecate license classifiers ''''''''''''''''''''''''''''' -Including license classifiers [#classif]_ in the ``Classifier`` field +Including license `classifiers <#classifiers_>`_ in the ``Classifier`` field (described in PEP 301) is deprecated and replaced by the more precise ``License-Expression`` field. If the ``License-Expression`` field is present, build tools SHOULD and publishing tools MUST raise an error if one or more license classifiers -(as defined above) is included in a ``Classifier`` field, and MUST NOT add +is included in a ``Classifier`` field, and MUST NOT add such classifiers themselves. Otherwise, if this field contains a license classifier, build tools MAY @@ -524,10 +542,10 @@ the presence of license classifiers SHOULD NOT raise an error unless ``License-Expression`` is also provided. For all newly-uploaded distributions that include a -``License-Expression`` field, the Python Package Index (PyPI) [#pypi]_ MUST +``License-Expression`` field, the `Python Package Index (PyPI) <#pypi_>`_ MUST reject any that also specify any license classifiers. -New license classifiers MUST NOT be added to PyPI [#classifersrepo]_; +New license classifiers MUST NOT be `added to PyPI <#classifiersrepo_>`_; users needing them SHOULD use the ``License-Expression`` field instead. Along with the ``License`` field, license classifiers may be removed from a new version of the specification in a future PEP. @@ -536,13 +554,15 @@ new version of the specification in a future PEP. Project source metadata ----------------------- -As originally introduced in PEP 621, the PyPA Declaring Project Metadata -specification [#projectspec]_ defines how to declare a project's source +As originally introduced in PEP 621, the +`PyPA Declaring Project Metadata specification <#pep621spec_>`_ +defines how to declare a project's source metadata in a ``[project]`` table in the ``pyproject.toml`` file for build tools to consume and output distribution core metadata. -This PEP adds the ``license-expression`` and ``license-files`` keys and -deprecates the ``license`` key. +This PEP `adds `_ the ``license-expression``, +`adds `_ the ``license-files`` key and +`deprecates `_ the ``license`` key. Add ``license-expression`` key @@ -552,7 +572,8 @@ A new ``license-expression`` key is added to the ``project`` table, which has a string value that is a valid SPDX license expression, as defined previously. Its value maps to the ``License-Expression`` field in the core metadata. -Build tools SHOULD validate the expression as described above, outputting +Build tools SHOULD validate the expression as described +`above `_, outputting an error or warning as specified. When generating the core metadata, tools MUST perform case normalization. @@ -579,12 +600,13 @@ Its value is a table, which if present MUST contain one of two optional, mutually exclusive subkeys, ``paths`` and ``globs``; both arrays of strings. If both are specified, tools MUST raise an error. The ``paths`` subkey contains verbatim file paths, and the ``globs`` subkey -valid glob patterns, parsable by the ``glob`` module [#globmodule]_ in the +valid glob patterns, parsable by the ``glob`` `module <#globmodule_>`_ in the Python standard library. **Note**: To avoid ambiguity, confusion and (per PEP 20, the Zen of Python) "more than one (obvious) way to do it", a flat array of strings value for the -``license-files`` key has been left out for now. +``license-files`` key has been +`left out for now `_. Path separators, if used, MUST be the forward slash character (``/``), and parent directory indicators (``..``) MUST NOT be used. @@ -683,24 +705,25 @@ A few minor additions will be made to the relevant existing specifications to document, standardize and clarify what is already currently supported, allowed and implemented behavior, as well as explicitly mention the directory location the license file tree is rooted in for each format, per the -specification above. +`specification above `_. **Project source trees** - As described above, the project source metadata specification [#projectspec]_ + As `described above `_, the + `Declaring Project Metadata specification <#pep621spec_>`_ will be updated to reflect that license file paths MUST be relative to the project root directory; i.e. the directory containing the ``pyproject.toml`` (or equivalently, other legacy project configuration, e.g. ``setup.py``, ``setup.cfg``, etc). **Source distributions** *(sdists)* - The sdist specification [#sdistspec]_ will be updated to reflect that for + The `sdist specification <#sdistspec_>`_ will be updated to reflect that for ``Metadata-Version`` is ``2.3`` or greater, the sdist MUST contain any license files specified by ``License-File`` in the ``PKG-INFO`` at their respective paths relative to the top-level directory of the sdist (containing the ``pyproject.toml`` and the ``PKG-INFO`` core metadata). **Built distributions** *(wheels)* - The wheel specification [#wheelspec]_ will be updated to reflect that if + The `wheel specification <#wheelspec_>`_ will be updated to reflect that if the ``Metadata-Version`` is ``2.3`` or greater and one or more ``License-File`` fields is specified, the ``.dist-info`` directory MUST contain a ``license_files`` subdirectory which MUST contain the files listed @@ -708,7 +731,7 @@ specification above. paths relative to the ``license_files`` directory. **Installed projects** - The Recording Installed Projects specification [#installedspec]_ will be + The `Recording Installed Projects specification <#installedspec_>`_ will be updated to reflect that if the ``Metadata-Version`` is ``2.3`` or greater and one or more ``License-File`` fields is specified, the ``.dist-info`` directory MUST contain a ``license_files`` subdirectory which MUST contain @@ -736,7 +759,7 @@ license identifier mapped to the license classifier to be considered unambiguous for the purposes of automatically filling the ``License-Expression`` field. -If tools have filled the ``License-Expression`` field as described above, +If tools have filled the ``License-Expression`` field as described here, they MUST output a prominent, user-visible warning informing package authors of that fact, including the ``License-Expression`` string they have output, and recommending that the source metadata be updated accordingly @@ -753,14 +776,16 @@ Mapping license classifiers to SPDX identifiers Most single license classifiers (namely, all those not mentioned below) map to a single valid SPDX license identifier, allowing tools to insert them -into the ``License-Expression`` field following the specification above. +into the ``License-Expression`` field following the +`specification above `_. Many legacy license classifiers intend to specify a particular license, -but do not specify the particular version or variant, leading to critical -ambiguity as to their terms, compatibility and acceptability [#issue17]_. -Tools MUST NOT attempt to automatically infer a ``License-Expression`` -when one of these classifiers is used, and SHOULD instead prompt the user -to affirmatively select and confirm their intended license choice. +but do not specify the particular version or variant, leading to +`critical ambiguity <#classifierissue_>`_ as to their terms, compatibility +and acceptability. Tools MUST NOT attempt to automatically infer a +``License-Expression`` when one of these classifiers is used, and SHOULD +instead prompt the user to affirmatively select and confirm their intended +license choice. These classifiers are the following: @@ -780,7 +805,7 @@ These classifiers are the following: - ``License :: OSI Approved :: GNU Library or Lesser General Public License (LGPL)`` A comprehensive mapping of these classifiers to their possible specific -identifiers was assembled by Dustin Ingram [#badclassifiers]_, which tools +identifiers was `assembled by Dustin Ingram <#badclassifiers_>`_, which tools MAY use as a reference for the identifier selection options to offer users when prompting the user to explicitly select the license identifier they intended for their project. @@ -799,11 +824,13 @@ considered canonical and normative for the purposes of this specification: ``License-Expression: LicenseRef-Public-Domain``. If tools do so, they SHOULD issue an informational warning encouraging the use of more explicit and legally portable license identifiers - such as ``CC0-1.0`` [#cc0]_ or the ``Unlicense`` [#unlic]_, + such as those for the `CC0 1.0 license <#cc0_>`_ (``CC0-1.0``), + the `Unlicense <#unlicense_>`_ (``Unlicense``), + or the `MIT license <#mitlicense_>`_ (``MIT``), since the meaning associated with the term "public domain" is thoroughly dependent on the specific legal jurisdiction involved, some of which lack the concept entirely. - Alternatively, tools MAY choose to treat the above as ambiguous and + Alternatively, tools MAY choose to treat these classifiers as ambiguous and require user confirmation to fill ``License-Expression`` in these cases. - The generic and sometimes ambiguous classifiers @@ -816,7 +843,7 @@ considered canonical and normative for the purposes of this specification: ``License :: Other/Proprietary License`` MAY be mapped to the generic ``License-Expression: LicenseRef-Proprietary``, but tools MUST issue a prominent, informative warning if they do so. - Alternatively, tools MAY choose to treat the above as ambiguous and + Alternatively, tools MAY choose to treat these classifiers as ambiguous and require user confirmation to fill ``License-Expression`` in these cases. - The generic and ambiguous classifiers ``License :: OSI Approved`` and @@ -824,7 +851,7 @@ considered canonical and normative for the purposes of this specification: and thus tools MUST treat them as ambiguous and require user intervention to fill ``License-Expression``. -- The classifiers ``License :: GUST Font License 1.0*`` and +- The classifiers ``License :: GUST Font License 1.0`` and ``License :: GUST Font License 2006-09-30`` have no mapping to SPDX license identifiers and no PyPI package uses them, as of the writing of this PEP. Therefore, tools MUST treat them as ambiguous when attempting to fill @@ -864,13 +891,13 @@ I just want to share my own work without legal restrictions ----------------------------------------------------------- While you aren't required to include a license, if you don't, no one has -*any* permission to download, use or improve your work [#dontchoosealicense]_, +`any permission to download, use or improve your work <#dontchoosealicense_>`_, so that's probably the *opposite* of what you actually want. -The MIT license [#mitlicense]_ is a great choice for this, as its simple, +The `MIT license <#mitlicense_>`_ is a great choice for this, as its simple, widely used and allows anyone to do whatever they want with your work (other than sue you, which you probably also don't want). -To apply it, just paste the text [#chooseamitlicense]_ into a file named +To apply it, just paste `the text <#chooseamitlicense_>`_ into a file named ``LICENSE.txt`` at the root of your repo, and add the year and your name to the copyright line. Then, just add ``license-expression = "MIT"`` under ``[project]`` in your ``pyproject.toml`` if your packaging tool supports it, @@ -888,8 +915,8 @@ file at the root of your repo (if you don't have it in a file starting with ``pyproject.toml`` if your packaging tool supports it, or in its config file (e.g. for Setuptools, ``license_expression = LICENSE-ID`` under ``[metadata]`` in ``setup.cfg``). You can find the ``LICENSE-ID`` -and copyable license text on sites like ChooseALicense [#choosealicenselist]_ -or SPDX [#spdxlist]_. +and copyable license text on sites like +`ChooseALicense <#choosealicenselist_>`_ or `SPDX <#spdxlist_>`_. Many popular code hosts, project templates and packaging tools can add the license file for you, and may support the expression as well in the future. @@ -908,7 +935,7 @@ In your project config file, enter your license expression under packaging tool, and make sure to remove any legacy ``license`` value or ``License ::`` classifiers. Your existing ``license`` value may already be valid as one (e.g. ``MIT``, ``Apache-2.0 OR BSD-2-Clause``, etc); -otherwise, check the SPDX license list [#spdxlist]_ for the identifier +otherwise, check the `SPDX license list <#spdxlist_>`_ for the identifier that matches the license used in your project. If your license files begin with ``LICENSE``, ``COPYING``, ``NOTICE`` or @@ -919,7 +946,7 @@ or ``license-files.globs`` under ``[project]`` in ``pyproject.toml`` (if your tool supports it), or in your tool's configuration file (e.g. ``license_files`` in ``setup.cfg`` for Setuptools). -See the `Basic Example`_ for a simple but complete real-world demo of how +See the `basic example`_ for a simple but complete real-world demo of how this works in practice, including some additional technical details. Packaging tools may support automatically converting legacy licensing metadata; check your tool's documentation for details. @@ -968,9 +995,9 @@ as glob patterns, or ``["LICENSE.txt", "_vendor/LICENSE-APACHE.txt", "_vendor/LICENSE-BSD.txt"]`` as literal file paths. -See a fully worked out `Advanced Example`_ for a comprehensive end-to-end +See a fully worked out `advanced example`_ for a comprehensive end-to-end application of this to a real-world complex project, with copious technical -details, and consult a tutorial [#spdxtutorial]_ for more help and examples +details, and consult a `tutorial <#spdxtutorial_>`_ for more help and examples on using SPDX identifiers and expressions. @@ -1020,7 +1047,7 @@ the names of existing licenses. While minor additions will be made to the source distribution (sdist) built distribution (wheel) and installed project specifications, all of these -are merely documenting, clarifing and formally specifying behaviors explicitly +are merely documenting, clarifying and formally specifying behaviors explicitly allowed under their current respective specifications, and already implemented in practice, and gating them behind the explicit presence of both the new metadata versions and the new fields. In particular, sdsts may contain @@ -1091,13 +1118,14 @@ Reference Implementation Tools will need to support parsing and validating license expressions in the ``License-Expression`` field. -The license-expression library [#licexp]_ is a reference Python +The `license-expression library <#licenseexplib_>`_ is a reference Python implementation of a library that handles license expressions including parsing, validating and formatting license expressions using flexible lists of license symbols (including SPDX license identifiers and any extra identifiers referenced here). It is licensed under the Apache-2.0 license and is used in a few projects -such as the SPDX Python Tools [#spdxpy]_, the ScanCode toolkit [#scancodetk]_ -and the Free Software Foundation Europe (FSFE) Reuse project [#reuse]_. +such as the `SPDX Python Tools <#spdxpy_>`_, +the `ScanCode toolkit <#scancodetk_>`_ +and the Free Software Foundation Europe (FSFE) `Reuse project <#reuse_>`_. Rejected Ideas @@ -1113,7 +1141,7 @@ core metadata fields specified in this PEP. Re-use the ``License`` field '''''''''''''''''''''''''''' -Following initial discussion [#reusediscussion]_, earlier versions of this +Following `initial discussion <#reusediscussion_>`_, earlier versions of this PEP proposed to re-use the existing ``License`` field, which tools would attempt to parse as a SPDX license expression with a fall back to treating as free text. Initially, this would merely cause a warning (or even pass @@ -1234,7 +1262,7 @@ to a free-form description, to the same SPDX identifier they would be entering in the ``license-expression`` key anyway, assuming they can easily find documentation at all about it). In fact, this can be made even easier thanks to the new field. For example, GitHub's popular -ChooseALicense.com [#choosealicense]_ links to how to add SPDX license +`ChooseALicense.com <#choosealicense_>`_ links to how to add SPDX license identifiers to the project source metadata of various languages that support them right in the sidebar of every license page; the SPDX support in this PEP enables adding Python to that list. @@ -1698,10 +1726,9 @@ Flatten license files in subdirectories ''''''''''''''''''''''''''''''''''''''' Previous drafts of this PEP were silent on the issue of handling license files -in subdirectories. Currently, Wheel [#wheelfiles]_ and (following its example) -Setuptools [#setuptoolsfiles]_ flattens all license files into the -``.dist-info`` directory [#setuptoolsfiles]_, without preserving the source -subdirectory hierarchy. +in subdirectories. Currently, `Wheel <#wheelfiles_>`_ and (following its +example) `Setuptools <#setuptoolsfiles_>`_ flattens all license files into the +``.dist-info`` directory without preserving the source subdirectory hierarchy. While this is the simplest approach and matches existing ad hoc practice, this can result in name conflicts and license files clobbering others, @@ -1775,7 +1802,7 @@ Therefore, now is a prudent time to specify an alternate approach. The simplest and most obvious solution, as suggested by several on the Wheel and Setuptools implementation issues, is to simply root the license files relative to a ``license_files`` subdirectory of ``.dist-info``. This is simple -to implement and solves all the problems noted above, without clear significant +to implement and solves all the problems noted here, without clear significant drawbacks relative to other more complex options. It does make the specification a bit more complex and less elegant, but @@ -1883,8 +1910,8 @@ Map identifiers to source files ''''''''''''''''''''''''''''''' File-level notices are not considered as part of the scope of this PEP and the -existing ``SPDX-License-Identifier`` [#spdxids]_ convention can be used and -may not need further specification as a PEP. +existing ``SPDX-License-Identifier`` `convention <#spdxid_>`_ can +be used and may not need further specification as a PEP. Don't freeze compatibility with a specific SPDX version @@ -1902,7 +1929,7 @@ compatibility with a specific version of these specifications here and requiring a PEP or similar process to update it avoids that from occurring, and follows the practice of other packaging ecosystems. -Therefore, it was decided [#spdxversion]_ to specify a minimum version +Therefore, it was `decided <#spdxversion_>`_ to specify a minimum version and requires tools to be compatible with it, while still allowing updates so long as they don't break backward compatibility. This enables tools to immediate take advantage of improvements and accept new @@ -1918,10 +1945,10 @@ PEP to handle cases where the license expression for a binary distribution (wheel) is different from that for a source distribution (sdist), such as in cases of non-pure-Python packages that compile and bundle binaries under different licenses than the project itself. An example cited was -PyTorch [#pytorch]_, which contains CUDA from Nvidia, which is freely -distributable but not open source. NumPy [#numpyissue]_ and SciPy -[#scipyissue]_ also had similar issues, as reported by the original author -of this PEP and now resolved for those cases. +`PyTorch <#pytorch_>`_, which contains CUDA from Nvidia, which is freely +distributable but not open source. `NumPy <#numpyissue_>`_ and +`SciPy <#scipyissue_>`_ also had similar issues, as reported by the +original author of this PEP and now resolved for those cases. However, given the inherent complexity here and a lack of an obvious mechanism to do so, the fact that each wheel would need its own license @@ -1967,7 +1994,7 @@ Appendix 1. License Expression Examples Basic example ------------- -The Setuptools project itself, as of version 59.1.1 [#setuptools5911]_, +The Setuptools project itself, as of `version 59.1.1 <#setuptools5911_>`_, does not use the ``License`` field in its own project source metadata. Further, it no longer explicitly specifies ``license_file``/``license_files`` as it did previously, since Setuptools relies on its own automatic @@ -1998,7 +2025,7 @@ The output core metadata for the distribution packages would then be:: The ``LICENSE`` file would be stored at ``/setuptools-{version}/LICENSE`` in the sdist and ``/setuptools-{version}.dist-info/license_files/LICENSE`` in the wheel, and unpacked from there into the site directory (e.g. -``site-packages) on installation; ``/`` is the root of the respective archive +``site-packages``) on installation; ``/`` is the root of the respective archive and ``{version}`` the version of the Setuptools release in the core metadata. @@ -2033,7 +2060,7 @@ of the MIT license and the copyrights used by Setuptools, ``pyparsing``, ``more_itertools`` and ``ordered-set``; and the ``LICENSE`` files in the ``setuptools/_vendor/packaging/`` directory contain the Apache 2.0 and 2-clause BSD license text, and the Packaging copyright statement and -license choice notice [#packlic]_. +`license choice notice <#packaginglicense_>`_. Therefore, we assume the license files are located at the following paths in the project source tree (relative to the project root and @@ -2094,7 +2121,7 @@ the license files would be located at the paths:: /setuptools-{version}/setuptools/_vendor/packaging/LICENSE.BSD In the built wheel, with ``/`` being the root of the archive and -``{version}`` as above, the license files would be stored at:: +``{version}`` as the previous, the license files would be stored at:: /setuptools-{version}.dist-info/license_files/LICENSE /setuptools-{version}.dist-info/license_files/setuptools/_vendor/packaging/LICENSE @@ -2102,7 +2129,7 @@ In the built wheel, with ``/`` being the root of the archive and /setuptools-{version}.dist-info/license_files/setuptools/_vendor/packaging/LICENSE.BSD Finally, in the installed project, with ``site-packages`` being the site dir -and ``{version}`` as above, the license files would be installed to:: +and ``{version}`` as the previous, the license files would be installed to:: site-packages/setuptools-{version}.dist-info/license_files/LICENSE site-packages/setuptools-{version}.dist-info/license_files/setuptools/_vendor/packaging/LICENSE @@ -2164,8 +2191,8 @@ Core metadata ------------- There are two overlapping core metadata fields to document a license: the -license ``Classifier`` strings [#classif]_ prefixed with ``License ::`` -and the ``License`` field as free text [#licfield]_. +license ``Classifier`` `strings <#classifiers_>`_ prefixed with ``License ::`` +and the ``License`` `field <#licensefield_>`_ as free text. The core metadata ``License`` field documentation is currently:: @@ -2204,7 +2231,7 @@ Beyond a license code or qualifier, license text files are documented and included in a built package either implicitly or explicitly and this is another possible source of confusion: -- In Setuptools [#setuptoolssdist]_ and Wheel [#wheels]_, license files +- In `Setuptools <#setuptoolssdist_>`_ and `Wheel <#wheels_>`_, license files are automatically added to the distribution (at their source location in in a source distribution/sdist, and in the ``.dist-info`` directory of a built wheel) if they match one of a number of common license file @@ -2215,44 +2242,37 @@ possible source of confusion: to the ``setuptools.setup()`` function. At present, following Wheel's lead, Setuptools flattens the collected license files into the metadata directory, clobbering files with the same name, and dump license files - directly into the top-level ``.dist-info`` directory, but there is a desire - to resolve both these issues, contingent on the this PEP being accepted - [#setuptoolsfiles]_. + directly into the top-level ``.dist-info`` directory, but there is a + `desire to resolve both these issues <#setuptoolsfiles_>`_, + contingent on the this PEP being accepted. - Both tools also support an older, singular ``license_file`` parameter that allows specifying only one license file to add to the distribution, which - has been deprecated for some time but still sees some use. - See [#pipsetup]_ for instance. + has been deprecated for some time but still sees `some use <#pipsetup_>`_. - Following the publication of an earlier draft of this PEP, Setuptools - added support for ``License-File`` in distribution metadata as described - herein [#setuptoolspep639]_. This allows other tools consuming the resulting - metadata to unambiguously locate the license file(s) for a given package. - -**Note:** The ``License-File`` field proposed in this PEP already exists in -Wheel and Setuptools with the same behaviour as explained above. -This PEP is only recognizing and documenting the existing practice as used -in Wheel and Setuptools to add license files to the distribution, -and formally including their paths in core metadata (which has since been -implemented on the basis of a draft of this PEP). + `added support <#setuptoolspep639_>`_ for ``License-File`` in distribution + metadata as described in this specification. This allows other tools + consuming the resulting metadata to unambiguously locate the license file(s) + for a given package. PyPA Packaging Guide and Sample Project --------------------------------------- -Both the PyPA beginner packaging tutorial [#packagingtuttxt]_ and its more -comprehensive packaging guide [#packagingguidetxt]_ state that it is important -that every package include a license file. They point to the ``LICENSE.txt`` -in the official PyPA sample project as an example, which is explicitly listed -under the ``license_files`` key in its ``setup.cfg`` [#samplesetupcfg]_, -following existing practice formally specified by this PEP. +Both the `PyPA beginner packaging tutorial <#packagingtuttxt_>`_ and its more +comprehensive `packaging guide <#packagingguidetxt_>`_ state that it is +important that every package include a license file. They point to the +``LICENSE.txt`` in the official PyPA sample project as an example, which is +`explicitly listed <#samplesetupcfg_>`_ under the ``license_files`` key in +its ``setup.cfg``, following existing practice formally specified by this PEP. -Both the beginner packaging tutorial [#packagingtutkey]_ and the sample project -[#samplesetuppy]_ only use classifiers to declare a package's license, and do -not include or mention the ``License`` field. The full packaging guide does -mention this field, but states that authors should use the license classifiers -instead, unless the project uses a non-standard license (which the guide -discourages) [#licfield]_. +Both the `beginner packaging tutorial <#packagingtutkey_>`_ and the +`sample project <#samplesetuppy_>`_ only use classifiers to declare a +package's license, and do not include or mention the ``License`` field. +The `full packaging guide <#licensefield_>`_ does mention this field, but +states that authors should use the license classifiers instead, unless the +project uses a non-standard license (which the guide discourages). Python source code files @@ -2260,10 +2280,9 @@ Python source code files **Note:** Documenting licenses in source code is not in the scope of this PEP. -Beside using comments and/or ``SPDX-License-Identifier`` conventions, the license -is sometimes documented in Python code files using "dunder" variables typically -named after one of the lower cased core metadata fields such as ``__license__`` -[#pycode]_. +Beside using comments and/or ``SPDX-License-Identifier`` conventions, the +license is `sometimes <#pycode_>`_ documented in Python code files using +a "dunder" variable, typically named ``__license__``. This convention (dunder global variables) is recognized by the built-in ``help()`` function and the standard ``pydoc`` module. The dunder variable(s) will show up in @@ -2273,17 +2292,17 @@ the ``help()`` DATA section for a module. Other Python packaging tools ---------------------------- -- Conda package manifests [#conda]_ have support for ``license`` and +- `Conda package manifests <#conda_>`_ have support for ``license`` and ``license_file`` fields, and automatically include license files following similar naming patterns as Wheel and Setuptools. -- Flit [#flit]_ recommends using classifiers instead of the ``License`` field +- `Flit <#flit_>`_ recommends using classifiers instead of the ``License`` field (per the current PyPA packaging guide). -- PBR [#pbr]_ uses similar data as Setuptools, but always stored in +- `PBR <#pbr_>`_ uses similar data as Setuptools, but always stored in ``setup.cfg``. -- Poetry [#poetry]_ specifies the use of the ``license`` field in +- `Poetry <#poetry_>`_ specifies the use of the ``license`` field in ``pyproject.toml`` with SPDX license identifiers. @@ -2299,27 +2318,30 @@ Linux distribution packages **Note:** in most cases the license texts of the most common licenses are included globally once in a shared documentation directory (e.g. ``/usr/share/doc``). -- Debian documents package licenses with machine readable copyright files - [#dep5]_. This specification defines its own license expression syntax that is +- Debian documents package licenses with + `machine readable copyright files <#dep5_>`_. + This specification defines its own license expression syntax that is very similar to the SDPX syntax and use its own list of license identifiers for common licenses (also closely related to SPDX identifiers). -- Fedora packages [#fedora]_ specify how to include ``License Texts`` - [#fedoratext]_ and how use a ``License`` field [#fedoralic]_ that must be filled +- `Fedora packages <#fedora_>`_ specify how to include + `License Texts <#fedoratext_>`_ and how use a + `License field <#fedoralicense_>`_ that must be filled with an appropriate license Short License identifier(s) from an extensive list - of "Good Licenses" identifiers [#fedoralist]_. Fedora also defines its own + of `"Good License" identifiers <#fedoralist_>`_. Fedora also defines its own license expression syntax very similar to the SDPX syntax. -- openSUSE packages [#opensuse]_ use SPDX license expressions with - SPDX license identifiers and a list of extra license identifiers - [#opensuselist]_. +- `OpenSUSE packages <#opensuse_>`_ use SPDX license expressions with + SPDX license identifiers and a + `list of extra license identifiers <#opensuselist_>`_. -- Gentoo ebuild uses a ``LICENSE`` variable [#gentoo]_. This field is specified - in GLEP-0023 [#glep23]_ and in the Gentoo development manual [#gentoodev]_. +- `Gentoo ebuild <#pycode_>`_ uses a ``LICENSE`` variable. This field is + specified in `GLEP-0023 <#glep23_>`_ and in the + `Gentoo development manual <#gentoodev_>`_. Gentoo also defines a license expression syntax and a list of allowed licenses. The expression syntax is rather different from SPDX. -- FreeBSD package Makefile [#freebsd]_ provides ``LICENSE`` and +- The `FreeBSD package Makefile <#freebsd_>`_ provides ``LICENSE`` and ``LICENSE_FILE`` fields with a list of custom license symbols. For non-standard licenses, FreeBSD recommend to use ``LICENSE=UNKNOWN`` and add ``LICENSE_NAME`` and ``LICENSE_TEXT`` fields, as well as sophisticated @@ -2329,95 +2351,97 @@ globally once in a shared documentation directory (e.g. ``/usr/share/doc``). expression syntax. FreeBSD also recommends the use of ``SPDX-License-Identifier`` in source code files. -- Arch Linux PKGBUILD [#archinux]_ define its own license identifiers - [#archlinuxlist]_. The value ``'unknown'`` can be used if the license is not - defined. +- `Arch Linux PKGBUILD <#archinux_>`_ define its + `own license identifiers <#archlinuxlist_>`_. + The value ``'unknown'`` can be used if the license is not defined. -- OpenWRT ipk packages [#openwrt]_ use the ``PKG_LICENSE`` and +- `OpenWRT ipk packages <#openwrt_>`_ use the ``PKG_LICENSE`` and ``PKG_LICENSE_FILES`` variables and recommend the use of SPDX License identifiers. -- NixOS uses SPDX identifiers [#nixos]_ and some extra license identifiers in - its license field. +- `NixOS uses SPDX identifiers <#nixos_>`_ and some extra license identifiers + in its license field. -- GNU Guix (based on NixOS) has a single License field, uses its own license - symbols list [#guix]_ and specifies to use one license or a list of licenses - [#guixlic]_. +- GNU Guix (based on NixOS) has a single License field, uses its own + `license symbols list <#guix_>`_ and specifies to use one license or a + `list of licenses <#guixlicense_>`_. -- Alpine Linux packages [#alpine]_ recommend using SPDX identifiers in the +- `Alpine Linux packages <#alpine_>`_ recommend using SPDX identifiers in the license field. Language and application packages --------------------------------- -- In Java, Maven POM [#maven]_ defines a ``licenses`` XML tag with a list of license - items each with a name, URL, comments and "distribution" type. This is not - mandatory and the content of each field is not specified. +- In Java, `Maven POM <#maven_>`_ defines a ``licenses`` XML tag with a list + of license items each with a name, URL, comments and "distribution" type. + This is not mandatory and the content of each field is not specified. -- JavaScript npm package.json [#npm]_ use a single license field with SPDX +- `JavaScript NPM package.json <#npm_>`_ use a single license field with SPDX license expression or the ``UNLICENSED`` id if no license is specified. A license file can be referenced as an alternative using "SEE LICENSE IN " in the single ``license`` field. -- Rubygems gemspec [#gem]_ specifies either a singular license string or a list - of license strings. The relationship between multiple licenses in a list is - not specified. They recommend using SPDX license identifiers. +- `Rubygems gemspec <#gem_>`_ specifies either a singular license string or + a list of license strings. The relationship between multiple licenses in a + list is not specified. They recommend using SPDX license identifiers. -- CPAN Perl modules [#perl]_ use a single license field which is either a single - string or a list of strings. The relationship between the licenses in a list - is not specified. There is a list of custom license identifiers plus +- `CPAN Perl modules <#perl_>`_ use a single license field which is either a + single string or a list of strings. The relationship between the licenses in + a list is not specified. There is a list of custom license identifiers plus these generic identifiers: ``open_source``, ``restricted``, ``unrestricted``, ``unknown``. -- Rust Cargo [#cargo]_ specifies the use of an SPDX license expression (v2.1) in - the ``license`` field. It also supports an alternative expression syntax using - slash-separated SPDX license identifiers. There is also a ``license_file`` - field. The crates.io package registry [#cratesio]_ requires that either - ``license`` or ``license_file`` fields are set when you upload a package. +- `Rust Cargo <#cargo_>`_ specifies the use of an SPDX license expression + (v2.1) in the ``license`` field. It also supports an alternative expression + syntax using slash-separated SPDX license identifiers. There is also a + ``license_file`` field. The `crates.io package registry <#cratesio_>`_ + requires that either ``license`` or ``license_file`` fields are set when + you upload a package. -- PHP Composer composer.json [#composer]_ uses a ``license`` field with an SPDX - license id or "proprietary". The ``license`` field is either a single string - that can use something which resembles the SPDX license expression syntax with - "and" and "or" keywords; or is a list of strings if there is a choice of - licenses (aka. a "disjunctive" choice of license). +- `PHP Composer composer.json <#composer_>`_ uses a ``license`` field with + an SPDX license id or "proprietary". The ``license`` field is either a + single string that can use something which resembles the SPDX license + expression syntax with "and" and "or" keywords; or is a list of strings + if there is a choice of licenses (aka. a "disjunctive" choice of license). -- NuGet packages [#nuget]_ were using only a simple license URL and are now +- `NuGet packages <#nuget_>`_ were using only a simple license URL and are now specifying to use an SPDX License expression and/or the path to a license file within the package. The NuGet.org repository states that they only - accepts license expressions that are `approved by the Open Source Initiative - or the Free Software Foundation.` + accepts license expressions that are "approved by the Open Source Initiative + or the Free Software Foundation." - Go language modules ``go.mod`` have no provision for any metadata beyond dependencies. Licensing information is left for code authors and other community package managers to document. -- Dart/Flutter spec [#flutter]_ recommends to use a single ``LICENSE`` file +- `Dart/Flutter spec <#flutter_>`_ recommends to use a single ``LICENSE`` file that should contain all the license texts each separated by a line with 80 hyphens. -- JavaScript Bower [#bower]_ ``license`` field is either a single string or a list - of strings using either SPDX license identifiers, or a path or a URL to a - license file. +- `JavaScript Bower <#bower_>`_ ``license`` field is either a single string + or a list of strings using either SPDX license identifiers, or a path or + a URL to a license file. -- Cocoapods podspec [#cocoapod]_ ``license`` field is either a single string or a - mapping with attributes of type, file and text keys. This is mandatory unless - there is a LICENSE or LICENCE file provided. +- `Cocoapods podspec <#cocoapod_>`_ ``license`` field is either a single + string or a mapping with attributes of type, file and text keys. + This is mandatory unless there is a ``LICENSE`` or ``LICENCE`` file provided. -- Haskell Cabal [#cabal]_ accepts an SPDX license expression since version 2.2. - The version of the SPDX license list used is a function of the ``cabal`` version. - The specification also provides a mapping between pre-SPDX Legacy license - Identifiers and SPDX identifiers. Cabal also specifies a ``license-file(s)`` - field that lists license files that will be installed with the package. +- `Haskell Cabal <#cabal_>`_ accepts an SPDX license expression since + version 2.2. The version of the SPDX license list used is a function of + the ``cabal`` version. The specification also provides a mapping between + pre-SPDX Legacy license Identifiers and SPDX identifiers. + Cabal also specifies a ``license-file(s)`` field that lists license files + that will be installed with the package. -- Erlang/Elixir mix/hex package [#mix]_ specifies a ``licenses`` field as a +- `Erlang/Elixir mix/hex package <#mix_>`_ specifies a ``licenses`` field as a required list of license strings and recommends to use SPDX license identifiers. -- D lang dub package [#dub]_ defines its own list of license identifiers and +- `D lang dub package <#dub_>`_ defines its own list of license identifiers and its own license expression syntax and both are similar to the SPDX conventions. -- R Package DESCRIPTION [#cran]_ defines its own sophisticated license +- `R Package DESCRIPTION <#cran_>`_ defines its own sophisticated license expression syntax and list of licenses identifiers. R has a unique way to support specifiers for license versions such as ``LGPL (>= 2.0, < 3)`` in its license expression syntax. @@ -2426,143 +2450,146 @@ Language and application packages Other ecosystems ---------------- -- ``SPDX-License-Identifier`` [#spdxids]_ is a simple convention to document the - license inside a file. +- The ``SPDX-License-Identifier`` `header <#spdxid_>`_ is a simple + convention to document the license inside a file. -- The Free Software Foundation (FSF) promotes the use of SPDX license identifiers - for clarity in the GPL and other versioned free software licenses [#gnu]_ - [#fsf]_. +- The `Free Software Foundation (FSF) <#fsf_>`_ promotes the use of + SPDX license identifiers for clarity in the `GPL <#gnu_>`_ and other + versioned free software licenses. -- The Free Software Foundation Europe (FSFE) REUSE project [#reuse]_ promotes - using ``SPDX-License-Identifier``. +- The Free Software Foundation Europe (FSFE) `REUSE project <#reuse_>`_ + promotes using ``SPDX-License-Identifier``. -- The Linux kernel uses ``SPDX-License-Identifier`` and parts of the FSFE REUSE - conventions to document its licenses [#linux]_. +- The `Linux kernel <#linux_>`_ uses ``SPDX-License-Identifier`` + and parts of the FSFE REUSE conventions to document its licenses. -- U-Boot spearheaded using ``SPDX-License-Identifier`` in code and now follows the - Linux ways [#uboot]_. +- `U-Boot <#uboot_>`_ spearheaded using ``SPDX-License-Identifier`` in code + and now follows the Linux ways. -- The Apache Software Foundation projects use RDF DOAP [#apache]_ with a single - license field pointing to SPDX license identifiers. +- The Apache Software Foundation projects use `RDF DOAP <#apache_>`_ with + a single license field pointing to SPDX license identifiers. -- The Eclipse Foundation promotes using ``SPDX-license-Identifiers`` [#eclipse]_ +- The `Eclipse Foundation <#eclipse_>`_ promotes using + ``SPDX-license-Identifiers``. -- The ClearlyDefined project [#cd]_ promotes using SPDX license identifiers and - expressions to improve license clarity. +- The `ClearlyDefined project <#clearlydefined_>`_ promotes using SPDX + license identifiers and expressions to improve license clarity. -- The Android Open Source Project [#android]_ use ``MODULE_LICENSE_XXX`` empty - tag files where ``XXX`` is a license code such as BSD, APACHE, GPL, etc. And - side by side with this ``MODULE_LICENSE`` file there is a ``NOTICE`` file - that contains license and notices texts. +- The `Android Open Source Project <#android_>`_ use ``MODULE_LICENSE_XXX`` + empty tag files, where ``XXX`` is a license code such as BSD, APACHE, GPL, + etc. And side by side with this ``MODULE_LICENSE`` file there is a + ``NOTICE`` file that contains license and notices texts. References ========== -.. [#pypugglossary] https://packaging.python.org/glossary/ -.. [#cms] https://packaging.python.org/specifications/core-metadata -.. [#projectspec] https://packaging.python.org/specifications/declaring-project-metadata/ -.. [#sdistspec] https://packaging.python.org/specifications/source-distribution-format/ -.. [#wheelspec] https://packaging.python.org/specifications/binary-distribution-format/ -.. [#installedspec] https://packaging.python.org/specifications/recording-installed-packages/ -.. [#wheelproject] https://wheel.readthedocs.io/en/stable/ -.. [#cdstats] https://clearlydefined.io/stats -.. [#cd] https://clearlydefined.io -.. [#osi] https://opensource.org -.. [#pypi] https://pypi.org/ -.. [#classif] https://pypi.org/classifiers -.. [#classifersrepo] https://github.com/pypa/trove-classifiers -.. [#issue17] https://github.com/pypa/trove-classifiers/issues/17 -.. [#badclassifiers] https://github.com/pypa/trove-classifiers/issues/17#issuecomment-385027197 -.. [#setuptoolspep639] https://github.com/pypa/setuptools/pull/2645 -.. [#wheelfiles] https://github.com/pypa/wheel/issues/138 -.. [#setuptoolsfiles] https://github.com/pypa/setuptools/issues/2739 -.. [#globmodule] https://docs.python.org/3/library/glob.html -.. [#spdxlist] https://spdx.org/licenses/ -.. [#spdx] https://spdx.dev/ -.. [#spdxid] https://spdx.dev/ids/ -.. [#spdxpression] https://spdx.github.io/spdx-spec/SPDX-license-expressions/ -.. [#wheels] https://github.com/pypa/wheel/blob/0.37.0/docs/user_guide.rst#including-license-files-in-the-generated-wheel-file -.. [#reuse] https://reuse.software/ -.. [#licexp] https://github.com/nexB/license-expression/ -.. [#spdxpy] https://github.com/spdx/tools-python/ -.. [#reusediscussion] https://github.com/pombredanne/spdx-pypi-pep/issues/7 -.. [#choosealicense] https://choosealicense.com/ -.. [#choosealicenselist] https://choosealicense.com/licenses/ -.. [#dontchoosealicense] https://choosealicense.com/no-permission/ -.. [#chooseamitlicense] https://choosealicense.com/licenses/mit/ -.. [#mitlicense] https://opensource.org/licenses/MIT -.. [#spdxtutorial] https://github.com/david-a-wheeler/spdx-tutorial -.. [#spdxversion] https://github.com/pombredanne/spdx-pypi-pep/issues/6 -.. [#pytorch] https://pypi.org/project/torch/ -.. [#numpyissue] https://github.com/numpy/numpy/issues/8689 -.. [#scipyissue] https://github.com/scipy/scipy/issues/7093 -.. [#scancodetk] https://github.com/nexB/scancode-toolkit -.. [#licfield] https://packaging.python.org/guides/distributing-packages-using-setuptools/#license -.. [#samplesetuppy] https://github.com/pypa/sampleproject/blob/3a836905fbd687af334db16b16c37cf51dcbc99c/setup.py#L98 -.. [#samplesetupcfg] https://github.com/pypa/sampleproject/blob/3a836905fbd687af334db16b16c37cf51dcbc99c/setup.cfg -.. [#pipsetup] https://github.com/pypa/pip/blob/21.3.1/setup.cfg#L114 -.. [#setuptoolssdist] https://github.com/pypa/setuptools/pull/1767 -.. [#packagingtuttxt] https://packaging.python.org/tutorials/packaging-projects/#creating-a-license -.. [#packagingguidetxt] https://packaging.python.org/guides/distributing-packages-using-setuptools/#license-txt -.. [#packagingtutkey] https://packaging.python.org/tutorials/packaging-projects/#configuring-metadata -.. [#pycode] https://github.com/search?l=Python&q=%22__license__%22&type=Code -.. [#setuptools5911] https://github.com/pypa/setuptools/blob/v59.1.1/setup.cfg -.. [#packlic] https://github.com/pypa/packaging/blob/21.2/LICENSE -.. [#conda] https://docs.conda.io/projects/conda-build/en/stable/resources/define-metadata.html#about-section -.. [#flit] https://flit.readthedocs.io/en/stable/pyproject_toml.html -.. [#poetry] https://python-poetry.org/docs/pyproject/#license -.. [#pbr] https://docs.openstack.org/pbr/latest/user/features.html -.. [#dep5] https://dep-team.pages.debian.net/deps/dep5/ -.. [#fedora] https://docs.fedoraproject.org/en-US/packaging-guidelines/LicensingGuidelines/ -.. [#fedoratext] https://docs.fedoraproject.org/en-US/packaging-guidelines/LicensingGuidelines/#_license_text -.. [#fedoralic] https://docs.fedoraproject.org/en-US/packaging-guidelines/LicensingGuidelines/#_valid_license_short_names -.. [#fedoralist] https://fedoraproject.org/wiki/Licensing:Main?rd=Licensing#Good_Licenses -.. [#opensuse] https://en.opensuse.org/openSUSE:Packaging_guidelines#Licensing -.. [#opensuselist] https://docs.google.com/spreadsheets/d/14AdaJ6cmU0kvQ4ulq9pWpjdZL5tkR03exRSYJmPGdfs/pub -.. [#gentoo] https://devmanual.gentoo.org/ebuild-writing/variables/index.html#license -.. [#glep23] https://www.gentoo.org/glep/glep-0023.html -.. [#gentoodev] https://devmanual.gentoo.org/general-concepts/licenses/index.html -.. [#freebsd] https://docs.freebsd.org/en/books/porters-handbook/makefiles/#licenses -.. [#archinux] https://wiki.archlinux.org/title/PKGBUILD#license -.. [#archlinuxlist] https://archlinux.org/packages/core/any/licenses/files/ -.. [#openwrt] https://openwrt.org/docs/guide-developer/packages#buildpackage_variables -.. [#nixos] https://github.com/NixOS/nixpkgs/blob/21.05/lib/licenses.nix -.. [#guix] https://git.savannah.gnu.org/cgit/guix.git/tree/guix/licenses.scm?h=v1.3.0 -.. [#guixlic] https://guix.gnu.org/manual/en/html_node/package-Reference.html#index-license_002c-of-packages -.. [#alpine] https://wiki.alpinelinux.org/wiki/Creating_an_Alpine_package#license -.. [#maven] https://maven.apache.org/pom.html#Licenses -.. [#npm] https://docs.npmjs.com/cli/v8/configuring-npm/package-json#license -.. [#gem] https://guides.rubygems.org/specification-reference/#license= -.. [#perl] https://metacpan.org/pod/CPAN::Meta::Spec#license -.. [#cargo] https://doc.rust-lang.org/cargo/reference/manifest.html#package-metadata -.. [#cratesio] https://doc.rust-lang.org/cargo/reference/registries.html#publish -.. [#composer] https://getcomposer.org/doc/04-schema.md#license -.. [#nuget] https://docs.microsoft.com/en-us/nuget/reference/nuspec#licenseurl -.. [#flutter] https://flutter.dev/docs/development/packages-and-plugins/developing-packages#adding-licenses-to-the-license-file -.. [#bower] https://github.com/bower/spec/blob/b00c4403e22e3f6177c410ed3391b9259687e461/json.md#license -.. [#cocoapod] https://guides.cocoapods.org/syntax/podspec.html#license -.. [#cabal] https://cabal.readthedocs.io/en/3.6/cabal-package.html?highlight=license#pkg-field-license -.. [#mix] https://hex.pm/docs/publish -.. [#dub] https://dub.pm/package-format-json.html#licenses -.. [#cran] https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Licensing -.. [#spdxids] https://spdx.dev/resources/use/#identifiers -.. [#gnu] https://www.gnu.org/licenses/identify-licenses-clearly.html -.. [#fsf] https://www.fsf.org/blogs/rms/rms-article-for-claritys-sake-please-dont-say-licensed-under-gnu-gpl-2 -.. [#linux] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/license-rules.rst -.. [#uboot] https://www.denx.de/wiki/U-Boot/Licensing -.. [#apache] https://svn.apache.org/repos/asf/allura/doap_Allura.rdf -.. [#eclipse] https://www.eclipse.org/legal/epl-2.0/faq.php -.. [#android] https://github.com/aosp-mirror/platform_external_tcpdump/blob/android-platform-12.0.0_r1/MODULE_LICENSE_BSD -.. [#cc0] https://creativecommons.org/publicdomain/zero/1.0/ -.. [#unlic] https://unlicense.org/ +.. _#alpine: https://wiki.alpinelinux.org/wiki/Creating_an_Alpine_package#license +.. _#android: https://github.com/aosp-mirror/platform_external_tcpdump/blob/android-platform-12.0.0_r1/MODULE_LICENSE_BSD +.. _#apache: https://svn.apache.org/repos/asf/allura/doap_Allura.rdf +.. _#archinux: https://wiki.archlinux.org/title/PKGBUILD#license +.. _#archlinuxlist: https://archlinux.org/packages/core/any/licenses/files/ +.. _#badclassifiers: https://github.com/pypa/trove-classifiers/issues/17#issuecomment-385027197 +.. _#bower: https://github.com/bower/spec/blob/b00c4403e22e3f6177c410ed3391b9259687e461/json.md#license +.. _#cabal: https://cabal.readthedocs.io/en/3.6/cabal-package.html?highlight=license#pkg-field-license +.. _#cargo: https://doc.rust-lang.org/cargo/reference/manifest.html#package-metadata +.. _#cc0: https://creativecommons.org/publicdomain/zero/1.0/ +.. _#cdstats: https://clearlydefined.io/stats +.. _#choosealicense: https://choosealicense.com/ +.. _#choosealicenselist: https://choosealicense.com/licenses/ +.. _#chooseamitlicense: https://choosealicense.com/licenses/mit/ +.. _#classifierissue: https://github.com/pypa/trove-classifiers/issues/17 +.. _#classifiers: https://pypi.org/classifiers +.. _#classifiersrepo: https://github.com/pypa/trove-classifiers +.. _#clearlydefined: https://clearlydefined.io +.. _#cocoapod: https://guides.cocoapods.org/syntax/podspec.html#license +.. _#composer: https://getcomposer.org/doc/04-schema.md#license +.. _#conda: https://docs.conda.io/projects/conda-build/en/stable/resources/define-metadata.html#about-section +.. _#coremetadataspec: https://packaging.python.org/specifications/core-metadata +.. _#cran: https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Licensing +.. _#cratesio: https://doc.rust-lang.org/cargo/reference/registries.html#publish +.. _#dep5: https://dep-team.pages.debian.net/deps/dep5/ +.. _#dontchoosealicense: https://choosealicense.com/no-permission/ +.. _#dub: https://dub.pm/package-format-json.html#licenses +.. _#eclipse: https://www.eclipse.org/legal/epl-2.0/faq.php +.. _#fedora: https://docs.fedoraproject.org/en-US/packaging-guidelines/LicensingGuidelines/ +.. _#fedoralicense: https://docs.fedoraproject.org/en-US/packaging-guidelines/LicensingGuidelines/#_valid_license_short_names +.. _#fedoralist: https://fedoraproject.org/wiki/Licensing:Main?rd=Licensing#Good_Licenses +.. _#fedoratext: https://docs.fedoraproject.org/en-US/packaging-guidelines/LicensingGuidelines/#_license_text +.. _#flit: https://flit.readthedocs.io/en/stable/pyproject_toml.html +.. _#flutter: https://flutter.dev/docs/development/packages-and-plugins/developing-packages#adding-licenses-to-the-license-file +.. _#freebsd: https://docs.freebsd.org/en/books/porters-handbook/makefiles/#licenses +.. _#fsf: https://www.fsf.org/blogs/rms/rms-article-for-claritys-sake-please-dont-say-licensed-under-gnu-gpl-2 +.. _#gem: https://guides.rubygems.org/specification-reference/#license= +.. _#gentoo: https://devmanual.gentoo.org/ebuild-writing/variables/index.html#license +.. _#gentoodev: https://devmanual.gentoo.org/general-concepts/licenses/index.html +.. _#glep23: https://www.gentoo.org/glep/glep-0023.html +.. _#globmodule: https://docs.python.org/3/library/glob.html +.. _#gnu: https://www.gnu.org/licenses/identify-licenses-clearly.html +.. _#guix: https://git.savannah.gnu.org/cgit/guix.git/tree/guix/licenses.scm?h=v1.3.0 +.. _#guixlicense: https://guix.gnu.org/manual/en/html_node/package-Reference.html#index-license_002c-of-packages +.. _#installedspec: https://packaging.python.org/specifications/recording-installed-packages/ +.. _#interopissue: https://github.com/pypa/interoperability-peps/issues/46 +.. _#licenseexplib: https://github.com/nexB/license-expression/ +.. _#licensefield: https://packaging.python.org/guides/distributing-packages-using-setuptools/#license +.. _#linux: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/license-rules.rst +.. _#maven: https://maven.apache.org/pom.html#Licenses +.. _#mitlicense: https://opensource.org/licenses/MIT +.. _#mix: https://hex.pm/docs/publish +.. _#nixos: https://github.com/NixOS/nixpkgs/blob/21.05/lib/licenses.nix +.. _#npm: https://docs.npmjs.com/cli/v8/configuring-npm/package-json#license +.. _#nuget: https://docs.microsoft.com/en-us/nuget/reference/nuspec#licenseurl +.. _#numpyissue: https://github.com/numpy/numpy/issues/8689 +.. _#opensuse: https://en.opensuse.org/openSUSE:Packaging_guidelines#Licensing +.. _#opensuselist: https://docs.google.com/spreadsheets/d/14AdaJ6cmU0kvQ4ulq9pWpjdZL5tkR03exRSYJmPGdfs/pub +.. _#openwrt: https://openwrt.org/docs/guide-developer/packages#buildpackage_variables +.. _#osi: https://opensource.org +.. _#packagingguidetxt: https://packaging.python.org/guides/distributing-packages-using-setuptools/#license-txt +.. _#packagingissue: https://github.com/pypa/packaging-problems/issues/41 +.. _#packaginglicense: https://github.com/pypa/packaging/blob/21.2/LICENSE +.. _#packagingtutkey: https://packaging.python.org/tutorials/packaging-projects/#configuring-metadata +.. _#packagingtuttxt: https://packaging.python.org/tutorials/packaging-projects/#creating-a-license +.. _#pbr: https://docs.openstack.org/pbr/latest/user/features.html +.. _#pep621spec: https://packaging.python.org/specifications/declaring-project-metadata/ +.. _#pepissue: https://github.com/pombredanne/spdx-pypi-pep/issues/1 +.. _#perl: https://metacpan.org/pod/CPAN::Meta::Spec#license +.. _#pipsetup: https://github.com/pypa/pip/blob/21.3.1/setup.cfg#L114 +.. _#poetry: https://python-poetry.org/docs/pyproject/#license +.. _#pycode: https://github.com/search?l=Python&q=%22__license__%22&type=Code +.. _#pypi: https://pypi.org/ +.. _#pypugglossary: https://packaging.python.org/glossary/ +.. _#pytorch: https://pypi.org/project/torch/ +.. _#reuse: https://reuse.software/ +.. _#reusediscussion: https://github.com/pombredanne/spdx-pypi-pep/issues/7 +.. _#samplesetupcfg: https://github.com/pypa/sampleproject/blob/3a836905fbd687af334db16b16c37cf51dcbc99c/setup.cfg +.. _#samplesetuppy: https://github.com/pypa/sampleproject/blob/3a836905fbd687af334db16b16c37cf51dcbc99c/setup.py#L98 +.. _#scancodetk: https://github.com/nexB/scancode-toolkit +.. _#scipyissue: https://github.com/scipy/scipy/issues/7093 +.. _#sdistspec: https://packaging.python.org/specifications/source-distribution-format/ +.. _#setuptools5911: https://github.com/pypa/setuptools/blob/v59.1.1/setup.cfg +.. _#setuptoolsfiles: https://github.com/pypa/setuptools/issues/2739 +.. _#setuptoolspep639: https://github.com/pypa/setuptools/pull/2645 +.. _#setuptoolssdist: https://github.com/pypa/setuptools/pull/1767 +.. _#spdx: https://spdx.dev/ +.. _#spdxid: https://spdx.dev/ids/ +.. _#spdxlist: https://spdx.org/licenses/ +.. _#spdxpression: https://spdx.github.io/spdx-spec/SPDX-license-expressions/ +.. _#spdxpy: https://github.com/spdx/tools-python/ +.. _#spdxtutorial: https://github.com/david-a-wheeler/spdx-tutorial +.. _#spdxversion: https://github.com/pombredanne/spdx-pypi-pep/issues/6 +.. _#uboot: https://www.denx.de/wiki/U-Boot/Licensing +.. _#unlicense: https://unlicense.org/ +.. _#wheelfiles: https://github.com/pypa/wheel/issues/138 +.. _#wheelproject: https://wheel.readthedocs.io/en/stable/ +.. _#wheels: https://github.com/pypa/wheel/blob/0.37.0/docs/user_guide.rst#including-license-files-in-the-generated-wheel-file +.. _#wheelspec: https://packaging.python.org/specifications/binary-distribution-format/ Copyright ========= -This document is placed in the public domain or under the CC0-1.0-Universal -license [#cc0]_, whichever is more permissive. +This document is placed in the public domain or under the +`CC0-1.0-Universal license <#cc0_>`_, whichever is more permissive. Acknowledgments From 7d71f170e524468cc060b70c46ebc857a582cd16 Mon Sep 17 00:00:00 2001 From: "C.A.M. Gerlach" Date: Sun, 28 Nov 2021 21:04:45 -0600 Subject: [PATCH 17/19] PEP 639: Copyedit and proofread for grammar, phrasing, clarity & tone --- pep-0639.rst | 637 +++++++++++++++++++++++++-------------------------- 1 file changed, 318 insertions(+), 319 deletions(-) diff --git a/pep-0639.rst b/pep-0639.rst index 32fe94696e4..ef3385f41f6 100644 --- a/pep-0639.rst +++ b/pep-0639.rst @@ -19,28 +19,25 @@ Abstract ======== This PEP defines a specification for how licenses are documented in the -`core metadata <#coremetadataspec_>`_ via a new ``License-Expression`` field, +`core metadata <#coremetadataspec_>`_, with `license expression strings `_ using -`SPDX identifiers <#spdxid_>`_. - -This will make license declarations simpler and less ambiguous for: - -- package authors to create, -- package users to read and understand, and, -- tools to process package license information mechanically. +`SPDX identifiers <#spdxid_>`_ in a new ``License-Expression`` field. +This will make license declarations simpler and less ambiguous for +package authors to create, end users to read and understand, and +tools to programatically process. The PEP also: -- `Formally specifies `_ a new ``License-File`` field - and how license files should be +- `Formally specifies `_ a new ``License-File`` field, + and defines how license files should be `included in distributions `_, as already used by Wheel and Setuptools. - `Deprecates `_ the legacy ``License`` field and ``license ::`` classifiers. -- `Adds and deprecates `_ corresponding keys in the - PEP 621 project source metadata format. +- `Adds and deprecates `_ the corresponding keys + in the PEP 621 project source metadata format. - `Provides clear guidance `_ for authors and tools converting legacy license metadata, adding license files and @@ -65,7 +62,7 @@ and make minor additions to the `source distribution (sdist) <#sdistspec_>`_, Goals ===== -This PEP's scope is limited strictly to how we document the license of a +This PEP's scope is limited to how we document the license of a distribution package, specifically covering: - An improved and structured way to include a license expression. @@ -92,24 +89,23 @@ implement the recommendations for validation and warnings specified here. Non-Goals ========= -This PEP is neutral regarding the choice of license by any package author. +This PEP is neutral regarding the choice of license by any particular +package author. This PEP makes no recommendation for specific licenses, +and does not require the use of a particular license documentation convention. -In particular, the SPDX license expression syntax proposed in this PEP provides -simpler and more expressive conventions to document accurately any kind of -license that applies to a Python package, whether it is an open source license, -a free or libre software license, a proprietary license, or a combination of -such licenses. +Rather, the SPDX license expression syntax proposed in this PEP provides a +simpler and more expressive mechanism to accurately document any kind of +license that applies to a Python package, whether it is open source, +free/libre, proprietary, or a combination of such. -This PEP makes no recommendation for specific licenses and does not require the -use of specific license documentation conventions. This PEP also does not impose -any additional restrictions when uploading to PyPI, unless projects choose to -make use of the new fields. +This PEP also does not impose any additional restrictions when uploading to +PyPI, unless projects choose to make use of the new fields. Instead, it is intended to document best practices already in use, extend them to use a new formally-specified and supported mechanism, and provide guidance for packaging tools on how to hand the transition and inform users accordingly. -This PEP is not about license documentation in files inside projects, +This PEP also is not about license documentation in files inside projects, though this is a `surveyed topic `_ in the appendix, and nor does it intend to cover cases where the source and binary distribution packages don't have @@ -128,36 +124,36 @@ related but separate PEPs in the future, which may include: - Making the ``License-Expression`` and ``License-File`` fields mandatory for publishing tools and PyPI packages. -- Requiring uploads to PyPI to use only FOSS (Free and Open Source Software) +- Requiring uploads to PyPI to use only Free and Open Source Software (FOSS) licenses. Motivation ========== -Software is licensed, and providing accurate licensing information to Python -package users is an important matter. Today, there are multiple places where -licenses are documented in core metadata and there are limitations to what -can be documented. This is often leading to confusion or a lack of clarity both -for package authors and users. - -Several package authors have expressed difficulty and/or frustrations due to the -limited capabilities to express licensing in project metadata. This also applies -to Linux and BSD distribution packagers. This has triggered several -license-related discussions and issues, including on -`outdated and ambiguous PyPI classifiers <#classifierissue_>`_, +All software is licensed, and providing accurate license information to Python +package users is an important matter. Today, there are multiple fields where +licenses are documented in core metadata, and there are limitations to what +can be expressed in each of them. This often leads to confusion and a lack of +clarity, both for package authors and end users. + +Many package authors have expressed difficulty and frustrations due to the +limited capabilities to express licensing in project metadata, and this +creates further trouble for Linux and BSD distribution re-packagers. +This has triggered a number of license-related discussions and issues, +including on `outdated and ambiguous PyPI classifiers <#classifierissue_>`_, `license interoperability with other ecosystems <#interopissue_>`_, -`too many confusing/limited license metadata options <#packagingissue_>`_, +`too many confusing license metadata options <#packagingissue_>`_, `limited Wheel support for license files <#wheelfiles_>`_, and `the lack of clear, precise and standardized license metadata <#pepissue_>`_. -On average, Python packages tend to have more ambiguous, or missing, license -information than other common application package formats (such as npm, Maven or -Gem) as can be seen in the `statistics page <#cdstats_>`_ of the -`ClearlyDefined project <#clearlydefined_>`_ that cover all packages from -PyPI, Maven, npm and Rubygems. ClearlyDefined is an open source project -to help improve clarity of other open source projects that is incubating at -the `Open Source Initiative <#osi_>`_. +On average, Python packages tend to have more ambiguous and missing license +information than other common ecosystems (such as npm, Maven or +Gem). This is supported by the `statistics page <#cdstats_>`_ of the +`ClearlyDefined project <#clearlydefined_>`_, an +`Open Source Initiative <#osi_>`_ incubated effort to help +improve licensing clarity of other FOSS projects, covering all packages +from PyPI, Maven, npm and Rubygems. Rationale @@ -174,16 +170,16 @@ There are a few takeaways from the survey: - Most package formats use a single ``License`` field. -- Many modern package formats use some form of license expression syntax to +- Many modern package systems use some form of license expression syntax to optionally combine more than one license identifier together. SPDX and SPDX-like syntaxes are the most popular in use. -- SPDX license identifiers are becoming a de-facto way to reference common - licenses everywhere, whether or not a license expression syntax is used. +- SPDX license identifiers are becoming the de-facto way to reference common + licenses everywhere, whether or not a full license expression syntax is used. - Several package formats support documenting both a license expression and the - paths of the corresponding files that contain the license text. Most free and - open source software licenses require package authors to include their full + paths of the corresponding files that contain the license text. Most Free and + Open Source Software licenses require package authors to include their full text in a distribution. These considerations have guided the design and recommendations of this PEP. @@ -207,7 +203,7 @@ examples, binaries or other assets under other licenses. It also requires both authors and tools understand and implement the PyPI-specific bespoke classifier system, rather than using short, easy to add and standardized SPDX identifiers in a simple text field, as increasingly widely adopted by -most other packaging systems, reducing the overall burden on the ecosystem. +most other packaging systems to reduce the overall burden on the ecosystem. Finally, this does not provide as clear an indicator that a package has adopted the new system, and should be treated accordingly. @@ -215,14 +211,15 @@ The use of a new ``License-Expression`` field will provide an intuitive, structured and unambiguous way to express the license of a package using a well-defined syntax and well-known license identifiers. Similarly, a formally-specified ``License-File`` field offers a standardized -way to declare the full text of the license(s) as legally required to be -included with the package when distributed, and allows other tools consuming +way to ensure that the full text of the license(s) are included with the +package when distributed, as legally required, and allows other tools consuming the core metadata to unambiguously locate a distribution's license files. -Over time, encouraging the use of these fields and deprecating and ambiguous, -duplicative legacy alternatives will help Python software publishers improve -the clarity, accuracy and portability of their licensing practices, -to the benefit of package authors, consumers and redistributors alike. +Over time, encouraging the use of these fields and deprecating the ambiguous, +duplicative and confusing legacy alternatives will help Python software +publishers improve the clarity, accuracy and portability of their licensing +practices, to the benefit of package authors, consumers and redistributors +alike. Terminology @@ -230,19 +227,19 @@ Terminology This PEP seeks to clearly define the terms it uses, specifically those that: -- Have multiple established meanings (import vs. distribution package, - wheel *format* vs. Wheel *package*). +- Have multiple established meanings (e.g. import vs. distribution package, + wheel *format* vs. Wheel *project*). - Are related and often used interchangeably, but have critical - distinctions in meaning (PEP 621 *key* vs. core metadata *field*, + distinctions in meaning (e.g. PEP 621 *key* vs. core metadata *field*, a point of apparent confusion in PEP 621 with significant effects on this PEP). - Are existing concepts that don't have formal terms/definitions - (project/source metadata vs. distribution/built metadata, + (e.g. project/source metadata vs. distribution/built metadata, build vs. publishing tools). -- Are new concepts introduced here (license expression/identifier). +- Are new concepts introduced here (e.g. license expression/identifier). Whenever available, definitions are excepted from the `PyPA PyPUG Glossary <#pypugglossary_>`_ and `SPDX <#spdx_>`_. Terms are listed @@ -252,7 +249,7 @@ for the purposes of this PEP (``Syn:``). **Built Distribution** *(Syn: Binary Distribution/Wheel)* A Distribution format containing files and metadata that only need to be - moved to the correct location on the target system, to be installed. + moved to the correct location on the target system to be installed. Wheel is such a format, whereas distutil's *[sic]* Source Distribution is not. *(PyPUG Glossary)* @@ -303,9 +300,9 @@ for the purposes of this PEP (``Syn:``). and ``LicenseRef-Proprietary`` strings. Examples: ``MIT``, ``GPL-3.0-only`` **Project** *(Sub: Project Source Tree, Installed Project)* - A library, framework, script, plugin, application, or collection of data + A library, framework, script, plugin, application, collection of data or other resources, or some combination thereof that is intended to be - packaged into a Distribution. I.E., contains a ``pyproject.toml``, + packaged into a Distribution. Generally contains a ``pyproject.toml``, ``setup.py``, or ``setup.cfg`` file at the root of the project source directory. *(PyPUG Glossary)* @@ -356,7 +353,7 @@ for the purposes of this PEP (``Syn:``). Here, **wheel**, the standard built distribution format introduced in PEP 427 and `specified by PyPA <#wheelspec_>`_, will be referred to in lowercase, while the `Wheel project <#wheelproject_>`_, its reference implementation, - which will be referred to as **Wheel** in Title Case. + will be referred to as **Wheel** in Title Case. Specification @@ -411,19 +408,19 @@ missing, and MAY raise an error. Build tools MAY issue a similar warning, but MUST NOT raise an error. A license expression is a string using the SPDX license expression syntax as -documented in the `SPDX specification <#spdxpression_>`_ using either +documented in the `SPDX specification <#spdxpression_>`_, either Version 2.2 or a later compatible version. -When used in the ``License-Expression`` field and as a specialization of the SPDX -license expression definition, a license expression can use the following license -identifiers: +When used in the ``License-Expression`` field and as a specialization of +the SPDX license expression definition, a license expression can use the +following license identifiers: - Any SPDX-listed license short-form identifiers that are published in the `SPDX License List <#spdxlist_>`_, version 3.15 or any later compatible version. Note that the SPDX working group never removes any license identifiers; instead, they may choose to mark an identifier as "deprecated". -- The ``LicenseRef-Public-Domain`` and ``LicenseRef-Proprietary`` strings to +- The ``LicenseRef-Public-Domain`` and ``LicenseRef-Proprietary`` strings, to identify licenses that are not included in the SPDX license list. When processing the ``License-Expression`` field to determine if it contains @@ -436,7 +433,7 @@ a valid license expression, build and publishing tools: - One or more license identifiers are not valid (as defined above) - SHOULD report an informational warning, and publishing tools MAY raise an - error if one or more license identifiers have been marked as deprecated in + error, if one or more license identifiers have been marked as deprecated in the `SPDX License List <#spdxlist_>`_. - MUST store a case-normalized version of the ``License-Expression`` field @@ -450,8 +447,8 @@ a valid license expression, build and publishing tools: For all newly-upload distributions that include a ``License-Expression`` field, the `Python Package Index (PyPI) <#pypi_>`_ MUST validate that it contains a valid, case-normalized license expression with -valid identifiers (as defined here) and MUST reject uploads that do not -validate. PyPI MAY reject an upload for using a deprecated license identifier, +valid identifiers (as defined here) and MUST reject uploads that do not. +PyPI MAY reject an upload for using a deprecated license identifier, so long as it was deprecated as of the above-mentioned SPDX License List version. @@ -461,11 +458,10 @@ Add ``License-File`` field The ``License-File`` optional field is specified to contain the string representation of the path to a license-related file, relative to the -root license directory. Files specified under this field -could include license text, author/attribution information, or other -legal notices that need to be distributed with the package. -It is a multi-use field that may appear zero or more times, -each instance listing the path to one such file. +root license directory. It is a multi-use field that may appear zero or +more times, each instance listing the path to one such file. Files specified +under this field could include license text, author/attribution information, +or other legal notices that need to be distributed with the package. If a ``License-File`` is listed in a source or built distribution's core metadata, that file MUST be included in the distribution at the specified path @@ -475,7 +471,7 @@ distribution at that same relative path. The **root license directory** is defined to be the project root directory for source trees and source distributions, and the ``license_files`` subdirectory of the directory containing the core metadata (i.e. the -``.dist-info`` directory containing the ``METADATA`` file), for built +``.dist-info`` directory containing the ``METADATA`` file) for built distributions and installed projects. The specified relative path MUST be consistent between project source trees, @@ -515,8 +511,7 @@ instead. For all newly-uploaded distributions that include a ``License-Expression`` field, the `Python Package Index (PyPI) <#pypi_>`_ MUST reject any that specify a ``License`` field and the text of which is not -identical to that of ``License-Expression``, as -`defined above `_. +identical to that of ``License-Expression``, as defined in this section. Along with license classifiers, the ``License`` field may be removed from a new version of the specification in a future PEP. @@ -525,7 +520,7 @@ new version of the specification in a future PEP. Deprecate license classifiers ''''''''''''''''''''''''''''' -Including license `classifiers <#classifiers_>`_ in the ``Classifier`` field +Using license `classifiers <#classifiers_>`_ in the ``Classifier`` field (described in PEP 301) is deprecated and replaced by the more precise ``License-Expression`` field. @@ -560,7 +555,7 @@ defines how to declare a project's source metadata in a ``[project]`` table in the ``pyproject.toml`` file for build tools to consume and output distribution core metadata. -This PEP `adds `_ the ``license-expression``, +This PEP `adds `_ the ``license-expression`` key, `adds `_ the ``license-files`` key and `deprecates `_ the ``license`` key. @@ -569,7 +564,8 @@ Add ``license-expression`` key '''''''''''''''''''''''''''''' A new ``license-expression`` key is added to the ``project`` table, which has -a string value that is a valid SPDX license expression, as defined previously. +a string value that is a valid SPDX license expression, as +`defined previously `_. Its value maps to the ``License-Expression`` field in the core metadata. Build tools SHOULD validate the expression as described @@ -597,10 +593,10 @@ containing licenses and other legal notices to be distributed with the package. It corresponds to the ``License-File`` fields in the core metadata. Its value is a table, which if present MUST contain one of two optional, -mutually exclusive subkeys, ``paths`` and ``globs``; both arrays of strings. -If both are specified, tools MUST raise an error. -The ``paths`` subkey contains verbatim file paths, and the ``globs`` subkey -valid glob patterns, parsable by the ``glob`` `module <#globmodule_>`_ in the +mutually exclusive subkeys, ``paths`` and ``globs``; if both are specified, +tools MUST raise an error. Both are arrays of strings; the ``paths`` subkey +contains verbatim file paths, and the ``globs`` subkey valid glob patterns, +which MUST be parsable by the ``glob`` `module <#globmodule_>`_ in the Python standard library. **Note**: To avoid ambiguity, confusion and (per PEP 20, the Zen of Python) @@ -670,8 +666,8 @@ Deprecate ``license`` key ''''''''''''''''''''''''' The ``license`` key in the ``project`` table is now deprecated. -It MUST not be used if either of the new ``license-expression`` or -``license-files`` keys are defined, nor should it be listed as ``dynamic``, +It MUST NOT be used or listed as ``dynamic`` if either of the new +``license-expression`` or ``license-files`` keys are defined, and build tools MUST raise an error if either is the case. Otherwise, if the ``text`` subkey is present in the ``license`` table, tools @@ -692,7 +688,7 @@ and ensure that users do not unknowingly create packages that are not legally distributable, tools MUST assume the above default value for the ``license-files`` key and also include, in addition to the license file specified under this ``file`` subkey, any license files that match the -corresponding list of patterns. +specified list of patterns. The ``license`` key may be removed from a new version of the specification in a future PEP. @@ -737,7 +733,7 @@ location the license file tree is rooted in for each format, per the directory MUST contain a ``license_files`` subdirectory which MUST contain the files listed in the ``License-File`` fields in the ``METADATA`` file at their respective paths relative to the ``license_files`` directory, - and that any files in this directory MUST be copied from installed wheels + and that any files in this directory MUST be copied from wheels by install tools. @@ -780,7 +776,7 @@ into the ``License-Expression`` field following the `specification above `_. Many legacy license classifiers intend to specify a particular license, -but do not specify the particular version or variant, leading to +but do not specify the particular version or variant, leading to a `critical ambiguity <#classifierissue_>`_ as to their terms, compatibility and acceptability. Tools MUST NOT attempt to automatically infer a ``License-Expression`` when one of these classifiers is used, and SHOULD @@ -810,7 +806,7 @@ MAY use as a reference for the identifier selection options to offer users when prompting the user to explicitly select the license identifier they intended for their project. -**Note**: A couple additional classifiers, namely the "or later" variants of +**Note**: Several additional classifiers, namely the "or later" variants of the AGPLv3, GPLv2, GPLv3 and LGPLv3, are also listed in the aforementioned mapping, but as they were merely proposed for textual harmonization and still unambiguously map to their respective respective licenses, @@ -823,7 +819,7 @@ considered canonical and normative for the purposes of this specification: - Classifier ``License :: Public Domain`` MAY be mapped to the generic ``License-Expression: LicenseRef-Public-Domain``. If tools do so, they SHOULD issue an informational warning encouraging - the use of more explicit and legally portable license identifiers + the use of more explicit and legally portable license identifiers, such as those for the `CC0 1.0 license <#cc0_>`_ (``CC0-1.0``), the `Unlicense <#unlicense_>`_ (``Unlicense``), or the `MIT license <#mitlicense_>`_ (``MIT``), @@ -857,11 +853,11 @@ considered canonical and normative for the purposes of this specification: Therefore, tools MUST treat them as ambiguous when attempting to fill ``License-Expression``. -When multiple license classifiers are used, their relation is ambiguous +When multiple license classifiers are used, their relationship is ambiguous, and it is typically not possible to determine if all the licenses apply or if there is a choice that is possible among the licenses. In this case, tools -MUST NOT automatically infer a license expression and SHOULD suggest that the -package author construct a license expression which expresses their intent. +MUST NOT automatically infer a license expression, and SHOULD suggest that the +package author construct a one which expresses their intent. User Scenarios @@ -869,22 +865,22 @@ User Scenarios The following covers the range of common use cases from a user perspective, providing straightforward guidance for each. Do note that the following -should **not** be considered legal advice, and you should consult a licensed -attorney if you are unsure about the specifics for your situation. +should **not** be considered legal advice, and readers should consult a +licensed attorney if they are unsure about the specifics for their situation. I have a private package that won't be distributed -------------------------------------------------- If your package isn't shared publicly, i.e. outside your company, -organization or household, it *usually* isn't necessary to include a formal -license, so you wouldn't have to do anything extra here. +organization or household, it *usually* isn't strictly necessary to include +a formal license, so you wouldn't necessarily have to do anything extra here. -To be more explicit, it is still a good idea to include -``LicenseRef-Proprietary`` as a license expression in your package -configuration, and/or a copyright statement and any legal notices in a -``LICENSE.txt`` file in the root of your project directory, which will be -automatically included by packaging tools. +However, it is still a good idea to include ``LicenseRef-Proprietary`` +as a license expression in your package configuration, and/or a +copyright statement and any legal notices in a ``LICENSE.txt`` file +in the root of your project directory, which will be automatically +included by packaging tools. I just want to share my own work without legal restrictions @@ -893,7 +889,7 @@ I just want to share my own work without legal restrictions While you aren't required to include a license, if you don't, no one has `any permission to download, use or improve your work <#dontchoosealicense_>`_, so that's probably the *opposite* of what you actually want. -The `MIT license <#mitlicense_>`_ is a great choice for this, as its simple, +The `MIT license <#mitlicense_>`_ is a great choice instead, as it's simple, widely used and allows anyone to do whatever they want with your work (other than sue you, which you probably also don't want). @@ -909,11 +905,11 @@ I want to distribute my project under a specific license -------------------------------------------------------- To use a particular license, simply paste its text into a ``LICENSE.txt`` -file at the root of your repo (if you don't have it in a file starting with -``LICENSE`` or ``COPYING`` already), and add +file at the root of your repo, if you don't have it in a file starting with +``LICENSE`` or ``COPYING`` already, and add ``license-expression = "LICENSE-ID"`` under ``[project]`` in your -``pyproject.toml`` if your packaging tool supports it, or in its config -file (e.g. for Setuptools, ``license_expression = LICENSE-ID`` +``pyproject.toml`` if your packaging tool supports it, or else in its +config file (e.g. for Setuptools, ``license_expression = LICENSE-ID`` under ``[metadata]`` in ``setup.cfg``). You can find the ``LICENSE-ID`` and copyable license text on sites like `ChooseALicense <#choosealicenselist_>`_ or `SPDX <#spdxlist_>`_. @@ -943,13 +939,13 @@ If your license files begin with ``LICENSE``, ``COPYING``, ``NOTICE`` or (e.g. ``license_files`` in ``setup.cfg``), you should already be good to go. If not, make sure to list them under ``license-files.paths`` or ``license-files.globs`` under ``[project]`` in ``pyproject.toml`` -(if your tool supports it), or in your tool's configuration file +(if your tool supports it), or else in your tool's configuration file (e.g. ``license_files`` in ``setup.cfg`` for Setuptools). See the `basic example`_ for a simple but complete real-world demo of how this works in practice, including some additional technical details. Packaging tools may support automatically converting legacy licensing -metadata; check your tool's documentation for details. +metadata; check your tool's documentation for more information. My package includes other code under different licenses @@ -966,7 +962,7 @@ to your project, or parts of it (for example, you included a file under another license), and ``License-1 OR License-2`` means that *either* of the licenses can be used, at the user's option (for example, you want to allow users a choice of multiple licenses). You can use -parenthesis (``()``) for grouping to form expressions to cover even the most +parenthesis (``()``) for grouping to form expressions that cover even the most complex situations. In your project config file, enter your license expression under @@ -981,7 +977,7 @@ and begin with ``LICENSE``, ``COPYING``, ``NOTICE`` or ``AUTHORS``, they will be included automatically. Otherwise, you'll need to list the relative path or glob patterns to each of them under ``license-files.paths`` or ``license-files.globs`` under ``[project]`` in ``pyproject.toml`` -(if your tool supports it), or in your tool's configuration file +(if your tool supports it), or else in your tool's configuration file (e.g. ``license_files`` in ``setup.cfg`` for Setuptools). As an example, if your project was licensed MIT but incorporated @@ -998,7 +994,7 @@ as literal file paths. See a fully worked out `advanced example`_ for a comprehensive end-to-end application of this to a real-world complex project, with copious technical details, and consult a `tutorial <#spdxtutorial_>`_ for more help and examples -on using SPDX identifiers and expressions. +using SPDX identifiers and expressions. Backwards Compatibility @@ -1029,7 +1025,7 @@ adopt it. Due to requiring license files not be flattened into ``.dist-info`` and specifying that they should be placed in a dedicated ``license_files`` subdir, -wheels produced with following this change will have differently-located +wheels produced following this change will have differently-located licenses relative to those produced via the previous unspecified, installer-specific behavior, but as until this PEP there was no way of discovering these files or accessing them programmatically, and this will @@ -1061,7 +1057,7 @@ and standardizes what is already being done. Finally, while this PEP does propose PyPI implement validation of the new ``License-Expression`` and ``License-File`` fields, this has no effect on existing packages, nor any effect on any new distributions uploaded unless they -explicitly choose to include these new fields while unintentionally not +explicitly choose to opt in to using these new fields while not following the requirements in the specification. Therefore, this does not have a backward compatibility impact, and in fact ensures forward compatibility with any future changes by ensuring all distributions uploaded to PyPI with the new @@ -1072,23 +1068,23 @@ Security Implications ===================== This PEP has no foreseen security implications: the ``License-Expression`` -field is a plain string and the License-File(s) are file paths. -None of them introduces any known new security concerns. +field is a plain string and the ``License-File`` fields are file paths. +Neither introduces any known new security concerns. How to Teach This ================= The simple cases are simple: a single license identifier is a valid license -expression and a large majority of packages use a single license. +expression, and a large majority of packages use a single license. The plan to teach users of packaging tools how to express their package's license with a valid license expression is to have tools issue informative messages when they detect invalid license expressions, or when the deprecated -``License`` field or a license classifier is used. +``License`` field or license classifiers are used. An immediate, descriptive error message if an invalid ``License-Expression`` -is used will help users understand they need to use valid SPDX identifiers in +is used will help users understand they need to use SPDX identifiers in this field, and catch them if they make a mistake. For authors still using the now-deprecated, less precise and more redundant ``License`` field or license classifiers, packaging tools will warn @@ -1108,8 +1104,8 @@ many, if not most common cases: ``License`` value and convert that to a ``License-Expression``. For instance, a tool may suggest converting from a ``License`` field with ``Apache2`` (which is not a valid license expression as defined in this PEP) - to a ``License-Expression`` field with ``Apache-2.0`` (which is a valid license - expression using an SPDX license identifier). + to a ``License-Expression`` field with ``Apache-2.0`` (which is a valid + license expression using an SPDX license identifier). Reference Implementation @@ -1119,13 +1115,13 @@ Tools will need to support parsing and validating license expressions in the ``License-Expression`` field. The `license-expression library <#licenseexplib_>`_ is a reference Python -implementation of a library that handles license expressions including parsing, -validating and formatting license expressions using flexible lists of license -symbols (including SPDX license identifiers and any extra identifiers referenced -here). It is licensed under the Apache-2.0 license and is used in a few projects -such as the `SPDX Python Tools <#spdxpy_>`_, +implementation that handles license expressions including parsing, +formatting and validation, using flexible lists of license symbols +(including SPDX license IDs and any extra identifiers included here). +It is licensed under Apache-2.0 and is already used in several projects, +including the `SPDX Python Tools <#spdxpy_>`_, the `ScanCode toolkit <#scancodetk_>`_ -and the Free Software Foundation Europe (FSFE) `Reuse project <#reuse_>`_. +and the Free Software Foundation Europe (FSFE) `REUSE project <#reuse_>`_. Rejected Ideas @@ -1142,12 +1138,12 @@ Re-use the ``License`` field '''''''''''''''''''''''''''' Following `initial discussion <#reusediscussion_>`_, earlier versions of this -PEP proposed to re-use the existing ``License`` field, which tools would -attempt to parse as a SPDX license expression with a fall back to treating -as free text. Initially, this would merely cause a warning (or even pass -silently), but would eventually be treated as an error by modern tooling. +PEP proposed re-using the existing ``License`` field, which tools would +attempt to parse as a SPDX license expression with a fallback to free text. +Initially, this would merely cause a warning (or even pass silently), +but would eventually be treated as an error by modern tooling. -This offered the benefit of greater backwards-compatibility, +This offered the potential benefit of greater backwards-compatibility, easing the community into using SPDX license expressions while taking advantage of packages that already have them (either intentionally or coincidentally), and avoided adding yet another license-related field. @@ -1162,7 +1158,7 @@ easily and unambiguously detect invalid content. This avoids both false positive (``License`` values that a package author didn't explicitly intend as an explicit SPDX identifier, but that happen to validate as one), and false negatives (expressions the author intended -to be valid SPDX, but due to a typo or mistake is not), which are otherwise +to be valid SPDX, but due to a typo or mistake are not), which are otherwise not clearly distinguishable from true positives and negatives, an ambiguity at odds with the goals of this PEP. @@ -1193,17 +1189,17 @@ purpose-created field, ``License-Expression``. Re-Use the ``License`` field with a value prefix '''''''''''''''''''''''''''''''''''''''''''''''' -As an alternative to the above, it was suggested to reduce the ambiguity -inherent in re-using the ``License`` field by prefixing SPDX license -expressions with, e.g. ``spdx:``. However, this effectively amounted to -creating a field within a field, and doesn't address all the downsides of +As an alternative to the above, prefixing SPDX license expressions with, +e.g. ``spdx:`` was suggested to reduce the ambiguity inherent in re-using +the ``License`` field. However, this effectively amounted to creating +a field within a field, and doesn't address all the downsides of keeping the ``License`` field. Namely, it still changes the behavior of an existing metadata field, requires tools to parse its value to determine how to handle its content, and makes the specification and deprecation process more complex and less clean. Yet, it still shares a same main potential downside as just creating a new -field, that projects currently using valid SPDX identifiers in the ``License`` +field: projects currently using valid SPDX identifiers in the ``License`` field, intentionally or not, won't be automatically recognized, and requires about the same amount of effort to fix, namely changing a line in the project's source metadata. Therefore, it was rejected in favor of a new field. @@ -1226,7 +1222,7 @@ Don't deprecate existing ``License`` field and classifiers '''''''''''''''''''''''''''''''''''''''''''''''''''''''''' Several community members were initially concerned that deprecating the -existing ``License`` field and license classifiers would result in +existing ``License`` field and classifiers would result in excessive churn for existing package authors and raise the barrier to entry for new ones, particularly everyday Python developers seeking to package and publish their personal projects without necessarily caring @@ -1282,8 +1278,9 @@ automated tooling to take care of this with no human changes needed. More complex cases where license metadata is currently specified may need a bit of human intervention, but in most cases tools will be able to provide a list of options following the mappings in this PEP, and -these are typically the projects most likely to be concerned about -licensing issues in any case, and thus most benefited by this PEP. +these are typically the projects most likely to be constrained by the +limitations of the existing license metadata, and thus most benefited +by the new fields in this PEP. Finally, for unmaintained packages, those using tools supporting older metadata versions, or those who choose not to provide license metadata, @@ -1299,7 +1296,7 @@ for PyPI (or other package indicies) as to whether and how they should validate the ``License-Expression`` or ``License-File`` fields, nor how they should handle using them in combination with the deprecated ``License`` field or license classifiers. This simplifies the specification -and either defers implementation on PyPI to a later PEP, or gives +and either defers implementation on PyPI to a later PEP, and gives discretion to PyPI to enforce the stated invariants, to minimize disruption to package authors. @@ -1308,16 +1305,16 @@ field was separate from the existing ``License``, which would make validation much more challenging and backwards-incompatible, breaking existing packages. With that change, there was a clear consensus that the new field should be validated from the start, guaranteeing that all -distributions uploaded to PyPI that declare adhere to core metadata version 2.3 +distributions uploaded to PyPI that declare core metadata version 2.3 or higher and have the ``License-Expression`` field will have a valid -expression that PyPI and consumers of its packages and metadata can rely upon -to follow the specification here. +expression, such that PyPI and consumers of its packages and metadata +can rely upon to follow the specification here. The same can be extended to the ``License-File`` field, as also specified here, to ensure that it is valid and the legally required license files present, and thus it is lawful for PyPI, users and downstream consumers -to distribute the package (of course, this makes no _guarentee_ of such -as it is ultimately reliant on authors to declare such, but it improves +to distribute the package (of course, this makes no *guarantee* of such +as it is ultimately reliant on authors to declare them, but it improves assurance of this and allows doing so in the future if the community so decides). To be clear, this would not require that any uploaded distribution have such metadata, only that if they choose to declare it per the new @@ -1337,8 +1334,8 @@ Add ``expression`` and ``files`` subkeys to table A previous working draft of this PEP added ``expression`` and ``files`` subkeys to the existing ``license`` table in the PEP 621 source metadata, to parallel the existing ``file`` and ``text`` subkeys. While this seemed perhaps the -most obvious approach at first, it had several serious drawbacks relative to -that ultimately taken here. +most obvious approach at first glance, it had several serious drawbacks +relative to that ultimately taken here. Most saliently, this means two very different types of metadata are being specified under the same top-level key that require very different handling, @@ -1377,15 +1374,15 @@ design overall. It allows ``license`` and ``license-files`` to be tagged ``dynamic`` independently, separates two independent types of metadata (syntactically and semantically), restores a closer to 1:1 mapping of PEP 621 keys to core metadata fields, and reduces nesting by a level for both. -Other than adding two extra keys to the file, there was no real apparent -downside to this latter approach, so it was adopted for this PEP. +Other than adding two extra keys to the file, there was no significant +apparent downside to this latter approach, so it was adopted for this PEP. Define license expression as string value ''''''''''''''''''''''''''''''''''''''''' A compromise approach between adding two new top-level keys for license -expressions and files would be to add a separate ``license-files`` key, +expressions and files would be adding a separate ``license-files`` key, but re-using the ``license`` key for the license expression, either by defining it as the (previously reserved) string value for the ``license`` key, retaining the ``expression`` subkey in the ``license`` table, or @@ -1443,14 +1440,14 @@ Add a ``type`` key to treat as expression ''''''''''''''''''''''''''''''''''''''''' Instead of creating a new top-level ``license-expression`` key in the -PEP 621 source metadata, we could add a ``type`` subkey to the existing +PEP 621 source metadata, one could add a ``type`` subkey to the existing ``license`` table to control whether ``text`` (or a string value) is interpreted as free-text or a license expression. This could make backward compatibility a little more seamless, as older tools could ignore it and always treat ``text`` as ``license``, while newer tools would know to treat it as a license expression, if ``type`` was set appropriately. -Indeed, PEP 621 suggests something of this sort as a possible alternative -way that SPDX license expressions could be implemented. +Indeed, PEP 621 seems to suggest something of this sort as a possible +alternative way that SPDX license expressions could be implemented. However, all the same downsides as in the previous item apply here, including greater complexity, a more complex mapping between the project @@ -1496,16 +1493,16 @@ why such is explicitly prohibited by this PEP. Therefore, not marking it as requirements. Finally, users explicitly being told to mark it as ``dynamic``, or not, to -control filling behavior is both a mis-use of the ``dynamic`` field as -apparently intended, and prevents tools from adapting to best practices -(fill, don't fill, etc) as they develop and evolve over time. +control filling behavior seems to be a bit of a mis-use of the ``dynamic`` +field as apparently intended, and prevents tools from adapting to best +practices (fill, don't fill, etc) as they develop and evolve over time. Source metadata ``license-files`` key ------------------------------------- Alternatives considered for the ``license-files`` key in the -``pyproject.toml`` project source metadata, primarily related to the +PEP 621 project source metadata, primarily related to the path/glob type handling. @@ -1546,8 +1543,8 @@ based only on those glob patterns the user explicitly specified and the filenames in the package, without installing it, executing its code or even examining its files. Furthermore, tools are still explicitly allowed to warn if specified glob patterns (including full paths) don't match any files. -And, of course, sdists, wheels and others will have the -full static list of files specified in their core metadata. +And, of course, sdists, wheels and others will have the full static list +of files specified in their distribution metadata. Perhaps most importantly, this would also preclude the currently specified default value, as widely used by the current most popular tools, and thus @@ -1568,19 +1565,21 @@ of nesting, and more closely match the configuration format of existing tools. However, for the cost of a few characters, it ensures users are aware whether they are entering globs or verbatim paths. Furthermore, allowing license files to be specified as literal paths avoids edge cases, such as those -containing glob or other special characters (or those confusingly or even -maliciously similar to them, as described in PEP 672). +containing glob characters (or those confusingly or even maliciously similar +to them, as described in PEP 672). -Including an explicit ``paths`` value guarantees that the resulting +Including an explicit ``paths`` value ensures that the resulting ``License-File`` metadata is correct, complete and purely static in the strictest sense of the term, with all license paths explicitly specified in the ``pyproject.toml`` file, guaranteed to be included and with an early -error should any be missing. +error should any be missing. This is not practical to do, at least without +serious limitations for many workflows, if we must assume the items +are glob patterns rather than literal paths. This allows tools to locate them and know the exact values of the ``License-File`` core metadata fields without having to traverse the source tree of the project and match globs, potentially allowing easier, -more efficient and reliable inspection by tools. +more efficient and reliable programmatic inspection and processing. Therefore, given the relatively small cost and the significant benefits, this approach was not adopted. @@ -1665,8 +1664,8 @@ being explicitly, statically specified, and others. Like the previous, if there is a clear need for it, it can be always allowed in the future in a backward-compatible manner (to the extent it is possible -at all), while the same is not true of disallowing it. Therefore, it was -decided to require the two subkeys to be mutually exclusive. +in the first place), while the same is not true of disallowing it. +Therefore, it was decided to require the two subkeys to be mutually exclusive. Rename ``paths`` subkey to ``files`` @@ -1694,20 +1693,20 @@ for license files to be matched and included at all. However, this is merely declaring a static, strictly-specified default value for this particular key, required to be used exactly by all conforming tools (so long as it is not marked ``dynamic``, negating this argument entirely), -and is no less static than any other set of glob patterns. Furthermore, the -resulting ``License-File`` core metadata values can still be determined with -only a list of files in the source, without installing or executing any of the -code, or even inspecting file contents. +and is no less static than any other set of glob patterns the user themself +may specify. Furthermore, the resulting ``License-File`` core metadata values +can still be determined with only a list of files in the source, without +installing or executing any of the code, or even inspecting file contents. Moreover, even if this were not so, practicality would trump purity, as this interpretation would be strictly backwards-incompatible with the existing -format, as it would trigger inconstant behavior with the existing tools. +format, and be inconsistent with the behavior with the existing tools. Further, this would create a very serious and likely risk of a large number of projects unknowingly no longer including legally mandatory license files, -making their distribution illegal, and is thus not a sane, much less sensible -default. +making their distribution technically illegal, and is thus not a sane, +much less sensible default. -Finally, aside from adding an additional line of virtually-required boilerplate +Finally, aside from adding an additional line of default-required boilerplate to the file, not defining the default as dynamic allows authors to clearly and unambiguously indicate when their build/packaging tools are going to be handling the inclusion of license files themselves rather than strictly @@ -1748,11 +1747,11 @@ To resolve this, the PEP now proposes, as did contributors on both of the above issues, reproducing the source directory structure of the original license files inside the ``.dist-info`` directory. This would fully resolve the concerns above, with the only downside being a more nested ``.dist-info`` -directory. There is still a risk of filename collision with -edge-case custom filenames (e.g. ``RECORD``, ``METADATA``), but that is also -the case with the previous approach, and in fact with fewer files flattened +directory. There is still a risk of collision with edge-case custom +filenames (e.g. ``RECORD``, ``METADATA``), but that is also the case +with the previous approach, and in fact with fewer files flattened into the root, this would actually reduce the risk. Furthermore, -a followup proposal rooting the license files under a ``license_files`` +the following proposal rooting the license files under a ``license_files`` subdirectory eliminates both collisions and the clutter problem entirely. @@ -1778,11 +1777,9 @@ Dump directly in ``.dist-info`` Previously, the included license files were stored directly in the top-level ``.dist-info`` directory of built wheels and installed projects. This followed existing ad hoc practice, ensured most existing wheels currently using this -feature will match new ones (i.e. those projects built with Wheel versions -that include license files but don't specify license files in subdirectories), -and kept the specification simpler, with the license files always being -stored in the same location relative to the core metadata regardless of -distribution type. +feature will match new ones, and kept the specification simpler, with the +license files always being stored in the same location relative to the core +metadata regardless of distribution type. However, this leads to a more cluttered ``.dist-info`` directory, littered with arbitrary license files and subdirectories, as opposed to separating @@ -1817,7 +1814,7 @@ relying on the current behavior and there is much greater uptake of not only simply including license files but potentially accessing them as well using the core metadata, if we're going to change it, now would be the time (particularly since we're already introducing an edge-case change with how -license files in subdirs are handled, as well as other things). +license files in subdirs are handled, along with other refinements). Therefore, the latter has been incorporated into current drafts of this PEP. @@ -1826,8 +1823,8 @@ Add new ``licenses`` category to wheel '''''''''''''''''''''''''''''''''''''' Instead of defining a root license directory (``license_files``) inside -the core metadata directory (``.dist-info``) for wheels, we could -instead define a new category (and, presumably, a corresponding install scheme), +the core metadata directory (``.dist-info``) for wheels, we could instead +define a new category (and, presumably, a corresponding install scheme), similar to the others currently included under ``.data`` in the wheel archive, specifically for license files, called (e.g.) ``licenses``. This was mentioned by the wheel creator, and would allow installing licenses somewhere more @@ -1842,10 +1839,10 @@ and updating the core metadata specification. Furthermore, doing so would likely require modifying ``sysconfig`` and the install schemes specified therein, alongside Wheel, Installer and other tools, which would be a non-trivial undertaking. While potentially slightly more complex for -repackagers (such as those for Linux distributions) the current proposal -ensuring all license files are included, and in a single dedicated directory -(which can easily be copied or relocated downstream), should still greatly -improve the status quo in this regard without the attendant complexity. +repackagers (such as those for Linux distributions), the current proposal still +ensures all license files are included, and in a single dedicated directory +(which can easily be copied or relocated downstream), and thus should still +greatly improve the status quo in this regard without the attendant complexity. In addition, this approach is not fully backwards compatible (since it isn't transparent to tools that simply extract the wheel), is a greater @@ -1854,7 +1851,7 @@ license install locations from wheels of different versions. Finally, this would mean licenses were not installed as proximately to their associated code, there would be more variability in the license root path across platforms and between built distributions and installed projects, -accessing installed licenses pro grammatically would be more non-trivial, and a +accessing installed licenses programmatically would be more non-trivial, and a suitable install location and method would need to be created, discussed and decided that would avoid name clashes. @@ -1883,35 +1880,38 @@ ultimately not adopted. Map identifiers to license files '''''''''''''''''''''''''''''''' -This would require using a mapping (two parallel lists would be too prone to -alignment errors) and a mapping would bring extra complication to how license -are documented by adding an additional nesting level. - -A mapping would be needed as you cannot guarantee that all expressions (e.g. -GPL with an exception may be in a single file) or all the license keys have a -single license file and that any expression does not have more than one. (e.g. -an Apache license ``LICENSE`` and its ``NOTICE`` file for instance are two -distinct files). Yet in most cases, there is a simpler "one license", "one or -more license files". In the rarer and more complex cases where there are many -licenses involved you can still use the proposed conventions at the cost of a -slight loss of clarity by not specifying which text file is for which license -identifier, but you are not forcing the more complex data model (e.g. a mapping) -on everyone that may not need it. +This would require using a mapping (as two parallel lists would be too prone to +alignment errors), which would add extra complexity to how license +are documented and add an additional nesting level. + +A mapping would be needed, as it cannot be guaranteed that all expressions +(keys) have a single license file associated with them (e.g. +GPL with an exception may be in a single file) and that any expression +does not have more than one. (e.g. an Apache license ``LICENSE`` and +its ``NOTICE`` file, for instance, are two distinct files). +For most common cases, a single license expression and one or more license +files would be perfectly adequate. In the rarer and more complex cases where +there are many licenses involved, authors can still safety use the fields +specified here, just with a slight loss of clarity by not specifying which +text file(s) map to which license identifier (though this should be clear in +practice given each license identifier has corresponding SPDX-registered +full license text), while not forcing the more complex data model +(a mapping) on the large majority of users who do not need or want it. We could of course have a data field with multiple possible value types (it's a string, it's a list, it's a mapping!) but this could be a source of confusion. -This is what has been done for instance in npm (historically) and in Rubygems -(still today) and as result you need to test the type of the metadata field -before using it in code and users are confused about when to use a list or a -string. +This is what has been done, for instance, in npm (historically) and in Rubygems +(still today), and as result tools need to test the type of the metadata field +before using it in code, while users are confused about when to use a list or a +string. Therefore, this approach is rejected. Map identifiers to source files ''''''''''''''''''''''''''''''' -File-level notices are not considered as part of the scope of this PEP and the +As discussed previously, file-level notices out of scope for this PEP, and the existing ``SPDX-License-Identifier`` `convention <#spdxid_>`_ can -be used and may not need further specification as a PEP. +already be used if this is needed without further specification here. Don't freeze compatibility with a specific SPDX version @@ -1926,8 +1926,8 @@ However, serious concerns were expressed about a future SPDX update breaking compatibility with existing expressions and identifiers, leaving current packages with invalid metadata per the definition in this PEP. Requiring compatibility with a specific version of these specifications here -and requiring a PEP or similar process to update it avoids that from -occurring, and follows the practice of other packaging ecosystems. +and a PEP or similar process to update it avoids this contingency, +and follows the practice of other packaging ecosystems. Therefore, it was `decided <#spdxversion_>`_ to specify a minimum version and requires tools to be compatible with it, while still allowing updates @@ -1955,7 +1955,7 @@ mechanism to do so, the fact that each wheel would need its own license information, lack of support on PyPI for exposing license info on a per-distribution archive basis, and the relatively niche use case, it was determined to be out of scope for this PEP, and left to a future PEP -to resolve if sufficient need and interest exists, and an appropriate +to resolve if sufficient need and interest exists and an appropriate mechanism can be found. @@ -2057,12 +2057,12 @@ combining all the license expressions into one. Such an expression might be:: In addition, per the requirements of the licenses, the relevant license files must be included in the package. Suppose the ``LICENSE`` file contains the text of the MIT license and the copyrights used by Setuptools, ``pyparsing``, -``more_itertools`` and ``ordered-set``; and the ``LICENSE`` files in the +``more_itertools`` and ``ordered-set``; and the ``LICENSE*`` files in the ``setuptools/_vendor/packaging/`` directory contain the Apache 2.0 and 2-clause BSD license text, and the Packaging copyright statement and `license choice notice <#packaginglicense_>`_. -Therefore, we assume the license files are located at the following +Specifically, we assume the license files are located at the following paths in the project source tree (relative to the project root and ``pyproject.toml``):: @@ -2221,30 +2221,30 @@ but simpler licensing. For instance, some classifiers lack precision (GPL without a version) and when multiple license classifiers are listed, it is not clear if both licenses must apply, or the user may choose between them. Furthermore, the list of available license classifiers -is often out-of-date. +is rather limited and out-of-date. Setuptools and Wheel -------------------- Beyond a license code or qualifier, license text files are documented and -included in a built package either implicitly or explicitly and this is another -possible source of confusion: - -- In `Setuptools <#setuptoolssdist_>`_ and `Wheel <#wheels_>`_, license files - are automatically added to the distribution (at their source location in - in a source distribution/sdist, and in the ``.dist-info`` directory - of a built wheel) if they match one of a number of common license file - name patterns (``LICEN[CS]E*``, ``COPYING*``, ``NOTICE*`` and ``AUTHORS*``). - Alternatively, a package author can specify a list of license file paths to - include in the built wheel under the ``license_files`` key in the - ``[metadata]`` section of the project's ``setup.cfg``, or as an argument - to the ``setuptools.setup()`` function. At present, following Wheel's - lead, Setuptools flattens the collected license files into the metadata - directory, clobbering files with the same name, and dump license files - directly into the top-level ``.dist-info`` directory, but there is a +included in a built package either implicitly or explicitly, +and this is another possible source of confusion: + +- In the `Setuptools <#setuptoolssdist_>`_ and `Wheel <#wheels_>`_ projects, + license files are automatically added to the distribution (at their source + location in in a source distribution/sdist, and in the ``.dist-info`` + directory of a built wheel) if they match one of a number of common license + file name patterns (``LICEN[CS]E*``, ``COPYING*``, ``NOTICE*`` and + ``AUTHORS*``). Alternatively, a package author can specify a list of license + file paths to include in the built wheel under the ``license_files`` key in + the ``[metadata]`` section of the project's ``setup.cfg``, or as an argument + to the ``setuptools.setup()`` function. At present, following the Wheel + project's lead, Setuptools flattens the collected license files into the + metadata directory, clobbering files with the same name, and dumps license + files directly into the top-level ``.dist-info`` directory, but there is a `desire to resolve both these issues <#setuptoolsfiles_>`_, - contingent on the this PEP being accepted. + contingent on this PEP being accepted. - Both tools also support an older, singular ``license_file`` parameter that allows specifying only one license file to add to the distribution, which @@ -2282,11 +2282,11 @@ Python source code files Beside using comments and/or ``SPDX-License-Identifier`` conventions, the license is `sometimes <#pycode_>`_ documented in Python code files using -a "dunder" variable, typically named ``__license__``. +a "dunder" module-level constant, typically named ``__license__``. -This convention (dunder global variables) is recognized by the built-in ``help()`` -function and the standard ``pydoc`` module. The dunder variable(s) will show up in -the ``help()`` DATA section for a module. +This convention, while perhaps somewhat antiquated, is recognized by the +built-in ``help()`` function and the standard ``pydoc`` module. +The dunder variable will show up in the ``help()`` DATA section for a module. Other Python packaging tools @@ -2296,8 +2296,8 @@ Other Python packaging tools ``license_file`` fields, and automatically include license files following similar naming patterns as Wheel and Setuptools. -- `Flit <#flit_>`_ recommends using classifiers instead of the ``License`` field - (per the current PyPA packaging guide). +- `Flit <#flit_>`_ recommends using classifiers instead of the ``License`` + field (per the current PyPA packaging guide). - `PBR <#pbr_>`_ uses similar data as Setuptools, but always stored in ``setup.cfg``. @@ -2315,43 +2315,42 @@ Here is a survey of how things are done elsewhere. Linux distribution packages --------------------------- -**Note:** in most cases the license texts of the most common licenses are included -globally once in a shared documentation directory (e.g. ``/usr/share/doc``). +**Note:** in most cases, the texts of the most common licenses are included +globally in a shared documentation directory (e.g. ``/usr/share/doc``). - Debian documents package licenses with `machine readable copyright files <#dep5_>`_. - This specification defines its own license expression syntax that is - very similar to the SDPX syntax and use its own list of license identifiers - for common licenses (also closely related to SPDX identifiers). + It defines its own license expression syntax and list of identifiers for + common licenses, both of which are closely related to those of SPDX. - `Fedora packages <#fedora_>`_ specify how to include - `License Texts <#fedoratext_>`_ and how use a + `License Texts <#fedoratext_>`_ and use a `License field <#fedoralicense_>`_ that must be filled - with an appropriate license Short License identifier(s) from an extensive list - of `"Good License" identifiers <#fedoralist_>`_. Fedora also defines its own - license expression syntax very similar to the SDPX syntax. + with appropriate short license identifier(s) from an extensive list + of `"Good Licenses" <#fedoralist_>`_. Fedora also defines its own + license expression syntax, similar to that of SPDX. - `OpenSUSE packages <#opensuse_>`_ use SPDX license expressions with - SPDX license identifiers and a - `list of extra license identifiers <#opensuselist_>`_. + SPDX license IDs and a + `list of additional license identifiers <#opensuselist_>`_. -- `Gentoo ebuild <#pycode_>`_ uses a ``LICENSE`` variable. This field is - specified in `GLEP-0023 <#glep23_>`_ and in the +- `Gentoo ebuild <#pycode_>`_ uses a ``LICENSE`` variable. + This field is specified in `GLEP-0023 <#glep23_>`_ and in the `Gentoo development manual <#gentoodev_>`_. - Gentoo also defines a license expression syntax and a list of allowed - licenses. The expression syntax is rather different from SPDX. + Gentoo also defines a list of allowed licenses and a license expression + syntax, which is rather different from SPDX. - The `FreeBSD package Makefile <#freebsd_>`_ provides ``LICENSE`` and ``LICENSE_FILE`` fields with a list of custom license symbols. For - non-standard licenses, FreeBSD recommend to use ``LICENSE=UNKNOWN`` and add - ``LICENSE_NAME`` and ``LICENSE_TEXT`` fields, as well as sophisticated + non-standard licenses, FreeBSD recommends using ``LICENSE=UNKNOWN`` and + adding ``LICENSE_NAME`` and ``LICENSE_TEXT`` fields, as well as sophisticated ``LICENSE_PERMS`` to qualify the license permissions and ``LICENSE_GROUPS`` - to document a license grouping. The ``LICENSE_COMB`` allows to document more + to document a license grouping. The ``LICENSE_COMB`` allows documenting more than one license and how they apply together, forming a custom license expression syntax. FreeBSD also recommends the use of ``SPDX-License-Identifier`` in source code files. -- `Arch Linux PKGBUILD <#archinux_>`_ define its +- `Arch Linux PKGBUILD <#archinux_>`_ defines its `own license identifiers <#archlinuxlist_>`_. The value ``'unknown'`` can be used if the license is not defined. @@ -2359,12 +2358,12 @@ globally once in a shared documentation directory (e.g. ``/usr/share/doc``). ``PKG_LICENSE_FILES`` variables and recommend the use of SPDX License identifiers. -- `NixOS uses SPDX identifiers <#nixos_>`_ and some extra license identifiers +- `NixOS uses SPDX identifiers <#nixos_>`_ and some extra license IDs in its license field. - GNU Guix (based on NixOS) has a single License field, uses its own - `license symbols list <#guix_>`_ and specifies to use one license or a - `list of licenses <#guixlicense_>`_. + `license symbols list <#guix_>`_ and specifies how to use one license or a + `list of them <#guixlicense_>`_. - `Alpine Linux packages <#alpine_>`_ recommend using SPDX identifiers in the license field. @@ -2374,77 +2373,77 @@ Language and application packages --------------------------------- - In Java, `Maven POM <#maven_>`_ defines a ``licenses`` XML tag with a list - of license items each with a name, URL, comments and "distribution" type. - This is not mandatory and the content of each field is not specified. + of licenses, each with a name, URL, comments and "distribution" type. + This is not mandatory, and the content of each field is not specified. -- `JavaScript NPM package.json <#npm_>`_ use a single license field with SPDX - license expression or the ``UNLICENSED`` id if no license is specified. - A license file can be referenced as an alternative using "SEE LICENSE IN - " in the single ``license`` field. +- The `JavaScript NPM package.json <#npm_>`_ uses a single license field with + a SPDX license expression, or the ``UNLICENSED`` ID if none is specified. + A license file can be referenced as an alternative using + ``SEE LICENSE IN `` in the single ``license`` field. -- `Rubygems gemspec <#gem_>`_ specifies either a singular license string or - a list of license strings. The relationship between multiple licenses in a +- `Rubygems gemspec <#gem_>`_ specifies either a single or list of license + strings. The relationship between multiple licenses in a list is not specified. They recommend using SPDX license identifiers. -- `CPAN Perl modules <#perl_>`_ use a single license field which is either a - single string or a list of strings. The relationship between the licenses in +- `CPAN Perl modules <#perl_>`_ use a single license field, which is either a + single or a list of strings. The relationship between the licenses in a list is not specified. There is a list of custom license identifiers plus these generic identifiers: ``open_source``, ``restricted``, ``unrestricted``, ``unknown``. - `Rust Cargo <#cargo_>`_ specifies the use of an SPDX license expression (v2.1) in the ``license`` field. It also supports an alternative expression - syntax using slash-separated SPDX license identifiers. There is also a + syntax using slash-separated SPDX license identifiers, and there is also a ``license_file`` field. The `crates.io package registry <#cratesio_>`_ requires that either ``license`` or ``license_file`` fields are set when - you upload a package. + uploading a package. - `PHP Composer composer.json <#composer_>`_ uses a ``license`` field with - an SPDX license id or "proprietary". The ``license`` field is either a - single string that can use something which resembles the SPDX license - expression syntax with "and" and "or" keywords; or is a list of strings - if there is a choice of licenses (aka. a "disjunctive" choice of license). + an SPDX license ID or ``proprietary``. The ``license`` field is either a + single string with resembling the SPDX license expression syntax with + ``and`` and ``or`` keywords; or is a list of strings if there is a + (disjunctive) choice of licenses. -- `NuGet packages <#nuget_>`_ were using only a simple license URL and are now - specifying to use an SPDX License expression and/or the path to a license +- `NuGet packages <#nuget_>`_ previously used only a simple license URL, but + now specify using a SPDX license expression and/or the path to a license file within the package. The NuGet.org repository states that they only - accepts license expressions that are "approved by the Open Source Initiative + accept license expressions that are "approved by the Open Source Initiative or the Free Software Foundation." - Go language modules ``go.mod`` have no provision for any metadata beyond dependencies. Licensing information is left for code authors and other community package managers to document. -- `Dart/Flutter spec <#flutter_>`_ recommends to use a single ``LICENSE`` file - that should contain all the license texts each separated by a line with 80 - hyphens. +- The `Dart/Flutter spec <#flutter_>`_ recommends using a single ``LICENSE`` + file that should contain all the license texts, each separated by a line + with 80 hyphens. -- `JavaScript Bower <#bower_>`_ ``license`` field is either a single string - or a list of strings using either SPDX license identifiers, or a path or - a URL to a license file. +- The `JavaScript Bower <#bower_>`_ ``license`` field is either a single or + or list of strings using either SPDX license identifiers, or a path/URL + to a license file. -- `Cocoapods podspec <#cocoapod_>`_ ``license`` field is either a single - string or a mapping with attributes of type, file and text keys. - This is mandatory unless there is a ``LICENSE`` or ``LICENCE`` file provided. +- The `Cocoapods podspec <#cocoapod_>`_ ``license`` field is either a single + string, or a mapping with ``type``, ``file`` and ``text`` keys. + This is mandatory unless there is a ``LICENSE``/``LICENCE`` file provided. - `Haskell Cabal <#cabal_>`_ accepts an SPDX license expression since version 2.2. The version of the SPDX license list used is a function of - the ``cabal`` version. The specification also provides a mapping between - pre-SPDX Legacy license Identifiers and SPDX identifiers. - Cabal also specifies a ``license-file(s)`` field that lists license files - that will be installed with the package. + the Cabal version. The specification also provides a mapping between + legacy (pre-SPDX) and SPDX license Identifiers. Cabal also specifies a + ``license-file(s)`` field that lists license files to be installed with + the package. - `Erlang/Elixir mix/hex package <#mix_>`_ specifies a ``licenses`` field as a - required list of license strings and recommends to use SPDX license + required list of license strings, and recommends using SPDX license identifiers. -- `D lang dub package <#dub_>`_ defines its own list of license identifiers and - its own license expression syntax and both are similar to the SPDX conventions. +- `D Langanguage dub packages <#dub_>`_ define their own list of license + identifiers and license expression syntax, similar to the SPDX standard. -- `R Package DESCRIPTION <#cran_>`_ defines its own sophisticated license - expression syntax and list of licenses identifiers. R has a unique way to - support specifiers for license versions such as ``LGPL (>= 2.0, < 3)`` in its - license expression syntax. +- The `R Package DESCRIPTION <#cran_>`_ defines its own sophisticated license + expression syntax and list of licenses identifiers. R has a unique way of + supporting specifiers for license versions (such as ``LGPL (>= 2.0, < 3)``) + in its license expression syntax. Other ecosystems @@ -2464,7 +2463,7 @@ Other ecosystems and parts of the FSFE REUSE conventions to document its licenses. - `U-Boot <#uboot_>`_ spearheaded using ``SPDX-License-Identifier`` in code - and now follows the Linux ways. + and now follows the Linux approach. - The Apache Software Foundation projects use `RDF DOAP <#apache_>`_ with a single license field pointing to SPDX license identifiers. @@ -2475,10 +2474,10 @@ Other ecosystems - The `ClearlyDefined project <#clearlydefined_>`_ promotes using SPDX license identifiers and expressions to improve license clarity. -- The `Android Open Source Project <#android_>`_ use ``MODULE_LICENSE_XXX`` - empty tag files, where ``XXX`` is a license code such as BSD, APACHE, GPL, - etc. And side by side with this ``MODULE_LICENSE`` file there is a - ``NOTICE`` file that contains license and notices texts. +- The `Android Open Source Project <#android_>`_ uses ``MODULE_LICENSE_XXX`` + empty tag files, where ``XXX`` is a license code such as ``BSD``, ``APACHE``, + ``GPL``, etc. It also uses a ``NOTICE`` file that contains license and + notice texts. References @@ -2585,13 +2584,6 @@ References .. _#wheelspec: https://packaging.python.org/specifications/binary-distribution-format/ -Copyright -========= - -This document is placed in the public domain or under the -`CC0-1.0-Universal license <#cc0_>`_, whichever is more permissive. - - Acknowledgments =============== @@ -2605,6 +2597,13 @@ Acknowledgments - Luis Villa +Copyright +========= + +This document is placed in the public domain or under the +`CC0-1.0-Universal license <#cc0_>`_, whichever is more permissive. + + .. Local Variables: From c617f2ef89c4efb370cec5e6c0b189034ed44ec0 Mon Sep 17 00:00:00 2001 From: "C.A.M. Gerlach" Date: Mon, 29 Nov 2021 19:07:50 -0600 Subject: [PATCH 18/19] PEP 639: Address reviewer and community feedback --- pep-0639.rst | 147 +++++++++++++++++++++++++++------------------------ 1 file changed, 79 insertions(+), 68 deletions(-) diff --git a/pep-0639.rst b/pep-0639.rst index ef3385f41f6..ac48c77dac9 100644 --- a/pep-0639.rst +++ b/pep-0639.rst @@ -28,10 +28,10 @@ tools to programatically process. The PEP also: -- `Formally specifies `_ a new ``License-File`` field, +- `Formally specifies `_ a new ``License-File`` field, and defines how license files should be `included in distributions `_, - as already used by Wheel and Setuptools. + as already used by the Wheel and Setuptools projects. - `Deprecates `_ the legacy ``License`` field and ``license ::`` classifiers. @@ -52,7 +52,7 @@ The PEP also: `other ecosystems `_. The changes in this PEP will update the -`core metadata <#coremetadataspec_>`_ to version 2.3, modify the +`core metadata <#coremetadataspec>`_ to version 2.3, modify the `PEP 621 project metadata specification <#pep621spec_>`_, and make minor additions to the `source distribution (sdist) <#sdistspec_>`_, `built distribution (wheel) <#wheelspec_>`_ and @@ -73,7 +73,7 @@ designed to minimize impact and maximize backward compatibility. This specification builds off of existing ways to document licenses that are already in use in popular tools (e.g. adding support to core metadata for the ``License-File`` field `already used `_ in -Wheel and Setuptools) and by some package authors (e.g. storing an +the Wheel and Setuptools projects) and by some package authors (e.g. storing an SPDX license expression in the existing ``License`` field). In addition to these proposed changes, this PEP contains guidance for tools @@ -109,7 +109,7 @@ This PEP also is not about license documentation in files inside projects, though this is a `surveyed topic `_ in the appendix, and nor does it intend to cover cases where the source and binary distribution packages don't have -`the same licenses `_ +`the same licenses `_. Possible future PEPs @@ -144,7 +144,7 @@ This has triggered a number of license-related discussions and issues, including on `outdated and ambiguous PyPI classifiers <#classifierissue_>`_, `license interoperability with other ecosystems <#interopissue_>`_, `too many confusing license metadata options <#packagingissue_>`_, -`limited Wheel support for license files <#wheelfiles_>`_, and +`limited support for license files in the Wheel project <#wheelfiles_>`_, and `the lack of clear, precise and standardized license metadata <#pepissue_>`_. On average, Python packages tend to have more ambiguous and missing license @@ -174,7 +174,7 @@ There are a few takeaways from the survey: optionally combine more than one license identifier together. SPDX and SPDX-like syntaxes are the most popular in use. -- SPDX license identifiers are becoming the de-facto way to reference common +- SPDX license identifiers are becoming the de facto way to reference common licenses everywhere, whether or not a full license expression syntax is used. - Several package formats support documenting both a license expression and the @@ -241,7 +241,7 @@ This PEP seeks to clearly define the terms it uses, specifically those that: - Are new concepts introduced here (e.g. license expression/identifier). -Whenever available, definitions are excepted from the +Whenever available, definitions are excerpted from the `PyPA PyPUG Glossary <#pypugglossary_>`_ and `SPDX <#spdx_>`_. Terms are listed here in their full versions; related words (``Rel:``) are in parenthesis, including short forms (``Short:``), sub-terms (``Sub:``) and common synonyms @@ -285,12 +285,12 @@ for the purposes of this PEP (``Syn:``). **License Classifier** A `PyPI Trove classifier <#classifiers_>`_ (as originally defined in PEP 301) which begins with ``License ::``, currently used to indicate a project's - license status by including it as a ``Classifer`` in the core metadata. + license status by including it as a ``Classifier`` in the core metadata. **License Expression** *(Syn: SPDX Expression)* A string with valid `SPDX license expression syntax <#spdxpression_>`_ including any SPDX license identifiers as defined here, which describes - a project's license(s) and how they related to one another. Examples: + a project's license(s) and how they relate to one another. Examples: ``GPL-3.0-or-later``, ``MIT AND (Apache-2.0 OR BSD-2-clause)`` **License Identifier** *(Syn: License ID/SPDX Identifier)* @@ -318,13 +318,22 @@ for the purposes of this PEP (``Syn:``). in the ``[metadata]`` table of ``setup.cfg``, or the equivalent for other build tools. - **PEP 621 metadata** refers specifically to the former, as defined by the + The **PEP 621 metadata** refers specifically to the former, as defined by the `PyPA Declaring Project Metadata specification <#pep621spec_>`_. A **PEP 621 metadata key**, or an unqualified *key* refers specifically to a top-level ``[project]`` key (notably, *not* a core metadata *field*), while a **subkey** refers to a second-level key in a table-valued PEP 621 key. +**Root License Directory** *(Short: License Directory)* + The directory under which license files are stored in a project/distribution + and the root directory that their paths, as recorded under the + ``License-File`` core metadata fields, are relative to. + Defined here to be the project root directory for source trees and source + distributions, and a subdirectory named ``license_files`` of the directory + containing the core metadata (i.e., the ``.dist-info/license_files`` + directory) for built distributions and installed projects. + **Source Distribution** *(Short: sdist)* Here, specifically refers to a source distribution (**sdist**) as `specified by PyPA <#sdistspec_>`_. @@ -374,6 +383,10 @@ Finally, `guidance is established `_ for tools handling and converting legacy license metadata to license expressions, to ensure the results are consistent, correct and unambiguous. +Note that the guidance on errors and warnings is for tools' default behavior; +they MAY operate more strictly if users explicitly configure them to do so, +such as by a CLI flag or a configuration option. + Core metadata ------------- @@ -390,7 +403,7 @@ and `deprecates `_ the license classifiers in the ``Classifier`` field. The error and warning guidance in this section applies to build and -publishing tools; user-facing install tools MAY be more lenient than +publishing tools; end-user-facing install tools MAY be more lenient than mentioned here when encountering malformed metadata that does not conform to this specification. @@ -468,12 +481,6 @@ metadata, that file MUST be included in the distribution at the specified path relative to the root license directory, and MUST be installed with the distribution at that same relative path. -The **root license directory** is defined to be the project root directory -for source trees and source distributions, and the ``license_files`` -subdirectory of the directory containing the core metadata (i.e. the -``.dist-info`` directory containing the ``METADATA`` file) for built -distributions and installed projects. - The specified relative path MUST be consistent between project source trees, source distributions (sdists), built distributions (wheels) and installed projects. Therefore, inside the root license directory, packaging tools @@ -498,11 +505,9 @@ Deprecate ``License`` field ''''''''''''''''''''''''''' The legacy unstructured-text ``License`` field is deprecated and replaced by -the new ``License-Expression`` field. - -Build and publishing tools MUST raise an error if both fields are present and -their values are not identical, including capitalization and excluding -leading and trailing whitespace. +the new ``License-Expression`` field. Build and publishing tools MUST raise +an error if both these fields are present and their values are not identical, +including capitalization and excluding leading and trailing whitespace. If only the ``License`` field is present, such tools SHOULD issue a warning informing users it is deprecated and recommending ``License-Expression`` @@ -600,8 +605,8 @@ which MUST be parsable by the ``glob`` `module <#globmodule_>`_ in the Python standard library. **Note**: To avoid ambiguity, confusion and (per PEP 20, the Zen of Python) -"more than one (obvious) way to do it", a flat array of strings value for the -``license-files`` key has been +"more than one (obvious) way to do it", allowing a flat array of strings +as the value for the ``license-files`` key has been `left out for now `_. Path separators, if used, MUST be the forward slash character (``/``), @@ -699,9 +704,9 @@ License files in project formats A few minor additions will be made to the relevant existing specifications to document, standardize and clarify what is already currently supported, -allowed and implemented behavior, as well as explicitly mention the directory -location the license file tree is rooted in for each format, per the -`specification above `_. +allowed and implemented behavior, as well as explicitly mention the root +license directory the license files are located in and relative to for +each format, per the `specification above `_. **Project source trees** As `described above `_, the @@ -740,16 +745,19 @@ location the license file tree is rooted in for each format, per the Converting legacy metadata -------------------------- -If the contents of the ``License`` field are a valid license expression -containing solely known, non-deprecated license identifiers, build tools -MAY use it to fill the ``License-Expression`` field. +If the contents of the ``license.text`` PEP 621 source metadata key +(or equivalent for tool-specific config formats) is a valid license expression +containing solely known, non-deprecated license identifiers, and, if +PEP 621 metadata are defined, the ``license-expression`` key is listed as +``dynamic``, build tools MAY use it to fill the ``License-Expression`` field. -Similarly, if the ``Classifier`` field contains exactly one license classifier -that unambiguously maps to exactly one valid, non-deprecated SPDX identifier, -tools MAY fill the ``License-Expression`` field with the latter. +Similarly, if the ``classifiers`` PEP 621 source metadata key (or equivalent +for tool-specific config formats) contains exactly one license classifier +that unambiguously maps to exactly one valid, non-deprecated SPDX license +identifier, tools MAY fill the ``License-Expression`` field with the latter. -If both a non-empty ``License`` field and a single license classifier are -present, the contents of the ``License`` field, including capitalization +If both a ``license.text`` or equivalent value and a single license classifier +are present, the contents of the former, including capitalization (but excluding leading and trailing whitespace), MUST exactly match the SPDX license identifier mapped to the license classifier to be considered unambiguous for the purposes of automatically filling the @@ -758,13 +766,14 @@ unambiguous for the purposes of automatically filling the If tools have filled the ``License-Expression`` field as described here, they MUST output a prominent, user-visible warning informing package authors of that fact, including the ``License-Expression`` string they have output, -and recommending that the source metadata be updated accordingly -with the indicated ``License-Expression``. +and recommending that the project source metadata be updated accordingly +with the indicated license expression. -In any other case, tools MUST NOT use the contents of the ``License`` field -or license classifiers to fill the ``License-Expression`` field without -informing the user and requiring unambiguous, affirmative user action to -select and confirm the desired ``License-Expression`` value before proceeding. +In any other case, tools MUST NOT use the contents of the ``license.text`` +key (or equivalent) or license classifiers to fill the +``License-Expression`` field without informing the user and requiring +unambiguous, affirmative user action to select and confirm the desired +``License-Expression`` value before proceeding. Mapping license classifiers to SPDX identifiers @@ -809,7 +818,7 @@ they intended for their project. **Note**: Several additional classifiers, namely the "or later" variants of the AGPLv3, GPLv2, GPLv3 and LGPLv3, are also listed in the aforementioned mapping, but as they were merely proposed for textual harmonization and -still unambiguously map to their respective respective licenses, +still unambiguously map to their respective licenses, they were not included here; LGPLv2 is, however, as it could ambiguously refer to either the distinct v2.0 or v2.1 variants of that license. @@ -857,7 +866,7 @@ When multiple license classifiers are used, their relationship is ambiguous, and it is typically not possible to determine if all the licenses apply or if there is a choice that is possible among the licenses. In this case, tools MUST NOT automatically infer a license expression, and SHOULD suggest that the -package author construct a one which expresses their intent. +package author construct one which expresses their intent. User Scenarios @@ -922,7 +931,7 @@ I maintain an existing package that's already licensed ------------------------------------------------------ If you already have license files and metadata in your project, you -should only need to make a couple tweaks to take advantage of the new +should only need to make a couple of tweaks to take advantage of the new functionality. In your project config file, enter your license expression under @@ -1015,8 +1024,8 @@ be left to a future PEP and a new version of the core metadata specification. Formally specifying the new ``License-File`` core metadata field and the inclusion of the listed files in the distribution merely codifies and -refines the existing practices in popular packaging tools, including -Wheel and Setuptools, and is designed to be largely backwards-compatible +refines the existing practices in popular packaging tools, including the Wheel +and Setuptools projects, and is designed to be largely backwards-compatible with their existing use of that field. Likewise, the new ``license-files`` PEP 621 source metadata key standardizes statically specifying the files to include, as well as the default behavior, and allows other tools to @@ -1041,12 +1050,12 @@ forward compatibility with additional standard or installer-specified files and directories added to ``.dist-info``, as they too could conflict with the names of existing licenses. -While minor additions will be made to the source distribution (sdist) +While minor additions will be made to the source distribution (sdist), built distribution (wheel) and installed project specifications, all of these are merely documenting, clarifying and formally specifying behaviors explicitly allowed under their current respective specifications, and already implemented in practice, and gating them behind the explicit presence of both the new -metadata versions and the new fields. In particular, sdsts may contain +metadata versions and the new fields. In particular, sdists may contain arbitrary files following the project source tree layout, and formally mentioning that these must include the license files listed in the metadata merely documents and codifies existing Setuptools practice. Likewise, arbitrary @@ -1089,7 +1098,7 @@ this field, and catch them if they make a mistake. For authors still using the now-deprecated, less precise and more redundant ``License`` field or license classifiers, packaging tools will warn them and inform them of the modern replacement, ``License-Expression``. -Finally, for users who may have forgot or not be aware they need to do so, +Finally, for users who may have forgotten or not be aware they need to do so, publishing tools will gently guide them toward including ``license-expression`` and ``license-files`` in their project source metadata. @@ -1240,7 +1249,7 @@ to improve the currently complex and fragmented story around license documentation. Not doing so would leave three different un-deprecated ways of specifying a license for a package, two of them ambiguous, less than clear/obvious how to use, inconsistently documented and out of date. -This is more complex for for all tools in the ecosystem to support +This is more complex for all tools in the ecosystem to support indefinitely (rather than simply installers supporting older packages implementing previous frozen metadata versions), resulting in a non-trivial and unbounded maintenance cost. @@ -1296,7 +1305,7 @@ for PyPI (or other package indicies) as to whether and how they should validate the ``License-Expression`` or ``License-File`` fields, nor how they should handle using them in combination with the deprecated ``License`` field or license classifiers. This simplifies the specification -and either defers implementation on PyPI to a later PEP, and gives +and either defers implementation on PyPI to a later PEP, or gives discretion to PyPI to enforce the stated invariants, to minimize disruption to package authors. @@ -1310,13 +1319,13 @@ or higher and have the ``License-Expression`` field will have a valid expression, such that PyPI and consumers of its packages and metadata can rely upon to follow the specification here. -The same can be extended to the ``License-File`` field, as also specified -here, to ensure that it is valid and the legally required license files +The same can be extended to the new ``License-File`` field as well, +to ensure that it is valid and the legally required license files are present, and thus it is lawful for PyPI, users and downstream consumers -to distribute the package (of course, this makes no *guarantee* of such +to distribute the package. (Of course, this makes no *guarantee* of such as it is ultimately reliant on authors to declare them, but it improves assurance of this and allows doing so in the future if the community so -decides). To be clear, this would not require that any uploaded distribution +decides.) To be clear, this would not require that any uploaded distribution have such metadata, only that if they choose to declare it per the new specification in this PEP, it is assured to be valid. @@ -1477,7 +1486,8 @@ explicitly set to dynamic in order for the ``License`` core metadata field to be automatically back-filled from the value of the ``license-expression`` key. This would be more explicit that the filling will be done, as strictly speaking the ``license`` key is not (and cannot be) specified in -``pyproject.toml``. +``pyproject.toml``, and satisfies a stricter interpretation of the letter +of the current PEP 621 specification that this PEP revises. However, this isn't seen to be necessary, because it is simply using the static, verbatim literal value of the ``license-expression`` key, as specified @@ -1725,9 +1735,10 @@ Flatten license files in subdirectories ''''''''''''''''''''''''''''''''''''''' Previous drafts of this PEP were silent on the issue of handling license files -in subdirectories. Currently, `Wheel <#wheelfiles_>`_ and (following its -example) `Setuptools <#setuptoolsfiles_>`_ flattens all license files into the -``.dist-info`` directory without preserving the source subdirectory hierarchy. +in subdirectories. Currently, the `Wheel <#wheelfiles_>`_ and (following its +example) `Setuptools <#setuptoolsfiles_>`_ projects flatten all license files +into the ``.dist-info`` directory without preserving the source subdirectory +hierarchy. While this is the simplest approach and matches existing ad hoc practice, this can result in name conflicts and license files clobbering others, @@ -1848,10 +1859,10 @@ In addition, this approach is not fully backwards compatible (since it isn't transparent to tools that simply extract the wheel), is a greater departure from existing practice and would lead to more inconsistent license install locations from wheels of different versions. Finally, -this would mean licenses were not installed as proximately to their +this would mean licenses would not be installed as proximately to their associated code, there would be more variability in the license root path across platforms and between built distributions and installed projects, -accessing installed licenses programmatically would be more non-trivial, and a +accessing installed licenses programmatically would be more difficult, and a suitable install location and method would need to be created, discussed and decided that would avoid name clashes. @@ -1909,8 +1920,8 @@ string. Therefore, this approach is rejected. Map identifiers to source files ''''''''''''''''''''''''''''''' -As discussed previously, file-level notices out of scope for this PEP, and the -existing ``SPDX-License-Identifier`` `convention <#spdxid_>`_ can +As discussed previously, file-level notices are out of scope for this PEP, +and the existing ``SPDX-License-Identifier`` `convention <#spdxid_>`_ can already be used if this is needed without further specification here. @@ -2233,7 +2244,7 @@ and this is another possible source of confusion: - In the `Setuptools <#setuptoolssdist_>`_ and `Wheel <#wheels_>`_ projects, license files are automatically added to the distribution (at their source - location in in a source distribution/sdist, and in the ``.dist-info`` + location in a source distribution/sdist, and in the ``.dist-info`` directory of a built wheel) if they match one of a number of common license file name patterns (``LICEN[CS]E*``, ``COPYING*``, ``NOTICE*`` and ``AUTHORS*``). Alternatively, a package author can specify a list of license @@ -2294,7 +2305,7 @@ Other Python packaging tools - `Conda package manifests <#conda_>`_ have support for ``license`` and ``license_file`` fields, and automatically include license files - following similar naming patterns as Wheel and Setuptools. + following similar naming patterns as the Wheel and Setuptools projects. - `Flit <#flit_>`_ recommends using classifiers instead of the ``License`` field (per the current PyPA packaging guide). @@ -2398,7 +2409,7 @@ Language and application packages requires that either ``license`` or ``license_file`` fields are set when uploading a package. -- `PHP Composer composer.json <#composer_>`_ uses a ``license`` field with +- `PHP composer.json <#composer_>`_ uses a ``license`` field with an SPDX license ID or ``proprietary``. The ``license`` field is either a single string with resembling the SPDX license expression syntax with ``and`` and ``or`` keywords; or is a list of strings if there is a @@ -2418,7 +2429,7 @@ Language and application packages file that should contain all the license texts, each separated by a line with 80 hyphens. -- The `JavaScript Bower <#bower_>`_ ``license`` field is either a single or +- The `JavaScript Bower <#bower_>`_ ``license`` field is either a single string or list of strings using either SPDX license identifiers, or a path/URL to a license file. From b02b234ac972e2ec4790f6c493428c472bdf50c8 Mon Sep 17 00:00:00 2001 From: "C.A.M. Gerlach" Date: Fri, 17 Dec 2021 15:41:58 -0600 Subject: [PATCH 19/19] PEP 639: Add custom IDs issue & clarify rejected license key str value --- pep-0639.rst | 131 +++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 116 insertions(+), 15 deletions(-) diff --git a/pep-0639.rst b/pep-0639.rst index ac48c77dac9..b781a5cecf0 100644 --- a/pep-0639.rst +++ b/pep-0639.rst @@ -6,7 +6,7 @@ Author: Philippe Ombredanne , C.A.M. Gerlach Sponsor: Paul Moore PEP-Delegate: Paul Moore -Discussions-To: https://discuss.python.org/t/2154 +Discussions-To: https://discuss.python.org/t/12622 Status: Draft Type: Standards Track Content-Type: text/x-rst @@ -1398,25 +1398,58 @@ key, retaining the ``expression`` subkey in the ``license`` table, or allowing both. Indeed, this would seem to have been envisioned by PEP 621 itself with this PEP in mind, in particular the first approach:: - A practical string value for the license key has been purposefully left out - to allow for a future PEP to specify support for SPDX expressions. + A practical string value for the license key has been purposefully left + out to allow for a future PEP to specify support for SPDX expressions + (the same logic applies to any sort of "type" field specifying what + license the file or text represents). However, while a working draft temporarily explored this solution, it was ultimately rejected, as it shared most of the downsides identified with adding new subkeys under the existing ``license`` table, as well as several of its own, with again minimal advantage over separating both. -In particular, it means the top-level ``license`` key still maps to multiple +Most importantly, it still means that per PEP 621, it is not possible to +separately mark the ``[project]`` keys corresponding to the ``License`` and +``License-Expression`` metadata fields as dynamic. This, in turn, still +renders specifying metadata following that standard incompatible with +conversion of legacy metadata, as specified in this PEP's +`Converting legacy metadata`_ section, as PEP 621 strictly prohibits the +``license`` key from being both present (to define the existing value of +the ``License`` field, or the path to a license file, and thus able to be +converted), and specified as ``dynamic`` (which would allow tools to +use the generated value for the ``License-Expression`` field. + +For the same reasons, this would make it impossible to back-fill the +``License`` field from the ``License-Expression`` field as this PEP +currently allows (without making an exception from strict +``dynamic`` behavior in this case), as again, marking ``license`` as dynamic +would mean it cannot be specified in the ``project`` table at all. + +Furthermore, this would mean existing project source metadata specifying +``license`` as ``dynamic`` would be ambiguous, as it would be impossible for +tools to statically determine if they are intended to conform to previous +metadata versions specifying ``License``, or this version specifying +``License-Expression``. Tools would have no way of determining which field, +if either, might be filled in the resulting distribution's core metadata. +By contrast, the present approach makes clear what the author intended, +allows tools to unambiguously determine which field(s) may be dynamically +inserted, and ensures backward compatibility such that current project +source metadata do not unknowingly specify both the old and the new field +as dynamic, and instead must do so explicitly per PEP 621's intent. + +Additionally, while differences from existing tool formats (and core metadata +field names) has precedent in PEP 621 (though is best avoided if practical), +using a key with an identical name as in all current tools (and of an existing +core metadata field) to mean something different (and map to a different +core metadata field), with distinct and incompatible syntax and semantics, +does not, and is likely to create substantial and confusion and ambiguity +for readers and authors, contrary to the fundamental goals of this PEP. + +Finally, this means that the top-level ``license`` key still maps to multiple core metadata fields with different purposes and interpretation (``License`` -and ``License-Expression``), one deprecated and one new, and still prevents -them from being separately marked as dynamic, and conflates the same with -an existing mark. This further exhibits the same divergence from both -PEP 621, core metadata, tool file formats and the consensus in the discussion -in not making the new license expression map to a corresponding new field, -none of which was the case at the time PEP 621 was drafted. -Finally, this would deny a clear separation from the old behavior by not -cleanly deprecating the entire ``license`` key, and increases the complexity -of the specification and implementation. +and ``License-Expression``), this would deny a clear separation from the +old behavior by not cleanly deprecating the ``license`` key, and +increases the complexity of the specification and implementation. In addition to the aforementioned issues, this also requires deciding between the three individual approaches (``expression`` subkey, top-level string or @@ -1441,8 +1474,7 @@ while adding even more spec and tool complexity and making there more than Therefore, a separate top-level ``license-expression`` key was adopted to avoid all these issues, with relatively minimal downside aside from adding a single -additional top-level key and (versus some approaches) a few extra characters -to type. +additional key and (versus some approaches) a few extra characters to type. Add a ``type`` key to treat as expression @@ -1999,6 +2031,75 @@ be reversed once a breaking revision of the metadata spec is issued)? Or should this not be explicitly allowed, discouraged or even prohibited? +Should custom license identifiers be allowed? +--------------------------------------------- + +The current version of this PEP retains the behavior of only specifying +the use of SPDX-defined license identifiers, as well as the explicitly defined +custom identifiers ``LicenseRef-Public-Domain`` and ``LicenseRef-Proprietary`` +to handle the two common cases where projects have a license, but it is not +one that has a recognized SPDX license identifier. + +For maximum flexibility, custom ``LicenseRef-`` license +identifiers could be allowed, which could potentially be useful for niche +cases or corporate environments where ``LicenseRef-Proprietary`` is not +appropriate or insufficiently specific, but relying on mainstream Python +build tooling and the ``License-Expression`` metadata field is still +desirable to use for this purpose. + +This has the downsides, however, of not catching misspellings of the +canonically defined license identifiers and thus producing license metadata +that is not a valid match for what the author intended, as well as users +potentially thinking they have to prepend ``LicenseRef`` in front of valid +license identifiers, as there seems to be some previous confusion about. +Furthermore, this encourages the proliferation of bespoke license identifiers, +which obviates the purpose of enabling clear, unambiguous and well +understood license metadata for which this PEP was created. + +Indeed, for niche cases that need specific, proprietary custom licenses, +they could always simply specify ``LicenseRef-Proprietary``, and then +include the actual license files needed to unambiguously identify the license +regardless (if not using SPDX license identifiers) under the ``License-File`` +fields. Requiring standards-conforming tools to allow custom license +identifiers does not seem very useful, since standard tools will not recognize +bespoke ones or know how to treat them. By contrast, bespoke tools, which +would be required in any case to understand and act on custom identifiers, +are explicitly allowed, with good reason (thus the ``SHOULD`` keyword) +to not require that license identifiers conform to those listed here. +Therefore, this specification still allows such use in private corporate +environments or specific ecosystems, while avoiding the disadvantages of +imposing them on all mainstream packaging tools. + +As an alternative, a literal ``LicenseRef-Custom`` identifier could be +defined, which would more explicitly indicate that the license cannot be +expressed with defined identifiers and the license text should be referenced +for details, without carrying the negative and potentially inappropriate +implications of ``LicenseRef-Proprietary``. This would avoid the main +mentioned downsides (misspellings, confusion, license proliferation) of +the approve approach of allowing an arbitrary ``LicenseRef``, while +addressing several of the potential theoretical scenarios cited for it. + +On the other hand, as SPDX aims to (and generally does) encompass all +FSF-recognized "Free" and OSI-approved "Open Source" licenses, +and those sources are kept closely in sync and are now relatively stable, +anything outside those bounds would generally be covered by +``LicenseRef-Proprietary``, thus making ``LicenseRef-Custom`` less specific +in that regard, and somewhat redundant to it. Furthermore, it may mislead +authors of projects with complex/multiple licenses that they should use it +over specifying a license expression. + +At present, the PEP retains the existing approach over either of these, given +the use cases and benefits were judged to be sufficiently marginal based +on the current understanding of the packaging landscape. For both these +proposals, however, if more concrete use cases emerge, this can certainly +be reconsidered, either for this current PEP or a future one (before or +in tandem with actually removing the legacy unstructured ``License`` +metadata field). Not defining this now enables allowing it later +(or still now, with custom packaging tools), without affecting backward +compatibility, while the same is not so if they are allowed now and later +determined to be unnecessary or too problematic in practice. + + Appendix 1. License Expression Examples =======================================