-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
PEP 639: Make requested additions and changes, add additional helpful info, reformat, update and copyedit #2164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This PEP `adds <Add license-expression key_>`_ the ``license-expression`` key, | ||
`adds <Add license-files key_>`_ the ``license-files`` key and | ||
`deprecates <Deprecate license key_>`_ the ``license`` key. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PEP 621 specifically left room in the license
key, so we don’t need to deprecate it. We only need to deprecate license = {text = ...}
and license = {file = ...}
, and define meaning to license = "MIT"
. No new [project]
keys are needed.
To allow multiple licenses, we add
license = { names = ... }
Also, I think we should always glob to avoid confusion. So we define
license = { files = ... }
where each string is a glob pattern, and if someone really really need stars in the path (extremely unlikely), they can escape them manually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PEP 621 specifically left room in the license key, so we don’t need to deprecate it. We only need to deprecate license = {text = ...} and license = {file = ...}, and define meaning to license = "MIT". No new [project] keys are needed.
This was exactly the original approach I took here (see 4871854 ), later revised to split the license-files
out into a top-level key while keeping the License-Expression
field mapped to the license
key, and further revised to split them both (see 88cba0c ). In practice, when considered more carefully, the first, and to a somewhat lesser extent the second approach provided to be distinctly suboptimal for a number of reasons, and quite possibly unworkable (due to the interaction with the dynamic
key, depending on exactly how one interprets a rather imprecisely specified passage in PEP 621), while the only downside of the present approach was 1-2 extra top-level keys.
I spent a substantial amount of time carefully considering the alternatives here, and thoroughly address both of these in great detail in the Rejected Ideas
section; see the Add expression and files subkeys to table
and Define license expression as string value
items under the Source metadata license key
subsection. If you still have articulatable concerns with this approach, or can find ways around the various issues I've cited with the previous one, it might be better to discuss this on the Discourse thread, where we can all come to a clear consensus before I spend several more hours of my free time changing everything back (and us all having to bikeshed over exactly what to change it back to, heh).
To allow multiple licenses, we add
license = { names = ... }
Perhaps I'm grossly misunderstanding your suggestion, but this proposal appears to contradict the underlying goal and basic objective of this PEP—adding SPDX license expressions to Python package metadata—which has been true its conception, has not been (at least as far as I could find) brought up throughout the course of the years-long relevant discussions, and is intrinsically incompatible with the core metadata field License-Expression
that it is supposed to map to. Could you explain a little more about what you're suggesting here? Given it seems to be a fairly substantial shift in what this PEP has been proposing to do from the start, and which seemed to have a fairly strong consensus behind it, if you're really sure about it, you might want to bring it up on Discourse first.
Also, I think we should always glob to avoid confusion.
I think I'm the one that's a little confused, heh—could you explain a little more how users having to implicitly know that license.files
is a glob is less potentially confusing than them explicitly specifying license-files.glob
(or license-files.paths
)? Furthermore, per the spec, for the latter tools will unambiguously know if the user mistakenly specify a glob instead of a literal path under the paths
key, because it won't match an actual file.
In any case, like the previous, this was the approach I initially implemented here but after a substantial amount of careful thought and experimentation, this ran into significant conceptual and practical issues with, for a few extra characters, the more explicit alternative neatly avoids. I address this issue at length under the Only accept glob patterns
and Infer whether paths or globs
items in the Source metadata license-files key
subsection of the Rejected Ideas
section.
if someone really really need stars in the path (extremely unlikely), they can escape them manually.
Actually, this is not the main motivation for this choice, or even one of the primary drivers. The Only accept glob patterns
and Infer whether paths or globs
items in the Source metadata license-files key
subsection of the Rejected Ideas
section describes the rationale in detail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps I'm grossly misunderstanding your suggestion, but this proposal appears to contradict the underlying goal and basic objective of this PEP—adding SPDX license expressions to Python package metadata—which has been true its conception, has not been (at least as far as I could find) brought up throughout the course of the years-long relevant discussions, and is intrinsically incompatible with the core metadata field
License-Expression
that it is supposed to map to.
No, you’re right, I was misunderstanding how License-Expression
works (I thought it could be specified multiple times to automatically mean “pick one of these”).
I think I'm the one that's a little confused, heh—could you explain a little more how users having to implicitly know that
license.files
is a glob is less potentially confusing than them explicitly specifyinglicense-files.glob
(orlicense-files.paths
)?
It’s not, but there’s no harm for that :) It is extremely uncommon for license files to be put in paths that resemble a glob pattern, so the user does not actually care if this triggers globbing when they only want to specify one file. And if they want globbing, now they have only one key to remember instead of two.
Perhaps this would be a better approach?
- Add
license-files
to do glob (no nested keys). - Use
license = "<spdx-expression>"
and deprecate nested keys.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, you’re right,
No worries, sorry for the confusion. Seems like the mixup might have been over a license identifier vs. a license expression; I do define these in the Terminology section, but is there anything in the PEP that I should clarify here?
It is extremely uncommon for license files to be put in paths that resemble a glob pattern, so the user does not actually care if this triggers globbing when they only want to specify one file.
Indeed; however, as discussed this isn't the (or even a) primary reason for this; sorry for not linking it when I referred to it previously, but I discuss the rationale in detail in the Only accept glob patterns and Infer whether paths or globs items in the Rejected Ideas, and see next item for a summary.
Add license-files to do glob (no nested keys).
Actually, as mentioned, this was also my initial approach, as "flat is better than nested", but it ended up conflicting with several of the behaviors specified so that "errors should never pass silently", especially if "in the face of ambiguity we must resist the temptation to guess", and in the end, I decided "explicit is better than implicit" :)
Use
license = "<spdx-expression>"
and deprecate nested keys.
In fact, following that which you previously proposed, this was my second approach here too :) However, upon implementing it, it still ran into less serious versions of some of the same issues as the previous, including mapping two distinct metadata types to the same key, inability to specify which was marked as dynamic, and creating additional inconsistency between core metadata, PEP 621 and setup.cfg
/other tool formats. On top of that, it creates potential for serious user confusion and ambiguity, since it means license
maps to completely syntactically and semantically different fields depending on the config format, and is much less explicit to authors, readers and consumers that it is intended to be a SPDX identifier/expression. See Define license expression as string value in the rejected ideas for more details.
b68a1dd
to
6aab97e
Compare
we similarly reject this here. | ||
|
||
|
||
Must be marked dynamic to back-fill |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't PEP 621 mandate that the field be marked as dynamic to be backfilled?
Only when a field is marked as dynamic may a tool provide a "new" value.
- https://www.python.org/dev/peps/pep-0621/#specification
- https://www.python.org/dev/peps/pep-0621/#have-the-dynamic-field-only-require-specifying-missing-required-fields
Or am I misunderstanding what you're saying here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potentially, depending on how strictly it is interpreted, but I address this explicitly in some detail in this very section. Given it is simply using the verbatim value of another specific, mutually-exclusive key as defined by this specification, such that any tool reading the pyproject.toml
can "trivially, deterministically and unambiguously determine this using only the static data in the pyproject.toml
file itself," it meets the spirit and intent, if (arguably) not the exact letter of PEP 621.
While I'd normally argue for explicitness, marking the field as dynamic has serious practical danger, as it means that any tool (including those not yet unaware of this PEP) could fill it with anything it wanted, directly violating the specification in this PEP, which is why the specification here explicitly prohibits marking license
as dynamic
if license-expression
is also defined.
Also, as a side note, requiring it to be dynamic is unavoidably mutually exclusive with @uranusjr and @ofek 's proposal to move license-expression
back under the license
key, since it would be impossible to mark one one as dynamic without the other, and thus prohibited to define a license expression value there in the first place.
In any case, as this PEP updates the specification originally defined in PEP 621, it does not irrevocably constrain this PEP (in fact, for license-files
, I also had to make what was arguably another mild exception in order to maintain safe, sane and backward-compatible default behavior, as also discussed here), and this section of the canonical project metadata spec will be updated accordingly to clarify these edge cases.
That said, I'm not completely wed to the necessity of allowing back-filling in the first place, as I haven't yet seen a 100% compelling rationale either way other than general backward compat and people on Discourse asking for it.
fba4263
to
82a0695
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @CAM-Gerlach! This is quite a comprehensive rewrite. I hope with that we're able to get this PR to the finish line. Only my opinion but I would say it's fine if you add yourself as coauthor.
When used in the ``License-Expression`` field and as a specialization of | ||
the SPDX license expression definition, a license expression can use the | ||
following license identifiers: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aside from LicenseRef-Proprietary
there is no real way to specific a custom (proprietary) license type. That isn't an issue for most public packages, but since Python is also used in corporate environments with custom package indexes, I could imaging they might want some more flexibility. Especially as we deprecate the license
field.
Does it make sense to add something like LicenseRef-Custom:<some custom license name>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, good point. SPDX already specifies the use of LicenseRef-<CUSTOM-LICENSE-NAME>
, which is what we are taking advantage of here, so if we did allow other custom identifiers, they would presumably want to follow that pattern rather than making up something bespoke. For niche cases that need very specific, proprietary custom licenses, making up an arbitrary bespoke license identifier is not terribly useful, I'd think, since people and tools won't know recognize it unless they are already aware of it anyway, and would need to reference the License-File
s anyway for the actual terms, which License Ref-Proprietary
makes explicit. If they do have their own custom private infra, that can still allow it, and they can also still use it in their individual files and in other standardized SPDX interchange formats. I'm just not sure it adds value allowing arbitrary license IDs for all Python packages given the room it creates for ambiguity and mistakes relative to the seemingly small benefit for such specific cases.
We could add an explicit LicenseRef-Custom
for theoretical cases where a license is FSF- or OSI-approved and thus not technically LicenseRef-Proprietary
but doesn't yet have a SPDX identifier, but as these sources are generally kept closely in sync (I've been a small part of that), this would only be true for a tiny number of edge cases packages, as far as I'm aware. There could also hypothetically be cases where the author has a strong good faith belief that their bespoke license is not proprietary, but it hasn't (yet) been reviewed by FSF or OSI...but not sure if that does more harm than good to allow specifying that, given the increased ambiguity. And, in any case, in terms of actively maintained modern packages it seems rather unlikely this is more than a handful of cases. As such, my inclination would be to stick with @pombredanne 's original language here (which this PR essentially leaves unchanged) unless there is a clear consensus otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't though of LicenseRef-<CUSTOM-LICENSE-NAME>
. You are right that LicenseRef-Proprietary
should cover it. I'm just wondering though if we should at least allow the former for build tools. (Maybe it would even make sense if only PyPI would reject them.)
When processing the ``License-Expression`` field to determine if it contains
a valid license expression, build and publishing tools:
- SHOULD halt execution and raise an error if:
- The field does not contain a valid license expression
- One or more license identifiers are not valid (as defined above) # <-- !
--
I don't plan on using this personally, it's just something I though about that could be an issue if the license
field is eventually removed. This could lower the barrier in that area.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, that's worth thinking about. Currently, given its a SHOULD
, custom build and publishing tools and those oriented toward this sort of use case are allowed to not raise an error if the license identifier is unknown, if they have reason to do so (as they would, if there was a need for this). We could carve out an exception "...unless they begin with LicenseRef-
, in which case build tools SHOULD issue a warning instead.".
However, that does complicate the spec a bit, and means that build tools won't catch misspellings of LicenseRef-Proprietary
or LicenseRef-Public-Domain
, or users who prepend LicenseRef
to their SPDX identifiers (as some mistakenly do), or users who go for a custom license without carefully considering the implications...unless users see the warning, which at present could be quite buried within the very verbose build output and be very difficult to spot, even for Python core devs, a point made on the Discourse thread. At least anecdotally, this seems more likely than the number of authors who genuinely need to make up their own bespoke license identifiers and use them in the packaging metadata with open source tools.
Since custom and enterprise-oriented build tooling is already free to allow such, which seems likely to be the case if custom internal-only IDs are actually going to be useful for other tools to consume, and the other ways that custom IDs can still be used (SPDX manifest, individual file SPDX tags, REUSE project standard, DEP5 standard, etc, plus actual license file detection, which this PEP makes much easier), on top of the fact that such organizations would have to be using the License
field for that very specific purpose already for it to be a regression, I'm still not sure its necessary in this PEP.
To note, when NPM implemented SPDX identifiers, they initially had a (very poorly named) UNLICENSED
identifier and one that was See license in FILE_PATH
(akin to our License-File
),, otherwise it didn't allow any LicenseRef
custom ones at all, including LicenseRef-Proprietary
and LicenseRef-Public-Domain
, nor allow a package to not have a license expression unless private: true
was manually set. They later removed even that, and only then did users complain in droves, after which they later re-implemented UNLICENSED
following a lot of complaints (see discussion in npm/npm#8918 ). So we'd still be offering clearer and more diverse tags than they do, and there didn't appear to be demand for custom LicenseRef
s as opposed to what this PEP has, LicenseRef-Proprietary
.
My feeling is given the deprecation process will presumably last several years, if there are actual necessary use cases where specifying a custom license more specific than License Ref-Proprietary
in the License-Expression
field packaging metadata, using open source tools, for proprietary projects, currently using license
for this, and for which other options aren't sufficient, we'll hear about it at some point before license
is actually removed, and we can always loosen the spec later but can't tighten it without breaking backward compat.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some good points you mentioned. Especially "we can change it later if necessary" is quite convincing.
Thanks for taking the time to discuss these issues. I appreciate that!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, we could; the guidance as it stands now is based on yours, and I didn't really touch that section. I just worry about catching unintentional mispellings of the two canonically defined ones, or users thinking they have to prepend LicenseRef
in front of license identifiers, as there unfortunately still seems to be some confusion over. That would also create the need for more cross-field validation, and if license-files.globs
was used or the default was relied upon, there would be no guarantee that the package would actually contain the named license files unless tools checked the source tree/sdist contents. We'd also have to consider how we'd want PyPI to handle that. I'm not sure the relatively niche use cases where something more specific than LicenseRef-Proprietary
was needed outweighs that, but I'd ultimately trust your judgement, assuming others are okay with requiring and cross-validating that at least one License-File
is matched (and additionally, the package may just have a AUTHORS.md
which it would match, which would not really be a license, so its not necessarily reliable even then).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pombredanne As I understand it, the main reason against allowing LicenseRef-xyz
is that it will make implementation and validation of License-Expressions much more difficult.
Maybe it's worth to consider a middle ground here. I think we agree that theoretically there might be instances where LicenseRef-Proprietary
, LicenseRef-Public-Domain
, and the other valid SPDX identifiers aren't enough. What do you guys think about allowing LicenseRef-Custom
for these cases?
I would not recommend to enforce that a License-File
must be provided, though. It would just get too complicated. For instance, what do we do if someone chooses a different name? Instead, anyone who encounters LicenseRef-Custom
should assume it's equivalent to LicenseRef-Proprietary
unless stated otherwise in a License-File
. That's consistent with how it would have been handled anyway if the License-Expression
was missing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was actually initially attracted to the LicenseRef-Custom
approach as well, having initially suggested something like that. However, I'm not sure how much it really adds over LicenseRef-Proprietary
, since AFAIK (correct me if I'm wrong, @pombredanne ) all of the (FSF-recognized) Free and (OSI-recognized) Open Source licenses (i.e., non-proprietary) have SPDX identifiers. Thus, other licenses would thus implicitly qualify as both non-free and non-open, thus are proprietary, aside from the potential for extremely niche cases of licenses that appear to meet the OSS definition and/or the Free Software definition, but are not yet officially recognized by those organizations, and don't otherwise already have a SPDX identifier. So I'm not really sure if there's a clear use case for LicenseRef-Custom
that wouldn't be covered by a license expression including LicenseRef-Proprietary
, LicenseRef-Public-Domain
and the allowed license identifiers, whereas while allowing custom LicenseRefs comes with more downsides, it also has more practical benefits. If there's interest, though, we could think about allowing them in build tools, warning in publishing tools and not allowing on PyPI?
I also thought about proposing distinguishing All Rights Reserved
from more permissive but still proprietary licensing, but if a package was really explicitly licensed All Rights Reserved (which is essentially equivalent to not having a license at all), technically speaking outside of any rights granted by PyPI's ToS, no one would be allowed to download or use it, so there doesn't seem to be a clear motivation to package and share it in the first place. But perhaps LicenseRef-All-Rights-Reserved
could be allowed by packaging tools, but not by publishing tools/PyPI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pombredanne @cdce8p Regardless of what we decide, I think the outcome of our fruitful and insightful discussion so far should be reflected in the PEP. Should I add it as an open issue, or should I document the current approach more completely in the "rejected ideas" section, leaving the door open for further revision on that point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned, for now, I added an Open Issue that summarized the points made thus far in our thread above, so they can be included in the PEP and can serve as a basis for further discussion on the Discourse thread, as needed.
fd9dcf7
to
34d2330
Compare
@CAM-Gerlach Thank you ++ for reviving this! I apologize for going awol and I am grateful for your effort.
Please do. You deserve it! Let me provide some feedback too in this PR too. |
34d2330
to
bfc4ed0
Compare
Thanks so much for your awfully kind words, @pombredanne , and really glad to have you here! I went ahead and added myself, and looking forward to your feedback. Cheers! |
@CAM-Gerlach I finished support (release is still not public) Example: [build-system]
requires = ['hatchling']
build-backend = 'hatchling.build'
[project]
name = 'foo'
license-expression = 'MIT'
dynamic = ['version']
[tool.hatch.version]
path = 'foo/__about__.py'
[tool.hatch.build]
packages = ['foo']
[tool.hatch.build.targets.sdist]
include = ['/tests']
[tool.hatch.build.targets.wheel]
core-metadata-version = '2.3' |
Great news @ofek , thanks! Does that include support for |
Both |
|
||
Classifier: Development Status :: 4 - Beta | ||
Classifier: Environment :: Console (Text Based) | ||
A new ``license-expression`` key is added to the ``project`` table, which has |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd really strongly prefer that we use the existing license
key instead.
The plain text form for the license
key was intentionally kept unused for this specific usecase. From https://python.github.io/peps/pep-0621/#license:
A practical string value for the license key has been purposefully left out to allow for a future PEP to specify support for SPDX [6] expressions (the same logic applies to any sort of “type” field specifying what license the file or text represents).
There's absolutely no reason to introduce a new key here; and certainly no reason to mirror key names under project
with what's in the METADATA
file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree - this is what the license
key was intended for, and we should use it as defined.
But this discussion should happen on Discourse, as noted below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The plain text form for the license key was intentionally kept unused for this specific usecase. From https://python.github.io/peps/pep-0621/#license:
Just to note once again, I'm well aware of this and it was my initial preference too, and what I initially implemented in the PEP, as I discuss in detail under the Rejected Ideas section.
There's absolutely no reason to introduce a new key here; and certainly no reason to mirror key names under project with what's in the METADATA file.
Actually, there are, as I took the time to describe in detail under the Rejected Ideas section, and as recently discussed at length on this PEP's Discourse thread (the most important one being the hard conflict with the use of the dynamic
key, and reader and author confusion and ambiguity with other tool formats). I revised it now to address the points raised more explicitly and focus more on those core issues while minimizing the rest.
I'm happy to discuss this further and hear your ideas for other ways around these issues on the Discourse thread.
I agree - this is what the license key was intended for, and we should use it as defined.
While reserving the string value for potential not-yet-specified future use is normative, the multiple suggestions for how more precise license metadata might be implemented does not appear to be, being contained in an explanatory note, specified as an envisioned future use, and with multiple options specified; and it does not consider the practical implementation complexities and interactions with the dynamic
field that I've spent much time analyzing in detail here, and which we can discuss further on the thread.
But this discussion should happen on Discourse, as noted below.
Agreed; for the record, I tried to politely steer previous substantive discussion to the Discourse thread the first time this was brought up, but didn't entirely succeed, sorry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, here's the thread: https://discuss.python.org/t/12622
As an aside, can we merge this PR and move the discussion of the current iteration of the PEP to a new discuss.python.org thread? It's significantly more work to try and keep up with discussion happening on a GitHub PR for reviewing a design document (IMO), and it would be generally nicer to move the discussion to the Packaging category on d.p.o instead; since that has the right audience for reviewing the PEP's content + proposal. |
+1. On a personal note, I am not following the discussions here, and once the PR is merged, I will not review them. So when it comes to approving the PEP, any substantial discussions here will be lost. That will result in extra work for everyone, if there's a point that needs clarification. On a procedural note, PyPA processes and PEP 1 state that discussions should happen on Discourse (technically, they say distutils-sig, but we're in the process of fixing that...). Only formatting issues should be discussed on the tracker here. |
FWIW, @pfmoore I think you have the ability to approve this PR; and poke the PEP editors to merge this (or, maybe you can merge this yourself too. :P) |
I do have the ability to approve/merge, yes. But TBH the diff is too big for me to reasonably review, and as I say, I'm not following the discussions here. I don't know what view the PEP editors would have (but unless someone shouts, I'll assume they don't have an issue with it). I'm happy to merge unreviewed as (in effect) a rewrite of the PEP, with a new discussion of the rewrite to then follow on Discourse. Is that OK? @CAM-Gerlach are you happy if I take that approach? I'm assuming @pombredanne would be OK with that approach, based on his review here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read over the whole PR and have a few minor suggestions, but otherwise I'd also suggest merging and moving substantive discussion back to Discourse.
pep-0639.rst
Outdated
provided ``License`` or ``Classifier`` information if they are able to do so | ||
unambiguously. For instance, if a package only has this license classifier:: | ||
**Note**: To avoid ambiguity, confusion and (per PEP 20, the Zen of Python) | ||
"more than one (obvious) way to do it", a flat array of strings value for the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"flat array of strings value" sounds a little awkward, but I see what you mean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, this is not only awkward but potentially confusing; I really struggled with wording this in a way that was both precise, clear and concise. I thought about it some more and rephrased it as "allowing a flat array of strings as the value for the license-files
key has been left out for now", which isn't perfect but somewhat better, I hope.
bc9804f
to
4032072
Compare
@pfmoore @pradyunsg Thanks and agreed; for the record I tried to steer earlier reviewers toward the Discourse thread for discussion substantive issues like that, but unfortunately was not entirely successful. I'm happy to see this merged for now and will create a new Discourse thread where we can discuss this further. I also revised the relevant Rejected Idea for the @pombredanne @cdce8p For now, I added an Open Issue that summarized the points made thus far in our thread above, so they can be included in the PEP and can serve as a basis for further discussion on the Discourse thread, as needed. @JelleZijlstra Thanks for your review and catching those issues! Unfortunately, for strange reason, GitHub is showing your suggestions as
|
Unfortunately rebase and merge is not supported for this repo. I probably don't have the permissions to change that setting, and I'm not sure we should—this PR is basically just a complete rewrite of the PEP, and I'd expect future readers to just read the result in its entirety. I'm happy to merge this (or let someone else do it) whenever you feel it's ready. |
I created the new thread, left a comment on the old one linking to it, and updated the discussions-to link to point to it in the PEP, so this should be ready to go pending @JelleZijlstra 's review of my final textual additions summarizing the discussion points here.
Got it, understood @JelleZijlstra —thanks for checking! Its ready, minus your review of the open issue I added and the rejected idea I modified to summarize the discussion here, as a more focused starting point for the further Discourse discussion (see the last commit). Once that's done, feel free to squash merge whenever you're ready, then. Thanks again! |
Also, I just want to note that with this PR, PEP 639 (3019 lines) would be the longest PEP by some margin, ahead of PEP 484 (typing; 2511 lines), PEP 622 (pattern matching; 2345), PEP 3156 (asyncio; 2117), and PEP 642 (pattern matching again; 2116), all of which introduced complex new concepts to the language. It's good to be thorough about all the options, but I'm worried that this PEP's sheer size will make it difficult to assess. Anyway, that's mostly up to the packaging community. |
pep-0639.rst
Outdated
build tooling and the ``License-Expression`` metadata field is still | ||
desirable to use for this purpose. | ||
|
||
This has the downsides, however, of not catching mispellings of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has the downsides, however, of not catching mispellings of the | |
This has the downsides, however, of not catching misspellings of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, the irony 😆 Funny enough, when I was doing a quick proofread, I actually caught this same word mispelled [sic] somewhere else, and thought about how embarrassing it would be if that had made it through...heh. Fixed, thanks.
pep-0639.rst
Outdated
|
||
On the other hand, as SPDX aims to (and generally does) encompass all | ||
FSF-recognized "Free" and OSI-approved "Open Source" licenses, | ||
and and those sources are kept closely in sync and are now relatively stable, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and and those sources are kept closely in sync and are now relatively stable, | |
and those sources are kept closely in sync and are now relatively stable, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, another one of those...I really should have a regex for that. Fixed, thanks.
4032072
to
b02b234
Compare
I think this is a reasonable concern. I know I haven't been reading the new version at least in part because of the sheer size (of the diffs, but the same applies to the doc itself). I'd strongly suggest trying to trim down the text. |
The revised version of the PEP is now live on the Python website. Thanks so much to everyone here who contributed feedback, fixes, ideas and support.
@JelleZijlstra I only count 2726 total, only just ahead of PEP 484 (2511), while my other numbers all match yours. Regardless, you have a fair point. Doing a brief inventory, in order, with conceptually similar categories grouped:
In summary, only about 20% (500 lines) of those lines is the actual core specification, and the rest is reference material intended, variously, for general PEP readers and reviewers (goals/motivation/rationale, terminology, security), package authors (scenarios, examples, how to teach), tool authors (examples, backward compat, reference implementation), those considering alternate approaches (open issues, rejected ideas) and those curious to learn more about the existing license ecossytem (appendix 2 and 3). To note, once #2155 is merged, the References section (perhaps the longest in any PEP) will no longer be rendered separately, so that knocks it down by 100 lines (including the non-printing header and footer). Also, the PEP extensively uses nearly 200 inline internal and external links for ease of navigation and reference, as well as other markup, making the rendered product easier to jump around in and navigate while increasing the source line count by perhaps another 100 lines. Also, in total, 600 lines (22%) is references or appendices rather than body text. By far the biggest single element was the rejected ideas/open issues sections, where I document the approaches taken by previous iterations of this PEP and suggested by others, and summarize why they were accepted or rejected, as required by PEP 1. This section is intended only for reference, for readers who may have such ideas to consider (or be pointed to) before raising them again, as well as clearly documenting the rationale for the future, so only the relevant subsections need be read by those interested. I wanted to give each ideas its due and fairly summarize its merits and shortcomings, rather than just a cursory dismissal leaving the reader guessing or uncertain. There have been a large number of such ideas over the long course of this PEP, and I had my doubts over whether including all them in sufficient detail was justified, due to the resulting length. Given many of them were brought up multiple times, including a number (especially those related to PEP 621 project source metadata, which is well over half the total length) on this PR, it would seem that at least a substantial number of them were. However, if this is seen as excessive, perhaps they can be significantly cut down if and when we find consensus on these topics, and those that others see as not really relevant/viable drastically reduced to a couple sentences or even eliminated. Other than that, some options for addressing this, in order from least to most drastic:
@pfmoore Okay; do you have any suggestions for which sections you feel don't belong here, could be moved somewhere else, or have content cut out of them? Or any opinion on which option(s) above you'd prefer? Personally, given all the time, thought, effort and analysis I've poured into trying to add helpful information for package authors, tool developers, spec readers and those with an interest in the subject matter, its naturally kinda hard to just throw a bunch of that away completely now, so I'd prefer the options that allow preserving as much as practical in one form or another (whether that's in the appendices, in a superseded version of the PEP, somewhere in the PyPA/packaging guide sites, etc). However, if it being too long and unweidely directly or indirectly results in the PEP being rejected, the negative consequences are obviously much greater and more far reaching, so I'll do whatever you think is best for that. |
One of the problems here is that I don't have the time to read this proposal as it stands. So it's hard for me to suggest how to shorten it, because I've not even been able to read and internalise what it's saying yet. But as a general point, will anyone read all of this before commenting? I spent some time this morning, and I feel like I barely got started. I don't have the free time to spend hours reading this proposal. To me, the only things that need to be made clear are:
And that's it. There's stuff I saw about shipping license files, but I was bogged down in too many words before I got into details there. I'd be hoping for something about this concise for that as well.
I understand how hard it is to shorten something you've worked on. What I write is always way too long, and I'm really bad at trimming it. But is it worth keeping all that writing if no-one ends up reading it, because they gave up after the first 2 sections? To try to provide a bit more help, I picked on a few sections, jumped straight to them (so no wider context) and skimmed them (so I will have missed details). Here's my immediate impressions - but please understand that they are immediate impressions, and my comments are extremely abbreviated and shouldn't be taken to imply there's nothing of value, just that I couldn't spot it on a quick skim.
That's probably enough. I don't want to make this feel too much like a hatchet job on all the work you've done. The work is much appreciated and extremely valuable. It's just how you present the results of all the work that needs some tidying up, IMO. Footnotes
|
As requested and discussed at length on the Discourse thread, this is a thorough rewrite of PEP 639, to reflect the numerous additions (PEP 621 metadata, deprecation guidance), changes (use
License-Expression
field instead ofLicense
, license file handling), updates (to many tools, specs, links, best practices and docs it references in the two+ years since its creation) and reformatting (as a standalone PEP rather than a spec update, inline links per #2130 , headings, various formatting/PEP 12 issues).Additionally, adds examples, user scenarios and a terminology section (as mentioned there), comprehensively documents the many ultimately rejected ideas suggested during the course of that thread, expands several of the other sections, resolves many issues and inconsistencies in the text, adds a number of internal links for easier navigation and copyedits for grammar, phrasing, clarity, tone and correctness.
To note, the diffs look much messier and more extensive than they actually are, due to the reorganization and, particularly, the rather antiquated hard-wrapping (rather than, e.g., sentence wrapping) that leads to small changes cascading the appearing to "change" an entire paragraph.
Note to committers: Once this is ready, a rebase-merge would be appreciated, to preserve the carefully arranged commit history organizing the changes into digestible, logical units, while still keeping a linear git history, but I understand if repo policy prevents that.
@pombredanne , if you see this, would you prefer if I added myself to the authors list, so I can continue to help champion, refine and implement this PEP? We all haven't heard from you about it in many months, so I'm happy to take care of your baby for you!
Pinging involved parties (also posted on the thread, ofc): @pombredanne @pfmoore @ofek @hroncok @kpfleming @di @pradyunsg @brettcannon @uranusjr
High-level summary of changes:
All the requested/needed changes have been implemented, including:
License-Expression
field instead of theLicense
field, as agreed hereIn addition, I've addressed the following issues I found during the process:
license_files
should match the subdir layout of the source as requested on wheel and Setuptools issueslicense_file
subdirectory as suggested on wheel and setuptools issuesFinally, I've added some additional non-normative content that should be useful here:
Full detailed chronological description of changes:
Restructure PEP to reflect form used in PEP 643, framing it as a standalone specification rather than a canonical incremental update of the actual core metadata spec itself, as proposed on Discuss thread and confirmed by @pfmoore
Implement basic formatting cleanup and fixes
Rewrite PEP to reflect adoption of a standalone
License-Expression
field instead of theLicense
field, as agreed on the Discuss threadLicense
field to refer to theLicense-Expression
field, where appropriateLicense-Expression
field with that of re-using theLicense
field in the Rejected ideas sectionExpand and update the PEP to reflect tool, spec and documentation changes since its initial drafting, fix link rot and follow current best practices
license_files
being supported by both tools and both wheels and sdists and Include the additional license file names I added in Add a few additional common patterns for legal and attribution files pypa/wheel#251License-File
support in package metadata added in AddLicense-File
field to package metadata pypa/setuptools#2645 based on this PEPAdd changes to PEP 621 metadata, as suggested by @pfmoore , restructure accordingly and greatly expand examples
expression
andfiles
, deprecatelicense
andfile
, and describe tool recommendations and examples thereofFurther specify and clarify license-file handling, in PEP 621 and core metadata
.dist-info
vs.egg-info
/ sdists vs wheelsAdd many additional missing ambiguous license classifiers and further tool guidance on handling them
Specify
license-expression
andlicense-files
as separate PEP 621 fields, greatly refine dynamic field handling and add many rejected ideasSpecify that
license_files
should match the subdir layout of the source as suggested on issuesAdd proposed
license_file
subdirectory as suggested on issuesdata
or new sys.prefixAdd requirements for PyPI validation of new fields, per consensus on Discourse
Add additional open issues and rejected ideas from Discourse/issues
Add User Scenarios section for ordinary package authors
Update general sections to reflect changes
Add definitions section (and use consistently): https://packaging.python.org/glossary/
Add, fix and improve links, references and more
Copyedit PEP for minor grammar, phrasing and style issues and proofread for typos and errors