Skip to content

Add linting and validation of all YAML files #27698

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 23, 2024
Merged

Conversation

QuLogic
Copy link
Member

@QuLogic QuLogic commented Jan 25, 2024

PR summary

This adds stylistic linting and schema validation to all YAML files that we use. This will prevent errors like #27642 and #27694 before merging.

Because these files tend to have fairly nested entries, I chose a rather long line length of 111. This is long enough to fit the longest action reference without wrapping. Note also that azure-pipelines.yml was almost entirely incorrectly indented, so appears as if it's changed entirely, but it's mostly just indent.

I also changed some newline-preserving blocks (|) to newline-folding (>-) as they don't need to keep newlines.

At the moment, some of the schemas do not pass even though the CI works, so I'm just checking whether those are outdated or just a bit stricter about things.

PR checklist

@QuLogic QuLogic added the CI: testing CI configuration and testing label Jan 25, 2024
@QuLogic QuLogic added this to the v3.9.0 milestone Jan 25, 2024
@QuLogic
Copy link
Member Author

QuLogic commented Feb 7, 2024

In the interest of getting this in without having to deal with many conflicts, I've commented out the Azure Pipelines check, which we can enable once it's fixed.

@QuLogic QuLogic marked this pull request as ready for review February 7, 2024 03:36
@QuLogic
Copy link
Member Author

QuLogic commented Feb 7, 2024

Ah, it looks like pre-commit.ci doesn't have network access, so we can't reference schemas from SchemaStore with URLs. Would we be fine with dropping those schemas in the repo somewhere?

@ksunden
Copy link
Member

ksunden commented Feb 7, 2024

personally, unless the schemas are too large (I'm really thinking like megabytes, which would be relatively impressive for plain text files) I'd be okay with adding them, probably under the ci folder (especially as they are used by non-GHA ci... if it was GHA, maybe I'd say inside of .github)

@QuLogic
Copy link
Member Author

QuLogic commented Feb 13, 2024

They are, at this time, 160K, so not much of a burden to ship.

@@ -146,39 +147,39 @@ jobs:
path: dist/

- name: Build wheels for CPython 3.12
uses: pypa/cibuildwheel@ce3fb7832089eb3e723a0a99cab7f3eaccf074fd # v2.16.5
uses: pypa/cibuildwheel@ce3fb7832089eb3e723a0a99cab7f3eaccf074fd # v2.16.5
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just want to double check that this change does not interfere with dependabot...

my guess is that it is not a problem for dependabot directly... but that it is possible/likely that dependabot will not place two spaces when it updates, and thus cause linting failures as a result.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the implementation commit, the comment regex is (?<comment>\s+#.*), so it should be fine with any number of leading spaces/tabs. Then updating the version is only replacing old with new.

There's also a test case with multiple spaces before and after the #.

Comment on lines 90 to 122
- id: check-jsonschema
files: ^\.appveyor\.yml$
args: ["--verbose", "--schemafile", "ci/schemas/appveyor.json"]
- id: check-jsonschema
files: ^\.circleci/config\.yml$
args: ["--verbose", "--schemafile", "ci/schemas/circleciconfig.json"]
- id: check-jsonschema
files: ^\.github/FUNDING\.yml$
args: ["--verbose", "--schemafile", "ci/schemas/github-funding.json"]
- id: check-jsonschema
files: ^\.github/ISSUE_TEMPLATE/config\.yml$
args: ["--verbose", "--schemafile", "ci/schemas/github-issue-config.json"]
- id: check-jsonschema
files: ^\.github/ISSUE_TEMPLATE/.*\.yml$
exclude: ^\.github/ISSUE_TEMPLATE/config\.yml$
args: ["--verbose", "--schemafile", "ci/schemas/github-issue-forms.json"]
- id: check-jsonschema
files: ^\.github/codecov\.yml$
args: ["--verbose", "--schemafile", "ci/schemas/codecov.json"]
- id: check-jsonschema
files: ^\.github/labeler\.yml$
args: ["--verbose", "--schemafile", "ci/schemas/pull-request-labeler-5.json"]
- id: check-jsonschema
files: ^environment\.yml$
args: ["--verbose", "--schemafile", "ci/schemas/conda-environment.json"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer if each of these could be given a unique name that would make it easier to tell what to look for when it fails (I don't actually know what all is printed on failure, but the wall of 8 generic Validate files with jsonschema doesn't actually tell me what it has validated (granted, mostly will be skipped when running locally since they all have file filters, but still)

(Also that it says jsonschema is mildly confusing since these are yaml files... I understand that the schema is specified in json, but not sure I would remember(/expect if I didn't know this PR) that when the check fails)

Copy link
Member

@timhoffm timhoffm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would put a README in schemas stating

Vendored YAML Schemas for linting and validation.
Since pre-commit CI doesn't have Internet access, we need to bundle these files
in the repo. The schemas can be updated using vendor_schema.py.

Also, consider putting vendor_schema.py in the schema folder. It's slightly better to have everything in one place rather than the generating script in the parent folder.

Otherwise LGTM.

The `if:` expressions don't need to preserve newlines, for example.
This is necessary because pre-commit CI doesn't have Internet access and
thus cannot download schemas itself.
@timhoffm
Copy link
Member

Not sure whether CI failures are unrelated. You may self-merge if you are sure, they are unrelated.

@QuLogic
Copy link
Member Author

QuLogic commented Feb 23, 2024

I think (other than Cygwin) they're all unfortunately flaky errors, but I've re-run them for now.

@QuLogic
Copy link
Member Author

QuLogic commented Feb 23, 2024

Everything looks good now.

@QuLogic QuLogic merged commit a6da814 into matplotlib:main Feb 23, 2024
@QuLogic QuLogic deleted the yaml-lint branch February 23, 2024 09:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI: Run cibuildwheel Run wheel building tests on a PR CI: Run cygwin Run cygwin tests on a PR CI: testing CI configuration and testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants