Skip to content

chore: add logic to validate all modules #11

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 30 commits into from
May 2, 2025
Merged

Conversation

Parkreiner
Copy link
Member

@Parkreiner Parkreiner commented Apr 19, 2025

Closes coder/internal#531

Changes made

  • Added functionality to validate the structure of module README files (frontmatter and the README body)
  • Added a really basic snapshot-ish test for the module README body validation
  • Updated README files that were previously violating README requirements (the old modules validation logic wasn't catching these)
  • Changed ValidationPhase from an int to a string

Notes

  • I've brought this up before, but I do think that it should be safe to remove the maintainer_github field, because that information can be intuited by the directory path. I didn't add any checks for it, because the new Registry site build process shouldn't need them
  • With how the current logic is set up (and assuming that templates and modules maintain the same README structure for a while), it'll be really easy to add
  • This setup works for now, but sometime down the line, I would like to refactor the code to use concurrency. Not for performance, but for "correctness" reasons – the structure of a concurrent program naturally lines up with the needs of a data pipeline. It'll make it easy to improve the granularity of the error messages, and help limit the amount of back-and-forth contributors will need to do with CI if they're editing multiple README files at once
  • The one test I added was to help make development easier for me, but I am planning to flesh that test suite out down the line

@Parkreiner Parkreiner self-assigned this Apr 19, 2025
Comment on lines 38 to 43
type coderResourceReadmes map[string]coderResourceReadme

func (crr coderResourceReadmes) Get(filePath string) (coderResourceReadme, bool) {
rm, ok := crr[filePath]
return rm, ok
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasn't sure if this pattern made sense in some cases. The idea is that if you have a map, the type signature doesn't communicate how the keys are formatted. With a custom type, the method makes it clear that you need a filePath to index the value by

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the type and method are unnecessary, you can leave a comment at the usage points informing the map structure since it doesn't travel far in the code when it is used.

Comment on lines 121 to 122
var errs []error
errs = append(errs, validateReadmeBody(trimmed)...)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasn't sure about whether Go has any rules of thumb for when to treat data as immutable, but I treated the results of validateReadme as if it were, even though there's no other owner for the slice after the function ends, and I could've just mutated the result directly

// would be a big improvement.
var terraformVersionRe = regexp.MustCompile("^\\s*\\bversion\\s+=")

func validateCoderResourceReadmeBody(body string) []error {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is convoluted, but at least with it seeming to work (from all the input files that exist right now), I'd rather get unit tests in place for it, and then use those to guide refactoring the function to do AST parsing

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't want to spend too much time on tests for this iteration, but needed to isolate the README-parsing logic to figure out how to make sure it worked faster

Comment on lines 71 to 73
// Todo: This seems to work okay for now, but the really proper way of doing
// this is by parsing this as an AST, and then checking the resulting nodes
func validateReadmeBody(body string) []error {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would also like to get tests in place, and then use those to refactor this

@Parkreiner Parkreiner requested a review from bcpeinhardt April 28, 2025 20:30
Copy link
Collaborator

@bcpeinhardt bcpeinhardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😎
We can talk about cleaning this and #10 up a bit and possibly moving it into coder/coder-registry post Profile launch.

Copy link
Collaborator

@f0ssel f0ssel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I'm fine without tests because the existing readmes act as one for now and we can circle back with other work.

Comment on lines 38 to 43
type coderResourceReadmes map[string]coderResourceReadme

func (crr coderResourceReadmes) Get(filePath string) (coderResourceReadme, bool) {
rm, ok := crr[filePath]
return rm, ok
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the type and method are unnecessary, you can leave a comment at the usage points informing the map structure since it doesn't travel far in the code when it is used.

Comment on lines 91 to 94
func validateCoderResourceTags(tags []string) error {
if len(tags) == 0 {
return nil
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tags can be nil here, right? Is that valid, or should we check for it? len will return 0 either way, but if you want to enforce that people put it in the frontmatter with an empty array we should fail on nil. If we are cool with omitting it there's nothing to do here, but we may want to put a comment saying that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, good call. I didn't think about it before, but I'll go ahead and add a check, even though I don't think it'll matter much in practice. The YAML library should guarantee that the tags field gets parsed as an allocated array, but since the function can still be called separately, there's a risk of edge cases down the line

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha, so even if the key is completely ommited it still parses to a slice with length 0? If so they it's safe, but I wasn't sure.

Copy link
Member Author

@Parkreiner Parkreiner May 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be nil if the field were optional, but it's required. So that gives us some guarantees, because parser errors out on omitted keys, instead of initializing them with nil values. If a user forgets the key, CI should ding them

@@ -27,7 +27,7 @@ type contributorProfileFrontmatter struct {
ContributorStatus *string `yaml:"status"`
}

type contributorProfile struct {
type contributorProfileReadme struct {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

strings.NewReader(strings.TrimSpace(readmeText)),
)

lineScanner := bufio.NewScanner(strings.NewReader(strings.TrimSpace(readmeText)))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@f0ssel
Copy link
Collaborator

f0ssel commented Apr 30, 2025

Oh and let's get rid of the camelCase, we do all lowercase for all go files like repostructure.go over repoStructure.go. If needed we can drop to snake_casing but I prefer to try to have simpler filenames first.

@Parkreiner Parkreiner changed the base branch from mes/module-validation to main May 2, 2025 15:32
@Parkreiner Parkreiner merged commit ba6fea8 into main May 2, 2025
3 checks passed
@Parkreiner Parkreiner deleted the mes/module-validation-2 branch May 2, 2025 22:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Migrate modules from coder/modules to new hub repo
3 participants