Skip to content

compiler: support arbitrary value js struct tag values #779

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 16, 2018

Conversation

myitcv
Copy link
Member

@myitcv myitcv commented Mar 24, 2018

Javascript allows for arbitrary strings to be used as index values on an
Object. A good example of where this comes up is React where "data-*"
and "aria-*" indexes are used on objects
(https://reactjs.org/docs/dom-elements.html).

By switching from property selectors to the index operator (where required) we can support arbitrary string values in js-tagged fields for *js.Object special structs.

Fixes #778

Copy link
Member

@dmitshur dmitshur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for sending this.

Can you check how much of an impact this change makes on the generated file size of a moderately sized program compiled with GopherJS? With minification, before and after gzip compression.

If it's not trivial, we'll probably want to only use this longer form when the string needs it.

js/js_test.go Outdated

if want != got {
t.Errorf("Value via non-identifier js tag field gave %q, want %q", got, want)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also add coverage for js.Object.Get and js.Object.Set methods? Are they equally affected?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@myitcv myitcv force-pushed the arbitrary_string_js_tags branch from 89aa286 to 082df8a Compare March 25, 2018 12:45
@myitcv
Copy link
Member Author

myitcv commented Mar 25, 2018

Can you check how much of an impact this change makes on the generated file size of a moderately sized program compiled with GopherJS? With minification, before and after gzip compression.

Good point. I think it's probably harder to come up with a definition for a representative "moderately sized program" than to make the change, so I made the change to implement this fallback logic. Please take a look.

@myitcv myitcv force-pushed the arbitrary_string_js_tags branch from 082df8a to ca48d94 Compare March 25, 2018 12:51
@myitcv
Copy link
Member Author

myitcv commented Apr 1, 2018

Gentle ping @shurcooL - thanks.

Copy link
Contributor

@lologarithm lologarithm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the change to only switch to the more complicated implementation when needed.

//
// Uses definition of an identifier from
// https://developer.mozilla.org/en-US/docs/Glossary/Identifier
func jsTagFormat(jsTag string) string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is entirely a nitpick, so ignore it if you wish... But I wonder if formatJSTag would be a better name, and more consistent with the other func names here, which use the <verb><object> convention (i.e. formatExpr and encodeIdent etc)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@myitcv myitcv force-pushed the arbitrary_string_js_tags branch from ca48d94 to 36bc620 Compare April 12, 2018 21:57
@myitcv
Copy link
Member Author

myitcv commented Apr 12, 2018

@hajimehoshi - any thoughts?

Gentle ping @shurcooL

@hajimehoshi
Copy link
Member

I'll look at this.

js tags

struct tags for js?

//
// Uses definition of an identifier from
// https://developer.mozilla.org/en-US/docs/Glossary/Identifier
func formatJSTag(jsTag string) string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like the word "js tag" is confusing. Isn't this actually a struct tag for js? How about formatStructTagForJS? (I'm not an English native speaker so my suggestion might not sound natural)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in the latest PR - I chose a slightly different version to keep more consistency with https://github.com/gopherjs/gopherjs/wiki/JavaScript-Tips-and-Gotchas

js/js_test.go Outdated

type StructWithNonIdentifierJsTags struct {
object *js.Object
Name string `js:"my name"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding more test cases like starting with a char that is not for IDs?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect we already have coverage for that elsewhere. It's good to avoid adding same tests in different places.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there are such tests, the results should be changed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made the test more "exciting" - I don't think any more are required because if we were doing the wrong thing then the generated Javascript would not parse.

func formatJSTag(jsTag string) string {
useDot := true

RuneRange:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trivial style nitpick, maybe use Outer label. It'd be more consistent with:

Outer:

(In general, RuneRange is not a label I see often, but Outer is pretty common.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@myitcv myitcv changed the title compiler: support arbitrary string js tags compiler: support arbitrary value js struct tag values Apr 14, 2018
@myitcv myitcv force-pushed the arbitrary_string_js_tags branch 2 times, most recently from d7271ca to 8798d09 Compare April 14, 2018 09:46
@myitcv
Copy link
Member Author

myitcv commented Apr 14, 2018

@hajimehoshi I've tweaked the wording of the description (and commit) to be more precise and inline with https://github.com/gopherjs/gopherjs/wiki/JavaScript-Tips-and-Gotchas

Revised commit now pushed up. PTAL

return "." + jsTag
}

return "[\"" + jsTag + "\"]"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need escaping some chars like backslash? text/template.JSEscapeString is needed here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah that's a good catch; but we need strconv.Quote in this situation.

Also updated the test to include more special characters.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in latest commit.

Copy link
Member

@hajimehoshi hajimehoshi Apr 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, there are very subtle differences between strconv.Quote and template.JSEscapeString...

  1. < and > are treated as special chars in JSEscapeString
  2. strconv.Quote can produce \U000XXXXX for non-BMP and non-printable char, which is not valid escaping in ES5

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're absolutely right - thank you for correcting me!

I've just added a test in the latest commit that does a .Get in order to ensure the property that has been set via the js struct tag is using the expected key.

PTAL

@myitcv myitcv force-pushed the arbitrary_string_js_tags branch 3 times, most recently from 7b1c965 to 20b1930 Compare April 14, 2018 10:54
Copy link
Member

@hajimehoshi hajimehoshi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thank you!

@shurcooL are you happy with this PR?

Copy link
Member

@dmitshur dmitshur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some suggestions on how I think this PR can be improved.

@@ -643,3 +645,38 @@ func endsWithReturn(stmts []ast.Stmt) bool {
func encodeIdent(name string) string {
return strings.Replace(url.QueryEscape(name), "%", "$", -1)
}

// formatJSStructTagVal returns a string of Javascript code appropriate for
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JavaScript is spelled with a capital S, not "Javascript". See https://en.wikipedia.org/wiki/JavaScript, https://www.javascript.com/.

Also below.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

// Uses definition of an identifier from
// https://developer.mozilla.org/en-US/docs/Glossary/Identifier
func formatJSStructTagVal(jsTag string) string {
useDot := true
Copy link
Member

@dmitshur dmitshur Apr 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think needsEscaping would be a more readable name.

That way it can start out with zero value:

var needsEscaping bool

And become set to true when encountering a character in the loop that needs escaping:

needsEscaping = true
break Outer

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I agree. We want the default behaviour to be to use the dot notation because, as noted before, this is more efficient in output terms. "Dot notation" is taken directly from the MDN. To my mind it's better to have the variable name reflect the intended semantics, hence useDot.

Copy link
Member

@dmitshur dmitshur Apr 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want the default behaviour to be to use the dot notation because, as noted before, this is more efficient in output terms.

My suggestion doesn't contradict that. needsEscaping controls whether non-default behavior of using escaping is used. If its value remains false, the default behavior (using dot notation) is used.

It's similar to when you handle error conditions (non-default behaviors), you typically write:

var err error // (controls whether non-default path will be used)
f, err = os.Open("file.txt")
if err != nil {
    // less common (non-default) path
    return fmt.Errorf("couldn't open file.txt: %v", err)
}
// normal (default) path

Instead of:

successfullyOpened := true
f, err := os.Open("file.txt")
if err != nil {
    successfullyOpened = false
}
if successfullyOpened {
    // normal (default) path
}
// less common (non-default) path
return fmt.Errorf("couldn't open file.txt: %v", err)

It's not a big deal, I just wanted to explain the rationale behind my suggestion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggestion doesn't contradict that. needsEscaping controls whether non-default behavior of using escaping is used. If its value remains false, the default behavior (using dot notation) is used.

No I understand what you were getting at, it's just the name needsEscaping that doesn't make sense to me. I'm quite happy having the inverse logic, but if we were to do so then useBracket would probably be a better name (it's the alternative form).

useDot = false
break Outer
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this whole switch statement can be expressed more simply as:

ok := unicode.IsLetter(r) || (i != 0 && unicode.IsNumber(r)) || r == '$' || r == '_'
if !ok {
    needsEscaping = true
    break
}

Then no need for Outer: label either.

Inspired by https://github.com/gopherjs/snippet-store/blob/09ea5d200e03052ed3cc332e9c6e96ffbb6d6eb3/id.go#L37-L40.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice; yes that reads much better

return "." + jsTag
}

return "[\"" + template.JSEscapeString(jsTag) + "\"]"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should indent the less frequent case, and keep the more common/normal case at normal indentation. So:

if needsEscaping {
    return "[\"" + template.JSEscapeString(jsTag) + "\"]"
}
return "." + jsTag

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, per my comment above, I think useDot is clearer.

Copy link
Member

@flimzy flimzy Apr 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just eliminate the variable entirely, and further shorten the function?

func formatJSStructTagVal(jsTag string) string {
    for i, r := range jsTag {
        ok := unicode.IsLetter(r) || (i != 0 && unicode.IsNumber(r)) || r == '$' || r == '_'
        if !ok {
            return "[\"" + template.JSEscapeString(jsTag) + "\"]"
        }
    }

    return "." + jsTag
}

To me this is clearer than either variable name alternative.


"github.com/alecthomas/template"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you meant to import html/template from the Go standard library rather than a new 3rd party package, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I missed that... Thanks. I think text/template is enough (html/template is also fine and does the same thing).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

goimports 😢 - thank you, good catch

Copy link
Member

@dmitshur dmitshur Apr 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you using the latest version of goimports? I think it's expected to prefer std lib packages over 3rd party ones when it matches both. If not, that's a bug and worth reporting.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll take a look, thanks for the tip.

Javascript allows for arbitrary strings to be used as index values on an
Object. A good example of where this comes up is React where `"data-*"`
and `"aria-*"` indexes are used on objects
(https://reactjs.org/docs/dom-elements.html).

By switching from property selectors to the index operator (where
required) we can support arbitrary string values in js-tagged fields for
`*js.Object` special structs.

Fixes gopherjs#778
@myitcv myitcv force-pushed the arbitrary_string_js_tags branch from 20b1930 to e78570d Compare April 14, 2018 20:20
@myitcv
Copy link
Member Author

myitcv commented Apr 14, 2018

@shurcooL - thanks for reviewing. I've just pushed up some changes, comments against one of your bits of feedback left above.

@dmitshur
Copy link
Member

dmitshur commented Apr 14, 2018

I've just pushed up some changes

I saw the comments, but not seeing changes. This PR currently has 1 commit, so it's hard to tell what has changed (if anything) from last time.

If you don't mind, please avoid squashing and force pushing to the PR branch, but rather feel free to push additional commits. We can easily squash them into one commit when merging the PR. It makes reviewing easier, because the diffs show only what has changed since last time. (Unfortunately, GitHub isn't as good as Gerrit in this sense; Gerrit makes it possible to see the entire history of a change at all times.)

@myitcv
Copy link
Member Author

myitcv commented Apr 15, 2018

I saw the comments, but not seeing changes. This PR currently has 1 commit, so it's hard to tell what has changed (if anything) from last time.

Unfortunately, GitHub isn't as good as Gerrit in this sense; Gerrit makes it possible to see the entire history of a change at all times.)

Sorry, yes. I keep forgetting that. Old habits die hard. Will try to change that.

myitcv added a commit to myitcv/react that referenced this pull request Apr 15, 2018
Syntax viewer is an incredibly simple application for viewing the syntax
tree representations of different languages.

As part of this we vendor mvdan.cc/sh/syntax and its deps.

We also re-enable AriaSet for now until we get
gopherjs/gopherjs#779 merged.
myitcv added a commit to myitcv/react that referenced this pull request Apr 15, 2018
Syntax viewer is an incredibly simple application for viewing the syntax
tree representations of different languages.

As part of this we vendor mvdan.cc/sh/syntax and its deps.

We also re-enable AriaSet for now until we get
gopherjs/gopherjs#779 merged.

We also add the float CSS attribute.

And fix some of the JSX parsing.
Add examples to show what kind of output formatJSStructTagVal returns.

Simplify and shorten test.

Remove unhelpful blank lines, they're best used sparingly.
Copy link
Member

@dmitshur dmitshur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've pushed some style/documentation adjustments in commit 5ade898 (to skip the extra latency introduced by roundtripping via code review comments). Let me know if everything looks okay to you @myitcv.

After that, this LGTM. Thanks a lot for your work on this @myitcv.

@myitcv
Copy link
Member Author

myitcv commented Apr 16, 2018

Thanks @shurcooL - looks fine. Let's get this merged!

@dmitshur
Copy link
Member

Thanks again @myitcv!

SeniorDev807 added a commit to SeniorDev807/itcv that referenced this pull request Jul 21, 2019
Syntax viewer is an incredibly simple application for viewing the syntax
tree representations of different languages.

As part of this we vendor mvdan.cc/sh/syntax and its deps.

We also re-enable AriaSet for now until we get
gopherjs/gopherjs#779 merged.

We also add the float CSS attribute.

And fix some of the JSX parsing.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants