Skip to content

Issue 351: update text about selection matching #358

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 0 commits into from

Conversation

aphillips
Copy link
Member

No description provided.

@aphillips aphillips requested review from stasm and eemeli February 26, 2023 16:41
Copy link
Collaborator

@eemeli eemeli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mixing in changes to spec/syntax.md along with adding a version of the supporting document earlier linked to in #351. These changes need to be untangled for this to be properly reviewable.

@macchiati
Copy link
Member

Added general comment in #351

+ Can visually inspect match order.
+ May be more efficient when perfoming match (??)

**Cons**
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a comment to issue #271 (comment) where I try to show that the first match is really not possible to implement properly.

Because the programmers / translators / tooling might not have enough information to make those decisions about sorting.


Because each _selector_ might produce different rankings, the whole list must be stack ranked. Any _variant_ that produces a "no match" can be eliminated from the candidate list.

We could specify that for-each _key_ the _selector_ must produce a ranked value (say between 0 and 1) or `0` for no match. **Non-matching items (that is, any _key_ value that scores `0`) are eliminated.** The value `*` **must** produce at least the minimal non-zero (matching) value, but ***may*** return a higher value. If no _key_ values match, throw an error (this should never happen as it is a syntax error to omit `*`)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The EM proposal contains a section on best matching algorithm s:
"Annex 2: Multi-selector matching algorithm alternatives"

And the code here implements a selection using something similar to the score proposal here.


A few differences from my thinking and this proposal (to consider, none of the blockers from my side):

  1. Nitpicking: a score between 0 and 100 or similar, using integers (always easier and safer than fractional values)

  2. The * is 0, and not under the control of the function. The function is not even called, * matches everything.
    I think that allowing functions to "mock" with the value of * is confusing, open for abuse. There might be some pros, but I don't see any.

  3. The score for no match is negative. If a function returns a negative score for one item in the tuple, the tuple does not match. We can drop the iteration on the tuple items and go to the next tuple.

  4. The score for the whole tuple is the sum of all the item scores, SQUARED
    This matches the intuitive idea of distance in the real world.
    The distance from origin to a point:

  • in 1D space is x (same as √(x²))
  • in 2D is √(x² + y²)
  • in 3D is √(x² + y² + z²)
  • in n dimensions is √(d₁² + d₂² + ... + dₙ²)
    There is no need to extract the root, as we compare the scores.

The pseudocode is something like this:

bestMatch = null
bestMatchScore = -1
for each tuple // iterate vertically, to select the best matching `when` entry.
    for each key in the keys of the current tuple // iterate horizontally
        tupleScore = 0
        if key == '*'
            itemScore = 0
        else
            itemScore = function.invoke( value_to_select_on, key ) // -1 or 0-100
        if itemScore < 0
            tupleScore = -1
            break from the inner cycle (on items)
        tupleScore += itemScore * itemScore // item score squared
    if tupleScore > bestMatchScore
        bestMatchScore = tupleScore
        bestMatch = tuple
if bestMatchScore == -1 // we didn't find any match, not even `* * *` (which would be zero)
    error, ultimate fallback option is missing
return bestMatch

We can even optimize a bit.
If the tupleScore is the max possible value (100*100 * number of items), we can return, as we found the best possible match (all items have a score of 100)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a score between 0 and 100 or similar, using integers (always easier and safer than fractional values)

I agree. This is an illustration.

The * is 0, and not under the control of the function. The function is not even called, * matches everything.
I think that allowing functions to "mock" with the value of * is confusing, open for abuse. There might be some pros, but I don't see any.

* is never zero, but you can't think of "matching vs. non-matching". For best match, there are cases where * matches better for a given selector. I illustrate this with * == other:

when * * * 1 1 0 0.1 + 0.1 + 0.1 = 0.3 N * here is default
when * * * 11 11 42.0 0.5 + 0.5 + 0.5 = 1.5 Y * here is like other

The key thing here is that that we need to decide between best and first match. If we choose best match, we can then describe best match using one of the mechanisms such as you describe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants