-
-
Notifications
You must be signed in to change notification settings - Fork 36
Issue 351: update text about selection matching #358
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is mixing in changes to spec/syntax.md
along with adding a version of the supporting document earlier linked to in #351. These changes need to be untangled for this to be properly reviewable.
Added general comment in #351 |
+ Can visually inspect match order. | ||
+ May be more efficient when perfoming match (??) | ||
|
||
**Cons** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a comment to issue #271 (comment) where I try to show that the first match is really not possible to implement properly.
Because the programmers / translators / tooling might not have enough information to make those decisions about sorting.
|
||
Because each _selector_ might produce different rankings, the whole list must be stack ranked. Any _variant_ that produces a "no match" can be eliminated from the candidate list. | ||
|
||
We could specify that for-each _key_ the _selector_ must produce a ranked value (say between 0 and 1) or `0` for no match. **Non-matching items (that is, any _key_ value that scores `0`) are eliminated.** The value `*` **must** produce at least the minimal non-zero (matching) value, but ***may*** return a higher value. If no _key_ values match, throw an error (this should never happen as it is a syntax error to omit `*`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The EM proposal contains a section on best matching algorithm s:
"Annex 2: Multi-selector matching algorithm alternatives"
And the code here implements a selection using something similar to the score proposal here.
A few differences from my thinking and this proposal (to consider, none of the blockers from my side):
-
Nitpicking: a score between 0 and 100 or similar, using integers (always easier and safer than fractional values)
-
The
*
is 0, and not under the control of the function. The function is not even called, * matches everything.
I think that allowing functions to "mock" with the value of*
is confusing, open for abuse. There might be some pros, but I don't see any. -
The score for no match is negative. If a function returns a negative score for one item in the tuple, the tuple does not match. We can drop the iteration on the tuple items and go to the next tuple.
-
The score for the whole tuple is the sum of all the item scores, SQUARED
This matches the intuitive idea of distance in the real world.
The distance from origin to a point:
- in 1D space is x (same as √(x²))
- in 2D is √(x² + y²)
- in 3D is √(x² + y² + z²)
- in
n
dimensions is √(d₁² + d₂² + ... + dₙ²)
There is no need to extract the root, as we compare the scores.
The pseudocode is something like this:
bestMatch = null
bestMatchScore = -1
for each tuple // iterate vertically, to select the best matching `when` entry.
for each key in the keys of the current tuple // iterate horizontally
tupleScore = 0
if key == '*'
itemScore = 0
else
itemScore = function.invoke( value_to_select_on, key ) // -1 or 0-100
if itemScore < 0
tupleScore = -1
break from the inner cycle (on items)
tupleScore += itemScore * itemScore // item score squared
if tupleScore > bestMatchScore
bestMatchScore = tupleScore
bestMatch = tuple
if bestMatchScore == -1 // we didn't find any match, not even `* * *` (which would be zero)
error, ultimate fallback option is missing
return bestMatch
We can even optimize a bit.
If the tupleScore is the max possible value (100*100 * number of items), we can return, as we found the best possible match (all items have a score of 100)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a score between 0 and 100 or similar, using integers (always easier and safer than fractional values)
I agree. This is an illustration.
The * is 0, and not under the control of the function. The function is not even called, * matches everything.
I think that allowing functions to "mock" with the value of * is confusing, open for abuse. There might be some pros, but I don't see any.
*
is never zero, but you can't think of "matching vs. non-matching". For best match, there are cases where *
matches better for a given selector. I illustrate this with *
== other
:
when * * * | 1 | 1 | 0 | 0.1 + 0.1 + 0.1 = 0.3 | N | * here is default |
---|---|---|---|---|---|---|
when * * * | 11 | 11 | 42.0 | 0.5 + 0.5 + 0.5 = 1.5 | Y | * here is like other |
The key thing here is that that we need to decide between best and first match. If we choose best match, we can then describe best match using one of the mechanisms such as you describe.
No description provided.