Issue 351: update text about selection matching #358

aphillips · 2023-02-26T16:41:21Z

No description provided.

eemeli

This is mixing in changes to spec/syntax.md along with adding a version of the supporting document earlier linked to in #351. These changes need to be untangled for this to be properly reviewable.

macchiati · 2023-02-28T23:10:29Z

Added general comment in #351

mihnita · 2023-03-03T20:22:42Z

exploration/selection-matching-options.md

+ Can visually inspect match order.
+ May be more efficient when perfoming match (??)
+
+**Cons**


I've added a comment to issue #271 (comment) where I try to show that the first match is really not possible to implement properly.

Because the programmers / translators / tooling might not have enough information to make those decisions about sorting.

mihnita · 2023-03-03T21:08:51Z

exploration/selection-matching-options.md

+
+Because each _selector_ might produce different rankings, the whole list must be stack ranked. Any _variant_ that produces a "no match" can be eliminated from the candidate list.
+
+We could specify that for-each _key_ the _selector_ must produce a ranked value (say between 0 and 1) or `0` for no match. **Non-matching items (that is, any _key_ value that scores `0`) are eliminated.** The value `*` **must** produce at least the minimal non-zero (matching) value, but ***may*** return a higher value. If no _key_ values match, throw an error (this should never happen as it is a syntax error to omit `*`)


The EM proposal contains a section on best matching algorithm s:
"Annex 2: Multi-selector matching algorithm alternatives"

And the code here implements a selection using something similar to the score proposal here.

A few differences from my thinking and this proposal (to consider, none of the blockers from my side):

Nitpicking: a score between 0 and 100 or similar, using integers (always easier and safer than fractional values)

The * is 0, and not under the control of the function. The function is not even called, * matches everything.
I think that allowing functions to "mock" with the value of * is confusing, open for abuse. There might be some pros, but I don't see any.

The score for no match is negative. If a function returns a negative score for one item in the tuple, the tuple does not match. We can drop the iteration on the tuple items and go to the next tuple.

The score for the whole tuple is the sum of all the item scores, SQUARED
This matches the intuitive idea of distance in the real world.
The distance from origin to a point:

in 1D space is x (same as √(x²))

in 2D is √(x² + y²)

in 3D is √(x² + y² + z²)

in n dimensions is √(d₁² + d₂² + ... + dₙ²)
There is no need to extract the root, as we compare the scores.

The pseudocode is something like this:

bestMatch = null bestMatchScore = -1 for each tuple // iterate vertically, to select the best matching `when` entry. for each key in the keys of the current tuple // iterate horizontally tupleScore = 0 if key == '*' itemScore = 0 else itemScore = function.invoke( value_to_select_on, key ) // -1 or 0-100 if itemScore < 0 tupleScore = -1 break from the inner cycle (on items) tupleScore += itemScore * itemScore // item score squared if tupleScore > bestMatchScore bestMatchScore = tupleScore bestMatch = tuple if bestMatchScore == -1 // we didn't find any match, not even `* * *` (which would be zero) error, ultimate fallback option is missing return bestMatch

We can even optimize a bit.
If the tupleScore is the max possible value (100*100 * number of items), we can return, as we found the best possible match (all items have a score of 100)

a score between 0 and 100 or similar, using integers (always easier and safer than fractional values)

I agree. This is an illustration.

The * is 0, and not under the control of the function. The function is not even called, * matches everything.
I think that allowing functions to "mock" with the value of * is confusing, open for abuse. There might be some pros, but I don't see any.

* is never zero, but you can't think of "matching vs. non-matching". For best match, there are cases where * matches better for a given selector. I illustrate this with * == other:

when * * * 1 1 0 0.1 + 0.1 + 0.1 = 0.3 N * here is default

when * * * 11 11 42.0 0.5 + 0.5 + 0.5 = 1.5 Y * here is like other

The key thing here is that that we need to decide between best and first match. If we choose best match, we can then describe best match using one of the mechanisms such as you describe.

aphillips requested review from stasm and eemeli February 26, 2023 16:41

eemeli requested changes Feb 28, 2023

View reviewed changes

mihnita reviewed Mar 3, 2023

View reviewed changes

mihnita mentioned this pull request Mar 3, 2023

proposal: replace first-match with best-match #351

Closed

aphillips closed this Mar 4, 2023

aphillips force-pushed the issue-351 branch from e9f050f to d18d64f Compare March 4, 2023 19:16

aphillips deleted the issue-351 branch March 4, 2023 19:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue 351: update text about selection matching #358

Issue 351: update text about selection matching #358

aphillips commented Feb 26, 2023

eemeli left a comment

macchiati commented Feb 28, 2023

mihnita Mar 3, 2023

mihnita Mar 3, 2023

aphillips Mar 4, 2023


		Because each _selector_ might produce different rankings, the whole list must be stack ranked. Any _variant_ that produces a "no match" can be eliminated from the candidate list.

		We could specify that for-each _key_ the _selector_ must produce a ranked value (say between 0 and 1) or `0` for no match. Non-matching items (that is, any _key_ value that scores `0`) are eliminated. The value `` must* produce at least the minimal non-zero (matching) value, but *may* return a higher value. If no _key_ values match, throw an error (this should never happen as it is a syntax error to omit `*`)

Issue 351: update text about selection matching #358

Issue 351: update text about selection matching #358

Conversation

aphillips commented Feb 26, 2023

eemeli left a comment

Choose a reason for hiding this comment

macchiati commented Feb 28, 2023

mihnita Mar 3, 2023

Choose a reason for hiding this comment

mihnita Mar 3, 2023

Choose a reason for hiding this comment

aphillips Mar 4, 2023

Choose a reason for hiding this comment