Skip to content

Replace the JS object given to jl.Class by primitive IR operations. (multi-dimensions edition) #5031

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

sjrd
Copy link
Member

@sjrd sjrd commented Sep 1, 2024

Alternative to #4998 where Class_newArray takes an Array[Int] instead of a single Int.

sjrd added 3 commits September 1, 2024 16:52
In the next commit, we will add `jl.Class.data` as a synthetic
property, which we will have to handle in the same way as
`ArrayClass` properties. We therefore rename the concept to be more
general and include any synthetic property on Scala classes.
`java.lang.Class` requires primitive capabilities that only the
linker can provide for most of its core methods. This commit changes
the fundamental way to provide those.

---

Previously, the contract was that the linker provided a JS object
with specified properties and methods to the constructor of
`jl.Class`. For example, it contained a `"name"` property in order
to implement `jl.Class.getName()`. We happened to reuse the
`$TypeData` instance as that object, for convenience, and to avoid
more indirections.

This design grew organically over the years, and was eventually
specified in Scala.js IR 1.0.0. However, it always had some
shortcomings that I was not happy about.

First, the `getSuperclass()` method had to be special-cased in the
Analyzer and Emitter: its very presence requires additional metadata.
When it is not reachable, we do not actually provide the supposedly
specified member `"getSuperclass"()` on the JS object. That assumes
that only `jl.Class.getSuperclass()` actually calls that method.
In turn, that assumes it never gets inlined (!), since doing so
would make it unreachable, but the feature would still be required.

Second, for optimization purposes, the emitter needs direct access
to the `$TypeData` stored inside `jl.Class` (for some of its
Transients). There again, that assumed a field with a particular
name be present in `jl.Class` in which the JS object would be
stored. This implicit requirement was already leaking in the way
`jl.Class` was written, in order to prevent scalac from renaming
the field.

---

In this commit, we completely change the approach, and therefore
the IR spec. We remove the JS object passed to `jl.Class`; its
constructor now takes no arguments. Instead, we add primitives in
the IR, in the form of new `UnaryOp`s and `BinaryOp`s. These new
primitives operate on an argument of type `jl.Class!`, and possibly
a second argument.

For example, `UnaryOp.Class_name` replaces the functionality
previously offered by the `"name"` field. `BinaryOp.Class_superClass`
replaces `"getSuperclass"()`.

It is up to the backend to ensure it can, somehow, implement those
operations given an instance of `jl.Class`. We choose to add a
magic field to `jl.Class` at the Emitter level to store the
corresponding instance of `$TypeData`, and implement the operations
in terms of that.

The approach has several benefits:

* It solves the two fundamental issues of the JS object, mentioned
  above.
* It does not require "public" (unminifiable) property names for
  the features; since there is no JS object spec, we can consistently
  minify those members.
* Several methods that were given an *intrinsic* treatment by the
  optimizer now fall, more naturally, under regular folding of
  operations.

---

The change of approach comes with a non-negligible cost for backward
compatibility. It is entirely done using deserialization hacks.

In this commit, we do not change the library/compiler yet, so that
we can exercise the deserialization hacks.
We use the same technique as for `String_length` and `String_charAt`,
for which we hijack the method body from the compiler and replace
it with the appropriate operation.

We only enable the deserialization hack when reading old IR.
@sjrd sjrd force-pushed the primitive-jl-class-operations-multi-dims branch from 13baee0 to ddd32cc Compare September 1, 2024 22:12
Comment on lines +4973 to +4979
case Class_newArray =>
/* We could do something better here if we had virtualized Arrays.
* If the lhs is a literal and the rhs is a virtualized array, we could
* replace the node by a `NewArray`. If/when we do that, we can remove
* the `ArrayNewInstance` intrinsic.
*/
default
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of this, this PR keeps an intrinsic for ArrayNewInstance. The other PR removes it.

@sjrd
Copy link
Member Author

sjrd commented Sep 2, 2024

(Currently broken when Closure is enabled.)

@sjrd
Copy link
Member Author

sjrd commented Sep 20, 2024

Superseded by #4998.

@sjrd sjrd closed this Sep 20, 2024
@sjrd sjrd deleted the primitive-jl-class-operations-multi-dims branch September 20, 2024 01:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant