Skip to content

Check restrictions on how this values can be used in constructors. #5019

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Oct 19, 2024

Conversation

sjrd
Copy link
Member

@sjrd sjrd commented Aug 23, 2024

Essentially, this cannot be used before the super constructor call, except to assign fields (but not reading).

These restrictions match those enforced by the JVM for bytecode.

While we never intended this values to be restricted in this way, in practice we never produced IR that would violate those restrictions.

In the future, we may leverage those restrictions to validate strict field initialization. If a field needs to be initialized before it is ever read (for example, because it has a non-nullable reference type), then we can do so: we only have to check that it is definitely initialized before the super constructor call. The restrictions implemented in this commit ensure that nobody can read the field before the super constructor call starts.

@sjrd sjrd requested a review from gzm0 August 23, 2024 15:23
@sjrd sjrd force-pushed the no-this-before-super-ctor-call branch from 407962a to c13cf9a Compare August 23, 2024 19:18
@sjrd sjrd marked this pull request as draft August 23, 2024 19:19
@sjrd
Copy link
Member Author

sjrd commented Aug 23, 2024

So ... it turns out that we did (and still do) generate IR that does not follow those restrictions. There are two sources.

Super ctor call to DynamicImportThunk

Edit: this was addressed in #5020.

DynamicImportThunk is a class, but we generate a super constructor call to jl.Object. Arguably this was never intended, and can be considered broken. I fixed the codegen in a commit to check whether more tests pass. We could easily write a deserialization hack for that one.

litedTree methods as args to super ctor calls

This was caught by the partest run/t8733.scala. If scalac needs to generate a liftedTree method to extract a try..catch, we end up with an internal call to an instance method. This used to cause a VerifyError in Scala/JVM, but apparently progressed in 2.12.0. I suspect it's because the JVM backend aggressively turns private methods that do not refer to this into static methods.

To fix this for newly compiled code, we can likely to do the same "optimization". That shouldn't be too difficult.

As a deserialization hack, however, it is particularly problematic, since it is difficult to detect. We would have to spend some computing time detecting whether we need to do any transformation. Could we get away with pretending such code was never valid in the first place, and it has been a compiler bug all along (I'm fantasising here; we don't have any precedent to justify that, AFAICT).

Funnily enough, in Scala 3 that is not a problem by construction because Scala 3 doesn't do liftedTrees anymore. Instead it handles problematic try..catches in the JVM backend. Ironically the problem being solved by liftedTrees is a JVM idiosynchrasy, and it does not exist in Scala.js in the first place (nor in Scala Native).

@sjrd
Copy link
Member Author

sjrd commented Oct 7, 2024

@gzm0 So I could indeed fix the liftedTree problem by making lifted methods static when possible. This is mostly the same analysis that the JVM backend does in Delambdafy, although it is simpler. They make differences between isLiftedMethod methods and isDelambdafyTarget methods, for some reason.

It's still quite problematic because it does not retroactively fix code that was compiled with previous versions of the compiler. I don't know what to do about that. It's super rare; one almost has to write contrived code to run into the issue. But it's still IR that was considered valid before, that may be published somewhere in some library, and that we would "abandon" as being invalid now.

@gzm0
Copy link
Contributor

gzm0 commented Oct 12, 2024

But it's still IR that was considered valid before, that may be published somewhere in some library, and that we would "abandon" as being invalid now.

Do we have an understanding of whether such IR actually works? (Especially in contrived downstream usage scenarios).

Or in other words: Can we construct an end-to-end example of usage of a library that has such IR causes some form of user visible problem (linker or code crashing)?

If yes, we have a good case to declare this a bug and essentially say the more restrictive class def checking catches more bugs in libraries (even if they were caused by Scala.js itself).
If no, the reason why not hopefully gives us more insight in what specifically makes this OK (unfortunately, no guarantee though, that it gives us a path forward 🤷).

@gzm0
Copy link
Contributor

gzm0 commented Oct 12, 2024

As an alternative, we could investigate options to turn the restrictions down. For example, we could apply the restrictions only if there exists a non-nullable field in the ancestry chain.

I guess the issues with many of these is going to be:

  • They rely on infos of the ancestry chain.
  • Adding a non-nullable field to a class is going to be a binary incompatible change, making it essentially impossible to do it as an optimization in the compiler.

@sjrd
Copy link
Member Author

sjrd commented Oct 12, 2024

Do we have an understanding of whether such IR actually works? (Especially in contrived downstream usage scenarios).

Or in other words: Can we construct an end-to-end example of usage of a library that has such IR causes some form of user visible problem (linker or code crashing)?

I don't see why such IR wouldn't work. I always thought this to be valid. We never made the assumption that this was not fully usable before the super constructor. Here I am definitely proposing to make invalid something that was considered valid before (not just that we did not check it). I'm only thinking about this because I saw that it was a requirement on the JVM, and that safety of non-defaultable fields must rely on that property.

Adding a non-nullable field to a class is going to be a binary incompatible change, making it essentially impossible to do it as an optimization in the compiler.

That's probably reason enough not to design the system this way, IMO.

@gzm0
Copy link
Contributor

gzm0 commented Oct 13, 2024

That's probably reason enough not to design the system this way, IMO.

I initially was going to agree with this, but maybe there can still be value. For example, changing to non-nullable would be a binary compatible change for final classes (granted, final is not in the IR). So for example, data classes could profit from this.

Two more ideas I had:

  • Investigate different means of representing field initialization / construction (e.g. allow an rhs to field defs, but in which context that would be evaluated, who knows).
  • Introduce the restrictions only for the next version of IR and deal with it in the entire linker pipeline (e.g. we could widen non-nullable fields to nullable and introduce runtime checks at usage sites).

Both of these seem rather hard to do if feasible at all. Otherwise I think our options boil down to:

  • Don't do this (abandon PR)
  • Break backwards compat (complete and merge PR)
  • Intricate scheme with flags and opt-in to ensure backwards compat.
  • SJS v2

@sjrd
Copy link
Member Author

sjrd commented Oct 14, 2024

I ran an "audit" of all the .sjsir files found in all the published Scala.js libraries (on Maven Central). I used Scaladex to get a list of all the artifacts with their latest version. For each, I used the Coursier API to fetch their jar, then I ran the ClassDefChecker of this PR on all the .sjsir files in them.

There were 7029 artifacts, and none of them yielded a "restricted this" error.

Script and results here:
https://gist.github.com/sjrd/6bafd71bbc1fe8f99ee80271c8a42553

IMO that should be enough evidence that we can just abandon IR that has restricted this. WDYT?


Ironically, there were 3 errors about the constructor call discipline in Akka.js. But that's in IR that they manually patch. There were also two errors in synthetic magic traits of the Scala 3 library. Those are .sjsir files that we never actually load, AFAICT.

@gzm0
Copy link
Contributor

gzm0 commented Oct 14, 2024

IMO that should be enough evidence that we can just abandon IR that has restricted this. WDYT?

Agreed.

@sjrd sjrd force-pushed the no-this-before-super-ctor-call branch from 2ab881f to 576bcf1 Compare October 14, 2024 13:24
@sjrd sjrd marked this pull request as ready for review October 14, 2024 13:25
@sjrd
Copy link
Member Author

sjrd commented Oct 14, 2024

All right. I switched the order of the commits, and cleaned up comments and commit messages. Ready for an actual review now.

@ekrich
Copy link
Contributor

ekrich commented Oct 14, 2024

The audit is very nice as it pulled all the Scala.js 1.x artifacts including all the older versions even published for Scala 2.11.

Comment on lines 338 to 344
for {
referrers <- methodReferences.get(symbol)
referrer <- referrers
} {
// referrer calls `symbol`, so it transitively references `this`
rec(referrer)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for {
referrers <- methodReferences.get(symbol)
referrer <- referrers
} {
// referrer calls `symbol`, so it transitively references `this`
rec(referrer)
}
methodReferences.getOrElse(symbol, Nil).foreach(rec(_))

For readability? (IMO the nested comment is not necessary).

@@ -915,6 +932,8 @@ private final class ClassDefChecker(classDef: ClassDef,
reportError(i"Cannot find `this` in scope")
else if (tree.tpe != env.thisType)
reportError(i"`this` of type ${env.thisType} typed as ${tree.tpe}")
else if (env.isThisRestricted)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the else? Wouldn't we want to report errors exhaustively (if sensible)?

reportError(i"Cannot find `this` in scope for StoreModule()")
else if (env.isThisRestricted)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the change to else? Wouldn't we want to report errors exhaustively (if sensible)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured some of these errors might lead to other errors. But I reverted to not having elses. Same below.

sjrd added 2 commits October 19, 2024 11:58
For all lifted methods and methods that contain the body of
lambdas, we analyze whether they (transitively) refer to `this`.
If not, we emit them as static.

The JVM backend does a similar analysis in the `delambdafy` phase,
which comes after our backend.

As a side effect, this solves an issue with lifted local methods
coming from arguments to a super constructor call. They would
previously result in references to `this` from the super constructor
call arguments, which we will not consider as valid IR anymore.
Since Scala lexical scoping rules ensure that they do not actually
refer to `this`, they will now always be compiled as static.
Essentially, `this` cannot be used before the super constructor
call, except to assign fields (but not reading).

These restrictions match those enforced by the JVM for bytecode.

While we never intended `this` values to be restricted in this way,
in practice we never produced IR that would violate those
restrictions.

There was one exception before the parent commit: we could have
lifted methods coming from the arguments to a super constructor
call. Since they were lifted as instance methods, they caused the
constructor to refer to `this` before the super constructor call.

This is in theory a backward breaking change in the IR! However,
we audited the entire ecosystem of libraries published on Maven
Central. None of them contained IR that violates the new
restriction.

In the future, we may leverage those restrictions to validate
strict field initialization. If a field *needs* to be initialized
before it is ever read (for example, because it has a non-nullable
reference type), then we can do so: we only have to check that it
is definitely initialized before the super constructor call. The
restrictions implemented in this commit ensure that nobody can
read the field before the super constructor call starts.
@sjrd sjrd force-pushed the no-this-before-super-ctor-call branch from 576bcf1 to 274713e Compare October 19, 2024 09:58
Copy link
Member Author

@sjrd sjrd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

reportError(i"Cannot find `this` in scope for StoreModule()")
else if (env.isThisRestricted)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured some of these errors might lead to other errors. But I reverted to not having elses. Same below.

@sjrd sjrd requested a review from gzm0 October 19, 2024 09:58
@sjrd sjrd merged commit 5f98d91 into scala-js:main Oct 19, 2024
3 checks passed
@sjrd sjrd deleted the no-this-before-super-ctor-call branch October 19, 2024 13:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants