Skip to content

More DCE wrt case object equality test #2396

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
japgolly opened this issue May 17, 2016 · 8 comments
Open

More DCE wrt case object equality test #2396

japgolly opened this issue May 17, 2016 · 8 comments
Labels
optimization Optimization only. Does not affect semantics or correctness.

Comments

@japgolly
Copy link
Contributor

japgolly commented May 17, 2016

I was pleasantly surprised to see that entire case objects are eliminated when unused.

DCE doesn't reduce this snippet however:

sealed abstract class Blah
case object ABC extends Blah
case object XYZ extends Blah

def blahName(b: Blah): String =
  b match {
    case ABC => "abc"
    case XYZ => "xyz"
  }

def main(): Unit =
  println(blahName(ABC))

Because we know that XYZ is unused ignoring the following exceptions of use...

  1. XYZ.unapply in a case clause, given XYZ.unapply isn't overridden or overloaded.
  2. _: XYZ.type in a case clause.

...I believe DCE could eliminate:

  • The XYZ obejct itself.
  • All case clauses referencing XYZ.

Thereby reducing the above snippet to:

sealed abstract class Blah
case object ABC extends Blah
// case object XYZ extends Blah

def blahName(b: Blah): String =
  b match {
    case ABC => "abc"
    // case XYZ => "xyz"
  }

def main(): Unit =
  println(blahName(ABC))
@japgolly
Copy link
Contributor Author

Another instance of use that could be eliminated is _: XYZ.type in a case clause.

Will update head comment.

@gzm0
Copy link
Contributor

gzm0 commented May 17, 2016

What happens if you mark blahName as @inline?

@gzm0
Copy link
Contributor

gzm0 commented May 17, 2016

@sjrd I guess we could teach the optimizer to check for existence of instances when doing isInstanceOf checks, which should fold the if branch. The question is how much impact this will have on invalidation.

@gzm0 gzm0 added the optimization Optimization only. Does not affect semantics or correctness. label May 17, 2016
@sjrd
Copy link
Member

sjrd commented May 17, 2016

Well here the thing the optimizer sees is if (XYZ == b). Collapsing that to false is hard.

@japgolly
Copy link
Contributor Author

What happens if you mark blahName as @inline?

Just gave it a try with @inline; it doesn't eliminate XYZ.

Well here the thing the optimizer sees is if (XYZ == b). Collapsing that to false is hard.

I don't know if this is a naive view but I imagine this optimisation could be implemented by something like:

  • change usage analysis of case objects so that references (usages) are recorded in two distinct buckets: one for value inspection usage like case clauses, equality or reference comparison, instanceOf checking; and a bucket for everything else.
  • DCE₁: remove objects without any references in usage-bucket-2.
  • DCE₂: for all references in all removed objects' usage-bucket-1s, remove case clauses, collapse equality tests to constant false.
  • Today's DCE runs and removes dead branches which now includes those introduced above.

Would that work? Is the reason you foresee this being hard due to logic (in that above is too naive) or in implementation?

@sjrd
Copy link
Member

sjrd commented May 18, 2016

Not really. What we want to get rid of is Obj == x (Scala's equality) for pattern matching. But == is not always reference equality. For example Nil == Vector.empty.

After a round of optimization, == simplifies to === (aka eq) when that's the actual implementation. But after the optimizer we only have one round of dce left, which is not enough to implement the rest of your "plan" (steps 2 and following). In particular what you call "DCE 2" actually needs a round of optimization, not dce.

@sjrd sjrd changed the title More DCE wrt case object unapply More DCE wrt case object equality test Mar 23, 2017
@sjrd sjrd added this to the Post-v1.0.0 milestone Oct 11, 2017
@gzm0 gzm0 removed this from the Post-v1.0.0 milestone Apr 8, 2020
@gzm0
Copy link
Contributor

gzm0 commented Sep 16, 2023

Note to self: It seems like

  • Ignoring _ === LoadModule(_) trees in reachability analysis and
  • Folding the above to false if the module isn't instantiated

Will give us the optimization we'd want. But the second step comes too late (Emitter would be our earliest phase). We can optimize the expression, but we cannot eliminate code that is unused because of it, providing limited value.

(Yes, this is what sjrd has been saying all along)

@gzm0
Copy link
Contributor

gzm0 commented Jan 19, 2025

I've had another (unsuccessful) pass at this. Sharing a negative result.

IIUC, as an approximation, the IR trees we are interested in are:

x.equals;Ljava.lang.Object;Z(y)

where x is a value of

  • module class type,
  • without an override of equals

My thought was to special case this in the Analyzer (in callMethod) to
extract the fact that we are checking module class identity.

This seems to be possible; however, it does not solve the problem:
we cannot trace back what x is in the analyzer phase, so we cannot
eliminate / flag the LoadModule call (which we assigned to x) as "readonly".

It seems like the only other option (short of adding another optimizer-like phase)
would be to investigate compiler support (compile to a linker friendlier tree).
But IIUC that would not work in terms of separate compilation guarantees we need to offer (a module class needs to be able to override equals without re-compilation of usage sites).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
optimization Optimization only. Does not affect semantics or correctness.
Projects
None yet
Development

No branches or pull requests

3 participants