Skip to content

Wasm: Implement ju.ArrayList without js.Array for Wasm backend #5162

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 8, 2025

Conversation

tanishiking
Copy link
Contributor

@tanishiking tanishiking commented May 5, 2025

The original implementation of ju.ArrayList uses js.Array as its internal data structure. When compiling to Wasm, operations on ju.ArrayList require JS interop calls to access the underlying js.Array, which causes a slow performance.
This commit introduces an implementation of ju.ArrayList for the Wasm backend. This version uses Scala's Array instead of js.Array for better performance.

The second commit fixes JS implementation of removeRange to throw an IndexOutOfBoundsException if fromIndex or toIndex is out of range.

Upstreamed from scala-wasm#39

Comment on lines 196 to 198
val newArr = new Array[Any](newCapacity)
System.arraycopy(innerWasm, 0, newArr, 0, size())
innerWasm = newArr
Copy link
Contributor Author

@tanishiking tanishiking May 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't use Arrays.copyOf since innerWasm needs to be casted to Array[AnyRef] for filling it with null, which fails if E was AnyVal.

Copy link
Member

@sjrd sjrd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. This looks good overall.

I mostly have coding style comments, as well as comments related to using an Array[AnyRef].

Comment on lines 122 to 128
@Test def isEmpty(): Unit = {
val coll = factory.empty[Int]
assertTrue(coll.size() == 0)
assertTrue(coll.isEmpty())
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add another case with a non-empty collection (created with fromElements) to test the false case.

Comment on lines 173 to 175
if (fromIndex < 0 || toIndex > size() || toIndex < fromIndex) {
throw new IndexOutOfBoundsException()
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider switching the commits. In the first commit, add the new tests, and fix this missing IndexOutOfBoundsException. In the second commit, introduce the Wasm rewriting. This will remove the need for the churn with the platform-dependent assumption.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! Switched the commit order, and added bounds check for addAll to the first commit as well :)

Comment on lines 38 to 40
private var innerWasm =
if (!isWebAssembly) null
else innerInit.asInstanceOf[Array[Any]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest creating this array as an Array[AnyRef]. They erase to the same thing, but Array[AnyRef] will allow to use the methods of Arrays. It's common practice even inside the Scala standard library to use an Array[AnyRef] when we need a performant "array of anything".

Note that it is always valid to cast to AnyRef, even for values that "are AnyVals". AnyVal is a Scala compiler fiction. It erases to AnyRef as well. Casting to AnyRef is a no-op in the IR/bytecode.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that it is always valid to cast to AnyRef, even for values that "are AnyVals". AnyVal is a Scala compiler fiction. It erases to AnyRef as well. Casting to AnyRef is a no-op in the IR/bytecode.

Interesting, I'll play around with it, but for now we cast it to Array[AnyRef] and can use the Arrays functions. Thanks!

Comment on lines 44 to 46
if (isWebAssembly)
new Array[Any](Math.max(initialCapacity, 0))
else new js.Array[E],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (isWebAssembly)
new Array[Any](Math.max(initialCapacity, 0))
else new js.Array[E],
if (isWebAssembly) new Array[Any](Math.max(initialCapacity, 0))
else new js.Array[E],

To avoid the Math.max, though, we can move the initialCapacity < 0 check before the super call, like this:

  def this(initialCapacity: Int) = {
    this(
      {
        if (initialCapacity < 0)
          throw new IllegalArgumentException
        if (isWebAssembly) new Array[Any](initialCapacity)
        else new js.Array[E],
      },
      0
  }

addAll(c)
}

def trimToSize(): Unit = {
// We ignore this as js.Array doesn't support explicit pre-allocation
if (isWebAssembly) resizeTo(size())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coding style: Always use a separate line for side-effecting branches of control structures (see here)

}

override def clear(): Unit =
inner.length = 0
if (isWebAssembly) {
removeRange(0, size())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This never needs to check bounds nor move elements. Consider using a direct Arrays.fill:

Suggested change
removeRange(0, size())
Arrays.fill(innerWasm, null) // free references for GC


override def addAll(index: Int, c: Collection[_ <: E]): Boolean = {
c match {
case other: ArrayList[_] =>
inner.splice(index, 0, other.inner.toSeq: _*)
if (isWebAssembly) {
checkIndexOnBounds(index)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like another bounds check that was missing from the JS implementation, and that should be applied to both cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right, added the bound check in the first commit (and check in both JS and Wasm in second) 👍

Comment on lines 179 to 182
for (i <- (size() - toIndex + fromIndex) until size()) {
innerWasm(i) = null
}
_size -= (toIndex - fromIndex)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider computing val newSize = _size - (toIndex - fromIndex) first, to use it also as the start index for the null-filling. Currently this code performs that computation twice, but in slightly different ways, which is confusing.

Also, with an Array[AnyRef] the loop is an Arrays.fill.


// Wasm only
private def expand(): Unit = {
resizeTo(Math.max(innerWasm.length * 2, 1))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it would be better to use 16 instead of 1 as lower bound here.

If we need to resize once, we probably need to resize more later, so we should directly go with the default 16 that we use when no initial capacity is provided.

Comment on lines 54 to 55
assertTrue(Array[Int](1, 2, 100) sameElements al1.toArray())
assertTrue(Array[Int](1, 2, 200) sameElements al2.toArray())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coding style: We don't use the infix syntax for alphanumeric method calls anymore (except for eq, ne, to and until):

Suggested change
assertTrue(Array[Int](1, 2, 100) sameElements al1.toArray())
assertTrue(Array[Int](1, 2, 200) sameElements al2.toArray())
assertTrue(Array[Int](1, 2, 100).sameElements(al1.toArray()))
assertTrue(Array[Int](1, 2, 200).sameElements(al2.toArray()))

Same several times below.

@tanishiking tanishiking force-pushed the arraylist-wasm branch 2 times, most recently from a7d756d to 443c741 Compare May 5, 2025 15:09
@tanishiking
Copy link
Contributor Author

Thanks! I think addressed all the feedbacks.

@tanishiking tanishiking requested a review from sjrd May 5, 2025 15:17
Copy link
Contributor

@gzm0 gzm0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional drive-by suggestions

if (isWebAssembly) new Array[AnyRef](16)
else new js.Array[E],
0
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just this(16)? Let the other ctor handle wasm v.s. JS?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, nice catch, thanks!

if (isWebAssembly) new Array[AnyRef](c.size())
else new js.Array[E],
0
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe also here this(c.size())?

// We ignore this as js.Array doesn't support explicit pre-allocation
if (isWebAssembly) {
if (innerWasm.length < minCapacity)
resizeTo(minCapacity)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we round up to the next power of 2 here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tanishiking tanishiking force-pushed the arraylist-wasm branch 3 times, most recently from 2010a9c to 7a0e0c6 Compare May 7, 2025 07:46
Copy link
Member

@sjrd sjrd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few last details, but otherwise looks good to me.

removeRange should throws an `IndexOutOfBoundsException` if
fromIndex or toIndex is out of range. addAll should also throw
when the index is not within bounds.
The original implementation of ju.ArrayList uses js.Array as its
internal data structure. When compiling to Wasm, operations on
ju.ArrayList require JS interop calls to access the underlying
js.Array, which causes a slow performance.

This commit introduces an implementation of ju.ArrayList for the
Wasm backend. This version uses Scala's Array instead of js.Array
for better performance.
@sjrd sjrd force-pushed the arraylist-wasm branch from 99cd41e to a088ec4 Compare May 7, 2025 15:46
Copy link
Member

@sjrd sjrd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took the liberty to move the addAll test to the right commit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants