You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Yang Jie (Jira)" <ji...@apache.org> on 2022/07/13 14:20:00 UTC
[jira] [Created] (SPARK-39766) For the `arrayOfAnyAsSeq` scenario in `GenericArrayDataBenchmark`, using Scala 2.13 is slower than Scala 2.12
Yang Jie created SPARK-39766:
--------------------------------
Summary: For the `arrayOfAnyAsSeq` scenario in `GenericArrayDataBenchmark`, using Scala 2.13 is slower than Scala 2.12
Key: SPARK-39766
URL: https://issues.apache.org/jira/browse/SPARK-39766
Project: Spark
Issue Type: Improvement
Components: SQL
Affects Versions: 3.4.0
Reporter: Yang Jie
Run `GenericArrayDataBenchmark` with Scala 2.13 and 2.12, for the `arrayOfAnyAsSeq` scenario in `GenericArrayDataBenchmark`, using Scala 2.13 is slower than Scala 2.12:
Scala 2.12
{code:java}
OpenJDK 64-Bit Server VM 1.8.0_322-b06 on Linux 5.13.0-1021-azure
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
constructor: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
arrayOfAnyAsSeq 25 29 2 395.1 2.5 0.1X{code}
Scala 2.13
{code:java}
OpenJDK 64-Bit Server VM 1.8.0_332-b09 on Linux 5.13.0-1031-azure
Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
constructor: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
arrayOfAnyAsSeq 241 243 1 41.4 24.1 0.0X {code}
the test code as follows:
{code:java}
benchmark.addCase("arrayOfAnyAsSeq") { _ =>
val arr: Seq[Any] = new Array[Any](arraySize)
var n = 0
while (n < valuesPerIteration) {
new GenericArrayData(arr)
n += 1
}
} {code}
the constructor of GenericArrayData as follows:
{code:java}
def this(seq: scala.collection.Seq[Any]) = this(seq.toArray) {code}
The performance difference is due to the following reasons:
When using Scala 2.12:
The class type of `arr` is `s.c.mutable.WrappedArrayWrappedArray$ofRef`, `toArray` return `array.asInstanceOf[Array[U]]`, there is no memory copy.
When using Scala 2.13:
The class type of `arr` is `s.c.immutable.ArraySeq$ofRef`, `toArray` will call `IterableOnceOps#toArray`, the corresponding implementation uses memory copy.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org