You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by sr...@apache.org on 2022/06/22 23:39:19 UTC
[spark] branch master updated: [SPARK-39545][SQL] Override `concat` method for `ExpressionSet` in Scala 2.13 to improve the performance
This is an automated email from the ASF dual-hosted git repository.
srowen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new a4a83a31ed3 [SPARK-39545][SQL] Override `concat` method for `ExpressionSet` in Scala 2.13 to improve the performance
a4a83a31ed3 is described below
commit a4a83a31ed355c85097bce284eac05dbfd06d039
Author: yangjie01 <ya...@baidu.com>
AuthorDate: Wed Jun 22 18:39:07 2022 -0500
[SPARK-39545][SQL] Override `concat` method for `ExpressionSet` in Scala 2.13 to improve the performance
### What changes were proposed in this pull request?
`ExpressionSet ++` method in the master branch a little slower than the branch-3.3 with Scala-2.13, so this pr override `concat` method for `ExpressionSet` in Scala 2.13.
### Why are the changes needed?
Improve the performance
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
- Pass GA
- Manual test 1:
microbench as follows and run with Scala 2.13:
```scala
val valuesPerIteration = 100000
val benchmark = new Benchmark("Test ExpressionSet ++ ", valuesPerIteration, output = output)
val aUpper = AttributeReference("A", IntegerType)(exprId = ExprId(1))
val initialSet = ExpressionSet(aUpper + 1 :: Rand(0) :: Nil)
val setToAddWithSameDeterministicExpression = ExpressionSet(aUpper + 1 :: Rand(0) :: Nil)
benchmark.addCase("Test ++") { _: Int =>
for (_ <- 0L until valuesPerIteration) {
initialSet ++ setToAddWithSameDeterministicExpression
}
}
benchmark.run()
```
**branch-3.3 result:**
```
OpenJDK 64-Bit Server VM 1.8.0_312-b07 on Linux 4.14.0_1-0-0-45
Intel(R) Xeon(R) Gold 6XXXC CPU 2.60GHz
Test ExpressionSet ++ : Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
Test ++ 14 16 4 7.2 139.1 1.0X
```
**master result before this pr:**
```
OpenJDK 64-Bit Server VM 1.8.0_312-b07 on Linux 4.14.0_1-0-0-45
Intel(R) Xeon(R) Gold 6XXXC CPU 2.60GHz
Test ExpressionSet ++ : Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
Test ++ 16 19 5 6.1 163.9 1.0X
```
**master result after this pr:**
```
OpenJDK 64-Bit Server VM 1.8.0_312-b07 on Linux 4.14.0_1-0-0-45
Intel(R) Xeon(R) Gold 6XXXC CPU 2.60GHz
Test ExpressionSet ++ : Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
Test ++ 12 13 3 8.6 115.7 1.0X
```
- Manual test 2:
```
dev/change-scala-version.sh 2.13
mvn clean install -pl sql/core -am -DskipTests -Pscala-2.13
mvn test -pl sql/catalyst -Pscala-2.13
mvn test -pl sql/core -Pscala-2.13
```
```
Run completed in 10 minutes, 40 seconds.
Total number of tests run: 6584
Suites: completed 285, aborted 0
Tests: succeeded 6584, failed 0, canceled 0, ignored 5, pending 0
All tests passed.
```
```
Run completed in 1 hour, 27 minutes, 16 seconds.
Total number of tests run: 11745
Suites: completed 520, aborted 0
Tests: succeeded 11745, failed 0, canceled 7, ignored 57, pending 0
All tests passed.
```
Closes #36942 from LuciferYang/ExpressionSet.
Authored-by: yangjie01 <ya...@baidu.com>
Signed-off-by: Sean Owen <sr...@gmail.com>
---
.../org/apache/spark/sql/catalyst/expressions/ExpressionSet.scala | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/sql/catalyst/src/main/scala-2.13/org/apache/spark/sql/catalyst/expressions/ExpressionSet.scala b/sql/catalyst/src/main/scala-2.13/org/apache/spark/sql/catalyst/expressions/ExpressionSet.scala
index e38deedec6d..a615223ef79 100644
--- a/sql/catalyst/src/main/scala-2.13/org/apache/spark/sql/catalyst/expressions/ExpressionSet.scala
+++ b/sql/catalyst/src/main/scala-2.13/org/apache/spark/sql/catalyst/expressions/ExpressionSet.scala
@@ -132,6 +132,12 @@ class ExpressionSet protected(
newSet
}
+ override def concat(that: IterableOnce[Expression]): ExpressionSet = {
+ val newSet = clone()
+ that.iterator.foreach(newSet.add)
+ newSet
+ }
+
override def --(that: IterableOnce[Expression]): ExpressionSet = {
val newSet = clone()
that.iterator.foreach(newSet.remove)
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org