You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/12/07 06:37:32 UTC
[GitHub] [spark] wankunde commented on a diff in pull request #38672: [SPARK-41159][SQL] Optimize like any and like all expressions
wankunde commented on code in PR #38672:
URL: https://github.com/apache/spark/pull/38672#discussion_r1041811602
##########
sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/LikeAnyBenchmark.scala:
##########
@@ -0,0 +1,88 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.benchmark
+
+import java.io.File
+
+import scala.util.Random
+
+import org.apache.spark.benchmark.Benchmark
+import org.apache.spark.sql.DataFrame
+import org.apache.spark.sql.internal.SQLConf
+
+/**
+ * Benchmark to measure like any expressions performance.
+ *
+ * To run this benchmark:
+ * {{{
+ * 1. without sbt: bin/spark-submit --class <this class>
+ * --jars <spark core test jar>,<spark catalyst test jar> <spark sql test jar>
+ * 2. build/sbt "sql/Test/runMain <this class>"
+ * 3. generate result: SPARK_GENERATE_BENCHMARK_FILES=1 build/sbt "sql/Test/runMain <this class>"
+ * Results will be written to "benchmarks/LikeAnyBenchmark-results.txt".
+ * }}}
+ */
+object LikeAnyBenchmark extends SqlBasedBenchmark {
Review Comment:
Before this PR:
```
[info] Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
[info] Multi like query: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] Query with multi like 1393 1469 119 0.0 1392586.7 1.0X
[info] Query with LikeAny simplification 1244 1309 97 0.0 1244382.5 1.1X
[info] Query without LikeAny simplification 400 407 8 0.0 399924.3 3.5X
[info] Multi like query: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] Query with multi like 1476 1576 149 0.0 1475710.1 1.0X
[info] Query with LikeAny simplification 1387 1429 37 0.0 1386669.1 1.1X
[info] Query without LikeAny simplification 430 470 35 0.0 430435.8 3.4X
```
After this PR:
```
[info] Multi like query: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] Query with multi like 1441 1516 78 0.0 1441335.8 1.0X
[info] Query with LikeAny simplification 1401 1431 44 0.0 1400743.9 1.0X
[info] Query without LikeAny simplification 357 369 10 0.0 357419.8 4.0X
[info] Multi like query: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] Query with multi like 1524 1628 117 0.0 1524119.6 1.0X
[info] Query with LikeAny simplification 1405 1418 18 0.0 1405258.7 1.1X
[info] Query without LikeAny simplification 362 372 12 0.0 361654.4 4.2X
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org