You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/04/27 06:50:55 UTC
[GitHub] [spark] cloud-fan commented on a change in pull request #32357: [SPARK-35235][SQL] Add row-based hash map into aggregate benchmark

cloud-fan commented on a change in pull request #32357:
URL: https://github.com/apache/spark/pull/32357#discussion_r620914345



##########
File path: sql/core/benchmarks/AggregateBenchmark-jdk11-results.txt
##########
@@ -2,142 +2,147 @@
 aggregate without grouping
 ================================================================================================
 
-OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
-Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
+OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure
+Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
 agg w/o group:                            Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 ------------------------------------------------------------------------------------------------------------------------
-agg w/o group wholestage off                      63666          64021         502         32.9          30.4       1.0X
-agg w/o group wholestage on                         882            912          37       2376.9           0.4      72.2X
+agg w/o group wholestage off                      82274          82877         853         25.5          39.2       1.0X
+agg w/o group wholestage on                        1322           1358          37       1586.7           0.6      62.2X
 
 
 ================================================================================================
 stat functions
 ================================================================================================
 
-OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
-Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
+OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure
+Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
 stddev:                                   Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 ------------------------------------------------------------------------------------------------------------------------
-stddev wholestage off                              7370           7688         450         14.2          70.3       1.0X
-stddev wholestage on                                931            997          50        112.6           8.9       7.9X
+stddev wholestage off                              8975           9129         219         11.7          85.6       1.0X
+stddev wholestage on                               1424           1444          34         73.6          13.6       6.3X
 
-OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
-Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
+OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure
+Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
 kurtosis:                                 Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 ------------------------------------------------------------------------------------------------------------------------
-kurtosis wholestage off                           30901          31209         436          3.4         294.7       1.0X
-kurtosis wholestage on                              950            996          33        110.4           9.1      32.5X
+kurtosis wholestage off                           42273          42424         213          2.5         403.1       1.0X
+kurtosis wholestage on                             1492           1528          27         70.3          14.2      28.3X
 
 
 ================================================================================================
 aggregate with linear keys
 ================================================================================================
 
-OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
-Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
+OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure
+Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
 Aggregate w keys:                         Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 ------------------------------------------------------------------------------------------------------------------------
-codegen = F                                        8845           8874          41          9.5         105.4       1.0X
-codegen = T hashmap = F                            5804           5854          47         14.5          69.2       1.5X
-codegen = T hashmap = T                             954           1001          35         87.9          11.4       9.3X
+codegen = F                                       10873          10998         176          7.7         129.6       1.0X
+codegen = T, hashmap = F                           5906           6005          95         14.2          70.4       1.8X
+codegen = T, row-based hashmap = T                 2325           2410          94         36.1          27.7       4.7X
+codegen = T, vectorized hashmap = T                1185           1259          78         70.8          14.1       9.2X

Review comment:
       interesting, we should probably pick vectorized hashmap under certain conditions.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org