You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by dongjoon-hyun <gi...@git.apache.org> on 2018/09/15 09:16:47 UTC
[GitHub] spark pull request #22427: [SPARK-25438][SQL][TEST] Fix FilterPushdownBenchm...
GitHub user dongjoon-hyun opened a pull request:
https://github.com/apache/spark/pull/22427
[SPARK-25438][SQL][TEST] Fix FilterPushdownBenchmark to use the same memory assumption
## What changes were proposed in this pull request?
This PR aims to fix three things in `FilterPushdownBenchmark`.
**1. Use the same memory assumption.**
The following configurations are used in ORC and Parquet.
- Memory buffer for writing
- parquet.block.size (default: 128MB)
- orc.stripe.size (default: 64MB)
- Compression chunk size
- parquet.page.size (default: 1MB)
- orc.compress.size (default: 256KB)
SPARK-24692 used 1MB, the default value of `parquet.page.size`, for `parquet.block.size` and `orc.stripe.size`. But, it missed to match `orc.compression.size`. So, the current benchmark shows the result from ORC with 256KB memory for compression and Parquet with 1MB. To compare correctly, we need to be consistent.
**2. Dictionary encoding should not be enforced for all cases.**
SPARK-24206 enforced dictionary encoding for all test cases. This PR recovers the default behavior in general and enforces dictionary encoding only in case of `prepareStringDictTable`.
**3. Generate test result on AWS r3.xlarge**
SPARK-24206 generated the result on AWS in order to reproduce and compare easily. This PR also aims to update the result on the same machine again in the same reason. Specifically, AWS r3.xlarge with Instance Store is used.
## How was this patch tested?
Manual. Enable the test cases and run `FilterPushdownBenchmark` on `AWS r3.xlarge`.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/dongjoon-hyun/spark SPARK-25438
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/22427.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #22427
----
commit fb14cd5829f431593db71b1b5ec06dd0957791ad
Author: Dongjoon Hyun <do...@...>
Date: 2018-09-15T04:21:54Z
[SPARK-25438][SQL][TEST] Fix FilterPushdownBenchmark to use the same memory assumption
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22427: [SPARK-25438][SQL][TEST] Fix FilterPushdownBenchmark to ...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22427
**[Test build #96092 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96092/testReport)** for PR 22427 at commit [`fb14cd5`](https://github.com/apache/spark/commit/fb14cd5829f431593db71b1b5ec06dd0957791ad).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22427: [SPARK-25438][SQL][TEST] Fix FilterPushdownBenchmark to ...
Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/22427
Thanks for the explanation! The change looks good to me.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22427: [SPARK-25438][SQL][TEST] Fix FilterPushdownBenchm...
Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:
https://github.com/apache/spark/pull/22427#discussion_r217923482
--- Diff: sql/core/benchmarks/FilterPushdownBenchmark-results.txt ---
@@ -2,737 +2,669 @@
Pushdown for many distinct value case
================================================================================================
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
-
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 0 string row (value IS NULL): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 8970 / 9122 1.8 570.3 1.0X
-Parquet Vectorized (Pushdown) 471 / 491 33.4 30.0 19.0X
-Native ORC Vectorized 7661 / 7853 2.1 487.0 1.2X
-Native ORC Vectorized (Pushdown) 1134 / 1161 13.9 72.1 7.9X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 11405 / 11485 1.4 725.1 1.0X
+Parquet Vectorized (Pushdown) 675 / 690 23.3 42.9 16.9X
+Native ORC Vectorized 7127 / 7170 2.2 453.1 1.6X
+Native ORC Vectorized (Pushdown) 519 / 541 30.3 33.0 22.0X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 0 string row ('7864320' < value < '7864320'): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 9246 / 9297 1.7 587.8 1.0X
-Parquet Vectorized (Pushdown) 480 / 488 32.8 30.5 19.3X
-Native ORC Vectorized 7838 / 7850 2.0 498.3 1.2X
-Native ORC Vectorized (Pushdown) 1054 / 1118 14.9 67.0 8.8X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 11457 / 11473 1.4 728.4 1.0X
+Parquet Vectorized (Pushdown) 656 / 686 24.0 41.7 17.5X
+Native ORC Vectorized 7328 / 7342 2.1 465.9 1.6X
+Native ORC Vectorized (Pushdown) 539 / 565 29.2 34.2 21.3X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 1 string row (value = '7864320'): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 8989 / 9100 1.7 571.5 1.0X
-Parquet Vectorized (Pushdown) 448 / 467 35.1 28.5 20.1X
-Native ORC Vectorized 7680 / 7768 2.0 488.3 1.2X
-Native ORC Vectorized (Pushdown) 1067 / 1118 14.7 67.8 8.4X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 11878 / 11888 1.3 755.2 1.0X
+Parquet Vectorized (Pushdown) 630 / 654 25.0 40.1 18.9X
+Native ORC Vectorized 7342 / 7362 2.1 466.8 1.6X
+Native ORC Vectorized (Pushdown) 519 / 537 30.3 33.0 22.9X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 1 string row (value <=> '7864320'): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 9115 / 9266 1.7 579.5 1.0X
-Parquet Vectorized (Pushdown) 466 / 492 33.7 29.7 19.5X
-Native ORC Vectorized 7800 / 7914 2.0 495.9 1.2X
-Native ORC Vectorized (Pushdown) 1075 / 1102 14.6 68.4 8.5X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 11423 / 11440 1.4 726.2 1.0X
+Parquet Vectorized (Pushdown) 625 / 643 25.2 39.7 18.3X
+Native ORC Vectorized 7315 / 7335 2.2 465.1 1.6X
+Native ORC Vectorized (Pushdown) 507 / 520 31.0 32.2 22.5X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 1 string row ('7864320' <= value <= '7864320'): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 9099 / 9237 1.7 578.5 1.0X
-Parquet Vectorized (Pushdown) 462 / 475 34.1 29.3 19.7X
-Native ORC Vectorized 7847 / 7925 2.0 498.9 1.2X
-Native ORC Vectorized (Pushdown) 1078 / 1114 14.6 68.5 8.4X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 11440 / 11478 1.4 727.3 1.0X
+Parquet Vectorized (Pushdown) 634 / 652 24.8 40.3 18.0X
+Native ORC Vectorized 7311 / 7324 2.2 464.8 1.6X
+Native ORC Vectorized (Pushdown) 517 / 548 30.4 32.8 22.1X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select all string rows (value IS NOT NULL): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 19303 / 19547 0.8 1227.3 1.0X
-Parquet Vectorized (Pushdown) 19924 / 20089 0.8 1266.7 1.0X
-Native ORC Vectorized 18725 / 19079 0.8 1190.5 1.0X
-Native ORC Vectorized (Pushdown) 19310 / 19492 0.8 1227.7 1.0X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 20750 / 20872 0.8 1319.3 1.0X
+Parquet Vectorized (Pushdown) 21002 / 21032 0.7 1335.3 1.0X
+Native ORC Vectorized 16714 / 16742 0.9 1062.6 1.2X
+Native ORC Vectorized (Pushdown) 16926 / 16965 0.9 1076.1 1.2X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 0 int row (value IS NULL): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 8117 / 8323 1.9 516.1 1.0X
-Parquet Vectorized (Pushdown) 484 / 494 32.5 30.8 16.8X
-Native ORC Vectorized 6811 / 7036 2.3 433.0 1.2X
-Native ORC Vectorized (Pushdown) 1061 / 1082 14.8 67.5 7.6X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 10510 / 10532 1.5 668.2 1.0X
+Parquet Vectorized (Pushdown) 642 / 665 24.5 40.8 16.4X
+Native ORC Vectorized 6609 / 6618 2.4 420.2 1.6X
+Native ORC Vectorized (Pushdown) 502 / 512 31.4 31.9 21.0X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 0 int row (7864320 < value < 7864320): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 8105 / 8140 1.9 515.3 1.0X
-Parquet Vectorized (Pushdown) 478 / 505 32.9 30.4 17.0X
-Native ORC Vectorized 6914 / 7211 2.3 439.6 1.2X
-Native ORC Vectorized (Pushdown) 1044 / 1064 15.1 66.4 7.8X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 10505 / 10514 1.5 667.9 1.0X
+Parquet Vectorized (Pushdown) 659 / 673 23.9 41.9 15.9X
+Native ORC Vectorized 6634 / 6641 2.4 421.8 1.6X
+Native ORC Vectorized (Pushdown) 513 / 526 30.7 32.6 20.5X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 1 int row (value = 7864320): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 7983 / 8116 2.0 507.6 1.0X
-Parquet Vectorized (Pushdown) 464 / 487 33.9 29.5 17.2X
-Native ORC Vectorized 6703 / 6774 2.3 426.1 1.2X
-Native ORC Vectorized (Pushdown) 1017 / 1058 15.5 64.6 7.9X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 10555 / 10570 1.5 671.1 1.0X
+Parquet Vectorized (Pushdown) 651 / 668 24.2 41.4 16.2X
+Native ORC Vectorized 6721 / 6728 2.3 427.3 1.6X
+Native ORC Vectorized (Pushdown) 508 / 519 31.0 32.3 20.8X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 1 int row (value <=> 7864320): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 7942 / 7983 2.0 504.9 1.0X
-Parquet Vectorized (Pushdown) 468 / 479 33.6 29.7 17.0X
-Native ORC Vectorized 6677 / 6779 2.4 424.5 1.2X
-Native ORC Vectorized (Pushdown) 1021 / 1068 15.4 64.9 7.8X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 10556 / 10566 1.5 671.1 1.0X
+Parquet Vectorized (Pushdown) 647 / 654 24.3 41.1 16.3X
+Native ORC Vectorized 6716 / 6728 2.3 427.0 1.6X
+Native ORC Vectorized (Pushdown) 510 / 521 30.9 32.4 20.7X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 1 int row (7864320 <= value <= 7864320): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 7909 / 7958 2.0 502.8 1.0X
-Parquet Vectorized (Pushdown) 485 / 494 32.4 30.8 16.3X
-Native ORC Vectorized 6751 / 6846 2.3 429.2 1.2X
-Native ORC Vectorized (Pushdown) 1043 / 1077 15.1 66.3 7.6X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 10556 / 10565 1.5 671.1 1.0X
+Parquet Vectorized (Pushdown) 649 / 654 24.2 41.3 16.3X
+Native ORC Vectorized 6700 / 6712 2.3 426.0 1.6X
+Native ORC Vectorized (Pushdown) 509 / 520 30.9 32.3 20.8X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 1 int row (7864319 < value < 7864321): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 8010 / 8033 2.0 509.2 1.0X
-Parquet Vectorized (Pushdown) 472 / 489 33.3 30.0 17.0X
-Native ORC Vectorized 6655 / 6808 2.4 423.1 1.2X
-Native ORC Vectorized (Pushdown) 1015 / 1067 15.5 64.5 7.9X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 10547 / 10566 1.5 670.5 1.0X
+Parquet Vectorized (Pushdown) 649 / 653 24.2 41.3 16.3X
+Native ORC Vectorized 6703 / 6713 2.3 426.2 1.6X
+Native ORC Vectorized (Pushdown) 510 / 520 30.8 32.5 20.7X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 10% int rows (value < 1572864): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 8983 / 9035 1.8 571.1 1.0X
-Parquet Vectorized (Pushdown) 2204 / 2231 7.1 140.1 4.1X
-Native ORC Vectorized 7864 / 8011 2.0 500.0 1.1X
-Native ORC Vectorized (Pushdown) 2674 / 2789 5.9 170.0 3.4X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 11478 / 11525 1.4 729.7 1.0X
+Parquet Vectorized (Pushdown) 2576 / 2587 6.1 163.8 4.5X
+Native ORC Vectorized 7633 / 7657 2.1 485.3 1.5X
+Native ORC Vectorized (Pushdown) 2076 / 2096 7.6 132.0 5.5X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 50% int rows (value < 7864320): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 12723 / 12903 1.2 808.9 1.0X
-Parquet Vectorized (Pushdown) 9112 / 9282 1.7 579.3 1.4X
-Native ORC Vectorized 12090 / 12230 1.3 768.7 1.1X
-Native ORC Vectorized (Pushdown) 9242 / 9372 1.7 587.6 1.4X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 14785 / 14802 1.1 940.0 1.0X
+Parquet Vectorized (Pushdown) 9971 / 9977 1.6 633.9 1.5X
+Native ORC Vectorized 11082 / 11107 1.4 704.6 1.3X
+Native ORC Vectorized (Pushdown) 8061 / 8073 2.0 512.5 1.8X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 90% int rows (value < 14155776): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 16453 / 16678 1.0 1046.1 1.0X
-Parquet Vectorized (Pushdown) 15997 / 16262 1.0 1017.0 1.0X
-Native ORC Vectorized 16652 / 17070 0.9 1058.7 1.0X
-Native ORC Vectorized (Pushdown) 15843 / 16112 1.0 1007.2 1.0X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 18174 / 18214 0.9 1155.5 1.0X
+Parquet Vectorized (Pushdown) 17387 / 17403 0.9 1105.5 1.0X
+Native ORC Vectorized 14465 / 14492 1.1 919.7 1.3X
+Native ORC Vectorized (Pushdown) 14024 / 14041 1.1 891.6 1.3X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select all int rows (value IS NOT NULL): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 17098 / 17254 0.9 1087.1 1.0X
-Parquet Vectorized (Pushdown) 17302 / 17529 0.9 1100.1 1.0X
-Native ORC Vectorized 16790 / 17098 0.9 1067.5 1.0X
-Native ORC Vectorized (Pushdown) 17329 / 17914 0.9 1101.7 1.0X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 19004 / 19014 0.8 1208.2 1.0X
+Parquet Vectorized (Pushdown) 19219 / 19232 0.8 1221.9 1.0X
+Native ORC Vectorized 15266 / 15290 1.0 970.6 1.2X
+Native ORC Vectorized (Pushdown) 15469 / 15482 1.0 983.5 1.2X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select all int rows (value > -1): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 17088 / 17392 0.9 1086.4 1.0X
-Parquet Vectorized (Pushdown) 17609 / 17863 0.9 1119.5 1.0X
-Native ORC Vectorized 18334 / 69831 0.9 1165.7 0.9X
-Native ORC Vectorized (Pushdown) 17465 / 17629 0.9 1110.4 1.0X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 19036 / 19052 0.8 1210.3 1.0X
+Parquet Vectorized (Pushdown) 19287 / 19306 0.8 1226.2 1.0X
+Native ORC Vectorized 15311 / 15371 1.0 973.5 1.2X
+Native ORC Vectorized (Pushdown) 15517 / 15590 1.0 986.5 1.2X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select all int rows (value != -1): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 16903 / 17233 0.9 1074.6 1.0X
-Parquet Vectorized (Pushdown) 16945 / 17032 0.9 1077.3 1.0X
-Native ORC Vectorized 16377 / 16762 1.0 1041.2 1.0X
-Native ORC Vectorized (Pushdown) 16950 / 17212 0.9 1077.7 1.0X
+Parquet Vectorized 19072 / 19102 0.8 1212.6 1.0X
+Parquet Vectorized (Pushdown) 19288 / 19318 0.8 1226.3 1.0X
+Native ORC Vectorized 15277 / 15293 1.0 971.3 1.2X
+Native ORC Vectorized (Pushdown) 15479 / 15499 1.0 984.1 1.2X
================================================================================================
Pushdown for few distinct value case (use dictionary encoding)
================================================================================================
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
-
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 0 distinct string row (value IS NULL): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 7245 / 7322 2.2 460.7 1.0X
-Parquet Vectorized (Pushdown) 378 / 389 41.6 24.0 19.2X
-Native ORC Vectorized 6720 / 6778 2.3 427.2 1.1X
-Native ORC Vectorized (Pushdown) 1009 / 1032 15.6 64.2 7.2X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 10250 / 10274 1.5 651.7 1.0X
+Parquet Vectorized (Pushdown) 571 / 576 27.5 36.3 17.9X
+Native ORC Vectorized 8651 / 8660 1.8 550.0 1.2X
+Native ORC Vectorized (Pushdown) 909 / 933 17.3 57.8 11.3X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 0 distinct string row ('100' < value < '100'): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 7627 / 7795 2.1 484.9 1.0X
-Parquet Vectorized (Pushdown) 384 / 406 41.0 24.4 19.9X
-Native ORC Vectorized 6724 / 7824 2.3 427.5 1.1X
-Native ORC Vectorized (Pushdown) 968 / 986 16.3 61.5 7.9X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 10420 / 10426 1.5 662.5 1.0X
+Parquet Vectorized (Pushdown) 574 / 579 27.4 36.5 18.2X
+Native ORC Vectorized 8973 / 8982 1.8 570.5 1.2X
+Native ORC Vectorized (Pushdown) 916 / 955 17.2 58.2 11.4X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 1 distinct string row (value = '100'): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 7157 / 7534 2.2 455.0 1.0X
-Parquet Vectorized (Pushdown) 542 / 565 29.0 34.5 13.2X
-Native ORC Vectorized 6716 / 7214 2.3 427.0 1.1X
-Native ORC Vectorized (Pushdown) 1212 / 1288 13.0 77.0 5.9X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 10428 / 10441 1.5 663.0 1.0X
+Parquet Vectorized (Pushdown) 789 / 809 19.9 50.2 13.2X
+Native ORC Vectorized 9042 / 9055 1.7 574.9 1.2X
+Native ORC Vectorized (Pushdown) 1130 / 1145 13.9 71.8 9.2X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 1 distinct string row (value <=> '100'): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 7368 / 7552 2.1 468.4 1.0X
-Parquet Vectorized (Pushdown) 544 / 556 28.9 34.6 13.5X
-Native ORC Vectorized 6740 / 6867 2.3 428.5 1.1X
-Native ORC Vectorized (Pushdown) 1230 / 1426 12.8 78.2 6.0X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 10402 / 10416 1.5 661.3 1.0X
+Parquet Vectorized (Pushdown) 791 / 806 19.9 50.3 13.2X
+Native ORC Vectorized 9042 / 9055 1.7 574.9 1.2X
+Native ORC Vectorized (Pushdown) 1112 / 1145 14.1 70.7 9.4X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 1 distinct string row ('100' <= value <= '100'): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 7427 / 7734 2.1 472.2 1.0X
-Parquet Vectorized (Pushdown) 556 / 568 28.3 35.4 13.3X
-Native ORC Vectorized 6847 / 7059 2.3 435.3 1.1X
-Native ORC Vectorized (Pushdown) 1226 / 1230 12.8 77.9 6.1X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 10548 / 10563 1.5 670.6 1.0X
+Parquet Vectorized (Pushdown) 790 / 796 19.9 50.2 13.4X
+Native ORC Vectorized 9144 / 9153 1.7 581.3 1.2X
+Native ORC Vectorized (Pushdown) 1117 / 1148 14.1 71.0 9.4X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select all distinct string rows (value IS NOT NULL): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 16998 / 17311 0.9 1080.7 1.0X
-Parquet Vectorized (Pushdown) 16977 / 17250 0.9 1079.4 1.0X
-Native ORC Vectorized 18447 / 19852 0.9 1172.8 0.9X
-Native ORC Vectorized (Pushdown) 16614 / 17102 0.9 1056.3 1.0X
+Parquet Vectorized 20445 / 20469 0.8 1299.8 1.0X
+Parquet Vectorized (Pushdown) 20686 / 20699 0.8 1315.2 1.0X
+Native ORC Vectorized 18851 / 18953 0.8 1198.5 1.1X
+Native ORC Vectorized (Pushdown) 19255 / 19268 0.8 1224.2 1.1X
================================================================================================
Pushdown benchmark for StringStartsWith
================================================================================================
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
-
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
StringStartsWith filter: (value like '10%'): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 9705 / 10814 1.6 617.0 1.0X
-Parquet Vectorized (Pushdown) 3086 / 3574 5.1 196.2 3.1X
-Native ORC Vectorized 10094 / 10695 1.6 641.8 1.0X
-Native ORC Vectorized (Pushdown) 9611 / 9999 1.6 611.0 1.0X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 14265 / 15213 1.1 907.0 1.0X
+Parquet Vectorized (Pushdown) 4228 / 4870 3.7 268.8 3.4X
+Native ORC Vectorized 10116 / 10977 1.6 643.2 1.4X
+Native ORC Vectorized (Pushdown) 10653 / 11376 1.5 677.3 1.3X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
StringStartsWith filter: (value like '1000%'): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 8016 / 8183 2.0 509.7 1.0X
-Parquet Vectorized (Pushdown) 444 / 457 35.4 28.2 18.0X
-Native ORC Vectorized 6970 / 7169 2.3 443.2 1.2X
-Native ORC Vectorized (Pushdown) 7447 / 7503 2.1 473.5 1.1X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 11499 / 11539 1.4 731.1 1.0X
+Parquet Vectorized (Pushdown) 669 / 672 23.5 42.5 17.2X
+Native ORC Vectorized 7343 / 7363 2.1 466.8 1.6X
+Native ORC Vectorized (Pushdown) 7559 / 7568 2.1 480.6 1.5X
--- End diff --
ORC doesn't support customer filter pushdown yet. It's expected and consistent from the previous result, @cloud-fan . :) Also, thank you for bringing the previous my comment, @wangyum .
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22427: [SPARK-25438][SQL][TEST] Fix FilterPushdownBenchm...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/22427
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22427: [SPARK-25438][SQL][TEST] Fix FilterPushdownBenchm...
Posted by wangyum <gi...@git.apache.org>.
Github user wangyum commented on a diff in the pull request:
https://github.com/apache/spark/pull/22427#discussion_r217918733
--- Diff: sql/core/benchmarks/FilterPushdownBenchmark-results.txt ---
@@ -2,737 +2,669 @@
Pushdown for many distinct value case
================================================================================================
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
-
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 0 string row (value IS NULL): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 8970 / 9122 1.8 570.3 1.0X
-Parquet Vectorized (Pushdown) 471 / 491 33.4 30.0 19.0X
-Native ORC Vectorized 7661 / 7853 2.1 487.0 1.2X
-Native ORC Vectorized (Pushdown) 1134 / 1161 13.9 72.1 7.9X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 11405 / 11485 1.4 725.1 1.0X
+Parquet Vectorized (Pushdown) 675 / 690 23.3 42.9 16.9X
+Native ORC Vectorized 7127 / 7170 2.2 453.1 1.6X
+Native ORC Vectorized (Pushdown) 519 / 541 30.3 33.0 22.0X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 0 string row ('7864320' < value < '7864320'): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 9246 / 9297 1.7 587.8 1.0X
-Parquet Vectorized (Pushdown) 480 / 488 32.8 30.5 19.3X
-Native ORC Vectorized 7838 / 7850 2.0 498.3 1.2X
-Native ORC Vectorized (Pushdown) 1054 / 1118 14.9 67.0 8.8X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 11457 / 11473 1.4 728.4 1.0X
+Parquet Vectorized (Pushdown) 656 / 686 24.0 41.7 17.5X
+Native ORC Vectorized 7328 / 7342 2.1 465.9 1.6X
+Native ORC Vectorized (Pushdown) 539 / 565 29.2 34.2 21.3X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 1 string row (value = '7864320'): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 8989 / 9100 1.7 571.5 1.0X
-Parquet Vectorized (Pushdown) 448 / 467 35.1 28.5 20.1X
-Native ORC Vectorized 7680 / 7768 2.0 488.3 1.2X
-Native ORC Vectorized (Pushdown) 1067 / 1118 14.7 67.8 8.4X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 11878 / 11888 1.3 755.2 1.0X
+Parquet Vectorized (Pushdown) 630 / 654 25.0 40.1 18.9X
+Native ORC Vectorized 7342 / 7362 2.1 466.8 1.6X
+Native ORC Vectorized (Pushdown) 519 / 537 30.3 33.0 22.9X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 1 string row (value <=> '7864320'): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 9115 / 9266 1.7 579.5 1.0X
-Parquet Vectorized (Pushdown) 466 / 492 33.7 29.7 19.5X
-Native ORC Vectorized 7800 / 7914 2.0 495.9 1.2X
-Native ORC Vectorized (Pushdown) 1075 / 1102 14.6 68.4 8.5X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 11423 / 11440 1.4 726.2 1.0X
+Parquet Vectorized (Pushdown) 625 / 643 25.2 39.7 18.3X
+Native ORC Vectorized 7315 / 7335 2.2 465.1 1.6X
+Native ORC Vectorized (Pushdown) 507 / 520 31.0 32.2 22.5X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 1 string row ('7864320' <= value <= '7864320'): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 9099 / 9237 1.7 578.5 1.0X
-Parquet Vectorized (Pushdown) 462 / 475 34.1 29.3 19.7X
-Native ORC Vectorized 7847 / 7925 2.0 498.9 1.2X
-Native ORC Vectorized (Pushdown) 1078 / 1114 14.6 68.5 8.4X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 11440 / 11478 1.4 727.3 1.0X
+Parquet Vectorized (Pushdown) 634 / 652 24.8 40.3 18.0X
+Native ORC Vectorized 7311 / 7324 2.2 464.8 1.6X
+Native ORC Vectorized (Pushdown) 517 / 548 30.4 32.8 22.1X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select all string rows (value IS NOT NULL): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 19303 / 19547 0.8 1227.3 1.0X
-Parquet Vectorized (Pushdown) 19924 / 20089 0.8 1266.7 1.0X
-Native ORC Vectorized 18725 / 19079 0.8 1190.5 1.0X
-Native ORC Vectorized (Pushdown) 19310 / 19492 0.8 1227.7 1.0X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 20750 / 20872 0.8 1319.3 1.0X
+Parquet Vectorized (Pushdown) 21002 / 21032 0.7 1335.3 1.0X
+Native ORC Vectorized 16714 / 16742 0.9 1062.6 1.2X
+Native ORC Vectorized (Pushdown) 16926 / 16965 0.9 1076.1 1.2X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 0 int row (value IS NULL): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 8117 / 8323 1.9 516.1 1.0X
-Parquet Vectorized (Pushdown) 484 / 494 32.5 30.8 16.8X
-Native ORC Vectorized 6811 / 7036 2.3 433.0 1.2X
-Native ORC Vectorized (Pushdown) 1061 / 1082 14.8 67.5 7.6X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 10510 / 10532 1.5 668.2 1.0X
+Parquet Vectorized (Pushdown) 642 / 665 24.5 40.8 16.4X
+Native ORC Vectorized 6609 / 6618 2.4 420.2 1.6X
+Native ORC Vectorized (Pushdown) 502 / 512 31.4 31.9 21.0X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 0 int row (7864320 < value < 7864320): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 8105 / 8140 1.9 515.3 1.0X
-Parquet Vectorized (Pushdown) 478 / 505 32.9 30.4 17.0X
-Native ORC Vectorized 6914 / 7211 2.3 439.6 1.2X
-Native ORC Vectorized (Pushdown) 1044 / 1064 15.1 66.4 7.8X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 10505 / 10514 1.5 667.9 1.0X
+Parquet Vectorized (Pushdown) 659 / 673 23.9 41.9 15.9X
+Native ORC Vectorized 6634 / 6641 2.4 421.8 1.6X
+Native ORC Vectorized (Pushdown) 513 / 526 30.7 32.6 20.5X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 1 int row (value = 7864320): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 7983 / 8116 2.0 507.6 1.0X
-Parquet Vectorized (Pushdown) 464 / 487 33.9 29.5 17.2X
-Native ORC Vectorized 6703 / 6774 2.3 426.1 1.2X
-Native ORC Vectorized (Pushdown) 1017 / 1058 15.5 64.6 7.9X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 10555 / 10570 1.5 671.1 1.0X
+Parquet Vectorized (Pushdown) 651 / 668 24.2 41.4 16.2X
+Native ORC Vectorized 6721 / 6728 2.3 427.3 1.6X
+Native ORC Vectorized (Pushdown) 508 / 519 31.0 32.3 20.8X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 1 int row (value <=> 7864320): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 7942 / 7983 2.0 504.9 1.0X
-Parquet Vectorized (Pushdown) 468 / 479 33.6 29.7 17.0X
-Native ORC Vectorized 6677 / 6779 2.4 424.5 1.2X
-Native ORC Vectorized (Pushdown) 1021 / 1068 15.4 64.9 7.8X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 10556 / 10566 1.5 671.1 1.0X
+Parquet Vectorized (Pushdown) 647 / 654 24.3 41.1 16.3X
+Native ORC Vectorized 6716 / 6728 2.3 427.0 1.6X
+Native ORC Vectorized (Pushdown) 510 / 521 30.9 32.4 20.7X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 1 int row (7864320 <= value <= 7864320): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 7909 / 7958 2.0 502.8 1.0X
-Parquet Vectorized (Pushdown) 485 / 494 32.4 30.8 16.3X
-Native ORC Vectorized 6751 / 6846 2.3 429.2 1.2X
-Native ORC Vectorized (Pushdown) 1043 / 1077 15.1 66.3 7.6X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 10556 / 10565 1.5 671.1 1.0X
+Parquet Vectorized (Pushdown) 649 / 654 24.2 41.3 16.3X
+Native ORC Vectorized 6700 / 6712 2.3 426.0 1.6X
+Native ORC Vectorized (Pushdown) 509 / 520 30.9 32.3 20.8X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 1 int row (7864319 < value < 7864321): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 8010 / 8033 2.0 509.2 1.0X
-Parquet Vectorized (Pushdown) 472 / 489 33.3 30.0 17.0X
-Native ORC Vectorized 6655 / 6808 2.4 423.1 1.2X
-Native ORC Vectorized (Pushdown) 1015 / 1067 15.5 64.5 7.9X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 10547 / 10566 1.5 670.5 1.0X
+Parquet Vectorized (Pushdown) 649 / 653 24.2 41.3 16.3X
+Native ORC Vectorized 6703 / 6713 2.3 426.2 1.6X
+Native ORC Vectorized (Pushdown) 510 / 520 30.8 32.5 20.7X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 10% int rows (value < 1572864): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 8983 / 9035 1.8 571.1 1.0X
-Parquet Vectorized (Pushdown) 2204 / 2231 7.1 140.1 4.1X
-Native ORC Vectorized 7864 / 8011 2.0 500.0 1.1X
-Native ORC Vectorized (Pushdown) 2674 / 2789 5.9 170.0 3.4X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 11478 / 11525 1.4 729.7 1.0X
+Parquet Vectorized (Pushdown) 2576 / 2587 6.1 163.8 4.5X
+Native ORC Vectorized 7633 / 7657 2.1 485.3 1.5X
+Native ORC Vectorized (Pushdown) 2076 / 2096 7.6 132.0 5.5X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 50% int rows (value < 7864320): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 12723 / 12903 1.2 808.9 1.0X
-Parquet Vectorized (Pushdown) 9112 / 9282 1.7 579.3 1.4X
-Native ORC Vectorized 12090 / 12230 1.3 768.7 1.1X
-Native ORC Vectorized (Pushdown) 9242 / 9372 1.7 587.6 1.4X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 14785 / 14802 1.1 940.0 1.0X
+Parquet Vectorized (Pushdown) 9971 / 9977 1.6 633.9 1.5X
+Native ORC Vectorized 11082 / 11107 1.4 704.6 1.3X
+Native ORC Vectorized (Pushdown) 8061 / 8073 2.0 512.5 1.8X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 90% int rows (value < 14155776): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 16453 / 16678 1.0 1046.1 1.0X
-Parquet Vectorized (Pushdown) 15997 / 16262 1.0 1017.0 1.0X
-Native ORC Vectorized 16652 / 17070 0.9 1058.7 1.0X
-Native ORC Vectorized (Pushdown) 15843 / 16112 1.0 1007.2 1.0X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 18174 / 18214 0.9 1155.5 1.0X
+Parquet Vectorized (Pushdown) 17387 / 17403 0.9 1105.5 1.0X
+Native ORC Vectorized 14465 / 14492 1.1 919.7 1.3X
+Native ORC Vectorized (Pushdown) 14024 / 14041 1.1 891.6 1.3X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select all int rows (value IS NOT NULL): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 17098 / 17254 0.9 1087.1 1.0X
-Parquet Vectorized (Pushdown) 17302 / 17529 0.9 1100.1 1.0X
-Native ORC Vectorized 16790 / 17098 0.9 1067.5 1.0X
-Native ORC Vectorized (Pushdown) 17329 / 17914 0.9 1101.7 1.0X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 19004 / 19014 0.8 1208.2 1.0X
+Parquet Vectorized (Pushdown) 19219 / 19232 0.8 1221.9 1.0X
+Native ORC Vectorized 15266 / 15290 1.0 970.6 1.2X
+Native ORC Vectorized (Pushdown) 15469 / 15482 1.0 983.5 1.2X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select all int rows (value > -1): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 17088 / 17392 0.9 1086.4 1.0X
-Parquet Vectorized (Pushdown) 17609 / 17863 0.9 1119.5 1.0X
-Native ORC Vectorized 18334 / 69831 0.9 1165.7 0.9X
-Native ORC Vectorized (Pushdown) 17465 / 17629 0.9 1110.4 1.0X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 19036 / 19052 0.8 1210.3 1.0X
+Parquet Vectorized (Pushdown) 19287 / 19306 0.8 1226.2 1.0X
+Native ORC Vectorized 15311 / 15371 1.0 973.5 1.2X
+Native ORC Vectorized (Pushdown) 15517 / 15590 1.0 986.5 1.2X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select all int rows (value != -1): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 16903 / 17233 0.9 1074.6 1.0X
-Parquet Vectorized (Pushdown) 16945 / 17032 0.9 1077.3 1.0X
-Native ORC Vectorized 16377 / 16762 1.0 1041.2 1.0X
-Native ORC Vectorized (Pushdown) 16950 / 17212 0.9 1077.7 1.0X
+Parquet Vectorized 19072 / 19102 0.8 1212.6 1.0X
+Parquet Vectorized (Pushdown) 19288 / 19318 0.8 1226.3 1.0X
+Native ORC Vectorized 15277 / 15293 1.0 971.3 1.2X
+Native ORC Vectorized (Pushdown) 15479 / 15499 1.0 984.1 1.2X
================================================================================================
Pushdown for few distinct value case (use dictionary encoding)
================================================================================================
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
-
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 0 distinct string row (value IS NULL): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 7245 / 7322 2.2 460.7 1.0X
-Parquet Vectorized (Pushdown) 378 / 389 41.6 24.0 19.2X
-Native ORC Vectorized 6720 / 6778 2.3 427.2 1.1X
-Native ORC Vectorized (Pushdown) 1009 / 1032 15.6 64.2 7.2X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 10250 / 10274 1.5 651.7 1.0X
+Parquet Vectorized (Pushdown) 571 / 576 27.5 36.3 17.9X
+Native ORC Vectorized 8651 / 8660 1.8 550.0 1.2X
+Native ORC Vectorized (Pushdown) 909 / 933 17.3 57.8 11.3X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 0 distinct string row ('100' < value < '100'): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 7627 / 7795 2.1 484.9 1.0X
-Parquet Vectorized (Pushdown) 384 / 406 41.0 24.4 19.9X
-Native ORC Vectorized 6724 / 7824 2.3 427.5 1.1X
-Native ORC Vectorized (Pushdown) 968 / 986 16.3 61.5 7.9X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 10420 / 10426 1.5 662.5 1.0X
+Parquet Vectorized (Pushdown) 574 / 579 27.4 36.5 18.2X
+Native ORC Vectorized 8973 / 8982 1.8 570.5 1.2X
+Native ORC Vectorized (Pushdown) 916 / 955 17.2 58.2 11.4X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 1 distinct string row (value = '100'): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 7157 / 7534 2.2 455.0 1.0X
-Parquet Vectorized (Pushdown) 542 / 565 29.0 34.5 13.2X
-Native ORC Vectorized 6716 / 7214 2.3 427.0 1.1X
-Native ORC Vectorized (Pushdown) 1212 / 1288 13.0 77.0 5.9X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 10428 / 10441 1.5 663.0 1.0X
+Parquet Vectorized (Pushdown) 789 / 809 19.9 50.2 13.2X
+Native ORC Vectorized 9042 / 9055 1.7 574.9 1.2X
+Native ORC Vectorized (Pushdown) 1130 / 1145 13.9 71.8 9.2X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 1 distinct string row (value <=> '100'): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 7368 / 7552 2.1 468.4 1.0X
-Parquet Vectorized (Pushdown) 544 / 556 28.9 34.6 13.5X
-Native ORC Vectorized 6740 / 6867 2.3 428.5 1.1X
-Native ORC Vectorized (Pushdown) 1230 / 1426 12.8 78.2 6.0X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 10402 / 10416 1.5 661.3 1.0X
+Parquet Vectorized (Pushdown) 791 / 806 19.9 50.3 13.2X
+Native ORC Vectorized 9042 / 9055 1.7 574.9 1.2X
+Native ORC Vectorized (Pushdown) 1112 / 1145 14.1 70.7 9.4X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 1 distinct string row ('100' <= value <= '100'): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 7427 / 7734 2.1 472.2 1.0X
-Parquet Vectorized (Pushdown) 556 / 568 28.3 35.4 13.3X
-Native ORC Vectorized 6847 / 7059 2.3 435.3 1.1X
-Native ORC Vectorized (Pushdown) 1226 / 1230 12.8 77.9 6.1X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 10548 / 10563 1.5 670.6 1.0X
+Parquet Vectorized (Pushdown) 790 / 796 19.9 50.2 13.4X
+Native ORC Vectorized 9144 / 9153 1.7 581.3 1.2X
+Native ORC Vectorized (Pushdown) 1117 / 1148 14.1 71.0 9.4X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select all distinct string rows (value IS NOT NULL): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 16998 / 17311 0.9 1080.7 1.0X
-Parquet Vectorized (Pushdown) 16977 / 17250 0.9 1079.4 1.0X
-Native ORC Vectorized 18447 / 19852 0.9 1172.8 0.9X
-Native ORC Vectorized (Pushdown) 16614 / 17102 0.9 1056.3 1.0X
+Parquet Vectorized 20445 / 20469 0.8 1299.8 1.0X
+Parquet Vectorized (Pushdown) 20686 / 20699 0.8 1315.2 1.0X
+Native ORC Vectorized 18851 / 18953 0.8 1198.5 1.1X
+Native ORC Vectorized (Pushdown) 19255 / 19268 0.8 1224.2 1.1X
================================================================================================
Pushdown benchmark for StringStartsWith
================================================================================================
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
-
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
StringStartsWith filter: (value like '10%'): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 9705 / 10814 1.6 617.0 1.0X
-Parquet Vectorized (Pushdown) 3086 / 3574 5.1 196.2 3.1X
-Native ORC Vectorized 10094 / 10695 1.6 641.8 1.0X
-Native ORC Vectorized (Pushdown) 9611 / 9999 1.6 611.0 1.0X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 14265 / 15213 1.1 907.0 1.0X
+Parquet Vectorized (Pushdown) 4228 / 4870 3.7 268.8 3.4X
+Native ORC Vectorized 10116 / 10977 1.6 643.2 1.4X
+Native ORC Vectorized (Pushdown) 10653 / 11376 1.5 677.3 1.3X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
StringStartsWith filter: (value like '1000%'): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 8016 / 8183 2.0 509.7 1.0X
-Parquet Vectorized (Pushdown) 444 / 457 35.4 28.2 18.0X
-Native ORC Vectorized 6970 / 7169 2.3 443.2 1.2X
-Native ORC Vectorized (Pushdown) 7447 / 7503 2.1 473.5 1.1X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 11499 / 11539 1.4 731.1 1.0X
+Parquet Vectorized (Pushdown) 669 / 672 23.5 42.5 17.2X
+Native ORC Vectorized 7343 / 7363 2.1 466.8 1.6X
+Native ORC Vectorized (Pushdown) 7559 / 7568 2.1 480.6 1.5X
--- End diff --
It seems ORC doesn't support custom filter yet: https://github.com/apache/spark/pull/21623#issuecomment-401558357
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22427: [SPARK-25438][SQL][TEST] Fix FilterPushdownBenchmark to ...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22427
**[Test build #96092 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96092/testReport)** for PR 22427 at commit [`fb14cd5`](https://github.com/apache/spark/commit/fb14cd5829f431593db71b1b5ec06dd0957791ad).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22427: [SPARK-25438][SQL][TEST] Fix FilterPushdownBenchmark to ...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22427
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96092/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22427: [SPARK-25438][SQL][TEST] Fix FilterPushdownBenchmark to ...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22427
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22427: [SPARK-25438][SQL][TEST] Fix FilterPushdownBenchmark to ...
Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:
https://github.com/apache/spark/pull/22427
Merged to master/2.4.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22427: [SPARK-25438][SQL][TEST] Fix FilterPushdownBenchmark to ...
Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/22427
Just a question; I'm not familiar with both internal logics though, these parameters (`Memory buffer for writing` and `Compression chunk size`) are internally treated in the same manner? Also, they are performace-sensitive parameters?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22427: [SPARK-25438][SQL][TEST] Fix FilterPushdownBenchm...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/22427#discussion_r217916656
--- Diff: sql/core/benchmarks/FilterPushdownBenchmark-results.txt ---
@@ -2,737 +2,669 @@
Pushdown for many distinct value case
================================================================================================
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
-
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 0 string row (value IS NULL): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 8970 / 9122 1.8 570.3 1.0X
-Parquet Vectorized (Pushdown) 471 / 491 33.4 30.0 19.0X
-Native ORC Vectorized 7661 / 7853 2.1 487.0 1.2X
-Native ORC Vectorized (Pushdown) 1134 / 1161 13.9 72.1 7.9X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 11405 / 11485 1.4 725.1 1.0X
+Parquet Vectorized (Pushdown) 675 / 690 23.3 42.9 16.9X
+Native ORC Vectorized 7127 / 7170 2.2 453.1 1.6X
+Native ORC Vectorized (Pushdown) 519 / 541 30.3 33.0 22.0X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 0 string row ('7864320' < value < '7864320'): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 9246 / 9297 1.7 587.8 1.0X
-Parquet Vectorized (Pushdown) 480 / 488 32.8 30.5 19.3X
-Native ORC Vectorized 7838 / 7850 2.0 498.3 1.2X
-Native ORC Vectorized (Pushdown) 1054 / 1118 14.9 67.0 8.8X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 11457 / 11473 1.4 728.4 1.0X
+Parquet Vectorized (Pushdown) 656 / 686 24.0 41.7 17.5X
+Native ORC Vectorized 7328 / 7342 2.1 465.9 1.6X
+Native ORC Vectorized (Pushdown) 539 / 565 29.2 34.2 21.3X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 1 string row (value = '7864320'): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 8989 / 9100 1.7 571.5 1.0X
-Parquet Vectorized (Pushdown) 448 / 467 35.1 28.5 20.1X
-Native ORC Vectorized 7680 / 7768 2.0 488.3 1.2X
-Native ORC Vectorized (Pushdown) 1067 / 1118 14.7 67.8 8.4X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 11878 / 11888 1.3 755.2 1.0X
+Parquet Vectorized (Pushdown) 630 / 654 25.0 40.1 18.9X
+Native ORC Vectorized 7342 / 7362 2.1 466.8 1.6X
+Native ORC Vectorized (Pushdown) 519 / 537 30.3 33.0 22.9X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 1 string row (value <=> '7864320'): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 9115 / 9266 1.7 579.5 1.0X
-Parquet Vectorized (Pushdown) 466 / 492 33.7 29.7 19.5X
-Native ORC Vectorized 7800 / 7914 2.0 495.9 1.2X
-Native ORC Vectorized (Pushdown) 1075 / 1102 14.6 68.4 8.5X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 11423 / 11440 1.4 726.2 1.0X
+Parquet Vectorized (Pushdown) 625 / 643 25.2 39.7 18.3X
+Native ORC Vectorized 7315 / 7335 2.2 465.1 1.6X
+Native ORC Vectorized (Pushdown) 507 / 520 31.0 32.2 22.5X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 1 string row ('7864320' <= value <= '7864320'): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 9099 / 9237 1.7 578.5 1.0X
-Parquet Vectorized (Pushdown) 462 / 475 34.1 29.3 19.7X
-Native ORC Vectorized 7847 / 7925 2.0 498.9 1.2X
-Native ORC Vectorized (Pushdown) 1078 / 1114 14.6 68.5 8.4X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 11440 / 11478 1.4 727.3 1.0X
+Parquet Vectorized (Pushdown) 634 / 652 24.8 40.3 18.0X
+Native ORC Vectorized 7311 / 7324 2.2 464.8 1.6X
+Native ORC Vectorized (Pushdown) 517 / 548 30.4 32.8 22.1X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select all string rows (value IS NOT NULL): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 19303 / 19547 0.8 1227.3 1.0X
-Parquet Vectorized (Pushdown) 19924 / 20089 0.8 1266.7 1.0X
-Native ORC Vectorized 18725 / 19079 0.8 1190.5 1.0X
-Native ORC Vectorized (Pushdown) 19310 / 19492 0.8 1227.7 1.0X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 20750 / 20872 0.8 1319.3 1.0X
+Parquet Vectorized (Pushdown) 21002 / 21032 0.7 1335.3 1.0X
+Native ORC Vectorized 16714 / 16742 0.9 1062.6 1.2X
+Native ORC Vectorized (Pushdown) 16926 / 16965 0.9 1076.1 1.2X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 0 int row (value IS NULL): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 8117 / 8323 1.9 516.1 1.0X
-Parquet Vectorized (Pushdown) 484 / 494 32.5 30.8 16.8X
-Native ORC Vectorized 6811 / 7036 2.3 433.0 1.2X
-Native ORC Vectorized (Pushdown) 1061 / 1082 14.8 67.5 7.6X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 10510 / 10532 1.5 668.2 1.0X
+Parquet Vectorized (Pushdown) 642 / 665 24.5 40.8 16.4X
+Native ORC Vectorized 6609 / 6618 2.4 420.2 1.6X
+Native ORC Vectorized (Pushdown) 502 / 512 31.4 31.9 21.0X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 0 int row (7864320 < value < 7864320): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 8105 / 8140 1.9 515.3 1.0X
-Parquet Vectorized (Pushdown) 478 / 505 32.9 30.4 17.0X
-Native ORC Vectorized 6914 / 7211 2.3 439.6 1.2X
-Native ORC Vectorized (Pushdown) 1044 / 1064 15.1 66.4 7.8X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 10505 / 10514 1.5 667.9 1.0X
+Parquet Vectorized (Pushdown) 659 / 673 23.9 41.9 15.9X
+Native ORC Vectorized 6634 / 6641 2.4 421.8 1.6X
+Native ORC Vectorized (Pushdown) 513 / 526 30.7 32.6 20.5X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 1 int row (value = 7864320): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 7983 / 8116 2.0 507.6 1.0X
-Parquet Vectorized (Pushdown) 464 / 487 33.9 29.5 17.2X
-Native ORC Vectorized 6703 / 6774 2.3 426.1 1.2X
-Native ORC Vectorized (Pushdown) 1017 / 1058 15.5 64.6 7.9X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 10555 / 10570 1.5 671.1 1.0X
+Parquet Vectorized (Pushdown) 651 / 668 24.2 41.4 16.2X
+Native ORC Vectorized 6721 / 6728 2.3 427.3 1.6X
+Native ORC Vectorized (Pushdown) 508 / 519 31.0 32.3 20.8X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 1 int row (value <=> 7864320): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 7942 / 7983 2.0 504.9 1.0X
-Parquet Vectorized (Pushdown) 468 / 479 33.6 29.7 17.0X
-Native ORC Vectorized 6677 / 6779 2.4 424.5 1.2X
-Native ORC Vectorized (Pushdown) 1021 / 1068 15.4 64.9 7.8X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 10556 / 10566 1.5 671.1 1.0X
+Parquet Vectorized (Pushdown) 647 / 654 24.3 41.1 16.3X
+Native ORC Vectorized 6716 / 6728 2.3 427.0 1.6X
+Native ORC Vectorized (Pushdown) 510 / 521 30.9 32.4 20.7X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 1 int row (7864320 <= value <= 7864320): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 7909 / 7958 2.0 502.8 1.0X
-Parquet Vectorized (Pushdown) 485 / 494 32.4 30.8 16.3X
-Native ORC Vectorized 6751 / 6846 2.3 429.2 1.2X
-Native ORC Vectorized (Pushdown) 1043 / 1077 15.1 66.3 7.6X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 10556 / 10565 1.5 671.1 1.0X
+Parquet Vectorized (Pushdown) 649 / 654 24.2 41.3 16.3X
+Native ORC Vectorized 6700 / 6712 2.3 426.0 1.6X
+Native ORC Vectorized (Pushdown) 509 / 520 30.9 32.3 20.8X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 1 int row (7864319 < value < 7864321): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 8010 / 8033 2.0 509.2 1.0X
-Parquet Vectorized (Pushdown) 472 / 489 33.3 30.0 17.0X
-Native ORC Vectorized 6655 / 6808 2.4 423.1 1.2X
-Native ORC Vectorized (Pushdown) 1015 / 1067 15.5 64.5 7.9X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 10547 / 10566 1.5 670.5 1.0X
+Parquet Vectorized (Pushdown) 649 / 653 24.2 41.3 16.3X
+Native ORC Vectorized 6703 / 6713 2.3 426.2 1.6X
+Native ORC Vectorized (Pushdown) 510 / 520 30.8 32.5 20.7X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 10% int rows (value < 1572864): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 8983 / 9035 1.8 571.1 1.0X
-Parquet Vectorized (Pushdown) 2204 / 2231 7.1 140.1 4.1X
-Native ORC Vectorized 7864 / 8011 2.0 500.0 1.1X
-Native ORC Vectorized (Pushdown) 2674 / 2789 5.9 170.0 3.4X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 11478 / 11525 1.4 729.7 1.0X
+Parquet Vectorized (Pushdown) 2576 / 2587 6.1 163.8 4.5X
+Native ORC Vectorized 7633 / 7657 2.1 485.3 1.5X
+Native ORC Vectorized (Pushdown) 2076 / 2096 7.6 132.0 5.5X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 50% int rows (value < 7864320): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 12723 / 12903 1.2 808.9 1.0X
-Parquet Vectorized (Pushdown) 9112 / 9282 1.7 579.3 1.4X
-Native ORC Vectorized 12090 / 12230 1.3 768.7 1.1X
-Native ORC Vectorized (Pushdown) 9242 / 9372 1.7 587.6 1.4X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 14785 / 14802 1.1 940.0 1.0X
+Parquet Vectorized (Pushdown) 9971 / 9977 1.6 633.9 1.5X
+Native ORC Vectorized 11082 / 11107 1.4 704.6 1.3X
+Native ORC Vectorized (Pushdown) 8061 / 8073 2.0 512.5 1.8X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 90% int rows (value < 14155776): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 16453 / 16678 1.0 1046.1 1.0X
-Parquet Vectorized (Pushdown) 15997 / 16262 1.0 1017.0 1.0X
-Native ORC Vectorized 16652 / 17070 0.9 1058.7 1.0X
-Native ORC Vectorized (Pushdown) 15843 / 16112 1.0 1007.2 1.0X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 18174 / 18214 0.9 1155.5 1.0X
+Parquet Vectorized (Pushdown) 17387 / 17403 0.9 1105.5 1.0X
+Native ORC Vectorized 14465 / 14492 1.1 919.7 1.3X
+Native ORC Vectorized (Pushdown) 14024 / 14041 1.1 891.6 1.3X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select all int rows (value IS NOT NULL): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 17098 / 17254 0.9 1087.1 1.0X
-Parquet Vectorized (Pushdown) 17302 / 17529 0.9 1100.1 1.0X
-Native ORC Vectorized 16790 / 17098 0.9 1067.5 1.0X
-Native ORC Vectorized (Pushdown) 17329 / 17914 0.9 1101.7 1.0X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 19004 / 19014 0.8 1208.2 1.0X
+Parquet Vectorized (Pushdown) 19219 / 19232 0.8 1221.9 1.0X
+Native ORC Vectorized 15266 / 15290 1.0 970.6 1.2X
+Native ORC Vectorized (Pushdown) 15469 / 15482 1.0 983.5 1.2X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select all int rows (value > -1): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 17088 / 17392 0.9 1086.4 1.0X
-Parquet Vectorized (Pushdown) 17609 / 17863 0.9 1119.5 1.0X
-Native ORC Vectorized 18334 / 69831 0.9 1165.7 0.9X
-Native ORC Vectorized (Pushdown) 17465 / 17629 0.9 1110.4 1.0X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 19036 / 19052 0.8 1210.3 1.0X
+Parquet Vectorized (Pushdown) 19287 / 19306 0.8 1226.2 1.0X
+Native ORC Vectorized 15311 / 15371 1.0 973.5 1.2X
+Native ORC Vectorized (Pushdown) 15517 / 15590 1.0 986.5 1.2X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select all int rows (value != -1): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 16903 / 17233 0.9 1074.6 1.0X
-Parquet Vectorized (Pushdown) 16945 / 17032 0.9 1077.3 1.0X
-Native ORC Vectorized 16377 / 16762 1.0 1041.2 1.0X
-Native ORC Vectorized (Pushdown) 16950 / 17212 0.9 1077.7 1.0X
+Parquet Vectorized 19072 / 19102 0.8 1212.6 1.0X
+Parquet Vectorized (Pushdown) 19288 / 19318 0.8 1226.3 1.0X
+Native ORC Vectorized 15277 / 15293 1.0 971.3 1.2X
+Native ORC Vectorized (Pushdown) 15479 / 15499 1.0 984.1 1.2X
================================================================================================
Pushdown for few distinct value case (use dictionary encoding)
================================================================================================
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
-
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 0 distinct string row (value IS NULL): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 7245 / 7322 2.2 460.7 1.0X
-Parquet Vectorized (Pushdown) 378 / 389 41.6 24.0 19.2X
-Native ORC Vectorized 6720 / 6778 2.3 427.2 1.1X
-Native ORC Vectorized (Pushdown) 1009 / 1032 15.6 64.2 7.2X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 10250 / 10274 1.5 651.7 1.0X
+Parquet Vectorized (Pushdown) 571 / 576 27.5 36.3 17.9X
+Native ORC Vectorized 8651 / 8660 1.8 550.0 1.2X
+Native ORC Vectorized (Pushdown) 909 / 933 17.3 57.8 11.3X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 0 distinct string row ('100' < value < '100'): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 7627 / 7795 2.1 484.9 1.0X
-Parquet Vectorized (Pushdown) 384 / 406 41.0 24.4 19.9X
-Native ORC Vectorized 6724 / 7824 2.3 427.5 1.1X
-Native ORC Vectorized (Pushdown) 968 / 986 16.3 61.5 7.9X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 10420 / 10426 1.5 662.5 1.0X
+Parquet Vectorized (Pushdown) 574 / 579 27.4 36.5 18.2X
+Native ORC Vectorized 8973 / 8982 1.8 570.5 1.2X
+Native ORC Vectorized (Pushdown) 916 / 955 17.2 58.2 11.4X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 1 distinct string row (value = '100'): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 7157 / 7534 2.2 455.0 1.0X
-Parquet Vectorized (Pushdown) 542 / 565 29.0 34.5 13.2X
-Native ORC Vectorized 6716 / 7214 2.3 427.0 1.1X
-Native ORC Vectorized (Pushdown) 1212 / 1288 13.0 77.0 5.9X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 10428 / 10441 1.5 663.0 1.0X
+Parquet Vectorized (Pushdown) 789 / 809 19.9 50.2 13.2X
+Native ORC Vectorized 9042 / 9055 1.7 574.9 1.2X
+Native ORC Vectorized (Pushdown) 1130 / 1145 13.9 71.8 9.2X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 1 distinct string row (value <=> '100'): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 7368 / 7552 2.1 468.4 1.0X
-Parquet Vectorized (Pushdown) 544 / 556 28.9 34.6 13.5X
-Native ORC Vectorized 6740 / 6867 2.3 428.5 1.1X
-Native ORC Vectorized (Pushdown) 1230 / 1426 12.8 78.2 6.0X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 10402 / 10416 1.5 661.3 1.0X
+Parquet Vectorized (Pushdown) 791 / 806 19.9 50.3 13.2X
+Native ORC Vectorized 9042 / 9055 1.7 574.9 1.2X
+Native ORC Vectorized (Pushdown) 1112 / 1145 14.1 70.7 9.4X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select 1 distinct string row ('100' <= value <= '100'): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 7427 / 7734 2.1 472.2 1.0X
-Parquet Vectorized (Pushdown) 556 / 568 28.3 35.4 13.3X
-Native ORC Vectorized 6847 / 7059 2.3 435.3 1.1X
-Native ORC Vectorized (Pushdown) 1226 / 1230 12.8 77.9 6.1X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 10548 / 10563 1.5 670.6 1.0X
+Parquet Vectorized (Pushdown) 790 / 796 19.9 50.2 13.4X
+Native ORC Vectorized 9144 / 9153 1.7 581.3 1.2X
+Native ORC Vectorized (Pushdown) 1117 / 1148 14.1 71.0 9.4X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select all distinct string rows (value IS NOT NULL): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 16998 / 17311 0.9 1080.7 1.0X
-Parquet Vectorized (Pushdown) 16977 / 17250 0.9 1079.4 1.0X
-Native ORC Vectorized 18447 / 19852 0.9 1172.8 0.9X
-Native ORC Vectorized (Pushdown) 16614 / 17102 0.9 1056.3 1.0X
+Parquet Vectorized 20445 / 20469 0.8 1299.8 1.0X
+Parquet Vectorized (Pushdown) 20686 / 20699 0.8 1315.2 1.0X
+Native ORC Vectorized 18851 / 18953 0.8 1198.5 1.1X
+Native ORC Vectorized (Pushdown) 19255 / 19268 0.8 1224.2 1.1X
================================================================================================
Pushdown benchmark for StringStartsWith
================================================================================================
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
-
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
StringStartsWith filter: (value like '10%'): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 9705 / 10814 1.6 617.0 1.0X
-Parquet Vectorized (Pushdown) 3086 / 3574 5.1 196.2 3.1X
-Native ORC Vectorized 10094 / 10695 1.6 641.8 1.0X
-Native ORC Vectorized (Pushdown) 9611 / 9999 1.6 611.0 1.0X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 14265 / 15213 1.1 907.0 1.0X
+Parquet Vectorized (Pushdown) 4228 / 4870 3.7 268.8 3.4X
+Native ORC Vectorized 10116 / 10977 1.6 643.2 1.4X
+Native ORC Vectorized (Pushdown) 10653 / 11376 1.5 677.3 1.3X
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
StringStartsWith filter: (value like '1000%'): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
-Parquet Vectorized 8016 / 8183 2.0 509.7 1.0X
-Parquet Vectorized (Pushdown) 444 / 457 35.4 28.2 18.0X
-Native ORC Vectorized 6970 / 7169 2.3 443.2 1.2X
-Native ORC Vectorized (Pushdown) 7447 / 7503 2.1 473.5 1.1X
-
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
-Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Parquet Vectorized 11499 / 11539 1.4 731.1 1.0X
+Parquet Vectorized (Pushdown) 669 / 672 23.5 42.5 17.2X
+Native ORC Vectorized 7343 / 7363 2.1 466.8 1.6X
+Native ORC Vectorized (Pushdown) 7559 / 7568 2.1 480.6 1.5X
--- End diff --
Does orc support `StringStartsWith` pushdown?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22427: [SPARK-25438][SQL][TEST] Fix FilterPushdownBenchmark to ...
Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:
https://github.com/apache/spark/pull/22427
Thank you for review, @maropu .
1. Yes. It's the same. The first one limits the memory usage for write operation. The second one limits the memory usage for compression operation.
2. Yes. As you see in this PR, it's performance sensitive. Actually, all parameters of Parquet/ORC are performance sensitive.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22427: [SPARK-25438][SQL][TEST] Fix FilterPushdownBenchmark to ...
Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:
https://github.com/apache/spark/pull/22427
Could you review this, @gatorsmile , @cloud-fan , @dbtsai , @maropu and @wangyum ?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22427: [SPARK-25438][SQL][TEST] Fix FilterPushdownBenchmark to ...
Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:
https://github.com/apache/spark/pull/22427
Thank you, @maropu !
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22427: [SPARK-25438][SQL][TEST] Fix FilterPushdownBenchmark to ...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22427
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3127/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22427: [SPARK-25438][SQL][TEST] Fix FilterPushdownBenchmark to ...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22427
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22427: [SPARK-25438][SQL][TEST] Fix FilterPushdownBenchmark to ...
Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/22427
cc @rdblue
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org