You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by peter-toth <gi...@git.apache.org> on 2018/10/02 20:59:58 UTC
[GitHub] spark pull request #22617: [SPARK-25484][TEST] Refactor ExternalAppendOnlyUn...
GitHub user peter-toth opened a pull request:
https://github.com/apache/spark/pull/22617
[SPARK-25484][TEST] Refactor ExternalAppendOnlyUnsafeRowArrayBenchmark
## What changes were proposed in this pull request?
1. Refactor ExternalAppendOnlyUnsafeRowArrayBenchmark
2. Fixes issue in `ExternalAppendOnlyUnsafeRowArray` creation
## How was this patch tested?
Manually tested and regenerated results.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/peter-toth/spark SPARK-25484
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/22617.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #22617
----
commit b9991cac48b6e28c2dfdbdd66e4f26008283643f
Author: Peter Toth <pe...@...>
Date: 2018-10-02T14:31:44Z
[SPARK-25484][TEST] Refactor ExternalAppendOnlyUnsafeRowArrayBenchmark
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendOnlyUnsa...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22617
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97348/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendOnlyUnsa...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22617
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendO...
Posted by peter-toth <gi...@git.apache.org>.
Github user peter-toth commented on a diff in the pull request:
https://github.com/apache/spark/pull/22617#discussion_r229480460
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/ExternalAppendOnlyUnsafeRowArrayBenchmark.scala ---
@@ -68,9 +100,7 @@ object ExternalAppendOnlyUnsafeRowArrayBenchmark {
benchmark.addCase("ExternalAppendOnlyUnsafeRowArray") { _: Int =>
var sum = 0L
for (_ <- 0L until iterations) {
- val array = new ExternalAppendOnlyUnsafeRowArray(
- ExternalAppendOnlyUnsafeRowArray.DefaultInitialSizeOfInMemoryBuffer,
- numSpillThreshold)
+ val array = new ExternalAppendOnlyUnsafeRowArray(numSpillThreshold, numSpillThreshold)
--- End diff --
Ok. Reverted the change.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendOnlyUnsa...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22617
**[Test build #97348 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97348/testReport)** for PR 22617 at commit [`f016e20`](https://github.com/apache/spark/commit/f016e20cbb221d4cdf8eeb7bbdc9a4ab6a6cde24).
* This patch **fails to build**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendOnlyUnsa...
Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:
https://github.com/apache/spark/pull/22617
Retest this please.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendO...
Posted by peter-toth <gi...@git.apache.org>.
Github user peter-toth commented on a diff in the pull request:
https://github.com/apache/spark/pull/22617#discussion_r228889856
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/ExternalAppendOnlyUnsafeRowArrayBenchmark.scala ---
@@ -68,9 +100,7 @@ object ExternalAppendOnlyUnsafeRowArrayBenchmark {
benchmark.addCase("ExternalAppendOnlyUnsafeRowArray") { _: Int =>
var sum = 0L
for (_ <- 0L until iterations) {
- val array = new ExternalAppendOnlyUnsafeRowArray(
- ExternalAppendOnlyUnsafeRowArray.DefaultInitialSizeOfInMemoryBuffer,
- numSpillThreshold)
+ val array = new ExternalAppendOnlyUnsafeRowArray(numSpillThreshold, numSpillThreshold)
--- End diff --
Thanks for the feedback @dongjoon-hyun. I added some details to the description.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendOnlyUnsa...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22617
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendOnlyUnsa...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22617
**[Test build #97504 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97504/testReport)** for PR 22617 at commit [`1c30755`](https://github.com/apache/spark/commit/1c307553d5aa64c2365b14084bb79569f6c14be1).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22617: [SPARK-25484][TEST] Refactor ExternalAppendOnlyUnsafeRow...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22617
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendOnlyUnsa...
Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on the issue:
https://github.com/apache/spark/pull/22617
Retest this please
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22617: [SPARK-25484][TEST] Refactor ExternalAppendOnlyUnsafeRow...
Posted by peter-toth <gi...@git.apache.org>.
Github user peter-toth commented on the issue:
https://github.com/apache/spark/pull/22617
cc @dongjoon-hyun @seancxmao
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendOnlyUnsa...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22617
**[Test build #97438 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97438/testReport)** for PR 22617 at commit [`1c30755`](https://github.com/apache/spark/commit/1c307553d5aa64c2365b14084bb79569f6c14be1).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendOnlyUnsa...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22617
**[Test build #97438 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97438/testReport)** for PR 22617 at commit [`1c30755`](https://github.com/apache/spark/commit/1c307553d5aa64c2365b14084bb79569f6c14be1).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendOnlyUnsa...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22617
**[Test build #97348 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97348/testReport)** for PR 22617 at commit [`f016e20`](https://github.com/apache/spark/commit/f016e20cbb221d4cdf8eeb7bbdc9a4ab6a6cde24).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendOnlyUnsa...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22617
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendOnlyUnsa...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22617
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97504/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22617: [SPARK-25484][TEST] Refactor ExternalAppendOnlyUnsafeRow...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22617
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendO...
Posted by peter-toth <gi...@git.apache.org>.
Github user peter-toth commented on a diff in the pull request:
https://github.com/apache/spark/pull/22617#discussion_r228944791
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/ExternalAppendOnlyUnsafeRowArrayBenchmark.scala ---
@@ -68,9 +100,7 @@ object ExternalAppendOnlyUnsafeRowArrayBenchmark {
benchmark.addCase("ExternalAppendOnlyUnsafeRowArray") { _: Int =>
var sum = 0L
for (_ <- 0L until iterations) {
- val array = new ExternalAppendOnlyUnsafeRowArray(
- ExternalAppendOnlyUnsafeRowArray.DefaultInitialSizeOfInMemoryBuffer,
- numSpillThreshold)
+ val array = new ExternalAppendOnlyUnsafeRowArray(numSpillThreshold, numSpillThreshold)
--- End diff --
Actually, following my logic in the description, I think
```
val array = new ExternalAppendOnlyUnsafeRowArray(numSpillThreshold, numSpillThreshold)
```
should be changed to
```
val array = new ExternalAppendOnlyUnsafeRowArray(0, numSpillThreshold)
```
in `testAgainstRawUnsafeExternalSorter` in "WITH SPILL" cases so as to compare `ExternalAppendOnlyUnsafeRowArray` to `UnsafeExternalSorter` when it behaves so.
But would be great if someone could confirm this idea.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendOnlyUnsa...
Posted by peter-toth <gi...@git.apache.org>.
Github user peter-toth commented on the issue:
https://github.com/apache/spark/pull/22617
@dongjoon-hyun , @kiszk could you please help me how take a step forward with this PR?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendOnlyUnsa...
Posted by wangyum <gi...@git.apache.org>.
Github user wangyum commented on the issue:
https://github.com/apache/spark/pull/22617
Retest this please
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendO...
Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:
https://github.com/apache/spark/pull/22617#discussion_r229419433
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/ExternalAppendOnlyUnsafeRowArrayBenchmark.scala ---
@@ -68,9 +100,7 @@ object ExternalAppendOnlyUnsafeRowArrayBenchmark {
benchmark.addCase("ExternalAppendOnlyUnsafeRowArray") { _: Int =>
var sum = 0L
for (_ <- 0L until iterations) {
- val array = new ExternalAppendOnlyUnsafeRowArray(
- ExternalAppendOnlyUnsafeRowArray.DefaultInitialSizeOfInMemoryBuffer,
- numSpillThreshold)
+ val array = new ExternalAppendOnlyUnsafeRowArray(numSpillThreshold, numSpillThreshold)
--- End diff --
In that case, we need two-step comparision.
1. Refactoring only to ensure no regression.
2. Change that value to check the performance value difference.
Could you rollback this line and let us finish `Step 1` first?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22617: [SPARK-25484][TEST] Refactor ExternalAppendOnlyUnsafeRow...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22617
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendOnlyUnsa...
Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:
https://github.com/apache/spark/pull/22617
Retest this please.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendOnlyUnsa...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22617
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97438/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendOnlyUnsa...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22617
**[Test build #97504 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97504/testReport)** for PR 22617 at commit [`1c30755`](https://github.com/apache/spark/commit/1c307553d5aa64c2365b14084bb79569f6c14be1).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendO...
Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:
https://github.com/apache/spark/pull/22617#discussion_r228768273
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/ExternalAppendOnlyUnsafeRowArrayBenchmark.scala ---
@@ -68,9 +100,7 @@ object ExternalAppendOnlyUnsafeRowArrayBenchmark {
benchmark.addCase("ExternalAppendOnlyUnsafeRowArray") { _: Int =>
var sum = 0L
for (_ <- 0L until iterations) {
- val array = new ExternalAppendOnlyUnsafeRowArray(
- ExternalAppendOnlyUnsafeRowArray.DefaultInitialSizeOfInMemoryBuffer,
- numSpillThreshold)
+ val array = new ExternalAppendOnlyUnsafeRowArray(numSpillThreshold, numSpillThreshold)
--- End diff --
Hi, @peter-toth . Could you explain why we need to replace `ExternalAppendOnlyUnsafeRowArray.DefaultInitialSizeOfInMemoryBuffer` with `numSpillThreshold` here? Actually, this is not a obvious refactoring.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org