You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by peter-toth <gi...@git.apache.org> on 2018/10/02 20:59:58 UTC

[GitHub] spark pull request #22617: [SPARK-25484][TEST] Refactor ExternalAppendOnlyUn...

GitHub user peter-toth opened a pull request:

    https://github.com/apache/spark/pull/22617

    [SPARK-25484][TEST] Refactor ExternalAppendOnlyUnsafeRowArrayBenchmark

    ## What changes were proposed in this pull request?
    
    1. Refactor ExternalAppendOnlyUnsafeRowArrayBenchmark
    2. Fixes issue in `ExternalAppendOnlyUnsafeRowArray` creation
    
    ## How was this patch tested?
    
    Manually tested and regenerated results.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/peter-toth/spark SPARK-25484

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22617.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22617
    
----
commit b9991cac48b6e28c2dfdbdd66e4f26008283643f
Author: Peter Toth <pe...@...>
Date:   2018-10-02T14:31:44Z

    [SPARK-25484][TEST] Refactor ExternalAppendOnlyUnsafeRowArrayBenchmark

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendOnlyUnsa...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22617
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97348/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendOnlyUnsa...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22617
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendO...

Posted by peter-toth <gi...@git.apache.org>.
Github user peter-toth commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22617#discussion_r229480460
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/ExternalAppendOnlyUnsafeRowArrayBenchmark.scala ---
    @@ -68,9 +100,7 @@ object ExternalAppendOnlyUnsafeRowArrayBenchmark {
         benchmark.addCase("ExternalAppendOnlyUnsafeRowArray") { _: Int =>
           var sum = 0L
           for (_ <- 0L until iterations) {
    -        val array = new ExternalAppendOnlyUnsafeRowArray(
    -          ExternalAppendOnlyUnsafeRowArray.DefaultInitialSizeOfInMemoryBuffer,
    -          numSpillThreshold)
    +        val array = new ExternalAppendOnlyUnsafeRowArray(numSpillThreshold, numSpillThreshold)
    --- End diff --
    
    Ok. Reverted the change.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendOnlyUnsa...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22617
  
    **[Test build #97348 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97348/testReport)** for PR 22617 at commit [`f016e20`](https://github.com/apache/spark/commit/f016e20cbb221d4cdf8eeb7bbdc9a4ab6a6cde24).
     * This patch **fails to build**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendOnlyUnsa...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/22617
  
    Retest this please.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendO...

Posted by peter-toth <gi...@git.apache.org>.
Github user peter-toth commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22617#discussion_r228889856
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/ExternalAppendOnlyUnsafeRowArrayBenchmark.scala ---
    @@ -68,9 +100,7 @@ object ExternalAppendOnlyUnsafeRowArrayBenchmark {
         benchmark.addCase("ExternalAppendOnlyUnsafeRowArray") { _: Int =>
           var sum = 0L
           for (_ <- 0L until iterations) {
    -        val array = new ExternalAppendOnlyUnsafeRowArray(
    -          ExternalAppendOnlyUnsafeRowArray.DefaultInitialSizeOfInMemoryBuffer,
    -          numSpillThreshold)
    +        val array = new ExternalAppendOnlyUnsafeRowArray(numSpillThreshold, numSpillThreshold)
    --- End diff --
    
    Thanks for the feedback @dongjoon-hyun. I added some details to the description.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendOnlyUnsa...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22617
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendOnlyUnsa...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22617
  
    **[Test build #97504 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97504/testReport)** for PR 22617 at commit [`1c30755`](https://github.com/apache/spark/commit/1c307553d5aa64c2365b14084bb79569f6c14be1).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22617: [SPARK-25484][TEST] Refactor ExternalAppendOnlyUnsafeRow...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22617
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendOnlyUnsa...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on the issue:

    https://github.com/apache/spark/pull/22617
  
    Retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22617: [SPARK-25484][TEST] Refactor ExternalAppendOnlyUnsafeRow...

Posted by peter-toth <gi...@git.apache.org>.
Github user peter-toth commented on the issue:

    https://github.com/apache/spark/pull/22617
  
    cc @dongjoon-hyun @seancxmao 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendOnlyUnsa...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22617
  
    **[Test build #97438 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97438/testReport)** for PR 22617 at commit [`1c30755`](https://github.com/apache/spark/commit/1c307553d5aa64c2365b14084bb79569f6c14be1).
     * This patch **fails due to an unknown error code, -9**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendOnlyUnsa...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22617
  
    **[Test build #97438 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97438/testReport)** for PR 22617 at commit [`1c30755`](https://github.com/apache/spark/commit/1c307553d5aa64c2365b14084bb79569f6c14be1).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendOnlyUnsa...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22617
  
    **[Test build #97348 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97348/testReport)** for PR 22617 at commit [`f016e20`](https://github.com/apache/spark/commit/f016e20cbb221d4cdf8eeb7bbdc9a4ab6a6cde24).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendOnlyUnsa...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22617
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendOnlyUnsa...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22617
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97504/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22617: [SPARK-25484][TEST] Refactor ExternalAppendOnlyUnsafeRow...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22617
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendO...

Posted by peter-toth <gi...@git.apache.org>.
Github user peter-toth commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22617#discussion_r228944791
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/ExternalAppendOnlyUnsafeRowArrayBenchmark.scala ---
    @@ -68,9 +100,7 @@ object ExternalAppendOnlyUnsafeRowArrayBenchmark {
         benchmark.addCase("ExternalAppendOnlyUnsafeRowArray") { _: Int =>
           var sum = 0L
           for (_ <- 0L until iterations) {
    -        val array = new ExternalAppendOnlyUnsafeRowArray(
    -          ExternalAppendOnlyUnsafeRowArray.DefaultInitialSizeOfInMemoryBuffer,
    -          numSpillThreshold)
    +        val array = new ExternalAppendOnlyUnsafeRowArray(numSpillThreshold, numSpillThreshold)
    --- End diff --
    
    Actually, following my logic in the description, I think 
    ```
    val array = new ExternalAppendOnlyUnsafeRowArray(numSpillThreshold, numSpillThreshold)
    ```
    should be changed to
    ```
    val array = new ExternalAppendOnlyUnsafeRowArray(0, numSpillThreshold)
    ```
    in `testAgainstRawUnsafeExternalSorter` in "WITH SPILL" cases so as to compare `ExternalAppendOnlyUnsafeRowArray` to `UnsafeExternalSorter` when it behaves so.
    
    But would be great if someone could confirm this idea.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendOnlyUnsa...

Posted by peter-toth <gi...@git.apache.org>.
Github user peter-toth commented on the issue:

    https://github.com/apache/spark/pull/22617
  
    @dongjoon-hyun , @kiszk could you please help me how take a step forward with this PR?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendOnlyUnsa...

Posted by wangyum <gi...@git.apache.org>.
Github user wangyum commented on the issue:

    https://github.com/apache/spark/pull/22617
  
    Retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendO...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22617#discussion_r229419433
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/ExternalAppendOnlyUnsafeRowArrayBenchmark.scala ---
    @@ -68,9 +100,7 @@ object ExternalAppendOnlyUnsafeRowArrayBenchmark {
         benchmark.addCase("ExternalAppendOnlyUnsafeRowArray") { _: Int =>
           var sum = 0L
           for (_ <- 0L until iterations) {
    -        val array = new ExternalAppendOnlyUnsafeRowArray(
    -          ExternalAppendOnlyUnsafeRowArray.DefaultInitialSizeOfInMemoryBuffer,
    -          numSpillThreshold)
    +        val array = new ExternalAppendOnlyUnsafeRowArray(numSpillThreshold, numSpillThreshold)
    --- End diff --
    
    In that case, we need two-step comparision.
    1. Refactoring only to ensure no regression.
    2. Change that value to check the performance value difference.
    
    Could you rollback this line and let us finish `Step 1` first?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22617: [SPARK-25484][TEST] Refactor ExternalAppendOnlyUnsafeRow...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22617
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendOnlyUnsa...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/22617
  
    Retest this please.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendOnlyUnsa...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22617
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97438/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendOnlyUnsa...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22617
  
    **[Test build #97504 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97504/testReport)** for PR 22617 at commit [`1c30755`](https://github.com/apache/spark/commit/1c307553d5aa64c2365b14084bb79569f6c14be1).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendO...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22617#discussion_r228768273
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/ExternalAppendOnlyUnsafeRowArrayBenchmark.scala ---
    @@ -68,9 +100,7 @@ object ExternalAppendOnlyUnsafeRowArrayBenchmark {
         benchmark.addCase("ExternalAppendOnlyUnsafeRowArray") { _: Int =>
           var sum = 0L
           for (_ <- 0L until iterations) {
    -        val array = new ExternalAppendOnlyUnsafeRowArray(
    -          ExternalAppendOnlyUnsafeRowArray.DefaultInitialSizeOfInMemoryBuffer,
    -          numSpillThreshold)
    +        val array = new ExternalAppendOnlyUnsafeRowArray(numSpillThreshold, numSpillThreshold)
    --- End diff --
    
    Hi, @peter-toth . Could you explain why we need to replace `ExternalAppendOnlyUnsafeRowArray.DefaultInitialSizeOfInMemoryBuffer` with `numSpillThreshold` here? Actually, this is not a obvious refactoring.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org