You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by gaborgsomogyi <gi...@git.apache.org> on 2017/12/19 16:38:05 UTC

[GitHub] spark pull request #20022: Add unit test for Window spilling

GitHub user gaborgsomogyi opened a pull request:

    https://github.com/apache/spark/pull/20022

    Add unit test for Window spilling

    ## What changes were proposed in this pull request?
    
    There is already test using window spilling, but the test coverage is not ideal.
    
    In this PR the already existing test was fixed and additional cases added.
    
    ## How was this patch tested?
    
    Automated: Pass the Jenkins.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gaborgsomogyi/spark SPARK-22363

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/20022.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #20022
    
----
commit db4c71105066dc2dbe423bb3bf5beaef2179dc82
Author: Gabor Somogyi <ga...@gmail.com>
Date:   2017-12-19T09:42:02Z

    Add unit test for Window spilling

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20022: [SPARK-22363][SQL][TEST] Add unit test for Window spilli...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20022
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85540/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20022: [SPARK-22363][SQL][TEST] Add unit test for Window...

Posted by gaborgsomogyi <gi...@git.apache.org>.
Github user gaborgsomogyi commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20022#discussion_r158525031
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameWindowFunctionsSuite.scala ---
    @@ -518,9 +519,46 @@ class DataFrameWindowFunctionsSuite extends QueryTest with SharedSQLContext {
           Seq(Row(3, "1", null, 3.0, 4.0, 3.0), Row(5, "1", false, 4.0, 5.0, 5.0)))
       }
     
    +  test("Window spill with less than the inMemoryThreshold") {
    +    val df = Seq((1, "1"), (2, "2"), (1, "3"), (2, "4")).toDF("key", "value")
    +    val window = Window.partitionBy($"key").orderBy($"value")
    +
    +    withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_IN_MEMORY_THRESHOLD.key -> "2",
    +      SQLConf.WINDOW_EXEC_BUFFER_SPILL_THRESHOLD.key -> "2") {
    +      assertNotSpilled(sparkContext, "select") {
    +        df.select($"key", sum("value").over(window)).collect()
    +      }
    +    }
    +  }
    +
    +  test("Window spill with more than the inMemoryThreshold but less than the spillThreshold") {
    +    val df = Seq((1, "1"), (2, "2"), (1, "3"), (2, "4")).toDF("key", "value")
    +    val window = Window.partitionBy($"key").orderBy($"value")
    +
    +    withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_IN_MEMORY_THRESHOLD.key -> "1",
    +      SQLConf.WINDOW_EXEC_BUFFER_SPILL_THRESHOLD.key -> "2") {
    +      assertNotSpilled(sparkContext, "select") {
    +        df.select($"key", sum("value").over(window)).collect()
    +      }
    +    }
    +  }
    +
    +  test("Window spill with more than the inMemoryThreshold and spillThreshold") {
    +    val df = Seq((1, "1"), (2, "2"), (1, "3"), (2, "4")).toDF("key", "value")
    +    val window = Window.partitionBy($"key").orderBy($"value")
    +
    +    withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_IN_MEMORY_THRESHOLD.key -> "1",
    +      SQLConf.WINDOW_EXEC_BUFFER_SPILL_THRESHOLD.key -> "1") {
    +      assertSpilled(sparkContext, "select") {
    +        df.select($"key", sum("value").over(window)).collect()
    +      }
    +    }
    +  }
    +
       test("SPARK-21258: complex object in combination with spilling") {
         // Make sure we trigger the spilling path.
    -    withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_SPILL_THRESHOLD.key -> "17") {
    +    withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_IN_MEMORY_THRESHOLD.key -> "0",
    --- End diff --
    
    WINDOW_EXEC_BUFFER_IN_MEMORY_THRESHOLD drives how much items is guaranteed to kept in memory. If this limit is not hit spilling not considered.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20022: [SPARK-22363][SQL][TEST] Add unit test for Window spilli...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/20022
  
    LGTM


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20022: [SPARK-22363][SQL][TEST] Add unit test for Window...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20022#discussion_r158584088
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameWindowFunctionsSuite.scala ---
    @@ -518,9 +519,46 @@ class DataFrameWindowFunctionsSuite extends QueryTest with SharedSQLContext {
           Seq(Row(3, "1", null, 3.0, 4.0, 3.0), Row(5, "1", false, 4.0, 5.0, 5.0)))
       }
     
    +  test("Window spill with less than the inMemoryThreshold") {
    +    val df = Seq((1, "1"), (2, "2"), (1, "3"), (2, "4")).toDF("key", "value")
    +    val window = Window.partitionBy($"key").orderBy($"value")
    +
    +    withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_IN_MEMORY_THRESHOLD.key -> "2",
    +      SQLConf.WINDOW_EXEC_BUFFER_SPILL_THRESHOLD.key -> "2") {
    +      assertNotSpilled(sparkContext, "select") {
    +        df.select($"key", sum("value").over(window)).collect()
    +      }
    +    }
    +  }
    +
    +  test("Window spill with more than the inMemoryThreshold but less than the spillThreshold") {
    +    val df = Seq((1, "1"), (2, "2"), (1, "3"), (2, "4")).toDF("key", "value")
    +    val window = Window.partitionBy($"key").orderBy($"value")
    +
    +    withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_IN_MEMORY_THRESHOLD.key -> "1",
    +      SQLConf.WINDOW_EXEC_BUFFER_SPILL_THRESHOLD.key -> "2") {
    +      assertNotSpilled(sparkContext, "select") {
    +        df.select($"key", sum("value").over(window)).collect()
    +      }
    +    }
    +  }
    +
    +  test("Window spill with more than the inMemoryThreshold and spillThreshold") {
    +    val df = Seq((1, "1"), (2, "2"), (1, "3"), (2, "4")).toDF("key", "value")
    +    val window = Window.partitionBy($"key").orderBy($"value")
    +
    +    withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_IN_MEMORY_THRESHOLD.key -> "1",
    +      SQLConf.WINDOW_EXEC_BUFFER_SPILL_THRESHOLD.key -> "1") {
    +      assertSpilled(sparkContext, "select") {
    +        df.select($"key", sum("value").over(window)).collect()
    +      }
    +    }
    +  }
    +
       test("SPARK-21258: complex object in combination with spilling") {
         // Make sure we trigger the spilling path.
    -    withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_SPILL_THRESHOLD.key -> "17") {
    +    withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_IN_MEMORY_THRESHOLD.key -> "0",
    --- End diff --
    
    Yeah, i mean, how about set it to 1 instead of 0?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20022: [SPARK-22363][SQL][TEST] Add unit test for Window spilli...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/20022
  
    ok to test


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20022: [SPARK-22363][SQL][TEST] Add unit test for Window spilli...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20022
  
    **[Test build #85540 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85540/testReport)** for PR 20022 at commit [`a4a0cfc`](https://github.com/apache/spark/commit/a4a0cfc9f8825257114a0fc778a93a91c6f0455f).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20022: [SPARK-22363][SQL][TEST] Add unit test for Window...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20022#discussion_r159134493
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameWindowFunctionsSuite.scala ---
    @@ -518,9 +519,46 @@ class DataFrameWindowFunctionsSuite extends QueryTest with SharedSQLContext {
           Seq(Row(3, "1", null, 3.0, 4.0, 3.0), Row(5, "1", false, 4.0, 5.0, 5.0)))
       }
     
    +  test("Window spill with less than the inMemoryThreshold") {
    +    val df = Seq((1, "1"), (2, "2"), (1, "3"), (2, "4")).toDF("key", "value")
    +    val window = Window.partitionBy($"key").orderBy($"value")
    +
    +    withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_IN_MEMORY_THRESHOLD.key -> "2",
    +      SQLConf.WINDOW_EXEC_BUFFER_SPILL_THRESHOLD.key -> "2") {
    +      assertNotSpilled(sparkContext, "select") {
    +        df.select($"key", sum("value").over(window)).collect()
    +      }
    +    }
    +  }
    +
    +  test("Window spill with more than the inMemoryThreshold but less than the spillThreshold") {
    +    val df = Seq((1, "1"), (2, "2"), (1, "3"), (2, "4")).toDF("key", "value")
    +    val window = Window.partitionBy($"key").orderBy($"value")
    +
    +    withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_IN_MEMORY_THRESHOLD.key -> "1",
    +      SQLConf.WINDOW_EXEC_BUFFER_SPILL_THRESHOLD.key -> "2") {
    +      assertNotSpilled(sparkContext, "select") {
    +        df.select($"key", sum("value").over(window)).collect()
    +      }
    +    }
    +  }
    +
    +  test("Window spill with more than the inMemoryThreshold and spillThreshold") {
    +    val df = Seq((1, "1"), (2, "2"), (1, "3"), (2, "4")).toDF("key", "value")
    +    val window = Window.partitionBy($"key").orderBy($"value")
    +
    +    withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_IN_MEMORY_THRESHOLD.key -> "1",
    +      SQLConf.WINDOW_EXEC_BUFFER_SPILL_THRESHOLD.key -> "1") {
    +      assertSpilled(sparkContext, "select") {
    +        df.select($"key", sum("value").over(window)).collect()
    +      }
    +    }
    +  }
    +
       test("SPARK-21258: complex object in combination with spilling") {
         // Make sure we trigger the spilling path.
    -    withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_SPILL_THRESHOLD.key -> "17") {
    +    withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_IN_MEMORY_THRESHOLD.key -> "0",
    --- End diff --
    
    We can accept any value. No limit. both are fine. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20022: [SPARK-22363][SQL][TEST] Add unit test for Window spilli...

Posted by gaborgsomogyi <gi...@git.apache.org>.
Github user gaborgsomogyi commented on the issue:

    https://github.com/apache/spark/pull/20022
  
    @gatorsmile @cloud-fan @jiangxb1987 Thanks for the help!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20022: [SPARK-22363][SQL][TEST] Add unit test for Window...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20022#discussion_r159134530
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameWindowFunctionsSuite.scala ---
    @@ -518,9 +519,46 @@ class DataFrameWindowFunctionsSuite extends QueryTest with SharedSQLContext {
           Seq(Row(3, "1", null, 3.0, 4.0, 3.0), Row(5, "1", false, 4.0, 5.0, 5.0)))
       }
     
    +  test("Window spill with less than the inMemoryThreshold") {
    +    val df = Seq((1, "1"), (2, "2"), (1, "3"), (2, "4")).toDF("key", "value")
    +    val window = Window.partitionBy($"key").orderBy($"value")
    +
    +    withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_IN_MEMORY_THRESHOLD.key -> "2",
    +      SQLConf.WINDOW_EXEC_BUFFER_SPILL_THRESHOLD.key -> "2") {
    +      assertNotSpilled(sparkContext, "select") {
    +        df.select($"key", sum("value").over(window)).collect()
    +      }
    +    }
    +  }
    +
    +  test("Window spill with more than the inMemoryThreshold but less than the spillThreshold") {
    +    val df = Seq((1, "1"), (2, "2"), (1, "3"), (2, "4")).toDF("key", "value")
    +    val window = Window.partitionBy($"key").orderBy($"value")
    +
    +    withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_IN_MEMORY_THRESHOLD.key -> "1",
    +      SQLConf.WINDOW_EXEC_BUFFER_SPILL_THRESHOLD.key -> "2") {
    +      assertNotSpilled(sparkContext, "select") {
    +        df.select($"key", sum("value").over(window)).collect()
    +      }
    +    }
    +  }
    +
    +  test("Window spill with more than the inMemoryThreshold and spillThreshold") {
    +    val df = Seq((1, "1"), (2, "2"), (1, "3"), (2, "4")).toDF("key", "value")
    +    val window = Window.partitionBy($"key").orderBy($"value")
    +
    +    withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_IN_MEMORY_THRESHOLD.key -> "1",
    +      SQLConf.WINDOW_EXEC_BUFFER_SPILL_THRESHOLD.key -> "1") {
    +      assertSpilled(sparkContext, "select") {
    +        df.select($"key", sum("value").over(window)).collect()
    +      }
    +    }
    +  }
    --- End diff --
    
    Normally, we will create a helper function for avoiding the duplicate codes. Since the test cases are pretty small, it is also fine. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20022: [SPARK-22363][SQL][TEST] Add unit test for Window spilli...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20022
  
    **[Test build #85547 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85547/testReport)** for PR 20022 at commit [`a4a0cfc`](https://github.com/apache/spark/commit/a4a0cfc9f8825257114a0fc778a93a91c6f0455f).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20022: [SPARK-22363][SQL][TEST] Add unit test for Window...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20022#discussion_r158516114
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameWindowFunctionsSuite.scala ---
    @@ -518,9 +519,46 @@ class DataFrameWindowFunctionsSuite extends QueryTest with SharedSQLContext {
           Seq(Row(3, "1", null, 3.0, 4.0, 3.0), Row(5, "1", false, 4.0, 5.0, 5.0)))
       }
     
    +  test("Window spill with less than the inMemoryThreshold") {
    +    val df = Seq((1, "1"), (2, "2"), (1, "3"), (2, "4")).toDF("key", "value")
    +    val window = Window.partitionBy($"key").orderBy($"value")
    +
    +    withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_IN_MEMORY_THRESHOLD.key -> "2",
    +      SQLConf.WINDOW_EXEC_BUFFER_SPILL_THRESHOLD.key -> "2") {
    +      assertNotSpilled(sparkContext, "select") {
    +        df.select($"key", sum("value").over(window)).collect()
    +      }
    +    }
    +  }
    +
    +  test("Window spill with more than the inMemoryThreshold but less than the spillThreshold") {
    +    val df = Seq((1, "1"), (2, "2"), (1, "3"), (2, "4")).toDF("key", "value")
    +    val window = Window.partitionBy($"key").orderBy($"value")
    +
    +    withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_IN_MEMORY_THRESHOLD.key -> "1",
    +      SQLConf.WINDOW_EXEC_BUFFER_SPILL_THRESHOLD.key -> "2") {
    +      assertNotSpilled(sparkContext, "select") {
    +        df.select($"key", sum("value").over(window)).collect()
    +      }
    +    }
    +  }
    +
    +  test("Window spill with more than the inMemoryThreshold and spillThreshold") {
    +    val df = Seq((1, "1"), (2, "2"), (1, "3"), (2, "4")).toDF("key", "value")
    +    val window = Window.partitionBy($"key").orderBy($"value")
    +
    +    withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_IN_MEMORY_THRESHOLD.key -> "1",
    +      SQLConf.WINDOW_EXEC_BUFFER_SPILL_THRESHOLD.key -> "1") {
    +      assertSpilled(sparkContext, "select") {
    +        df.select($"key", sum("value").over(window)).collect()
    +      }
    +    }
    +  }
    +
       test("SPARK-21258: complex object in combination with spilling") {
         // Make sure we trigger the spilling path.
    -    withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_SPILL_THRESHOLD.key -> "17") {
    +    withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_IN_MEMORY_THRESHOLD.key -> "0",
    --- End diff --
    
    Why should we set this value to 0?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20022: [SPARK-22363][SQL][TEST] Add unit test for Window spilli...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/20022
  
    Thanks! Merged to master.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20022: [SPARK-22363][SQL][TEST] Add unit test for Window spilli...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/20022
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20022: [SPARK-22363][SQL][TEST] Add unit test for Window spilli...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20022
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20022: [SPARK-22363][SQL][TEST] Add unit test for Window spilli...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20022
  
    **[Test build #85547 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85547/testReport)** for PR 20022 at commit [`a4a0cfc`](https://github.com/apache/spark/commit/a4a0cfc9f8825257114a0fc778a93a91c6f0455f).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20022: [SPARK-22363][SQL][TEST] Add unit test for Window...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/20022


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20022: [SPARK-22363][SQL][TEST] Add unit test for Window spilli...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20022
  
    **[Test build #85540 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85540/testReport)** for PR 20022 at commit [`a4a0cfc`](https://github.com/apache/spark/commit/a4a0cfc9f8825257114a0fc778a93a91c6f0455f).
     * This patch **fails due to an unknown error code, -9**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20022: [SPARK-22363][SQL][TEST] Add unit test for Window spilli...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20022
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20022: [SPARK-22363][SQL][TEST] Add unit test for Window spilli...

Posted by gaborgsomogyi <gi...@git.apache.org>.
Github user gaborgsomogyi commented on the issue:

    https://github.com/apache/spark/pull/20022
  
    cc @jiangxb1987 @gatorsmile @hvanhovell 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20022: [SPARK-22363][SQL][TEST] Add unit test for Window spilli...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20022
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85547/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20022: [SPARK-22363][SQL][TEST] Add unit test for Window...

Posted by gaborgsomogyi <gi...@git.apache.org>.
Github user gaborgsomogyi commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20022#discussion_r158585754
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameWindowFunctionsSuite.scala ---
    @@ -518,9 +519,46 @@ class DataFrameWindowFunctionsSuite extends QueryTest with SharedSQLContext {
           Seq(Row(3, "1", null, 3.0, 4.0, 3.0), Row(5, "1", false, 4.0, 5.0, 5.0)))
       }
     
    +  test("Window spill with less than the inMemoryThreshold") {
    +    val df = Seq((1, "1"), (2, "2"), (1, "3"), (2, "4")).toDF("key", "value")
    +    val window = Window.partitionBy($"key").orderBy($"value")
    +
    +    withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_IN_MEMORY_THRESHOLD.key -> "2",
    +      SQLConf.WINDOW_EXEC_BUFFER_SPILL_THRESHOLD.key -> "2") {
    +      assertNotSpilled(sparkContext, "select") {
    +        df.select($"key", sum("value").over(window)).collect()
    +      }
    +    }
    +  }
    +
    +  test("Window spill with more than the inMemoryThreshold but less than the spillThreshold") {
    +    val df = Seq((1, "1"), (2, "2"), (1, "3"), (2, "4")).toDF("key", "value")
    +    val window = Window.partitionBy($"key").orderBy($"value")
    +
    +    withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_IN_MEMORY_THRESHOLD.key -> "1",
    +      SQLConf.WINDOW_EXEC_BUFFER_SPILL_THRESHOLD.key -> "2") {
    +      assertNotSpilled(sparkContext, "select") {
    +        df.select($"key", sum("value").over(window)).collect()
    +      }
    +    }
    +  }
    +
    +  test("Window spill with more than the inMemoryThreshold and spillThreshold") {
    +    val df = Seq((1, "1"), (2, "2"), (1, "3"), (2, "4")).toDF("key", "value")
    +    val window = Window.partitionBy($"key").orderBy($"value")
    +
    +    withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_IN_MEMORY_THRESHOLD.key -> "1",
    +      SQLConf.WINDOW_EXEC_BUFFER_SPILL_THRESHOLD.key -> "1") {
    +      assertSpilled(sparkContext, "select") {
    +        df.select($"key", sum("value").over(window)).collect()
    +      }
    +    }
    +  }
    +
       test("SPARK-21258: complex object in combination with spilling") {
         // Make sure we trigger the spilling path.
    -    withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_SPILL_THRESHOLD.key -> "17") {
    +    withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_IN_MEMORY_THRESHOLD.key -> "0",
    --- End diff --
    
    Ahh, now I see 🙂 Sure, I'll set it soon.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20022: [SPARK-22363][SQL][TEST] Add unit test for Window spilli...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20022
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20022: [SPARK-22363][SQL][TEST] Add unit test for Window spilli...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/20022
  
    OK to test


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org