You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by jiangxb1987 <gi...@git.apache.org> on 2017/08/02 07:40:18 UTC

[GitHub] spark pull request #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetwee...

GitHub user jiangxb1987 opened a pull request:

    https://github.com/apache/spark/pull/18814

    [SPARK-21608][SPARK-9221][SQL] Window rangeBetween() API should allow literal boundary

    ## What changes were proposed in this pull request?
    
    Window rangeBetween() API should allow literal boundary, that means, the window range frame can calculate frame of double/date/timestamp.
    
    Example of the use case can be:
    ```
    SELECT
    	val_timestamp,
    	cate,
    	avg(val_timestamp) OVER(PARTITION BY cate ORDER BY val_timestamp RANGE BETWEEN CURRENT ROW AND interval 23 days 4 hours FOLLOWING)
    FROM testData
    ```
    
    This PR refactors the Window `rangeBetween` and `rowsBetween` API, while the legacy user code should still be valid.
    
    ## How was this patch tested?
    
    Add new test cases both in `DataFrameWindowFunctionsSuite` and in `window.sql`.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jiangxb1987/spark literal-boundary

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/18814.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #18814
    
----
commit a6d111b656320a06f197f18ab149803ec6b0a913
Author: Xingbo Jiang <xi...@databricks.com>
Date:   2017-08-02T06:16:02Z

    refactor Window rangeBetween API

commit 7893e5dd7aa37daa6c176e7228a2a71938d78514
Author: Xingbo Jiang <xi...@databricks.com>
Date:   2017-08-02T07:33:57Z

    code cleanup

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetween() API...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18814
  
    **[Test build #80404 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80404/testReport)** for PR 18814 at commit [`cec519b`](https://github.com/apache/spark/commit/cec519b8cfbf1ed2a3107056ef5281a5be75ec54).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetween() API...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18814
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80404/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetwee...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18814#discussion_r131595661
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowExec.scala ---
    @@ -139,7 +141,12 @@ case class WindowExec(
             }
     
             // Create the projection which returns the current 'value' modified by adding the offset.
    -        val boundExpr = Add(expr, Cast(boundOffset, expr.dataType))
    +        val boundExpr = (expr.dataType, boundOffset.dataType) match {
    +          case (DateType, IntegerType) => DateAdd(expr, boundOffset)
    +          case (TimestampType, CalendarIntervalType) =>
    +            TimeAdd(expr, boundOffset, Some(DateTimeUtils.defaultTimeZone.getID))
    --- End diff --
    
    shall we use session timezone here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetween() API...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18814
  
    **[Test build #80340 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80340/testReport)** for PR 18814 at commit [`2cf3d78`](https://github.com/apache/spark/commit/2cf3d787df12603a37a3b5635fe097c4b96f440d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetween() API...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18814
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetween() API...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18814
  
    **[Test build #80354 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80354/testReport)** for PR 18814 at commit [`3b3d822`](https://github.com/apache/spark/commit/3b3d8228242a768de1cf2ceb6c194791308867cc).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetween() API...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18814
  
    **[Test build #80343 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80343/testReport)** for PR 18814 at commit [`c2c69ef`](https://github.com/apache/spark/commit/c2c69ef5fd847f3cea208341db9b28e37478b7e2).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetwee...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18814#discussion_r131598096
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ---
    @@ -777,6 +778,29 @@ object functions {
       //////////////////////////////////////////////////////////////////////////////////////////////
       // Window functions
       //////////////////////////////////////////////////////////////////////////////////////////////
    +  /**
    +   * Window function: returns the value that represents the first row in the window partition.
    +   *
    +   * @group window_funcs
    +   * @since 2.3.0
    +   */
    +  def unboundedPreceding(): Column = lit(Window.unboundedPreceding)
    --- End diff --
    
    can't we just return `Column(UnboundedPreceding)`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetwee...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18814#discussion_r130913456
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/expressions/Window.scala ---
    @@ -84,7 +84,7 @@ object Window {
        *
        * @since 2.1.0
        */
    -  def unboundedPreceding: Long = Long.MinValue
    --- End diff --
    
    We are unable to change the public interface. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetwee...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18814#discussion_r131931907
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/expressions/WindowSpec.scala ---
    @@ -210,6 +210,59 @@ class WindowSpec private[sql](
       }
     
       /**
    +   * Defines the frame boundaries, from `start` (inclusive) to `end` (inclusive).
    +   *
    +   * Both `start` and `end` are relative to the current row. For example, "lit(0)" means
    +   * "current row", while "lit(-1)" means one off before the current row, and "lit(5)" means the
    +   * five off after the current row.
    +   *
    +   * Users should use `unboundedPreceding()`, `unboundedFollowing()`, and `currentRow()` from
    +   * [[org.apache.spark.sql.functions]] to specify special boundary values, literals are not
    +   * transformed to [[org.apache.spark.sql.catalyst.expressions.SpecialFrameBoundary]]s.
    +   *
    +   * A range-based boundary is based on the actual value of the ORDER BY
    +   * expression(s). An offset is used to alter the value of the ORDER BY expression, for
    +   * instance if the current order by expression has a value of 10 and the lower bound offset
    +   * is -3, the resulting lower bound for the current row will be 10 - 3 = 7. This however puts a
    +   * number of constraints on the ORDER BY expressions: there can be only one expression and this
    +   * expression must have a numerical/date/timestamp data type. An exception can be made when the
    +   * offset is unbounded, because no value modification is needed, in this case multiple and
    +   * non-literal ORDER BY expression are allowed.
    --- End diff --
    
    ditto


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetween() API...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18814
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80211/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetween() API...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18814
  
    **[Test build #80338 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80338/testReport)** for PR 18814 at commit [`2cf3d78`](https://github.com/apache/spark/commit/2cf3d787df12603a37a3b5635fe097c4b96f440d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetwee...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18814#discussion_r131701948
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/expressions/Window.scala ---
    @@ -210,6 +210,57 @@ object Window {
         spec.rangeBetween(start, end)
       }
     
    +  /**
    +   * Creates a [[WindowSpec]] with the frame boundaries defined,
    +   * from `start` (inclusive) to `end` (inclusive).
    +   *
    +   * Both `start` and `end` are relative to the current row. For example, "lit(0)" means
    +   * "current row", while "lit(-1)" means one off before the current row, and "lit(5)" means the
    +   * five off after the current row.
    +   *
    +   * Users should use `unboundedPreceding()`, `unboundedFollowing()`, and `currentRow()` to specify
    --- End diff --
    
    let's mention that these 3 functions are from `object functions`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetween() API...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18814
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetween() API...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/18814
  
    thanks, merging to master!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetween() API...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18814
  
    **[Test build #80354 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80354/testReport)** for PR 18814 at commit [`3b3d822`](https://github.com/apache/spark/commit/3b3d8228242a768de1cf2ceb6c194791308867cc).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetween() API...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18814
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetween() API...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18814
  
    **[Test build #80338 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80338/testReport)** for PR 18814 at commit [`2cf3d78`](https://github.com/apache/spark/commit/2cf3d787df12603a37a3b5635fe097c4b96f440d).
     * This patch **fails to generate documentation**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetwee...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18814#discussion_r131931115
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/expressions/Window.scala ---
    @@ -167,17 +167,17 @@ object Window {
        * current row.
        *
        * We recommend users use `Window.unboundedPreceding`, `Window.unboundedFollowing`,
    -   * and `Window.currentRow` to specify special boundary values, rather than using integral
    -   * values directly.
    +   * and `Window.currentRow` to specify special boundary values, rather than using long values
    +   * directly.
        *
        * A range-based boundary is based on the actual value of the ORDER BY
        * expression(s). An offset is used to alter the value of the ORDER BY expression, for
        * instance if the current order by expression has a value of 10 and the lower bound offset
        * is -3, the resulting lower bound for the current row will be 10 - 3 = 7. This however puts a
        * number of constraints on the ORDER BY expressions: there can be only one expression and this
    -   * expression must have a numerical data type. An exception can be made when the offset is 0,
    -   * because no value modification is needed, in this case multiple and non-numeric ORDER BY
    -   * expression are allowed.
    +   * expression must have a numerical data type. An exception can be made when the offset is
    +   * unbounded, because no value modification is needed, in this case multiple and non-literal
    --- End diff --
    
    `non-literal` -> `non-numeric`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetween() API...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18814
  
    **[Test build #80386 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80386/testReport)** for PR 18814 at commit [`3e3b58c`](https://github.com/apache/spark/commit/3e3b58c8911e266b2af985da3dec53418a608d2b).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetween() API...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18814
  
    **[Test build #80340 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80340/testReport)** for PR 18814 at commit [`2cf3d78`](https://github.com/apache/spark/commit/2cf3d787df12603a37a3b5635fe097c4b96f440d).
     * This patch **fails to generate documentation**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetween() API...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18814
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetwee...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18814#discussion_r131858708
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/expressions/WindowSpec.scala ---
    @@ -146,22 +146,22 @@ class WindowSpec private[sql](
       /**
        * Defines the frame boundaries, from `start` (inclusive) to `end` (inclusive).
        *
    -   * Both `start` and `end` are relative from the current row. For example, "0" means "current row",
    -   * while "-1" means one off before the current row, and "5" means the five off after the
    -   * current row.
    +   * Both `start` and `end` are relative from the current row. For example, "0" means
    +   * "current row", while "-1" means one off before the current row, and "5" means the five off
    +   * after the current row.
    --- End diff --
    
    The original line length exceeds the 100 charactors limit...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetwee...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18814#discussion_r131678887
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/expressions/Window.scala ---
    @@ -210,6 +210,17 @@ object Window {
         spec.rangeBetween(start, end)
       }
     
    +  /**
    +   * @param start boundary start, inclusive. The frame is unbounded if the expression is
    --- End diff --
    
    we should copy-paste the document from `rangeBetween(start: Long, end: Long)` to here, and replace `long values` with `literal values`. We should also update the examle to show how to use special range boundary.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetwee...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18814#discussion_r131679361
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/expressions/WindowSpec.scala ---
    @@ -210,6 +210,20 @@ class WindowSpec private[sql](
       }
     
       /**
    +   * @param start boundary start, inclusive. The frame is unbounded if the expression is
    --- End diff --
    
    ditto


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetween() API...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18814
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80338/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetween() API...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18814
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetwee...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18814#discussion_r131702902
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala ---
    @@ -89,7 +89,11 @@ case class WindowSpecDefinition(
         elements.mkString("(", " ", ")")
       }
     
    -  private def isValidFrameType(ft: DataType): Boolean = orderSpec.head.dataType == ft
    +  private def isValidFrameType(ft: DataType): Boolean = (orderSpec.head.dataType, ft) match {
    +    case (DateType, IntegerType) => true
    --- End diff --
    
    can we check with other databases? seems `RANGE BETWEEN CURRENT ROW AND interval 2 days FOLLOWING` is reasonable for date type column.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetween() API...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18814
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80386/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetween() API...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18814
  
    **[Test build #80154 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80154/testReport)** for PR 18814 at commit [`7893e5d`](https://github.com/apache/spark/commit/7893e5dd7aa37daa6c176e7228a2a71938d78514).
     * This patch **fails MiMa tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetwee...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/18814


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetween() API...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the issue:

    https://github.com/apache/spark/pull/18814
  
    
    (test comment to test PR dashboard linking)
    --
    
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetween() API...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18814
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80343/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetwee...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18814#discussion_r131931843
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/expressions/WindowSpec.scala ---
    @@ -146,22 +146,22 @@ class WindowSpec private[sql](
       /**
        * Defines the frame boundaries, from `start` (inclusive) to `end` (inclusive).
        *
    -   * Both `start` and `end` are relative from the current row. For example, "0" means "current row",
    -   * while "-1" means one off before the current row, and "5" means the five off after the
    -   * current row.
    +   * Both `start` and `end` are relative from the current row. For example, "0" means
    +   * "current row", while "-1" means one off before the current row, and "5" means the five off
    +   * after the current row.
        *
        * We recommend users use `Window.unboundedPreceding`, `Window.unboundedFollowing`,
    -   * and `Window.currentRow` to specify special boundary values, rather than using integral
    -   * values directly.
    +   * and `Window.currentRow` to specify special boundary values, rather than using long values
    +   * directly.
        *
        * A range-based boundary is based on the actual value of the ORDER BY
        * expression(s). An offset is used to alter the value of the ORDER BY expression, for
        * instance if the current order by expression has a value of 10 and the lower bound offset
        * is -3, the resulting lower bound for the current row will be 10 - 3 = 7. This however puts a
        * number of constraints on the ORDER BY expressions: there can be only one expression and this
    -   * expression must have a numerical data type. An exception can be made when the offset is 0,
    -   * because no value modification is needed, in this case multiple and non-numeric ORDER BY
    -   * expression are allowed.
    +   * expression must have a numerical data type. An exception can be made when the offset is
    +   * unbounded, because no value modification is needed, in this case multiple and non-literal
    --- End diff --
    
    ditto


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetwee...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18814#discussion_r131613111
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala ---
    @@ -821,9 +821,12 @@ object TypeCoercion {
         }
     
         private def createBoundaryCast(boundary: Expression, dt: DataType): Expression = {
    -      boundary match {
    -        case e: SpecialFrameBoundary => e
    -        case e: Expression if e.dataType != dt && Cast.canCast(e.dataType, dt) => Cast(e, dt)
    +      (boundary, dt) match {
    +        case (e: SpecialFrameBoundary, _) => e
    +        case (e: Expression, _: DateType) => e
    +        case (e: Expression, _: TimestampType) => e
    +        case (e: Expression, t: DataType) if e.dataType != t && Cast.canCast(e.dataType, t) =>
    --- End diff --
    
    `Cast.canCast()` only accepts `DataType` as input param.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetween() API...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18814
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80340/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetween() API...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18814
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetwee...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18814#discussion_r131595259
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala ---
    @@ -821,9 +821,12 @@ object TypeCoercion {
         }
     
         private def createBoundaryCast(boundary: Expression, dt: DataType): Expression = {
    -      boundary match {
    -        case e: SpecialFrameBoundary => e
    -        case e: Expression if e.dataType != dt && Cast.canCast(e.dataType, dt) => Cast(e, dt)
    +      (boundary, dt) match {
    +        case (e: SpecialFrameBoundary, _) => e
    +        case (e: Expression, _: DateType) => e
    +        case (e: Expression, _: TimestampType) => e
    +        case (e: Expression, t: DataType) if e.dataType != t && Cast.canCast(e.dataType, t) =>
    --- End diff --
    
    nit: just `t` instead of `t: DataType`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetwee...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18814#discussion_r131702233
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/expressions/WindowSpec.scala ---
    @@ -146,22 +146,22 @@ class WindowSpec private[sql](
       /**
        * Defines the frame boundaries, from `start` (inclusive) to `end` (inclusive).
        *
    -   * Both `start` and `end` are relative from the current row. For example, "0" means "current row",
    -   * while "-1" means one off before the current row, and "5" means the five off after the
    -   * current row.
    +   * Both `start` and `end` are relative from the current row. For example, "0" means
    +   * "current row", while "-1" means one off before the current row, and "5" means the five off
    +   * after the current row.
    --- End diff --
    
    why this change? seems you only change the line length?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetwee...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18814#discussion_r131614060
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala ---
    @@ -89,7 +89,11 @@ case class WindowSpecDefinition(
         elements.mkString("(", " ", ")")
       }
     
    -  private def isValidFrameType(ft: DataType): Boolean = orderSpec.head.dataType == ft
    +  private def isValidFrameType(ft: DataType): Boolean = (orderSpec.head.dataType, ft) match {
    +    case (DateType, IntegerType) => true
    --- End diff --
    
    I think `CalendarIntervalType` is only used with `TimestampType`, cc @hvanhovell @ueshin 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetwee...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18814#discussion_r131596386
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/expressions/Window.scala ---
    @@ -159,6 +159,17 @@ object Window {
       }
     
       /**
    +   * @param start boundary start, inclusive. The frame is unbounded if this evals to
    +   *              the minimum long value (`Window.unboundedPreceding`).
    --- End diff --
    
    `.. the minimum long value ...` this is not true now, we use a special value `functions.unboundedPreceding`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetween() API...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18814
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetwee...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18814#discussion_r131864180
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala ---
    @@ -89,7 +89,11 @@ case class WindowSpecDefinition(
         elements.mkString("(", " ", ")")
       }
     
    -  private def isValidFrameType(ft: DataType): Boolean = orderSpec.head.dataType == ft
    +  private def isValidFrameType(ft: DataType): Boolean = (orderSpec.head.dataType, ft) match {
    +    case (DateType, IntegerType) => true
    --- End diff --
    
    I checked and HIVE support that, so let's also allow this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetween() API...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the issue:

    https://github.com/apache/spark/pull/18814
  
    (test comment to test PR dashboard linking)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetween() API...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18814
  
    **[Test build #80404 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80404/testReport)** for PR 18814 at commit [`cec519b`](https://github.com/apache/spark/commit/cec519b8cfbf1ed2a3107056ef5281a5be75ec54).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetween() API...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18814
  
    **[Test build #80211 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80211/testReport)** for PR 18814 at commit [`f247191`](https://github.com/apache/spark/commit/f247191215b93cb2472aaffe98fbaed6ea5d3a59).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetween() API...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18814
  
    **[Test build #80343 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80343/testReport)** for PR 18814 at commit [`c2c69ef`](https://github.com/apache/spark/commit/c2c69ef5fd847f3cea208341db9b28e37478b7e2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetween() API...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18814
  
    **[Test build #80386 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80386/testReport)** for PR 18814 at commit [`3e3b58c`](https://github.com/apache/spark/commit/3e3b58c8911e266b2af985da3dec53418a608d2b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetwee...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18814#discussion_r131678499
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/expressions/Window.scala ---
    @@ -167,17 +167,17 @@ object Window {
        * current row.
        *
        * We recommend users use `Window.unboundedPreceding`, `Window.unboundedFollowing`,
    -   * and `Window.currentRow` to specify special boundary values, rather than using integral
    +   * and `Window.currentRow` to specify special boundary values, rather than using literal
    --- End diff --
    
    long values?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetween() API...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18814
  
    **[Test build #80211 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80211/testReport)** for PR 18814 at commit [`f247191`](https://github.com/apache/spark/commit/f247191215b93cb2472aaffe98fbaed6ea5d3a59).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetwee...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18814#discussion_r131597007
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/expressions/Window.scala ---
    @@ -159,6 +159,17 @@ object Window {
       }
     
       /**
    +   * @param start boundary start, inclusive. The frame is unbounded if this evals to
    +   *              the minimum long value (`Window.unboundedPreceding`).
    +   * @param end boundary end, inclusive. The frame is unbounded if this evals to the
    +   *            maximum long value (`Window.unboundedFollowing`).
    +   * @since 2.3.0
    +   */
    +  def rowsBetween(start: Column, end: Column): WindowSpec = {
    --- End diff --
    
    hmmm, how useful is this API? `rowsBetween` only accept integer and max/min long, and it's ok if we only have `rowsBetween(start: Long, end: Long)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetwee...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18814#discussion_r131701648
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/expressions/Window.scala ---
    @@ -167,17 +167,17 @@ object Window {
        * current row.
        *
        * We recommend users use `Window.unboundedPreceding`, `Window.unboundedFollowing`,
    -   * and `Window.currentRow` to specify special boundary values, rather than using integral
    -   * values directly.
    +   * and `Window.currentRow` to specify special boundary values, rather than using long values
    +   * directly.
        *
        * A range-based boundary is based on the actual value of the ORDER BY
        * expression(s). An offset is used to alter the value of the ORDER BY expression, for
        * instance if the current order by expression has a value of 10 and the lower bound offset
        * is -3, the resulting lower bound for the current row will be 10 - 3 = 7. This however puts a
        * number of constraints on the ORDER BY expressions: there can be only one expression and this
    -   * expression must have a numerical data type. An exception can be made when the offset is 0,
    -   * because no value modification is needed, in this case multiple and non-numeric ORDER BY
    -   * expression are allowed.
    +   * expression must have a literal data type. An exception can be made when the offset is
    --- End diff --
    
    literal -> numerical or date or timestamp data type


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetwee...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18814#discussion_r131931683
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/expressions/Window.scala ---
    @@ -210,6 +210,57 @@ object Window {
         spec.rangeBetween(start, end)
       }
     
    +  /**
    +   * Creates a [[WindowSpec]] with the frame boundaries defined,
    +   * from `start` (inclusive) to `end` (inclusive).
    +   *
    +   * Both `start` and `end` are relative to the current row. For example, "lit(0)" means
    +   * "current row", while "lit(-1)" means one off before the current row, and "lit(5)" means the
    +   * five off after the current row.
    +   *
    +   * Users should use `unboundedPreceding()`, `unboundedFollowing()`, and `currentRow()` from
    +   * [[org.apache.spark.sql.functions]] to specify special boundary values, literals are not
    +   * transformed to [[org.apache.spark.sql.catalyst.expressions.SpecialFrameBoundary]]s.
    +   *
    +   * A range-based boundary is based on the actual value of the ORDER BY
    +   * expression(s). An offset is used to alter the value of the ORDER BY expression, for
    +   * instance if the current order by expression has a value of 10 and the lower bound offset
    +   * is -3, the resulting lower bound for the current row will be 10 - 3 = 7. This however puts a
    +   * number of constraints on the ORDER BY expressions: there can be only one expression and this
    +   * expression must have a numerical/date/timestamp data type. An exception can be made when the
    +   * offset is unbounded, because no value modification is needed, in this case multiple and
    +   * non-literal ORDER BY expression are allowed.
    --- End diff --
    
    non-numerical/date/timestamp data type


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetween() API...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18814
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80354/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetwee...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18814#discussion_r131595450
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala ---
    @@ -89,7 +89,11 @@ case class WindowSpecDefinition(
         elements.mkString("(", " ", ")")
       }
     
    -  private def isValidFrameType(ft: DataType): Boolean = orderSpec.head.dataType == ft
    +  private def isValidFrameType(ft: DataType): Boolean = (orderSpec.head.dataType, ft) match {
    +    case (DateType, IntegerType) => true
    --- End diff --
    
    don't we need `CalendarIntervalType` for `DateType` too?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetween() API...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18814
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetwee...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18814#discussion_r131676246
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala ---
    @@ -821,9 +821,12 @@ object TypeCoercion {
         }
     
         private def createBoundaryCast(boundary: Expression, dt: DataType): Expression = {
    -      boundary match {
    -        case e: SpecialFrameBoundary => e
    -        case e: Expression if e.dataType != dt && Cast.canCast(e.dataType, dt) => Cast(e, dt)
    +      (boundary, dt) match {
    +        case (e: SpecialFrameBoundary, _) => e
    +        case (e, _: DateType) => e
    +        case (e, _: TimestampType) => e
    +        case (e: Expression, t: DataType) if e.dataType != t && Cast.canCast(e.dataType, t) =>
    --- End diff --
    
    `t` is always `DataType`, so we don't need to write `t: DataType`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetwee...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18814#discussion_r131595197
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala ---
    @@ -821,9 +821,12 @@ object TypeCoercion {
         }
     
         private def createBoundaryCast(boundary: Expression, dt: DataType): Expression = {
    -      boundary match {
    -        case e: SpecialFrameBoundary => e
    -        case e: Expression if e.dataType != dt && Cast.canCast(e.dataType, dt) => Cast(e, dt)
    +      (boundary, dt) match {
    +        case (e: SpecialFrameBoundary, _) => e
    +        case (e: Expression, _: DateType) => e
    --- End diff --
    
    nit: just `e` instead of `e: Expression`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetween() API...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18814
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80154/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetwee...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18814#discussion_r131876058
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala ---
    @@ -89,7 +89,11 @@ case class WindowSpecDefinition(
         elements.mkString("(", " ", ")")
       }
     
    -  private def isValidFrameType(ft: DataType): Boolean = orderSpec.head.dataType == ft
    +  private def isValidFrameType(ft: DataType): Boolean = (orderSpec.head.dataType, ft) match {
    +    case (DateType, IntegerType) => true
    --- End diff --
    
    Yea, we should allow this, but this is a bit corner case and not very straight-forward to implement, maybe we can leave this as a follow-up?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetwee...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18814#discussion_r131615511
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala ---
    @@ -89,7 +89,11 @@ case class WindowSpecDefinition(
         elements.mkString("(", " ", ")")
       }
     
    -  private def isValidFrameType(ft: DataType): Boolean = orderSpec.head.dataType == ft
    +  private def isValidFrameType(ft: DataType): Boolean = (orderSpec.head.dataType, ft) match {
    +    case (DateType, IntegerType) => true
    --- End diff --
    
    I think `CalendarIntervalType` is only used with `TimestampType`, cc @hvanhovell @ueshin 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetween() API...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18814
  
    **[Test build #80154 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80154/testReport)** for PR 18814 at commit [`7893e5d`](https://github.com/apache/spark/commit/7893e5dd7aa37daa6c176e7228a2a71938d78514).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org