You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by jiangxb1987 <gi...@git.apache.org> on 2017/07/05 09:14:14 UTC

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

GitHub user jiangxb1987 opened a pull request:

    https://github.com/apache/spark/pull/18540

    [SPARK-19451][SQL] rangeBetween method should accept Long value as boundary

    ## What changes were proposed in this pull request?
    
    Long values can be passed to `rangeBetween` as range frame boundaries, but we silently convert it to Int values, this can cause wrong results and we should fix this.
    
    Further more, we should accept any legal literal values as range frame boundaries. In this PR, we make it possible for Long values, and make accepting other DataTypes really easy to add.
    
    ## How was this patch tested?
    
    Add new tests in `DataFrameWindowFunctionsSuite` and `TypeCoercionSuite`.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jiangxb1987/spark rangeFrame

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/18540.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #18540
    
----
commit 52c52895cfe3cae58f8b029a350949cc153971a7
Author: Xingbo Jiang <xi...@databricks.com>
Date:   2017-07-05T09:02:51Z

    rangeBetween accept literal values.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r130217250
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/expressions/WindowSpec.scala ---
    @@ -174,28 +191,22 @@ class WindowSpec private[sql](
        */
       // Note: when updating the doc for this method, also update Window.rangeBetween.
       def rangeBetween(start: Long, end: Long): WindowSpec = {
    --- End diff --
    
    We'll get and compute empty frames.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r126680805
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala ---
    @@ -805,4 +806,24 @@ object TypeCoercion {
           Option(ret)
         }
       }
    +
    +  /**
    +   * Cast WindowFrame boundaries to the type they operate upon.
    +   */
    +  object WindowFrameCoercion extends Rule[LogicalPlan] {
    --- End diff --
    
    Hmmm that would be kind of weird. So a user will get type coercion in its select but not in the range clause.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    It looks pretty solid. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r129078561
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala ---
    @@ -106,173 +101,167 @@ case class WindowSpecReference(name: String) extends WindowSpec
     /**
      * The trait used to represent the type of a Window Frame.
      */
    -sealed trait FrameType
    +sealed trait FrameType {
    +  def inputType: AbstractDataType
    +  def sql: String
    +}
     
     /**
    - * RowFrame treats rows in a partition individually. When a [[ValuePreceding]]
    - * or a [[ValueFollowing]] is used as its [[FrameBoundary]], the value is considered
    - * as a physical offset.
    + * RowFrame treats rows in a partition individually. Values used in a row frame are considered
    + * to be physical offsets.
      * For example, `ROW BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a 3-row frame,
      * from the row that precedes the current row to the row that follows the current row.
      */
    -case object RowFrame extends FrameType
    +case object RowFrame extends FrameType {
    +  override def inputType: AbstractDataType = IntegerType
    +  override def sql: String = "ROWS"
    +}
     
     /**
    - * RangeFrame treats rows in a partition as groups of peers.
    - * All rows having the same `ORDER BY` ordering are considered as peers.
    - * When a [[ValuePreceding]] or a [[ValueFollowing]] is used as its [[FrameBoundary]],
    - * the value is considered as a logical offset.
    + * RangeFrame treats rows in a partition as groups of peers. All rows having the same `ORDER BY`
    + * ordering are considered as peers. Values used in a range frame are considered to be logical
    + * offsets.
      * For example, assuming the value of the current row's `ORDER BY` expression `expr` is `v`,
      * `RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a frame containing rows whose values
      * `expr` are in the range of [v-1, v+1].
      *
      * If `ORDER BY` clause is not defined, all rows in the partition are considered as peers
      * of the current row.
      */
    -case object RangeFrame extends FrameType
    +case object RangeFrame extends FrameType {
    +  override def inputType: AbstractDataType = TypeCollection.NumericAndInterval
    +  override def sql: String = "RANGE"
    +}
     
     /**
    - * The trait used to represent the type of a Window Frame Boundary.
    + * The trait used to represent special boundaries used in a window frame.
      */
    -sealed trait FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean
    +sealed trait SpecialFrameBoundary extends Expression with Unevaluable {
    +  override lazy val children: Seq[Expression] = Nil
    +  override def dataType: DataType = NullType
    +  override def foldable: Boolean = false
    +  override def nullable: Boolean = false
     }
     
    +/** UNBOUNDED boundary. */
    +case object Unbounded extends SpecialFrameBoundary
    +
    +/** CURRENT ROW boundary. */
    +case object CurrentRow extends SpecialFrameBoundary
    +
     /**
    - * Extractor for making working with frame boundaries easier.
    + * Represents a window frame.
      */
    -object FrameBoundary {
    -  def apply(boundary: FrameBoundary): Option[Int] = unapply(boundary)
    -  def unapply(boundary: FrameBoundary): Option[Int] = boundary match {
    -    case CurrentRow => Some(0)
    -    case ValuePreceding(offset) => Some(-offset)
    -    case ValueFollowing(offset) => Some(offset)
    -    case _ => None
    -  }
    +sealed trait WindowFrame extends Expression with Unevaluable {
    +  override lazy val children: Seq[Expression] = Nil
    +  override def dataType: DataType = throw new UnsupportedOperationException("dataType")
    +  override def foldable: Boolean = false
    +  override def nullable: Boolean = false
     }
     
    -/** UNBOUNDED PRECEDING boundary. */
    -case object UnboundedPreceding extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => true
    -    case vp: ValuePreceding => true
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    -    case UnboundedFollowing => true
    -  }
    +/** Used as a placeholder when a frame specification is not defined. */
    +case object UnspecifiedFrame extends WindowFrame
     
    -  override def toString: String = "UNBOUNDED PRECEDING"
    -}
    +/**
    + * A specified Window Frame. The val lower/uppper can be either a foldable [[Expression]] or a
    + * [[SpecialFrameBoundary]].
    + */
    +case class SpecifiedWindowFrame(
    +    frameType: FrameType,
    +    lower: Expression,
    +    upper: Expression)
    +  extends WindowFrame {
     
    -/** <value> PRECEDING boundary. */
    -case class ValuePreceding(value: Int) extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case ValuePreceding(anotherValue) => value >= anotherValue
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    -    case UnboundedFollowing => true
    -  }
    +  override lazy val children: Seq[Expression] = lower :: upper :: Nil
     
    -  override def toString: String = s"$value PRECEDING"
    -}
    +  lazy val valueBoundary: Seq[Expression] =
    +    children.filterNot(_.isInstanceOf[SpecialFrameBoundary])
     
    -/** CURRENT ROW boundary. */
    -case object CurrentRow extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case vp: ValuePreceding => false
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    -    case UnboundedFollowing => true
    -  }
    +  override def checkInputDataTypes(): TypeCheckResult = {
    +    // Check lower value.
    +    val lowerCheck = checkBoundary(lower, "lower")
    +    if (lowerCheck.isFailure) {
    +      return lowerCheck
    +    }
     
    -  override def toString: String = "CURRENT ROW"
    -}
    +    // Check upper value.
    +    val upperCheck = checkBoundary(upper, "upper")
    +    if (upperCheck.isFailure) {
    +      return upperCheck
    +    }
     
    -/** <value> FOLLOWING boundary. */
    -case class ValueFollowing(value: Int) extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case vp: ValuePreceding => false
    -    case CurrentRow => false
    -    case ValueFollowing(anotherValue) => value <= anotherValue
    -    case UnboundedFollowing => true
    +    // Check combination (of expressions).
    +    (lower, upper) match {
    +      case (l: SpecialFrameBoundary, _) => TypeCheckSuccess
    +      case (_, u: SpecialFrameBoundary) => TypeCheckSuccess
    +      case (l: Expression, u: Expression) if l.dataType != u.dataType =>
    +        TypeCheckFailure(
    +          s"Window frame bounds '$lower' and '$upper' do no not have the same data type: " +
    +            s"'${l.dataType.catalogString}' <> '${u.dataType.catalogString}'")
    +      case (l: Expression, u: Expression) if isGreaterThan(l, u) =>
    +        TypeCheckFailure(
    +          "The lower bound of a window frame must less than or equal to the upper bound")
    +      case _ => TypeCheckSuccess
    +    }
       }
     
    -  override def toString: String = s"$value FOLLOWING"
    -}
    -
    -/** UNBOUNDED FOLLOWING boundary. */
    -case object UnboundedFollowing extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case vp: ValuePreceding => false
    -    case CurrentRow => false
    -    case vf: ValueFollowing => false
    -    case UnboundedFollowing => true
    +  override def sql: String = {
    +    val lowerSql = boundarySql(lower, "PRECEDING")
    +    val upperSql = boundarySql(upper, "FOLLOWING")
    +    s"${frameType.sql} BETWEEN $lowerSql AND $upperSql"
       }
     
    -  override def toString: String = "UNBOUNDED FOLLOWING"
    -}
    +  def isUnbounded: Boolean = lower == Unbounded && upper == Unbounded
     
    -/**
    - * Represents a window frame.
    - */
    -sealed trait WindowFrame
    +  def isValueBound: Boolean = valueBoundary.nonEmpty
     
    -/** Used as a placeholder when a frame specification is not defined. */
    -case object UnspecifiedFrame extends WindowFrame
    +  def isOffset: Boolean = (lower, upper) match {
    +    case (l: Expression, u: Expression) => frameType == RowFrame && l == u
    +    case _ => false
    +  }
     
    -/** A specified Window Frame. */
    -case class SpecifiedWindowFrame(
    -    frameType: FrameType,
    -    frameStart: FrameBoundary,
    -    frameEnd: FrameBoundary) extends WindowFrame {
    -
    -  /** If this WindowFrame is valid or not. */
    -  def validate: Option[String] = (frameType, frameStart, frameEnd) match {
    -    case (_, UnboundedFollowing, _) =>
    -      Some(s"$UnboundedFollowing is not allowed as the start of a Window Frame.")
    -    case (_, _, UnboundedPreceding) =>
    -      Some(s"$UnboundedPreceding is not allowed as the end of a Window Frame.")
    -    // case (RowFrame, start, end) => ??? RowFrame specific rule
    -    // case (RangeFrame, start, end) => ??? RangeFrame specific rule
    -    case (_, start, end) =>
    -      if (start.notFollows(end)) {
    -        None
    -      } else {
    -        val reason =
    -          s"The end of this Window Frame $end is smaller than the start of " +
    -          s"this Window Frame $start."
    -        Some(reason)
    -      }
    +  private def boundarySql(expr: Expression, defaultDirection: String): String = expr match {
    +    case CurrentRow => "CURRENT ROW"
    +    case Unbounded => "UNBOUNDED " + defaultDirection
    +    case UnaryMinus(n) => n.sql + " PRECEDING"
    +    case e: Expression => e.sql + " FOLLOWING"
       }
     
    -  override def toString: String = frameType match {
    -    case RowFrame => s"ROWS BETWEEN $frameStart AND $frameEnd"
    -    case RangeFrame => s"RANGE BETWEEN $frameStart AND $frameEnd"
    +  private def isGreaterThan(l: Expression, r: Expression): Boolean = {
    +    GreaterThan(l, r).eval().asInstanceOf[Boolean]
    +  }
    +
    +  private def checkBoundary(b: Expression, location: String): TypeCheckResult = b match {
    +    case e: Expression if !e.foldable && !e.isInstanceOf[SpecialFrameBoundary] =>
    --- End diff --
    
    sorry I was wrong.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    Thanks! Merging to master/2.2


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    **[Test build #80006 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80006/testReport)** for PR 18540 at commit [`a1f91cd`](https://github.com/apache/spark/commit/a1f91cd7b0f10176b551cb168bf23d2eef68c15c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    **[Test build #80021 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80021/testReport)** for PR 18540 at commit [`fbcea1e`](https://github.com/apache/spark/commit/fbcea1ed4b279e4527a96211fd024baa7860f40f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r129209847
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala ---
    @@ -43,58 +42,54 @@ case class WindowSpecDefinition(
         orderSpec: Seq[SortOrder],
         frameSpecification: WindowFrame) extends Expression with WindowSpec with Unevaluable {
     
    -  def validate: Option[String] = frameSpecification match {
    -    case UnspecifiedFrame =>
    -      Some("Found a UnspecifiedFrame. It should be converted to a SpecifiedWindowFrame " +
    -        "during analysis. Please file a bug report.")
    -    case frame: SpecifiedWindowFrame => frame.validate.orElse {
    -      def checkValueBasedBoundaryForRangeFrame(): Option[String] = {
    -        if (orderSpec.length > 1)  {
    -          // It is not allowed to have a value-based PRECEDING and FOLLOWING
    -          // as the boundary of a Range Window Frame.
    -          Some("This Range Window Frame only accepts at most one ORDER BY expression.")
    -        } else if (orderSpec.nonEmpty && !orderSpec.head.dataType.isInstanceOf[NumericType]) {
    -          Some("The data type of the expression in the ORDER BY clause should be a numeric type.")
    -        } else {
    -          None
    -        }
    -      }
    -
    -      (frame.frameType, frame.frameStart, frame.frameEnd) match {
    -        case (RangeFrame, vp: ValuePreceding, _) => checkValueBasedBoundaryForRangeFrame()
    -        case (RangeFrame, vf: ValueFollowing, _) => checkValueBasedBoundaryForRangeFrame()
    -        case (RangeFrame, _, vp: ValuePreceding) => checkValueBasedBoundaryForRangeFrame()
    -        case (RangeFrame, _, vf: ValueFollowing) => checkValueBasedBoundaryForRangeFrame()
    -        case (_, _, _) => None
    -      }
    -    }
    -  }
    -
    -  override def children: Seq[Expression] = partitionSpec ++ orderSpec
    +  override def children: Seq[Expression] = partitionSpec ++ orderSpec :+ frameSpecification
     
       override lazy val resolved: Boolean =
         childrenResolved && checkInputDataTypes().isSuccess &&
           frameSpecification.isInstanceOf[SpecifiedWindowFrame]
     
       override def nullable: Boolean = true
       override def foldable: Boolean = false
    -  override def dataType: DataType = throw new UnsupportedOperationException
    +  override def dataType: DataType = throw new UnsupportedOperationException("dataType")
     
    -  override def sql: String = {
    -    val partition = if (partitionSpec.isEmpty) {
    -      ""
    -    } else {
    -      "PARTITION BY " + partitionSpec.map(_.sql).mkString(", ") + " "
    +  override def checkInputDataTypes(): TypeCheckResult = {
    +    frameSpecification match {
    +      case UnspecifiedFrame =>
    +        TypeCheckFailure(
    +          "Cannot use an UnspecifiedFrame. This should have been converted during analysis. " +
    +            "Please file a bug report.")
    +      case f: SpecifiedWindowFrame if f.frameType == RangeFrame && !f.isUnbounded &&
    --- End diff --
    
    I think this should be `!f.isValueBound`? basically `current row and current row` is not unbound but should be allowed here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r129002412
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ExpressionParserSuite.scala ---
    @@ -267,16 +267,17 @@ class ExpressionParserSuite extends PlanTest {
         // Range/Row
         val frameTypes = Seq(("rows", RowFrame), ("range", RangeFrame))
         val boundaries = Seq(
    -      ("10 preceding", ValuePreceding(10), CurrentRow),
    -      ("3 + 1 following", ValueFollowing(4), CurrentRow), // Will fail during analysis
    -      ("unbounded preceding", UnboundedPreceding, CurrentRow),
    -      ("unbounded following", UnboundedFollowing, CurrentRow), // Will fail during analysis
    -      ("between unbounded preceding and current row", UnboundedPreceding, CurrentRow),
    +      ("10 preceding", -Literal(10), CurrentRow),
    +      ("2147483648 preceding", -Literal(2147483648L), CurrentRow),
    +      ("3 + 1 following", Add(Literal(3), Literal(1)), CurrentRow), // Will fail during analysis
    --- End diff --
    
    The lower boundary would be higher than the upper boundary, previously it would fail, but we have removed this check, should add it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r130214990
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala ---
    @@ -805,4 +806,27 @@ object TypeCoercion {
           Option(ret)
         }
       }
    +
    +  /**
    +   * Cast WindowFrame boundaries to the type they operate upon.
    +   */
    +  object WindowFrameCoercion extends Rule[LogicalPlan] {
    +    def apply(plan: LogicalPlan): LogicalPlan = plan resolveExpressions {
    +      case s @ WindowSpecDefinition(_, Seq(order), SpecifiedWindowFrame(RangeFrame, lower, upper))
    +        if order.resolved =>
    --- End diff --
    
    Nit: add two more spaces before `if`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r130215695
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala ---
    @@ -1179,32 +1179,26 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging
       }
     
       /**
    -   * Create or resolve a [[FrameBoundary]]. Simple math expressions are allowed for Value
    -   * Preceding/Following boundaries. These expressions must be constant (foldable) and return an
    -   * integer value.
    +   * Create or resolve a frame boundary expressions.
        */
    -  override def visitFrameBound(ctx: FrameBoundContext): FrameBoundary = withOrigin(ctx) {
    -    // We currently only allow foldable integers.
    -    def value: Int = {
    +  override def visitFrameBound(ctx: FrameBoundContext): Expression = withOrigin(ctx) {
    +    def value: Expression = {
           val e = expression(ctx.expression)
    -      validate(e.resolved && e.foldable && e.dataType == IntegerType,
    -        "Frame bound value must be a constant integer.",
    -        ctx)
    -      e.eval().asInstanceOf[Int]
    +      validate(e.resolved && e.foldable, "Frame bound value must be a literal.", ctx)
    +      e
         }
     
    -    // Create the FrameBoundary
         ctx.boundType.getType match {
           case SqlBaseParser.PRECEDING if ctx.UNBOUNDED != null =>
             UnboundedPreceding
           case SqlBaseParser.PRECEDING =>
    -        ValuePreceding(value)
    +        UnaryMinus(value)
    --- End diff --
    
    The same here. Do we allow users assign a negative value?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r130215342
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala ---
    @@ -43,58 +42,54 @@ case class WindowSpecDefinition(
         orderSpec: Seq[SortOrder],
         frameSpecification: WindowFrame) extends Expression with WindowSpec with Unevaluable {
     
    -  def validate: Option[String] = frameSpecification match {
    -    case UnspecifiedFrame =>
    -      Some("Found a UnspecifiedFrame. It should be converted to a SpecifiedWindowFrame " +
    -        "during analysis. Please file a bug report.")
    -    case frame: SpecifiedWindowFrame => frame.validate.orElse {
    -      def checkValueBasedBoundaryForRangeFrame(): Option[String] = {
    -        if (orderSpec.length > 1)  {
    -          // It is not allowed to have a value-based PRECEDING and FOLLOWING
    -          // as the boundary of a Range Window Frame.
    -          Some("This Range Window Frame only accepts at most one ORDER BY expression.")
    -        } else if (orderSpec.nonEmpty && !orderSpec.head.dataType.isInstanceOf[NumericType]) {
    -          Some("The data type of the expression in the ORDER BY clause should be a numeric type.")
    -        } else {
    -          None
    -        }
    -      }
    -
    -      (frame.frameType, frame.frameStart, frame.frameEnd) match {
    -        case (RangeFrame, vp: ValuePreceding, _) => checkValueBasedBoundaryForRangeFrame()
    -        case (RangeFrame, vf: ValueFollowing, _) => checkValueBasedBoundaryForRangeFrame()
    -        case (RangeFrame, _, vp: ValuePreceding) => checkValueBasedBoundaryForRangeFrame()
    -        case (RangeFrame, _, vf: ValueFollowing) => checkValueBasedBoundaryForRangeFrame()
    -        case (_, _, _) => None
    -      }
    -    }
    -  }
    -
    -  override def children: Seq[Expression] = partitionSpec ++ orderSpec
    +  override def children: Seq[Expression] = partitionSpec ++ orderSpec :+ frameSpecification
     
       override lazy val resolved: Boolean =
         childrenResolved && checkInputDataTypes().isSuccess &&
           frameSpecification.isInstanceOf[SpecifiedWindowFrame]
     
       override def nullable: Boolean = true
       override def foldable: Boolean = false
    -  override def dataType: DataType = throw new UnsupportedOperationException
    +  override def dataType: DataType = throw new UnsupportedOperationException("dataType")
     
    -  override def sql: String = {
    -    val partition = if (partitionSpec.isEmpty) {
    -      ""
    -    } else {
    -      "PARTITION BY " + partitionSpec.map(_.sql).mkString(", ") + " "
    +  override def checkInputDataTypes(): TypeCheckResult = {
    +    frameSpecification match {
    +      case UnspecifiedFrame =>
    +        TypeCheckFailure(
    +          "Cannot use an UnspecifiedFrame. This should have been converted during analysis. " +
    +            "Please file a bug report.")
    +      case f: SpecifiedWindowFrame if f.frameType == RangeFrame && !f.isUnbounded &&
    +        orderSpec.isEmpty =>
    +        TypeCheckFailure(
    +          "A range window frame cannot be used in an unordered window specification.")
    +      case f: SpecifiedWindowFrame if f.frameType == RangeFrame && f.isValueBound &&
    +        orderSpec.size > 1 =>
    +        TypeCheckFailure(
    +          s"A range window frame with value boundaries cannot be used in a window specification " +
    +            s"with multiple order by expressions: ${orderSpec.mkString(",")}")
    +      case f: SpecifiedWindowFrame if f.frameType == RangeFrame && f.isValueBound &&
    +        !isValidFrameType(f.valueBoundary.head.dataType) =>
    +        TypeCheckFailure(
    +          s"The data type '${orderSpec.head.dataType}' used in the order specification does " +
    +            s"not match the data type '${f.valueBoundary.head.dataType}' which is used in the " +
    +            "range frame.")
    --- End diff --
    
    Just want to confirm whether we have at least four negatives test cases to respectively cover these cases?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r130215526
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala ---
    @@ -351,20 +356,25 @@ abstract class OffsetWindowFunction
     
       override def nullable: Boolean = default == null || default.nullable || input.nullable
     
    -  override lazy val frame = {
    -    // This will be triggered by the Analyzer.
    -    val offsetValue = offset.eval() match {
    -      case o: Int => o
    -      case x => throw new AnalysisException(
    -        s"Offset expression must be a foldable integer expression: $x")
    -    }
    +  override lazy val frame: WindowFrame = {
         val boundary = direction match {
    -      case Ascending => ValueFollowing(offsetValue)
    -      case Descending => ValuePreceding(offsetValue)
    +      case Ascending => offset
    +      case Descending => UnaryMinus(offset)
         }
         SpecifiedWindowFrame(RowFrame, boundary, boundary)
       }
     
    +  override def checkInputDataTypes(): TypeCheckResult = {
    +    val check = super.checkInputDataTypes()
    +    if (check.isFailure) {
    +      check
    +    } else if (!offset.foldable) {
    +      TypeCheckFailure(s"Offset expression '$offset' must be a literal.")
    --- End diff --
    
    Having a test case?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79231/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    **[Test build #80018 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80018/testReport)** for PR 18540 at commit [`7d05c3d`](https://github.com/apache/spark/commit/7d05c3d1bfc016603f37b7479dce4d60e6b490da).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r130215469
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala ---
    @@ -106,161 +101,171 @@ case class WindowSpecReference(name: String) extends WindowSpec
     /**
      * The trait used to represent the type of a Window Frame.
      */
    -sealed trait FrameType
    +sealed trait FrameType {
    +  def inputType: AbstractDataType
    +  def sql: String
    +}
     
     /**
    - * RowFrame treats rows in a partition individually. When a [[ValuePreceding]]
    - * or a [[ValueFollowing]] is used as its [[FrameBoundary]], the value is considered
    - * as a physical offset.
    + * RowFrame treats rows in a partition individually. Values used in a row frame are considered
    + * to be physical offsets.
      * For example, `ROW BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a 3-row frame,
      * from the row that precedes the current row to the row that follows the current row.
      */
    -case object RowFrame extends FrameType
    +case object RowFrame extends FrameType {
    +  override def inputType: AbstractDataType = IntegerType
    +  override def sql: String = "ROWS"
    +}
     
     /**
    - * RangeFrame treats rows in a partition as groups of peers.
    - * All rows having the same `ORDER BY` ordering are considered as peers.
    - * When a [[ValuePreceding]] or a [[ValueFollowing]] is used as its [[FrameBoundary]],
    - * the value is considered as a logical offset.
    + * RangeFrame treats rows in a partition as groups of peers. All rows having the same `ORDER BY`
    + * ordering are considered as peers. Values used in a range frame are considered to be logical
    + * offsets.
      * For example, assuming the value of the current row's `ORDER BY` expression `expr` is `v`,
      * `RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a frame containing rows whose values
      * `expr` are in the range of [v-1, v+1].
      *
      * If `ORDER BY` clause is not defined, all rows in the partition are considered as peers
      * of the current row.
      */
    -case object RangeFrame extends FrameType
    -
    -/**
    - * The trait used to represent the type of a Window Frame Boundary.
    - */
    -sealed trait FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean
    +case object RangeFrame extends FrameType {
    +  override def inputType: AbstractDataType = TypeCollection.NumericAndInterval
    +  override def sql: String = "RANGE"
     }
     
     /**
    - * Extractor for making working with frame boundaries easier.
    + * The trait used to represent special boundaries used in a window frame.
      */
    -object FrameBoundary {
    -  def apply(boundary: FrameBoundary): Option[Int] = unapply(boundary)
    -  def unapply(boundary: FrameBoundary): Option[Int] = boundary match {
    -    case CurrentRow => Some(0)
    -    case ValuePreceding(offset) => Some(-offset)
    -    case ValueFollowing(offset) => Some(offset)
    -    case _ => None
    -  }
    +sealed trait SpecialFrameBoundary extends Expression with Unevaluable {
    +  override def children: Seq[Expression] = Nil
    +  override def dataType: DataType = NullType
    +  override def foldable: Boolean = false
    +  override def nullable: Boolean = false
     }
     
    -/** UNBOUNDED PRECEDING boundary. */
    -case object UnboundedPreceding extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => true
    -    case vp: ValuePreceding => true
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    -    case UnboundedFollowing => true
    -  }
    -
    -  override def toString: String = "UNBOUNDED PRECEDING"
    +/** UNBOUNDED boundary. */
    +case object UnboundedPreceding extends SpecialFrameBoundary {
    +  override def sql: String = "UNBOUNDED PRECEDING"
     }
     
    -/** <value> PRECEDING boundary. */
    -case class ValuePreceding(value: Int) extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case ValuePreceding(anotherValue) => value >= anotherValue
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    -    case UnboundedFollowing => true
    -  }
    -
    -  override def toString: String = s"$value PRECEDING"
    +case object UnboundedFollowing extends SpecialFrameBoundary {
    +  override def sql: String = "UNBOUNDED FOLLOWING"
     }
     
     /** CURRENT ROW boundary. */
    -case object CurrentRow extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case vp: ValuePreceding => false
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    -    case UnboundedFollowing => true
    -  }
    -
    -  override def toString: String = "CURRENT ROW"
    -}
    -
    -/** <value> FOLLOWING boundary. */
    -case class ValueFollowing(value: Int) extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case vp: ValuePreceding => false
    -    case CurrentRow => false
    -    case ValueFollowing(anotherValue) => value <= anotherValue
    -    case UnboundedFollowing => true
    -  }
    -
    -  override def toString: String = s"$value FOLLOWING"
    -}
    -
    -/** UNBOUNDED FOLLOWING boundary. */
    -case object UnboundedFollowing extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case vp: ValuePreceding => false
    -    case CurrentRow => false
    -    case vf: ValueFollowing => false
    -    case UnboundedFollowing => true
    -  }
    -
    -  override def toString: String = "UNBOUNDED FOLLOWING"
    +case object CurrentRow extends SpecialFrameBoundary {
    +  override def sql: String = "CURRENT ROW"
     }
     
     /**
      * Represents a window frame.
      */
    -sealed trait WindowFrame
    +sealed trait WindowFrame extends Expression with Unevaluable {
    +  override def children: Seq[Expression] = Nil
    +  override def dataType: DataType = throw new UnsupportedOperationException("dataType")
    +  override def foldable: Boolean = false
    +  override def nullable: Boolean = false
    +}
     
     /** Used as a placeholder when a frame specification is not defined. */
     case object UnspecifiedFrame extends WindowFrame
     
    -/** A specified Window Frame. */
    +/**
    + * A specified Window Frame. The val lower/uppper can be either a foldable [[Expression]] or a
    + * [[SpecialFrameBoundary]].
    + */
     case class SpecifiedWindowFrame(
         frameType: FrameType,
    -    frameStart: FrameBoundary,
    -    frameEnd: FrameBoundary) extends WindowFrame {
    -
    -  /** If this WindowFrame is valid or not. */
    -  def validate: Option[String] = (frameType, frameStart, frameEnd) match {
    -    case (_, UnboundedFollowing, _) =>
    -      Some(s"$UnboundedFollowing is not allowed as the start of a Window Frame.")
    -    case (_, _, UnboundedPreceding) =>
    -      Some(s"$UnboundedPreceding is not allowed as the end of a Window Frame.")
    -    // case (RowFrame, start, end) => ??? RowFrame specific rule
    -    // case (RangeFrame, start, end) => ??? RangeFrame specific rule
    -    case (_, start, end) =>
    -      if (start.notFollows(end)) {
    -        None
    -      } else {
    -        val reason =
    -          s"The end of this Window Frame $end is smaller than the start of " +
    -          s"this Window Frame $start."
    -        Some(reason)
    -      }
    +    lower: Expression,
    +    upper: Expression)
    +  extends WindowFrame {
    +
    +  override def children: Seq[Expression] = lower :: upper :: Nil
    +
    +  lazy val valueBoundary: Seq[Expression] =
    +    children.filterNot(_.isInstanceOf[SpecialFrameBoundary])
    +
    +  override def checkInputDataTypes(): TypeCheckResult = {
    +    // Check lower value.
    +    val lowerCheck = checkBoundary(lower, "lower")
    +    if (lowerCheck.isFailure) {
    +      return lowerCheck
    +    }
    +
    +    // Check upper value.
    +    val upperCheck = checkBoundary(upper, "upper")
    +    if (upperCheck.isFailure) {
    +      return upperCheck
    +    }
    +
    +    // Check combination (of expressions).
    +    (lower, upper) match {
    +      case (l: Expression, u: Expression) if !isValidFrameBoundary(l, u) =>
    +        TypeCheckFailure(s"Window frame upper bound '$upper' does not followes the lower bound " +
    +          s"'$lower'.")
    +      case (l: SpecialFrameBoundary, _) => TypeCheckSuccess
    +      case (_, u: SpecialFrameBoundary) => TypeCheckSuccess
    +      case (l: Expression, u: Expression) if l.dataType != u.dataType =>
    +        TypeCheckFailure(
    +          s"Window frame bounds '$lower' and '$upper' do no not have the same data type: " +
    +            s"'${l.dataType.catalogString}' <> '${u.dataType.catalogString}'")
    +      case (l: Expression, u: Expression) if isGreaterThan(l, u) =>
    +        TypeCheckFailure(
    +          "The lower bound of a window frame must less than or equal to the upper bound")
    --- End diff --
    
    Another question. Do we have test cases for the above three negative cases?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    **[Test build #79905 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79905/testReport)** for PR 18540 at commit [`5c9a992`](https://github.com/apache/spark/commit/5c9a992d732f28bf8c139a236ff9151b40291e8b).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r130217194
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala ---
    @@ -1179,32 +1179,26 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging
       }
     
       /**
    -   * Create or resolve a [[FrameBoundary]]. Simple math expressions are allowed for Value
    -   * Preceding/Following boundaries. These expressions must be constant (foldable) and return an
    -   * integer value.
    +   * Create or resolve a frame boundary expressions.
        */
    -  override def visitFrameBound(ctx: FrameBoundContext): FrameBoundary = withOrigin(ctx) {
    -    // We currently only allow foldable integers.
    -    def value: Int = {
    +  override def visitFrameBound(ctx: FrameBoundContext): Expression = withOrigin(ctx) {
    +    def value: Expression = {
           val e = expression(ctx.expression)
    -      validate(e.resolved && e.foldable && e.dataType == IntegerType,
    -        "Frame bound value must be a constant integer.",
    -        ctx)
    -      e.eval().asInstanceOf[Int]
    +      validate(e.resolved && e.foldable, "Frame bound value must be a literal.", ctx)
    --- End diff --
    
    added


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r126016260
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala ---
    @@ -805,4 +806,24 @@ object TypeCoercion {
           Option(ret)
         }
       }
    +
    +  /**
    +   * Cast WindowFrame boundaries to the type they operate upon.
    +   */
    +  object WindowFrameCoercion extends Rule[LogicalPlan] {
    --- End diff --
    
    do we really need this? can we just require strict types?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r128894965
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/trees/TreeNodeSuite.scala ---
    @@ -436,21 +436,22 @@ class TreeNodeSuite extends SparkFunSuite {
             "bucketColumnNames" -> "[bucket]",
             "sortColumnNames" -> "[sort]"))
     
    -    // Converts FrameBoundary to JSON
    -    assertJSON(
    -      ValueFollowing(3),
    -      JObject(
    -        "product-class" -> classOf[ValueFollowing].getName,
    -        "value" -> 3))
    -
         // Converts WindowFrame to JSON
         assertJSON(
    -      SpecifiedWindowFrame(RowFrame, UnboundedFollowing, CurrentRow),
    -      JObject(
    -        "product-class" -> classOf[SpecifiedWindowFrame].getName,
    -        "frameType" -> JObject("object" -> JString(RowFrame.getClass.getName)),
    -        "frameStart" -> JObject("object" -> JString(UnboundedFollowing.getClass.getName)),
    -        "frameEnd" -> JObject("object" -> JString(CurrentRow.getClass.getName))))
    +      SpecifiedWindowFrame(RowFrame, Unbounded, CurrentRow),
    +      List(
    +        JObject(
    +          "class" -> classOf[SpecifiedWindowFrame].getName,
    +          "num-children" -> 2,
    +          "frameType" -> JObject("object" -> JString(RowFrame.getClass.getName)),
    +          "lower" -> 0,
    +          "upper" -> 1),
    --- End diff --
    
    why lower and upper is 0 and 1?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r129005122
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala ---
    @@ -106,173 +101,167 @@ case class WindowSpecReference(name: String) extends WindowSpec
     /**
      * The trait used to represent the type of a Window Frame.
      */
    -sealed trait FrameType
    +sealed trait FrameType {
    +  def inputType: AbstractDataType
    +  def sql: String
    +}
     
     /**
    - * RowFrame treats rows in a partition individually. When a [[ValuePreceding]]
    - * or a [[ValueFollowing]] is used as its [[FrameBoundary]], the value is considered
    - * as a physical offset.
    + * RowFrame treats rows in a partition individually. Values used in a row frame are considered
    + * to be physical offsets.
      * For example, `ROW BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a 3-row frame,
      * from the row that precedes the current row to the row that follows the current row.
      */
    -case object RowFrame extends FrameType
    +case object RowFrame extends FrameType {
    +  override def inputType: AbstractDataType = IntegerType
    +  override def sql: String = "ROWS"
    +}
     
     /**
    - * RangeFrame treats rows in a partition as groups of peers.
    - * All rows having the same `ORDER BY` ordering are considered as peers.
    - * When a [[ValuePreceding]] or a [[ValueFollowing]] is used as its [[FrameBoundary]],
    - * the value is considered as a logical offset.
    + * RangeFrame treats rows in a partition as groups of peers. All rows having the same `ORDER BY`
    + * ordering are considered as peers. Values used in a range frame are considered to be logical
    + * offsets.
      * For example, assuming the value of the current row's `ORDER BY` expression `expr` is `v`,
      * `RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a frame containing rows whose values
      * `expr` are in the range of [v-1, v+1].
      *
      * If `ORDER BY` clause is not defined, all rows in the partition are considered as peers
      * of the current row.
      */
    -case object RangeFrame extends FrameType
    +case object RangeFrame extends FrameType {
    +  override def inputType: AbstractDataType = TypeCollection.NumericAndInterval
    +  override def sql: String = "RANGE"
    +}
     
     /**
    - * The trait used to represent the type of a Window Frame Boundary.
    + * The trait used to represent special boundaries used in a window frame.
      */
    -sealed trait FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean
    +sealed trait SpecialFrameBoundary extends Expression with Unevaluable {
    +  override lazy val children: Seq[Expression] = Nil
    +  override def dataType: DataType = NullType
    +  override def foldable: Boolean = false
    +  override def nullable: Boolean = false
     }
     
    +/** UNBOUNDED boundary. */
    +case object Unbounded extends SpecialFrameBoundary
    +
    +/** CURRENT ROW boundary. */
    +case object CurrentRow extends SpecialFrameBoundary
    +
     /**
    - * Extractor for making working with frame boundaries easier.
    + * Represents a window frame.
      */
    -object FrameBoundary {
    -  def apply(boundary: FrameBoundary): Option[Int] = unapply(boundary)
    -  def unapply(boundary: FrameBoundary): Option[Int] = boundary match {
    -    case CurrentRow => Some(0)
    -    case ValuePreceding(offset) => Some(-offset)
    -    case ValueFollowing(offset) => Some(offset)
    -    case _ => None
    -  }
    +sealed trait WindowFrame extends Expression with Unevaluable {
    +  override lazy val children: Seq[Expression] = Nil
    +  override def dataType: DataType = throw new UnsupportedOperationException("dataType")
    +  override def foldable: Boolean = false
    +  override def nullable: Boolean = false
     }
     
    -/** UNBOUNDED PRECEDING boundary. */
    -case object UnboundedPreceding extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => true
    -    case vp: ValuePreceding => true
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    -    case UnboundedFollowing => true
    -  }
    +/** Used as a placeholder when a frame specification is not defined. */
    +case object UnspecifiedFrame extends WindowFrame
     
    -  override def toString: String = "UNBOUNDED PRECEDING"
    -}
    +/**
    + * A specified Window Frame. The val lower/uppper can be either a foldable [[Expression]] or a
    + * [[SpecialFrameBoundary]].
    + */
    +case class SpecifiedWindowFrame(
    +    frameType: FrameType,
    +    lower: Expression,
    +    upper: Expression)
    +  extends WindowFrame {
     
    -/** <value> PRECEDING boundary. */
    -case class ValuePreceding(value: Int) extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case ValuePreceding(anotherValue) => value >= anotherValue
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    -    case UnboundedFollowing => true
    -  }
    +  override lazy val children: Seq[Expression] = lower :: upper :: Nil
     
    -  override def toString: String = s"$value PRECEDING"
    -}
    +  lazy val valueBoundary: Seq[Expression] =
    +    children.filterNot(_.isInstanceOf[SpecialFrameBoundary])
     
    -/** CURRENT ROW boundary. */
    -case object CurrentRow extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case vp: ValuePreceding => false
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    -    case UnboundedFollowing => true
    -  }
    +  override def checkInputDataTypes(): TypeCheckResult = {
    +    // Check lower value.
    +    val lowerCheck = checkBoundary(lower, "lower")
    +    if (lowerCheck.isFailure) {
    +      return lowerCheck
    +    }
     
    -  override def toString: String = "CURRENT ROW"
    -}
    +    // Check upper value.
    +    val upperCheck = checkBoundary(upper, "upper")
    +    if (upperCheck.isFailure) {
    +      return upperCheck
    +    }
     
    -/** <value> FOLLOWING boundary. */
    -case class ValueFollowing(value: Int) extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case vp: ValuePreceding => false
    -    case CurrentRow => false
    -    case ValueFollowing(anotherValue) => value <= anotherValue
    -    case UnboundedFollowing => true
    +    // Check combination (of expressions).
    +    (lower, upper) match {
    +      case (l: SpecialFrameBoundary, _) => TypeCheckSuccess
    +      case (_, u: SpecialFrameBoundary) => TypeCheckSuccess
    +      case (l: Expression, u: Expression) if l.dataType != u.dataType =>
    +        TypeCheckFailure(
    +          s"Window frame bounds '$lower' and '$upper' do no not have the same data type: " +
    +            s"'${l.dataType.catalogString}' <> '${u.dataType.catalogString}'")
    +      case (l: Expression, u: Expression) if isGreaterThan(l, u) =>
    +        TypeCheckFailure(
    +          "The lower bound of a window frame must less than or equal to the upper bound")
    +      case _ => TypeCheckSuccess
    +    }
       }
     
    -  override def toString: String = s"$value FOLLOWING"
    -}
    -
    -/** UNBOUNDED FOLLOWING boundary. */
    -case object UnboundedFollowing extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case vp: ValuePreceding => false
    -    case CurrentRow => false
    -    case vf: ValueFollowing => false
    -    case UnboundedFollowing => true
    +  override def sql: String = {
    +    val lowerSql = boundarySql(lower, "PRECEDING")
    +    val upperSql = boundarySql(upper, "FOLLOWING")
    +    s"${frameType.sql} BETWEEN $lowerSql AND $upperSql"
       }
     
    -  override def toString: String = "UNBOUNDED FOLLOWING"
    -}
    +  def isUnbounded: Boolean = lower == Unbounded && upper == Unbounded
     
    -/**
    - * Represents a window frame.
    - */
    -sealed trait WindowFrame
    +  def isValueBound: Boolean = valueBoundary.nonEmpty
     
    -/** Used as a placeholder when a frame specification is not defined. */
    -case object UnspecifiedFrame extends WindowFrame
    +  def isOffset: Boolean = (lower, upper) match {
    +    case (l: Expression, u: Expression) => frameType == RowFrame && l == u
    +    case _ => false
    +  }
     
    -/** A specified Window Frame. */
    -case class SpecifiedWindowFrame(
    -    frameType: FrameType,
    -    frameStart: FrameBoundary,
    -    frameEnd: FrameBoundary) extends WindowFrame {
    -
    -  /** If this WindowFrame is valid or not. */
    -  def validate: Option[String] = (frameType, frameStart, frameEnd) match {
    -    case (_, UnboundedFollowing, _) =>
    -      Some(s"$UnboundedFollowing is not allowed as the start of a Window Frame.")
    -    case (_, _, UnboundedPreceding) =>
    -      Some(s"$UnboundedPreceding is not allowed as the end of a Window Frame.")
    -    // case (RowFrame, start, end) => ??? RowFrame specific rule
    -    // case (RangeFrame, start, end) => ??? RangeFrame specific rule
    -    case (_, start, end) =>
    -      if (start.notFollows(end)) {
    -        None
    -      } else {
    -        val reason =
    -          s"The end of this Window Frame $end is smaller than the start of " +
    -          s"this Window Frame $start."
    -        Some(reason)
    -      }
    +  private def boundarySql(expr: Expression, defaultDirection: String): String = expr match {
    +    case CurrentRow => "CURRENT ROW"
    +    case Unbounded => "UNBOUNDED " + defaultDirection
    +    case UnaryMinus(n) => n.sql + " PRECEDING"
    +    case e: Expression => e.sql + " FOLLOWING"
       }
     
    -  override def toString: String = frameType match {
    -    case RowFrame => s"ROWS BETWEEN $frameStart AND $frameEnd"
    -    case RangeFrame => s"RANGE BETWEEN $frameStart AND $frameEnd"
    +  private def isGreaterThan(l: Expression, r: Expression): Boolean = {
    +    GreaterThan(l, r).eval().asInstanceOf[Boolean]
    +  }
    +
    +  private def checkBoundary(b: Expression, location: String): TypeCheckResult = b match {
    +    case e: Expression if !e.foldable && !e.isInstanceOf[SpecialFrameBoundary] =>
    --- End diff --
    
    I don't think you should, with what are you going to replace the boundary during optimization?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r130035366
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala ---
    @@ -43,57 +42,65 @@ case class WindowSpecDefinition(
         orderSpec: Seq[SortOrder],
         frameSpecification: WindowFrame) extends Expression with WindowSpec with Unevaluable {
     
    -  def validate: Option[String] = frameSpecification match {
    -    case UnspecifiedFrame =>
    -      Some("Found a UnspecifiedFrame. It should be converted to a SpecifiedWindowFrame " +
    -        "during analysis. Please file a bug report.")
    -    case frame: SpecifiedWindowFrame => frame.validate.orElse {
    -      def checkValueBasedBoundaryForRangeFrame(): Option[String] = {
    -        if (orderSpec.length > 1)  {
    -          // It is not allowed to have a value-based PRECEDING and FOLLOWING
    -          // as the boundary of a Range Window Frame.
    -          Some("This Range Window Frame only accepts at most one ORDER BY expression.")
    -        } else if (orderSpec.nonEmpty && !orderSpec.head.dataType.isInstanceOf[NumericType]) {
    -          Some("The data type of the expression in the ORDER BY clause should be a numeric type.")
    -        } else {
    -          None
    -        }
    -      }
    -
    -      (frame.frameType, frame.frameStart, frame.frameEnd) match {
    -        case (RangeFrame, vp: ValuePreceding, _) => checkValueBasedBoundaryForRangeFrame()
    -        case (RangeFrame, vf: ValueFollowing, _) => checkValueBasedBoundaryForRangeFrame()
    -        case (RangeFrame, _, vp: ValuePreceding) => checkValueBasedBoundaryForRangeFrame()
    -        case (RangeFrame, _, vf: ValueFollowing) => checkValueBasedBoundaryForRangeFrame()
    -        case (_, _, _) => None
    -      }
    -    }
    -  }
    -
    -  override def children: Seq[Expression] = partitionSpec ++ orderSpec
    +  override def children: Seq[Expression] = partitionSpec ++ orderSpec :+ frameSpecification
     
       override lazy val resolved: Boolean =
         childrenResolved && checkInputDataTypes().isSuccess &&
           frameSpecification.isInstanceOf[SpecifiedWindowFrame]
     
       override def nullable: Boolean = true
       override def foldable: Boolean = false
    -  override def dataType: DataType = throw new UnsupportedOperationException
    +  override def dataType: DataType = throw new UnsupportedOperationException("dataType")
     
    -  override def sql: String = {
    -    val partition = if (partitionSpec.isEmpty) {
    -      ""
    -    } else {
    -      "PARTITION BY " + partitionSpec.map(_.sql).mkString(", ") + " "
    +  override def checkInputDataTypes(): TypeCheckResult = {
    +    frameSpecification match {
    +      case UnspecifiedFrame =>
    +        TypeCheckFailure(
    +          "Cannot use an UnspecifiedFrame. This should have been converted during analysis. " +
    +            "Please file a bug report.")
    +      case f: SpecifiedWindowFrame if f.frameType == RangeFrame && !f.isUnbounded &&
    +        orderSpec.isEmpty =>
    +        TypeCheckFailure(
    +          "A range window frame cannot be used in an unordered window specification.")
    +      case f: SpecifiedWindowFrame if f.frameType == RangeFrame && f.isValueBound &&
    +        orderSpec.size > 1 =>
    +        TypeCheckFailure(
    +          s"A range window frame with value boundaries cannot be used in a window specification " +
    +            s"with multiple order by expressions: ${orderSpec.mkString(",")}")
    +      case f: SpecifiedWindowFrame if f.frameType == RangeFrame && f.isValueBound &&
    +        !isValidFrameType(f.valueBoundary.head.dataType) =>
    +        TypeCheckFailure(
    +          s"The data type '${orderSpec.head.dataType}' used in the order specification does " +
    +            s"not match the data type '${f.valueBoundary.head.dataType}' which is used in the " +
    +            "range frame.")
    +      case f: SpecifiedWindowFrame if !isValidFrameBoundary(f.lower, f.upper) =>
    +        TypeCheckFailure(s"The upper bound of the window frame is '${f.upper.sql}', which is " +
    +          s"smaller than the lower bound '${f.lower.sql}'.")
    +      case _ => TypeCheckSuccess
         }
    +  }
     
    -    val order = if (orderSpec.isEmpty) {
    -      ""
    -    } else {
    -      "ORDER BY " + orderSpec.map(_.sql).mkString(", ") + " "
    +  override def sql: String = {
    +    def toSql(exprs: Seq[Expression], prefix: String): Seq[String] = {
    +      Seq(exprs).filter(_.nonEmpty).map(_.map(_.sql).mkString(prefix, ", ", ""))
         }
     
    -    s"($partition$order${frameSpecification.toString})"
    +    val elements =
    +      toSql(partitionSpec, "PARTITION BY ") ++
    +        toSql(orderSpec, "ORDER BY ") ++
    +        Seq(frameSpecification.sql)
    +    elements.mkString("(", " ", ")")
    +  }
    +
    +  private def isValidFrameType(ft: DataType): Boolean = orderSpec.head.dataType == ft
    +
    +  private def isValidFrameBoundary(lower: Expression, upper: Expression): Boolean = {
    +    (lower, upper) match {
    +      case (UnboundedFollowing, _) => false
    +      case (_, UnboundedPreceding) => false
    +      case (l: Expression, u: SpecialFrameBoundary) => !u.notFollows(l)
    --- End diff --
    
    do we need to consider `(u: SpecialFrameBoundary, l: Expression)`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79841/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80020/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r129005758
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ExpressionParserSuite.scala ---
    @@ -267,16 +267,17 @@ class ExpressionParserSuite extends PlanTest {
         // Range/Row
         val frameTypes = Seq(("rows", RowFrame), ("range", RangeFrame))
         val boundaries = Seq(
    -      ("10 preceding", ValuePreceding(10), CurrentRow),
    -      ("3 + 1 following", ValueFollowing(4), CurrentRow), // Will fail during analysis
    -      ("unbounded preceding", UnboundedPreceding, CurrentRow),
    -      ("unbounded following", UnboundedFollowing, CurrentRow), // Will fail during analysis
    -      ("between unbounded preceding and current row", UnboundedPreceding, CurrentRow),
    +      ("10 preceding", -Literal(10), CurrentRow),
    +      ("2147483648 preceding", -Literal(2147483648L), CurrentRow),
    +      ("3 + 1 following", Add(Literal(3), Literal(1)), CurrentRow), // Will fail during analysis
    +      ("unbounded preceding", Unbounded, CurrentRow),
    +      ("unbounded following", Unbounded, CurrentRow), // Will fail during analysis
    --- End diff --
    
    Well the idea was that the unboundedness was tied to the location in which it was used, so for example unbounded in the first position would mean unbounded preceding. However this is completely opposite to how we interpret literal bounds, it might be better to reintroduce special boundaries for unbounded preceding and unbounded following.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r128894061
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala ---
    @@ -43,58 +42,54 @@ case class WindowSpecDefinition(
         orderSpec: Seq[SortOrder],
         frameSpecification: WindowFrame) extends Expression with WindowSpec with Unevaluable {
     
    -  def validate: Option[String] = frameSpecification match {
    -    case UnspecifiedFrame =>
    -      Some("Found a UnspecifiedFrame. It should be converted to a SpecifiedWindowFrame " +
    -        "during analysis. Please file a bug report.")
    -    case frame: SpecifiedWindowFrame => frame.validate.orElse {
    -      def checkValueBasedBoundaryForRangeFrame(): Option[String] = {
    -        if (orderSpec.length > 1)  {
    -          // It is not allowed to have a value-based PRECEDING and FOLLOWING
    -          // as the boundary of a Range Window Frame.
    -          Some("This Range Window Frame only accepts at most one ORDER BY expression.")
    -        } else if (orderSpec.nonEmpty && !orderSpec.head.dataType.isInstanceOf[NumericType]) {
    -          Some("The data type of the expression in the ORDER BY clause should be a numeric type.")
    -        } else {
    -          None
    -        }
    -      }
    -
    -      (frame.frameType, frame.frameStart, frame.frameEnd) match {
    -        case (RangeFrame, vp: ValuePreceding, _) => checkValueBasedBoundaryForRangeFrame()
    -        case (RangeFrame, vf: ValueFollowing, _) => checkValueBasedBoundaryForRangeFrame()
    -        case (RangeFrame, _, vp: ValuePreceding) => checkValueBasedBoundaryForRangeFrame()
    -        case (RangeFrame, _, vf: ValueFollowing) => checkValueBasedBoundaryForRangeFrame()
    -        case (_, _, _) => None
    -      }
    -    }
    -  }
    -
    -  override def children: Seq[Expression] = partitionSpec ++ orderSpec
    +  override def children: Seq[Expression] = partitionSpec ++ orderSpec ++ Seq(frameSpecification)
    --- End diff --
    
    nit: `partitionSpec ++ orderSpec :+ frameSpecification`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    **[Test build #79231 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79231/testReport)** for PR 18540 at commit [`1135053`](https://github.com/apache/spark/commit/11350536255543535b3ec156eeca60b996d0a239).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r130224334
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala ---
    @@ -1179,32 +1179,26 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging
       }
     
       /**
    -   * Create or resolve a [[FrameBoundary]]. Simple math expressions are allowed for Value
    -   * Preceding/Following boundaries. These expressions must be constant (foldable) and return an
    -   * integer value.
    +   * Create or resolve a frame boundary expressions.
        */
    -  override def visitFrameBound(ctx: FrameBoundContext): FrameBoundary = withOrigin(ctx) {
    -    // We currently only allow foldable integers.
    -    def value: Int = {
    +  override def visitFrameBound(ctx: FrameBoundContext): Expression = withOrigin(ctx) {
    +    def value: Expression = {
           val e = expression(ctx.expression)
    -      validate(e.resolved && e.foldable && e.dataType == IntegerType,
    -        "Frame bound value must be a constant integer.",
    -        ctx)
    -      e.eval().asInstanceOf[Int]
    +      validate(e.resolved && e.foldable, "Frame bound value must be a literal.", ctx)
    +      e
         }
     
    -    // Create the FrameBoundary
         ctx.boundType.getType match {
           case SqlBaseParser.PRECEDING if ctx.UNBOUNDED != null =>
             UnboundedPreceding
           case SqlBaseParser.PRECEDING =>
    -        ValuePreceding(value)
    +        UnaryMinus(value)
           case SqlBaseParser.CURRENT =>
             CurrentRow
           case SqlBaseParser.FOLLOWING if ctx.UNBOUNDED != null =>
             UnboundedFollowing
           case SqlBaseParser.FOLLOWING =>
    -        ValueFollowing(value)
    +        value
    --- End diff --
    
    It sounds like we already allowed it in the previous release. Thus, we need to follow what we have now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r130214797
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala ---
    @@ -109,10 +109,8 @@ trait CheckAnalysis extends PredicateHelper {
                 failAnalysis(s"Distinct window functions are not supported: $w")
     
               case w @ WindowExpression(_: OffsetWindowFunction, WindowSpecDefinition(_, order,
    -               SpecifiedWindowFrame(frame,
    -                 FrameBoundary(l),
    -                 FrameBoundary(h))))
    -             if order.isEmpty || frame != RowFrame || l != h =>
    +               frame: SpecifiedWindowFrame))
    --- End diff --
    
    Nit: Cutting in the middle looks weird. Please keep them in the same line.
    `WindowSpecDefinition(_, order, frame: SpecifiedWindowFrame))`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r130073011
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala ---
    @@ -106,161 +113,176 @@ case class WindowSpecReference(name: String) extends WindowSpec
     /**
      * The trait used to represent the type of a Window Frame.
      */
    -sealed trait FrameType
    +sealed trait FrameType {
    +  def inputType: AbstractDataType
    +  def sql: String
    +}
     
     /**
    - * RowFrame treats rows in a partition individually. When a [[ValuePreceding]]
    - * or a [[ValueFollowing]] is used as its [[FrameBoundary]], the value is considered
    - * as a physical offset.
    + * RowFrame treats rows in a partition individually. Values used in a row frame are considered
    + * to be physical offsets.
      * For example, `ROW BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a 3-row frame,
      * from the row that precedes the current row to the row that follows the current row.
      */
    -case object RowFrame extends FrameType
    +case object RowFrame extends FrameType {
    +  override def inputType: AbstractDataType = IntegerType
    +  override def sql: String = "ROWS"
    +}
     
     /**
    - * RangeFrame treats rows in a partition as groups of peers.
    - * All rows having the same `ORDER BY` ordering are considered as peers.
    - * When a [[ValuePreceding]] or a [[ValueFollowing]] is used as its [[FrameBoundary]],
    - * the value is considered as a logical offset.
    + * RangeFrame treats rows in a partition as groups of peers. All rows having the same `ORDER BY`
    + * ordering are considered as peers. Values used in a range frame are considered to be logical
    + * offsets.
      * For example, assuming the value of the current row's `ORDER BY` expression `expr` is `v`,
      * `RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a frame containing rows whose values
      * `expr` are in the range of [v-1, v+1].
      *
      * If `ORDER BY` clause is not defined, all rows in the partition are considered as peers
      * of the current row.
      */
    -case object RangeFrame extends FrameType
    -
    -/**
    - * The trait used to represent the type of a Window Frame Boundary.
    - */
    -sealed trait FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean
    +case object RangeFrame extends FrameType {
    +  override def inputType: AbstractDataType = TypeCollection.NumericAndInterval
    +  override def sql: String = "RANGE"
     }
     
     /**
    - * Extractor for making working with frame boundaries easier.
    + * The trait used to represent special boundaries used in a window frame.
      */
    -object FrameBoundary {
    -  def apply(boundary: FrameBoundary): Option[Int] = unapply(boundary)
    -  def unapply(boundary: FrameBoundary): Option[Int] = boundary match {
    -    case CurrentRow => Some(0)
    -    case ValuePreceding(offset) => Some(-offset)
    -    case ValueFollowing(offset) => Some(offset)
    -    case _ => None
    -  }
    +sealed trait SpecialFrameBoundary extends Expression with Unevaluable {
    +  override lazy val children: Seq[Expression] = Nil
    +  override def dataType: DataType = NullType
    +  override def foldable: Boolean = false
    +  override def nullable: Boolean = false
    +
    +  def notFollows(other: Expression): Boolean
     }
     
    -/** UNBOUNDED PRECEDING boundary. */
    -case object UnboundedPreceding extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => true
    -    case vp: ValuePreceding => true
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    -    case UnboundedFollowing => true
    -  }
    +/** UNBOUNDED boundary. */
    +case object UnboundedPreceding extends SpecialFrameBoundary {
    +  override def sql: String = "UNBOUNDED PRECEDING"
     
    -  override def toString: String = "UNBOUNDED PRECEDING"
    +  override def notFollows(other: Expression): Boolean = true
     }
     
    -/** <value> PRECEDING boundary. */
    -case class ValuePreceding(value: Int) extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case ValuePreceding(anotherValue) => value >= anotherValue
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    +case object UnboundedFollowing extends SpecialFrameBoundary {
    +  override def sql: String = "UNBOUNDED FOLLOWING"
    +
    +  override def notFollows(other: Expression): Boolean = other match {
         case UnboundedFollowing => true
    +    case _ => false
       }
    -
    -  override def toString: String = s"$value PRECEDING"
     }
     
     /** CURRENT ROW boundary. */
    -case object CurrentRow extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case vp: ValuePreceding => false
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    -    case UnboundedFollowing => true
    -  }
    -
    -  override def toString: String = "CURRENT ROW"
    -}
    +case object CurrentRow extends SpecialFrameBoundary {
    +  override def sql: String = "CURRENT ROW"
     
    -/** <value> FOLLOWING boundary. */
    -case class ValueFollowing(value: Int) extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case vp: ValuePreceding => false
    -    case CurrentRow => false
    -    case ValueFollowing(anotherValue) => value <= anotherValue
    +  override def notFollows(other: Expression): Boolean = other match {
         case UnboundedFollowing => true
    +    case _ => false
       }
    -
    -  override def toString: String = s"$value FOLLOWING"
    -}
    -
    -/** UNBOUNDED FOLLOWING boundary. */
    -case object UnboundedFollowing extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case vp: ValuePreceding => false
    -    case CurrentRow => false
    -    case vf: ValueFollowing => false
    -    case UnboundedFollowing => true
    -  }
    -
    -  override def toString: String = "UNBOUNDED FOLLOWING"
     }
     
     /**
      * Represents a window frame.
      */
    -sealed trait WindowFrame
    +sealed trait WindowFrame extends Expression with Unevaluable {
    +  override lazy val children: Seq[Expression] = Nil
    --- End diff --
    
    nit: `def` instead of `lazy val`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r130078786
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala ---
    @@ -1179,32 +1179,26 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging
       }
     
       /**
    -   * Create or resolve a [[FrameBoundary]]. Simple math expressions are allowed for Value
    -   * Preceding/Following boundaries. These expressions must be constant (foldable) and return an
    -   * integer value.
    +   * Create or resolve a frame boundary expressions.
        */
    -  override def visitFrameBound(ctx: FrameBoundContext): FrameBoundary = withOrigin(ctx) {
    -    // We currently only allow foldable integers.
    -    def value: Int = {
    +  override def visitFrameBound(ctx: FrameBoundContext): Expression = withOrigin(ctx) {
    +    def value: Expression = {
           val e = expression(ctx.expression)
    -      validate(e.resolved && e.foldable && e.dataType == IntegerType,
    -        "Frame bound value must be a constant integer.",
    -        ctx)
    -      e.eval().asInstanceOf[Int]
    +      validate(e.resolved && e.foldable, "Frame bound value must be a literal.", ctx)
    --- End diff --
    
    sure


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    **[Test build #80012 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80012/testReport)** for PR 18540 at commit [`9abdb5e`](https://github.com/apache/spark/commit/9abdb5eee7aab766fe73ca00749efa2a16328882).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r130073540
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala ---
    @@ -106,161 +113,176 @@ case class WindowSpecReference(name: String) extends WindowSpec
     /**
      * The trait used to represent the type of a Window Frame.
      */
    -sealed trait FrameType
    +sealed trait FrameType {
    +  def inputType: AbstractDataType
    +  def sql: String
    +}
     
     /**
    - * RowFrame treats rows in a partition individually. When a [[ValuePreceding]]
    - * or a [[ValueFollowing]] is used as its [[FrameBoundary]], the value is considered
    - * as a physical offset.
    + * RowFrame treats rows in a partition individually. Values used in a row frame are considered
    + * to be physical offsets.
      * For example, `ROW BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a 3-row frame,
      * from the row that precedes the current row to the row that follows the current row.
      */
    -case object RowFrame extends FrameType
    +case object RowFrame extends FrameType {
    +  override def inputType: AbstractDataType = IntegerType
    +  override def sql: String = "ROWS"
    +}
     
     /**
    - * RangeFrame treats rows in a partition as groups of peers.
    - * All rows having the same `ORDER BY` ordering are considered as peers.
    - * When a [[ValuePreceding]] or a [[ValueFollowing]] is used as its [[FrameBoundary]],
    - * the value is considered as a logical offset.
    + * RangeFrame treats rows in a partition as groups of peers. All rows having the same `ORDER BY`
    + * ordering are considered as peers. Values used in a range frame are considered to be logical
    + * offsets.
      * For example, assuming the value of the current row's `ORDER BY` expression `expr` is `v`,
      * `RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a frame containing rows whose values
      * `expr` are in the range of [v-1, v+1].
      *
      * If `ORDER BY` clause is not defined, all rows in the partition are considered as peers
      * of the current row.
      */
    -case object RangeFrame extends FrameType
    -
    -/**
    - * The trait used to represent the type of a Window Frame Boundary.
    - */
    -sealed trait FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean
    +case object RangeFrame extends FrameType {
    +  override def inputType: AbstractDataType = TypeCollection.NumericAndInterval
    +  override def sql: String = "RANGE"
     }
     
     /**
    - * Extractor for making working with frame boundaries easier.
    + * The trait used to represent special boundaries used in a window frame.
      */
    -object FrameBoundary {
    -  def apply(boundary: FrameBoundary): Option[Int] = unapply(boundary)
    -  def unapply(boundary: FrameBoundary): Option[Int] = boundary match {
    -    case CurrentRow => Some(0)
    -    case ValuePreceding(offset) => Some(-offset)
    -    case ValueFollowing(offset) => Some(offset)
    -    case _ => None
    -  }
    +sealed trait SpecialFrameBoundary extends Expression with Unevaluable {
    +  override lazy val children: Seq[Expression] = Nil
    +  override def dataType: DataType = NullType
    +  override def foldable: Boolean = false
    +  override def nullable: Boolean = false
    +
    +  def notFollows(other: Expression): Boolean
     }
     
    -/** UNBOUNDED PRECEDING boundary. */
    -case object UnboundedPreceding extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => true
    -    case vp: ValuePreceding => true
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    -    case UnboundedFollowing => true
    -  }
    +/** UNBOUNDED boundary. */
    +case object UnboundedPreceding extends SpecialFrameBoundary {
    +  override def sql: String = "UNBOUNDED PRECEDING"
     
    -  override def toString: String = "UNBOUNDED PRECEDING"
    +  override def notFollows(other: Expression): Boolean = true
     }
     
    -/** <value> PRECEDING boundary. */
    -case class ValuePreceding(value: Int) extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case ValuePreceding(anotherValue) => value >= anotherValue
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    +case object UnboundedFollowing extends SpecialFrameBoundary {
    +  override def sql: String = "UNBOUNDED FOLLOWING"
    +
    +  override def notFollows(other: Expression): Boolean = other match {
         case UnboundedFollowing => true
    +    case _ => false
       }
    -
    -  override def toString: String = s"$value PRECEDING"
     }
     
     /** CURRENT ROW boundary. */
    -case object CurrentRow extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case vp: ValuePreceding => false
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    -    case UnboundedFollowing => true
    -  }
    -
    -  override def toString: String = "CURRENT ROW"
    -}
    +case object CurrentRow extends SpecialFrameBoundary {
    +  override def sql: String = "CURRENT ROW"
     
    -/** <value> FOLLOWING boundary. */
    -case class ValueFollowing(value: Int) extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case vp: ValuePreceding => false
    -    case CurrentRow => false
    -    case ValueFollowing(anotherValue) => value <= anotherValue
    +  override def notFollows(other: Expression): Boolean = other match {
         case UnboundedFollowing => true
    +    case _ => false
       }
    -
    -  override def toString: String = s"$value FOLLOWING"
    -}
    -
    -/** UNBOUNDED FOLLOWING boundary. */
    -case object UnboundedFollowing extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case vp: ValuePreceding => false
    -    case CurrentRow => false
    -    case vf: ValueFollowing => false
    -    case UnboundedFollowing => true
    -  }
    -
    -  override def toString: String = "UNBOUNDED FOLLOWING"
     }
     
     /**
      * Represents a window frame.
      */
    -sealed trait WindowFrame
    +sealed trait WindowFrame extends Expression with Unevaluable {
    +  override lazy val children: Seq[Expression] = Nil
    +  override def dataType: DataType = throw new UnsupportedOperationException("dataType")
    +  override def foldable: Boolean = false
    +  override def nullable: Boolean = false
    +}
     
     /** Used as a placeholder when a frame specification is not defined. */
     case object UnspecifiedFrame extends WindowFrame
     
    -/** A specified Window Frame. */
    +/**
    + * A specified Window Frame. The val lower/uppper can be either a foldable [[Expression]] or a
    + * [[SpecialFrameBoundary]].
    + */
     case class SpecifiedWindowFrame(
         frameType: FrameType,
    -    frameStart: FrameBoundary,
    -    frameEnd: FrameBoundary) extends WindowFrame {
    -
    -  /** If this WindowFrame is valid or not. */
    -  def validate: Option[String] = (frameType, frameStart, frameEnd) match {
    -    case (_, UnboundedFollowing, _) =>
    -      Some(s"$UnboundedFollowing is not allowed as the start of a Window Frame.")
    -    case (_, _, UnboundedPreceding) =>
    -      Some(s"$UnboundedPreceding is not allowed as the end of a Window Frame.")
    -    // case (RowFrame, start, end) => ??? RowFrame specific rule
    -    // case (RangeFrame, start, end) => ??? RangeFrame specific rule
    -    case (_, start, end) =>
    -      if (start.notFollows(end)) {
    -        None
    -      } else {
    -        val reason =
    -          s"The end of this Window Frame $end is smaller than the start of " +
    -          s"this Window Frame $start."
    -        Some(reason)
    -      }
    +    lower: Expression,
    +    upper: Expression)
    +  extends WindowFrame {
    +
    +  override lazy val children: Seq[Expression] = lower :: upper :: Nil
    +
    +  lazy val valueBoundary: Seq[Expression] =
    +    children.filterNot(_.isInstanceOf[SpecialFrameBoundary])
    +
    +  override def checkInputDataTypes(): TypeCheckResult = {
    +    // Check lower value.
    +    val lowerCheck = checkBoundary(lower, "lower")
    +    if (lowerCheck.isFailure) {
    +      return lowerCheck
    +    }
    +
    +    // Check upper value.
    +    val upperCheck = checkBoundary(upper, "upper")
    +    if (upperCheck.isFailure) {
    +      return upperCheck
    +    }
    +
    +    // Check combination (of expressions).
    +    (lower, upper) match {
    +      case (l: SpecialFrameBoundary, _) => TypeCheckSuccess
    +      case (_, u: SpecialFrameBoundary) => TypeCheckSuccess
    +      case (l: Expression, u: Expression) if l.dataType != u.dataType =>
    +        TypeCheckFailure(
    +          s"Window frame bounds '$lower' and '$upper' do no not have the same data type: " +
    +            s"'${l.dataType.catalogString}' <> '${u.dataType.catalogString}'")
    +      case (l: Expression, u: Expression) if isGreaterThan(l, u) =>
    +        TypeCheckFailure(
    +          "The lower bound of a window frame must less than or equal to the upper bound")
    +      case _ => TypeCheckSuccess
    +    }
       }
     
    -  override def toString: String = frameType match {
    -    case RowFrame => s"ROWS BETWEEN $frameStart AND $frameEnd"
    -    case RangeFrame => s"RANGE BETWEEN $frameStart AND $frameEnd"
    +  override def sql: String = {
    +    val lowerSql = boundarySql(lower)
    +    val upperSql = boundarySql(upper)
    +    s"${frameType.sql} BETWEEN $lowerSql AND $upperSql"
    +  }
    +
    +  def isUnbounded: Boolean = lower == UnboundedPreceding && upper == UnboundedFollowing
    +
    +  def isValueBound: Boolean = valueBoundary.nonEmpty
    +
    +  def isOffset: Boolean = (lower, upper) match {
    +    case (l: Expression, u: Expression) => frameType == RowFrame && l == u
    +    case _ => false
    +  }
    +
    +  private def boundarySql(expr: Expression): String = expr match {
    +    case e: SpecialFrameBoundary => e.sql
    +    case UnaryMinus(n) => n.sql + " PRECEDING"
    +    case e: Expression => e.sql + " FOLLOWING"
    +  }
    +
    +  private def isGreaterThan(l: Expression, r: Expression): Boolean = {
    +    GreaterThan(l, r).eval().asInstanceOf[Boolean]
    +  }
    +
    +  private def checkBoundary(b: Expression, location: String): TypeCheckResult = b match {
    --- End diff --
    
    nit:
    ```
    b match {
      case _: SpecialFrameBoundary => success
      case _ if !b.foldable => fail
      case _ if !frameType.inputType.acceptsType(e.dataType) => fail
      case _ => success
    }
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r128894916
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ExpressionParserSuite.scala ---
    @@ -267,16 +267,17 @@ class ExpressionParserSuite extends PlanTest {
         // Range/Row
         val frameTypes = Seq(("rows", RowFrame), ("range", RangeFrame))
         val boundaries = Seq(
    -      ("10 preceding", ValuePreceding(10), CurrentRow),
    -      ("3 + 1 following", ValueFollowing(4), CurrentRow), // Will fail during analysis
    -      ("unbounded preceding", UnboundedPreceding, CurrentRow),
    -      ("unbounded following", UnboundedFollowing, CurrentRow), // Will fail during analysis
    -      ("between unbounded preceding and current row", UnboundedPreceding, CurrentRow),
    +      ("10 preceding", -Literal(10), CurrentRow),
    +      ("2147483648 preceding", -Literal(2147483648L), CurrentRow),
    +      ("3 + 1 following", Add(Literal(3), Literal(1)), CurrentRow), // Will fail during analysis
    --- End diff --
    
    why this will fail during analysis?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r129005233
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ExpressionParserSuite.scala ---
    @@ -267,16 +267,17 @@ class ExpressionParserSuite extends PlanTest {
         // Range/Row
         val frameTypes = Seq(("rows", RowFrame), ("range", RangeFrame))
         val boundaries = Seq(
    -      ("10 preceding", ValuePreceding(10), CurrentRow),
    -      ("3 + 1 following", ValueFollowing(4), CurrentRow), // Will fail during analysis
    -      ("unbounded preceding", UnboundedPreceding, CurrentRow),
    -      ("unbounded following", UnboundedFollowing, CurrentRow), // Will fail during analysis
    -      ("between unbounded preceding and current row", UnboundedPreceding, CurrentRow),
    +      ("10 preceding", -Literal(10), CurrentRow),
    +      ("2147483648 preceding", -Literal(2147483648L), CurrentRow),
    +      ("3 + 1 following", Add(Literal(3), Literal(1)), CurrentRow), // Will fail during analysis
    --- End diff --
    
    Do you think that will improve the UX?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    **[Test build #80021 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80021/testReport)** for PR 18540 at commit [`fbcea1e`](https://github.com/apache/spark/commit/fbcea1ed4b279e4527a96211fd024baa7860f40f).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r130217065
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala ---
    @@ -106,161 +101,171 @@ case class WindowSpecReference(name: String) extends WindowSpec
     /**
      * The trait used to represent the type of a Window Frame.
      */
    -sealed trait FrameType
    +sealed trait FrameType {
    +  def inputType: AbstractDataType
    +  def sql: String
    +}
     
     /**
    - * RowFrame treats rows in a partition individually. When a [[ValuePreceding]]
    - * or a [[ValueFollowing]] is used as its [[FrameBoundary]], the value is considered
    - * as a physical offset.
    + * RowFrame treats rows in a partition individually. Values used in a row frame are considered
    + * to be physical offsets.
      * For example, `ROW BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a 3-row frame,
      * from the row that precedes the current row to the row that follows the current row.
      */
    -case object RowFrame extends FrameType
    +case object RowFrame extends FrameType {
    +  override def inputType: AbstractDataType = IntegerType
    +  override def sql: String = "ROWS"
    +}
     
     /**
    - * RangeFrame treats rows in a partition as groups of peers.
    - * All rows having the same `ORDER BY` ordering are considered as peers.
    - * When a [[ValuePreceding]] or a [[ValueFollowing]] is used as its [[FrameBoundary]],
    - * the value is considered as a logical offset.
    + * RangeFrame treats rows in a partition as groups of peers. All rows having the same `ORDER BY`
    + * ordering are considered as peers. Values used in a range frame are considered to be logical
    + * offsets.
      * For example, assuming the value of the current row's `ORDER BY` expression `expr` is `v`,
      * `RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a frame containing rows whose values
      * `expr` are in the range of [v-1, v+1].
      *
      * If `ORDER BY` clause is not defined, all rows in the partition are considered as peers
      * of the current row.
      */
    -case object RangeFrame extends FrameType
    -
    -/**
    - * The trait used to represent the type of a Window Frame Boundary.
    - */
    -sealed trait FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean
    +case object RangeFrame extends FrameType {
    +  override def inputType: AbstractDataType = TypeCollection.NumericAndInterval
    +  override def sql: String = "RANGE"
     }
     
     /**
    - * Extractor for making working with frame boundaries easier.
    + * The trait used to represent special boundaries used in a window frame.
      */
    -object FrameBoundary {
    -  def apply(boundary: FrameBoundary): Option[Int] = unapply(boundary)
    -  def unapply(boundary: FrameBoundary): Option[Int] = boundary match {
    -    case CurrentRow => Some(0)
    -    case ValuePreceding(offset) => Some(-offset)
    -    case ValueFollowing(offset) => Some(offset)
    -    case _ => None
    -  }
    +sealed trait SpecialFrameBoundary extends Expression with Unevaluable {
    +  override def children: Seq[Expression] = Nil
    +  override def dataType: DataType = NullType
    +  override def foldable: Boolean = false
    +  override def nullable: Boolean = false
     }
     
    -/** UNBOUNDED PRECEDING boundary. */
    -case object UnboundedPreceding extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => true
    -    case vp: ValuePreceding => true
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    -    case UnboundedFollowing => true
    -  }
    -
    -  override def toString: String = "UNBOUNDED PRECEDING"
    +/** UNBOUNDED boundary. */
    +case object UnboundedPreceding extends SpecialFrameBoundary {
    +  override def sql: String = "UNBOUNDED PRECEDING"
     }
     
    -/** <value> PRECEDING boundary. */
    -case class ValuePreceding(value: Int) extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case ValuePreceding(anotherValue) => value >= anotherValue
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    -    case UnboundedFollowing => true
    -  }
    -
    -  override def toString: String = s"$value PRECEDING"
    +case object UnboundedFollowing extends SpecialFrameBoundary {
    +  override def sql: String = "UNBOUNDED FOLLOWING"
     }
     
     /** CURRENT ROW boundary. */
    -case object CurrentRow extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case vp: ValuePreceding => false
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    -    case UnboundedFollowing => true
    -  }
    -
    -  override def toString: String = "CURRENT ROW"
    -}
    -
    -/** <value> FOLLOWING boundary. */
    -case class ValueFollowing(value: Int) extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case vp: ValuePreceding => false
    -    case CurrentRow => false
    -    case ValueFollowing(anotherValue) => value <= anotherValue
    -    case UnboundedFollowing => true
    -  }
    -
    -  override def toString: String = s"$value FOLLOWING"
    -}
    -
    -/** UNBOUNDED FOLLOWING boundary. */
    -case object UnboundedFollowing extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case vp: ValuePreceding => false
    -    case CurrentRow => false
    -    case vf: ValueFollowing => false
    -    case UnboundedFollowing => true
    -  }
    -
    -  override def toString: String = "UNBOUNDED FOLLOWING"
    +case object CurrentRow extends SpecialFrameBoundary {
    +  override def sql: String = "CURRENT ROW"
     }
     
     /**
      * Represents a window frame.
      */
    -sealed trait WindowFrame
    +sealed trait WindowFrame extends Expression with Unevaluable {
    +  override def children: Seq[Expression] = Nil
    +  override def dataType: DataType = throw new UnsupportedOperationException("dataType")
    +  override def foldable: Boolean = false
    +  override def nullable: Boolean = false
    +}
     
     /** Used as a placeholder when a frame specification is not defined. */
     case object UnspecifiedFrame extends WindowFrame
     
    -/** A specified Window Frame. */
    +/**
    + * A specified Window Frame. The val lower/uppper can be either a foldable [[Expression]] or a
    + * [[SpecialFrameBoundary]].
    + */
     case class SpecifiedWindowFrame(
         frameType: FrameType,
    -    frameStart: FrameBoundary,
    -    frameEnd: FrameBoundary) extends WindowFrame {
    -
    -  /** If this WindowFrame is valid or not. */
    -  def validate: Option[String] = (frameType, frameStart, frameEnd) match {
    -    case (_, UnboundedFollowing, _) =>
    -      Some(s"$UnboundedFollowing is not allowed as the start of a Window Frame.")
    -    case (_, _, UnboundedPreceding) =>
    -      Some(s"$UnboundedPreceding is not allowed as the end of a Window Frame.")
    -    // case (RowFrame, start, end) => ??? RowFrame specific rule
    -    // case (RangeFrame, start, end) => ??? RangeFrame specific rule
    -    case (_, start, end) =>
    -      if (start.notFollows(end)) {
    -        None
    -      } else {
    -        val reason =
    -          s"The end of this Window Frame $end is smaller than the start of " +
    -          s"this Window Frame $start."
    -        Some(reason)
    -      }
    +    lower: Expression,
    +    upper: Expression)
    +  extends WindowFrame {
    +
    +  override def children: Seq[Expression] = lower :: upper :: Nil
    +
    +  lazy val valueBoundary: Seq[Expression] =
    +    children.filterNot(_.isInstanceOf[SpecialFrameBoundary])
    +
    +  override def checkInputDataTypes(): TypeCheckResult = {
    +    // Check lower value.
    +    val lowerCheck = checkBoundary(lower, "lower")
    +    if (lowerCheck.isFailure) {
    +      return lowerCheck
    +    }
    +
    +    // Check upper value.
    +    val upperCheck = checkBoundary(upper, "upper")
    +    if (upperCheck.isFailure) {
    +      return upperCheck
    +    }
    +
    +    // Check combination (of expressions).
    +    (lower, upper) match {
    +      case (l: Expression, u: Expression) if !isValidFrameBoundary(l, u) =>
    +        TypeCheckFailure(s"Window frame upper bound '$upper' does not followes the lower bound " +
    +          s"'$lower'.")
    +      case (l: SpecialFrameBoundary, _) => TypeCheckSuccess
    +      case (_, u: SpecialFrameBoundary) => TypeCheckSuccess
    +      case (l: Expression, u: Expression) if l.dataType != u.dataType =>
    +        TypeCheckFailure(
    +          s"Window frame bounds '$lower' and '$upper' do no not have the same data type: " +
    +            s"'${l.dataType.catalogString}' <> '${u.dataType.catalogString}'")
    +      case (l: Expression, u: Expression) if isGreaterThan(l, u) =>
    +        TypeCheckFailure(
    +          "The lower bound of a window frame must less than or equal to the upper bound")
    --- End diff --
    
    The second is defensive check, added test sql for the rest.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80041/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    **[Test build #79835 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79835/testReport)** for PR 18540 at commit [`427753d`](https://github.com/apache/spark/commit/427753d27a15835d07cba0f47b35693efa6175f0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r125688475
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameWindowFunctionsSuite.scala ---
    @@ -151,6 +151,48 @@ class DataFrameWindowFunctionsSuite extends QueryTest with SharedSQLContext {
             Row(2.0d), Row(2.0d)))
       }
     
    +  test("row between should accept integer values as boundary") {
    --- End diff --
    
    Will do this tomorrow.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r130076316
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala ---
    @@ -1179,32 +1179,26 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging
       }
     
       /**
    -   * Create or resolve a [[FrameBoundary]]. Simple math expressions are allowed for Value
    -   * Preceding/Following boundaries. These expressions must be constant (foldable) and return an
    -   * integer value.
    +   * Create or resolve a frame boundary expressions.
        */
    -  override def visitFrameBound(ctx: FrameBoundContext): FrameBoundary = withOrigin(ctx) {
    -    // We currently only allow foldable integers.
    -    def value: Int = {
    +  override def visitFrameBound(ctx: FrameBoundContext): Expression = withOrigin(ctx) {
    +    def value: Expression = {
           val e = expression(ctx.expression)
    -      validate(e.resolved && e.foldable && e.dataType == IntegerType,
    -        "Frame bound value must be a constant integer.",
    -        ctx)
    -      e.eval().asInstanceOf[Int]
    +      validate(e.resolved && e.foldable, "Frame bound value must be a literal.", ctx)
    --- End diff --
    
    How about keep this so we can fail earlier?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r130035488
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala ---
    @@ -43,58 +42,54 @@ case class WindowSpecDefinition(
         orderSpec: Seq[SortOrder],
         frameSpecification: WindowFrame) extends Expression with WindowSpec with Unevaluable {
     
    -  def validate: Option[String] = frameSpecification match {
    -    case UnspecifiedFrame =>
    -      Some("Found a UnspecifiedFrame. It should be converted to a SpecifiedWindowFrame " +
    -        "during analysis. Please file a bug report.")
    -    case frame: SpecifiedWindowFrame => frame.validate.orElse {
    -      def checkValueBasedBoundaryForRangeFrame(): Option[String] = {
    -        if (orderSpec.length > 1)  {
    -          // It is not allowed to have a value-based PRECEDING and FOLLOWING
    -          // as the boundary of a Range Window Frame.
    -          Some("This Range Window Frame only accepts at most one ORDER BY expression.")
    -        } else if (orderSpec.nonEmpty && !orderSpec.head.dataType.isInstanceOf[NumericType]) {
    -          Some("The data type of the expression in the ORDER BY clause should be a numeric type.")
    -        } else {
    -          None
    -        }
    -      }
    -
    -      (frame.frameType, frame.frameStart, frame.frameEnd) match {
    -        case (RangeFrame, vp: ValuePreceding, _) => checkValueBasedBoundaryForRangeFrame()
    -        case (RangeFrame, vf: ValueFollowing, _) => checkValueBasedBoundaryForRangeFrame()
    -        case (RangeFrame, _, vp: ValuePreceding) => checkValueBasedBoundaryForRangeFrame()
    -        case (RangeFrame, _, vf: ValueFollowing) => checkValueBasedBoundaryForRangeFrame()
    -        case (_, _, _) => None
    -      }
    -    }
    -  }
    -
    -  override def children: Seq[Expression] = partitionSpec ++ orderSpec
    +  override def children: Seq[Expression] = partitionSpec ++ orderSpec :+ frameSpecification
     
       override lazy val resolved: Boolean =
         childrenResolved && checkInputDataTypes().isSuccess &&
           frameSpecification.isInstanceOf[SpecifiedWindowFrame]
     
       override def nullable: Boolean = true
       override def foldable: Boolean = false
    -  override def dataType: DataType = throw new UnsupportedOperationException
    +  override def dataType: DataType = throw new UnsupportedOperationException("dataType")
     
    -  override def sql: String = {
    -    val partition = if (partitionSpec.isEmpty) {
    -      ""
    -    } else {
    -      "PARTITION BY " + partitionSpec.map(_.sql).mkString(", ") + " "
    +  override def checkInputDataTypes(): TypeCheckResult = {
    +    frameSpecification match {
    +      case UnspecifiedFrame =>
    +        TypeCheckFailure(
    +          "Cannot use an UnspecifiedFrame. This should have been converted during analysis. " +
    +            "Please file a bug report.")
    +      case f: SpecifiedWindowFrame if f.frameType == RangeFrame && !f.isUnbounded &&
    --- End diff --
    
    Sorry I was wrong


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r128894779
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala ---
    @@ -106,173 +101,167 @@ case class WindowSpecReference(name: String) extends WindowSpec
     /**
      * The trait used to represent the type of a Window Frame.
      */
    -sealed trait FrameType
    +sealed trait FrameType {
    +  def inputType: AbstractDataType
    +  def sql: String
    +}
     
     /**
    - * RowFrame treats rows in a partition individually. When a [[ValuePreceding]]
    - * or a [[ValueFollowing]] is used as its [[FrameBoundary]], the value is considered
    - * as a physical offset.
    + * RowFrame treats rows in a partition individually. Values used in a row frame are considered
    + * to be physical offsets.
      * For example, `ROW BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a 3-row frame,
      * from the row that precedes the current row to the row that follows the current row.
      */
    -case object RowFrame extends FrameType
    +case object RowFrame extends FrameType {
    +  override def inputType: AbstractDataType = IntegerType
    +  override def sql: String = "ROWS"
    +}
     
     /**
    - * RangeFrame treats rows in a partition as groups of peers.
    - * All rows having the same `ORDER BY` ordering are considered as peers.
    - * When a [[ValuePreceding]] or a [[ValueFollowing]] is used as its [[FrameBoundary]],
    - * the value is considered as a logical offset.
    + * RangeFrame treats rows in a partition as groups of peers. All rows having the same `ORDER BY`
    + * ordering are considered as peers. Values used in a range frame are considered to be logical
    + * offsets.
      * For example, assuming the value of the current row's `ORDER BY` expression `expr` is `v`,
      * `RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a frame containing rows whose values
      * `expr` are in the range of [v-1, v+1].
      *
      * If `ORDER BY` clause is not defined, all rows in the partition are considered as peers
      * of the current row.
      */
    -case object RangeFrame extends FrameType
    +case object RangeFrame extends FrameType {
    +  override def inputType: AbstractDataType = TypeCollection.NumericAndInterval
    +  override def sql: String = "RANGE"
    +}
     
     /**
    - * The trait used to represent the type of a Window Frame Boundary.
    + * The trait used to represent special boundaries used in a window frame.
      */
    -sealed trait FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean
    +sealed trait SpecialFrameBoundary extends Expression with Unevaluable {
    +  override lazy val children: Seq[Expression] = Nil
    +  override def dataType: DataType = NullType
    +  override def foldable: Boolean = false
    +  override def nullable: Boolean = false
     }
     
    +/** UNBOUNDED boundary. */
    +case object Unbounded extends SpecialFrameBoundary
    +
    +/** CURRENT ROW boundary. */
    +case object CurrentRow extends SpecialFrameBoundary
    +
     /**
    - * Extractor for making working with frame boundaries easier.
    + * Represents a window frame.
      */
    -object FrameBoundary {
    -  def apply(boundary: FrameBoundary): Option[Int] = unapply(boundary)
    -  def unapply(boundary: FrameBoundary): Option[Int] = boundary match {
    -    case CurrentRow => Some(0)
    -    case ValuePreceding(offset) => Some(-offset)
    -    case ValueFollowing(offset) => Some(offset)
    -    case _ => None
    -  }
    +sealed trait WindowFrame extends Expression with Unevaluable {
    +  override lazy val children: Seq[Expression] = Nil
    +  override def dataType: DataType = throw new UnsupportedOperationException("dataType")
    +  override def foldable: Boolean = false
    +  override def nullable: Boolean = false
     }
     
    -/** UNBOUNDED PRECEDING boundary. */
    -case object UnboundedPreceding extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => true
    -    case vp: ValuePreceding => true
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    -    case UnboundedFollowing => true
    -  }
    +/** Used as a placeholder when a frame specification is not defined. */
    +case object UnspecifiedFrame extends WindowFrame
     
    -  override def toString: String = "UNBOUNDED PRECEDING"
    -}
    +/**
    + * A specified Window Frame. The val lower/uppper can be either a foldable [[Expression]] or a
    + * [[SpecialFrameBoundary]].
    + */
    +case class SpecifiedWindowFrame(
    +    frameType: FrameType,
    +    lower: Expression,
    +    upper: Expression)
    +  extends WindowFrame {
     
    -/** <value> PRECEDING boundary. */
    -case class ValuePreceding(value: Int) extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case ValuePreceding(anotherValue) => value >= anotherValue
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    -    case UnboundedFollowing => true
    -  }
    +  override lazy val children: Seq[Expression] = lower :: upper :: Nil
     
    -  override def toString: String = s"$value PRECEDING"
    -}
    +  lazy val valueBoundary: Seq[Expression] =
    +    children.filterNot(_.isInstanceOf[SpecialFrameBoundary])
     
    -/** CURRENT ROW boundary. */
    -case object CurrentRow extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case vp: ValuePreceding => false
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    -    case UnboundedFollowing => true
    -  }
    +  override def checkInputDataTypes(): TypeCheckResult = {
    +    // Check lower value.
    +    val lowerCheck = checkBoundary(lower, "lower")
    +    if (lowerCheck.isFailure) {
    +      return lowerCheck
    +    }
     
    -  override def toString: String = "CURRENT ROW"
    -}
    +    // Check upper value.
    +    val upperCheck = checkBoundary(upper, "upper")
    +    if (upperCheck.isFailure) {
    +      return upperCheck
    +    }
     
    -/** <value> FOLLOWING boundary. */
    -case class ValueFollowing(value: Int) extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case vp: ValuePreceding => false
    -    case CurrentRow => false
    -    case ValueFollowing(anotherValue) => value <= anotherValue
    -    case UnboundedFollowing => true
    +    // Check combination (of expressions).
    +    (lower, upper) match {
    +      case (l: SpecialFrameBoundary, _) => TypeCheckSuccess
    +      case (_, u: SpecialFrameBoundary) => TypeCheckSuccess
    +      case (l: Expression, u: Expression) if l.dataType != u.dataType =>
    +        TypeCheckFailure(
    +          s"Window frame bounds '$lower' and '$upper' do no not have the same data type: " +
    +            s"'${l.dataType.catalogString}' <> '${u.dataType.catalogString}'")
    +      case (l: Expression, u: Expression) if isGreaterThan(l, u) =>
    +        TypeCheckFailure(
    +          "The lower bound of a window frame must less than or equal to the upper bound")
    +      case _ => TypeCheckSuccess
    +    }
       }
     
    -  override def toString: String = s"$value FOLLOWING"
    -}
    -
    -/** UNBOUNDED FOLLOWING boundary. */
    -case object UnboundedFollowing extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case vp: ValuePreceding => false
    -    case CurrentRow => false
    -    case vf: ValueFollowing => false
    -    case UnboundedFollowing => true
    +  override def sql: String = {
    +    val lowerSql = boundarySql(lower, "PRECEDING")
    +    val upperSql = boundarySql(upper, "FOLLOWING")
    +    s"${frameType.sql} BETWEEN $lowerSql AND $upperSql"
       }
     
    -  override def toString: String = "UNBOUNDED FOLLOWING"
    -}
    +  def isUnbounded: Boolean = lower == Unbounded && upper == Unbounded
     
    -/**
    - * Represents a window frame.
    - */
    -sealed trait WindowFrame
    +  def isValueBound: Boolean = valueBoundary.nonEmpty
    --- End diff --
    
    nit: `def isValueBound: Boolean = !isUnbounded`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r125900833
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/expressions/WindowSpec.scala ---
    @@ -174,28 +191,22 @@ class WindowSpec private[sql](
        */
       // Note: when updating the doc for this method, also update Window.rangeBetween.
       def rangeBetween(start: Long, end: Long): WindowSpec = {
    --- End diff --
    
    Maybe we can use `Literal(0)` to represent `CurrentRow`? And a sufficient large number(like `Literal(Long.MaxValue)`) for `Unbounded`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    cc @cloud-fan @gatorsmile @gengliangwang 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r128947347
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala ---
    @@ -106,173 +101,167 @@ case class WindowSpecReference(name: String) extends WindowSpec
     /**
      * The trait used to represent the type of a Window Frame.
      */
    -sealed trait FrameType
    +sealed trait FrameType {
    +  def inputType: AbstractDataType
    +  def sql: String
    +}
     
     /**
    - * RowFrame treats rows in a partition individually. When a [[ValuePreceding]]
    - * or a [[ValueFollowing]] is used as its [[FrameBoundary]], the value is considered
    - * as a physical offset.
    + * RowFrame treats rows in a partition individually. Values used in a row frame are considered
    + * to be physical offsets.
      * For example, `ROW BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a 3-row frame,
      * from the row that precedes the current row to the row that follows the current row.
      */
    -case object RowFrame extends FrameType
    +case object RowFrame extends FrameType {
    +  override def inputType: AbstractDataType = IntegerType
    +  override def sql: String = "ROWS"
    +}
     
     /**
    - * RangeFrame treats rows in a partition as groups of peers.
    - * All rows having the same `ORDER BY` ordering are considered as peers.
    - * When a [[ValuePreceding]] or a [[ValueFollowing]] is used as its [[FrameBoundary]],
    - * the value is considered as a logical offset.
    + * RangeFrame treats rows in a partition as groups of peers. All rows having the same `ORDER BY`
    + * ordering are considered as peers. Values used in a range frame are considered to be logical
    + * offsets.
      * For example, assuming the value of the current row's `ORDER BY` expression `expr` is `v`,
      * `RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a frame containing rows whose values
      * `expr` are in the range of [v-1, v+1].
      *
      * If `ORDER BY` clause is not defined, all rows in the partition are considered as peers
      * of the current row.
      */
    -case object RangeFrame extends FrameType
    +case object RangeFrame extends FrameType {
    +  override def inputType: AbstractDataType = TypeCollection.NumericAndInterval
    +  override def sql: String = "RANGE"
    +}
     
     /**
    - * The trait used to represent the type of a Window Frame Boundary.
    + * The trait used to represent special boundaries used in a window frame.
      */
    -sealed trait FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean
    +sealed trait SpecialFrameBoundary extends Expression with Unevaluable {
    +  override lazy val children: Seq[Expression] = Nil
    +  override def dataType: DataType = NullType
    +  override def foldable: Boolean = false
    +  override def nullable: Boolean = false
     }
     
    +/** UNBOUNDED boundary. */
    +case object Unbounded extends SpecialFrameBoundary
    +
    +/** CURRENT ROW boundary. */
    +case object CurrentRow extends SpecialFrameBoundary
    +
     /**
    - * Extractor for making working with frame boundaries easier.
    + * Represents a window frame.
      */
    -object FrameBoundary {
    -  def apply(boundary: FrameBoundary): Option[Int] = unapply(boundary)
    -  def unapply(boundary: FrameBoundary): Option[Int] = boundary match {
    -    case CurrentRow => Some(0)
    -    case ValuePreceding(offset) => Some(-offset)
    -    case ValueFollowing(offset) => Some(offset)
    -    case _ => None
    -  }
    +sealed trait WindowFrame extends Expression with Unevaluable {
    +  override lazy val children: Seq[Expression] = Nil
    +  override def dataType: DataType = throw new UnsupportedOperationException("dataType")
    +  override def foldable: Boolean = false
    +  override def nullable: Boolean = false
     }
     
    -/** UNBOUNDED PRECEDING boundary. */
    -case object UnboundedPreceding extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => true
    -    case vp: ValuePreceding => true
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    -    case UnboundedFollowing => true
    -  }
    +/** Used as a placeholder when a frame specification is not defined. */
    +case object UnspecifiedFrame extends WindowFrame
     
    -  override def toString: String = "UNBOUNDED PRECEDING"
    -}
    +/**
    + * A specified Window Frame. The val lower/uppper can be either a foldable [[Expression]] or a
    + * [[SpecialFrameBoundary]].
    + */
    +case class SpecifiedWindowFrame(
    +    frameType: FrameType,
    +    lower: Expression,
    +    upper: Expression)
    +  extends WindowFrame {
     
    -/** <value> PRECEDING boundary. */
    -case class ValuePreceding(value: Int) extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case ValuePreceding(anotherValue) => value >= anotherValue
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    -    case UnboundedFollowing => true
    -  }
    +  override lazy val children: Seq[Expression] = lower :: upper :: Nil
     
    -  override def toString: String = s"$value PRECEDING"
    -}
    +  lazy val valueBoundary: Seq[Expression] =
    +    children.filterNot(_.isInstanceOf[SpecialFrameBoundary])
     
    -/** CURRENT ROW boundary. */
    -case object CurrentRow extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case vp: ValuePreceding => false
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    -    case UnboundedFollowing => true
    -  }
    +  override def checkInputDataTypes(): TypeCheckResult = {
    +    // Check lower value.
    +    val lowerCheck = checkBoundary(lower, "lower")
    +    if (lowerCheck.isFailure) {
    +      return lowerCheck
    +    }
     
    -  override def toString: String = "CURRENT ROW"
    -}
    +    // Check upper value.
    +    val upperCheck = checkBoundary(upper, "upper")
    +    if (upperCheck.isFailure) {
    +      return upperCheck
    +    }
     
    -/** <value> FOLLOWING boundary. */
    -case class ValueFollowing(value: Int) extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case vp: ValuePreceding => false
    -    case CurrentRow => false
    -    case ValueFollowing(anotherValue) => value <= anotherValue
    -    case UnboundedFollowing => true
    +    // Check combination (of expressions).
    +    (lower, upper) match {
    +      case (l: SpecialFrameBoundary, _) => TypeCheckSuccess
    +      case (_, u: SpecialFrameBoundary) => TypeCheckSuccess
    +      case (l: Expression, u: Expression) if l.dataType != u.dataType =>
    +        TypeCheckFailure(
    +          s"Window frame bounds '$lower' and '$upper' do no not have the same data type: " +
    +            s"'${l.dataType.catalogString}' <> '${u.dataType.catalogString}'")
    +      case (l: Expression, u: Expression) if isGreaterThan(l, u) =>
    +        TypeCheckFailure(
    +          "The lower bound of a window frame must less than or equal to the upper bound")
    +      case _ => TypeCheckSuccess
    +    }
       }
     
    -  override def toString: String = s"$value FOLLOWING"
    -}
    -
    -/** UNBOUNDED FOLLOWING boundary. */
    -case object UnboundedFollowing extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case vp: ValuePreceding => false
    -    case CurrentRow => false
    -    case vf: ValueFollowing => false
    -    case UnboundedFollowing => true
    +  override def sql: String = {
    +    val lowerSql = boundarySql(lower, "PRECEDING")
    +    val upperSql = boundarySql(upper, "FOLLOWING")
    +    s"${frameType.sql} BETWEEN $lowerSql AND $upperSql"
       }
     
    -  override def toString: String = "UNBOUNDED FOLLOWING"
    -}
    +  def isUnbounded: Boolean = lower == Unbounded && upper == Unbounded
     
    -/**
    - * Represents a window frame.
    - */
    -sealed trait WindowFrame
    +  def isValueBound: Boolean = valueBoundary.nonEmpty
    --- End diff --
    
    yea we can have `ROWS CURRENT ROW`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    **[Test build #79841 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79841/testReport)** for PR 18540 at commit [`43b2399`](https://github.com/apache/spark/commit/43b23993af1b7abf20088ad0b63893afebe9a00d).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79905/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80012/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r128894924
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ExpressionParserSuite.scala ---
    @@ -267,16 +267,17 @@ class ExpressionParserSuite extends PlanTest {
         // Range/Row
         val frameTypes = Seq(("rows", RowFrame), ("range", RangeFrame))
         val boundaries = Seq(
    -      ("10 preceding", ValuePreceding(10), CurrentRow),
    -      ("3 + 1 following", ValueFollowing(4), CurrentRow), // Will fail during analysis
    -      ("unbounded preceding", UnboundedPreceding, CurrentRow),
    -      ("unbounded following", UnboundedFollowing, CurrentRow), // Will fail during analysis
    -      ("between unbounded preceding and current row", UnboundedPreceding, CurrentRow),
    +      ("10 preceding", -Literal(10), CurrentRow),
    +      ("2147483648 preceding", -Literal(2147483648L), CurrentRow),
    +      ("3 + 1 following", Add(Literal(3), Literal(1)), CurrentRow), // Will fail during analysis
    +      ("unbounded preceding", Unbounded, CurrentRow),
    +      ("unbounded following", Unbounded, CurrentRow), // Will fail during analysis
    --- End diff --
    
    ditto, why?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    **[Test build #80020 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80020/testReport)** for PR 18540 at commit [`9b8a19b`](https://github.com/apache/spark/commit/9b8a19b44e11525aa875df9ae3ffaed9477d1b43).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r130036535
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala ---
    @@ -106,161 +113,176 @@ case class WindowSpecReference(name: String) extends WindowSpec
     /**
      * The trait used to represent the type of a Window Frame.
      */
    -sealed trait FrameType
    +sealed trait FrameType {
    +  def inputType: AbstractDataType
    +  def sql: String
    +}
     
     /**
    - * RowFrame treats rows in a partition individually. When a [[ValuePreceding]]
    - * or a [[ValueFollowing]] is used as its [[FrameBoundary]], the value is considered
    - * as a physical offset.
    + * RowFrame treats rows in a partition individually. Values used in a row frame are considered
    + * to be physical offsets.
      * For example, `ROW BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a 3-row frame,
      * from the row that precedes the current row to the row that follows the current row.
      */
    -case object RowFrame extends FrameType
    +case object RowFrame extends FrameType {
    +  override def inputType: AbstractDataType = IntegerType
    +  override def sql: String = "ROWS"
    +}
     
     /**
    - * RangeFrame treats rows in a partition as groups of peers.
    - * All rows having the same `ORDER BY` ordering are considered as peers.
    - * When a [[ValuePreceding]] or a [[ValueFollowing]] is used as its [[FrameBoundary]],
    - * the value is considered as a logical offset.
    + * RangeFrame treats rows in a partition as groups of peers. All rows having the same `ORDER BY`
    + * ordering are considered as peers. Values used in a range frame are considered to be logical
    + * offsets.
      * For example, assuming the value of the current row's `ORDER BY` expression `expr` is `v`,
      * `RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a frame containing rows whose values
      * `expr` are in the range of [v-1, v+1].
      *
      * If `ORDER BY` clause is not defined, all rows in the partition are considered as peers
      * of the current row.
      */
    -case object RangeFrame extends FrameType
    -
    -/**
    - * The trait used to represent the type of a Window Frame Boundary.
    - */
    -sealed trait FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean
    +case object RangeFrame extends FrameType {
    +  override def inputType: AbstractDataType = TypeCollection.NumericAndInterval
    +  override def sql: String = "RANGE"
     }
     
     /**
    - * Extractor for making working with frame boundaries easier.
    + * The trait used to represent special boundaries used in a window frame.
      */
    -object FrameBoundary {
    -  def apply(boundary: FrameBoundary): Option[Int] = unapply(boundary)
    -  def unapply(boundary: FrameBoundary): Option[Int] = boundary match {
    -    case CurrentRow => Some(0)
    -    case ValuePreceding(offset) => Some(-offset)
    -    case ValueFollowing(offset) => Some(offset)
    -    case _ => None
    -  }
    +sealed trait SpecialFrameBoundary extends Expression with Unevaluable {
    +  override lazy val children: Seq[Expression] = Nil
    +  override def dataType: DataType = NullType
    +  override def foldable: Boolean = false
    +  override def nullable: Boolean = false
    +
    +  def notFollows(other: Expression): Boolean
     }
     
    -/** UNBOUNDED PRECEDING boundary. */
    -case object UnboundedPreceding extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => true
    -    case vp: ValuePreceding => true
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    -    case UnboundedFollowing => true
    -  }
    +/** UNBOUNDED boundary. */
    +case object UnboundedPreceding extends SpecialFrameBoundary {
    +  override def sql: String = "UNBOUNDED PRECEDING"
     
    -  override def toString: String = "UNBOUNDED PRECEDING"
    +  override def notFollows(other: Expression): Boolean = true
     }
     
    -/** <value> PRECEDING boundary. */
    -case class ValuePreceding(value: Int) extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case ValuePreceding(anotherValue) => value >= anotherValue
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    +case object UnboundedFollowing extends SpecialFrameBoundary {
    +  override def sql: String = "UNBOUNDED FOLLOWING"
    +
    +  override def notFollows(other: Expression): Boolean = other match {
         case UnboundedFollowing => true
    +    case _ => false
       }
    -
    -  override def toString: String = s"$value PRECEDING"
     }
     
     /** CURRENT ROW boundary. */
    -case object CurrentRow extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case vp: ValuePreceding => false
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    -    case UnboundedFollowing => true
    -  }
    -
    -  override def toString: String = "CURRENT ROW"
    -}
    +case object CurrentRow extends SpecialFrameBoundary {
    +  override def sql: String = "CURRENT ROW"
     
    -/** <value> FOLLOWING boundary. */
    -case class ValueFollowing(value: Int) extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case vp: ValuePreceding => false
    -    case CurrentRow => false
    -    case ValueFollowing(anotherValue) => value <= anotherValue
    +  override def notFollows(other: Expression): Boolean = other match {
         case UnboundedFollowing => true
    +    case _ => false
       }
    -
    -  override def toString: String = s"$value FOLLOWING"
    -}
    -
    -/** UNBOUNDED FOLLOWING boundary. */
    -case object UnboundedFollowing extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case vp: ValuePreceding => false
    -    case CurrentRow => false
    -    case vf: ValueFollowing => false
    -    case UnboundedFollowing => true
    -  }
    -
    -  override def toString: String = "UNBOUNDED FOLLOWING"
     }
     
     /**
      * Represents a window frame.
      */
    -sealed trait WindowFrame
    +sealed trait WindowFrame extends Expression with Unevaluable {
    +  override lazy val children: Seq[Expression] = Nil
    +  override def dataType: DataType = throw new UnsupportedOperationException("dataType")
    +  override def foldable: Boolean = false
    +  override def nullable: Boolean = false
    +}
     
     /** Used as a placeholder when a frame specification is not defined. */
     case object UnspecifiedFrame extends WindowFrame
     
    -/** A specified Window Frame. */
    +/**
    + * A specified Window Frame. The val lower/uppper can be either a foldable [[Expression]] or a
    + * [[SpecialFrameBoundary]].
    + */
     case class SpecifiedWindowFrame(
         frameType: FrameType,
    -    frameStart: FrameBoundary,
    -    frameEnd: FrameBoundary) extends WindowFrame {
    -
    -  /** If this WindowFrame is valid or not. */
    -  def validate: Option[String] = (frameType, frameStart, frameEnd) match {
    -    case (_, UnboundedFollowing, _) =>
    -      Some(s"$UnboundedFollowing is not allowed as the start of a Window Frame.")
    -    case (_, _, UnboundedPreceding) =>
    -      Some(s"$UnboundedPreceding is not allowed as the end of a Window Frame.")
    -    // case (RowFrame, start, end) => ??? RowFrame specific rule
    -    // case (RangeFrame, start, end) => ??? RangeFrame specific rule
    -    case (_, start, end) =>
    -      if (start.notFollows(end)) {
    -        None
    -      } else {
    -        val reason =
    -          s"The end of this Window Frame $end is smaller than the start of " +
    -          s"this Window Frame $start."
    -        Some(reason)
    -      }
    +    lower: Expression,
    +    upper: Expression)
    +  extends WindowFrame {
    +
    +  override lazy val children: Seq[Expression] = lower :: upper :: Nil
    +
    +  lazy val valueBoundary: Seq[Expression] =
    +    children.filterNot(_.isInstanceOf[SpecialFrameBoundary])
    +
    +  override def checkInputDataTypes(): TypeCheckResult = {
    +    // Check lower value.
    +    val lowerCheck = checkBoundary(lower, "lower")
    +    if (lowerCheck.isFailure) {
    +      return lowerCheck
    +    }
    +
    +    // Check upper value.
    +    val upperCheck = checkBoundary(upper, "upper")
    +    if (upperCheck.isFailure) {
    +      return upperCheck
    +    }
    +
    +    // Check combination (of expressions).
    +    (lower, upper) match {
    +      case (l: SpecialFrameBoundary, _) => TypeCheckSuccess
    +      case (_, u: SpecialFrameBoundary) => TypeCheckSuccess
    --- End diff --
    
    shall we move [this checking logic](https://github.com/apache/spark/pull/18540/files#diff-4a8f00ca33a80744965463dcc6662c75R76) to here too?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80006/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79210/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79531/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r125735576
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/expressions/WindowSpec.scala ---
    @@ -174,28 +191,22 @@ class WindowSpec private[sql](
        */
       // Note: when updating the doc for this method, also update Window.rangeBetween.
       def rangeBetween(start: Long, end: Long): WindowSpec = {
    --- End diff --
    
    It might be the easiest to make it take a `Column`, so `rangeBetween(begin: Column, end: Column)`, only downside to this is that we need some way to express the special boundaries (`current row`, `unbounded`). Also cc @rxin


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r127451425
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala ---
    @@ -106,173 +105,164 @@ case class WindowSpecReference(name: String) extends WindowSpec
     /**
      * The trait used to represent the type of a Window Frame.
      */
    -sealed trait FrameType
    +sealed trait FrameType {
    +  def inputType: AbstractDataType
    +  def sql: String
    +}
     
     /**
    - * RowFrame treats rows in a partition individually. When a [[ValuePreceding]]
    - * or a [[ValueFollowing]] is used as its [[FrameBoundary]], the value is considered
    - * as a physical offset.
    + * RowFrame treats rows in a partition individually. Values used in a row frame are considered
    + * to be physical offsets.
      * For example, `ROW BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a 3-row frame,
      * from the row that precedes the current row to the row that follows the current row.
      */
    -case object RowFrame extends FrameType
    +case object RowFrame extends FrameType {
    +  override def inputType: AbstractDataType = IntegerType
    +  override def sql: String = "ROWS"
    +}
     
     /**
    - * RangeFrame treats rows in a partition as groups of peers.
    - * All rows having the same `ORDER BY` ordering are considered as peers.
    - * When a [[ValuePreceding]] or a [[ValueFollowing]] is used as its [[FrameBoundary]],
    - * the value is considered as a logical offset.
    + * RangeFrame treats rows in a partition as groups of peers. All rows having the same `ORDER BY`
    + * ordering are considered as peers. Values used in a range frame are considered to be logical
    + * offsets.
      * For example, assuming the value of the current row's `ORDER BY` expression `expr` is `v`,
      * `RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a frame containing rows whose values
      * `expr` are in the range of [v-1, v+1].
      *
      * If `ORDER BY` clause is not defined, all rows in the partition are considered as peers
      * of the current row.
      */
    -case object RangeFrame extends FrameType
    +case object RangeFrame extends FrameType {
    +  override def inputType: AbstractDataType = TypeCollection.NumericAndInterval
    +  override def sql: String = "RANGE"
    +}
     
     /**
    - * The trait used to represent the type of a Window Frame Boundary.
    + * The trait used to represent special boundaries used in a window frame.
      */
    -sealed trait FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean
    -}
    +sealed trait SpecialFrameBoundary
    +
    +/** UNBOUNDED boundary. */
    +case object Unbounded extends SpecialFrameBoundary
    +
    +/** CURRENT ROW boundary. */
    +case object CurrentRow extends SpecialFrameBoundary
     
     /**
    - * Extractor for making working with frame boundaries easier.
    + * Represents a window frame.
      */
    -object FrameBoundary {
    -  def apply(boundary: FrameBoundary): Option[Int] = unapply(boundary)
    -  def unapply(boundary: FrameBoundary): Option[Int] = boundary match {
    -    case CurrentRow => Some(0)
    -    case ValuePreceding(offset) => Some(-offset)
    -    case ValueFollowing(offset) => Some(offset)
    -    case _ => None
    -  }
    +sealed trait WindowFrame extends Expression with Unevaluable {
    +  override lazy val children: Seq[Expression] = Nil
    +  override def dataType: DataType = throw new UnsupportedOperationException("dataType")
    +  override def foldable: Boolean = false
    +  override def nullable: Boolean = false
     }
     
    -/** UNBOUNDED PRECEDING boundary. */
    -case object UnboundedPreceding extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => true
    -    case vp: ValuePreceding => true
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    -    case UnboundedFollowing => true
    -  }
    +/** Used as a placeholder when a frame specification is not defined. */
    +case object UnspecifiedFrame extends WindowFrame
     
    -  override def toString: String = "UNBOUNDED PRECEDING"
    -}
    +/**
    + * A specified Window Frame. The val lower/uppper can be either a foldable [[Expression]] or a
    + * [[SpecialFrameBoundary]].
    --- End diff --
    
    can we do this in `WindowFrameCoercion`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r127449331
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala ---
    @@ -106,173 +105,164 @@ case class WindowSpecReference(name: String) extends WindowSpec
     /**
      * The trait used to represent the type of a Window Frame.
      */
    -sealed trait FrameType
    +sealed trait FrameType {
    +  def inputType: AbstractDataType
    +  def sql: String
    +}
     
     /**
    - * RowFrame treats rows in a partition individually. When a [[ValuePreceding]]
    - * or a [[ValueFollowing]] is used as its [[FrameBoundary]], the value is considered
    - * as a physical offset.
    + * RowFrame treats rows in a partition individually. Values used in a row frame are considered
    + * to be physical offsets.
      * For example, `ROW BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a 3-row frame,
      * from the row that precedes the current row to the row that follows the current row.
      */
    -case object RowFrame extends FrameType
    +case object RowFrame extends FrameType {
    +  override def inputType: AbstractDataType = IntegerType
    +  override def sql: String = "ROWS"
    +}
     
     /**
    - * RangeFrame treats rows in a partition as groups of peers.
    - * All rows having the same `ORDER BY` ordering are considered as peers.
    - * When a [[ValuePreceding]] or a [[ValueFollowing]] is used as its [[FrameBoundary]],
    - * the value is considered as a logical offset.
    + * RangeFrame treats rows in a partition as groups of peers. All rows having the same `ORDER BY`
    + * ordering are considered as peers. Values used in a range frame are considered to be logical
    + * offsets.
      * For example, assuming the value of the current row's `ORDER BY` expression `expr` is `v`,
      * `RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a frame containing rows whose values
      * `expr` are in the range of [v-1, v+1].
      *
      * If `ORDER BY` clause is not defined, all rows in the partition are considered as peers
      * of the current row.
      */
    -case object RangeFrame extends FrameType
    +case object RangeFrame extends FrameType {
    +  override def inputType: AbstractDataType = TypeCollection.NumericAndInterval
    +  override def sql: String = "RANGE"
    +}
     
     /**
    - * The trait used to represent the type of a Window Frame Boundary.
    + * The trait used to represent special boundaries used in a window frame.
      */
    -sealed trait FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean
    -}
    +sealed trait SpecialFrameBoundary
    +
    +/** UNBOUNDED boundary. */
    +case object Unbounded extends SpecialFrameBoundary
    +
    +/** CURRENT ROW boundary. */
    +case object CurrentRow extends SpecialFrameBoundary
     
     /**
    - * Extractor for making working with frame boundaries easier.
    + * Represents a window frame.
      */
    -object FrameBoundary {
    -  def apply(boundary: FrameBoundary): Option[Int] = unapply(boundary)
    -  def unapply(boundary: FrameBoundary): Option[Int] = boundary match {
    -    case CurrentRow => Some(0)
    -    case ValuePreceding(offset) => Some(-offset)
    -    case ValueFollowing(offset) => Some(offset)
    -    case _ => None
    -  }
    +sealed trait WindowFrame extends Expression with Unevaluable {
    +  override lazy val children: Seq[Expression] = Nil
    +  override def dataType: DataType = throw new UnsupportedOperationException("dataType")
    +  override def foldable: Boolean = false
    +  override def nullable: Boolean = false
     }
     
    -/** UNBOUNDED PRECEDING boundary. */
    -case object UnboundedPreceding extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => true
    -    case vp: ValuePreceding => true
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    -    case UnboundedFollowing => true
    -  }
    +/** Used as a placeholder when a frame specification is not defined. */
    +case object UnspecifiedFrame extends WindowFrame
     
    -  override def toString: String = "UNBOUNDED PRECEDING"
    -}
    +/**
    + * A specified Window Frame. The val lower/uppper can be either a foldable [[Expression]] or a
    + * [[SpecialFrameBoundary]].
    --- End diff --
    
    I tried that. The problem is that you will need to make them have a proper data type. I tried to make them `case object .. {}` with data type null, but I ended with these replaced with a null literal.
    
    All I am saying that this will require a little bit more coding. Since you need to resolve the data type of the boundary.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r126016128
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/expressions/WindowSpec.scala ---
    @@ -174,28 +191,22 @@ class WindowSpec private[sql](
        */
       // Note: when updating the doc for this method, also update Window.rangeBetween.
       def rangeBetween(start: Long, end: Long): WindowSpec = {
    --- End diff --
    
    i was trying to avoid introducing a special value, but maybe you can do that.
    
    How important is it to fix this?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    **[Test build #79210 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79210/testReport)** for PR 18540 at commit [`52c5289`](https://github.com/apache/spark/commit/52c52895cfe3cae58f8b029a350949cc153971a7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r125602934
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/expressions/WindowSpec.scala ---
    @@ -174,28 +191,22 @@ class WindowSpec private[sql](
        */
       // Note: when updating the doc for this method, also update Window.rangeBetween.
       def rangeBetween(start: Long, end: Long): WindowSpec = {
    --- End diff --
    
    We should also create API's that allow for other types of literals, at least one for `CalendarInterval`s.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/18540


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r125599886
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala ---
    @@ -106,173 +105,161 @@ case class WindowSpecReference(name: String) extends WindowSpec
     /**
      * The trait used to represent the type of a Window Frame.
      */
    -sealed trait FrameType
    +sealed trait FrameType {
    +  def inputType: AbstractDataType
    +  def sql: String
    +}
     
     /**
    - * RowFrame treats rows in a partition individually. When a [[ValuePreceding]]
    - * or a [[ValueFollowing]] is used as its [[FrameBoundary]], the value is considered
    - * as a physical offset.
    + * RowFrame treats rows in a partition individually. Values used in a row frame are considered
    + * to be physical offsets.
      * For example, `ROW BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a 3-row frame,
      * from the row that precedes the current row to the row that follows the current row.
      */
    -case object RowFrame extends FrameType
    +case object RowFrame extends FrameType {
    +  override def inputType: AbstractDataType = IntegerType
    +  override def sql: String = "ROWS"
    +}
     
     /**
    - * RangeFrame treats rows in a partition as groups of peers.
    - * All rows having the same `ORDER BY` ordering are considered as peers.
    - * When a [[ValuePreceding]] or a [[ValueFollowing]] is used as its [[FrameBoundary]],
    - * the value is considered as a logical offset.
    + * RangeFrame treats rows in a partition as groups of peers. All rows having the same `ORDER BY`
    + * ordering are considered as peers. Values used in a range frame are considered to be logical
    + * offsets.
      * For example, assuming the value of the current row's `ORDER BY` expression `expr` is `v`,
      * `RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a frame containing rows whose values
      * `expr` are in the range of [v-1, v+1].
      *
      * If `ORDER BY` clause is not defined, all rows in the partition are considered as peers
      * of the current row.
      */
    -case object RangeFrame extends FrameType
    +case object RangeFrame extends FrameType {
    +  override def inputType: AbstractDataType = TypeCollection.NumericAndInterval
    +  override def sql: String = "RANGE"
    +}
     
     /**
    - * The trait used to represent the type of a Window Frame Boundary.
    + * The trait used to represent special boundaries used in a window frame.
      */
    -sealed trait FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean
    -}
    +sealed trait SpecialFrameBoundary
    +
    +/** UNBOUNDED boundary. */
    +case object Unbounded extends SpecialFrameBoundary
    +
    +/** CURRENT ROW boundary. */
    +case object CurrentRow extends SpecialFrameBoundary
     
     /**
    - * Extractor for making working with frame boundaries easier.
    + * Represents a window frame.
      */
    -object FrameBoundary {
    -  def apply(boundary: FrameBoundary): Option[Int] = unapply(boundary)
    -  def unapply(boundary: FrameBoundary): Option[Int] = boundary match {
    -    case CurrentRow => Some(0)
    -    case ValuePreceding(offset) => Some(-offset)
    -    case ValueFollowing(offset) => Some(offset)
    -    case _ => None
    -  }
    +sealed trait WindowFrame extends Expression with Unevaluable {
    +  override lazy val children: Seq[Expression] = Nil
    +  override def dataType: DataType = throw new UnsupportedOperationException("dataType")
    +  override def foldable: Boolean = false
    +  override def nullable: Boolean = false
     }
     
    -/** UNBOUNDED PRECEDING boundary. */
    -case object UnboundedPreceding extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => true
    -    case vp: ValuePreceding => true
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    -    case UnboundedFollowing => true
    -  }
    +/** Used as a placeholder when a frame specification is not defined. */
    +case object UnspecifiedFrame extends WindowFrame
     
    -  override def toString: String = "UNBOUNDED PRECEDING"
    -}
    +/** A specified Window Frame. */
    +case class SpecifiedWindowFrame(
    +    frameType: FrameType,
    +    lower: AnyRef,
    --- End diff --
    
    I made `lower` and `upper` `AnyRef`, this was to allow the use of both of (foldable) `Expressions` and `SpecialFrameBoundary`. This works rather well with things like constant folding. The reason for not making `SpecialFrameBoundary` an `Expression` is that this cannot have a type (unless you make it a case class I suppose) and that it showed some weird behavior during analysis/optimization.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    **[Test build #80012 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80012/testReport)** for PR 18540 at commit [`9abdb5e`](https://github.com/apache/spark/commit/9abdb5eee7aab766fe73ca00749efa2a16328882).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r130216990
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala ---
    @@ -43,58 +42,54 @@ case class WindowSpecDefinition(
         orderSpec: Seq[SortOrder],
         frameSpecification: WindowFrame) extends Expression with WindowSpec with Unevaluable {
     
    -  def validate: Option[String] = frameSpecification match {
    -    case UnspecifiedFrame =>
    -      Some("Found a UnspecifiedFrame. It should be converted to a SpecifiedWindowFrame " +
    -        "during analysis. Please file a bug report.")
    -    case frame: SpecifiedWindowFrame => frame.validate.orElse {
    -      def checkValueBasedBoundaryForRangeFrame(): Option[String] = {
    -        if (orderSpec.length > 1)  {
    -          // It is not allowed to have a value-based PRECEDING and FOLLOWING
    -          // as the boundary of a Range Window Frame.
    -          Some("This Range Window Frame only accepts at most one ORDER BY expression.")
    -        } else if (orderSpec.nonEmpty && !orderSpec.head.dataType.isInstanceOf[NumericType]) {
    -          Some("The data type of the expression in the ORDER BY clause should be a numeric type.")
    -        } else {
    -          None
    -        }
    -      }
    -
    -      (frame.frameType, frame.frameStart, frame.frameEnd) match {
    -        case (RangeFrame, vp: ValuePreceding, _) => checkValueBasedBoundaryForRangeFrame()
    -        case (RangeFrame, vf: ValueFollowing, _) => checkValueBasedBoundaryForRangeFrame()
    -        case (RangeFrame, _, vp: ValuePreceding) => checkValueBasedBoundaryForRangeFrame()
    -        case (RangeFrame, _, vf: ValueFollowing) => checkValueBasedBoundaryForRangeFrame()
    -        case (_, _, _) => None
    -      }
    -    }
    -  }
    -
    -  override def children: Seq[Expression] = partitionSpec ++ orderSpec
    +  override def children: Seq[Expression] = partitionSpec ++ orderSpec :+ frameSpecification
     
       override lazy val resolved: Boolean =
         childrenResolved && checkInputDataTypes().isSuccess &&
           frameSpecification.isInstanceOf[SpecifiedWindowFrame]
     
       override def nullable: Boolean = true
       override def foldable: Boolean = false
    -  override def dataType: DataType = throw new UnsupportedOperationException
    +  override def dataType: DataType = throw new UnsupportedOperationException("dataType")
     
    -  override def sql: String = {
    -    val partition = if (partitionSpec.isEmpty) {
    -      ""
    -    } else {
    -      "PARTITION BY " + partitionSpec.map(_.sql).mkString(", ") + " "
    +  override def checkInputDataTypes(): TypeCheckResult = {
    +    frameSpecification match {
    +      case UnspecifiedFrame =>
    +        TypeCheckFailure(
    +          "Cannot use an UnspecifiedFrame. This should have been converted during analysis. " +
    +            "Please file a bug report.")
    +      case f: SpecifiedWindowFrame if f.frameType == RangeFrame && !f.isUnbounded &&
    +        orderSpec.isEmpty =>
    +        TypeCheckFailure(
    +          "A range window frame cannot be used in an unordered window specification.")
    +      case f: SpecifiedWindowFrame if f.frameType == RangeFrame && f.isValueBound &&
    +        orderSpec.size > 1 =>
    +        TypeCheckFailure(
    +          s"A range window frame with value boundaries cannot be used in a window specification " +
    +            s"with multiple order by expressions: ${orderSpec.mkString(",")}")
    +      case f: SpecifiedWindowFrame if f.frameType == RangeFrame && f.isValueBound &&
    +        !isValidFrameType(f.valueBoundary.head.dataType) =>
    +        TypeCheckFailure(
    +          s"The data type '${orderSpec.head.dataType}' used in the order specification does " +
    +            s"not match the data type '${f.valueBoundary.head.dataType}' which is used in the " +
    +            "range frame.")
    --- End diff --
    
    The first case is for defensive guard, I'll add test sql for the two negative cases related to orderBy in RangeFrame.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    **[Test build #79905 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79905/testReport)** for PR 18540 at commit [`5c9a992`](https://github.com/apache/spark/commit/5c9a992d732f28bf8c139a236ff9151b40291e8b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r125901045
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameWindowFunctionsSuite.scala ---
    @@ -151,6 +151,48 @@ class DataFrameWindowFunctionsSuite extends QueryTest with SharedSQLContext {
             Row(2.0d), Row(2.0d)))
       }
     
    +  test("row between should accept integer values as boundary") {
    --- End diff --
    
    We can only add the test cases after we have finalized the API change.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r130074867
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala ---
    @@ -1179,32 +1179,26 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging
       }
     
       /**
    -   * Create or resolve a [[FrameBoundary]]. Simple math expressions are allowed for Value
    -   * Preceding/Following boundaries. These expressions must be constant (foldable) and return an
    -   * integer value.
    +   * Create or resolve a frame boundary expressions.
        */
    -  override def visitFrameBound(ctx: FrameBoundContext): FrameBoundary = withOrigin(ctx) {
    -    // We currently only allow foldable integers.
    -    def value: Int = {
    +  override def visitFrameBound(ctx: FrameBoundContext): Expression = withOrigin(ctx) {
    +    def value: Expression = {
           val e = expression(ctx.expression)
    -      validate(e.resolved && e.foldable && e.dataType == IntegerType,
    -        "Frame bound value must be a constant integer.",
    -        ctx)
    -      e.eval().asInstanceOf[Int]
    +      validate(e.resolved && e.foldable, "Frame bound value must be a literal.", ctx)
    --- End diff --
    
    is it necessary? I think analyzer can detect and report this failure too?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r128894802
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala ---
    @@ -106,173 +101,167 @@ case class WindowSpecReference(name: String) extends WindowSpec
     /**
      * The trait used to represent the type of a Window Frame.
      */
    -sealed trait FrameType
    +sealed trait FrameType {
    +  def inputType: AbstractDataType
    +  def sql: String
    +}
     
     /**
    - * RowFrame treats rows in a partition individually. When a [[ValuePreceding]]
    - * or a [[ValueFollowing]] is used as its [[FrameBoundary]], the value is considered
    - * as a physical offset.
    + * RowFrame treats rows in a partition individually. Values used in a row frame are considered
    + * to be physical offsets.
      * For example, `ROW BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a 3-row frame,
      * from the row that precedes the current row to the row that follows the current row.
      */
    -case object RowFrame extends FrameType
    +case object RowFrame extends FrameType {
    +  override def inputType: AbstractDataType = IntegerType
    +  override def sql: String = "ROWS"
    +}
     
     /**
    - * RangeFrame treats rows in a partition as groups of peers.
    - * All rows having the same `ORDER BY` ordering are considered as peers.
    - * When a [[ValuePreceding]] or a [[ValueFollowing]] is used as its [[FrameBoundary]],
    - * the value is considered as a logical offset.
    + * RangeFrame treats rows in a partition as groups of peers. All rows having the same `ORDER BY`
    + * ordering are considered as peers. Values used in a range frame are considered to be logical
    + * offsets.
      * For example, assuming the value of the current row's `ORDER BY` expression `expr` is `v`,
      * `RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a frame containing rows whose values
      * `expr` are in the range of [v-1, v+1].
      *
      * If `ORDER BY` clause is not defined, all rows in the partition are considered as peers
      * of the current row.
      */
    -case object RangeFrame extends FrameType
    +case object RangeFrame extends FrameType {
    +  override def inputType: AbstractDataType = TypeCollection.NumericAndInterval
    +  override def sql: String = "RANGE"
    +}
     
     /**
    - * The trait used to represent the type of a Window Frame Boundary.
    + * The trait used to represent special boundaries used in a window frame.
      */
    -sealed trait FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean
    +sealed trait SpecialFrameBoundary extends Expression with Unevaluable {
    +  override lazy val children: Seq[Expression] = Nil
    +  override def dataType: DataType = NullType
    +  override def foldable: Boolean = false
    +  override def nullable: Boolean = false
     }
     
    +/** UNBOUNDED boundary. */
    +case object Unbounded extends SpecialFrameBoundary
    +
    +/** CURRENT ROW boundary. */
    +case object CurrentRow extends SpecialFrameBoundary
    +
     /**
    - * Extractor for making working with frame boundaries easier.
    + * Represents a window frame.
      */
    -object FrameBoundary {
    -  def apply(boundary: FrameBoundary): Option[Int] = unapply(boundary)
    -  def unapply(boundary: FrameBoundary): Option[Int] = boundary match {
    -    case CurrentRow => Some(0)
    -    case ValuePreceding(offset) => Some(-offset)
    -    case ValueFollowing(offset) => Some(offset)
    -    case _ => None
    -  }
    +sealed trait WindowFrame extends Expression with Unevaluable {
    +  override lazy val children: Seq[Expression] = Nil
    +  override def dataType: DataType = throw new UnsupportedOperationException("dataType")
    +  override def foldable: Boolean = false
    +  override def nullable: Boolean = false
     }
     
    -/** UNBOUNDED PRECEDING boundary. */
    -case object UnboundedPreceding extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => true
    -    case vp: ValuePreceding => true
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    -    case UnboundedFollowing => true
    -  }
    +/** Used as a placeholder when a frame specification is not defined. */
    +case object UnspecifiedFrame extends WindowFrame
     
    -  override def toString: String = "UNBOUNDED PRECEDING"
    -}
    +/**
    + * A specified Window Frame. The val lower/uppper can be either a foldable [[Expression]] or a
    + * [[SpecialFrameBoundary]].
    + */
    +case class SpecifiedWindowFrame(
    +    frameType: FrameType,
    +    lower: Expression,
    +    upper: Expression)
    +  extends WindowFrame {
     
    -/** <value> PRECEDING boundary. */
    -case class ValuePreceding(value: Int) extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case ValuePreceding(anotherValue) => value >= anotherValue
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    -    case UnboundedFollowing => true
    -  }
    +  override lazy val children: Seq[Expression] = lower :: upper :: Nil
     
    -  override def toString: String = s"$value PRECEDING"
    -}
    +  lazy val valueBoundary: Seq[Expression] =
    +    children.filterNot(_.isInstanceOf[SpecialFrameBoundary])
     
    -/** CURRENT ROW boundary. */
    -case object CurrentRow extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case vp: ValuePreceding => false
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    -    case UnboundedFollowing => true
    -  }
    +  override def checkInputDataTypes(): TypeCheckResult = {
    +    // Check lower value.
    +    val lowerCheck = checkBoundary(lower, "lower")
    +    if (lowerCheck.isFailure) {
    +      return lowerCheck
    +    }
     
    -  override def toString: String = "CURRENT ROW"
    -}
    +    // Check upper value.
    +    val upperCheck = checkBoundary(upper, "upper")
    +    if (upperCheck.isFailure) {
    +      return upperCheck
    +    }
     
    -/** <value> FOLLOWING boundary. */
    -case class ValueFollowing(value: Int) extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case vp: ValuePreceding => false
    -    case CurrentRow => false
    -    case ValueFollowing(anotherValue) => value <= anotherValue
    -    case UnboundedFollowing => true
    +    // Check combination (of expressions).
    +    (lower, upper) match {
    +      case (l: SpecialFrameBoundary, _) => TypeCheckSuccess
    +      case (_, u: SpecialFrameBoundary) => TypeCheckSuccess
    +      case (l: Expression, u: Expression) if l.dataType != u.dataType =>
    +        TypeCheckFailure(
    +          s"Window frame bounds '$lower' and '$upper' do no not have the same data type: " +
    +            s"'${l.dataType.catalogString}' <> '${u.dataType.catalogString}'")
    +      case (l: Expression, u: Expression) if isGreaterThan(l, u) =>
    +        TypeCheckFailure(
    +          "The lower bound of a window frame must less than or equal to the upper bound")
    +      case _ => TypeCheckSuccess
    +    }
       }
     
    -  override def toString: String = s"$value FOLLOWING"
    -}
    -
    -/** UNBOUNDED FOLLOWING boundary. */
    -case object UnboundedFollowing extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case vp: ValuePreceding => false
    -    case CurrentRow => false
    -    case vf: ValueFollowing => false
    -    case UnboundedFollowing => true
    +  override def sql: String = {
    +    val lowerSql = boundarySql(lower, "PRECEDING")
    +    val upperSql = boundarySql(upper, "FOLLOWING")
    +    s"${frameType.sql} BETWEEN $lowerSql AND $upperSql"
       }
     
    -  override def toString: String = "UNBOUNDED FOLLOWING"
    -}
    +  def isUnbounded: Boolean = lower == Unbounded && upper == Unbounded
     
    -/**
    - * Represents a window frame.
    - */
    -sealed trait WindowFrame
    +  def isValueBound: Boolean = valueBoundary.nonEmpty
     
    -/** Used as a placeholder when a frame specification is not defined. */
    -case object UnspecifiedFrame extends WindowFrame
    +  def isOffset: Boolean = (lower, upper) match {
    +    case (l: Expression, u: Expression) => frameType == RowFrame && l == u
    +    case _ => false
    +  }
     
    -/** A specified Window Frame. */
    -case class SpecifiedWindowFrame(
    -    frameType: FrameType,
    -    frameStart: FrameBoundary,
    -    frameEnd: FrameBoundary) extends WindowFrame {
    -
    -  /** If this WindowFrame is valid or not. */
    -  def validate: Option[String] = (frameType, frameStart, frameEnd) match {
    -    case (_, UnboundedFollowing, _) =>
    -      Some(s"$UnboundedFollowing is not allowed as the start of a Window Frame.")
    -    case (_, _, UnboundedPreceding) =>
    -      Some(s"$UnboundedPreceding is not allowed as the end of a Window Frame.")
    -    // case (RowFrame, start, end) => ??? RowFrame specific rule
    -    // case (RangeFrame, start, end) => ??? RangeFrame specific rule
    -    case (_, start, end) =>
    -      if (start.notFollows(end)) {
    -        None
    -      } else {
    -        val reason =
    -          s"The end of this Window Frame $end is smaller than the start of " +
    -          s"this Window Frame $start."
    -        Some(reason)
    -      }
    +  private def boundarySql(expr: Expression, defaultDirection: String): String = expr match {
    +    case CurrentRow => "CURRENT ROW"
    +    case Unbounded => "UNBOUNDED " + defaultDirection
    +    case UnaryMinus(n) => n.sql + " PRECEDING"
    +    case e: Expression => e.sql + " FOLLOWING"
       }
     
    -  override def toString: String = frameType match {
    -    case RowFrame => s"ROWS BETWEEN $frameStart AND $frameEnd"
    -    case RangeFrame => s"RANGE BETWEEN $frameStart AND $frameEnd"
    +  private def isGreaterThan(l: Expression, r: Expression): Boolean = {
    +    GreaterThan(l, r).eval().asInstanceOf[Boolean]
    +  }
    +
    +  private def checkBoundary(b: Expression, location: String): TypeCheckResult = b match {
    +    case e: Expression if !e.foldable && !e.isInstanceOf[SpecialFrameBoundary] =>
    --- End diff --
    
    nit: we can mark `SpecialFrameBoundary` as foldable.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    **[Test build #80018 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80018/testReport)** for PR 18540 at commit [`7d05c3d`](https://github.com/apache/spark/commit/7d05c3d1bfc016603f37b7479dce4d60e6b490da).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r127143764
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala ---
    @@ -43,57 +42,57 @@ case class WindowSpecDefinition(
         orderSpec: Seq[SortOrder],
         frameSpecification: WindowFrame) extends Expression with WindowSpec with Unevaluable {
     
    -  def validate: Option[String] = frameSpecification match {
    -    case UnspecifiedFrame =>
    -      Some("Found a UnspecifiedFrame. It should be converted to a SpecifiedWindowFrame " +
    -        "during analysis. Please file a bug report.")
    -    case frame: SpecifiedWindowFrame => frame.validate.orElse {
    -      def checkValueBasedBoundaryForRangeFrame(): Option[String] = {
    -        if (orderSpec.length > 1)  {
    -          // It is not allowed to have a value-based PRECEDING and FOLLOWING
    -          // as the boundary of a Range Window Frame.
    -          Some("This Range Window Frame only accepts at most one ORDER BY expression.")
    -        } else if (orderSpec.nonEmpty && !orderSpec.head.dataType.isInstanceOf[NumericType]) {
    -          Some("The data type of the expression in the ORDER BY clause should be a numeric type.")
    -        } else {
    -          None
    -        }
    -      }
    -
    -      (frame.frameType, frame.frameStart, frame.frameEnd) match {
    -        case (RangeFrame, vp: ValuePreceding, _) => checkValueBasedBoundaryForRangeFrame()
    -        case (RangeFrame, vf: ValueFollowing, _) => checkValueBasedBoundaryForRangeFrame()
    -        case (RangeFrame, _, vp: ValuePreceding) => checkValueBasedBoundaryForRangeFrame()
    -        case (RangeFrame, _, vf: ValueFollowing) => checkValueBasedBoundaryForRangeFrame()
    -        case (_, _, _) => None
    -      }
    -    }
    -  }
    -
    -  override def children: Seq[Expression] = partitionSpec ++ orderSpec
    +  override def children: Seq[Expression] = partitionSpec ++ orderSpec ++ Seq(frameSpecification)
     
       override lazy val resolved: Boolean =
         childrenResolved && checkInputDataTypes().isSuccess &&
           frameSpecification.isInstanceOf[SpecifiedWindowFrame]
     
       override def nullable: Boolean = true
       override def foldable: Boolean = false
    -  override def dataType: DataType = throw new UnsupportedOperationException
    +  override def dataType: DataType = throw new UnsupportedOperationException("dataType")
     
    -  override def sql: String = {
    -    val partition = if (partitionSpec.isEmpty) {
    -      ""
    -    } else {
    -      "PARTITION BY " + partitionSpec.map(_.sql).mkString(", ") + " "
    +  override def checkInputDataTypes(): TypeCheckResult = {
    +    frameSpecification match {
    +      case UnspecifiedFrame =>
    +        TypeCheckFailure(
    +          "Cannot use an UnspecifiedFrame. This should have been converted during analysis. " +
    +            "Please file a bug report.")
    +      case f: SpecifiedWindowFrame if f.frameType == RangeFrame && !f.isUnbounded
    +        && orderSpec.isEmpty =>
    +        TypeCheckFailure(
    +          "A range window frame cannot be used in an unordered window specification.")
    +      case f: SpecifiedWindowFrame if f.frameType == RangeFrame && f.isValueBound
    +        && orderSpec.size > 1 =>
    +        TypeCheckFailure(
    +          s"A range window frame with value boundaries cannot be used in a window specification " +
    +            s"with multiple order by expressions: ${orderSpec.mkString(",")}")
    +      case f: SpecifiedWindowFrame if f.frameType == RangeFrame && f.isValueBound
    +        && !isValidFrameType(f.children.head.dataType) =>
    +        TypeCheckFailure(
    +          s"The data type '${orderSpec.head.dataType}' used in the order specification does " +
    +            s"not match the data type '${f.children.head.dataType}' which is used in the " +
    +            "range frame.")
    +      case _ => TypeCheckSuccess
         }
    +  }
     
    -    val order = if (orderSpec.isEmpty) {
    -      ""
    -    } else {
    -      "ORDER BY " + orderSpec.map(_.sql).mkString(", ") + " "
    +  override def sql: String = {
    +    def toSql(exprs: Seq[Expression], prefix: String): Seq[String] = {
    +      Seq(exprs).filter(_.nonEmpty).map(_.map(_.sql).mkString(prefix, ", ", ""))
         }
     
    -    s"($partition$order${frameSpecification.toString})"
    +    val elements =
    +      toSql(partitionSpec, "PARTITION BY ") ++
    +        toSql(orderSpec, "ORDER BY ") ++
    +        Seq(frameSpecification.sql)
    +    elements.mkString("(", " ", ")")
    +  }
    +
    +  private def isValidFrameType(ft: DataType): Boolean = (orderSpec.head.dataType, ft) match {
    +    case (DateType, IntegerType) => true
    +    case (TimestampType, CalendarIntervalType) => true
    --- End diff --
    
    shall we support `DateType`, `TimestampType` in follow-up PR? Let's focus on refactor/cleanup in this PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r130214960
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala ---
    @@ -805,4 +806,27 @@ object TypeCoercion {
           Option(ret)
         }
       }
    +
    +  /**
    +   * Cast WindowFrame boundaries to the type they operate upon.
    +   */
    +  object WindowFrameCoercion extends Rule[LogicalPlan] {
    +    def apply(plan: LogicalPlan): LogicalPlan = plan resolveExpressions {
    +      case s @ WindowSpecDefinition(_, Seq(order), SpecifiedWindowFrame(RangeFrame, lower, upper))
    +        if order.resolved =>
    +        s.copy(frameSpecification = SpecifiedWindowFrame(
    +          RangeFrame,
    +          createBoundaryCast(lower, order.dataType),
    +          createBoundaryCast(upper, order.dataType)))
    +    }
    +
    +    private def createBoundaryCast(boundary: Expression, dt: DataType): Expression = {
    +      boundary match {
    +        case e: Expression if e.dataType != dt && Cast.canCast(e.dataType, dt) &&
    +          !e.isInstanceOf[SpecialFrameBoundary] =>
    --- End diff --
    
    Splitting `if` in the middle looks weird. How about?
    ```Scala
            case e: SpecialFrameBoundary => e
            case e: Expression if e.dataType != dt && Cast.canCast(e.dataType, dt) => Cast(e, dt)
            case _ => boundary
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r130036232
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala ---
    @@ -43,57 +42,65 @@ case class WindowSpecDefinition(
         orderSpec: Seq[SortOrder],
         frameSpecification: WindowFrame) extends Expression with WindowSpec with Unevaluable {
     
    -  def validate: Option[String] = frameSpecification match {
    -    case UnspecifiedFrame =>
    -      Some("Found a UnspecifiedFrame. It should be converted to a SpecifiedWindowFrame " +
    -        "during analysis. Please file a bug report.")
    -    case frame: SpecifiedWindowFrame => frame.validate.orElse {
    -      def checkValueBasedBoundaryForRangeFrame(): Option[String] = {
    -        if (orderSpec.length > 1)  {
    -          // It is not allowed to have a value-based PRECEDING and FOLLOWING
    -          // as the boundary of a Range Window Frame.
    -          Some("This Range Window Frame only accepts at most one ORDER BY expression.")
    -        } else if (orderSpec.nonEmpty && !orderSpec.head.dataType.isInstanceOf[NumericType]) {
    -          Some("The data type of the expression in the ORDER BY clause should be a numeric type.")
    -        } else {
    -          None
    -        }
    -      }
    -
    -      (frame.frameType, frame.frameStart, frame.frameEnd) match {
    -        case (RangeFrame, vp: ValuePreceding, _) => checkValueBasedBoundaryForRangeFrame()
    -        case (RangeFrame, vf: ValueFollowing, _) => checkValueBasedBoundaryForRangeFrame()
    -        case (RangeFrame, _, vp: ValuePreceding) => checkValueBasedBoundaryForRangeFrame()
    -        case (RangeFrame, _, vf: ValueFollowing) => checkValueBasedBoundaryForRangeFrame()
    -        case (_, _, _) => None
    -      }
    -    }
    -  }
    -
    -  override def children: Seq[Expression] = partitionSpec ++ orderSpec
    +  override def children: Seq[Expression] = partitionSpec ++ orderSpec :+ frameSpecification
     
       override lazy val resolved: Boolean =
         childrenResolved && checkInputDataTypes().isSuccess &&
           frameSpecification.isInstanceOf[SpecifiedWindowFrame]
     
       override def nullable: Boolean = true
       override def foldable: Boolean = false
    -  override def dataType: DataType = throw new UnsupportedOperationException
    +  override def dataType: DataType = throw new UnsupportedOperationException("dataType")
     
    -  override def sql: String = {
    -    val partition = if (partitionSpec.isEmpty) {
    -      ""
    -    } else {
    -      "PARTITION BY " + partitionSpec.map(_.sql).mkString(", ") + " "
    +  override def checkInputDataTypes(): TypeCheckResult = {
    +    frameSpecification match {
    +      case UnspecifiedFrame =>
    +        TypeCheckFailure(
    +          "Cannot use an UnspecifiedFrame. This should have been converted during analysis. " +
    +            "Please file a bug report.")
    +      case f: SpecifiedWindowFrame if f.frameType == RangeFrame && !f.isUnbounded &&
    +        orderSpec.isEmpty =>
    +        TypeCheckFailure(
    +          "A range window frame cannot be used in an unordered window specification.")
    +      case f: SpecifiedWindowFrame if f.frameType == RangeFrame && f.isValueBound &&
    +        orderSpec.size > 1 =>
    +        TypeCheckFailure(
    +          s"A range window frame with value boundaries cannot be used in a window specification " +
    +            s"with multiple order by expressions: ${orderSpec.mkString(",")}")
    +      case f: SpecifiedWindowFrame if f.frameType == RangeFrame && f.isValueBound &&
    +        !isValidFrameType(f.valueBoundary.head.dataType) =>
    +        TypeCheckFailure(
    +          s"The data type '${orderSpec.head.dataType}' used in the order specification does " +
    +            s"not match the data type '${f.valueBoundary.head.dataType}' which is used in the " +
    +            "range frame.")
    +      case f: SpecifiedWindowFrame if !isValidFrameBoundary(f.lower, f.upper) =>
    +        TypeCheckFailure(s"The upper bound of the window frame is '${f.upper.sql}', which is " +
    +          s"smaller than the lower bound '${f.lower.sql}'.")
    +      case _ => TypeCheckSuccess
         }
    +  }
     
    -    val order = if (orderSpec.isEmpty) {
    -      ""
    -    } else {
    -      "ORDER BY " + orderSpec.map(_.sql).mkString(", ") + " "
    +  override def sql: String = {
    +    def toSql(exprs: Seq[Expression], prefix: String): Seq[String] = {
    +      Seq(exprs).filter(_.nonEmpty).map(_.map(_.sql).mkString(prefix, ", ", ""))
         }
     
    -    s"($partition$order${frameSpecification.toString})"
    +    val elements =
    +      toSql(partitionSpec, "PARTITION BY ") ++
    +        toSql(orderSpec, "ORDER BY ") ++
    +        Seq(frameSpecification.sql)
    +    elements.mkString("(", " ", ")")
    +  }
    +
    +  private def isValidFrameType(ft: DataType): Boolean = orderSpec.head.dataType == ft
    +
    +  private def isValidFrameBoundary(lower: Expression, upper: Expression): Boolean = {
    +    (lower, upper) match {
    +      case (UnboundedFollowing, _) => false
    +      case (_, UnboundedPreceding) => false
    +      case (l: Expression, u: SpecialFrameBoundary) => !u.notFollows(l)
    --- End diff --
    
    AFAIK, `UnboundedPreceding` is always valid, and `UnboundedFollowing` has been handled. As for `CurrentRow`, it's hard to compare it with an `Expression`, because we are considering to add literal support for rangeFrame, so `l.eval()` could be Int、Decimal、Date or any other stuff.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r130215418
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala ---
    @@ -106,161 +101,171 @@ case class WindowSpecReference(name: String) extends WindowSpec
     /**
      * The trait used to represent the type of a Window Frame.
      */
    -sealed trait FrameType
    +sealed trait FrameType {
    +  def inputType: AbstractDataType
    +  def sql: String
    +}
     
     /**
    - * RowFrame treats rows in a partition individually. When a [[ValuePreceding]]
    - * or a [[ValueFollowing]] is used as its [[FrameBoundary]], the value is considered
    - * as a physical offset.
    + * RowFrame treats rows in a partition individually. Values used in a row frame are considered
    + * to be physical offsets.
      * For example, `ROW BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a 3-row frame,
      * from the row that precedes the current row to the row that follows the current row.
      */
    -case object RowFrame extends FrameType
    +case object RowFrame extends FrameType {
    +  override def inputType: AbstractDataType = IntegerType
    +  override def sql: String = "ROWS"
    +}
     
     /**
    - * RangeFrame treats rows in a partition as groups of peers.
    - * All rows having the same `ORDER BY` ordering are considered as peers.
    - * When a [[ValuePreceding]] or a [[ValueFollowing]] is used as its [[FrameBoundary]],
    - * the value is considered as a logical offset.
    + * RangeFrame treats rows in a partition as groups of peers. All rows having the same `ORDER BY`
    + * ordering are considered as peers. Values used in a range frame are considered to be logical
    + * offsets.
      * For example, assuming the value of the current row's `ORDER BY` expression `expr` is `v`,
      * `RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a frame containing rows whose values
      * `expr` are in the range of [v-1, v+1].
      *
      * If `ORDER BY` clause is not defined, all rows in the partition are considered as peers
      * of the current row.
      */
    -case object RangeFrame extends FrameType
    -
    -/**
    - * The trait used to represent the type of a Window Frame Boundary.
    - */
    -sealed trait FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean
    +case object RangeFrame extends FrameType {
    +  override def inputType: AbstractDataType = TypeCollection.NumericAndInterval
    --- End diff --
    
    uh, we also support `CalendarInterval`. Do we have a test case to verify it works on `CalendarInterval`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80018/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r130216635
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala ---
    @@ -121,15 +119,10 @@ trait CheckAnalysis extends PredicateHelper {
                 // function.
                 e match {
                   case _: AggregateExpression | _: OffsetWindowFunction | _: AggregateWindowFunction =>
    +                w
                   case _ =>
                     failAnalysis(s"Expression '$e' not supported within a window function.")
                 }
    -            // Make sure the window specification is valid.
    -            s.validate match {
    --- End diff --
    
    yea


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    LGTM except some minor comments


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r128894791
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala ---
    @@ -106,173 +101,167 @@ case class WindowSpecReference(name: String) extends WindowSpec
     /**
      * The trait used to represent the type of a Window Frame.
      */
    -sealed trait FrameType
    +sealed trait FrameType {
    +  def inputType: AbstractDataType
    +  def sql: String
    +}
     
     /**
    - * RowFrame treats rows in a partition individually. When a [[ValuePreceding]]
    - * or a [[ValueFollowing]] is used as its [[FrameBoundary]], the value is considered
    - * as a physical offset.
    + * RowFrame treats rows in a partition individually. Values used in a row frame are considered
    + * to be physical offsets.
      * For example, `ROW BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a 3-row frame,
      * from the row that precedes the current row to the row that follows the current row.
      */
    -case object RowFrame extends FrameType
    +case object RowFrame extends FrameType {
    +  override def inputType: AbstractDataType = IntegerType
    +  override def sql: String = "ROWS"
    +}
     
     /**
    - * RangeFrame treats rows in a partition as groups of peers.
    - * All rows having the same `ORDER BY` ordering are considered as peers.
    - * When a [[ValuePreceding]] or a [[ValueFollowing]] is used as its [[FrameBoundary]],
    - * the value is considered as a logical offset.
    + * RangeFrame treats rows in a partition as groups of peers. All rows having the same `ORDER BY`
    + * ordering are considered as peers. Values used in a range frame are considered to be logical
    + * offsets.
      * For example, assuming the value of the current row's `ORDER BY` expression `expr` is `v`,
      * `RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a frame containing rows whose values
      * `expr` are in the range of [v-1, v+1].
      *
      * If `ORDER BY` clause is not defined, all rows in the partition are considered as peers
      * of the current row.
      */
    -case object RangeFrame extends FrameType
    +case object RangeFrame extends FrameType {
    +  override def inputType: AbstractDataType = TypeCollection.NumericAndInterval
    +  override def sql: String = "RANGE"
    +}
     
     /**
    - * The trait used to represent the type of a Window Frame Boundary.
    + * The trait used to represent special boundaries used in a window frame.
      */
    -sealed trait FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean
    +sealed trait SpecialFrameBoundary extends Expression with Unevaluable {
    +  override lazy val children: Seq[Expression] = Nil
    +  override def dataType: DataType = NullType
    +  override def foldable: Boolean = false
    +  override def nullable: Boolean = false
     }
     
    +/** UNBOUNDED boundary. */
    +case object Unbounded extends SpecialFrameBoundary
    +
    +/** CURRENT ROW boundary. */
    +case object CurrentRow extends SpecialFrameBoundary
    +
     /**
    - * Extractor for making working with frame boundaries easier.
    + * Represents a window frame.
      */
    -object FrameBoundary {
    -  def apply(boundary: FrameBoundary): Option[Int] = unapply(boundary)
    -  def unapply(boundary: FrameBoundary): Option[Int] = boundary match {
    -    case CurrentRow => Some(0)
    -    case ValuePreceding(offset) => Some(-offset)
    -    case ValueFollowing(offset) => Some(offset)
    -    case _ => None
    -  }
    +sealed trait WindowFrame extends Expression with Unevaluable {
    +  override lazy val children: Seq[Expression] = Nil
    +  override def dataType: DataType = throw new UnsupportedOperationException("dataType")
    +  override def foldable: Boolean = false
    +  override def nullable: Boolean = false
     }
     
    -/** UNBOUNDED PRECEDING boundary. */
    -case object UnboundedPreceding extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => true
    -    case vp: ValuePreceding => true
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    -    case UnboundedFollowing => true
    -  }
    +/** Used as a placeholder when a frame specification is not defined. */
    +case object UnspecifiedFrame extends WindowFrame
     
    -  override def toString: String = "UNBOUNDED PRECEDING"
    -}
    +/**
    + * A specified Window Frame. The val lower/uppper can be either a foldable [[Expression]] or a
    + * [[SpecialFrameBoundary]].
    + */
    +case class SpecifiedWindowFrame(
    +    frameType: FrameType,
    +    lower: Expression,
    +    upper: Expression)
    +  extends WindowFrame {
     
    -/** <value> PRECEDING boundary. */
    -case class ValuePreceding(value: Int) extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case ValuePreceding(anotherValue) => value >= anotherValue
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    -    case UnboundedFollowing => true
    -  }
    +  override lazy val children: Seq[Expression] = lower :: upper :: Nil
     
    -  override def toString: String = s"$value PRECEDING"
    -}
    +  lazy val valueBoundary: Seq[Expression] =
    +    children.filterNot(_.isInstanceOf[SpecialFrameBoundary])
     
    -/** CURRENT ROW boundary. */
    -case object CurrentRow extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case vp: ValuePreceding => false
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    -    case UnboundedFollowing => true
    -  }
    +  override def checkInputDataTypes(): TypeCheckResult = {
    +    // Check lower value.
    +    val lowerCheck = checkBoundary(lower, "lower")
    +    if (lowerCheck.isFailure) {
    +      return lowerCheck
    +    }
     
    -  override def toString: String = "CURRENT ROW"
    -}
    +    // Check upper value.
    +    val upperCheck = checkBoundary(upper, "upper")
    +    if (upperCheck.isFailure) {
    +      return upperCheck
    +    }
     
    -/** <value> FOLLOWING boundary. */
    -case class ValueFollowing(value: Int) extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case vp: ValuePreceding => false
    -    case CurrentRow => false
    -    case ValueFollowing(anotherValue) => value <= anotherValue
    -    case UnboundedFollowing => true
    +    // Check combination (of expressions).
    +    (lower, upper) match {
    +      case (l: SpecialFrameBoundary, _) => TypeCheckSuccess
    +      case (_, u: SpecialFrameBoundary) => TypeCheckSuccess
    +      case (l: Expression, u: Expression) if l.dataType != u.dataType =>
    +        TypeCheckFailure(
    +          s"Window frame bounds '$lower' and '$upper' do no not have the same data type: " +
    +            s"'${l.dataType.catalogString}' <> '${u.dataType.catalogString}'")
    +      case (l: Expression, u: Expression) if isGreaterThan(l, u) =>
    +        TypeCheckFailure(
    +          "The lower bound of a window frame must less than or equal to the upper bound")
    +      case _ => TypeCheckSuccess
    +    }
       }
     
    -  override def toString: String = s"$value FOLLOWING"
    -}
    -
    -/** UNBOUNDED FOLLOWING boundary. */
    -case object UnboundedFollowing extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case vp: ValuePreceding => false
    -    case CurrentRow => false
    -    case vf: ValueFollowing => false
    -    case UnboundedFollowing => true
    +  override def sql: String = {
    +    val lowerSql = boundarySql(lower, "PRECEDING")
    +    val upperSql = boundarySql(upper, "FOLLOWING")
    +    s"${frameType.sql} BETWEEN $lowerSql AND $upperSql"
       }
     
    -  override def toString: String = "UNBOUNDED FOLLOWING"
    -}
    +  def isUnbounded: Boolean = lower == Unbounded && upper == Unbounded
     
    -/**
    - * Represents a window frame.
    - */
    -sealed trait WindowFrame
    +  def isValueBound: Boolean = valueBoundary.nonEmpty
    --- End diff --
    
    oh, is it possible that both `lower` and `upper` are current row?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r127453702
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala ---
    @@ -106,173 +105,164 @@ case class WindowSpecReference(name: String) extends WindowSpec
     /**
      * The trait used to represent the type of a Window Frame.
      */
    -sealed trait FrameType
    +sealed trait FrameType {
    +  def inputType: AbstractDataType
    +  def sql: String
    +}
     
     /**
    - * RowFrame treats rows in a partition individually. When a [[ValuePreceding]]
    - * or a [[ValueFollowing]] is used as its [[FrameBoundary]], the value is considered
    - * as a physical offset.
    + * RowFrame treats rows in a partition individually. Values used in a row frame are considered
    + * to be physical offsets.
      * For example, `ROW BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a 3-row frame,
      * from the row that precedes the current row to the row that follows the current row.
      */
    -case object RowFrame extends FrameType
    +case object RowFrame extends FrameType {
    +  override def inputType: AbstractDataType = IntegerType
    +  override def sql: String = "ROWS"
    +}
     
     /**
    - * RangeFrame treats rows in a partition as groups of peers.
    - * All rows having the same `ORDER BY` ordering are considered as peers.
    - * When a [[ValuePreceding]] or a [[ValueFollowing]] is used as its [[FrameBoundary]],
    - * the value is considered as a logical offset.
    + * RangeFrame treats rows in a partition as groups of peers. All rows having the same `ORDER BY`
    + * ordering are considered as peers. Values used in a range frame are considered to be logical
    + * offsets.
      * For example, assuming the value of the current row's `ORDER BY` expression `expr` is `v`,
      * `RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a frame containing rows whose values
      * `expr` are in the range of [v-1, v+1].
      *
      * If `ORDER BY` clause is not defined, all rows in the partition are considered as peers
      * of the current row.
      */
    -case object RangeFrame extends FrameType
    +case object RangeFrame extends FrameType {
    +  override def inputType: AbstractDataType = TypeCollection.NumericAndInterval
    +  override def sql: String = "RANGE"
    +}
     
     /**
    - * The trait used to represent the type of a Window Frame Boundary.
    + * The trait used to represent special boundaries used in a window frame.
      */
    -sealed trait FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean
    -}
    +sealed trait SpecialFrameBoundary
    +
    +/** UNBOUNDED boundary. */
    +case object Unbounded extends SpecialFrameBoundary
    +
    +/** CURRENT ROW boundary. */
    +case object CurrentRow extends SpecialFrameBoundary
     
     /**
    - * Extractor for making working with frame boundaries easier.
    + * Represents a window frame.
      */
    -object FrameBoundary {
    -  def apply(boundary: FrameBoundary): Option[Int] = unapply(boundary)
    -  def unapply(boundary: FrameBoundary): Option[Int] = boundary match {
    -    case CurrentRow => Some(0)
    -    case ValuePreceding(offset) => Some(-offset)
    -    case ValueFollowing(offset) => Some(offset)
    -    case _ => None
    -  }
    +sealed trait WindowFrame extends Expression with Unevaluable {
    +  override lazy val children: Seq[Expression] = Nil
    +  override def dataType: DataType = throw new UnsupportedOperationException("dataType")
    +  override def foldable: Boolean = false
    +  override def nullable: Boolean = false
     }
     
    -/** UNBOUNDED PRECEDING boundary. */
    -case object UnboundedPreceding extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => true
    -    case vp: ValuePreceding => true
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    -    case UnboundedFollowing => true
    -  }
    +/** Used as a placeholder when a frame specification is not defined. */
    +case object UnspecifiedFrame extends WindowFrame
     
    -  override def toString: String = "UNBOUNDED PRECEDING"
    -}
    +/**
    + * A specified Window Frame. The val lower/uppper can be either a foldable [[Expression]] or a
    + * [[SpecialFrameBoundary]].
    --- End diff --
    
    Sure.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    **[Test build #79231 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79231/testReport)** for PR 18540 at commit [`1135053`](https://github.com/apache/spark/commit/11350536255543535b3ec156eeca60b996d0a239).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r127142649
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala ---
    @@ -106,173 +105,164 @@ case class WindowSpecReference(name: String) extends WindowSpec
     /**
      * The trait used to represent the type of a Window Frame.
      */
    -sealed trait FrameType
    +sealed trait FrameType {
    +  def inputType: AbstractDataType
    +  def sql: String
    +}
     
     /**
    - * RowFrame treats rows in a partition individually. When a [[ValuePreceding]]
    - * or a [[ValueFollowing]] is used as its [[FrameBoundary]], the value is considered
    - * as a physical offset.
    + * RowFrame treats rows in a partition individually. Values used in a row frame are considered
    + * to be physical offsets.
      * For example, `ROW BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a 3-row frame,
      * from the row that precedes the current row to the row that follows the current row.
      */
    -case object RowFrame extends FrameType
    +case object RowFrame extends FrameType {
    +  override def inputType: AbstractDataType = IntegerType
    +  override def sql: String = "ROWS"
    +}
     
     /**
    - * RangeFrame treats rows in a partition as groups of peers.
    - * All rows having the same `ORDER BY` ordering are considered as peers.
    - * When a [[ValuePreceding]] or a [[ValueFollowing]] is used as its [[FrameBoundary]],
    - * the value is considered as a logical offset.
    + * RangeFrame treats rows in a partition as groups of peers. All rows having the same `ORDER BY`
    + * ordering are considered as peers. Values used in a range frame are considered to be logical
    + * offsets.
      * For example, assuming the value of the current row's `ORDER BY` expression `expr` is `v`,
      * `RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a frame containing rows whose values
      * `expr` are in the range of [v-1, v+1].
      *
      * If `ORDER BY` clause is not defined, all rows in the partition are considered as peers
      * of the current row.
      */
    -case object RangeFrame extends FrameType
    +case object RangeFrame extends FrameType {
    +  override def inputType: AbstractDataType = TypeCollection.NumericAndInterval
    +  override def sql: String = "RANGE"
    +}
     
     /**
    - * The trait used to represent the type of a Window Frame Boundary.
    + * The trait used to represent special boundaries used in a window frame.
      */
    -sealed trait FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean
    -}
    +sealed trait SpecialFrameBoundary
    +
    +/** UNBOUNDED boundary. */
    +case object Unbounded extends SpecialFrameBoundary
    +
    +/** CURRENT ROW boundary. */
    +case object CurrentRow extends SpecialFrameBoundary
     
     /**
    - * Extractor for making working with frame boundaries easier.
    + * Represents a window frame.
      */
    -object FrameBoundary {
    -  def apply(boundary: FrameBoundary): Option[Int] = unapply(boundary)
    -  def unapply(boundary: FrameBoundary): Option[Int] = boundary match {
    -    case CurrentRow => Some(0)
    -    case ValuePreceding(offset) => Some(-offset)
    -    case ValueFollowing(offset) => Some(offset)
    -    case _ => None
    -  }
    +sealed trait WindowFrame extends Expression with Unevaluable {
    +  override lazy val children: Seq[Expression] = Nil
    +  override def dataType: DataType = throw new UnsupportedOperationException("dataType")
    +  override def foldable: Boolean = false
    +  override def nullable: Boolean = false
     }
     
    -/** UNBOUNDED PRECEDING boundary. */
    -case object UnboundedPreceding extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => true
    -    case vp: ValuePreceding => true
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    -    case UnboundedFollowing => true
    -  }
    +/** Used as a placeholder when a frame specification is not defined. */
    +case object UnspecifiedFrame extends WindowFrame
     
    -  override def toString: String = "UNBOUNDED PRECEDING"
    -}
    +/**
    + * A specified Window Frame. The val lower/uppper can be either a foldable [[Expression]] or a
    + * [[SpecialFrameBoundary]].
    --- End diff --
    
    can we make `SpecialFrameBoundary` an expression?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r130217244
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala ---
    @@ -1179,32 +1179,26 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging
       }
     
       /**
    -   * Create or resolve a [[FrameBoundary]]. Simple math expressions are allowed for Value
    -   * Preceding/Following boundaries. These expressions must be constant (foldable) and return an
    -   * integer value.
    +   * Create or resolve a frame boundary expressions.
        */
    -  override def visitFrameBound(ctx: FrameBoundContext): FrameBoundary = withOrigin(ctx) {
    -    // We currently only allow foldable integers.
    -    def value: Int = {
    +  override def visitFrameBound(ctx: FrameBoundContext): Expression = withOrigin(ctx) {
    +    def value: Expression = {
           val e = expression(ctx.expression)
    -      validate(e.resolved && e.foldable && e.dataType == IntegerType,
    -        "Frame bound value must be a constant integer.",
    -        ctx)
    -      e.eval().asInstanceOf[Int]
    +      validate(e.resolved && e.foldable, "Frame bound value must be a literal.", ctx)
    +      e
         }
     
    -    // Create the FrameBoundary
         ctx.boundType.getType match {
           case SqlBaseParser.PRECEDING if ctx.UNBOUNDED != null =>
             UnboundedPreceding
           case SqlBaseParser.PRECEDING =>
    -        ValuePreceding(value)
    +        UnaryMinus(value)
           case SqlBaseParser.CURRENT =>
             CurrentRow
           case SqlBaseParser.FOLLOWING if ctx.UNBOUNDED != null =>
             UnboundedFollowing
           case SqlBaseParser.FOLLOWING =>
    -        ValueFollowing(value)
    +        value
    --- End diff --
    
    May I ask how should we parse it into an `unsigned-integer`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r130073180
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala ---
    @@ -106,161 +113,176 @@ case class WindowSpecReference(name: String) extends WindowSpec
     /**
      * The trait used to represent the type of a Window Frame.
      */
    -sealed trait FrameType
    +sealed trait FrameType {
    +  def inputType: AbstractDataType
    +  def sql: String
    +}
     
     /**
    - * RowFrame treats rows in a partition individually. When a [[ValuePreceding]]
    - * or a [[ValueFollowing]] is used as its [[FrameBoundary]], the value is considered
    - * as a physical offset.
    + * RowFrame treats rows in a partition individually. Values used in a row frame are considered
    + * to be physical offsets.
      * For example, `ROW BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a 3-row frame,
      * from the row that precedes the current row to the row that follows the current row.
      */
    -case object RowFrame extends FrameType
    +case object RowFrame extends FrameType {
    +  override def inputType: AbstractDataType = IntegerType
    +  override def sql: String = "ROWS"
    +}
     
     /**
    - * RangeFrame treats rows in a partition as groups of peers.
    - * All rows having the same `ORDER BY` ordering are considered as peers.
    - * When a [[ValuePreceding]] or a [[ValueFollowing]] is used as its [[FrameBoundary]],
    - * the value is considered as a logical offset.
    + * RangeFrame treats rows in a partition as groups of peers. All rows having the same `ORDER BY`
    + * ordering are considered as peers. Values used in a range frame are considered to be logical
    + * offsets.
      * For example, assuming the value of the current row's `ORDER BY` expression `expr` is `v`,
      * `RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a frame containing rows whose values
      * `expr` are in the range of [v-1, v+1].
      *
      * If `ORDER BY` clause is not defined, all rows in the partition are considered as peers
      * of the current row.
      */
    -case object RangeFrame extends FrameType
    -
    -/**
    - * The trait used to represent the type of a Window Frame Boundary.
    - */
    -sealed trait FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean
    +case object RangeFrame extends FrameType {
    +  override def inputType: AbstractDataType = TypeCollection.NumericAndInterval
    +  override def sql: String = "RANGE"
     }
     
     /**
    - * Extractor for making working with frame boundaries easier.
    + * The trait used to represent special boundaries used in a window frame.
      */
    -object FrameBoundary {
    -  def apply(boundary: FrameBoundary): Option[Int] = unapply(boundary)
    -  def unapply(boundary: FrameBoundary): Option[Int] = boundary match {
    -    case CurrentRow => Some(0)
    -    case ValuePreceding(offset) => Some(-offset)
    -    case ValueFollowing(offset) => Some(offset)
    -    case _ => None
    -  }
    +sealed trait SpecialFrameBoundary extends Expression with Unevaluable {
    +  override lazy val children: Seq[Expression] = Nil
    +  override def dataType: DataType = NullType
    +  override def foldable: Boolean = false
    +  override def nullable: Boolean = false
    +
    +  def notFollows(other: Expression): Boolean
     }
     
    -/** UNBOUNDED PRECEDING boundary. */
    -case object UnboundedPreceding extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => true
    -    case vp: ValuePreceding => true
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    -    case UnboundedFollowing => true
    -  }
    +/** UNBOUNDED boundary. */
    +case object UnboundedPreceding extends SpecialFrameBoundary {
    +  override def sql: String = "UNBOUNDED PRECEDING"
     
    -  override def toString: String = "UNBOUNDED PRECEDING"
    +  override def notFollows(other: Expression): Boolean = true
     }
     
    -/** <value> PRECEDING boundary. */
    -case class ValuePreceding(value: Int) extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case ValuePreceding(anotherValue) => value >= anotherValue
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    +case object UnboundedFollowing extends SpecialFrameBoundary {
    +  override def sql: String = "UNBOUNDED FOLLOWING"
    +
    +  override def notFollows(other: Expression): Boolean = other match {
         case UnboundedFollowing => true
    +    case _ => false
       }
    -
    -  override def toString: String = s"$value PRECEDING"
     }
     
     /** CURRENT ROW boundary. */
    -case object CurrentRow extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case vp: ValuePreceding => false
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    -    case UnboundedFollowing => true
    -  }
    -
    -  override def toString: String = "CURRENT ROW"
    -}
    +case object CurrentRow extends SpecialFrameBoundary {
    +  override def sql: String = "CURRENT ROW"
     
    -/** <value> FOLLOWING boundary. */
    -case class ValueFollowing(value: Int) extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case vp: ValuePreceding => false
    -    case CurrentRow => false
    -    case ValueFollowing(anotherValue) => value <= anotherValue
    +  override def notFollows(other: Expression): Boolean = other match {
         case UnboundedFollowing => true
    +    case _ => false
       }
    -
    -  override def toString: String = s"$value FOLLOWING"
    -}
    -
    -/** UNBOUNDED FOLLOWING boundary. */
    -case object UnboundedFollowing extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case vp: ValuePreceding => false
    -    case CurrentRow => false
    -    case vf: ValueFollowing => false
    -    case UnboundedFollowing => true
    -  }
    -
    -  override def toString: String = "UNBOUNDED FOLLOWING"
     }
     
     /**
      * Represents a window frame.
      */
    -sealed trait WindowFrame
    +sealed trait WindowFrame extends Expression with Unevaluable {
    +  override lazy val children: Seq[Expression] = Nil
    +  override def dataType: DataType = throw new UnsupportedOperationException("dataType")
    +  override def foldable: Boolean = false
    +  override def nullable: Boolean = false
    +}
     
     /** Used as a placeholder when a frame specification is not defined. */
     case object UnspecifiedFrame extends WindowFrame
     
    -/** A specified Window Frame. */
    +/**
    + * A specified Window Frame. The val lower/uppper can be either a foldable [[Expression]] or a
    + * [[SpecialFrameBoundary]].
    + */
     case class SpecifiedWindowFrame(
         frameType: FrameType,
    -    frameStart: FrameBoundary,
    -    frameEnd: FrameBoundary) extends WindowFrame {
    -
    -  /** If this WindowFrame is valid or not. */
    -  def validate: Option[String] = (frameType, frameStart, frameEnd) match {
    -    case (_, UnboundedFollowing, _) =>
    -      Some(s"$UnboundedFollowing is not allowed as the start of a Window Frame.")
    -    case (_, _, UnboundedPreceding) =>
    -      Some(s"$UnboundedPreceding is not allowed as the end of a Window Frame.")
    -    // case (RowFrame, start, end) => ??? RowFrame specific rule
    -    // case (RangeFrame, start, end) => ??? RangeFrame specific rule
    -    case (_, start, end) =>
    -      if (start.notFollows(end)) {
    -        None
    -      } else {
    -        val reason =
    -          s"The end of this Window Frame $end is smaller than the start of " +
    -          s"this Window Frame $start."
    -        Some(reason)
    -      }
    +    lower: Expression,
    +    upper: Expression)
    +  extends WindowFrame {
    +
    +  override lazy val children: Seq[Expression] = lower :: upper :: Nil
    --- End diff --
    
    nit: `def` instead of `lazy val`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    **[Test build #80020 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80020/testReport)** for PR 18540 at commit [`9b8a19b`](https://github.com/apache/spark/commit/9b8a19b44e11525aa875df9ae3ffaed9477d1b43).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    **[Test build #80041 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80041/testReport)** for PR 18540 at commit [`38b04df`](https://github.com/apache/spark/commit/38b04df792caf298da63be565d1b46a4eae93d98).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r130214846
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala ---
    @@ -121,15 +119,10 @@ trait CheckAnalysis extends PredicateHelper {
                 // function.
                 e match {
                   case _: AggregateExpression | _: OffsetWindowFunction | _: AggregateWindowFunction =>
    +                w
                   case _ =>
                     failAnalysis(s"Expression '$e' not supported within a window function.")
                 }
    -            // Make sure the window specification is valid.
    -            s.validate match {
    --- End diff --
    
    The verification is moved to `checkInputDataTypes`, right? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    You still missed resolving [this comment](https://github.com/apache/spark/pull/18540#discussion_r130073540)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r125603187
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameWindowFunctionsSuite.scala ---
    @@ -151,6 +151,48 @@ class DataFrameWindowFunctionsSuite extends QueryTest with SharedSQLContext {
             Row(2.0d), Row(2.0d)))
       }
     
    +  test("row between should accept integer values as boundary") {
    --- End diff --
    
    We should also add tests for dates/doubles/timestamps.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    **[Test build #79841 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79841/testReport)** for PR 18540 at commit [`43b2399`](https://github.com/apache/spark/commit/43b23993af1b7abf20088ad0b63893afebe9a00d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r125688053
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/expressions/WindowSpec.scala ---
    @@ -174,28 +191,22 @@ class WindowSpec private[sql](
        */
       // Note: when updating the doc for this method, also update Window.rangeBetween.
       def rangeBetween(start: Long, end: Long): WindowSpec = {
    --- End diff --
    
    I totally agree that's what we definitely should do, but I'd suggest we address this in a follow-up work, and focus on resolving the overflow issue on Long frame boundaries in `rangeBetween` in this PR.
    
    One major concern is the `WindowSpec` API is marked `Stable`, so I'm wondering what's the proper procedure to make a change to this interface?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r125602833
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowExec.scala ---
    @@ -109,46 +109,54 @@ case class WindowExec(
        *
        * This method uses Code Generation. It can only be used on the executor side.
        *
    -   * @param frameType to evaluate. This can either be Row or Range based.
    -   * @param offset with respect to the row.
    +   * @param frame to evaluate. This can either be a Row or Range frame.
    +   * @param bound with respect to the row.
        * @return a bound ordering object.
        */
    -  private[this] def createBoundOrdering(frameType: FrameType, offset: Int): BoundOrdering = {
    -    frameType match {
    -      case RangeFrame =>
    -        val (exprs, current, bound) = if (offset == 0) {
    -          // Use the entire order expression when the offset is 0.
    -          val exprs = orderSpec.map(_.child)
    -          val buildProjection = () => newMutableProjection(exprs, child.output)
    -          (orderSpec, buildProjection(), buildProjection())
    -        } else if (orderSpec.size == 1) {
    -          // Use only the first order expression when the offset is non-null.
    -          val sortExpr = orderSpec.head
    -          val expr = sortExpr.child
    -          // Create the projection which returns the current 'value'.
    -          val current = newMutableProjection(expr :: Nil, child.output)
    -          // Flip the sign of the offset when processing the order is descending
    -          val boundOffset = sortExpr.direction match {
    -            case Descending => -offset
    -            case Ascending => offset
    -          }
    -          // Create the projection which returns the current 'value' modified by adding the offset.
    -          val boundExpr = Add(expr, Cast(Literal.create(boundOffset, IntegerType), expr.dataType))
    -          val bound = newMutableProjection(boundExpr :: Nil, child.output)
    -          (sortExpr :: Nil, current, bound)
    -        } else {
    -          sys.error("Non-Zero range offsets are not supported for windows " +
    -            "with multiple order expressions.")
    +  private[this] def createBoundOrdering(frame: FrameType, bound: AnyRef): BoundOrdering = {
    +    (frame, bound) match {
    +      case (RowFrame, CurrentRow) =>
    +        RowBoundOrdering(0)
    +
    +      case (RowFrame, IntegerLiteral(offset)) =>
    +        RowBoundOrdering(offset)
    +
    +      case (RangeFrame, CurrentRow) =>
    +        val ordering = newOrdering(orderSpec, child.output)
    +        RangeBoundOrdering(ordering, IdentityProjection, IdentityProjection)
    +
    +      case (RangeFrame, offset: Expression) if orderSpec.size == 1 =>
    +        // Use only the first order expression when the offset is non-null.
    +        val sortExpr = orderSpec.head
    +        val expr = sortExpr.child
    +
    +        // Create the projection which returns the current 'value'.
    +        val current = newMutableProjection(expr :: Nil, child.output)
    +
    +        // Flip the sign of the offset when processing the order is descending
    +        val boundOffset = sortExpr.direction match {
    +          case Descending => UnaryMinus(offset)
    +          case Ascending => offset
    +        }
    +
    +        // Create the projection which returns the current 'value' modified by adding the offset.
    +        val boundExpr = (expr.dataType, boundOffset.dataType) match {
    +          case (DateType, IntegerType) => DateAdd(expr, boundOffset)
    --- End diff --
    
    Both `DateType` and `TimestampType` expressions are going to need a time zone. I was wondering if we can use a `GMT` because these are just offset calculation? cc @ueshin 
    
    If we can't then we either need to thread through the session local timezone, or it might be easier to put the entire offset calculation in the frame.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r130215803
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/expressions/WindowSpec.scala ---
    @@ -174,28 +191,22 @@ class WindowSpec private[sql](
        */
       // Note: when updating the doc for this method, also update Window.rangeBetween.
       def rangeBetween(start: Long, end: Long): WindowSpec = {
    --- End diff --
    
    What happens if `start` is larger than `end`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r129004992
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala ---
    @@ -106,173 +101,167 @@ case class WindowSpecReference(name: String) extends WindowSpec
     /**
      * The trait used to represent the type of a Window Frame.
      */
    -sealed trait FrameType
    +sealed trait FrameType {
    +  def inputType: AbstractDataType
    +  def sql: String
    +}
     
     /**
    - * RowFrame treats rows in a partition individually. When a [[ValuePreceding]]
    - * or a [[ValueFollowing]] is used as its [[FrameBoundary]], the value is considered
    - * as a physical offset.
    + * RowFrame treats rows in a partition individually. Values used in a row frame are considered
    + * to be physical offsets.
      * For example, `ROW BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a 3-row frame,
      * from the row that precedes the current row to the row that follows the current row.
      */
    -case object RowFrame extends FrameType
    +case object RowFrame extends FrameType {
    +  override def inputType: AbstractDataType = IntegerType
    +  override def sql: String = "ROWS"
    +}
     
     /**
    - * RangeFrame treats rows in a partition as groups of peers.
    - * All rows having the same `ORDER BY` ordering are considered as peers.
    - * When a [[ValuePreceding]] or a [[ValueFollowing]] is used as its [[FrameBoundary]],
    - * the value is considered as a logical offset.
    + * RangeFrame treats rows in a partition as groups of peers. All rows having the same `ORDER BY`
    + * ordering are considered as peers. Values used in a range frame are considered to be logical
    + * offsets.
      * For example, assuming the value of the current row's `ORDER BY` expression `expr` is `v`,
      * `RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a frame containing rows whose values
      * `expr` are in the range of [v-1, v+1].
      *
      * If `ORDER BY` clause is not defined, all rows in the partition are considered as peers
      * of the current row.
      */
    -case object RangeFrame extends FrameType
    +case object RangeFrame extends FrameType {
    +  override def inputType: AbstractDataType = TypeCollection.NumericAndInterval
    +  override def sql: String = "RANGE"
    +}
     
     /**
    - * The trait used to represent the type of a Window Frame Boundary.
    + * The trait used to represent special boundaries used in a window frame.
      */
    -sealed trait FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean
    +sealed trait SpecialFrameBoundary extends Expression with Unevaluable {
    +  override lazy val children: Seq[Expression] = Nil
    +  override def dataType: DataType = NullType
    +  override def foldable: Boolean = false
    +  override def nullable: Boolean = false
     }
     
    +/** UNBOUNDED boundary. */
    +case object Unbounded extends SpecialFrameBoundary
    +
    +/** CURRENT ROW boundary. */
    +case object CurrentRow extends SpecialFrameBoundary
    +
     /**
    - * Extractor for making working with frame boundaries easier.
    + * Represents a window frame.
      */
    -object FrameBoundary {
    -  def apply(boundary: FrameBoundary): Option[Int] = unapply(boundary)
    -  def unapply(boundary: FrameBoundary): Option[Int] = boundary match {
    -    case CurrentRow => Some(0)
    -    case ValuePreceding(offset) => Some(-offset)
    -    case ValueFollowing(offset) => Some(offset)
    -    case _ => None
    -  }
    +sealed trait WindowFrame extends Expression with Unevaluable {
    +  override lazy val children: Seq[Expression] = Nil
    +  override def dataType: DataType = throw new UnsupportedOperationException("dataType")
    +  override def foldable: Boolean = false
    +  override def nullable: Boolean = false
     }
     
    -/** UNBOUNDED PRECEDING boundary. */
    -case object UnboundedPreceding extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => true
    -    case vp: ValuePreceding => true
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    -    case UnboundedFollowing => true
    -  }
    +/** Used as a placeholder when a frame specification is not defined. */
    +case object UnspecifiedFrame extends WindowFrame
     
    -  override def toString: String = "UNBOUNDED PRECEDING"
    -}
    +/**
    + * A specified Window Frame. The val lower/uppper can be either a foldable [[Expression]] or a
    + * [[SpecialFrameBoundary]].
    + */
    +case class SpecifiedWindowFrame(
    +    frameType: FrameType,
    +    lower: Expression,
    +    upper: Expression)
    +  extends WindowFrame {
     
    -/** <value> PRECEDING boundary. */
    -case class ValuePreceding(value: Int) extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case ValuePreceding(anotherValue) => value >= anotherValue
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    -    case UnboundedFollowing => true
    -  }
    +  override lazy val children: Seq[Expression] = lower :: upper :: Nil
     
    -  override def toString: String = s"$value PRECEDING"
    -}
    +  lazy val valueBoundary: Seq[Expression] =
    +    children.filterNot(_.isInstanceOf[SpecialFrameBoundary])
     
    -/** CURRENT ROW boundary. */
    -case object CurrentRow extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case vp: ValuePreceding => false
    -    case CurrentRow => true
    -    case vf: ValueFollowing => true
    -    case UnboundedFollowing => true
    -  }
    +  override def checkInputDataTypes(): TypeCheckResult = {
    +    // Check lower value.
    +    val lowerCheck = checkBoundary(lower, "lower")
    +    if (lowerCheck.isFailure) {
    +      return lowerCheck
    +    }
     
    -  override def toString: String = "CURRENT ROW"
    -}
    +    // Check upper value.
    +    val upperCheck = checkBoundary(upper, "upper")
    +    if (upperCheck.isFailure) {
    +      return upperCheck
    +    }
     
    -/** <value> FOLLOWING boundary. */
    -case class ValueFollowing(value: Int) extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case vp: ValuePreceding => false
    -    case CurrentRow => false
    -    case ValueFollowing(anotherValue) => value <= anotherValue
    -    case UnboundedFollowing => true
    +    // Check combination (of expressions).
    +    (lower, upper) match {
    +      case (l: SpecialFrameBoundary, _) => TypeCheckSuccess
    +      case (_, u: SpecialFrameBoundary) => TypeCheckSuccess
    +      case (l: Expression, u: Expression) if l.dataType != u.dataType =>
    +        TypeCheckFailure(
    +          s"Window frame bounds '$lower' and '$upper' do no not have the same data type: " +
    +            s"'${l.dataType.catalogString}' <> '${u.dataType.catalogString}'")
    +      case (l: Expression, u: Expression) if isGreaterThan(l, u) =>
    +        TypeCheckFailure(
    +          "The lower bound of a window frame must less than or equal to the upper bound")
    +      case _ => TypeCheckSuccess
    +    }
       }
     
    -  override def toString: String = s"$value FOLLOWING"
    -}
    -
    -/** UNBOUNDED FOLLOWING boundary. */
    -case object UnboundedFollowing extends FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean = other match {
    -    case UnboundedPreceding => false
    -    case vp: ValuePreceding => false
    -    case CurrentRow => false
    -    case vf: ValueFollowing => false
    -    case UnboundedFollowing => true
    +  override def sql: String = {
    +    val lowerSql = boundarySql(lower, "PRECEDING")
    +    val upperSql = boundarySql(upper, "FOLLOWING")
    +    s"${frameType.sql} BETWEEN $lowerSql AND $upperSql"
       }
     
    -  override def toString: String = "UNBOUNDED FOLLOWING"
    -}
    +  def isUnbounded: Boolean = lower == Unbounded && upper == Unbounded
     
    -/**
    - * Represents a window frame.
    - */
    -sealed trait WindowFrame
    +  def isValueBound: Boolean = valueBoundary.nonEmpty
    --- End diff --
    
    Having `rows between current row and current row` is kinda dumb, the aggregate should contain only 1 value. However `range between current row and current row` can be very useful because you can aggregate over all the observations with the same ordering value.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r125598577
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala ---
    @@ -43,57 +42,57 @@ case class WindowSpecDefinition(
         orderSpec: Seq[SortOrder],
         frameSpecification: WindowFrame) extends Expression with WindowSpec with Unevaluable {
     
    -  def validate: Option[String] = frameSpecification match {
    -    case UnspecifiedFrame =>
    -      Some("Found a UnspecifiedFrame. It should be converted to a SpecifiedWindowFrame " +
    -        "during analysis. Please file a bug report.")
    -    case frame: SpecifiedWindowFrame => frame.validate.orElse {
    -      def checkValueBasedBoundaryForRangeFrame(): Option[String] = {
    -        if (orderSpec.length > 1)  {
    -          // It is not allowed to have a value-based PRECEDING and FOLLOWING
    -          // as the boundary of a Range Window Frame.
    -          Some("This Range Window Frame only accepts at most one ORDER BY expression.")
    -        } else if (orderSpec.nonEmpty && !orderSpec.head.dataType.isInstanceOf[NumericType]) {
    -          Some("The data type of the expression in the ORDER BY clause should be a numeric type.")
    -        } else {
    -          None
    -        }
    -      }
    -
    -      (frame.frameType, frame.frameStart, frame.frameEnd) match {
    -        case (RangeFrame, vp: ValuePreceding, _) => checkValueBasedBoundaryForRangeFrame()
    -        case (RangeFrame, vf: ValueFollowing, _) => checkValueBasedBoundaryForRangeFrame()
    -        case (RangeFrame, _, vp: ValuePreceding) => checkValueBasedBoundaryForRangeFrame()
    -        case (RangeFrame, _, vf: ValueFollowing) => checkValueBasedBoundaryForRangeFrame()
    -        case (_, _, _) => None
    -      }
    -    }
    -  }
    -
    -  override def children: Seq[Expression] = partitionSpec ++ orderSpec
    +  override def children: Seq[Expression] = partitionSpec ++ orderSpec ++ Seq(frameSpecification)
     
       override lazy val resolved: Boolean =
         childrenResolved && checkInputDataTypes().isSuccess &&
           frameSpecification.isInstanceOf[SpecifiedWindowFrame]
     
       override def nullable: Boolean = true
       override def foldable: Boolean = false
    -  override def dataType: DataType = throw new UnsupportedOperationException
    +  override def dataType: DataType = throw new UnsupportedOperationException("dataType")
     
    -  override def sql: String = {
    -    val partition = if (partitionSpec.isEmpty) {
    -      ""
    -    } else {
    -      "PARTITION BY " + partitionSpec.map(_.sql).mkString(", ") + " "
    +  override def checkInputDataTypes(): TypeCheckResult = {
    +    frameSpecification match {
    +      case UnspecifiedFrame =>
    +        TypeCheckFailure(
    +          "Cann't use an UnspecifiedFrame. This should have been converted during analysis. " +
    --- End diff --
    
    NIT: typo `Cannot`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    **[Test build #79210 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79210/testReport)** for PR 18540 at commit [`52c5289`](https://github.com/apache/spark/commit/52c52895cfe3cae58f8b029a350949cc153971a7).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `sealed trait FrameType `
      * `sealed trait WindowFrame extends Expression with Unevaluable `
      * `case class SpecifiedWindowFrame(`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r130215692
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala ---
    @@ -1179,32 +1179,26 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging
       }
     
       /**
    -   * Create or resolve a [[FrameBoundary]]. Simple math expressions are allowed for Value
    -   * Preceding/Following boundaries. These expressions must be constant (foldable) and return an
    -   * integer value.
    +   * Create or resolve a frame boundary expressions.
        */
    -  override def visitFrameBound(ctx: FrameBoundContext): FrameBoundary = withOrigin(ctx) {
    -    // We currently only allow foldable integers.
    -    def value: Int = {
    +  override def visitFrameBound(ctx: FrameBoundContext): Expression = withOrigin(ctx) {
    +    def value: Expression = {
           val e = expression(ctx.expression)
    -      validate(e.resolved && e.foldable && e.dataType == IntegerType,
    -        "Frame bound value must be a constant integer.",
    -        ctx)
    -      e.eval().asInstanceOf[Int]
    +      validate(e.resolved && e.foldable, "Frame bound value must be a literal.", ctx)
    +      e
         }
     
    -    // Create the FrameBoundary
         ctx.boundType.getType match {
           case SqlBaseParser.PRECEDING if ctx.UNBOUNDED != null =>
             UnboundedPreceding
           case SqlBaseParser.PRECEDING =>
    -        ValuePreceding(value)
    +        UnaryMinus(value)
           case SqlBaseParser.CURRENT =>
             CurrentRow
           case SqlBaseParser.FOLLOWING if ctx.UNBOUNDED != null =>
             UnboundedFollowing
           case SqlBaseParser.FOLLOWING =>
    -        ValueFollowing(value)
    +        value
    --- End diff --
    
    It should be an `unsigned-integer` based on ANSI SQL


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    **[Test build #79531 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79531/testReport)** for PR 18540 at commit [`a761f63`](https://github.com/apache/spark/commit/a761f632b0473c415535f68b4cbd1f5bb543ba94).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r129002630
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ExpressionParserSuite.scala ---
    @@ -267,16 +267,17 @@ class ExpressionParserSuite extends PlanTest {
         // Range/Row
         val frameTypes = Seq(("rows", RowFrame), ("range", RangeFrame))
         val boundaries = Seq(
    -      ("10 preceding", ValuePreceding(10), CurrentRow),
    -      ("3 + 1 following", ValueFollowing(4), CurrentRow), // Will fail during analysis
    -      ("unbounded preceding", UnboundedPreceding, CurrentRow),
    -      ("unbounded following", UnboundedFollowing, CurrentRow), // Will fail during analysis
    -      ("between unbounded preceding and current row", UnboundedPreceding, CurrentRow),
    +      ("10 preceding", -Literal(10), CurrentRow),
    +      ("2147483648 preceding", -Literal(2147483648L), CurrentRow),
    +      ("3 + 1 following", Add(Literal(3), Literal(1)), CurrentRow), // Will fail during analysis
    +      ("unbounded preceding", Unbounded, CurrentRow),
    +      ("unbounded following", Unbounded, CurrentRow), // Will fail during analysis
    --- End diff --
    
    In fact this is problematic, we would generate the same result for both `unbounded preceding` and `unbounded following`. @hvanhovell any idea on resolving this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    **[Test build #79835 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79835/testReport)** for PR 18540 at commit [`427753d`](https://github.com/apache/spark/commit/427753d27a15835d07cba0f47b35693efa6175f0).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `sealed trait SpecialFrameBoundary extends Expression with Unevaluable `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79835/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r130217171
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala ---
    @@ -351,20 +356,25 @@ abstract class OffsetWindowFunction
     
       override def nullable: Boolean = default == null || default.nullable || input.nullable
     
    -  override lazy val frame = {
    -    // This will be triggered by the Analyzer.
    -    val offsetValue = offset.eval() match {
    -      case o: Int => o
    -      case x => throw new AnalysisException(
    -        s"Offset expression must be a foldable integer expression: $x")
    -    }
    +  override lazy val frame: WindowFrame = {
         val boundary = direction match {
    -      case Ascending => ValueFollowing(offsetValue)
    -      case Descending => ValuePreceding(offsetValue)
    +      case Ascending => offset
    +      case Descending => UnaryMinus(offset)
         }
         SpecifiedWindowFrame(RowFrame, boundary, boundary)
       }
     
    +  override def checkInputDataTypes(): TypeCheckResult = {
    +    val check = super.checkInputDataTypes()
    +    if (check.isFailure) {
    +      check
    +    } else if (!offset.foldable) {
    +      TypeCheckFailure(s"Offset expression '$offset' must be a literal.")
    --- End diff --
    
    Currently it's only used by lead()/lag() functions, that both checked the input types, so we're not able to test this from sql.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r129000692
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/trees/TreeNodeSuite.scala ---
    @@ -436,21 +436,22 @@ class TreeNodeSuite extends SparkFunSuite {
             "bucketColumnNames" -> "[bucket]",
             "sortColumnNames" -> "[sort]"))
     
    -    // Converts FrameBoundary to JSON
    -    assertJSON(
    -      ValueFollowing(3),
    -      JObject(
    -        "product-class" -> classOf[ValueFollowing].getName,
    -        "value" -> 3))
    -
         // Converts WindowFrame to JSON
         assertJSON(
    -      SpecifiedWindowFrame(RowFrame, UnboundedFollowing, CurrentRow),
    -      JObject(
    -        "product-class" -> classOf[SpecifiedWindowFrame].getName,
    -        "frameType" -> JObject("object" -> JString(RowFrame.getClass.getName)),
    -        "frameStart" -> JObject("object" -> JString(UnboundedFollowing.getClass.getName)),
    -        "frameEnd" -> JObject("object" -> JString(CurrentRow.getClass.getName))))
    +      SpecifiedWindowFrame(RowFrame, Unbounded, CurrentRow),
    +      List(
    +        JObject(
    +          "class" -> classOf[SpecifiedWindowFrame].getName,
    +          "num-children" -> 2,
    +          "frameType" -> JObject("object" -> JString(RowFrame.getClass.getName)),
    +          "lower" -> 0,
    +          "upper" -> 1),
    --- End diff --
    
    After this PR, `SpecialFrameBoundary` and `WindowFrame` are made `Expression`s, thus they are `TreeNode`s, so the field values are made value index in the TreeNode.children.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r128894905
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionSuite.scala ---
    @@ -1109,6 +1109,31 @@ class TypeCoercionSuite extends AnalysisTest {
           EqualTo(Literal(Array(1, 2)), Literal("123")),
           EqualTo(Literal(Array(1, 2)), Literal("123")))
       }
    +
    +  test("cast WindowFrame boundaries to the type they operate upon") {
    +    // Can cast frame boundaries to order dataType.
    +    ruleTest(WindowFrameCoercion,
    +      windowSpec(
    +        Seq(UnresolvedAttribute("a")),
    +        Seq(SortOrder(Literal(1L), Ascending)),
    +        SpecifiedWindowFrame(RangeFrame, Literal(3), Literal(2147483648L))),
    +      windowSpec(
    +        Seq(UnresolvedAttribute("a")),
    +        Seq(SortOrder(Literal(1L), Ascending)),
    +        SpecifiedWindowFrame(RangeFrame, Cast(3, LongType), Literal(2147483648L)))
    +    )
    +    // Cannot cast frame boundaries to order dataType.
    +    ruleTest(WindowFrameCoercion,
    +      windowSpec(
    +        Seq(UnresolvedAttribute("a")),
    +        Seq(SortOrder(Literal.default(DateType), Ascending)),
    +        SpecifiedWindowFrame(RangeFrame, Literal(10.0), Literal(2147483648L))),
    +      windowSpec(
    +        Seq(UnresolvedAttribute("a")),
    +        Seq(SortOrder(Literal.default(DateType), Ascending)),
    +        SpecifiedWindowFrame(RangeFrame, Literal(10.0), Literal(2147483648L)))
    +    )
    --- End diff --
    
    can we add some more test cases with special window frame boundary?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    **[Test build #79531 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79531/testReport)** for PR 18540 at commit [`a761f63`](https://github.com/apache/spark/commit/a761f632b0473c415535f68b4cbd1f5bb543ba94).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r130073049
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala ---
    @@ -106,161 +113,176 @@ case class WindowSpecReference(name: String) extends WindowSpec
     /**
      * The trait used to represent the type of a Window Frame.
      */
    -sealed trait FrameType
    +sealed trait FrameType {
    +  def inputType: AbstractDataType
    +  def sql: String
    +}
     
     /**
    - * RowFrame treats rows in a partition individually. When a [[ValuePreceding]]
    - * or a [[ValueFollowing]] is used as its [[FrameBoundary]], the value is considered
    - * as a physical offset.
    + * RowFrame treats rows in a partition individually. Values used in a row frame are considered
    + * to be physical offsets.
      * For example, `ROW BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a 3-row frame,
      * from the row that precedes the current row to the row that follows the current row.
      */
    -case object RowFrame extends FrameType
    +case object RowFrame extends FrameType {
    +  override def inputType: AbstractDataType = IntegerType
    +  override def sql: String = "ROWS"
    +}
     
     /**
    - * RangeFrame treats rows in a partition as groups of peers.
    - * All rows having the same `ORDER BY` ordering are considered as peers.
    - * When a [[ValuePreceding]] or a [[ValueFollowing]] is used as its [[FrameBoundary]],
    - * the value is considered as a logical offset.
    + * RangeFrame treats rows in a partition as groups of peers. All rows having the same `ORDER BY`
    + * ordering are considered as peers. Values used in a range frame are considered to be logical
    + * offsets.
      * For example, assuming the value of the current row's `ORDER BY` expression `expr` is `v`,
      * `RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a frame containing rows whose values
      * `expr` are in the range of [v-1, v+1].
      *
      * If `ORDER BY` clause is not defined, all rows in the partition are considered as peers
      * of the current row.
      */
    -case object RangeFrame extends FrameType
    -
    -/**
    - * The trait used to represent the type of a Window Frame Boundary.
    - */
    -sealed trait FrameBoundary {
    -  def notFollows(other: FrameBoundary): Boolean
    +case object RangeFrame extends FrameType {
    +  override def inputType: AbstractDataType = TypeCollection.NumericAndInterval
    +  override def sql: String = "RANGE"
     }
     
     /**
    - * Extractor for making working with frame boundaries easier.
    + * The trait used to represent special boundaries used in a window frame.
      */
    -object FrameBoundary {
    -  def apply(boundary: FrameBoundary): Option[Int] = unapply(boundary)
    -  def unapply(boundary: FrameBoundary): Option[Int] = boundary match {
    -    case CurrentRow => Some(0)
    -    case ValuePreceding(offset) => Some(-offset)
    -    case ValueFollowing(offset) => Some(offset)
    -    case _ => None
    -  }
    +sealed trait SpecialFrameBoundary extends Expression with Unevaluable {
    +  override lazy val children: Seq[Expression] = Nil
    --- End diff --
    
    nit: `def` instead of `lazy val`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    **[Test build #80006 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80006/testReport)** for PR 18540 at commit [`a1f91cd`](https://github.com/apache/spark/commit/a1f91cd7b0f10176b551cb168bf23d2eef68c15c).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80021/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    LGTM, pending tests


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r130215242
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala ---
    @@ -43,58 +42,54 @@ case class WindowSpecDefinition(
         orderSpec: Seq[SortOrder],
         frameSpecification: WindowFrame) extends Expression with WindowSpec with Unevaluable {
     
    -  def validate: Option[String] = frameSpecification match {
    -    case UnspecifiedFrame =>
    -      Some("Found a UnspecifiedFrame. It should be converted to a SpecifiedWindowFrame " +
    -        "during analysis. Please file a bug report.")
    -    case frame: SpecifiedWindowFrame => frame.validate.orElse {
    -      def checkValueBasedBoundaryForRangeFrame(): Option[String] = {
    -        if (orderSpec.length > 1)  {
    -          // It is not allowed to have a value-based PRECEDING and FOLLOWING
    -          // as the boundary of a Range Window Frame.
    -          Some("This Range Window Frame only accepts at most one ORDER BY expression.")
    -        } else if (orderSpec.nonEmpty && !orderSpec.head.dataType.isInstanceOf[NumericType]) {
    -          Some("The data type of the expression in the ORDER BY clause should be a numeric type.")
    -        } else {
    -          None
    -        }
    -      }
    -
    -      (frame.frameType, frame.frameStart, frame.frameEnd) match {
    -        case (RangeFrame, vp: ValuePreceding, _) => checkValueBasedBoundaryForRangeFrame()
    -        case (RangeFrame, vf: ValueFollowing, _) => checkValueBasedBoundaryForRangeFrame()
    -        case (RangeFrame, _, vp: ValuePreceding) => checkValueBasedBoundaryForRangeFrame()
    -        case (RangeFrame, _, vf: ValueFollowing) => checkValueBasedBoundaryForRangeFrame()
    -        case (_, _, _) => None
    -      }
    -    }
    -  }
    -
    -  override def children: Seq[Expression] = partitionSpec ++ orderSpec
    +  override def children: Seq[Expression] = partitionSpec ++ orderSpec :+ frameSpecification
     
       override lazy val resolved: Boolean =
         childrenResolved && checkInputDataTypes().isSuccess &&
           frameSpecification.isInstanceOf[SpecifiedWindowFrame]
     
       override def nullable: Boolean = true
       override def foldable: Boolean = false
    -  override def dataType: DataType = throw new UnsupportedOperationException
    +  override def dataType: DataType = throw new UnsupportedOperationException("dataType")
     
    -  override def sql: String = {
    -    val partition = if (partitionSpec.isEmpty) {
    -      ""
    -    } else {
    -      "PARTITION BY " + partitionSpec.map(_.sql).mkString(", ") + " "
    +  override def checkInputDataTypes(): TypeCheckResult = {
    +    frameSpecification match {
    +      case UnspecifiedFrame =>
    +        TypeCheckFailure(
    +          "Cannot use an UnspecifiedFrame. This should have been converted during analysis. " +
    +            "Please file a bug report.")
    +      case f: SpecifiedWindowFrame if f.frameType == RangeFrame && !f.isUnbounded &&
    +        orderSpec.isEmpty =>
    +        TypeCheckFailure(
    +          "A range window frame cannot be used in an unordered window specification.")
    +      case f: SpecifiedWindowFrame if f.frameType == RangeFrame && f.isValueBound &&
    +        orderSpec.size > 1 =>
    +        TypeCheckFailure(
    +          s"A range window frame with value boundaries cannot be used in a window specification " +
    +            s"with multiple order by expressions: ${orderSpec.mkString(",")}")
    +      case f: SpecifiedWindowFrame if f.frameType == RangeFrame && f.isValueBound &&
    +        !isValidFrameType(f.valueBoundary.head.dataType) =>
    --- End diff --
    
    Personally, I do not like many long `if` conditions. Let us add extra two spaces in line 71, 66, and 62.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r130215611
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala ---
    @@ -1179,32 +1179,26 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging
       }
     
       /**
    -   * Create or resolve a [[FrameBoundary]]. Simple math expressions are allowed for Value
    -   * Preceding/Following boundaries. These expressions must be constant (foldable) and return an
    -   * integer value.
    +   * Create or resolve a frame boundary expressions.
        */
    -  override def visitFrameBound(ctx: FrameBoundContext): FrameBoundary = withOrigin(ctx) {
    -    // We currently only allow foldable integers.
    -    def value: Int = {
    +  override def visitFrameBound(ctx: FrameBoundContext): Expression = withOrigin(ctx) {
    +    def value: Expression = {
           val e = expression(ctx.expression)
    -      validate(e.resolved && e.foldable && e.dataType == IntegerType,
    -        "Frame bound value must be a constant integer.",
    -        ctx)
    -      e.eval().asInstanceOf[Int]
    +      validate(e.resolved && e.foldable, "Frame bound value must be a literal.", ctx)
    --- End diff --
    
    Any test case?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18540#discussion_r126187356
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/expressions/WindowSpec.scala ---
    @@ -174,28 +191,22 @@ class WindowSpec private[sql](
        */
       // Note: when updating the doc for this method, also update Window.rangeBetween.
       def rangeBetween(start: Long, end: Long): WindowSpec = {
    --- End diff --
    
    Let's rule it out of the scope of this PR and address this in a follow-up.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18540
  
    **[Test build #80041 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80041/testReport)** for PR 18540 at commit [`38b04df`](https://github.com/apache/spark/commit/38b04df792caf298da63be565d1b46a4eae93d98).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org