You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by cloud-fan <gi...@git.apache.org> on 2018/05/31 18:14:53 UTC

[GitHub] spark pull request #21470: [SPARK-24443][SQL] comparison should accept struc...

GitHub user cloud-fan opened a pull request:

    https://github.com/apache/spark/pull/21470

    [SPARK-24443][SQL] comparison should accept structurally-equal types

    ## What changes were proposed in this pull request?
    
    When comparing struct type values, it's a little hard to make the field names same. Simple queries like
    `SELECT (a, b) = (1, 2) FROM t` may fail because the field names do not match.
    
    In Postgres, when comparing struct type values, it will do safe type coercion and ignore field name difference
    ```
    # create table t(i smallint, j smallint);
    CREATE TABLE
    
    # select Row(i, j) = Row(1, 1) from t;                              
     ?column? 
    ----------
     t
    (1 row)
    
    # select Row(i, j) < Row(1, 1) from t;
     ?column? 
    ----------
     f
    (1 row)
    
    # select Row(i, j) = Row(j, i) from t;
     ?column? 
    ----------
     t
    (1 row)
    ```
    
    This PR follows Postgres and accept structurally-equal types in comparison
    
    TODO:
    * check Hive and Presto.
    * think about array/map type. Postgres doesn't support type coercion for elements in array/map when comparison.
    
    ## How was this patch tested?
    
    new test cases.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cloud-fan/spark equal

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21470.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21470
    
----
commit 3555e0fd1a8003af0e6a7694aab8999698aab9c4
Author: Wenchen Fan <we...@...>
Date:   2018-05-31T18:00:42Z

    comparison should accept structurally-equal types

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21470: [SPARK-24443][SQL] comparison should accept struc...

Posted by eric-maynard <gi...@git.apache.org>.
Github user eric-maynard commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21470#discussion_r192820148
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala ---
    @@ -803,18 +803,60 @@ object TypeCoercion {
               e.copy(left = Cast(e.left, TimestampType))
             }
     
    -      case b @ BinaryOperator(left, right) if left.dataType != right.dataType =>
    -        findTightestCommonType(left.dataType, right.dataType).map { commonType =>
    -          if (b.inputType.acceptsType(commonType)) {
    -            // If the expression accepts the tightest common type, cast to that.
    -            val newLeft = if (left.dataType == commonType) left else Cast(left, commonType)
    -            val newRight = if (right.dataType == commonType) right else Cast(right, commonType)
    -            b.withNewChildren(Seq(newLeft, newRight))
    -          } else {
    -            // Otherwise, don't do anything with the expression.
    -            b
    -          }
    -        }.getOrElse(b)  // If there is no applicable conversion, leave expression unchanged.
    +      case b @ BinaryOperator(left, right)
    +          if !BinaryOperator.sameType(left.dataType, right.dataType) =>
    +        (left.dataType, right.dataType) match {
    +          case (StructType(fields1), StructType(fields2)) =>
    +            val commonTypes = scala.collection.mutable.ArrayBuffer.empty[DataType]
    +            val len = fields1.length
    +            var i = 0
    +            var continue = fields1.length == fields2.length
    +            while (i < len && continue) {
    --- End diff --
    
    This loop could be refactored functionally, e.g.
    ```
    val commonTypes = (fields1 zip fields2).map(f => findTightestCommonType(f._1, f._2))
    if (commonTypes.forall(_.isDefined)) {
     . . .
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21470: [SPARK-24443][SQL] comparison should accept struc...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21470#discussion_r192274662
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala ---
    @@ -803,18 +803,60 @@ object TypeCoercion {
               e.copy(left = Cast(e.left, TimestampType))
             }
     
    -      case b @ BinaryOperator(left, right) if left.dataType != right.dataType =>
    -        findTightestCommonType(left.dataType, right.dataType).map { commonType =>
    -          if (b.inputType.acceptsType(commonType)) {
    -            // If the expression accepts the tightest common type, cast to that.
    -            val newLeft = if (left.dataType == commonType) left else Cast(left, commonType)
    -            val newRight = if (right.dataType == commonType) right else Cast(right, commonType)
    -            b.withNewChildren(Seq(newLeft, newRight))
    -          } else {
    -            // Otherwise, don't do anything with the expression.
    -            b
    -          }
    -        }.getOrElse(b)  // If there is no applicable conversion, leave expression unchanged.
    +      case b @ BinaryOperator(left, right)
    +          if !BinaryOperator.sameType(left.dataType, right.dataType) =>
    --- End diff --
    
    `findTightestCommonType` doesn't accept struct type with different filed names, so the other code are needed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21470: [SPARK-24443][SQL] comparison should accept structurally...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/21470
  
    cc @gatorsmile @dbtsai @dongjoon-hyun @viirya 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21470: [SPARK-24443][SQL] comparison should accept struc...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21470#discussion_r192266906
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala ---
    @@ -803,18 +803,60 @@ object TypeCoercion {
               e.copy(left = Cast(e.left, TimestampType))
             }
     
    -      case b @ BinaryOperator(left, right) if left.dataType != right.dataType =>
    -        findTightestCommonType(left.dataType, right.dataType).map { commonType =>
    -          if (b.inputType.acceptsType(commonType)) {
    -            // If the expression accepts the tightest common type, cast to that.
    -            val newLeft = if (left.dataType == commonType) left else Cast(left, commonType)
    -            val newRight = if (right.dataType == commonType) right else Cast(right, commonType)
    -            b.withNewChildren(Seq(newLeft, newRight))
    -          } else {
    -            // Otherwise, don't do anything with the expression.
    -            b
    -          }
    -        }.getOrElse(b)  // If there is no applicable conversion, leave expression unchanged.
    +      case b @ BinaryOperator(left, right)
    +          if !BinaryOperator.sameType(left.dataType, right.dataType) =>
    --- End diff --
    
    How about just change this line? The other changes in this file can be done later? 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21470: [SPARK-24443][SQL] comparison should accept struc...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21470#discussion_r192266617
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashJoin.scala ---
    @@ -62,8 +62,9 @@ trait HashJoin {
       }
     
       protected lazy val (buildKeys, streamedKeys) = {
    -    require(leftKeys.map(_.dataType) == rightKeys.map(_.dataType),
    -      "Join keys from two sides should have same types")
    +    require(leftKeys.map(_.dataType).zip(rightKeys.map(_.dataType)).forall {
    +      case (l, r) => BinaryOperator.sameType(l, r)
    --- End diff --
    
    Do we have a test case to cover this?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21470: [SPARK-24443][SQL] comparison should accept structurally...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21470
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21470: [SPARK-24443][SQL] comparison should accept structurally...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21470
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21470: [SPARK-24443][SQL] comparison should accept structurally...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21470
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3737/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21470: [SPARK-24443][SQL] comparison should accept struc...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21470#discussion_r192314292
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala ---
    @@ -803,18 +803,60 @@ object TypeCoercion {
               e.copy(left = Cast(e.left, TimestampType))
             }
     
    -      case b @ BinaryOperator(left, right) if left.dataType != right.dataType =>
    -        findTightestCommonType(left.dataType, right.dataType).map { commonType =>
    -          if (b.inputType.acceptsType(commonType)) {
    -            // If the expression accepts the tightest common type, cast to that.
    -            val newLeft = if (left.dataType == commonType) left else Cast(left, commonType)
    -            val newRight = if (right.dataType == commonType) right else Cast(right, commonType)
    -            b.withNewChildren(Seq(newLeft, newRight))
    -          } else {
    -            // Otherwise, don't do anything with the expression.
    -            b
    -          }
    -        }.getOrElse(b)  // If there is no applicable conversion, leave expression unchanged.
    +      case b @ BinaryOperator(left, right)
    +          if !BinaryOperator.sameType(left.dataType, right.dataType) =>
    +        (left.dataType, right.dataType) match {
    +          case (StructType(fields1), StructType(fields2)) =>
    +            val commonTypes = scala.collection.mutable.ArrayBuffer.empty[DataType]
    +            val len = fields1.length
    +            var i = 0
    +            var continue = fields1.length == fields2.length
    +            while (i < len && continue) {
    +              val commonType = findTightestCommonType(fields1(i).dataType, fields2(i).dataType)
    --- End diff --
    
    What about nested structs?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21470: [SPARK-24443][SQL] comparison should accept structurally...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21470
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91355/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21470: [SPARK-24443][SQL] comparison should accept struc...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21470#discussion_r192314729
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala ---
    @@ -803,18 +803,60 @@ object TypeCoercion {
               e.copy(left = Cast(e.left, TimestampType))
             }
     
    -      case b @ BinaryOperator(left, right) if left.dataType != right.dataType =>
    -        findTightestCommonType(left.dataType, right.dataType).map { commonType =>
    -          if (b.inputType.acceptsType(commonType)) {
    -            // If the expression accepts the tightest common type, cast to that.
    -            val newLeft = if (left.dataType == commonType) left else Cast(left, commonType)
    -            val newRight = if (right.dataType == commonType) right else Cast(right, commonType)
    -            b.withNewChildren(Seq(newLeft, newRight))
    -          } else {
    -            // Otherwise, don't do anything with the expression.
    -            b
    -          }
    -        }.getOrElse(b)  // If there is no applicable conversion, leave expression unchanged.
    +      case b @ BinaryOperator(left, right)
    +          if !BinaryOperator.sameType(left.dataType, right.dataType) =>
    +        (left.dataType, right.dataType) match {
    +          case (StructType(fields1), StructType(fields2)) =>
    +            val commonTypes = scala.collection.mutable.ArrayBuffer.empty[DataType]
    +            val len = fields1.length
    +            var i = 0
    +            var continue = fields1.length == fields2.length
    +            while (i < len && continue) {
    +              val commonType = findTightestCommonType(fields1(i).dataType, fields2(i).dataType)
    +              if (commonType.isDefined) {
    +                commonTypes += commonType.get
    +              } else {
    +                continue = false
    +              }
    +              i += 1
    +            }
    +
    +            if (continue) {
    +              val newLeftST = new StructType(fields1.zip(commonTypes).map {
    +                case (f, commonType) => f.copy(dataType = commonType)
    +              })
    +              val newLeft = if (left.dataType == newLeftST) left else Cast(left, newLeftST)
    +
    +              val newRightST = new StructType(fields2.zip(commonTypes).map {
    +                case (f, commonType) => f.copy(dataType = commonType)
    +              })
    +              val newRight = if (right.dataType == newRightST) right else Cast(right, newRightST)
    +
    +              if (b.inputType.acceptsType(newLeftST) && b.inputType.acceptsType(newRightST)) {
    --- End diff --
    
    Is it possible `b` only accepts one side (e.g., only `newLeftST`) but doesn't accept other side?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21470: [SPARK-24443][SQL] comparison should accept structurally...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21470
  
    **[Test build #91355 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91355/testReport)** for PR 21470 at commit [`3555e0f`](https://github.com/apache/spark/commit/3555e0fd1a8003af0e6a7694aab8999698aab9c4).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21470: [SPARK-24443][SQL] comparison should accept struc...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21470#discussion_r192274753
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala ---
    @@ -803,18 +803,60 @@ object TypeCoercion {
               e.copy(left = Cast(e.left, TimestampType))
             }
     
    -      case b @ BinaryOperator(left, right) if left.dataType != right.dataType =>
    -        findTightestCommonType(left.dataType, right.dataType).map { commonType =>
    -          if (b.inputType.acceptsType(commonType)) {
    -            // If the expression accepts the tightest common type, cast to that.
    -            val newLeft = if (left.dataType == commonType) left else Cast(left, commonType)
    -            val newRight = if (right.dataType == commonType) right else Cast(right, commonType)
    -            b.withNewChildren(Seq(newLeft, newRight))
    -          } else {
    -            // Otherwise, don't do anything with the expression.
    -            b
    -          }
    -        }.getOrElse(b)  // If there is no applicable conversion, leave expression unchanged.
    +      case b @ BinaryOperator(left, right)
    +          if !BinaryOperator.sameType(left.dataType, right.dataType) =>
    +        (left.dataType, right.dataType) match {
    +          case (StructType(fields1), StructType(fields2)) =>
    +            val commonTypes = scala.collection.mutable.ArrayBuffer.empty[DataType]
    +            val len = fields1.length
    +            var i = 0
    +            var continue = fields1.length == fields2.length
    +            while (i < len && continue) {
    +              val commonType = findTightestCommonType(fields1(i).dataType, fields2(i).dataType)
    --- End diff --
    
    if we don't want to support type coercion, we can change this line to use `fields1(i).dataType == fields2(i).dataType`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21470: [SPARK-24443][SQL] comparison should accept structurally...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21470
  
    **[Test build #91355 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91355/testReport)** for PR 21470 at commit [`3555e0f`](https://github.com/apache/spark/commit/3555e0fd1a8003af0e6a7694aab8999698aab9c4).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21470: [SPARK-24443][SQL] comparison should accept struc...

Posted by eric-maynard <gi...@git.apache.org>.
Github user eric-maynard commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21470#discussion_r192819128
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala ---
    @@ -803,18 +803,60 @@ object TypeCoercion {
               e.copy(left = Cast(e.left, TimestampType))
             }
     
    -      case b @ BinaryOperator(left, right) if left.dataType != right.dataType =>
    -        findTightestCommonType(left.dataType, right.dataType).map { commonType =>
    -          if (b.inputType.acceptsType(commonType)) {
    -            // If the expression accepts the tightest common type, cast to that.
    -            val newLeft = if (left.dataType == commonType) left else Cast(left, commonType)
    -            val newRight = if (right.dataType == commonType) right else Cast(right, commonType)
    -            b.withNewChildren(Seq(newLeft, newRight))
    -          } else {
    -            // Otherwise, don't do anything with the expression.
    -            b
    -          }
    -        }.getOrElse(b)  // If there is no applicable conversion, leave expression unchanged.
    +      case b @ BinaryOperator(left, right)
    +          if !BinaryOperator.sameType(left.dataType, right.dataType) =>
    +        (left.dataType, right.dataType) match {
    +          case (StructType(fields1), StructType(fields2)) =>
    +            val commonTypes = scala.collection.mutable.ArrayBuffer.empty[DataType]
    +            val len = fields1.length
    +            var i = 0
    +            var continue = fields1.length == fields2.length
    +            while (i < len && continue) {
    +              val commonType = findTightestCommonType(fields1(i).dataType, fields2(i).dataType)
    +              if (commonType.isDefined) {
    +                commonTypes += commonType.get
    +              } else {
    +                continue = false
    +              }
    +              i += 1
    +            }
    +
    +            if (continue) {
    +              val newLeftST = new StructType(fields1.zip(commonTypes).map {
    +                case (f, commonType) => f.copy(dataType = commonType)
    +              })
    +              val newLeft = if (left.dataType == newLeftST) left else Cast(left, newLeftST)
    +
    +              val newRightST = new StructType(fields2.zip(commonTypes).map {
    +                case (f, commonType) => f.copy(dataType = commonType)
    +              })
    +              val newRight = if (right.dataType == newRightST) right else Cast(right, newRightST)
    +
    +              if (b.inputType.acceptsType(newLeftST) && b.inputType.acceptsType(newRightST)) {
    +                b.withNewChildren(Seq(newLeft, newRight))
    +              } else {
    +                // type not acceptable, don't do anything with the expression.
    +                b
    +              }
    +            } else {
    +              // left struct type and right struct type have different number of fields, or some
    +              // fields don't have a common type, don't do anything with the expression.
    +              b
    +            }
    +
    +          case _ =>
    +            findTightestCommonType(left.dataType, right.dataType).map { commonType =>
    +              if (b.inputType.acceptsType(commonType)) {
    +                // If the expression accepts the tightest common type, cast to that.
    +                val newLeft = if (left.dataType == commonType) left else Cast(left, commonType)
    --- End diff --
    
    This ternary operation seems to crop up a few times in this PR. Maybe we can push it out into a method?
    ```
    private def castIfNeeded(e: Expression, possibleType: DataType): Expression = {
      if (e.dataType == possibleType) data else Cast(e, possibleType)
    }
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21470: [SPARK-24443][SQL] comparison should accept struc...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21470#discussion_r192309419
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala ---
    @@ -2696,16 +2687,6 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext {
             sql("SELECT struct(1 a) EXCEPT (SELECT struct(2 A))")
           }.message
           assert(m2.contains("Except can only be performed on tables with the compatible column types"))
    -
    -      withTable("t", "S") {
    -        sql("CREATE TABLE t(c struct<f:int>) USING parquet")
    -        sql("CREATE TABLE S(C struct<F:int>) USING parquet")
    -        checkAnswer(sql("SELECT * FROM t, S WHERE t.c.f = S.C.F"), Seq.empty)
    -        val m = intercept[AnalysisException] {
    -          sql("SELECT * FROM t, S WHERE c = C")
    -        }.message
    -        assert(m.contains("cannot resolve '(t.`c` = S.`C`)' due to data type mismatch"))
    --- End diff --
    
    Thank you for pinging me, @cloud-fan .
    
    Since this removal is a real behavior change instead of new test coverage of `comparator.sql`, could you add a  documentation for this?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21470: [SPARK-24443][SQL] comparison should accept struc...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21470#discussion_r192192426
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala ---
    @@ -2675,15 +2675,6 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext {
         withSQLConf(SQLConf.CASE_SENSITIVE.key -> "false") {
           sql("SELECT struct(1 a) UNION ALL (SELECT struct(2 A))")
           sql("SELECT struct(1 a) EXCEPT (SELECT struct(2 A))")
    -
    -      withTable("t", "S") {
    -        sql("CREATE TABLE t(c struct<f:int>) USING parquet")
    -        sql("CREATE TABLE S(C struct<F:int>) USING parquet")
    -        Seq(("c", "C"), ("C", "c"), ("c.f", "C.F"), ("C.F", "c.f")).foreach {
    -          case (left, right) =>
    -            checkAnswer(sql(s"SELECT * FROM t, S WHERE t.$left = S.$right"), Seq.empty)
    --- End diff --
    
    this test case wants to test set operation, but here it's testing filter. the new tests should've covered it.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21470: [SPARK-24443][SQL] comparison should accept struc...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21470#discussion_r192274906
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashJoin.scala ---
    @@ -62,8 +62,9 @@ trait HashJoin {
       }
     
       protected lazy val (buildKeys, streamedKeys) = {
    -    require(leftKeys.map(_.dataType) == rightKeys.map(_.dataType),
    -      "Join keys from two sides should have same types")
    +    require(leftKeys.map(_.dataType).zip(rightKeys.map(_.dataType)).forall {
    +      case (l, r) => BinaryOperator.sameType(l, r)
    --- End diff --
    
    I hit a test failure before changing this. This kind of check(assert, require) can only be hitten when there is a bug.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org