You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by cloud-fan <gi...@git.apache.org> on 2018/05/31 18:14:53 UTC
[GitHub] spark pull request #21470: [SPARK-24443][SQL] comparison should accept struc...
GitHub user cloud-fan opened a pull request:
https://github.com/apache/spark/pull/21470
[SPARK-24443][SQL] comparison should accept structurally-equal types
## What changes were proposed in this pull request?
When comparing struct type values, it's a little hard to make the field names same. Simple queries like
`SELECT (a, b) = (1, 2) FROM t` may fail because the field names do not match.
In Postgres, when comparing struct type values, it will do safe type coercion and ignore field name difference
```
# create table t(i smallint, j smallint);
CREATE TABLE
# select Row(i, j) = Row(1, 1) from t;
?column?
----------
t
(1 row)
# select Row(i, j) < Row(1, 1) from t;
?column?
----------
f
(1 row)
# select Row(i, j) = Row(j, i) from t;
?column?
----------
t
(1 row)
```
This PR follows Postgres and accept structurally-equal types in comparison
TODO:
* check Hive and Presto.
* think about array/map type. Postgres doesn't support type coercion for elements in array/map when comparison.
## How was this patch tested?
new test cases.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/cloud-fan/spark equal
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/21470.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #21470
----
commit 3555e0fd1a8003af0e6a7694aab8999698aab9c4
Author: Wenchen Fan <we...@...>
Date: 2018-05-31T18:00:42Z
comparison should accept structurally-equal types
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #21470: [SPARK-24443][SQL] comparison should accept struc...
Posted by eric-maynard <gi...@git.apache.org>.
Github user eric-maynard commented on a diff in the pull request:
https://github.com/apache/spark/pull/21470#discussion_r192820148
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala ---
@@ -803,18 +803,60 @@ object TypeCoercion {
e.copy(left = Cast(e.left, TimestampType))
}
- case b @ BinaryOperator(left, right) if left.dataType != right.dataType =>
- findTightestCommonType(left.dataType, right.dataType).map { commonType =>
- if (b.inputType.acceptsType(commonType)) {
- // If the expression accepts the tightest common type, cast to that.
- val newLeft = if (left.dataType == commonType) left else Cast(left, commonType)
- val newRight = if (right.dataType == commonType) right else Cast(right, commonType)
- b.withNewChildren(Seq(newLeft, newRight))
- } else {
- // Otherwise, don't do anything with the expression.
- b
- }
- }.getOrElse(b) // If there is no applicable conversion, leave expression unchanged.
+ case b @ BinaryOperator(left, right)
+ if !BinaryOperator.sameType(left.dataType, right.dataType) =>
+ (left.dataType, right.dataType) match {
+ case (StructType(fields1), StructType(fields2)) =>
+ val commonTypes = scala.collection.mutable.ArrayBuffer.empty[DataType]
+ val len = fields1.length
+ var i = 0
+ var continue = fields1.length == fields2.length
+ while (i < len && continue) {
--- End diff --
This loop could be refactored functionally, e.g.
```
val commonTypes = (fields1 zip fields2).map(f => findTightestCommonType(f._1, f._2))
if (commonTypes.forall(_.isDefined)) {
. . .
```
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #21470: [SPARK-24443][SQL] comparison should accept struc...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/21470#discussion_r192274662
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala ---
@@ -803,18 +803,60 @@ object TypeCoercion {
e.copy(left = Cast(e.left, TimestampType))
}
- case b @ BinaryOperator(left, right) if left.dataType != right.dataType =>
- findTightestCommonType(left.dataType, right.dataType).map { commonType =>
- if (b.inputType.acceptsType(commonType)) {
- // If the expression accepts the tightest common type, cast to that.
- val newLeft = if (left.dataType == commonType) left else Cast(left, commonType)
- val newRight = if (right.dataType == commonType) right else Cast(right, commonType)
- b.withNewChildren(Seq(newLeft, newRight))
- } else {
- // Otherwise, don't do anything with the expression.
- b
- }
- }.getOrElse(b) // If there is no applicable conversion, leave expression unchanged.
+ case b @ BinaryOperator(left, right)
+ if !BinaryOperator.sameType(left.dataType, right.dataType) =>
--- End diff --
`findTightestCommonType` doesn't accept struct type with different filed names, so the other code are needed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21470: [SPARK-24443][SQL] comparison should accept structurally...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/21470
cc @gatorsmile @dbtsai @dongjoon-hyun @viirya
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #21470: [SPARK-24443][SQL] comparison should accept struc...
Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/21470#discussion_r192266906
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala ---
@@ -803,18 +803,60 @@ object TypeCoercion {
e.copy(left = Cast(e.left, TimestampType))
}
- case b @ BinaryOperator(left, right) if left.dataType != right.dataType =>
- findTightestCommonType(left.dataType, right.dataType).map { commonType =>
- if (b.inputType.acceptsType(commonType)) {
- // If the expression accepts the tightest common type, cast to that.
- val newLeft = if (left.dataType == commonType) left else Cast(left, commonType)
- val newRight = if (right.dataType == commonType) right else Cast(right, commonType)
- b.withNewChildren(Seq(newLeft, newRight))
- } else {
- // Otherwise, don't do anything with the expression.
- b
- }
- }.getOrElse(b) // If there is no applicable conversion, leave expression unchanged.
+ case b @ BinaryOperator(left, right)
+ if !BinaryOperator.sameType(left.dataType, right.dataType) =>
--- End diff --
How about just change this line? The other changes in this file can be done later?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #21470: [SPARK-24443][SQL] comparison should accept struc...
Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/21470#discussion_r192266617
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashJoin.scala ---
@@ -62,8 +62,9 @@ trait HashJoin {
}
protected lazy val (buildKeys, streamedKeys) = {
- require(leftKeys.map(_.dataType) == rightKeys.map(_.dataType),
- "Join keys from two sides should have same types")
+ require(leftKeys.map(_.dataType).zip(rightKeys.map(_.dataType)).forall {
+ case (l, r) => BinaryOperator.sameType(l, r)
--- End diff --
Do we have a test case to cover this?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21470: [SPARK-24443][SQL] comparison should accept structurally...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21470
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21470: [SPARK-24443][SQL] comparison should accept structurally...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21470
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21470: [SPARK-24443][SQL] comparison should accept structurally...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21470
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3737/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #21470: [SPARK-24443][SQL] comparison should accept struc...
Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/21470#discussion_r192314292
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala ---
@@ -803,18 +803,60 @@ object TypeCoercion {
e.copy(left = Cast(e.left, TimestampType))
}
- case b @ BinaryOperator(left, right) if left.dataType != right.dataType =>
- findTightestCommonType(left.dataType, right.dataType).map { commonType =>
- if (b.inputType.acceptsType(commonType)) {
- // If the expression accepts the tightest common type, cast to that.
- val newLeft = if (left.dataType == commonType) left else Cast(left, commonType)
- val newRight = if (right.dataType == commonType) right else Cast(right, commonType)
- b.withNewChildren(Seq(newLeft, newRight))
- } else {
- // Otherwise, don't do anything with the expression.
- b
- }
- }.getOrElse(b) // If there is no applicable conversion, leave expression unchanged.
+ case b @ BinaryOperator(left, right)
+ if !BinaryOperator.sameType(left.dataType, right.dataType) =>
+ (left.dataType, right.dataType) match {
+ case (StructType(fields1), StructType(fields2)) =>
+ val commonTypes = scala.collection.mutable.ArrayBuffer.empty[DataType]
+ val len = fields1.length
+ var i = 0
+ var continue = fields1.length == fields2.length
+ while (i < len && continue) {
+ val commonType = findTightestCommonType(fields1(i).dataType, fields2(i).dataType)
--- End diff --
What about nested structs?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21470: [SPARK-24443][SQL] comparison should accept structurally...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21470
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91355/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #21470: [SPARK-24443][SQL] comparison should accept struc...
Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/21470#discussion_r192314729
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala ---
@@ -803,18 +803,60 @@ object TypeCoercion {
e.copy(left = Cast(e.left, TimestampType))
}
- case b @ BinaryOperator(left, right) if left.dataType != right.dataType =>
- findTightestCommonType(left.dataType, right.dataType).map { commonType =>
- if (b.inputType.acceptsType(commonType)) {
- // If the expression accepts the tightest common type, cast to that.
- val newLeft = if (left.dataType == commonType) left else Cast(left, commonType)
- val newRight = if (right.dataType == commonType) right else Cast(right, commonType)
- b.withNewChildren(Seq(newLeft, newRight))
- } else {
- // Otherwise, don't do anything with the expression.
- b
- }
- }.getOrElse(b) // If there is no applicable conversion, leave expression unchanged.
+ case b @ BinaryOperator(left, right)
+ if !BinaryOperator.sameType(left.dataType, right.dataType) =>
+ (left.dataType, right.dataType) match {
+ case (StructType(fields1), StructType(fields2)) =>
+ val commonTypes = scala.collection.mutable.ArrayBuffer.empty[DataType]
+ val len = fields1.length
+ var i = 0
+ var continue = fields1.length == fields2.length
+ while (i < len && continue) {
+ val commonType = findTightestCommonType(fields1(i).dataType, fields2(i).dataType)
+ if (commonType.isDefined) {
+ commonTypes += commonType.get
+ } else {
+ continue = false
+ }
+ i += 1
+ }
+
+ if (continue) {
+ val newLeftST = new StructType(fields1.zip(commonTypes).map {
+ case (f, commonType) => f.copy(dataType = commonType)
+ })
+ val newLeft = if (left.dataType == newLeftST) left else Cast(left, newLeftST)
+
+ val newRightST = new StructType(fields2.zip(commonTypes).map {
+ case (f, commonType) => f.copy(dataType = commonType)
+ })
+ val newRight = if (right.dataType == newRightST) right else Cast(right, newRightST)
+
+ if (b.inputType.acceptsType(newLeftST) && b.inputType.acceptsType(newRightST)) {
--- End diff --
Is it possible `b` only accepts one side (e.g., only `newLeftST`) but doesn't accept other side?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21470: [SPARK-24443][SQL] comparison should accept structurally...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21470
**[Test build #91355 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91355/testReport)** for PR 21470 at commit [`3555e0f`](https://github.com/apache/spark/commit/3555e0fd1a8003af0e6a7694aab8999698aab9c4).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #21470: [SPARK-24443][SQL] comparison should accept struc...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/21470#discussion_r192274753
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala ---
@@ -803,18 +803,60 @@ object TypeCoercion {
e.copy(left = Cast(e.left, TimestampType))
}
- case b @ BinaryOperator(left, right) if left.dataType != right.dataType =>
- findTightestCommonType(left.dataType, right.dataType).map { commonType =>
- if (b.inputType.acceptsType(commonType)) {
- // If the expression accepts the tightest common type, cast to that.
- val newLeft = if (left.dataType == commonType) left else Cast(left, commonType)
- val newRight = if (right.dataType == commonType) right else Cast(right, commonType)
- b.withNewChildren(Seq(newLeft, newRight))
- } else {
- // Otherwise, don't do anything with the expression.
- b
- }
- }.getOrElse(b) // If there is no applicable conversion, leave expression unchanged.
+ case b @ BinaryOperator(left, right)
+ if !BinaryOperator.sameType(left.dataType, right.dataType) =>
+ (left.dataType, right.dataType) match {
+ case (StructType(fields1), StructType(fields2)) =>
+ val commonTypes = scala.collection.mutable.ArrayBuffer.empty[DataType]
+ val len = fields1.length
+ var i = 0
+ var continue = fields1.length == fields2.length
+ while (i < len && continue) {
+ val commonType = findTightestCommonType(fields1(i).dataType, fields2(i).dataType)
--- End diff --
if we don't want to support type coercion, we can change this line to use `fields1(i).dataType == fields2(i).dataType`
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21470: [SPARK-24443][SQL] comparison should accept structurally...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21470
**[Test build #91355 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91355/testReport)** for PR 21470 at commit [`3555e0f`](https://github.com/apache/spark/commit/3555e0fd1a8003af0e6a7694aab8999698aab9c4).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #21470: [SPARK-24443][SQL] comparison should accept struc...
Posted by eric-maynard <gi...@git.apache.org>.
Github user eric-maynard commented on a diff in the pull request:
https://github.com/apache/spark/pull/21470#discussion_r192819128
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala ---
@@ -803,18 +803,60 @@ object TypeCoercion {
e.copy(left = Cast(e.left, TimestampType))
}
- case b @ BinaryOperator(left, right) if left.dataType != right.dataType =>
- findTightestCommonType(left.dataType, right.dataType).map { commonType =>
- if (b.inputType.acceptsType(commonType)) {
- // If the expression accepts the tightest common type, cast to that.
- val newLeft = if (left.dataType == commonType) left else Cast(left, commonType)
- val newRight = if (right.dataType == commonType) right else Cast(right, commonType)
- b.withNewChildren(Seq(newLeft, newRight))
- } else {
- // Otherwise, don't do anything with the expression.
- b
- }
- }.getOrElse(b) // If there is no applicable conversion, leave expression unchanged.
+ case b @ BinaryOperator(left, right)
+ if !BinaryOperator.sameType(left.dataType, right.dataType) =>
+ (left.dataType, right.dataType) match {
+ case (StructType(fields1), StructType(fields2)) =>
+ val commonTypes = scala.collection.mutable.ArrayBuffer.empty[DataType]
+ val len = fields1.length
+ var i = 0
+ var continue = fields1.length == fields2.length
+ while (i < len && continue) {
+ val commonType = findTightestCommonType(fields1(i).dataType, fields2(i).dataType)
+ if (commonType.isDefined) {
+ commonTypes += commonType.get
+ } else {
+ continue = false
+ }
+ i += 1
+ }
+
+ if (continue) {
+ val newLeftST = new StructType(fields1.zip(commonTypes).map {
+ case (f, commonType) => f.copy(dataType = commonType)
+ })
+ val newLeft = if (left.dataType == newLeftST) left else Cast(left, newLeftST)
+
+ val newRightST = new StructType(fields2.zip(commonTypes).map {
+ case (f, commonType) => f.copy(dataType = commonType)
+ })
+ val newRight = if (right.dataType == newRightST) right else Cast(right, newRightST)
+
+ if (b.inputType.acceptsType(newLeftST) && b.inputType.acceptsType(newRightST)) {
+ b.withNewChildren(Seq(newLeft, newRight))
+ } else {
+ // type not acceptable, don't do anything with the expression.
+ b
+ }
+ } else {
+ // left struct type and right struct type have different number of fields, or some
+ // fields don't have a common type, don't do anything with the expression.
+ b
+ }
+
+ case _ =>
+ findTightestCommonType(left.dataType, right.dataType).map { commonType =>
+ if (b.inputType.acceptsType(commonType)) {
+ // If the expression accepts the tightest common type, cast to that.
+ val newLeft = if (left.dataType == commonType) left else Cast(left, commonType)
--- End diff --
This ternary operation seems to crop up a few times in this PR. Maybe we can push it out into a method?
```
private def castIfNeeded(e: Expression, possibleType: DataType): Expression = {
if (e.dataType == possibleType) data else Cast(e, possibleType)
}
```
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #21470: [SPARK-24443][SQL] comparison should accept struc...
Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:
https://github.com/apache/spark/pull/21470#discussion_r192309419
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala ---
@@ -2696,16 +2687,6 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext {
sql("SELECT struct(1 a) EXCEPT (SELECT struct(2 A))")
}.message
assert(m2.contains("Except can only be performed on tables with the compatible column types"))
-
- withTable("t", "S") {
- sql("CREATE TABLE t(c struct<f:int>) USING parquet")
- sql("CREATE TABLE S(C struct<F:int>) USING parquet")
- checkAnswer(sql("SELECT * FROM t, S WHERE t.c.f = S.C.F"), Seq.empty)
- val m = intercept[AnalysisException] {
- sql("SELECT * FROM t, S WHERE c = C")
- }.message
- assert(m.contains("cannot resolve '(t.`c` = S.`C`)' due to data type mismatch"))
--- End diff --
Thank you for pinging me, @cloud-fan .
Since this removal is a real behavior change instead of new test coverage of `comparator.sql`, could you add a documentation for this?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #21470: [SPARK-24443][SQL] comparison should accept struc...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/21470#discussion_r192192426
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala ---
@@ -2675,15 +2675,6 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext {
withSQLConf(SQLConf.CASE_SENSITIVE.key -> "false") {
sql("SELECT struct(1 a) UNION ALL (SELECT struct(2 A))")
sql("SELECT struct(1 a) EXCEPT (SELECT struct(2 A))")
-
- withTable("t", "S") {
- sql("CREATE TABLE t(c struct<f:int>) USING parquet")
- sql("CREATE TABLE S(C struct<F:int>) USING parquet")
- Seq(("c", "C"), ("C", "c"), ("c.f", "C.F"), ("C.F", "c.f")).foreach {
- case (left, right) =>
- checkAnswer(sql(s"SELECT * FROM t, S WHERE t.$left = S.$right"), Seq.empty)
--- End diff --
this test case wants to test set operation, but here it's testing filter. the new tests should've covered it.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #21470: [SPARK-24443][SQL] comparison should accept struc...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/21470#discussion_r192274906
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashJoin.scala ---
@@ -62,8 +62,9 @@ trait HashJoin {
}
protected lazy val (buildKeys, streamedKeys) = {
- require(leftKeys.map(_.dataType) == rightKeys.map(_.dataType),
- "Join keys from two sides should have same types")
+ require(leftKeys.map(_.dataType).zip(rightKeys.map(_.dataType)).forall {
+ case (l, r) => BinaryOperator.sameType(l, r)
--- End diff --
I hit a test failure before changing this. This kind of check(assert, require) can only be hitten when there is a bug.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org