You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/05/07 04:04:04 UTC

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24442: [SPARK-27547][SQL] Fix DataFrame self-join problems

dongjoon-hyun commented on a change in pull request #24442: [SPARK-27547][SQL] Fix DataFrame self-join problems
URL: https://github.com/apache/spark/pull/24442#discussion_r281453665
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/Column.scala
 ##########
 @@ -144,11 +144,11 @@ class Column(val expr: Expression) extends Logging {
   override def toString: String = toPrettySQL(expr)
 
   override def equals(that: Any): Boolean = that match {
-    case that: Column => that.expr.equals(this.expr)
+    case that: Column => that.expr.semanticEquals(this.expr)
 
 Review comment:
   @cloud-fan , @gatorsmile , @rxin .
   
   Could you split this `Column` change into another JIRA issue?
   
   This looks required but orthogonal. In addition, although this PR provides a new configuration, `spark.sql.analyzer.resolveDatasetColumnReference`, we cannot undo this behavior change of `Column` class. The following is the behavior change which might affect new optimizers.
   
   ```scala
   scala> spark.version
   res0: String = 2.4.3
   
   scala> rand(0).equals(rand(0))
   res1: Boolean = true
   
   scala> ($"a" + 1 + 2 + 3).equals($"a" + 3 + 2 + 1)
   res2: Boolean = false
   ```
   
   ```scala
   scala> spark.version  // This PR
   res0: String = 3.0.0-SNAPSHOT
   
   scala> rand(0).equals(rand(0))
   res1: Boolean = false
   
   scala> ($"a" + 1 + 2 + 3).equals($"a" + 3 + 2 + 1)
   res2: Boolean = true
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org