You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by mgaido91 <gi...@git.apache.org> on 2018/06/01 11:36:38 UTC

[GitHub] spark issue #21449: [SPARK-24385][SQL] Resolve self-join condition ambiguity...

Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/21449
  
    Thanks for your comment @cloud-fan. I understand your point. That is quite a tricky problem, since we should know probably also the "DAG" of the dataframes in order to take the right decision.
    
    But despite this change is related to that problem, I think it is different and with a much smaller scope. Indeed, while we can use the metadata information in many places, actually in this patch is is used only in the self-join case when there is ambiguity in which column to take. The behavior in any other case in unchanged.
    
    So after this patch, the situation in resolving column using `col` is unchanged. The only places where the dataset of provenance is checked is in self joins. The goal here is only to support cases which were throwing exceptions in resolving the right column.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org