You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Saurabh Santhosh (JIRA)" <ji...@apache.org> on 2016/04/27 08:50:12 UTC

[jira] [Commented] (SPARK-11072) simplify self join handling

    [ https://issues.apache.org/jira/browse/SPARK-11072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15259653#comment-15259653 ] 

Saurabh Santhosh commented on SPARK-11072:
------------------------------------------

Will this resolve https://issues.apache.org/jira/browse/SPARK-14948 ?
Can you please add a test case covering this scenario for future releases

> simplify self join handling
> ---------------------------
>
>                 Key: SPARK-11072
>                 URL: https://issues.apache.org/jira/browse/SPARK-11072
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>            Reporter: Wenchen Fan
>
> self-join is a diamond problem that confuse our analyzer. Our current solution is creating new instances of leaf nodes in the right tree of join node, and update all attribute reference there. Thus there is no diamond anymore and problem fixed.
> However, our execution engine can handle diamond plan and we only need to distinguish the output between left and right. So we can simplify the self-join handling by introducing a new Plan `NewOutput` to give different output attributes.
> The extra `NewOutput` layer is quite cheap and can be completely removed when we have local nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org