You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Matt Cheah (JIRA)" <ji...@apache.org> on 2015/05/21 02:54:59 UTC

[jira] [Comment Edited] (SPARK-7611) Support HashJoin if the join condition uses eqNullSafe/<=>

    [ https://issues.apache.org/jira/browse/SPARK-7611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14553448#comment-14553448 ] 

Matt Cheah edited comment on SPARK-7611 at 5/21/15 12:54 AM:
-------------------------------------------------------------

For context, was there a specific decision to avoid supporting eqNullSafe optimization on Hash Join earlier in the project's lifecycle? Is there any particular reason to not implement the optimization? I'll start working on this but I want to make sure implementing this is the right thing to do in the grand scheme of things.


was (Author: mcheah):
For context, was there a specific decision to avoid supporting eqNullSafe optimization on Hash Join?

> Support HashJoin if the join condition uses eqNullSafe/<=>
> ----------------------------------------------------------
>
>                 Key: SPARK-7611
>                 URL: https://issues.apache.org/jira/browse/SPARK-7611
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 1.3.1
>            Reporter: David Tolnay
>
> Currently ExtractEquiJoinKeys only looks for EqualTo, not EqualNullSafe. So if your join condition uses eqNullSafe/<=> instead of equalTo/===, you end up with the CartesianProduct strategy instead of HashJoin.
> This requires many changes under org.apache.spark.sql.execution.joins, where code assumes rows can only join "if (!key.anyNull)".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org