You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by rxin <gi...@git.apache.org> on 2016/07/05 20:51:04 UTC

[GitHub] spark pull request #14057: [SPARK-15425][SQL] Disallow cross joins, even if ...

GitHub user rxin opened a pull request:

    https://github.com/apache/spark/pull/14057

    [SPARK-15425][SQL] Disallow cross joins, even if it is broadcast nested loop - WIP

    ## What changes were proposed in this pull request?
    WIP
    
    ## How was this patch tested?
    WIP


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/rxin/spark SPARK-15425

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/14057.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #14057
    
----
commit 3affa5b2f04104ba108523ee360aac6cb0aac001
Author: Reynold Xin <rx...@databricks.com>
Date:   2016-07-05T20:50:13Z

    Disallow cross joins

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14057: [SPARK-15425][SQL] Disallow cross joins, even if it is b...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14057
  
    **[Test build #61788 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61788/consoleFull)** for PR 14057 at commit [`3affa5b`](https://github.com/apache/spark/commit/3affa5b2f04104ba108523ee360aac6cb0aac001).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14057: [SPARK-15425][SQL] Disallow cross joins, even if it is b...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14057
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14057: [SPARK-15425][SQL] Disallow cross joins, even if it is b...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on the issue:

    https://github.com/apache/spark/pull/14057
  
    This breaks null aware anti joins in most cases.
    
    For example:
    `select * from a where a.id not in (select id from b)`
    
    Get planned as:
    ```
    == Parsed Logical Plan ==
    'Project [*]
    +- 'Filter NOT 'a.id IN (list#16)
       :  +- 'SubqueryAlias list#16
       :     +- 'Project ['id]
       :        +- 'UnresolvedRelation `b`
       +- 'UnresolvedRelation `a`
    
    == Analyzed Logical Plan ==
    id: bigint
    Project [id#0L]
    +- Filter NOT predicate-subquery#16 [(id#0L = id#6L)]
       :  +- SubqueryAlias predicate-subquery#16 [(id#0L = id#6L)]
       :     +- Project [id#6L]
       :        +- SubqueryAlias b
       :           +- Range (5, 10, splits=8)
       +- SubqueryAlias a
          +- Range (0, 10, splits=8)
    
    == Optimized Logical Plan ==
    Join LeftAnti, (isnull((id#0L = id#6L)) || (id#0L = id#6L))
    :- Range (0, 10, splits=8)
    +- Range (5, 10, splits=8)
    
    == Physical Plan ==
    BroadcastNestedLoopJoin BuildRight, LeftAnti, (isnull((id#0L = id#6L)) || (id#0L = id#6L)), true
    :- *Range (0, 10, splits=8)
    +- BroadcastExchange IdentityBroadcastMode
       +- *Range (5, 10, splits=8)
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14057: [SPARK-15425][SQL] Disallow cross joins, even if it is b...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the issue:

    https://github.com/apache/spark/pull/14057
  
    cc @hvanhovell @davies is this safe to do, or are we catching too wide of a net?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14057: [SPARK-15425][SQL] Disallow cross joins, even if ...

Posted by rxin <gi...@git.apache.org>.
Github user rxin closed the pull request at:

    https://github.com/apache/spark/pull/14057


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14057: [SPARK-15425][SQL] Disallow cross joins, even if it is b...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14057
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61798/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14057: [SPARK-15425][SQL] Disallow cross joins, even if it is b...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14057
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61788/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14057: [SPARK-15425][SQL] Disallow cross joins, even if it is b...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14057
  
    **[Test build #61798 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61798/consoleFull)** for PR 14057 at commit [`9a64a2f`](https://github.com/apache/spark/commit/9a64a2f5ebd8cac2c6e5c3bf359b13131a501ffc).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14057: [SPARK-15425][SQL] Disallow cross joins, even if it is b...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14057
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14057: [SPARK-15425][SQL] Disallow cross joins, even if it is b...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14057
  
    **[Test build #61788 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61788/consoleFull)** for PR 14057 at commit [`3affa5b`](https://github.com/apache/spark/commit/3affa5b2f04104ba108523ee360aac6cb0aac001).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14057: [SPARK-15425][SQL] Disallow cross joins, even if it is b...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14057
  
    **[Test build #61798 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61798/consoleFull)** for PR 14057 at commit [`9a64a2f`](https://github.com/apache/spark/commit/9a64a2f5ebd8cac2c6e5c3bf359b13131a501ffc).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org