You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by davies <gi...@git.apache.org> on 2016/05/01 07:59:47 UTC

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

GitHub user davies opened a pull request:

    https://github.com/apache/spark/pull/12820

    [SPARK-14781] [SQL] support nested predicate subquery

    ## What changes were proposed in this pull request?
    
    In order to support nested predicate subquery, this PR introduce an internal join type LeftSemiPlus, which will emit all the rows from left, plus an additional column, which presents there are any rows matched from right or not (it's not null-aware right now). This additional column could be used to replace the subquery in Filter.
    
    In theory, all the predicate subquery could use this join type, but it's slower than LeftSemi and LeftAnti, so it's only used for nested subquery (subquery inside OR).
    
    For example, the following SQL:
    ```sql
    SELECT a FROM t  WHERE EXISTS (select 0) OR EXISTS (select 1)
    ``` 
    
    This PR also fix a bug in predicate subquery push down through join (they should not).
    
    Nested null-aware subquery is still not supported. For example,   `a > 3 OR b NOT IN (select bb from t)`
    
    TODO: add more tests for subquery
    
    ## How was this patch tested?
    
    Added unit tests.
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/davies/spark or_exists

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/12820.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #12820
    
----
commit 0c9f26c943e70894d9ad18b8dac2792b5d6fd92b
Author: Davies Liu <da...@databricks.com>
Date:   2016-05-01T07:48:50Z

    support nested predicate subquery

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12820#issuecomment-216318074
  
    **[Test build #57541 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57541/consoleFull)** for PR 12820 at commit [`d7a3c8f`](https://github.com/apache/spark/commit/d7a3c8fd644faf0a5093fc25db6df06ae6401c37).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12820#issuecomment-216347529
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by asfgit <gi...@git.apache.org>.

Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/12820


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12820#issuecomment-216347200
  
    **[Test build #57544 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57544/consoleFull)** for PR 12820 at commit [`1bb9f60`](https://github.com/apache/spark/commit/1bb9f6072c5a97190ec75e09b6647c8bda903285).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by hvanhovell <gi...@git.apache.org>.

Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/12820#discussion_r61771928
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastNestedLoopJoinExec.scala ---
    @@ -197,14 +198,42 @@ case class BroadcastNestedLoopJoinExec(
         }
       }
     
    +  private def leftSemiPlusJoin(relation: Broadcast[Array[InternalRow]]): RDD[InternalRow] = {
    +    assert(buildSide == BuildRight)
    +    streamed.execute().mapPartitionsInternal { streamedIter =>
    +      val buildRows = relation.value
    +      val joinedRow = new JoinedRow
    +      val exists = joinType.asInstanceOf[ExistenceJoin].exists
    +
    +      if (condition.isDefined) {
    +        val resultRow = new GenericMutableRow(Array[Any](null))
    +        if (exists.nullable) {
    +          sys.error("Null-aware join is not supported")
    +        } else {
    +          streamedIter.map { row =>
    +            val result = buildRows.exists(r => boundCondition(joinedRow(row, r)))
    +            resultRow.setBoolean(0, result)
    +            joinedRow(row, resultRow)
    +          }
    +        }
    +      } else {
    +        val resultRow = new GenericMutableRow(Array[Any](buildRows.nonEmpty))
    +        streamedIter.map { row =>
    +          joinedRow(row, resultRow)
    +        }
    +      }
    +    }
    +  }
    +
       /**
        * The implementation for these joins:
        *
        *   LeftOuter with BuildLeft
        *   RightOuter with BuildRight
        *   FullOuter
        *   LeftSemi with BuildLeft
    -   *   Anti with BuildLeft
    +   *   LeftAnti with BuildLeft
    +   *   LeftSemiPlus with BuildLeft
    --- End diff --
    
    Nit: Existence


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12820#issuecomment-216323628
  
    **[Test build #57544 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57544/consoleFull)** for PR 12820 at commit [`1bb9f60`](https://github.com/apache/spark/commit/1bb9f6072c5a97190ec75e09b6647c8bda903285).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12820#issuecomment-216341487
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12820#issuecomment-216121348
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57508/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12820#issuecomment-216151171
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by hvanhovell <gi...@git.apache.org>.

Github user hvanhovell commented on the pull request:

    https://github.com/apache/spark/pull/12820#issuecomment-216024637
  
    Cool!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by hvanhovell <gi...@git.apache.org>.

Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/12820#discussion_r61688335
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -1530,6 +1546,17 @@ object RewritePredicateSubquery extends Rule[LogicalPlan] with PredicateHelper {
               // Note that will almost certainly be planned as a Broadcast Nested Loop join. Use EXISTS
               // if performance matters to you.
               Join(p, sub, LeftAnti, Option(Or(anyNull, condition)))
    +        case (p, predicate) =>
    +          var joined = p
    +          val hasNot = predicate.find(_.isInstanceOf[Not]).isDefined
    --- End diff --
    
    Where is `hasNot` used?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by davies <gi...@git.apache.org>.

Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/12820#discussion_r61706110
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/subquery.scala ---
    @@ -105,6 +105,13 @@ object PredicateSubquery {
           case _ => false
         }.isDefined
       }
    +  def hasNullAwarePredicate(e: Expression): Boolean = {
    +    e.find(_.isInstanceOf[Not]).isDefined &&
    --- End diff --
    
    This could have false positive (the join could be not-null-aware, but we think it's null-awear), that's OK for now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by davies <gi...@git.apache.org>.

Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/12820#discussion_r61705603
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -1200,9 +1209,16 @@ object PushPredicateThroughJoin extends Rule[LogicalPlan] with PredicateHelper {
                 reduceLeftOption(And).map(Filter(_, left)).getOrElse(left)
               val newRight = rightFilterConditions.
                 reduceLeftOption(And).map(Filter(_, right)).getOrElse(right)
    -          val newJoinCond = (commonFilterCondition ++ joinCondition).reduceLeftOption(And)
    +          val (newJoinConditions, others) =
    +            commonFilterCondition.partition(e => !PredicateSubquery.hasPredicateSubquery(e))
    +          val newJoinCond = (newJoinConditions ++ joinCondition).reduceLeftOption(And)
     
    -          Join(newLeft, newRight, Inner, newJoinCond)
    +          val join = Join(newLeft, newRight, Inner, newJoinCond)
    +          if (others.nonEmpty) {
    --- End diff --
    
    Repeat it only twice, I think it's OK.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by hvanhovell <gi...@git.apache.org>.

Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/12820#discussion_r61771927
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastHashJoinExec.scala ---
    @@ -407,4 +408,67 @@ case class BroadcastHashJoinExec(
            """.stripMargin
         }
       }
    +
    +  /**
    +   * Generates the code for left semi join.
    --- End diff --
    
    NIT: existence join


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12820#issuecomment-216342894
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57543/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by hvanhovell <gi...@git.apache.org>.

Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/12820#discussion_r61688622
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -99,16 +99,17 @@ abstract class Optimizer(sessionCatalog: SessionCatalog, conf: CatalystConf)
           EliminateSorts,
           SimplifyCasts,
           SimplifyCaseConversionExpressions,
    -      EliminateSerialization,
    -      RewritePredicateSubquery) ::
    +      EliminateSerialization) ::
         Batch("Decimal Optimizations", fixedPoint,
           DecimalAggregates) ::
         Batch("Typed Filter Optimization", fixedPoint,
           EmbedSerializerInFilter) ::
         Batch("LocalRelation", fixedPoint,
           ConvertToLocalRelation) ::
         Batch("OptimizeCodegen", Once,
    -      OptimizeCodegen(conf)) :: Nil
    +      OptimizeCodegen(conf)) ::
    +    Batch("RewriteSubquery", Once,
    +      RewritePredicateSubquery) :: Nil
    --- End diff --
    
    We sometime add a top-level project to make sure all attributes are unique; which is a tiny bit of overhead. Shouldn't we add `ColumnPruning`/`ColapseProjects` to this batch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by hvanhovell <gi...@git.apache.org>.

Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/12820#discussion_r61689199
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -1530,6 +1546,17 @@ object RewritePredicateSubquery extends Rule[LogicalPlan] with PredicateHelper {
               // Note that will almost certainly be planned as a Broadcast Nested Loop join. Use EXISTS
               // if performance matters to you.
               Join(p, sub, LeftAnti, Option(Or(anyNull, condition)))
    +        case (p, predicate) =>
    --- End diff --
    
    This won't plan these joins outside of filters right? So this is not working yet:
    ```SQL
    select a.*, a.value in (select value from b) as in_b from a
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by hvanhovell <gi...@git.apache.org>.

Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/12820#discussion_r61688776
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -1077,7 +1078,14 @@ object ReorderJoin extends Rule[LogicalPlan] with PredicateHelper {
       def createOrderedJoin(input: Seq[LogicalPlan], conditions: Seq[Expression]): LogicalPlan = {
         assert(input.size >= 2)
         if (input.size == 2) {
    -      Join(input(0), input(1), Inner, conditions.reduceLeftOption(And))
    +      val (joinConditions, others) = conditions.partition(
    --- End diff --
    
    NIT: It might be easier to flip the names and call `PredicateSubquery.hasPredicateSubquery` directly


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by hvanhovell <gi...@git.apache.org>.

Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/12820#discussion_r61689405
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala ---
    @@ -115,8 +115,8 @@ trait CheckAnalysis extends PredicateHelper {
               case f @ Filter(condition, child) =>
                 splitConjunctivePredicates(condition).foreach {
                   case _: PredicateSubquery | Not(_: PredicateSubquery) =>
    -              case e if PredicateSubquery.hasPredicateSubquery(e) =>
    -                failAnalysis(s"Predicate sub-queries cannot be used in nested conditions: $e")
    +              case e if PredicateSubquery.hasNullAwarePredicate(e) =>
    +                failAnalysis(s"Null-aware sub-queries cannot be used in nested conditions: $e")
    --- End diff --
    
    What is the problem here? We cannot use the same condition we currently use for NAAJ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12820#issuecomment-216150675
  
    **[Test build #57518 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57518/consoleFull)** for PR 12820 at commit [`a34b172`](https://github.com/apache/spark/commit/a34b1720cd9b7161b0e564bfa981e22789657e7a).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12820#issuecomment-216122727
  
    **[Test build #57518 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57518/consoleFull)** for PR 12820 at commit [`a34b172`](https://github.com/apache/spark/commit/a34b1720cd9b7161b0e564bfa981e22789657e7a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by hvanhovell <gi...@git.apache.org>.

Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/12820#discussion_r61688481
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/subquery.scala ---
    @@ -105,6 +105,13 @@ object PredicateSubquery {
           case _ => false
         }.isDefined
       }
    +  def hasNullAwarePredicate(e: Expression): Boolean = {
    +    e.find(_.isInstanceOf[Not]).isDefined &&
    --- End diff --
    
    You are looking for a NOT expression with a nested NULL-aware predicate here right? This will also find other results e.g.: an OR with an NOT on the left hand side and a NULL-aware predicate on the right hand.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12820#issuecomment-216027929
  
    **[Test build #57474 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57474/consoleFull)** for PR 12820 at commit [`0c9f26c`](https://github.com/apache/spark/commit/0c9f26c943e70894d9ad18b8dac2792b5d6fd92b).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class LeftSemiPlus(exists: Attribute) extends JoinType `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/12820#issuecomment-216078850
  
    It'd be better to name it ExistenceJoin. It took me a while to grasp what LeftSemiPlus means.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by hvanhovell <gi...@git.apache.org>.

Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/12820#discussion_r61688786
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -1200,9 +1209,16 @@ object PushPredicateThroughJoin extends Rule[LogicalPlan] with PredicateHelper {
                 reduceLeftOption(And).map(Filter(_, left)).getOrElse(left)
               val newRight = rightFilterConditions.
                 reduceLeftOption(And).map(Filter(_, right)).getOrElse(right)
    -          val newJoinCond = (commonFilterCondition ++ joinCondition).reduceLeftOption(And)
    +          val (newJoinConditions, others) =
    +            commonFilterCondition.partition(e => !PredicateSubquery.hasPredicateSubquery(e))
    +          val newJoinCond = (newJoinConditions ++ joinCondition).reduceLeftOption(And)
     
    -          Join(newLeft, newRight, Inner, newJoinCond)
    +          val join = Join(newLeft, newRight, Inner, newJoinCond)
    +          if (others.nonEmpty) {
    --- End diff --
    
    2nd time you need this. Almost warrants an inner method.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12820#issuecomment-216347531
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57544/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12820#issuecomment-216027952
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57474/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by hvanhovell <gi...@git.apache.org>.

Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/12820#discussion_r61790324
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -1200,9 +1209,16 @@ object PushPredicateThroughJoin extends Rule[LogicalPlan] with PredicateHelper {
                 reduceLeftOption(And).map(Filter(_, left)).getOrElse(left)
               val newRight = rightFilterConditions.
                 reduceLeftOption(And).map(Filter(_, right)).getOrElse(right)
    -          val newJoinCond = (commonFilterCondition ++ joinCondition).reduceLeftOption(And)
    +          val (newJoinConditions, others) =
    +            commonFilterCondition.partition(e => !PredicateSubquery.hasPredicateSubquery(e))
    +          val newJoinCond = (newJoinConditions ++ joinCondition).reduceLeftOption(And)
     
    -          Join(newLeft, newRight, Inner, newJoinCond)
    +          val join = Join(newLeft, newRight, Inner, newJoinCond)
    +          if (others.nonEmpty) {
    --- End diff --
    
    nvm - two different rules.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12820#issuecomment-216121254
  
    **[Test build #57508 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57508/consoleFull)** for PR 12820 at commit [`0439346`](https://github.com/apache/spark/commit/04393464f7b315b7effbb5b067609c2313e6d228).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class ExistenceJoin(exists: Attribute) extends JoinType `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12820#issuecomment-216319527
  
    **[Test build #57543 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57543/consoleFull)** for PR 12820 at commit [`023ba92`](https://github.com/apache/spark/commit/023ba92697f8a6e3adbe9088f411cf04585a3474).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by hvanhovell <gi...@git.apache.org>.

Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/12820#discussion_r61689163
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/joinTypes.scala ---
    @@ -69,6 +70,12 @@ case object LeftAnti extends JoinType {
       override def sql: String = "LEFT ANTI"
     }
     
    +case class LeftSemiPlus(exists: Attribute) extends JoinType {
    --- End diff --
    
    We could just add this join type...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by davies <gi...@git.apache.org>.

Github user davies commented on the pull request:

    https://github.com/apache/spark/pull/12820#issuecomment-216323690
  
    @hvanhovell Does this look good to you? Can we merge this one first (before yours)?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12820#issuecomment-216341225
  
    **[Test build #57541 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57541/consoleFull)** for PR 12820 at commit [`d7a3c8f`](https://github.com/apache/spark/commit/d7a3c8fd644faf0a5093fc25db6df06ae6401c37).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by davies <gi...@git.apache.org>.

Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/12820#discussion_r61702404
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -1530,6 +1546,17 @@ object RewritePredicateSubquery extends Rule[LogicalPlan] with PredicateHelper {
               // Note that will almost certainly be planned as a Broadcast Nested Loop join. Use EXISTS
               // if performance matters to you.
               Join(p, sub, LeftAnti, Option(Or(anyNull, condition)))
    +        case (p, predicate) =>
    --- End diff --
    
    yes. If that's needed, we could support that as follow-up PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by hvanhovell <gi...@git.apache.org>.

Github user hvanhovell commented on the pull request:

    https://github.com/apache/spark/pull/12820#issuecomment-216324170
  
    LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by davies <gi...@git.apache.org>.

Github user davies commented on the pull request:

    https://github.com/apache/spark/pull/12820#issuecomment-216345269
  
    Merging this into master and 2.0, thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by davies <gi...@git.apache.org>.

Github user davies commented on the pull request:

    https://github.com/apache/spark/pull/12820#issuecomment-216112602
  
    @rxin That's true, right now only BroadcastHashJoin is codegened. Another things is that ExistenceJoin will be null-aware, but LeftSemi/LeftAnti could still be simple.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12820#issuecomment-216342641
  
    **[Test build #57543 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57543/consoleFull)** for PR 12820 at commit [`023ba92`](https://github.com/apache/spark/commit/023ba92697f8a6e3adbe9088f411cf04585a3474).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by hvanhovell <gi...@git.apache.org>.

Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/12820#discussion_r61688397
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/subquery.scala ---
    @@ -105,6 +105,13 @@ object PredicateSubquery {
           case _ => false
         }.isDefined
       }
    +  def hasNullAwarePredicate(e: Expression): Boolean = {
    +    e.find(_.isInstanceOf[Not]).isDefined &&
    +      e.find {
    +        case p: PredicateSubquery if p.nullable => true
    --- End diff --
    
    `p.nullable` is always false. Do you mean `nullAware`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by davies <gi...@git.apache.org>.

Github user davies commented on the pull request:

    https://github.com/apache/spark/pull/12820#issuecomment-216024398
  
    cc @hvanhovell 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12820#issuecomment-216342891
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by hvanhovell <gi...@git.apache.org>.

Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/12820#discussion_r61770335
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -1530,6 +1546,17 @@ object RewritePredicateSubquery extends Rule[LogicalPlan] with PredicateHelper {
               // Note that will almost certainly be planned as a Broadcast Nested Loop join. Use EXISTS
               // if performance matters to you.
               Join(p, sub, LeftAnti, Option(Or(anyNull, condition)))
    +        case (p, predicate) =>
    --- End diff --
    
    Follow-up PR works.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12820#issuecomment-216121347
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12820#issuecomment-216341489
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57541/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12820#issuecomment-216027951
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by hvanhovell <gi...@git.apache.org>.

Github user hvanhovell commented on the pull request:

    https://github.com/apache/spark/pull/12820#issuecomment-216324244
  
    Lets merge this one first.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12820#issuecomment-216112318
  
    **[Test build #57508 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57508/consoleFull)** for PR 12820 at commit [`0439346`](https://github.com/apache/spark/commit/04393464f7b315b7effbb5b067609c2313e6d228).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12820#issuecomment-216024454
  
    **[Test build #57474 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57474/consoleFull)** for PR 12820 at commit [`0c9f26c`](https://github.com/apache/spark/commit/0c9f26c943e70894d9ad18b8dac2792b5d6fd92b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/12820#issuecomment-216080357
  
    BTW one interesting thing: if we can implement whole-stage codegen for ExistenceJoin, then technically we can implement LeftSemi / Anti by just adding a filter on top without losing performance right? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by davies <gi...@git.apache.org>.

Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/12820#discussion_r61705446
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -1077,7 +1078,14 @@ object ReorderJoin extends Rule[LogicalPlan] with PredicateHelper {
       def createOrderedJoin(input: Seq[LogicalPlan], conditions: Seq[Expression]): LogicalPlan = {
         assert(input.size >= 2)
         if (input.size == 2) {
    -      Join(input(0), input(1), Inner, conditions.reduceLeftOption(And))
    +      val (joinConditions, others) = conditions.partition(
    --- End diff --
    
    It's weird to see `others` come before `joinConditions `, so I make it this way.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14781] [SQL] support nested predicate s...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12820#issuecomment-216151188
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57518/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org