You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by gatorsmile <gi...@git.apache.org> on 2018/01/31 00:54:24 UTC

[GitHub] spark pull request #20444: [SPARK-23274] [SQL] Fix ReplaceExceptWithFilter w...

GitHub user gatorsmile opened a pull request:

    https://github.com/apache/spark/pull/20444

    [SPARK-23274] [SQL] Fix ReplaceExceptWithFilter when the right's Filter contains the references that are not in the left output

    ## What changes were proposed in this pull request?
    This PR is to fix the `ReplaceExceptWithFilter` rule when the right's Filter contains the references that are not in the left output.
    
    Before this PR, we got the error like
    ```
    java.util.NoSuchElementException: key not found: a
      at scala.collection.MapLike$class.default(MapLike.scala:228)
      at scala.collection.AbstractMap.default(Map.scala:59)
      at scala.collection.MapLike$class.apply(MapLike.scala:141)
      at scala.collection.AbstractMap.apply(Map.scala:59)
    ```
    
    After this PR, `ReplaceExceptWithFilter ` will not take an effect in this case. 
    
    ## How was this patch tested?
    Added tests

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gatorsmile/spark fixReplaceExceptWithFilter

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/20444.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #20444
    
----
commit 41bd3e00d1c31a36cd4565d41dc8e28165e5afe6
Author: gatorsmile <ga...@...>
Date:   2018-01-31T00:50:30Z

    fix.

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20444: [SPARK-23274] [SQL] Fix ReplaceExceptWithFilter when the...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/20444
  
    Thanks! Merged to master/2.3


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20444: [SPARK-23274] [SQL] Fix ReplaceExceptWithFilter w...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20444#discussion_r164948893
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/ReplaceExceptWithFilter.scala ---
    @@ -46,18 +46,27 @@ object ReplaceExceptWithFilter extends Rule[LogicalPlan] {
         }
     
         plan.transform {
    -      case Except(left, right) if isEligible(left, right) =>
    -        Distinct(Filter(Not(transformCondition(left, skipProject(right))), left))
    +      case e @ Except(left, right) if isEligible(left, right) =>
    +        val newCondition = transformCondition(left, skipProject(right))
    +        newCondition.map { c =>
    +          Distinct(Filter(Not(c), left))
    +        }.getOrElse {
    +          e
    +        }
         }
       }
     
    -  private def transformCondition(left: LogicalPlan, right: LogicalPlan): Expression = {
    +  private def transformCondition(left: LogicalPlan, right: LogicalPlan): Option[Expression] = {
         val filterCondition =
           InferFiltersFromConstraints(combineFilters(right)).asInstanceOf[Filter].condition
     
         val attributeNameMap: Map[String, Attribute] = left.output.map(x => (x.name, x)).toMap
     
    -    filterCondition.transform { case a : AttributeReference => attributeNameMap(a.name) }
    +    if (filterCondition.references.forall(r => attributeNameMap.contains(r.name))) {
    +      Some(filterCondition.transform { case a: AttributeReference => attributeNameMap(a.name) })
    --- End diff --
    
    Yes. There are multiple potential cases we can improve for this case. If we make it more complicated, it just takes a longer time to review the work. This blocks the 2.3 RC. Thus, I would like to fix it in a conservative way. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20444: [SPARK-23274] [SQL] Fix ReplaceExceptWithFilter w...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20444#discussion_r164937374
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/ReplaceExceptWithFilter.scala ---
    @@ -46,18 +46,27 @@ object ReplaceExceptWithFilter extends Rule[LogicalPlan] {
         }
     
         plan.transform {
    -      case Except(left, right) if isEligible(left, right) =>
    -        Distinct(Filter(Not(transformCondition(left, skipProject(right))), left))
    +      case e @ Except(left, right) if isEligible(left, right) =>
    +        val newCondition = transformCondition(left, skipProject(right))
    +        newCondition.map { c =>
    +          Distinct(Filter(Not(c), left))
    +        }.getOrElse {
    +          e
    +        }
         }
       }
     
    -  private def transformCondition(left: LogicalPlan, right: LogicalPlan): Expression = {
    +  private def transformCondition(left: LogicalPlan, right: LogicalPlan): Option[Expression] = {
         val filterCondition =
           InferFiltersFromConstraints(combineFilters(right)).asInstanceOf[Filter].condition
     
         val attributeNameMap: Map[String, Attribute] = left.output.map(x => (x.name, x)).toMap
     
    -    filterCondition.transform { case a : AttributeReference => attributeNameMap(a.name) }
    +    if (filterCondition.references.forall(r => attributeNameMap.contains(r.name))) {
    +      Some(filterCondition.transform { case a: AttributeReference => attributeNameMap(a.name) })
    --- End diff --
    
    Actually it may still possibly to add the `Filter` on the child of left's projection where it can be applied. But for now this fixing LGTM.
    
    We may also need to update the doc of `ReplaceExceptWithFilter` to add this constraint.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20444: [SPARK-23274] [SQL] Fix ReplaceExceptWithFilter when the...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20444
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86851/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20444: [SPARK-23274] [SQL] Fix ReplaceExceptWithFilter when the...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20444
  
    **[Test build #86851 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86851/testReport)** for PR 20444 at commit [`41bd3e0`](https://github.com/apache/spark/commit/41bd3e00d1c31a36cd4565d41dc8e28165e5afe6).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20444: [SPARK-23274] [SQL] Fix ReplaceExceptWithFilter when the...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20444
  
    **[Test build #86851 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86851/testReport)** for PR 20444 at commit [`41bd3e0`](https://github.com/apache/spark/commit/41bd3e00d1c31a36cd4565d41dc8e28165e5afe6).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20444: [SPARK-23274] [SQL] Fix ReplaceExceptWithFilter w...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/20444


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20444: [SPARK-23274] [SQL] Fix ReplaceExceptWithFilter when the...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20444
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20444: [SPARK-23274] [SQL] Fix ReplaceExceptWithFilter when the...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/20444
  
    Yeah, I see. LGTM.
    
    On Wed, Jan 31, 2018, 1:03 PM Xiao Li <no...@github.com> wrote:
    
    > *@gatorsmile* commented on this pull request.
    > ------------------------------
    >
    > In
    > sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/ReplaceExceptWithFilter.scala
    > <https://github.com/apache/spark/pull/20444#discussion_r164948893>:
    >
    > >      val filterCondition =
    >        InferFiltersFromConstraints(combineFilters(right)).asInstanceOf[Filter].condition
    >
    >      val attributeNameMap: Map[String, Attribute] = left.output.map(x => (x.name, x)).toMap
    >
    > -    filterCondition.transform { case a : AttributeReference => attributeNameMap(a.name) }
    > +    if (filterCondition.references.forall(r => attributeNameMap.contains(r.name))) {
    > +      Some(filterCondition.transform { case a: AttributeReference => attributeNameMap(a.name) })
    >
    > Yes. There are multiple potential cases we can improve for this case. If
    > we make it more complicated, it just takes a longer time to review the
    > work. This blocks the 2.3 RC. Thus, I would like to fix it in a
    > conservative way.
    >
    > —
    > You are receiving this because you commented.
    > Reply to this email directly, view it on GitHub
    > <https://github.com/apache/spark/pull/20444#discussion_r164948893>, or mute
    > the thread
    > <https://github.com/notifications/unsubscribe-auth/AAEM9wQriQJupnqPBxd7KgoAI-Jro-T8ks5tP-YLgaJpZM4RzQaq>
    > .
    >



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20444: [SPARK-23274] [SQL] Fix ReplaceExceptWithFilter when the...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20444
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/407/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20444: [SPARK-23274] [SQL] Fix ReplaceExceptWithFilter when the...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20444
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20444: [SPARK-23274] [SQL] Fix ReplaceExceptWithFilter when the...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/20444
  
    LGTM


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org