You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by mgaido91 <gi...@git.apache.org> on 2017/10/13 16:00:18 UTC

[GitHub] spark pull request #19494: [SPARK-22249][SQL] isin with empty list throws ex...

GitHub user mgaido91 opened a pull request:

    https://github.com/apache/spark/pull/19494

    [SPARK-22249][SQL] isin with empty list throws exception on cached DataFrame

    ## What changes were proposed in this pull request?
    
    As pointed out in the JIRA, there is a bug which causes an exception to be thrown if `isin` is called with an empty list on a cached DataFrame. The PR fixes it.
    
    ## How was this patch tested?
    
    Added UT.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mgaido91/spark SPARK-22249

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19494.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19494
    
----
commit 549ff15f36adbaf8b00628afe9b1f522f5dbcb6d
Author: Marco Gaido <ma...@gmail.com>
Date:   2017-10-13T15:53:29Z

    [SPARK-22249][SQL] isin with empty list throws exception on cached DataFrame

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19494: [SPARK-22249][SQL] isin with empty list throws exception...

Posted by HyukjinKwon <gi...@git.apache.org>.

Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/19494
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #19494: [SPARK-22249][SQL] isin with empty list throws ex...

Posted by gatorsmile <gi...@git.apache.org>.

Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19494#discussion_r145218694
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryTableScanExec.scala ---
    @@ -102,6 +102,7 @@ case class InMemoryTableScanExec(
         case IsNull(a: Attribute) => statsFor(a).nullCount > 0
         case IsNotNull(a: Attribute) => statsFor(a).count - statsFor(a).nullCount > 0
     
    +    case In(_: AttributeReference, list: Seq[Expression]) if list.isEmpty => Literal.FalseLiteral
    --- End diff --
    
    This should be moved to optimizer rules 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #19494: [SPARK-22249][SQL] isin with empty list throws ex...

Posted by mgaido91 <gi...@git.apache.org>.

Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19494#discussion_r145251914
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryTableScanExec.scala ---
    @@ -102,6 +102,7 @@ case class InMemoryTableScanExec(
         case IsNull(a: Attribute) => statsFor(a).nullCount > 0
         case IsNotNull(a: Attribute) => statsFor(a).count - statsFor(a).nullCount > 0
     
    +    case In(_: AttributeReference, list: Seq[Expression]) if list.isEmpty => Literal.FalseLiteral
    --- End diff --
    
    since this is only for filters, does it make any difference null or false?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19494: [SPARK-22249][SQL] isin with empty list throws exception...

Posted by mgaido91 <gi...@git.apache.org>.

Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/19494
  
    @srowen do you mean replacing `contains` with `exists`? If so, might you please explain me why `exists` is a better option? Thanks.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19494: [SPARK-22249][SQL] isin with empty list throws exception...

Posted by mgaido91 <gi...@git.apache.org>.

Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/19494
  
    I created the followup PR: https://github.com/apache/spark/pull/19522. Thanks.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #19494: [SPARK-22249][SQL] isin with empty list throws ex...

Posted by mgaido91 <gi...@git.apache.org>.

Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19494#discussion_r144713232
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryTableScanExec.scala ---
    @@ -104,7 +104,8 @@ case class InMemoryTableScanExec(
     
         case In(a: AttributeReference, list: Seq[Expression]) if list.forall(_.isInstanceOf[Literal]) =>
           list.map(l => statsFor(a).lowerBound <= l.asInstanceOf[Literal] &&
    -        l.asInstanceOf[Literal] <= statsFor(a).upperBound).reduce(_ || _)
    +        l.asInstanceOf[Literal] <= statsFor(a).upperBound)
    --- End diff --
    
    I think this is a bit more elegant and concise as code style, but your idea can be more efficient (though I am not sure how much overhead is introduced by a `False` evaluation). Then I am updating this PR according to your suggestion, thanks.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19494: [SPARK-22249][SQL] isin with empty list throws exception...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19494
  
    **[Test build #82789 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82789/testReport)** for PR 19494 at commit [`f5bc105`](https://github.com/apache/spark/commit/f5bc10582f60bf130af1fefed5b9dc9fb3e748d9).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #19494: [SPARK-22249][SQL] isin with empty list throws ex...

Posted by mgaido91 <gi...@git.apache.org>.

Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19494#discussion_r145252308
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/columnar/InMemoryColumnarQuerySuite.scala ---
    @@ -429,4 +429,19 @@ class InMemoryColumnarQuerySuite extends QueryTest with SharedSQLContext {
           checkAnswer(agg_without_cache, agg_with_cache)
         }
       }
    +
    +  test("SPARK-22249: IN should work also with cached DataFrame") {
    --- End diff --
    
    do you mean this test case is not enough and I should add a test case to check the proper behavior of the optimizer after the change?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19494: [SPARK-22249][SQL] isin with empty list throws exception...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19494
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #19494: [SPARK-22249][SQL] isin with empty list throws ex...

Posted by srowen <gi...@git.apache.org>.

Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19494#discussion_r144621674
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryTableScanExec.scala ---
    @@ -104,7 +104,8 @@ case class InMemoryTableScanExec(
     
         case In(a: AttributeReference, list: Seq[Expression]) if list.forall(_.isInstanceOf[Literal]) =>
           list.map(l => statsFor(a).lowerBound <= l.asInstanceOf[Literal] &&
    -        l.asInstanceOf[Literal] <= statsFor(a).upperBound).reduce(_ || _)
    +        l.asInstanceOf[Literal] <= statsFor(a).upperBound)
    --- End diff --
    
    This still looks more complex and less efficient than 
    
    ```
    list.exists(l => statsFor(a).lowerBound <= l.asInstanceOf[Literal] && l.asInstanceOf[Literal] <= statsFor(a).upperBound)
    ```
    
    or better
    
    ```
    val stats = statsFor(a)
    list.exists { l =>
      val literal = l.asInstanceOf[Literal]
      stats.lowerBound <= literal && literal <= stats.upperBound
    }
    ```
    
    The point being that you should be able to short-circuit evaluation here.
    Or have I missed something basic like that these aren't Booleans?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19494: [SPARK-22249][SQL] isin with empty list throws exception...

Posted by mgaido91 <gi...@git.apache.org>.

Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/19494
  
    I will, thanks for your suggestions @gatorsmile.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #19494: [SPARK-22249][SQL] isin with empty list throws ex...

Posted by gatorsmile <gi...@git.apache.org>.

Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19494#discussion_r145226891
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryTableScanExec.scala ---
    @@ -102,6 +102,7 @@ case class InMemoryTableScanExec(
         case IsNull(a: Attribute) => statsFor(a).nullCount > 0
         case IsNotNull(a: Attribute) => statsFor(a).count - statsFor(a).nullCount > 0
     
    +    case In(_: AttributeReference, list: Seq[Expression]) if list.isEmpty => Literal.FalseLiteral
    --- End diff --
    
    BTW, this is wrong. If the left value is null, it should return null instead of false.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #19494: [SPARK-22249][SQL] isin with empty list throws ex...

Posted by asfgit <gi...@git.apache.org>.

Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/19494


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #19494: [SPARK-22249][SQL] isin with empty list throws ex...

Posted by mgaido91 <gi...@git.apache.org>.

Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19494#discussion_r144622642
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryTableScanExec.scala ---
    @@ -104,7 +104,8 @@ case class InMemoryTableScanExec(
     
         case In(a: AttributeReference, list: Seq[Expression]) if list.forall(_.isInstanceOf[Literal]) =>
           list.map(l => statsFor(a).lowerBound <= l.asInstanceOf[Literal] &&
    -        l.asInstanceOf[Literal] <= statsFor(a).upperBound).reduce(_ || _)
    +        l.asInstanceOf[Literal] <= statsFor(a).upperBound)
    --- End diff --
    
    Exactly, they aren't Booleans. This they are `Expression`s. Thus I can't think of a different (more efficient) way of doing this.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19494: [SPARK-22249][SQL] isin with empty list throws exception...

Posted by HyukjinKwon <gi...@git.apache.org>.

Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/19494
  
    ok to test


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #19494: [SPARK-22249][SQL] isin with empty list throws ex...

Posted by gatorsmile <gi...@git.apache.org>.

Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19494#discussion_r145219348
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/columnar/InMemoryColumnarQuerySuite.scala ---
    @@ -429,4 +429,19 @@ class InMemoryColumnarQuerySuite extends QueryTest with SharedSQLContext {
           checkAnswer(agg_without_cache, agg_with_cache)
         }
       }
    +
    +  test("SPARK-22249: IN should work also with cached DataFrame") {
    --- End diff --
    
    This test case is just an end-to-end test. This test will still pass if the optimizer has a change. We also need a unit test case. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #19494: [SPARK-22249][SQL] isin with empty list throws ex...

Posted by gatorsmile <gi...@git.apache.org>.

Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19494#discussion_r145218611
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryTableScanExec.scala ---
    @@ -102,6 +102,7 @@ case class InMemoryTableScanExec(
         case IsNull(a: Attribute) => statsFor(a).nullCount > 0
         case IsNotNull(a: Attribute) => statsFor(a).count - statsFor(a).nullCount > 0
     
    +    case In(_: AttributeReference, list: Seq[Expression]) if list.isEmpty => Literal.FalseLiteral
         case In(a: AttributeReference, list: Seq[Expression]) if list.forall(_.isInstanceOf[Literal]) =>
    --- End diff --
    
    Please change it to 
    > if list.forall(_.isInstanceOf[Literal]) && list.nonEmpty()


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19494: [SPARK-22249][SQL] isin with empty list throws exception...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19494
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19494: [SPARK-22249][SQL] isin with empty list throws exception...

Posted by gatorsmile <gi...@git.apache.org>.

Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/19494
  
    Sorry for my late review. @mgaido91 Could you submit a follow-up PR to address my comments. Thanks!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #19494: [SPARK-22249][SQL] isin with empty list throws ex...

Posted by mgaido91 <gi...@git.apache.org>.

Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19494#discussion_r144690815
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryTableScanExec.scala ---
    @@ -104,7 +104,8 @@ case class InMemoryTableScanExec(
     
         case In(a: AttributeReference, list: Seq[Expression]) if list.forall(_.isInstanceOf[Literal]) =>
           list.map(l => statsFor(a).lowerBound <= l.asInstanceOf[Literal] &&
    -        l.asInstanceOf[Literal] <= statsFor(a).upperBound).reduce(_ || _)
    +        l.asInstanceOf[Literal] <= statsFor(a).upperBound)
    --- End diff --
    
    It was a mistake, sorry. It returned always `false`.
    I see what you mean, but in this piece of code we are only building the `Expression` and we are not evaluating it. Thus it is not possible to short-circuit, because the `Expression` must be built entirely.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #19494: [SPARK-22249][SQL] isin with empty list throws ex...

Posted by srowen <gi...@git.apache.org>.

Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19494#discussion_r145229914
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryTableScanExec.scala ---
    @@ -102,6 +102,7 @@ case class InMemoryTableScanExec(
         case IsNull(a: Attribute) => statsFor(a).nullCount > 0
         case IsNotNull(a: Attribute) => statsFor(a).count - statsFor(a).nullCount > 0
     
    +    case In(_: AttributeReference, list: Seq[Expression]) if list.isEmpty => Literal.FalseLiteral
    --- End diff --
    
    Ah, really. OK, yeah we need to change this. It can be reverted too.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19494: [SPARK-22249][SQL] isin with empty list throws exception...

Posted by gatorsmile <gi...@git.apache.org>.

Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/19494
  
    I will not revert this PR but please submit the fix ASAP. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #19494: [SPARK-22249][SQL] isin with empty list throws ex...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19494#discussion_r145637774
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryTableScanExec.scala ---
    @@ -102,6 +102,7 @@ case class InMemoryTableScanExec(
         case IsNull(a: Attribute) => statsFor(a).nullCount > 0
         case IsNotNull(a: Attribute) => statsFor(a).count - statsFor(a).nullCount > 0
     
    +    case In(_: AttributeReference, list: Seq[Expression]) if list.isEmpty => Literal.FalseLiteral
    --- End diff --
    
    Under this special context of filtering partitions, this `In` with empty list will result in a false literal in the end, no matter the attribute is null or not. We don't possibly have some expressions like `IsNull(In(a, Nil))` as the filter predicate for now.
    



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #19494: [SPARK-22249][SQL] isin with empty list throws ex...

Posted by srowen <gi...@git.apache.org>.

Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19494#discussion_r144689325
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryTableScanExec.scala ---
    @@ -104,7 +104,8 @@ case class InMemoryTableScanExec(
     
         case In(a: AttributeReference, list: Seq[Expression]) if list.forall(_.isInstanceOf[Literal]) =>
           list.map(l => statsFor(a).lowerBound <= l.asInstanceOf[Literal] &&
    -        l.asInstanceOf[Literal] <= statsFor(a).upperBound).reduce(_ || _)
    +        l.asInstanceOf[Literal] <= statsFor(a).upperBound)
    --- End diff --
    
    I see. How does `.contains(true)` work then? or did that not work?
    I suppose all I mean is that we should write something that works on an empty list (returns false?) and also short-circuits (stops when anything is true). Is that possible?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19494: [SPARK-22249][SQL] isin with empty list throws exception...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19494
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82780/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19494: [SPARK-22249][SQL] isin with empty list throws exception...

Posted by srowen <gi...@git.apache.org>.

Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/19494
  
    Merged to master/2.2


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19494: [SPARK-22249][SQL] isin with empty list throws exception...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19494
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19494: [SPARK-22249][SQL] isin with empty list throws exception...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19494
  
    **[Test build #82789 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82789/testReport)** for PR 19494 at commit [`f5bc105`](https://github.com/apache/spark/commit/f5bc10582f60bf130af1fefed5b9dc9fb3e748d9).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19494: [SPARK-22249][SQL] isin with empty list throws exception...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19494
  
    **[Test build #82780 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82780/testReport)** for PR 19494 at commit [`f5bc105`](https://github.com/apache/spark/commit/f5bc10582f60bf130af1fefed5b9dc9fb3e748d9).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #19494: [SPARK-22249][SQL] isin with empty list throws ex...

Posted by srowen <gi...@git.apache.org>.

Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19494#discussion_r144711572
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryTableScanExec.scala ---
    @@ -104,7 +104,8 @@ case class InMemoryTableScanExec(
     
         case In(a: AttributeReference, list: Seq[Expression]) if list.forall(_.isInstanceOf[Literal]) =>
           list.map(l => statsFor(a).lowerBound <= l.asInstanceOf[Literal] &&
    -        l.asInstanceOf[Literal] <= statsFor(a).upperBound).reduce(_ || _)
    +        l.asInstanceOf[Literal] <= statsFor(a).upperBound)
    --- End diff --
    
    OK I understand. Rather than build an extra term into the expression every time, why not special-case an empty list? return Literal(false) if empty, otherwise return the same thing as before.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19494: [SPARK-22249][SQL] isin with empty list throws exception...

Posted by mgaido91 <gi...@git.apache.org>.

Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/19494
  
    @srowen I also updated the UT to check all the possible cases.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19494: [SPARK-22249][SQL] isin with empty list throws exception...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19494
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82789/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19494: [SPARK-22249][SQL] isin with empty list throws exception...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19494
  
    **[Test build #82780 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82780/testReport)** for PR 19494 at commit [`f5bc105`](https://github.com/apache/spark/commit/f5bc10582f60bf130af1fefed5b9dc9fb3e748d9).
     * This patch **fails due to an unknown error code, -9**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org