You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by mbautin <gi...@git.apache.org> on 2015/10/27 21:21:50 UTC

[GitHub] spark pull request: [SPARK-10707] [SQL] Fix nullability computatio...

GitHub user mbautin opened a pull request:

    https://github.com/apache/spark/pull/9308

    [SPARK-10707] [SQL] Fix nullability computation in union output

    Conflicts:


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mbautin/spark SPARK-10707

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/9308.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #9308
    
----
commit f444ae86b00e3a2cb7c73e7240fd881a7507a1a0
Author: Mikhail Bautin <mb...@gmail.com>
Date:   2015-09-18T19:49:26Z

    Fix nullability computation in union output
    
    Conflicts:
    	sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicOperators.scala
    	sql/core/src/main/scala/org/apache/spark/sql/execution/basicOperators.scala

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10707] [SQL] Fix nullability computatio...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9308#issuecomment-151687160
  
    **[Test build #44480 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44480/consoleFull)** for PR 9308 at commit [`6958a49`](https://github.com/apache/spark/commit/6958a49aeae446008c90c59f8c09b11fdb54d397).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10707] [SQL] Fix nullability computatio...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9308#issuecomment-151693023
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10707] [SQL] Fix nullability computatio...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9308#issuecomment-151689553
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10707] [SQL] Fix nullability computatio...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9308#issuecomment-151686171
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10707] [SQL] Fix nullability computatio...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9308#issuecomment-151686154
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10707] [SQL] Fix nullability computatio...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9308#issuecomment-151654155
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10707] [SQL] Fix nullability computatio...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9308#issuecomment-151632980
  
    **[Test build #44462 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44462/consoleFull)** for PR 9308 at commit [`f444ae8`](https://github.com/apache/spark/commit/f444ae86b00e3a2cb7c73e7240fd881a7507a1a0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10707] [SQL] Fix nullability computatio...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9308#issuecomment-151689517
  
    **[Test build #44480 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44480/consoleFull)** for PR 9308 at commit [`6958a49`](https://github.com/apache/spark/commit/6958a49aeae446008c90c59f8c09b11fdb54d397).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `case class Except(left: LogicalPlan, right: LogicalPlan) extends SetOperation(left, right) `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10707] [SQL] Fix nullability computatio...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9308#issuecomment-151656521
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44466/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10707] [SQL] Fix nullability computatio...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9308#issuecomment-151654182
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10707] [SQL] Fix nullability computatio...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9308#issuecomment-151653099
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10707] [SQL] Fix nullability computatio...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9308#issuecomment-151634169
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44462/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10707] [SQL] Fix nullability computatio...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9308#issuecomment-151632648
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10707] [SQL] Fix nullability computatio...

Posted by markhamstra <gi...@git.apache.org>.
Github user markhamstra commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9308#discussion_r43270151
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala ---
    @@ -1870,4 +1870,19 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext {
           assert(sampled.count() == sampledOdd.count() + sampledEven.count())
         }
       }
    +
    +  test("SPARK-10707: nullability should be correctly propagated through set operations") {
    +    withTempTable("src") {
    +      Seq((1, 1)).toDF("key", "value").registerTempTable("src")
    +      checkAnswer(
    +        sql("""SELECT count(v) FROM (
    +              |  SELECT v FROM (
    +              |    SELECT 'foo' AS v FROM src UNION ALL
    +              |    SELECT NULL AS v FROM src
    +              |  ) my_union WHERE isnull(v)
    +              |) my_subview""".stripMargin),
    +        Seq(Row(0)))
    --- End diff --
    
    @cloud-fan You seem to have neglected the WHERE clause.  The `count(v)` is over only one record, `null`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10707] [SQL] Fix nullability computatio...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9308#discussion_r43336393
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala ---
    @@ -1870,4 +1870,19 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext {
           assert(sampled.count() == sampledOdd.count() + sampledEven.count())
         }
       }
    +
    +  test("SPARK-10707: nullability should be correctly propagated through set operations") {
    +    withTempTable("src") {
    +      Seq((1, 1)).toDF("key", "value").registerTempTable("src")
    +      checkAnswer(
    +        sql("""SELECT count(v) FROM (
    +              |  SELECT v FROM (
    +              |    SELECT 'foo' AS v FROM src UNION ALL
    +              |    SELECT NULL AS v FROM src
    +              |  ) my_union WHERE isnull(v)
    +              |) my_subview""".stripMargin),
    +        Seq(Row(0)))
    --- End diff --
    
    OK, but result of `count(v)` over a null record should also be 1?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10707] [SQL] Fix nullability computatio...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9308#issuecomment-151656520
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10707] [SQL] Fix nullability computatio...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9308#issuecomment-151655957
  
    **[Test build #44468 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44468/consoleFull)** for PR 9308 at commit [`b27bc75`](https://github.com/apache/spark/commit/b27bc75bc476d53056c3a2e01709ace1b6ab3ccb).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10707] [SQL] Fix nullability computatio...

Posted by markhamstra <gi...@git.apache.org>.
Github user markhamstra commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9308#discussion_r43340250
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala ---
    @@ -1870,4 +1870,19 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext {
           assert(sampled.count() == sampledOdd.count() + sampledEven.count())
         }
       }
    +
    +  test("SPARK-10707: nullability should be correctly propagated through set operations") {
    +    withTempTable("src") {
    +      Seq((1, 1)).toDF("key", "value").registerTempTable("src")
    +      checkAnswer(
    +        sql("""SELECT count(v) FROM (
    +              |  SELECT v FROM (
    +              |    SELECT 'foo' AS v FROM src UNION ALL
    +              |    SELECT NULL AS v FROM src
    +              |  ) my_union WHERE isnull(v)
    +              |) my_subview""".stripMargin),
    +        Seq(Row(0)))
    --- End diff --
    
    No, `count` of a specific column should return the number of non-null entries in that column.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10707] [SQL] Fix nullability computatio...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9308#issuecomment-151714713
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44482/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10707] [SQL] Fix nullability computatio...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9308#issuecomment-151656413
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10707] [SQL] Fix nullability computatio...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9308#issuecomment-151632695
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10707] [SQL] Fix nullability computatio...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9308#issuecomment-151656414
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44468/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10707] [SQL] Fix nullability computatio...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9308#issuecomment-151694743
  
    **[Test build #44482 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44482/consoleFull)** for PR 9308 at commit [`319d105`](https://github.com/apache/spark/commit/319d10515ace5cb72a747518ee74957279a06d0b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10707] [SQL] Fix nullability computatio...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9308#issuecomment-151693036
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10707] [SQL] Fix nullability computatio...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9308#discussion_r43228226
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala ---
    @@ -1870,4 +1870,19 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext {
           assert(sampled.count() == sampledOdd.count() + sampledEven.count())
         }
       }
    +
    +  test("SPARK-10707: nullability should be correctly propagated through set operations") {
    +    withTempTable("src") {
    +      Seq((1, 1)).toDF("key", "value").registerTempTable("src")
    +      checkAnswer(
    +        sql("""SELECT count(v) FROM (
    +              |  SELECT v FROM (
    +              |    SELECT 'foo' AS v FROM src UNION ALL
    +              |    SELECT NULL AS v FROM src
    +              |  ) my_union WHERE isnull(v)
    +              |) my_subview""".stripMargin),
    +        Seq(Row(0)))
    --- End diff --
    
    why is the result 0? The `my_union` table should have 2 records: `foo` and `null`, and `my_subview` should have 1 record: `foo`, so `count(v)` should be 1?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10707] [SQL] Fix nullability computatio...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9308#issuecomment-151714652
  
    **[Test build #44482 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44482/consoleFull)** for PR 9308 at commit [`319d105`](https://github.com/apache/spark/commit/319d10515ace5cb72a747518ee74957279a06d0b).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `case class Except(left: LogicalPlan, right: LogicalPlan) extends SetOperation(left, right) `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10707] [SQL] Fix nullability computatio...

Posted by markhamstra <gi...@git.apache.org>.
Github user markhamstra commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9308#discussion_r43353616
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala ---
    @@ -1870,4 +1870,19 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext {
           assert(sampled.count() == sampledOdd.count() + sampledEven.count())
         }
       }
    +
    +  test("SPARK-10707: nullability should be correctly propagated through set operations") {
    +    withTempTable("src") {
    +      Seq((1, 1)).toDF("key", "value").registerTempTable("src")
    +      checkAnswer(
    +        sql("""SELECT count(v) FROM (
    +              |  SELECT v FROM (
    +              |    SELECT 'foo' AS v FROM src UNION ALL
    +              |    SELECT NULL AS v FROM src
    +              |  ) my_union WHERE isnull(v)
    +              |) my_subview""".stripMargin),
    +        Seq(Row(0)))
    --- End diff --
    
    Spark SQL does not give you the correct answer now:
    ```
    scala> Seq((1, 1)).toDF("key", "value").registerTempTable("src")
    
    scala> sqlContext.sql("SELECT count(v) FROM (SELECT v FROM (SELECT 'foo' AS v FROM src UNION ALL SELECT NULL AS v FROM src ) my_union WHERE isnull(v) ) my_subview")
    res1: org.apache.spark.sql.DataFrame = [_c0: bigint]
    
    scala> res1.collect()
    res2: Array[org.apache.spark.sql.Row] = Array([1])
    ```
    
    And as Mikhail explained in the JIRA, Spark SQL actually gives a different answer when the left and right sides of the UNION are swapped, even though that should be a symmetric operation. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10707] [SQL] Fix nullability computatio...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9308#discussion_r43369777
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala ---
    @@ -1870,4 +1870,19 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext {
           assert(sampled.count() == sampledOdd.count() + sampledEven.count())
         }
       }
    +
    +  test("SPARK-10707: nullability should be correctly propagated through set operations") {
    +    withTempTable("src") {
    +      Seq((1, 1)).toDF("key", "value").registerTempTable("src")
    +      checkAnswer(
    +        sql("""SELECT count(v) FROM (
    +              |  SELECT v FROM (
    +              |    SELECT 'foo' AS v FROM src UNION ALL
    +              |    SELECT NULL AS v FROM src
    +              |  ) my_union WHERE isnull(v)
    +              |) my_subview""".stripMargin),
    +        Seq(Row(0)))
    --- End diff --
    
    This test doesn't look clear to me, how about:
    ```
    checkAnswer(
      sql(
        """
          |SELECT a FROM (
          |  SELECT ISNULL(v) AS a, RAND() FROM (
          |    SELECT 'foo' AS v UNION ALL SELECT null AS v
          |  ) my_union
          |) my_view
        """.stripMargin),
      Row(false) :: Row(true) :: Nil)
    ```
    
    Use `RAND()` to stop column pruning for `Union` and then we can show that `isnull(v)` is wrong because of the wrong nullability of `Union` output.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10707] [SQL] Fix nullability computatio...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9308#issuecomment-151689554
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44480/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10707] [SQL] Fix nullability computatio...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9308#discussion_r43227830
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala ---
    @@ -1870,4 +1870,19 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext {
           assert(sampled.count() == sampledOdd.count() + sampledEven.count())
         }
       }
    +
    +  test("SPARK-10707: nullability should be correctly propagated through set operations") {
    +    withTempTable("src") {
    +      Seq((1, 1)).toDF("key", "value").registerTempTable("src")
    +      checkAnswer(
    +        sql("""SELECT count(v) FROM (
    +              |  SELECT v FROM (
    +              |    SELECT 'foo' AS v FROM src UNION ALL
    --- End diff --
    
    spark SQL support `select xxx` without `from` clause, so we don't need to create the src table.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10707] [SQL] Fix nullability computatio...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9308#issuecomment-151634166
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10707] [SQL] Fix nullability computatio...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9308#issuecomment-151656408
  
    **[Test build #44468 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44468/consoleFull)** for PR 9308 at commit [`b27bc75`](https://github.com/apache/spark/commit/b27bc75bc476d53056c3a2e01709ace1b6ab3ccb).
     * This patch **fails Scala style tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `case class Except(left: LogicalPlan, right: LogicalPlan) extends SetOperation(left, right) `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10707] [SQL] Fix nullability computatio...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9308#issuecomment-151634164
  
    **[Test build #44462 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44462/consoleFull)** for PR 9308 at commit [`f444ae8`](https://github.com/apache/spark/commit/f444ae86b00e3a2cb7c73e7240fd881a7507a1a0).
     * This patch **fails to build**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10707] [SQL] Fix nullability computatio...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9308#issuecomment-151653082
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10707] [SQL] Fix nullability computatio...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9308#issuecomment-151714712
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10707] [SQL] Fix nullability computatio...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9308#discussion_r43345851
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala ---
    @@ -1870,4 +1870,19 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext {
           assert(sampled.count() == sampledOdd.count() + sampledEven.count())
         }
       }
    +
    +  test("SPARK-10707: nullability should be correctly propagated through set operations") {
    +    withTempTable("src") {
    +      Seq((1, 1)).toDF("key", "value").registerTempTable("src")
    +      checkAnswer(
    +        sql("""SELECT count(v) FROM (
    +              |  SELECT v FROM (
    +              |    SELECT 'foo' AS v FROM src UNION ALL
    +              |    SELECT NULL AS v FROM src
    +              |  ) my_union WHERE isnull(v)
    +              |) my_subview""".stripMargin),
    +        Seq(Row(0)))
    --- End diff --
    
    Then what does this test prove? Without this fix, `my_union` is not nullable and `isnull(v)` will always be false, so `my_subview` will be empty and the result of `count(v)` is also 0, did I miss something here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org