You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by cloud-fan <gi...@git.apache.org> on 2018/10/15 15:38:13 UTC

[GitHub] spark pull request #22728: [SPARK-25736][SQL][TEST] add tests to verify the ...

GitHub user cloud-fan opened a pull request:

    https://github.com/apache/spark/pull/22728

    [SPARK-25736][SQL][TEST] add tests to verify the behavior of multi-column count

    ## What changes were proposed in this pull request?
    
    AFAIK multi-column count is not widely supported by the mainstream databases(postgres doesn't support), and the SQL standard doesn't define it clearly, as near as I can tell.
    
    Since Spark supports it, we should clearly document the current behavior and add tests to verify it.
    
    ## How was this patch tested?
    
    N/A


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cloud-fan/spark doc

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22728.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22728
    
----
commit 62b4b84f135c2f71ecc8192deabec3d694b6bbc9
Author: Wenchen Fan <we...@...>
Date:   2018-10-15T15:25:14Z

    add tests to verify the behavior of count for corner cases

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22728: [SPARK-25736][SQL][TEST] add tests to verify the behavio...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22728
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22728: [SPARK-25736][SQL][TEST] add tests to verify the behavio...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/22728
  
    (From https://github.com/apache/spark/pull/22773#issuecomment-432917994) @gatorsmile and @cloud-fan, let's say this will break `DESCRIBE FUNCTION EXTENDED`. Should we update migration guide as well?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22728: [SPARK-25736][SQL][TEST] add tests to verify the behavio...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22728
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22728: [SPARK-25736][SQL][TEST] add tests to verify the behavio...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22728
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97420/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22728: [SPARK-25736][SQL][TEST] add tests to verify the ...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/22728


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22728: [SPARK-25736][SQL][TEST] add tests to verify the behavio...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/22728
  
    thanks for your work @cloud-fan !


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22728: [SPARK-25736][SQL][TEST] add tests to verify the behavio...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22728
  
    **[Test build #97401 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97401/testReport)** for PR 22728 at commit [`708d7fd`](https://github.com/apache/spark/commit/708d7fd33b6164080d5e6fa28e0cbd783411e2e1).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22728: [SPARK-25736][SQL][TEST] add tests to verify the behavio...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22728
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22728: [SPARK-25736][SQL][TEST] add tests to verify the behavio...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22728
  
    **[Test build #97420 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97420/testReport)** for PR 22728 at commit [`e3aaa90`](https://github.com/apache/spark/commit/e3aaa90bd3b5f12a892f11be67ac26326c3b18ce).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22728: [SPARK-25736][SQL][TEST] add tests to verify the behavio...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22728
  
    **[Test build #97401 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97401/testReport)** for PR 22728 at commit [`708d7fd`](https://github.com/apache/spark/commit/708d7fd33b6164080d5e6fa28e0cbd783411e2e1).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22728: [SPARK-25736][SQL][TEST] add tests to verify the behavio...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22728
  
    **[Test build #97400 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97400/testReport)** for PR 22728 at commit [`62b4b84`](https://github.com/apache/spark/commit/62b4b84f135c2f71ecc8192deabec3d694b6bbc9).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22728: [SPARK-25736][SQL][TEST] add tests to verify the behavio...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22728
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97400/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22728: [SPARK-25736][SQL][TEST] add tests to verify the behavio...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/22728
  
    FYI, I tried both hive and presto, neither of them supports multi-column count.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22728: [SPARK-25736][SQL][TEST] add tests to verify the behavio...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22728
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3992/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22728: [SPARK-25736][SQL][TEST] add tests to verify the ...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22728#discussion_r225298840
  
    --- Diff: sql/core/src/test/resources/sql-tests/inputs/count.sql ---
    @@ -0,0 +1,21 @@
    +-- Test data.
    +CREATE OR REPLACE TEMPORARY VIEW testData AS SELECT * FROM VALUES
    +(1, 1), (1, 2), (2, 1), (1, 1), (null, 2), (1, null), (null, null)
    +AS testData(a, b);
    +
    +-- count with single expression
    +SELECT count(a), count(b), count(a + b), count((a, b)) FROM testData;
    +
    +-- distinct count with single expression
    +SELECT
    +  count(DISTINCT a),
    +  count(DISTINCT b),
    +  count(DISTINCT (a + b)),
    +  count(DISTINCT (a, b))
    +FROM testData;
    +
    +-- count with multiple expressions
    +SELECT count(a, b), count(b, a), count(testData.*) FROM testData;
    --- End diff --
    
    Please also include `count(*)`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22728: [SPARK-25736][SQL][TEST] add tests to verify the behavio...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/22728
  
    cc @gatorsmile @mgaido91 @viirya 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22728: [SPARK-25736][SQL][TEST] add tests to verify the behavio...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22728
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97401/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22728: [SPARK-25736][SQL][TEST] add tests to verify the behavio...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22728
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3993/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22728: [SPARK-25736][SQL][TEST] add tests to verify the behavio...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/22728
  
    LGTM


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22728: [SPARK-25736][SQL][TEST] add tests to verify the behavio...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22728
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22728: [SPARK-25736][SQL][TEST] add tests to verify the behavio...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22728
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22728: [SPARK-25736][SQL][TEST] add tests to verify the behavio...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22728
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22728: [SPARK-25736][SQL][TEST] add tests to verify the behavio...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/22728
  
    Yea, it is definitely good to add document and test for current behavior.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22728: [SPARK-25736][SQL][TEST] add tests to verify the behavio...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22728
  
    **[Test build #97400 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97400/testReport)** for PR 22728 at commit [`62b4b84`](https://github.com/apache/spark/commit/62b4b84f135c2f71ecc8192deabec3d694b6bbc9).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22728: [SPARK-25736][SQL][TEST] add tests to verify the behavio...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/22728
  
    Merged to master and branch-2.4.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22728: [SPARK-25736][SQL][TEST] add tests to verify the behavio...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/22728
  
    this is indeed the behavior I'd expect. Good to add tests to enforce the behavior. Did you check other RDBMs apart from Postgres?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22728: [SPARK-25736][SQL][TEST] add tests to verify the behavio...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22728
  
    **[Test build #97420 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97420/testReport)** for PR 22728 at commit [`e3aaa90`](https://github.com/apache/spark/commit/e3aaa90bd3b5f12a892f11be67ac26326c3b18ce).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22728: [SPARK-25736][SQL][TEST] add tests to verify the ...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22728#discussion_r225299298
  
    --- Diff: sql/core/src/test/resources/sql-tests/inputs/count.sql ---
    @@ -0,0 +1,21 @@
    +-- Test data.
    +CREATE OR REPLACE TEMPORARY VIEW testData AS SELECT * FROM VALUES
    +(1, 1), (1, 2), (2, 1), (1, 1), (null, 2), (1, null), (null, null)
    +AS testData(a, b);
    +
    +-- count with single expression
    +SELECT count(a), count(b), count(a + b), count((a, b)) FROM testData;
    +
    +-- distinct count with single expression
    +SELECT
    +  count(DISTINCT a),
    +  count(DISTINCT b),
    +  count(DISTINCT (a + b)),
    +  count(DISTINCT (a, b))
    +FROM testData;
    +
    +-- count with multiple expressions
    +SELECT count(a, b), count(b, a), count(testData.*) FROM testData;
    +
    +-- distinct count with multiple expressions
    +SELECT count(DISTINCT a, b), count(DISTINCT b, a), count(DISTINCT testData.*) FROM testData;
    --- End diff --
    
    Also include `count(DISTINCT *)` 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22728: [SPARK-25736][SQL][TEST] add tests to verify the behavio...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/22728
  
    BTW MySQL doesn't support `count(a, b)` but supports `count(distinct a, b)`, the result is same as Spark.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22728: [SPARK-25736][SQL][TEST] add tests to verify the behavio...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22728
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4010/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org