You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by gatorsmile <gi...@git.apache.org> on 2017/02/20 18:00:52 UTC

[GitHub] spark pull request #17004: [SPARK-19670] [SQL] [TEST] Enable Bucketed Table ...

GitHub user gatorsmile opened a pull request:

    https://github.com/apache/spark/pull/17004

    [SPARK-19670] [SQL] [TEST] Enable Bucketed Table Reading and Writing Testing Without Hive Support

    ### What changes were proposed in this pull request?
    Bucketed table reading and writing does not need Hive support. We can move the test cases from `sql/hive` to `sql/core`. After this PR, we can improve the test case coverage. Bucket table reading and writing can be tested with and without Hive support.
    
    ### How was this patch tested?
    N/A

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gatorsmile/spark mvTestCaseForBuckets

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/17004.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #17004
    
----
commit 980e08ef0c4da375d13c4bfe1e919543f988b222
Author: Xiao Li <ga...@gmail.com>
Date:   2017-02-19T17:03:44Z

    move

commit 0b1a41ba0ca251215bf9de1cee48ca2a206f1539
Author: Xiao Li <ga...@gmail.com>
Date:   2017-02-19T17:04:27Z

    Merge remote-tracking branch 'upstream/master' into mvTestCaseForBuckets

commit 7a35f16e0f6f51cd2cf9db0171658d047c4a711b
Author: Xiao Li <ga...@gmail.com>
Date:   2017-02-19T17:47:48Z

    add a new test suite

commit 18ff98c1e56a6245ef46af021cdabd942686dff5
Author: Xiao Li <ga...@gmail.com>
Date:   2017-02-20T17:23:27Z

    Merge remote-tracking branch 'upstream/master' into mvTestCaseForBuckets

commit 4c94cef6516af85a8ecc18896e75c936c43e450d
Author: Xiao Li <ga...@gmail.com>
Date:   2017-02-20T17:49:32Z

    move BucketedRead

commit e8d0fc26eadf59a34c8362dcd5f191ec9935eda6
Author: Xiao Li <ga...@gmail.com>
Date:   2017-02-20T17:50:53Z

    style fix.

commit 444aae8ee300f40b104abc0bdd192a3a4c7064b5
Author: Xiao Li <ga...@gmail.com>
Date:   2017-02-20T17:51:13Z

    style fix.

commit 3ecf1872550b72d5eb21e607e23fb07f503ba2f2
Author: Xiao Li <ga...@gmail.com>
Date:   2017-02-20T17:55:50Z

    style fix.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #17004: [SPARK-19670] [SQL] [TEST] Enable Bucketed Table Reading...

Posted by gatorsmile <gi...@git.apache.org>.

Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/17004
  
    cc @cloud-fan @tejasapatil 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #17004: [SPARK-19670] [SQL] [TEST] Enable Bucketed Table ...

Posted by asfgit <gi...@git.apache.org>.

Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/17004


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #17004: [SPARK-19670] [SQL] [TEST] Enable Bucketed Table Reading...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17004
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73239/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #17004: [SPARK-19670] [SQL] [TEST] Enable Bucketed Table Reading...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17004
  
    **[Test build #73239 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73239/testReport)** for PR 17004 at commit [`3ecf187`](https://github.com/apache/spark/commit/3ecf1872550b72d5eb21e607e23fb07f503ba2f2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #17004: [SPARK-19670] [SQL] [TEST] Enable Bucketed Table ...

Posted by tejasapatil <gi...@git.apache.org>.

Github user tejasapatil commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17004#discussion_r102298105
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/sources/BucketedWriteSuite.scala ---
    @@ -20,19 +20,29 @@ package org.apache.spark.sql.sources
     import java.io.File
     import java.net.URI
     
    -import org.apache.spark.SparkException
     import org.apache.spark.sql.{AnalysisException, QueryTest}
     import org.apache.spark.sql.catalyst.expressions.UnsafeProjection
     import org.apache.spark.sql.catalyst.plans.physical.HashPartitioning
     import org.apache.spark.sql.execution.datasources.BucketingUtils
     import org.apache.spark.sql.functions._
    -import org.apache.spark.sql.hive.test.TestHiveSingleton
     import org.apache.spark.sql.internal.SQLConf
    -import org.apache.spark.sql.test.SQLTestUtils
    +import org.apache.spark.sql.internal.StaticSQLConf.CATALOG_IMPLEMENTATION
    +import org.apache.spark.sql.test.{SharedSQLContext, SQLTestUtils}
     
    -class BucketedWriteSuite extends QueryTest with SQLTestUtils with TestHiveSingleton {
    +class BucketedWriteWithoutHiveSupportSuite extends BucketedWriteSuite with SharedSQLContext {
    +  protected override def beforeAll(): Unit = {
    +    super.beforeAll()
    +    assume(spark.sparkContext.conf.get(CATALOG_IMPLEMENTATION) == "in-memory")
    +  }
    +
    +  override protected def fileFormatsToTest: Seq[String] = Seq("parquet", "json")
    --- End diff --
    
    curious : why is `orc` not in this list ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #17004: [SPARK-19670] [SQL] [TEST] Enable Bucketed Table Reading...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17004
  
    **[Test build #73180 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73180/testReport)** for PR 17004 at commit [`3ecf187`](https://github.com/apache/spark/commit/3ecf1872550b72d5eb21e607e23fb07f503ba2f2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #17004: [SPARK-19670] [SQL] [TEST] Enable Bucketed Table Reading...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17004
  
    **[Test build #73180 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73180/testReport)** for PR 17004 at commit [`3ecf187`](https://github.com/apache/spark/commit/3ecf1872550b72d5eb21e607e23fb07f503ba2f2).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #17004: [SPARK-19670] [SQL] [TEST] Enable Bucketed Table Reading...

Posted by tejasapatil <gi...@git.apache.org>.

Github user tejasapatil commented on the issue:

    https://github.com/apache/spark/pull/17004
  
    LGTM with a minor nit


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #17004: [SPARK-19670] [SQL] [TEST] Enable Bucketed Table Reading...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17004
  
    **[Test build #73239 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73239/testReport)** for PR 17004 at commit [`3ecf187`](https://github.com/apache/spark/commit/3ecf1872550b72d5eb21e607e23fb07f503ba2f2).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #17004: [SPARK-19670] [SQL] [TEST] Enable Bucketed Table Reading...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17004
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #17004: [SPARK-19670] [SQL] [TEST] Enable Bucketed Table Reading...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17004
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #17004: [SPARK-19670] [SQL] [TEST] Enable Bucketed Table Reading...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17004
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73180/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #17004: [SPARK-19670] [SQL] [TEST] Enable Bucketed Table Reading...

Posted by gatorsmile <gi...@git.apache.org>.

Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/17004
  
    retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #17004: [SPARK-19670] [SQL] [TEST] Enable Bucketed Table ...

Posted by gatorsmile <gi...@git.apache.org>.

Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17004#discussion_r102300022
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/sources/BucketedWriteSuite.scala ---
    @@ -20,19 +20,29 @@ package org.apache.spark.sql.sources
     import java.io.File
     import java.net.URI
     
    -import org.apache.spark.SparkException
     import org.apache.spark.sql.{AnalysisException, QueryTest}
     import org.apache.spark.sql.catalyst.expressions.UnsafeProjection
     import org.apache.spark.sql.catalyst.plans.physical.HashPartitioning
     import org.apache.spark.sql.execution.datasources.BucketingUtils
     import org.apache.spark.sql.functions._
    -import org.apache.spark.sql.hive.test.TestHiveSingleton
     import org.apache.spark.sql.internal.SQLConf
    -import org.apache.spark.sql.test.SQLTestUtils
    +import org.apache.spark.sql.internal.StaticSQLConf.CATALOG_IMPLEMENTATION
    +import org.apache.spark.sql.test.{SharedSQLContext, SQLTestUtils}
     
    -class BucketedWriteSuite extends QueryTest with SQLTestUtils with TestHiveSingleton {
    +class BucketedWriteWithoutHiveSupportSuite extends BucketedWriteSuite with SharedSQLContext {
    +  protected override def beforeAll(): Unit = {
    +    super.beforeAll()
    +    assume(spark.sparkContext.conf.get(CATALOG_IMPLEMENTATION) == "in-memory")
    +  }
    +
    +  override protected def fileFormatsToTest: Seq[String] = Seq("parquet", "json")
    --- End diff --
    
    `orc` is not available in sql/core package. : (


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #17004: [SPARK-19670] [SQL] [TEST] Enable Bucketed Table Reading...

Posted by gatorsmile <gi...@git.apache.org>.

Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/17004
  
    Thanks! Merging to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org