You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by jiangxb1987 <gi...@git.apache.org> on 2017/10/16 07:06:42 UTC

[GitHub] spark pull request #19504: [SPARK-22233] [CORE] [FOLLOW-UP] Allow user to fi...

GitHub user jiangxb1987 opened a pull request:

    https://github.com/apache/spark/pull/19504

    [SPARK-22233] [CORE] [FOLLOW-UP] Allow user to filter out empty split in HadoopRDD

    ## What changes were proposed in this pull request?
    
    Update the config `spark.files.ignoreEmptySplits`, rename it and make it internal.
    
    ## How was this patch tested?
    
    Exsiting tests.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jiangxb1987/spark partitionsplit

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19504.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19504
    
----
commit bcb3dbd178650f3c6bf54af32b0b1029b89286dd
Author: Xingbo Jiang <xi...@databricks.com>
Date:   2017-10-16T07:02:49Z

    update config spark.hadoopRDD.ignoreEmptySplits

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19504: [SPARK-22233] [CORE] [FOLLOW-UP] Allow user to filter ou...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19504
  
    **[Test build #82787 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82787/testReport)** for PR 19504 at commit [`bcb3dbd`](https://github.com/apache/spark/commit/bcb3dbd178650f3c6bf54af32b0b1029b89286dd).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19504: [SPARK-22233] [CORE] [FOLLOW-UP] Allow user to fi...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/19504


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19504: [SPARK-22233] [CORE] [FOLLOW-UP] Allow user to fi...

Posted by liutang123 <gi...@git.apache.org>.
Github user liutang123 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19504#discussion_r144823226
  
    --- Diff: core/src/test/scala/org/apache/spark/FileSuite.scala ---
    @@ -549,9 +551,11 @@ class FileSuite extends SparkFunSuite with LocalSparkContext {
           expectedPartitionNum = 2)
       }
     
    -  test("spark.files.ignoreEmptySplits work correctly (new Hadoop API)") {
    +  test("spark.hadoopRDD.ignoreEmptySplits work correctly (new Hadoop API)") {
         val conf = new SparkConf()
    -    conf.setAppName("test").setMaster("local").set(IGNORE_EMPTY_SPLITS, true)
    +      .setAppName("test")
    +      .setMaster("local")
    +      .set(HADOOP_RDD_IGNORE_EMPTY_SPLITS, true)
         sc = new SparkContext(conf)
     
         def testIgnoreEmptySplits(
    --- End diff --
    
    ```
    testIgnoreEmptySplits(
           data = Array.empty[Tuple2[String, String]],
           actualPartitionNum = 1,
           expectedPartitionNum = 0)
    ```
    =>
    ```
    testIgnoreEmptySplits(
           data = Array.empty[(String, String)],
           actualPartitionNum = 1,
           expectedPartitionNum = 0)
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19504: [SPARK-22233] [CORE] [FOLLOW-UP] Allow user to filter ou...

Posted by liutang123 <gi...@git.apache.org>.
Github user liutang123 commented on the issue:

    https://github.com/apache/spark/pull/19504
  
    It looks better.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19504: [SPARK-22233] [CORE] [FOLLOW-UP] Allow user to filter ou...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19504
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19504: [SPARK-22233] [CORE] [FOLLOW-UP] Allow user to filter ou...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on the issue:

    https://github.com/apache/spark/pull/19504
  
    cc @liutang123 @HyukjinKwon @gatorsmile @cloud-fan 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19504: [SPARK-22233] [CORE] [FOLLOW-UP] Allow user to filter ou...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19504
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82787/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19504: [SPARK-22233] [CORE] [FOLLOW-UP] Allow user to filter ou...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19504
  
    **[Test build #82787 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82787/testReport)** for PR 19504 at commit [`bcb3dbd`](https://github.com/apache/spark/commit/bcb3dbd178650f3c6bf54af32b0b1029b89286dd).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19504: [SPARK-22233] [CORE] [FOLLOW-UP] Allow user to filter ou...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/19504
  
    LGTM, merging to master!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org