You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by ferdonline <gi...@git.apache.org> on 2018/01/15 14:04:48 UTC

[GitHub] spark pull request #20269: [SPARK-23029] [DOCS] Specifying default units of ...

GitHub user ferdonline opened a pull request:

    https://github.com/apache/spark/pull/20269

    [SPARK-23029] [DOCS] Specifying default units of configuration entries

    ## What changes were proposed in this pull request?
    This PR completes the docs, specifying the default units assumed in configuration entries of type size.
    This is crucial since unit-less values are accepted and the user might assume the base unit is bytes, which in most cases it is not, leading to hard-to-debug problems.
    
    ## How was this patch tested?
    This patch updates only documentation only.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ferdonline/spark docs_units

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/20269.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #20269
    
----
commit 889426b191f1f542012e0a5c9f0a121f64a2b46e
Author: Fernando Pereira <fe...@...>
Date:   2018-01-15T13:56:23Z

    Specifying default units of configuration entries

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20269: [SPARK-23029] [DOCS] Specifying default units of ...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/20269


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20269: [SPARK-23029] [DOCS] Specifying default units of ...

Posted by Ngone51 <gi...@git.apache.org>.
Github user Ngone51 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20269#discussion_r238135789
  
    --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala ---
    @@ -38,10 +38,13 @@ package object config {
         ConfigBuilder("spark.driver.userClassPathFirst").booleanConf.createWithDefault(false)
     
       private[spark] val DRIVER_MEMORY = ConfigBuilder("spark.driver.memory")
    +    .doc("Amount of memory to use for the driver process, in MiB unless otherwise specified.")
         .bytesConf(ByteUnit.MiB)
         .createWithDefaultString("1g")
     
       private[spark] val DRIVER_MEMORY_OVERHEAD = ConfigBuilder("spark.driver.memoryOverhead")
    +    .doc("The amount of off-heap memory to be allocated per driver in cluster mode, " +
    --- End diff --
    
    Hi, @ferdonline , can you explain why this is  **off-heap** memory ?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20269: [SPARK-23029] [DOCS] Specifying default units of configu...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20269
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20269: [SPARK-23029] [DOCS] Specifying default units of configu...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/20269
  
    Merged to master/2.3


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20269: [SPARK-23029] [DOCS] Specifying default units of ...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20269#discussion_r161609127
  
    --- Diff: docs/configuration.md ---
    @@ -58,6 +58,8 @@ The following format is accepted:
         1t or 1tb (tebibytes = 1024 gibibytes)
         1p or 1pb (pebibytes = 1024 tebibytes)
     
    +Without specification the unit depends on the configuration entry where KiB are typically assumed.
    --- End diff --
    
    Just looking at the properties that use bytesConf(), there are as many in MiB. And, really the default is just bytes unless otherwise specified. If you say anything here, maybe just 
    
    "While numbers without units are generally interpreted as bytes, a few are interpreted as KiB or MiB when no units are specified, for historical reasons. See documentation of individual configuration properties. Specifying units is desirable where possible."


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20269: [SPARK-23029] [DOCS] Specifying default units of configu...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20269
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20269: [SPARK-23029] [DOCS] Specifying default units of configu...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20269
  
    **[Test build #4057 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4057/testReport)** for PR 20269 at commit [`bf8e55e`](https://github.com/apache/spark/commit/bf8e55e0978749171ec21babbfa155484897590c).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20269: [SPARK-23029] [DOCS] Specifying default units of configu...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20269
  
    **[Test build #4054 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4054/testReport)** for PR 20269 at commit [`9d92235`](https://github.com/apache/spark/commit/9d9223570f399f2eb3ebc37e886f2cfbcad0b68d).
     * This patch **fails Scala style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20269: [SPARK-23029] [DOCS] Specifying default units of configu...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on the issue:

    https://github.com/apache/spark/pull/20269
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20269: [SPARK-23029] [DOCS] Specifying default units of ...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20269#discussion_r161608661
  
    --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala ---
    @@ -419,7 +419,7 @@ package object config {
     
       private[spark] val SHUFFLE_FILE_BUFFER_SIZE =
         ConfigBuilder("spark.shuffle.file.buffer")
    -      .doc("Size of the in-memory buffer for each shuffle file output stream. " +
    +      .doc("Size (in KiB) of the in-memory buffer for each shuffle file output stream. " +
    --- End diff --
    
    Really, "in KiB unless otherwise specified"?
    
    Same for the next property below. These two are the only two that aren't in bytes by default, and have a description already. It would be handy to add a blurb about this to all of the "MiB" default properties above this too, for consistency.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20269: [SPARK-23029] [DOCS] Specifying default units of ...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20269#discussion_r161609423
  
    --- Diff: docs/configuration.md ---
    @@ -150,6 +152,7 @@ of the most common options to set are:
       <td>
         Amount of memory to use for the driver process, i.e. where SparkContext is initialized.
         (e.g. <code>1g</code>, <code>2g</code>).
    +    Default unit: MiB
    --- End diff --
    
    Everywhere the default isn't bytes, a clause like ", in MiB unless otherwise specified", seems cleanest. There are 9 such properties as far as I can tell. 
    
    Although it would be complete to say "in bytes" for all other properties, probably not necessary.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20269: [SPARK-23029] [DOCS] Specifying default units of configu...

Posted by ferdonline <gi...@git.apache.org>.
Github user ferdonline commented on the issue:

    https://github.com/apache/spark/pull/20269
  
    Hi. Thanks for your review. Sounds good, I will go around and add a "unit blurb" to them.
    I wrote "Default unit: X" to keep it the shortest and very obvious, but I agree to have nicer english in the html docs.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20269: [SPARK-23029] [DOCS] Specifying default units of configu...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20269
  
    **[Test build #4054 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4054/testReport)** for PR 20269 at commit [`9d92235`](https://github.com/apache/spark/commit/9d9223570f399f2eb3ebc37e886f2cfbcad0b68d).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20269: [SPARK-23029] [DOCS] Specifying default units of configu...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20269
  
    **[Test build #4058 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4058/testReport)** for PR 20269 at commit [`bf8e55e`](https://github.com/apache/spark/commit/bf8e55e0978749171ec21babbfa155484897590c).
     * This patch passes all tests.
     * This patch **does not merge cleanly**.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20269: [SPARK-23029] [DOCS] Specifying default units of configu...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20269
  
    **[Test build #4058 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4058/testReport)** for PR 20269 at commit [`bf8e55e`](https://github.com/apache/spark/commit/bf8e55e0978749171ec21babbfa155484897590c).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20269: [SPARK-23029] [DOCS] Specifying default units of configu...

Posted by ferdonline <gi...@git.apache.org>.
Github user ferdonline commented on the issue:

    https://github.com/apache/spark/pull/20269
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20269: [SPARK-23029] [DOCS] Specifying default units of configu...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20269
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org