You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by vanzin <gi...@git.apache.org> on 2014/08/08 01:32:07 UTC

[GitHub] spark pull request: [SPARK-2889] Create Hadoop config objects cons...

GitHub user vanzin opened a pull request:

    https://github.com/apache/spark/pull/1843

    [SPARK-2889] Create Hadoop config objects consistently.

    Different places in the code were intantiating Configuration / YarnConfiguration objects in different ways. This could lead to confusion for people who actually expected "spark.hadoop.*" options to end up in the configs used by Spark code, since that would only happen for the SparkContext's config.
    
    This change modifies most places to use SparkHadoopUtil to initialize configs, and make that method do the translation that previously was only done inside SparkContext.
    
    The places that were not changed fall in one of the following categories:
    - Test code where this doesn't really matter
    - Places deep in the code where plumbing SparkConf would be too difficult for very little gain
    - Default values for arguments - since the caller can provide their own config in that case

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/vanzin/spark SPARK-2889

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/1843.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1843
    
----
commit 1e7003ff01778f1a3be0f006fc721495ce13a0e2
Author: Marcelo Vanzin <va...@cloudera.com>
Date:   2014-08-07T16:12:17Z

    Replace explicit Configuration instantiation with SparkHadoopUtil.
    
    This is the basic grunt work; code doesn't fully compile yet, since
    I'll do some of the more questionable changes in separate commits.

commit b8ab1737c8230481a7797e5b174d07eea9f880d6
Author: Marcelo Vanzin <va...@cloudera.com>
Date:   2014-08-07T17:12:34Z

    Update Utils API to take a Configuration argument.
    
    Instead of using "new Configuration()" where a configuration is
    needed, let the caller provide a context-appropriate config
    object.

commit f16cadd2e4c0426d6aca1e125403c1427cb2d0c4
Author: Marcelo Vanzin <va...@cloudera.com>
Date:   2014-08-07T17:17:50Z

    Initialize config in SparkHadoopUtil.
    
    This is sort of hackish, since it doesn't account for any customization
    someone might make to SparkConf before they actually start executing spark
    code. Instead, this will only consider options available in the
    system properties when creating the hadoop conf.

commit 3f2676052937d193b3415b7c7aeeb4a6dad8eeba
Author: Marcelo Vanzin <va...@cloudera.com>
Date:   2014-08-07T17:22:24Z

    Compilation fix.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2889] Create Hadoop config objects cons...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1843#issuecomment-53914169
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19485/consoleFull) for   PR 1843 at commit [`52daf35`](https://github.com/apache/spark/commit/52daf357c698a7e3c03a75a715105da81e4c9114).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `class ExecutorClassLoader(conf: SparkConf, classUri: String, parent: ClassLoader,`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2889] Create Hadoop config objects cons...

Posted by vanzin <gi...@git.apache.org>.

Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1843#discussion_r16722717
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala ---
    @@ -68,7 +68,26 @@ class SparkHadoopUtil extends Logging {
        * Return an appropriate (subclass) of Configuration. Creating config can initializes some Hadoop
        * subsystems.
        */
    -  def newConfiguration(): Configuration = new Configuration()
    +  def newConfiguration(conf: SparkConf): Configuration = {
    --- End diff --
    
    I know the whole "deploy" package is excluded from mima checks (because I added the exclude at @pwendell's request). How is it documented that these packages are "private", if at all? Do we need explicit annotations in that case?
    
    (http://spark.apache.org/docs/1.0.0/api/scala/#package does not list the package, so maybe that's it?)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2889] Create Hadoop config objects cons...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1843#issuecomment-52945636
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19060/consoleFull) for   PR 1843 at commit [`0ac3fdf`](https://github.com/apache/spark/commit/0ac3fdfff8cc683e656561adc3a4375f5b110127).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2889] Create Hadoop config objects cons...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1843#issuecomment-53906725
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19485/consoleFull) for   PR 1843 at commit [`52daf35`](https://github.com/apache/spark/commit/52daf357c698a7e3c03a75a715105da81e4c9114).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2889] Create Hadoop config objects cons...

Posted by mateiz <gi...@git.apache.org>.

Github user mateiz commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1843#discussion_r16751825
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala ---
    @@ -68,7 +68,26 @@ class SparkHadoopUtil extends Logging {
        * Return an appropriate (subclass) of Configuration. Creating config can initializes some Hadoop
        * subsystems.
        */
    -  def newConfiguration(): Configuration = new Configuration()
    +  def newConfiguration(conf: SparkConf): Configuration = {
    --- End diff --
    
    It's the same as the rest of the codebase -- everything that is "private" should be marked `private[spark]`. Things that we need to make public for advanced developers are `@DeveloperApi`. In this case, this thing has been public so we can't remove it, but we could at least mark it to tell people not to depend on it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2889] Create Hadoop config objects cons...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1843#issuecomment-53451443
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19215/consoleFull) for   PR 1843 at commit [`3d345cb`](https://github.com/apache/spark/commit/3d345cba145cdaaa5555ddea23708f7c825f4640).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2889] Create Hadoop config objects cons...

Posted by vanzin <gi...@git.apache.org>.

Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/1843#issuecomment-53638348
  
    Jenkins, test this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2889] Create Hadoop config objects cons...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1843#issuecomment-51635063
  
    QA tests have started for PR 1843. This patch merges cleanly. <br>View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18210/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2889] Create Hadoop config objects cons...

Posted by mateiz <gi...@git.apache.org>.

Github user mateiz commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1843#discussion_r16693018
  
    --- Diff: examples/src/main/scala/org/apache/spark/examples/SparkTachyonHdfsLR.scala ---
    @@ -52,8 +52,8 @@ object SparkTachyonHdfsLR {
     
       def main(args: Array[String]) {
         val inputPath = args(0)
    -    val conf = SparkHadoopUtil.get.newConfiguration()
         val sparkConf = new SparkConf().setAppName("SparkTachyonHdfsLR")
    +    val conf = SparkHadoopUtil.get.newConfiguration(sparkConf)
    --- End diff --
    
    Same here, let's not use this in examples


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2889] Create Hadoop config objects cons...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1843#issuecomment-53521392
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19269/consoleFull) for   PR 1843 at commit [`51e71cf`](https://github.com/apache/spark/commit/51e71cf0d7943fe927fbbd7f47a921cd46571afd).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `"$FWDIR"/bin/spark-submit --class $CLASS "$`
      * `class ExecutorClassLoader(conf: SparkConf, classUri: String, parent: ClassLoader,`
      * `"$FWDIR"/bin/spark-submit --class $CLASS "$`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2889] Create Hadoop config objects cons...

Posted by vanzin <gi...@git.apache.org>.

Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1843#discussion_r16753695
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala ---
    @@ -68,7 +68,26 @@ class SparkHadoopUtil extends Logging {
        * Return an appropriate (subclass) of Configuration. Creating config can initializes some Hadoop
        * subsystems.
        */
    -  def newConfiguration(): Configuration = new Configuration()
    +  def newConfiguration(conf: SparkConf): Configuration = {
    --- End diff --
    
    ok, I added the annotation.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2889] Create Hadoop config objects cons...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1843#issuecomment-51546541
  
    QA tests have started for PR 1843. This patch merges cleanly. <br>View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18155/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2889] Create Hadoop config objects cons...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1843#issuecomment-52954706
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19060/consoleFull) for   PR 1843 at commit [`0ac3fdf`](https://github.com/apache/spark/commit/0ac3fdfff8cc683e656561adc3a4375f5b110127).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `class ExecutorClassLoader(conf: SparkConf, classUri: String, parent: ClassLoader,`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2889] Create Hadoop config objects cons...

Posted by vanzin <gi...@git.apache.org>.

Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/1843#issuecomment-52944713
  
    Jenkins, test this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2889] Create Hadoop config objects cons...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1843#issuecomment-53460874
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19215/consoleFull) for   PR 1843 at commit [`3d345cb`](https://github.com/apache/spark/commit/3d345cba145cdaaa5555ddea23708f7c825f4640).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `class ExecutorClassLoader(conf: SparkConf, classUri: String, parent: ClassLoader,`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2889] Create Hadoop config objects cons...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1843#issuecomment-53671595
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19371/consoleFull) for   PR 1843 at commit [`f179013`](https://github.com/apache/spark/commit/f179013d4dbb94238e0b282511c248d4d8f2d7a9).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `class ExecutorClassLoader(conf: SparkConf, classUri: String, parent: ClassLoader,`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2889] Create Hadoop config objects cons...

Posted by mateiz <gi...@git.apache.org>.

Github user mateiz commented on the pull request:

    https://github.com/apache/spark/pull/1843#issuecomment-53971505
  
    Thanks Marcelo! I've merged this in.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2889] Create Hadoop config objects cons...

Posted by vanzin <gi...@git.apache.org>.

Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/1843#issuecomment-51634739
  
    python errors only, unrelated?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2889] Create Hadoop config objects cons...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1843#issuecomment-51550183
  
    QA results for PR 1843:<br>- This patch FAILED unit tests.<br>- This patch merges cleanly<br>- This patch adds the following public classes (experimental):<br>class ExecutorClassLoader(conf: SparkConf, classUri: String, parent: ClassLoader,<br><br>For more information see test ouptut:<br>https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18155/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2889] Create Hadoop config objects cons...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1843#issuecomment-51641985
  
    QA results for PR 1843:<br>- This patch PASSES unit tests.<br>- This patch merges cleanly<br>- This patch adds the following public classes (experimental):<br>class ExecutorClassLoader(conf: SparkConf, classUri: String, parent: ClassLoader,<br><br>For more information see test ouptut:<br>https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18210/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2889] Create Hadoop config objects cons...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1843#issuecomment-53517960
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19269/consoleFull) for   PR 1843 at commit [`51e71cf`](https://github.com/apache/spark/commit/51e71cf0d7943fe927fbbd7f47a921cd46571afd).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2889] Create Hadoop config objects cons...

Posted by vanzin <gi...@git.apache.org>.

Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/1843#issuecomment-51546457
  
    BTW: I'd like to add a couple of simple tests for the YarnSparkHadoopUtil class, but #1724 adds the test suite for that class and I'll wait until that PR is merged before adding the tests.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2889] Create Hadoop config objects cons...

Posted by vanzin <gi...@git.apache.org>.

Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/1843#issuecomment-51634750
  
    Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2889] Create Hadoop config objects cons...

Posted by asfgit <gi...@git.apache.org>.

Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/1843


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2889] Create Hadoop config objects cons...

Posted by mateiz <gi...@git.apache.org>.

Github user mateiz commented on the pull request:

    https://github.com/apache/spark/pull/1843#issuecomment-53667741
  
    add to whitelist and test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2889] Create Hadoop config objects cons...

Posted by mateiz <gi...@git.apache.org>.

Github user mateiz commented on the pull request:

    https://github.com/apache/spark/pull/1843#issuecomment-53613113
  
    @vanzin unfortunately this no longer merges cleanly, probably due to your YARN change. Mind rebasing it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2889] Create Hadoop config objects cons...

Posted by mateiz <gi...@git.apache.org>.

Github user mateiz commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1843#discussion_r16692981
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala ---
    @@ -68,7 +68,26 @@ class SparkHadoopUtil extends Logging {
        * Return an appropriate (subclass) of Configuration. Creating config can initializes some Hadoop
        * subsystems.
        */
    -  def newConfiguration(): Configuration = new Configuration()
    +  def newConfiguration(conf: SparkConf): Configuration = {
    --- End diff --
    
    This is technically a breaking API change, we can't just do it like this. We have to add the old version.
    
    Also, somewhat worryingly, I don't think SparkHadoopUtil was meant to be a public API, so it's weird that it gets used in our examples. We should probably mark it as `@DeveloperApi` and make sure that the examples don't use it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2889] Create Hadoop config objects cons...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1843#issuecomment-53668068
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19371/consoleFull) for   PR 1843 at commit [`f179013`](https://github.com/apache/spark/commit/f179013d4dbb94238e0b282511c248d4d8f2d7a9).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2889] Create Hadoop config objects cons...

Posted by vanzin <gi...@git.apache.org>.

Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/1843#issuecomment-53906219
  
    Jenkins, test this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2889] Create Hadoop config objects cons...

Posted by mateiz <gi...@git.apache.org>.

Github user mateiz commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1843#discussion_r16751924
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala ---
    @@ -68,7 +68,26 @@ class SparkHadoopUtil extends Logging {
        * Return an appropriate (subclass) of Configuration. Creating config can initializes some Hadoop
        * subsystems.
        */
    -  def newConfiguration(): Configuration = new Configuration()
    +  def newConfiguration(conf: SparkConf): Configuration = {
    --- End diff --
    
    BTW in this case you should mark this class and all its methods as `@DeveloperApi`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2889] Create Hadoop config objects cons...

Posted by vanzin <gi...@git.apache.org>.

Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/1843#issuecomment-52855755
  
    Jenkins, test this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2889] Create Hadoop config objects cons...

Posted by mateiz <gi...@git.apache.org>.

Github user mateiz commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1843#discussion_r16693014
  
    --- Diff: examples/src/main/scala/org/apache/spark/examples/SparkHdfsLR.scala ---
    @@ -70,7 +70,7 @@ object SparkHdfsLR {
     
         val sparkConf = new SparkConf().setAppName("SparkHdfsLR")
         val inputPath = args(0)
    -    val conf = SparkHadoopUtil.get.newConfiguration()
    +    val conf = SparkHadoopUtil.get.newConfiguration(sparkConf)
    --- End diff --
    
    IMO this should not even be using SparkHadoopUtil, as mentioned above; it should just create a `new Configuration`. It's okay if it doesn't get the `spark.hadoop.*` properties, after all you're doing this before initializing a SparkContext.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org