You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by tgravescs <gi...@git.apache.org> on 2014/04/09 02:29:00 UTC

[GitHub] spark pull request: SPARK-1408 Modify Spark on Yarn to point to th...

GitHub user tgravescs opened a pull request:

    https://github.com/apache/spark/pull/362

    SPARK-1408 Modify Spark on Yarn to point to the history server when app ...

    ...finishes
    
    Note this is dependent on https://github.com/apache/spark/pull/204 to have a working history server, but there are no code dependencies. 
    
    This also fixes SPARK-1288 yarn stable finishApplicationMaster incomplete. Since I was in there I made the diagnostic message be passed properly.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/tgravescs/spark SPARK-1408

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/362.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #362
    
----
commit f5d53734225448abe3c34e3f109face4c0fea0fd
Author: Thomas Graves <tg...@apache.org>
Date:   2014-04-09T00:26:02Z

    SPARK-1408 Modify Spark on Yarn to point to the history server when app finishes

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1408 Modify Spark on Yarn to point to th...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/362#issuecomment-40752469
  
     Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1408 Modify Spark on Yarn to point to th...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/362#issuecomment-40748111
  
    One small typo, but other than that this LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1408 Modify Spark on Yarn to point to th...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/362#discussion_r11541495
  
    --- Diff: yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala ---
    @@ -347,8 +347,8 @@ class ApplicationMaster(args: ApplicationMasterArguments, conf: Configuration,
     
           logInfo("finishApplicationMaster with " + status)
           if (registered) {
    -        // Set tracking URL to empty since we don't have a history server.
    -        amClient.unregisterApplicationMaster(status, "" /* appMessage */ , "" /* appTrackingUrl */)
    +        val trackingUrl = sparkConf.get("spark.historyserver.address", "")
    --- End diff --
    
    Ah you're right. Somehow I thought they were in the same file.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1408 Modify Spark on Yarn to point to th...

Posted by tgravescs <gi...@git.apache.org>.
Github user tgravescs commented on a diff in the pull request:

    https://github.com/apache/spark/pull/362#discussion_r11536569
  
    --- Diff: docs/configuration.md ---
    @@ -627,6 +627,13 @@ Apart from these, the following properties are also available, and may be useful
         Number of cores to allocate for each task.
       </td>
     </tr>
    +<tr>
    +  <td>spark.historyserver.address</td>
    +  <td>localhost:18080</td>
    +  <td>
    +    The URL of the Spark history server. The history server is optional. 
    +  </td>
    +</tr>
    --- End diff --
    
    Maybe I should make this a yarn specific config for now.  I put it here thinking perhaps down the road it might be useful for spark core.  thoughts? I'll fix the description on it. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1408 Modify Spark on Yarn to point to th...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/362#issuecomment-39919675
  
    Merged build finished. All automated tests passed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1408 Modify Spark on Yarn to point to th...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/362


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1408 Modify Spark on Yarn to point to th...

Posted by tgravescs <gi...@git.apache.org>.
Github user tgravescs commented on a diff in the pull request:

    https://github.com/apache/spark/pull/362#discussion_r11536291
  
    --- Diff: yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala ---
    @@ -347,8 +347,8 @@ class ApplicationMaster(args: ApplicationMasterArguments, conf: Configuration,
     
           logInfo("finishApplicationMaster with " + status)
           if (registered) {
    -        // Set tracking URL to empty since we don't have a history server.
    -        amClient.unregisterApplicationMaster(status, "" /* appMessage */ , "" /* appTrackingUrl */)
    +        val trackingUrl = sparkConf.get("spark.historyserver.address", "")
    --- End diff --
    
    What do you mean the line is a duplicate? Its only used once in this file. the spark.historyserver.address isn't used anywhere else in spark.  There are 2 different ApplicationMaster files. One for yarn-alpha and one for yarn-stable but currently they have no shared base class.    We could move it up but since its only being used here and this function is only called once I don't see the need.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1408 Modify Spark on Yarn to point to th...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/362#issuecomment-40761471
  
    Merged build finished. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1408 Modify Spark on Yarn to point to th...

Posted by lumiataCI <gi...@git.apache.org>.
Github user lumiataCI commented on the pull request:

    https://github.com/apache/spark/pull/362#issuecomment-166291832
  
    Test Message


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-1408 Modify Spark on Yarn to point to th...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/362#discussion_r11747228
  
    --- Diff: docs/running-on-yarn.md ---
    @@ -42,6 +42,7 @@ System Properties:
     * `spark.yarn.preserve.staging.files`, set to true to preserve the staged files(spark jar, app jar, distributed cache files) at the end of the job rather then delete them.
     * `spark.yarn.scheduler.heartbeat.interval-ms`, the interval in ms in which the Spark application master heartbeats into the YARN ResourceManager. Default is 5 seconds. 
     * `spark.yarn.max.executor.failures`, the maximum number of executor failures before failing the application. Default is the number of executors requested times 2 with minimum of 3.
    +* `spark.yarn.historyServer.address`, the address of the Spark history server (i.e. host.com:18080). The address should not contain a scheme (http://). Defaults to not being set since the history server is an optional service. This address is given to the Yarn ResourceManager when the Spark application finishes to link the application from the ResourceManaer UI to the Spark history server UI. 
    --- End diff --
    
    ResourceManager* (last sentence)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1408 Modify Spark on Yarn to point to th...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/362#issuecomment-39917701
  
    Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1408 Modify Spark on Yarn to point to th...

Posted by tgravescs <gi...@git.apache.org>.
Github user tgravescs commented on the pull request:

    https://github.com/apache/spark/pull/362#issuecomment-40747761
  
    @andrewor14  any additional comments?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1408 Modify Spark on Yarn to point to th...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/362#issuecomment-40752477
  
    Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1408 Modify Spark on Yarn to point to th...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/362#discussion_r11497629
  
    --- Diff: docs/configuration.md ---
    @@ -627,6 +627,13 @@ Apart from these, the following properties are also available, and may be useful
         Number of cores to allocate for each task.
       </td>
     </tr>
    +<tr>
    +  <td>spark.historyserver.address</td>
    +  <td>localhost:18080</td>
    +  <td>
    +    The URL of the Spark history server. The history server is optional. 
    +  </td>
    +</tr>
    --- End diff --
    
    Also, from this description it's not clear to me when I should set this. If I'm trying to start a history server independently, I might get confused about this parameter and try to use it to change the port. On the other hand, if I'm using YARN and I want to point my AM to the history server, I won't know what this parameter actually does until I grep for it in the code.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1408 Modify Spark on Yarn to point to th...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/362#issuecomment-40386037
  
    Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1408 Modify Spark on Yarn to point to th...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/362#issuecomment-39917694
  
     Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1408 Modify Spark on Yarn to point to th...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/362#issuecomment-39919676
  
    All automated tests passed.
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13913/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1408 Modify Spark on Yarn to point to th...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/362#discussion_r11497447
  
    --- Diff: docs/configuration.md ---
    @@ -627,6 +627,13 @@ Apart from these, the following properties are also available, and may be useful
         Number of cores to allocate for each task.
       </td>
     </tr>
    +<tr>
    +  <td>spark.historyserver.address</td>
    +  <td>localhost:18080</td>
    +  <td>
    +    The URL of the Spark history server. The history server is optional. 
    +  </td>
    +</tr>
    --- End diff --
    
    It seems from your other changes that the default is "" instead of localhost:18080. Also, nit: `spark.historyServer.address`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1408 Modify Spark on Yarn to point to th...

Posted by tgravescs <gi...@git.apache.org>.
Github user tgravescs commented on the pull request:

    https://github.com/apache/spark/pull/362#issuecomment-40765918
  
    thanks, I committed this to master and branch-1.0


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1408 Modify Spark on Yarn to point to th...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/362#issuecomment-40390542
  
    Merged build finished. All automated tests passed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1408 Modify Spark on Yarn to point to th...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/362#discussion_r11497757
  
    --- Diff: yarn/alpha/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala ---
    @@ -366,8 +366,7 @@ class ApplicationMaster(args: ApplicationMasterArguments, conf: Configuration,
             finishReq.setAppAttemptId(appAttemptId)
             finishReq.setFinishApplicationStatus(status)
             finishReq.setDiagnostics(diagnostics)
    -        // Set tracking url to empty since we don't have a history server.
    -        finishReq.setTrackingUrl("")
    +        finishReq.setTrackingUrl(sparkConf.get("spark.historyserver.address", ""))
    --- End diff --
    
    Maybe move the `trackingUrl` as a private val somewhere at the top for reuse? Right now this line is duplicated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1408 Modify Spark on Yarn to point to th...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/362#discussion_r11497766
  
    --- Diff: yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala ---
    @@ -347,8 +347,8 @@ class ApplicationMaster(args: ApplicationMasterArguments, conf: Configuration,
     
           logInfo("finishApplicationMaster with " + status)
           if (registered) {
    -        // Set tracking URL to empty since we don't have a history server.
    -        amClient.unregisterApplicationMaster(status, "" /* appMessage */ , "" /* appTrackingUrl */)
    +        val trackingUrl = sparkConf.get("spark.historyserver.address", "")
    --- End diff --
    
    Maybe move the `trackingUrl` as a private val somewhere at the top for reuse? Right now this line is duplicated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1408 Modify Spark on Yarn to point to th...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/362#issuecomment-40390543
  
    All automated tests passed.
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14113/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1408 Modify Spark on Yarn to point to th...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/362#issuecomment-40761472
  
    
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14210/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1408 Modify Spark on Yarn to point to th...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/362#issuecomment-40386021
  
     Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---