You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by Cangyl <gi...@git.apache.org> on 2018/10/29 12:11:46 UTC

[GitHub] spark pull request #22876: [SPARK-25869] [YARN] the original diagnostics is ...

GitHub user Cangyl opened a pull request:

    https://github.com/apache/spark/pull/22876

    [SPARK-25869] [YARN] the original diagnostics is missing when job failed ma…

    
    When submit spark on yarn jobs, the diagnostics message may be missing when job failed maxAppAttempts times
    For more details, please check
    https://issues.apache.org/jira/browse/SPARK-25869


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/Cangyl/spark SPARK-25869

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22876.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22876
    
----
commit afce7a0986ae04610eb2a9cfb9d8c50fe8de69c1
Author: 仓业亮 10202330 <ca...@...>
Date:   2018-10-29T11:59:49Z

    Spark on YARN: the original diagnostics is missing when job failed maxAppAttempts times

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22876: [SPARK-25869] [YARN] the original diagnostics is missing...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22876
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22876: [SPARK-25869] [YARN] the original diagnostics is missing...

Posted by ouyangxiaochen <gi...@git.apache.org>.
Github user ouyangxiaochen commented on the issue:

    https://github.com/apache/spark/pull/22876
  
    @srowen  CC 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22876: [SPARK-25869] [YARN] the original diagnostics is ...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22876#discussion_r230053125
  
    --- Diff: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala ---
    @@ -293,6 +293,9 @@ private[spark] class ApplicationMaster(args: ApplicationMasterArguments) extends
             }
     
             if (!unregistered) {
    +          logInfo("Waiting for " + sparkConf.get("spark.yarn.report.interval", "1000").toInt +"ms to unregister am," +
    --- End diff --
    
    Use interpolation. No need to call `.toInt`. am -> AM or Application master. msg -> message; these should be complete sentences. You get the config twice here. Is this the default used elsewhere?
    
    I don't know this code well and don't think a Thread.sleep is a great way to coordinate.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22876: [SPARK-25869] [YARN] the original diagnostics is missing...

Posted by Cangyl <gi...@git.apache.org>.
Github user Cangyl commented on the issue:

    https://github.com/apache/spark/pull/22876
  
    What changes were proposed in this pull request?
    When set configs in yarn as below:
    yarn.scheduler.minimum-allocation-mb 50mb
    yarn.scheduler.increment-allocation-mb 50mb
    And submit spark on yarn job using the command below:
    
    In ApplicationMaster.scala, when a spark applicaiton failed in the LastAttempt, it will try to unregister itself from yarn resourcemanager. Normally, during ${spark.yarn.report.interval}, the unregister event will be sent to resourcemanager before the failed container event, and overwrite the container failded diagnostics with "Shutdown hook called before final status was reported." This is the prolem code in Application.scala
    <if (!unregistered) {
              // we only want to unregister if we don't want the RM to retry
              if (finalStatus == FinalApplicationStatus.SUCCEEDED || isLastAttempt) {
                unregister(finalStatus, finalMsg)
                cleanupStagingDir()
              }
            }>
    
    



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22876: [SPARK-25869] [YARN] the original diagnostics is missing...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22876
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22876: [SPARK-25869] [YARN] the original diagnostics is missing...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22876
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org