You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by ajithme <gi...@git.apache.org> on 2018/08/16 13:06:18 UTC

[GitHub] spark pull request #22120: [SPARK-25131]Event logs missing applicationAttemp...

GitHub user ajithme opened a pull request:

    https://github.com/apache/spark/pull/22120

    [SPARK-25131]Event logs missing applicationAttemptId for SparkListenerApplicationStart

    When master=yarn and deploy-mode=client, event logs do not contain applicationAttemptId for SparkListenerApplicationStart. This is caused at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend#start where we do bindToYarn(client.submitApplication(), None) which sets appAttemptId to None. We can however, get the appAttemptId after waitForApplication() and set it
    
    
    This i have tested manually and verified

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ajithme/spark missingAttemptId

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22120.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22120
    
----
commit cc7625c06609c1092ca26d6d6b4f522b9f844710
Author: Ajith <aj...@...>
Date:   2018-08-16T12:54:06Z

    Set application attempt id in yarn client mode

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22120: [SPARK-25131]Event logs missing applicationAttemptId for...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/22120
  
    Is this really necessary? It will always be "1", since client-mode apps are not re-tried (the YARN AM might be, but the driver is not). That makes it not really useful.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22120: [SPARK-25131]Event logs missing applicationAttemptId for...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/22120
  
    It is not that the change is trivial, it's that I don't see the point of it. There is no concept of "attempts" in Spark client mode. So why fill in this information at all?
    
    It's an "option" in the event for a reason: not all modes support re-attempts. If some listener is confused by that, it's a problem with that listener.
    
    Sorry but I don't think this should go in.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22120: [SPARK-25131]Event logs missing applicationAttemptId for...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22120
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22120: [SPARK-25131]Event logs missing applicationAttemptId for...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22120
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22120: [SPARK-25131]Event logs missing applicationAttemp...

Posted by sujith71955 <gi...@git.apache.org>.
Github user sujith71955 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22120#discussion_r210599272
  
    --- Diff: resource-managers/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala ---
    @@ -62,6 +62,10 @@ private[spark] class YarnClientSchedulerBackend(
         super.start()
         waitForApplication()
     
    +    // set the attemptId as its available now
    +    this.attemptId = Option(client.getApplicationReport(this.appId.get)
    +      .getCurrentApplicationAttemptId())
    +
    --- End diff --
    
    Extra space


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22120: [SPARK-25131]Event logs missing applicationAttemptId for...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22120
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22120: [SPARK-25131]Event logs missing applicationAttemp...

Posted by ajithme <gi...@git.apache.org>.
Github user ajithme closed the pull request at:

    https://github.com/apache/spark/pull/22120


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22120: [SPARK-25131]Event logs missing applicationAttemptId for...

Posted by ajithme <gi...@git.apache.org>.
Github user ajithme commented on the issue:

    https://github.com/apache/spark/pull/22120
  
    @vanzin I agree its a trivial change. Just wanted it to be consistent output with yarn cluster mode. This is not just for event logs also for a custom SparkListener , it may be confusing that appId is empty in client case and a actual number in cluster case for onApplicationStart, this is where its effect can be seen. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org