You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Chester (JIRA)" <ji...@apache.org> on 2014/10/12 05:29:33 UTC

[jira] [Created] (SPARK-3913) Spark Yarn Client API change to expose Yarn Resource Capacity and Yarn Application Listerner

Chester created SPARK-3913:
------------------------------

             Summary: Spark Yarn Client API change to expose Yarn Resource Capacity and Yarn Application Listerner
                 Key: SPARK-3913
                 URL: https://issues.apache.org/jira/browse/SPARK-3913
             Project: Spark
          Issue Type: Improvement
          Components: YARN
            Reporter: Chester


When working with Spark with Yarn deployment mode, we have two issues:

1) We don't know how much yarn max capacity ( memory and cores) before we specify the number of executor and memories for spark drivers and executors. We we set a big number, the job can potentially exceeds the limit and got killed. 
   It would be better we let the application know that the yarn resource capacity a head of time and the spark config can adjusted dynamically. 
  
2) Once job started, we would like some feedback from yarn application. Currently, the spark client basically block the call and returns when the job is finished or failed or killed. 
   If the job runs for few hours, we have no idea how far it has gone, the progress and resource usage, tracking URL etc. 
   
I will create one Pull Request and try to address these two problems.  
 




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org