You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by xiaowen147 <gi...@git.apache.org> on 2015/09/30 04:48:37 UTC

[GitHub] spark pull request: spark on yarn support priority option

GitHub user xiaowen147 opened a pull request:

    https://github.com/apache/spark/pull/8943

    spark on yarn support priority option

    https://issues.apache.org/jira/browse/SPARK-10879

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/xiaowen147/spark yarn_priority

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/8943.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #8943
    
----
commit 556fe01b3d15fca76b9f86142c3f1fef9b130f24
Author: xiaowen147 <xi...@gmail.com>
Date:   2015-09-30T02:40:31Z

    spark on yarn support priority option

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10879] spark on yarn support priority o...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/8943


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10879] spark on yarn support priority o...

Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on the pull request:

    https://github.com/apache/spark/pull/8943#issuecomment-144298144
  
    From my understanding, priority support on Yarn is still on working, at least for capacity scheduler, right? Also do we need to specify priority for each `ContainerRequest`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10879] spark on yarn support priority o...

Posted by tgravescs <gi...@git.apache.org>.
Github user tgravescs commented on the pull request:

    https://github.com/apache/spark/pull/8943#issuecomment-144490406
  
    the yarn api parts of this are done and the setPriority method for the applicationsubmissioncontext has been there forever (at least yarn 2.2 and I think before).  It just never did anything.   
    So I don't see an issue with adding this now.   Priorities are at an application level not a container level.
    
    I personally would either like to see --priority removed and just use the config or at least renamed to like queuePriority.  Right now its only used by yarn and you can just as easily set this just by --conf spark.yarn.priority=10 so would rather not see the submit option.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10879] spark on yarn support priority o...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/8943#issuecomment-144264696
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10879] spark on yarn support priority o...

Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on the pull request:

    https://github.com/apache/spark/pull/8943#issuecomment-144580262
  
    Agreed with undocumented configuration.
    
    BTW, my question is that do we need to take care of priority in `ContainerRequest`? Now it is setting by `RM_REQUEST_PRIORITY`, if we both specify the application priority and container priority, which one has the priority to overwrite another?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10879] spark on yarn support priority o...

Posted by tgravescs <gi...@git.apache.org>.
Github user tgravescs commented on a diff in the pull request:

    https://github.com/apache/spark/pull/8943#discussion_r40827232
  
    --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---
    @@ -165,6 +165,7 @@ private[spark] class Client(
         val appContext = newApp.getApplicationSubmissionContext
         appContext.setApplicationName(args.appName)
         appContext.setQueue(args.amQueue)
    +    appContext.setPriority(Priority.newInstance(args.priority))
    --- End diff --
    
    I believe they are going to have a default priority per queue setting so if we hardcode the default to 0 that might not mach what the queues default it.  I would rather leave this unset if not explicitly specified.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10879] spark on yarn support priority o...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/8943#issuecomment-144391128
  
    Echoing Saisai's comment, if this work is still not completely finished on YARN, it might be better to not expose a command line option in Spark just yet; just make it a non-documented config option.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10879] spark on yarn support priority o...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/8943#discussion_r40790889
  
    --- Diff: yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala ---
    @@ -86,6 +86,7 @@ private[spark] class YarnClientSchedulerBackend(
             ("--executor-cores", "SPARK_WORKER_CORES", "spark.executor.cores"),
             ("--executor-cores", "SPARK_EXECUTOR_CORES", "spark.executor.cores"),
             ("--queue", "SPARK_YARN_QUEUE", "spark.yarn.queue"),
    +        ("--priority", "SPARK_YARN_PRIORITY", "spark.yarn.priority"),
    --- End diff --
    
    I'd really prefer to not add this. There's no need, just set the value in `SparkConf` and read it in `Client.scala` - which is what this line you're adding in SparkSubmit already does: 
    
        OptionAssigner(args.priority, YARN, CLIENT, sysProp = "spark.yarn.priority"),
    
    Just make it work for both deploy modes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10879] spark on yarn support priority o...

Posted by tgravescs <gi...@git.apache.org>.
Github user tgravescs commented on the pull request:

    https://github.com/apache/spark/pull/8943#issuecomment-145873474
  
    Container priority is different.  container priority is honored right now and its for the application to say one container must be allocated before another. An example is MapReduce where maps may be higher priority because the reducer can't run until all the maps are finished.  At this point I don't see this useful to Spark as all the executors are the same importance.
    
    @xiaowen147   can you update based on the feedback?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10879] spark on yarn support priority o...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/8943#issuecomment-148527839
  
    @xiaowen147 did you mean that you won't be updating this PR? Could you close it in that case?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10879] spark on yarn support priority o...

Posted by xiaowen147 <gi...@git.apache.org>.
Github user xiaowen147 commented on the pull request:

    https://github.com/apache/spark/pull/8943#issuecomment-146474166
  
    @tgravescs  @jerryshao @vanzin  Thank you for your suggestions. Due to the YARN's unsupported feature, even if I update, it won't work. So, I consider to close this issue at the moment. 
    You  may refer to https://issues.apache.org/jira/browse/YARN-1963.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10879] spark on yarn support priority o...

Posted by tgravescs <gi...@git.apache.org>.
Github user tgravescs commented on the pull request:

    https://github.com/apache/spark/pull/8943#issuecomment-146542075
  
    @xiaowen147   can you clarify, do you just mean its not fully implemented yet?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org