You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by lianhuiwang <gi...@git.apache.org> on 2014/07/22 15:02:30 UTC

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

GitHub user lianhuiwang opened a pull request:

    https://github.com/apache/spark/pull/1528

    use config spark.scheduler.priority for specifying TaskSet's priority on DAGScheduler

    https://issues.apache.org/jira/browse/SPARK-2618

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/lianhuiwang/spark config-scheduler-priority

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/1528.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1528
    
----
commit f2b597022b4fc4023c238e5b5a9824946f84f84e
Author: lianhuiwang <li...@gmail.com>
Date:   2014-05-23T14:02:57Z

    bugfix worker DriverStateChanged state should match DriverState.FAILED

commit 480ce949a83c0d854078b38f5665f3369cf759eb
Author: lianhuiwang <li...@gmail.com>
Date:   2014-05-24T15:24:37Z

    address aarondav comments

commit 8bbfe76dd8c8af815fa8404eb9a7922e58f938f7
Author: lianhuiwang <li...@gmail.com>
Date:   2014-06-10T16:01:36Z

    Merge remote-tracking branch 'upstream/master'

commit eacf9339a8c062cf3f28343a4f8157d214d25b00
Author: lianhuiwang <li...@gmail.com>
Date:   2014-07-13T14:13:03Z

    Merge remote-tracking branch 'upstream/master'

commit 44a3f50c689849228c42d072bdd355781dbacec6
Author: unknown <ad...@taguswang-pc1.tencent.com>
Date:   2014-07-16T14:22:18Z

    Merge remote-tracking branch 'upstream/master'

commit 20f81fa7e8af0108656100ddee8361865d644c1c
Author: lianhuiwang <li...@gmail.com>
Date:   2014-07-17T13:07:48Z

    Merge remote-tracking branch 'upstream/master'

commit 66371a16ebb4f1736442d7aae54e7eedf67294c0
Author: lianhuiwang <li...@gmail.com>
Date:   2014-07-22T12:10:10Z

    Merge remote-tracking branch 'spark/master'

commit 1e1e30eebe6eb97563dd50dfded25bea24ecda95
Author: lianhuiwang <li...@gmail.com>
Date:   2014-07-22T12:57:15Z

    use config spark.scheduler.priority for specifying TaskSet's priority on DAGScheduler

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

Posted by lianhuiwang <gi...@git.apache.org>.

Github user lianhuiwang commented on the pull request:

    https://github.com/apache/spark/pull/1528#issuecomment-49846832
  
    @markhamstra thank you.how about latest code?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

Posted by markhamstra <gi...@git.apache.org>.

Github user markhamstra commented on the pull request:

    https://github.com/apache/spark/pull/1528#issuecomment-49848091
  
    This looks like a clean implementation, but you still need to open a JIRA issue to explain why you want this; then edit the description of this PR to reference that JIRA. https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark#ContributingtoSpark-ContributingCode


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

Posted by CodingCat <gi...@git.apache.org>.

Github user CodingCat commented on the pull request:

    https://github.com/apache/spark/pull/1528#issuecomment-49744970
  
    also, this is preemptive or non-preemptive?
    
    according to my understanding on the code, it's non-preemptive, then a high priority TaskSet is easily to be delayed when there are a lot of last-long but low priority TaskSets


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

Posted by markhamstra <gi...@git.apache.org>.

Github user markhamstra commented on the pull request:

    https://github.com/apache/spark/pull/1528#issuecomment-49916553
  
    Yeah, I'm wondering whether the actual problem is that creation and use of scheduler pools with different weights is unclear or too difficult; and that if we could resolve those issues, then the need for this PR would disappear. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1528#issuecomment-49756330
  
    QA tests have started for PR 1528. This patch merges cleanly. <br>View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16969/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1528#issuecomment-49847118
  
    QA tests have started for PR 1528. This patch merges cleanly. <br>View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17026/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

Posted by markhamstra <gi...@git.apache.org>.

Github user markhamstra commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1528#discussion_r15270158
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/SchedulingAlgorithm.scala ---
    @@ -17,6 +17,8 @@
     
     package org.apache.spark.scheduler
     
    +import scala.math.Ordering.Implicits._
    --- End diff --
    
    Pulling in these implicits can have unintended consequences; that's why in my previous comment I kept the scope of the import as small as possible.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

Posted by CodingCat <gi...@git.apache.org>.

Github user CodingCat commented on the pull request:

    https://github.com/apache/spark/pull/1528#issuecomment-49863644
  
    I mean if we just want to prioritize some jobs, why not assigning them to a pool with higher weight?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

Posted by lianhuiwang <gi...@git.apache.org>.

Github user lianhuiwang commented on the pull request:

    https://github.com/apache/spark/pull/1528#issuecomment-50125169
  
    the current implementation of scheduling is very ugly. so i cannot find space to add this config to complete job's  priority.anyone can help me?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1528#issuecomment-49834400
  
    QA results for PR 1528:<br>- This patch PASSES unit tests.<br>- This patch merges cleanly<br>- This patch adds no public classes<br><br>For more information see test ouptut:<br>https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17013/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

Posted by lianhuiwang <gi...@git.apache.org>.

Github user lianhuiwang commented on the pull request:

    https://github.com/apache/spark/pull/1528#issuecomment-49833109
  
    @markhamstra thank you. i update patch. have more comments?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

Posted by markhamstra <gi...@git.apache.org>.

Github user markhamstra commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1528#discussion_r15270239
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSet.scala ---
    @@ -27,9 +27,18 @@ private[spark] class TaskSet(
         val tasks: Array[Task[_]],
         val stageId: Int,
         val attempt: Int,
    -    val priority: Int,
    +    val jobId: Int,
         val properties: Properties) {
         val id: String = stageId + "." + attempt
    +    val DEFAULT_PRIORITY: Int = 0
    +
    +  def priority:Int = {
    +    if(properties != null){
    +      properties.getProperty("spark.scheduler.priority", "0").toInt
    +    }else{
    +      DEFAULT_PRIORITY
    +    }
    +  }
    --- End diff --
    
    Why the change from `val` to `def`?  I believe `val priority = if ... else ...` will work fine here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1528#issuecomment-49855837
  
    QA results for PR 1528:<br>- This patch FAILED unit tests.<br>- This patch merges cleanly<br>- This patch adds no public classes<br><br>For more information see test ouptut:<br>https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17026/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

Posted by CodingCat <gi...@git.apache.org>.

Github user CodingCat commented on the pull request:

    https://github.com/apache/spark/pull/1528#issuecomment-49744177
  
    so it's actually another type of scheduling instead of FIFO/FAIR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

Posted by markhamstra <gi...@git.apache.org>.

Github user markhamstra commented on the pull request:

    https://github.com/apache/spark/pull/1528#issuecomment-49848309
  
    Sorry, looks like you already have SPARK-2618, so change change the title of this PR to include that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

Posted by pwendell <gi...@git.apache.org>.

Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/1528#issuecomment-50106263
  
    We shouldn't should expose these types of hooks into the scheduler internals. The TaskSet, for instance, is an implementation detail we don't want to be part of a public API and the priority is an internal concept.
    
    The public API of Spark for scheduling policies is the Fair Scheduler. Many different types of policies can be achieved within Fair Scheduling, including having a high priority pool to which tasks are submitted.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

Posted by lianhuiwang <gi...@git.apache.org>.

Github user lianhuiwang commented on the pull request:

    https://github.com/apache/spark/pull/1528#issuecomment-49849632
  
    @markhamstra @pwendell  i have updated SPARK-2618, please take a look. thanks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

Posted by lianhuiwang <gi...@git.apache.org>.

Github user lianhuiwang commented on the pull request:

    https://github.com/apache/spark/pull/1528#issuecomment-49862687
  
    i donot think priority is useful for FAIR scheduler. on YARN scheduler priority is work with FIFO and not with FAIR. so i think spark application's scheduler mode is same with YARN.we can see YARN's FAIR:
    https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FairSharePolicy.java


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1528#issuecomment-49833080
  
    QA tests have started for PR 1528. This patch merges cleanly. <br>View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17014/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

Posted by markhamstra <gi...@git.apache.org>.

Github user markhamstra commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1528#discussion_r15248491
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/SchedulingAlgorithm.scala ---
    @@ -32,11 +32,21 @@ private[spark] class FIFOSchedulingAlgorithm extends SchedulingAlgorithm {
         val priority2 = s2.priority
         var res = math.signum(priority1 - priority2)
         if (res == 0) {
    -      val stageId1 = s1.stageId
    -      val stageId2 = s2.stageId
    -      res = math.signum(stageId1 - stageId2)
    +      val jobId1 = s1.jobId
    +      val jobId2 = s2.jobId
    +      res = math.signum(jobId1 - jobId2)
    +      if (res == 0) {
    +        val stageId1 = s1.stageId
    +        val stageId2 = s2.stageId
    +        res = math.signum(stageId1 - stageId2)
    +      }
    +      if (res < 0) {
    +        true
    +      } else {
    +        false
    +      }
    --- End diff --
    
    This `if..else` doesn't actually do anything.  This whole comparator is needlessly complex.  If the intent is lexicographical ordering with higher priority ahead of lower priority, lower jobId ahead of higher jobId, and lower stageId ahead of higher stageId, then this should be sufficient:
    ```scala
    private[spark] class FIFOSchedulingAlgorithm extends SchedulingAlgorithm {
      override def comparator(s1: Schedulable, s2: Schedulable): Boolean = {
        import scala.math.Ordering.Implicits._
        (s1.priority, s2.jobId, s2.stageId) > (s2.priority, s1.jobId, s1.stageId)
      }
    }
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1528#issuecomment-49830053
  
    QA tests have started for PR 1528. This patch merges cleanly. <br>View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17013/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1528#issuecomment-49838099
  
    QA results for PR 1528:<br>- This patch PASSES unit tests.<br>- This patch merges cleanly<br>- This patch adds no public classes<br><br>For more information see test ouptut:<br>https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17014/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

Posted by markhamstra <gi...@git.apache.org>.

Github user markhamstra commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1528#discussion_r15246213
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala ---
    @@ -778,8 +778,10 @@ class DAGScheduler(
           logInfo("Submitting " + tasks.size + " missing tasks from " + stage + " (" + stage.rdd + ")")
           myPending ++= tasks
           logDebug("New pending tasks: " + myPending)
    +      val priority:String = properties.getProperty("spark.scheduler.priority", "0")
           taskScheduler.submitTasks(
    -        new TaskSet(tasks.toArray, stage.id, stage.newAttemptId(), stage.jobId, properties))
    +        new TaskSet(tasks.toArray, stage.id, stage.newAttemptId(), stage.jobId, priority.toInt,
    +          properties))
    --- End diff --
    
    `properties` is already being passed to the `TaskSet` ctor, so I'd prefer that extraction of `priority` happen there or elsewhere instead of doing `properties.getProperty` here and adding another parameter to the `TaskSet` ctor. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

Posted by lianhuiwang <gi...@git.apache.org>.

Github user lianhuiwang commented on the pull request:

    https://github.com/apache/spark/pull/1528#issuecomment-49864527
  
    maybe i misunderstand you. with FAIR Schedulable.weight can replace priority. you mean with FAIR we can provide weight config to user? example:spark.scheduler.weight. if it is right , i think we can achieve it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

Posted by pwendell <gi...@git.apache.org>.

Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/1528#issuecomment-54101464
  
    Hey @lianhuiwang I'd prefer to close this issue and take the discussion about scheduling to the user list if you are not sure how to configure the scheduler to do what you want. Exposing internals like this to the user is not a great idea since these API's will likely change in the future.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1528#issuecomment-49738262
  
    QA tests have started for PR 1528. This patch merges cleanly. <br>View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16964/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

Posted by markhamstra <gi...@git.apache.org>.

Github user markhamstra commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1528#discussion_r15272274
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSet.scala ---
    @@ -27,9 +27,18 @@ private[spark] class TaskSet(
         val tasks: Array[Task[_]],
         val stageId: Int,
         val attempt: Int,
    -    val priority: Int,
    +    val jobId: Int,
         val properties: Properties) {
         val id: String = stageId + "." + attempt
    +    val DEFAULT_PRIORITY: Int = 0
    +
    +  val priority:Int = {
    +    if(properties != null){
    +      properties.getProperty("spark.scheduler.priority", "0").toInt
    +    }else{
    +      DEFAULT_PRIORITY
    +    }
    +  }
    --- End diff --
    
    Is the style checker ok with `val priority = if (...) {...` instead of `val priority = { if (...) {...`?  If it is, I'd rather do without the extra `{}`.  You can also drop the `: Int` from `val DEFAULT_PRIORITY` and `val priority` -- the types are obvious without the annotations.  Also, I'm not sure that DEFAULT_PRIORITY really gains you anything -- I'd be fine with just `if (...) {...} else 0`.  And make sure you follow the style guide for spacing with parens and braces.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1528#issuecomment-49735914
  
    QA tests have started for PR 1528. This patch merges cleanly. <br>View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16963/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

Posted by asfgit <gi...@git.apache.org>.

Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/1528


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

Posted by lianhuiwang <gi...@git.apache.org>.

Github user lianhuiwang commented on the pull request:

    https://github.com/apache/spark/pull/1528#issuecomment-49756393
  
    It add user defined priority to FIFO. If user don not configure priority, it work as before. It is non-preemptive.when there has free executors we can let high priority taskset's tasks firstly be submitted than lower priority taskset.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

Posted by CodingCat <gi...@git.apache.org>.

Github user CodingCat commented on the pull request:

    https://github.com/apache/spark/pull/1528#issuecomment-49862183
  
    en....I'm thinking that if we can achieve the same goal with FAIR scheduler.....my own answer is yes......@markhamstra your thoughts?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

Posted by lianhuiwang <gi...@git.apache.org>.

Github user lianhuiwang commented on the pull request:

    https://github.com/apache/spark/pull/1528#issuecomment-49830100
  
    @markhamstra @CodingCat thank you for comments, i updated patch, please review again.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1528#issuecomment-49735992
  
    QA results for PR 1528:<br>- This patch FAILED unit tests.<br>- This patch merges cleanly<br>- This patch adds no public classes<br><br>For more information see test ouptut:<br>https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16963/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: use config spark.scheduler.priority for specif...

Posted by CodingCat <gi...@git.apache.org>.

Github user CodingCat commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1528#discussion_r15246517
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala ---
    @@ -778,8 +778,10 @@ class DAGScheduler(
           logInfo("Submitting " + tasks.size + " missing tasks from " + stage + " (" + stage.rdd + ")")
           myPending ++= tasks
           logDebug("New pending tasks: " + myPending)
    +      val priority:String = properties.getProperty("spark.scheduler.priority", "0")
           taskScheduler.submitTasks(
    -        new TaskSet(tasks.toArray, stage.id, stage.newAttemptId(), stage.jobId, properties))
    +        new TaskSet(tasks.toArray, stage.id, stage.newAttemptId(), stage.jobId, priority.toInt,
    +          properties))
    --- End diff --
    
    agree, as DAGScheduler has known too much about task-level things......


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---