You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by zsxwing <gi...@git.apache.org> on 2018/02/03 00:29:53 UTC

[GitHub] spark pull request #20493: [SPARK-23326][WEBUI]schedulerDelay should return ...

GitHub user zsxwing opened a pull request:

    https://github.com/apache/spark/pull/20493

    [SPARK-23326][WEBUI]schedulerDelay should return 0 when the task is running

    ## What changes were proposed in this pull request?
    
    When a task is still running, metrics like executorRunTime are not available. Then `schedulerDelay` will be almost the same as `duration` and that's confusing.
    
    This PR makes `schedulerDelay` return 0 when the task is running which is the same behavior as 2.2.
    
    ## How was this patch tested?
    
    `AppStatusUtilsSuite.schedulerDelay`

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/zsxwing/spark SPARK-23326

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/20493.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #20493
    
----
commit 7889fb0e5e4515ade35c2a07703017e16ee6194a
Author: Shixiong Zhu <zs...@...>
Date:   2018-02-03T00:25:34Z

    schedulerDelay should return 0 when the task is running

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20493: [SPARK-23326][WEBUI]schedulerDelay should return ...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/20493


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20493: [SPARK-23326][WEBUI]schedulerDelay should return ...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20493#discussion_r166068708
  
    --- Diff: core/src/test/scala/org/apache/spark/status/AppStatusUtilsSuite.scala ---
    @@ -0,0 +1,89 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.spark.status
    +
    +import java.util.Date
    +
    +import org.apache.spark.SparkFunSuite
    +import org.apache.spark.status.api.v1.{TaskData, TaskMetrics}
    +
    +class AppStatusUtilsSuite extends SparkFunSuite {
    +
    +  test("schedulerDelay") {
    +    val runningTask = new TaskData(
    --- End diff --
    
    +1


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20493: [SPARK-23326][WEBUI]schedulerDelay should return ...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20493#discussion_r165819565
  
    --- Diff: core/src/test/scala/org/apache/spark/status/AppStatusUtilsSuite.scala ---
    @@ -0,0 +1,89 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.spark.status
    +
    +import java.util.Date
    +
    +import org.apache.spark.SparkFunSuite
    +import org.apache.spark.status.api.v1.{TaskData, TaskMetrics}
    +
    +class AppStatusUtilsSuite extends SparkFunSuite {
    +
    +  test("schedulerDelay") {
    +    val runningTask = new TaskData(
    --- End diff --
    
    Can we make this test case more concise and easy to read by deduplication?
    For the purpose of this test case, what about the following pattern?
    ```
    Seq(("RUNNING", 0), ("SUCCESS", 3L)).foreach { case (status, schedulerDelay) =>
     // the code from `finishedTask`
    }
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20493: [SPARK-23326][WEBUI]schedulerDelay should return 0 when ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20493
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87015/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20493: [SPARK-23326][WEBUI]schedulerDelay should return ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20493#discussion_r166181600
  
    --- Diff: core/src/test/scala/org/apache/spark/status/AppStatusUtilsSuite.scala ---
    @@ -0,0 +1,89 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.spark.status
    +
    +import java.util.Date
    +
    +import org.apache.spark.SparkFunSuite
    +import org.apache.spark.status.api.v1.{TaskData, TaskMetrics}
    +
    +class AppStatusUtilsSuite extends SparkFunSuite {
    +
    +  test("schedulerDelay") {
    +    val runningTask = new TaskData(
    --- End diff --
    
    +1


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20493: [SPARK-23326][WEBUI]schedulerDelay should return ...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20493#discussion_r166064864
  
    --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusUtils.scala ---
    @@ -17,16 +17,23 @@
     
     package org.apache.spark.status
     
    -import org.apache.spark.status.api.v1.{TaskData, TaskMetrics}
    +import org.apache.spark.status.api.v1.TaskData
     
     private[spark] object AppStatusUtils {
     
    +  private val TASK_FINISHED_STATES = Set("FAILED", "KILLED", "SUCCESS")
    +
    +  private def isTaskFinished(task: TaskData): Boolean = {
    +    TASK_FINISHED_STATES.contains(task.status)
    +  }
    +
       def schedulerDelay(task: TaskData): Long = {
    -    if (task.taskMetrics.isDefined && task.duration.isDefined) {
    +    if (isTaskFinished(task) && task.taskMetrics.isDefined && task.duration.isDefined) {
    --- End diff --
    
    `task.duration.isDefined` should be redundant now, right?
    
    (I remember the duration didn't use to be set for running tasks, so this code worked, but apparently it changed while I worked on these changes...)


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20493: [SPARK-23326][WEBUI]schedulerDelay should return 0 when ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20493
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20493: [SPARK-23326][WEBUI]schedulerDelay should return 0 when ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20493
  
    **[Test build #87015 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87015/testReport)** for PR 20493 at commit [`7889fb0`](https://github.com/apache/spark/commit/7889fb0e5e4515ade35c2a07703017e16ee6194a).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20493: [SPARK-23326][WEBUI]schedulerDelay should return 0 when ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20493
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/551/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20493: [SPARK-23326][WEBUI]schedulerDelay should return 0 when ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/20493
  
    LGTM


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20493: [SPARK-23326][WEBUI]schedulerDelay should return 0 when ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20493
  
    **[Test build #87015 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87015/testReport)** for PR 20493 at commit [`7889fb0`](https://github.com/apache/spark/commit/7889fb0e5e4515ade35c2a07703017e16ee6194a).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20493: [SPARK-23326][WEBUI]schedulerDelay should return ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20493#discussion_r166197254
  
    --- Diff: core/src/test/scala/org/apache/spark/status/AppStatusUtilsSuite.scala ---
    @@ -0,0 +1,89 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.spark.status
    +
    +import java.util.Date
    +
    +import org.apache.spark.SparkFunSuite
    +import org.apache.spark.status.api.v1.{TaskData, TaskMetrics}
    +
    +class AppStatusUtilsSuite extends SparkFunSuite {
    +
    +  test("schedulerDelay") {
    +    val runningTask = new TaskData(
    --- End diff --
    
    Actually there are many different values between these 2 code blocks
    ```
     +        executorDeserializeTime = 5L,
     +        executorDeserializeCpuTime = 3L,
     +        executorRunTime = 90L,
     +        executorCpuTime = 10L,
     +        resultSize = 100L,
     +        jvmGcTime = 10L,
     +        resultSerializationTime = 2L,
    ```
    I think it's OK keep the code as it is.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20493: [SPARK-23326][WEBUI]schedulerDelay should return 0 when ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20493
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20493: [SPARK-23326][WEBUI]schedulerDelay should return ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20493#discussion_r166181455
  
    --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusUtils.scala ---
    @@ -17,16 +17,23 @@
     
     package org.apache.spark.status
     
    -import org.apache.spark.status.api.v1.{TaskData, TaskMetrics}
    +import org.apache.spark.status.api.v1.TaskData
     
     private[spark] object AppStatusUtils {
     
    +  private val TASK_FINISHED_STATES = Set("FAILED", "KILLED", "SUCCESS")
    +
    +  private def isTaskFinished(task: TaskData): Boolean = {
    +    TASK_FINISHED_STATES.contains(task.status)
    +  }
    +
       def schedulerDelay(task: TaskData): Long = {
    -    if (task.taskMetrics.isDefined && task.duration.isDefined) {
    +    if (isTaskFinished(task) && task.taskMetrics.isDefined && task.duration.isDefined) {
    --- End diff --
    
    Logically `duration` should be set for running tasks, to indicate how long a task has been run.
    
    I feel it's safer to keep `task.duration.isDefined`, as we call `task.duration.get` below.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20493: [SPARK-23326][WEBUI]schedulerDelay should return ...

Posted by zsxwing <gi...@git.apache.org>.
Github user zsxwing commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20493#discussion_r166197592
  
    --- Diff: core/src/test/scala/org/apache/spark/status/AppStatusUtilsSuite.scala ---
    @@ -0,0 +1,89 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.spark.status
    +
    +import java.util.Date
    +
    +import org.apache.spark.SparkFunSuite
    +import org.apache.spark.status.api.v1.{TaskData, TaskMetrics}
    +
    +class AppStatusUtilsSuite extends SparkFunSuite {
    +
    +  test("schedulerDelay") {
    +    val runningTask = new TaskData(
    --- End diff --
    
    Yeah, I'm inclined to keep it as they are more real. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20493: [SPARK-23326][WEBUI]schedulerDelay should return 0 when ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/20493
  
    thanks, merging to master/2.3!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org