You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by JoshRosen <gi...@git.apache.org> on 2014/12/30 23:07:32 UTC

[GitHub] spark pull request: [SPARK-4014] Change TaskContext.attemptId to r...

GitHub user JoshRosen opened a pull request:

    https://github.com/apache/spark/pull/3849

    [SPARK-4014] Change TaskContext.attemptId to return attempt number instead of task ID

    This patch modifies `TaskContext.attemptId` to return an attempt number, which conveys how many times a task has been attempted, instead of a taskId, which uniquely identifies a particular task attempt within a particular SparkContext.  Prior to this change, it was impossible to determine whether a task was being re-attempted (or was a speculative copy), which made it difficult to write unit tests for tasks that fail on early attempts or speculative tasks that complete faster than original tasks.
    
    I've introduced a new `TaskContext.taskId` field which returns the old value.
    
    Most of this patch is fairly straightforward, but there is a bit of trickiness related to Mesos tasks: since there's no field in MesosTaskInfo to encode the attemptId, I packed it into the `data` field alongside the task binary.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/JoshRosen/spark SPARK-4014

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/3849.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3849
    
----
commit fd515a56e9e66a2f7ab70187fbd463046f1cb330
Author: Josh Rosen <jo...@databricks.com>
Date:   2014-12-30T22:00:46Z

    Add failing test for SPARK-4014

commit 1e7a933f41936e134e07c32aaf47bbcf8938167c
Author: Josh Rosen <jo...@databricks.com>
Date:   2014-12-30T22:01:21Z

    [SPARK-4014] Change TaskContext.attemptId to return attempt number instead of task ID.

commit 9d8d4d115449f48ce1714497e16704d4f12442d8
Author: Josh Rosen <jo...@databricks.com>
Date:   2014-12-30T22:06:19Z

    Doc typo

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Add TaskContext.attemptNumber and...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-68581193
  
    Maybe that test was flaky; let's see if it passes again (it'll retest since I pushed a commit to fix a merge conflict).
    
    I've updated this patch to not modify `attemptId` but to introduce `attemptNumber` and deprecate `attemptId`.  I think it will be confusing to have `attemptId` have different behavior in different branches, especially since it seems like functionality that might be nice to rely on when writing certain types of regression tests.  Since this patch doesn't change any behavior, I'd like to backport it to maintenance branches so that we can rely on it in test code.  If we decide to do that, the committer should update the MiMa exclusions on cherry-pick.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Add TaskContext.attemptNumber and...

Posted by tnachen <gi...@git.apache.org>.
Github user tnachen commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-69810067
  
    Thanks for adding the wrapper, the Mesos portions looks good to me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Change TaskContext.attemptId to r...

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-68407800
  
    Ah I see - I didn't see the doc. I'm more on the fence in this case (because there was a doc that created a specification). So I guess I'm fine either way.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Change TaskContext.attemptId to r...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/3849#discussion_r22367585
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala ---
    @@ -635,7 +635,7 @@ class DAGScheduler(
           val rdd = job.finalStage.rdd
           val split = rdd.partitions(job.partitions(0))
           val taskContext =
    -        new TaskContextImpl(job.finalStage.id, job.partitions(0), 0, true)
    +        new TaskContextImpl(job.finalStage.id, job.partitions(0), 0, 0, true)
    --- End diff --
    
    Good call; I didn't do this for the test code, but this line is in DAGScheduler so it should use named arguments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Change TaskContext.attemptId to r...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/3849#discussion_r22367584
  
    --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala ---
    @@ -180,7 +189,7 @@ private[spark] class Executor(
     
             // Run the actual task and measure its runtime.
             taskStart = System.currentTimeMillis()
    -        val value = task.run(taskId.toInt)
    +        val value = task.run(taskId = taskId.toLong, attemptId = attemptId.toLong)
    --- End diff --
    
    the toLong's are not necessary, are they?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Add TaskContext.attemptNumber and...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-69865129
  
    Noo!  This became merge-conflicted.  Let me bring it up to date...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Change TaskContext.attemptId to r...

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-68408128
  
    Hm - we probably just shouldn't backport it, again I think users might be depending on the hold behavior, putting it in a patch release is a bit iffy.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Change TaskContext.attemptId to r...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-68404316
  
    /cc @koeninger, who raised this issue on the mailing list, and @yhuai, who filed the original JIRA issue.
    
    Also, /cc @pwendell, @andrewor14, and @tdas for review.
    
    Technically, this changes the behavior of `attemptId`; I'd argue that the old behavior was a bug, but we should consider whether we need to preserve it and introduce a new method which returns the attempt number.  I don't think there's anything particularly useful that you can do with the taskId, but maybe I'm mistaken.
    
    Also, "attempt number" would be a better name than "attempt ID", but I think we're kind of stuck with attemptId for binary compatibility reasons.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Change TaskContext.attemptId to r...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/3849#discussion_r22367485
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala ---
    @@ -635,7 +635,7 @@ class DAGScheduler(
           val rdd = job.finalStage.rdd
           val split = rdd.partitions(job.partitions(0))
           val taskContext =
    -        new TaskContextImpl(job.finalStage.id, job.partitions(0), 0, true)
    +        new TaskContextImpl(job.finalStage.id, job.partitions(0), 0, 0, true)
    --- End diff --
    
    can we use named argument for the two zeros and the true?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Change TaskContext.attemptId to r...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-68407820
  
    Annoyingly, it looks like ScalaDoc doesn't display Javadoc annotations, so in the Scala documentation for 1.2.0 the TaskContext class appears to have lost all documentation, even though it still shows up in the Java docs:
    
    - http://spark.apache.org/docs/1.2.0/api/scala/index.html#org.apache.spark.TaskContext
    - https://spark.apache.org/docs/1.2.0/api/java/org/apache/spark/TaskContext.html
    
    I don't know how the docs for `attemptId` managed to get lost during the TaskContext stabilization patch, but this suggests that we need a better review checklist for public APIs which mandates full Scaladoc / Javadoc for all public interfaces.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Change TaskContext.attemptId to r...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/3849#discussion_r22367053
  
    --- Diff: core/src/main/scala/org/apache/spark/executor/MesosExecutorBackend.scala ---
    @@ -77,10 +77,13 @@ private[spark] class MesosExecutorBackend
     
       override def launchTask(d: ExecutorDriver, taskInfo: TaskInfo) {
         val taskId = taskInfo.getTaskId.getValue.toLong
    +    val attemptIdPlusData = taskInfo.getData.asReadOnlyByteBuffer()
    --- End diff --
    
    And here's the corresponding receive-side of the Mesos trickiness.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Change TaskContext.attemptId to r...

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-68406594
  
    So personally I don't think we should change the semantics of `attemptId` because this has been exposed to user applications and they could silently break if we modify the meaning of the field (my original JIRA referred to an internal use of this). What it means right now is "a global GUID over all attempts" - that is a bit of an awkward definition, but I don't think it's fair to call this a bug - it was just a weird definition.
    
    So I'd be in favor of deprecating this in favor of `taskAttemptId` (a new field) and say that it was renamed to avoid confusion. Then we can add another field, `attemptCount` or `attemptNum` or something to convey the more intuitive thing.
    
    It will be slightly awkward, but if anyone reads the docs it should be obvious. In fact, we should probably spruce up the docs here for things like `partitionID` which right now are probably not super clear to users.
    
     



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Add TaskContext.attemptNumber and...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-69823967
  
      [Test build #25483 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25483/consoleFull) for   PR 3849 at commit [`5cfff05`](https://github.com/apache/spark/commit/5cfff05c4cc3c6eba94b3b3f1c34830dfb5fb958).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Add TaskContext.attemptNumber and...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-69646530
  
      [Test build #25425 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25425/consoleFull) for   PR 3849 at commit [`38574d4`](https://github.com/apache/spark/commit/38574d4ecbad9d11af2a648ab35bd6aa044c8cf2).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Change TaskContext.attemptId to r...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/3849#discussion_r22367512
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/Task.scala ---
    @@ -44,8 +44,9 @@ import org.apache.spark.util.Utils
      */
     private[spark] abstract class Task[T](val stageId: Int, var partitionId: Int) extends Serializable {
     
    -  final def run(attemptId: Long): T = {
    -    context = new TaskContextImpl(stageId, partitionId, attemptId, runningLocally = false)
    +  final def run(taskId: Long, attemptId: Long): T = {
    --- End diff --
    
    would be great to add javadoc here


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Change TaskContext.attemptId to r...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-68408000
  
    One potential consideration is backporing: if we agree that `attemptId`'s meaning should be changed to reflect the 1.1.1 documentation, then we should make the same fix in `branch-1.1`, which will require a separate patch because that branch uses a different implementation of TaskContext.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Add TaskContext.attemptNumber and...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-69871290
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25505/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Change TaskContext.attemptId to r...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/3849#discussion_r22367027
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala ---
    @@ -290,13 +291,22 @@ private[spark] class MesosSchedulerBackend(
           .setType(Value.Type.SCALAR)
           .setScalar(Value.Scalar.newBuilder().setValue(scheduler.CPUS_PER_TASK).build())
           .build()
    +    // Encode the attemptId as part of the data payload, since there's not a MesosTaskInfo field
    +    // to hold it:
    +    val data = {
    --- End diff --
    
    Here's the Mesos trickiness that I alluded to in the PR description.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Change TaskContext.attemptId to r...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-68426611
  
      [Test build #24945 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24945/consoleFull) for   PR 3849 at commit [`8c387ce`](https://github.com/apache/spark/commit/8c387ce76850caf2163dd82dc5c79b079788a921).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Add TaskContext.attemptNumber and...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-69865505
  
      [Test build #25505 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25505/consoleFull) for   PR 3849 at commit [`89d03e0`](https://github.com/apache/spark/commit/89d03e03929e1c0d971a258b90f6c9fdbd6ed297).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Add TaskContext.attemptNumber and...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-68583690
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25004/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Change TaskContext.attemptId to r...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-68453172
  
    Hmm, looks like this change somehow broke a PySpark Streaming test:
    
    ```
    ======================================================================
    FAIL: Basic operation test for DStream.mapPartitions.
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "pyspark/streaming/tests.py", line 228, in test_mapPartitions
        self._test_func(rdds, func, expected)
      File "pyspark/streaming/tests.py", line 114, in _test_func
        self.assertEqual(expected, result)
    AssertionError: [[3, 7], [11, 15], [19, 23]] != []
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Change TaskContext.attemptId to r...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/3849#discussion_r22375124
  
    --- Diff: core/src/test/scala/org/apache/spark/scheduler/TaskContextSuite.scala ---
    @@ -63,6 +63,33 @@ class TaskContextSuite extends FunSuite with BeforeAndAfter with LocalSparkConte
     
         verify(listener, times(1)).onTaskCompletion(any())
       }
    +
    +  test("TaskContext.attemptNumber should return attempt number, not task id (SPARK-4014)") {
    +    sc = new SparkContext("local-cluster[2,1,512]", "test")
    --- End diff --
    
    Just realized that I can change the `maxFailures` property on `local` instead of having to use `local-cluster`.  Let me make that change, since it's a better practice and will speed up the tests.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Add TaskContext.attemptNumber and...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-68811124
  
      [Test build #25075 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25075/consoleFull) for   PR 3849 at commit [`1d43aa6`](https://github.com/apache/spark/commit/1d43aa618ccba4cdc8f12f46aa1119cb4f82dbd6).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Add TaskContext.attemptNumber and...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/3849#discussion_r22830201
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala ---
    @@ -290,13 +291,22 @@ private[spark] class MesosSchedulerBackend(
           .setType(Value.Type.SCALAR)
           .setScalar(Value.Scalar.newBuilder().setValue(scheduler.CPUS_PER_TASK).build())
           .build()
    +    // Encode the attemptId as part of the data payload, since there's not a MesosTaskInfo field
    +    // to hold it:
    +    val data: ByteString = {
    +      val serializedTask = task.serializedTask
    +      val dataBuffer = ByteBuffer.allocate(4 + serializedTask.limit())
    +      dataBuffer.putInt(task.attemptNumber)
    +      dataBuffer.put(serializedTask)
    --- End diff --
    
    That's a good idea; putting the serialization / deserialization code in the same wrapper class will make it much easier to verify that it's correct / test it separately.  I'll push a new commit to do this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Change TaskContext.attemptId to r...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-68429438
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24945/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Change TaskContext.attemptId to r...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/3849#discussion_r22372823
  
    --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala ---
    @@ -180,7 +189,7 @@ private[spark] class Executor(
     
             // Run the actual task and measure its runtime.
             taskStart = System.currentTimeMillis()
    -        val value = task.run(taskId.toInt)
    +        val value = task.run(taskId = taskId.toLong, attemptId = attemptId.toLong)
    --- End diff --
    
    It turns out that `attemptId` was a long but we only expected it to hold values that could fit in an int, hence this weird `.toLong` in a few places, plus  a few `% Int.MaxValue` modulus calls in various places; I've cleaned this up.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Add TaskContext.attemptNumber and...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-69646539
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25425/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Change TaskContext.attemptId to r...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-68412414
  
    Hmm, looks like this failed MiMa checks due to the addition of a new method to a public interface:
    
    ```
    [info] spark-core: found 1 potential binary incompatibilities (filtered 185)
    [error]  * abstract method taskId()Long in class org.apache.spark.TaskContext does not have a correspondent in old version
    [error]    filter with: ProblemFilters.exclude[MissingMethodProblem]("org.apache.spark.TaskContext.taskId")
    ```
    
    I'll add an exclusion.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Change TaskContext.attemptId to r...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-68432871
  
      [Test build #24955 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24955/consoleFull) for   PR 3849 at commit [`0b10526`](https://github.com/apache/spark/commit/0b1052611f7fd7f71232ba3f2c0505e5711080e1).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Change TaskContext.attemptId to r...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-68418485
  
      [Test build #24923 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24923/consoleFull) for   PR 3849 at commit [`cbe4d76`](https://github.com/apache/spark/commit/cbe4d76a7968c467514c290fc56ece713225970a).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Change TaskContext.attemptId to r...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-68410774
  
      [Test build #24908 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24908/consoleFull) for   PR 3849 at commit [`9d8d4d1`](https://github.com/apache/spark/commit/9d8d4d115449f48ce1714497e16704d4f12442d8).
     * This patch **fails MiMa tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class Sort(`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Change TaskContext.attemptId to r...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-68404370
  
      [Test build #24908 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24908/consoleFull) for   PR 3849 at commit [`9d8d4d1`](https://github.com/apache/spark/commit/9d8d4d115449f48ce1714497e16704d4f12442d8).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Add TaskContext.attemptNumber and...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/3849


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Add TaskContext.attemptNumber and...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-68818892
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25075/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Add TaskContext.attemptNumber and...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-69978436
  
    I'm going to merge this into `master` (1.3.0) since it's a blocker for some tests that I want to write.  I'll look into backporting this into maintenance branches, too, since that would allow me to backport regression tests that use the new methods.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Change TaskContext.attemptId to r...

Posted by koeninger <gi...@git.apache.org>.
Github user koeninger commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-68407388
  
    The flip side is that it's already documented as doing the "right" thing:
    
    http://spark.apache.org/docs/1.1.1/api/scala/index.html#org.apache.spark.TaskContext
    
    val attemptId: Long
    
    the number of attempts to execute this task
    
    On Tue, Dec 30, 2014 at 4:38 PM, Patrick Wendell <no...@github.com>
    wrote:
    
    > So personally I don't think we should change the semantics of attemptId
    > because this has been exposed to user applications and they could silently
    > break if we modify the meaning of the field (my original JIRA referred to
    > an internal use of this). What it means right now is "a global GUID over
    > all attempts" - that is a bit of an awkward definition, but I don't think
    > it's fair to call this a bug - it was just a weird definition.
    >
    > So I'd be in favor of deprecating this in favor of taskAttemptId (a new
    > field) and say that it was renamed to avoid confusion. Then we can add
    > another field, attemptCount or attemptNum or something to convey the more
    > intuitive thing.
    >
    > It will be slightly awkward, but if anyone reads the docs it should be
    > obvious. In fact, we should probably spruce up the docs here for things
    > like partitionID which right now are probably not super clear to users.
    >
    > —
    > Reply to this email directly or view it on GitHub
    > <https://github.com/apache/spark/pull/3849#issuecomment-68406594>.
    >


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Add TaskContext.attemptNumber and...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-69823981
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25483/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Change TaskContext.attemptId to r...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/3849#discussion_r22367524
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala ---
    @@ -290,13 +291,22 @@ private[spark] class MesosSchedulerBackend(
           .setType(Value.Type.SCALAR)
           .setScalar(Value.Scalar.newBuilder().setValue(scheduler.CPUS_PER_TASK).build())
           .build()
    +    // Encode the attemptId as part of the data payload, since there's not a MesosTaskInfo field
    +    // to hold it:
    +    val data = {
    --- End diff --
    
    can we type data explicitly?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Change TaskContext.attemptId to r...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-68421619
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24923/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Add TaskContext.attemptNumber and...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-69630266
  
      [Test build #25425 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25425/consoleFull) for   PR 3849 at commit [`38574d4`](https://github.com/apache/spark/commit/38574d4ecbad9d11af2a648ab35bd6aa044c8cf2).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Change TaskContext.attemptId to r...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-68421615
  
      [Test build #24923 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24923/consoleFull) for   PR 3849 at commit [`cbe4d76`](https://github.com/apache/spark/commit/cbe4d76a7968c467514c290fc56ece713225970a).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Add TaskContext.attemptNumber and...

Posted by tnachen <gi...@git.apache.org>.
Github user tnachen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/3849#discussion_r22829776
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala ---
    @@ -290,13 +291,22 @@ private[spark] class MesosSchedulerBackend(
           .setType(Value.Type.SCALAR)
           .setScalar(Value.Scalar.newBuilder().setValue(scheduler.CPUS_PER_TASK).build())
           .build()
    +    // Encode the attemptId as part of the data payload, since there's not a MesosTaskInfo field
    +    // to hold it:
    +    val data: ByteString = {
    +      val serializedTask = task.serializedTask
    +      val dataBuffer = ByteBuffer.allocate(4 + serializedTask.limit())
    +      dataBuffer.putInt(task.attemptNumber)
    +      dataBuffer.put(serializedTask)
    --- End diff --
    
    Can we introduce another wrapper here instead? I'm imagine we will be adding more fields to serialize to Mesos executors, and it's a lot easier to maintain a struct then position with types and offsets.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Change TaskContext.attemptId to r...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/3849#discussion_r22367616
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/Task.scala ---
    @@ -44,8 +44,9 @@ import org.apache.spark.util.Utils
      */
     private[spark] abstract class Task[T](val stageId: Int, var partitionId: Int) extends Serializable {
     
    -  final def run(attemptId: Long): T = {
    -    context = new TaskContextImpl(stageId, partitionId, attemptId, runningLocally = false)
    +  final def run(taskId: Long, attemptId: Long): T = {
    --- End diff --
    
    Are the descriptions in TaskContext okay?  If so, I'll just copy those for the parameters.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Change TaskContext.attemptId to r...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-68429433
  
      [Test build #24945 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24945/consoleFull) for   PR 3849 at commit [`8c387ce`](https://github.com/apache/spark/commit/8c387ce76850caf2163dd82dc5c79b079788a921).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Change TaskContext.attemptId to r...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-68408407
  
    Should we at least leave a documentation note to inform users about the difference in behavior?  I'm worried that someone will look at the 1.2 docs, write some code which relies on the correct behavior, then be surprised if they run it on an older release.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Add TaskContext.attemptNumber and...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-68818889
  
      [Test build #25075 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25075/consoleFull) for   PR 3849 at commit [`1d43aa6`](https://github.com/apache/spark/commit/1d43aa618ccba4cdc8f12f46aa1119cb4f82dbd6).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Change TaskContext.attemptId to r...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-68437209
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24955/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Add TaskContext.attemptNumber and...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-69871285
  
      [Test build #25505 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25505/consoleFull) for   PR 3849 at commit [`89d03e0`](https://github.com/apache/spark/commit/89d03e03929e1c0d971a258b90f6c9fdbd6ed297).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Add TaskContext.attemptNumber and...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-68813324
  
    @pwendell I've updated this based on our previous discussions; please take another look.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Add TaskContext.attemptNumber and...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-69808717
  
      [Test build #25483 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25483/consoleFull) for   PR 3849 at commit [`5cfff05`](https://github.com/apache/spark/commit/5cfff05c4cc3c6eba94b3b3f1c34830dfb5fb958).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Change TaskContext.attemptId to r...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-68437206
  
      [Test build #24955 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24955/consoleFull) for   PR 3849 at commit [`0b10526`](https://github.com/apache/spark/commit/0b1052611f7fd7f71232ba3f2c0505e5711080e1).
     * This patch **fails PySpark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Change TaskContext.attemptId to r...

Posted by koeninger <gi...@git.apache.org>.
Github user koeninger commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-68405794
  
    Thanks for this.  Most of the uses of attemptId I've seen look like they were assuming it meant the 0-based attempt number.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Change TaskContext.attemptId to r...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/3849#discussion_r22367600
  
    --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala ---
    @@ -180,7 +189,7 @@ private[spark] class Executor(
     
             // Run the actual task and measure its runtime.
             taskStart = System.currentTimeMillis()
    -        val value = task.run(taskId.toInt)
    +        val value = task.run(taskId = taskId.toLong, attemptId = attemptId.toLong)
    --- End diff --
    
    Good catch; this was a carryover from the old code.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Add TaskContext.attemptNumber and...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-68583689
  
      [Test build #25004 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25004/consoleFull) for   PR 3849 at commit [`eee6a45`](https://github.com/apache/spark/commit/eee6a4569d926d3ada2ca259ddb04906392688ae).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Add TaskContext.attemptNumber and...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-68581165
  
      [Test build #25004 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25004/consoleFull) for   PR 3849 at commit [`eee6a45`](https://github.com/apache/spark/commit/eee6a4569d926d3ada2ca259ddb04906392688ae).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Add TaskContext.attemptNumber and...

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-69486253
  
    @joshrosen LGTM relating the renaming.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4014] Change TaskContext.attemptId to r...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/3849#issuecomment-68410782
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24908/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org