You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by squito <gi...@git.apache.org> on 2015/10/22 05:20:05 UTC

[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

GitHub user squito opened a pull request:

    https://github.com/apache/spark/pull/9214

    [SPARK-8029][core][wip] first successful shuffle task always wins

    Shuffle writers now write to temp files, and when they are done, they atomically move those files into the final location *if those files don't already exist*.  This way, if one executor ends up executing more than one task to generate shuffle output for one partition, the first successful one "wins", and all others are ignored.
    
    TODO
    - [ ] make sure I'm using the right compression / temp block sizes, per SPARK-3426
    - [ ] run some fault-injection tests

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/squito/spark SPARK-8029_first_wins

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/9214.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #9214
    
----
commit 6140e426f045967e107451336005887e144f6e39
Author: Imran Rashid <ir...@cloudera.com>
Date:   2015-10-21T19:26:26Z

    ShuffleWriters write to temp file, then go through
    ShuffleOutputCoordinator to atomically move w/ "first one wins"

commit 5854ac8a68474b595c9f02d895f2bb3c2eb59c5a
Author: Imran Rashid <ir...@cloudera.com>
Date:   2015-10-22T03:17:17Z

    assorted cleanup

commit c3e4456788e4f6a10d07f5ff47eb4d6a8d19f543
Author: Imran Rashid <ir...@cloudera.com>
Date:   2015-10-22T03:19:07Z

    style

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151623844
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44451/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150638306
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150626639
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151586556
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44443/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153831594
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153830448
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45038/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9214#discussion_r43666157
  
    --- Diff: core/src/main/scala/org/apache/spark/shuffle/ShuffleOutputCoordinator.scala ---
    @@ -0,0 +1,105 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.spark.shuffle
    +
    +import java.io.{FileOutputStream, FileInputStream, File}
    +
    +import com.google.common.annotations.VisibleForTesting
    +
    +import org.apache.spark.storage.ShuffleMapStatusBlockId
    +import org.apache.spark.{SparkEnv, Logging}
    +import org.apache.spark.scheduler.MapStatus
    +import org.apache.spark.serializer.SerializerInstance
    +
    +/**
    + * Ensures that on each executor, there are no conflicting writes to the same shuffle files.  It
    + * implements "first write wins", by atomically moving all shuffle files into their final location,
    + * only if the files did not already exist. See SPARK-8029
    + */
    +private[spark] object ShuffleOutputCoordinator extends Logging {
    +
    +  /**
    +   * If any of the destination files do not exist, then move all of the temporary files to their
    +   * destinations, and return (true, the given MapStatus).  If all destination files exist, then
    +   * delete all temporary files, and return (false, the MapStatus from previously committed shuffle
    +   * output).
    +   *
    +   * Note that this will write to all destination files.  If the tmp file is missing, then a
    +   * zero-length destination file will be created.  This is so the ShuffleOutputCoordinator can work
    +   * even when there is a non-determinstic data, where the output exists in one attempt, but is
    +   * empty in another attempt.
    +   *
    +   * @param tmpToDest  Seq of (temporary, destination) file pairs
    +   * @param mapStatus the [[MapStatus]] for the output already written to the the temporary files
    +   * @return pair of: (1) true iff the set of temporary files was moved to the destination and (2)
    +   *         the MapStatus of the committed attempt.
    +   *
    +   */
    +  def commitOutputs(
    +      shuffleId: Int,
    +      partitionId: Int,
    +      tmpToDest: Seq[(File, File)],
    +      mapStatus: MapStatus,
    +      sparkEnv: SparkEnv): (Boolean, MapStatus) = synchronized {
    +    val mapStatusFile = sparkEnv.blockManager.diskBlockManager.getFile(
    +      ShuffleMapStatusBlockId(shuffleId, partitionId))
    +    val ser = sparkEnv.serializer.newInstance()
    +    commitOutputs(shuffleId, partitionId, tmpToDest, mapStatus, mapStatusFile, ser)
    +  }
    +
    +  @VisibleForTesting
    +  def commitOutputs(
    +      shuffleId: Int,
    +      partitionId: Int,
    +      tmpToDest: Seq[(File, File)],
    +      mapStatus: MapStatus,
    +      mapStatusFile: File,
    +      serializer: SerializerInstance): (Boolean, MapStatus) = synchronized {
    +    val destAlreadyExists = tmpToDest.forall{_._2.exists()} && mapStatusFile.exists()
    +    if (!destAlreadyExists) {
    +      tmpToDest.foreach { case (tmp, dest) =>
    +        // If *some* of the destination files exist, but not all of them, then its not clear
    --- End diff --
    
    nit: it's


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9214#discussion_r43952983
  
    --- Diff: core/src/test/scala/org/apache/spark/ShuffleSuite.scala ---
    @@ -317,6 +309,156 @@ abstract class ShuffleSuite extends SparkFunSuite with Matchers with LocalSparkC
         assert(metrics.bytesWritten === metrics.byresRead)
         assert(metrics.bytesWritten > 0)
       }
    +
    +  /**
    +   * ShuffleOutputCoordinator requires that a given shuffle always generate the same set of files on
    +   * all attempts, even if the data is non-deterministic.  We test the extreme case where the data
    +   * is completely missing in one case
    +   */
    +  test("same set of shuffle files regardless of data") {
    +
    +    def shuffleAndGetShuffleFiles(data: Seq[Int]): Map[String, Long] = {
    +      var tempDir: File = null
    +      try {
    +        tempDir = Utils.createTempDir()
    +        conf.set("spark.local.dir", tempDir.getAbsolutePath)
    +        sc = new SparkContext("local", "test", conf)
    +        val rdd = sc.parallelize(data, 10).map(x => (x, x))
    +        val shuffledRdd = new ShuffledRDD[Int, Int, Int](rdd, new HashPartitioner(4))
    +        shuffledRdd.count()
    +        val shuffleFiles =
    +          FileUtils.listFiles(tempDir, TrueFileFilter.INSTANCE, TrueFileFilter.INSTANCE)
    +        sc.stop()
    +        shuffleFiles.asScala.map { file: File =>
    +          file.getName -> file.length()
    +        }.toMap
    +      } finally {
    +        conf.remove("spark.local.dir")
    +        Utils.deleteRecursively(tempDir)
    +      }
    +    }
    +
    +    val shuffleFilesWithData = shuffleAndGetShuffleFiles(0 until 100)
    +    val shuffleFilesNoData = shuffleAndGetShuffleFiles(Seq[Int]())
    +    assert(shuffleFilesNoData.keySet === shuffleFilesWithData.keySet)
    +    // make sure our test is doing what it is supposed to -- at least some of the
    +    // "no data" files are empty
    +    assert(shuffleFilesNoData.filter{ case (name, size) =>
    +      size == 0L && shuffleFilesWithData(name) != 0L
    +    }.nonEmpty)
    +  }
    +
    +  test("multiple simultaneous attempts for one task (SPARK-8029)") {
    +    sc = new SparkContext("local", "test", conf)
    +    val mapStatusFile = sc.env.blockManager.diskBlockManager.getFile(ShuffleMapStatusBlockId(0, 0))
    +    val mapTrackerMaster = sc.env.mapOutputTracker.asInstanceOf[MapOutputTrackerMaster]
    +    val manager = sc.env.shuffleManager
    +
    +    val taskMemoryManager = new TaskMemoryManager(sc.env.memoryManager, 0L)
    +    val metricsSystem = sc.env.metricsSystem
    +    val shuffleMapRdd = new MyRDD(sc, 1, Nil)
    +    val shuffleDep = new ShuffleDependency(shuffleMapRdd, new HashPartitioner(1))
    +    val shuffleHandle = manager.registerShuffle(0, 1, shuffleDep)
    +
    +    // first attempt -- its successful
    +    val writer1 = manager.getWriter[Int, Int](shuffleHandle, 0,
    +      new TaskContextImpl(0, 0, 0L, 0, taskMemoryManager, metricsSystem,
    +        InternalAccumulator.create(sc)))
    +    val data1 = (1 to 10).map { x => x -> x}
    +
    +    // second attempt -- also successful.  We'll write out different data,
    +    // just to simulate the fact that the records may get written differently
    +    // depending on what gets spilled, what gets combined, etc.
    +    val writer2 = manager.getWriter[Int, Int](shuffleHandle, 0,
    +      new TaskContextImpl(0, 0, 1L, 0, taskMemoryManager, metricsSystem,
    +        InternalAccumulator.create(sc)))
    +    val data2 = (11 to 20).map { x => x -> x}
    +
    +    // interleave writes of both attempts -- we want to test that both attempts can occur
    +    // simultaneously, and everything is still OK
    +
    +    def writeAndClose(
    +      writer: ShuffleWriter[Int, Int])(
    --- End diff --
    
    Indentation.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151550003
  
    **[Test build #44432 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44432/consoleFull)** for PR 9214 at commit [`eabf978`](https://github.com/apache/spark/commit/eabf97838517d5354685f48a4e0abc5fec329df3).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `case class ShuffleMapStatusBlockId(shuffleId: Int, mapId: Int) extends BlockId `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9214#discussion_r42904139
  
    --- Diff: core/src/test/scala/org/apache/spark/shuffle/sort/BypassMergeSortShuffleWriterSuite.scala ---
    @@ -121,12 +140,16 @@ class BypassMergeSortShuffleWriterSuite extends SparkFunSuite with BeforeAndAfte
           taskContext,
           conf
         )
    -    writer.write(Iterator.empty)
    +    val files = writer.write(Iterator.empty)
         writer.stop( /* success = */ true)
    +    assert(ShuffleOutputCoordinator.commitOutputs(0, 0, files))
         assert(writer.getPartitionLengths.sum === 0)
    -    assert(outputFile.exists())
    -    assert(outputFile.length() === 0)
    -    assert(temporaryFilesCreated.isEmpty)
    +    if (outputFile.exists()) {
    +      assert(outputFile.length() === 0)
    +      assert(temporaryFilesCreated.size === 2)
    +    } else {
    +      assert(temporaryFilesCreated.size === 1)
    +    }
    --- End diff --
    
    @JoshRosen I'm pretty sure that `BypassMergeSortShuffleWriter` actually wasn't writing any data file if there was an empty iterator.  This test was passing just b/c the test setup created the file w/ `File.createTempFile`.  So the change here brings it into line with the implementation (either write no file at all, or write a zero-length file) but given my comment on non-determinism I will change this to always write a zero length file.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150460966
  
    cc @JoshRosen 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9214#discussion_r43666391
  
    --- Diff: core/src/main/scala/org/apache/spark/shuffle/ShuffleOutputCoordinator.scala ---
    @@ -0,0 +1,105 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.spark.shuffle
    +
    +import java.io.{FileOutputStream, FileInputStream, File}
    +
    +import com.google.common.annotations.VisibleForTesting
    +
    +import org.apache.spark.storage.ShuffleMapStatusBlockId
    +import org.apache.spark.{SparkEnv, Logging}
    +import org.apache.spark.scheduler.MapStatus
    +import org.apache.spark.serializer.SerializerInstance
    +
    +/**
    + * Ensures that on each executor, there are no conflicting writes to the same shuffle files.  It
    + * implements "first write wins", by atomically moving all shuffle files into their final location,
    + * only if the files did not already exist. See SPARK-8029
    + */
    +private[spark] object ShuffleOutputCoordinator extends Logging {
    +
    +  /**
    +   * If any of the destination files do not exist, then move all of the temporary files to their
    +   * destinations, and return (true, the given MapStatus).  If all destination files exist, then
    +   * delete all temporary files, and return (false, the MapStatus from previously committed shuffle
    +   * output).
    +   *
    +   * Note that this will write to all destination files.  If the tmp file is missing, then a
    +   * zero-length destination file will be created.  This is so the ShuffleOutputCoordinator can work
    +   * even when there is a non-determinstic data, where the output exists in one attempt, but is
    +   * empty in another attempt.
    +   *
    +   * @param tmpToDest  Seq of (temporary, destination) file pairs
    +   * @param mapStatus the [[MapStatus]] for the output already written to the the temporary files
    +   * @return pair of: (1) true iff the set of temporary files was moved to the destination and (2)
    +   *         the MapStatus of the committed attempt.
    +   *
    +   */
    +  def commitOutputs(
    +      shuffleId: Int,
    +      partitionId: Int,
    +      tmpToDest: Seq[(File, File)],
    +      mapStatus: MapStatus,
    +      sparkEnv: SparkEnv): (Boolean, MapStatus) = synchronized {
    +    val mapStatusFile = sparkEnv.blockManager.diskBlockManager.getFile(
    +      ShuffleMapStatusBlockId(shuffleId, partitionId))
    +    val ser = sparkEnv.serializer.newInstance()
    +    commitOutputs(shuffleId, partitionId, tmpToDest, mapStatus, mapStatusFile, ser)
    +  }
    +
    +  @VisibleForTesting
    +  def commitOutputs(
    +      shuffleId: Int,
    +      partitionId: Int,
    +      tmpToDest: Seq[(File, File)],
    +      mapStatus: MapStatus,
    +      mapStatusFile: File,
    +      serializer: SerializerInstance): (Boolean, MapStatus) = synchronized {
    +    val destAlreadyExists = tmpToDest.forall{_._2.exists()} && mapStatusFile.exists()
    +    if (!destAlreadyExists) {
    +      tmpToDest.foreach { case (tmp, dest) =>
    +        // If *some* of the destination files exist, but not all of them, then its not clear
    +        // what to do.  There could be a task already reading from this dest file when we delete
    +        // it -- but then again, something in that taskset would be doomed to fail in any case when
    +        // it got to the missing files.  Better to just put consistent output into place
    +        if (dest.exists()) {
    +          dest.delete()
    +        }
    +        if (tmp.exists()) {
    +          tmp.renameTo(dest)
    +        } else {
    +          // we always create the destination files, so this works correctly even when the
    +          // input data is non-deterministic (potentially empty in one iteration, and non-empty
    +          // in another)
    +          dest.createNewFile()
    +        }
    +      }
    +      val out = serializer.serializeStream(new FileOutputStream(mapStatusFile))
    +      out.writeObject(mapStatus)
    +      out.close()
    +      (true, mapStatus)
    +    } else {
    +      logInfo(s"shuffle output for shuffle $shuffleId, partition $partitionId already exists, " +
    +        s"not overwriting.  Another task must have created this shuffle output.")
    +      tmpToDest.foreach{ case (tmp, _) => tmp.delete()}
    +      val in = serializer.deserializeStream(new FileInputStream(mapStatusFile))
    +      val readStatus = in.readObject[MapStatus]
    --- End diff --
    
    you need a `try..finally` here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150627404
  
    **[Test build #44231 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44231/consoleFull)** for PR 9214 at commit [`89063dd`](https://github.com/apache/spark/commit/89063dd3654066e076e0e4f13250d25c414e88c3).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9214#discussion_r43664703
  
    --- Diff: core/src/main/java/org/apache/spark/shuffle/sort/BypassMergeSortShuffleWriter.java ---
    @@ -121,13 +125,22 @@ public BypassMergeSortShuffleWriter(
       }
     
       @Override
    -  public void write(Iterator<Product2<K, V>> records) throws IOException {
    +  public Seq<Tuple2<File, File>> write(Iterator<Product2<K, V>> records) throws IOException {
    --- End diff --
    
    Could you add a javadoc explaining what the return value is? It's particularly cryptic because it's uses tuples; maybe it would be better to create a helper type where the fields have proper names.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151617398
  
    I think this is ready to go now (pretty sure the last commit will fix the test failure, I just got a little antsy with `git push`)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9214#discussion_r43664936
  
    --- Diff: core/src/main/scala/org/apache/spark/shuffle/FileShuffleBlockResolver.scala ---
    @@ -132,6 +134,15 @@ private[spark] class FileShuffleBlockResolver(conf: SparkConf)
                 logWarning(s"Error deleting ${file.getPath()}")
               }
             }
    +        for (mapId <- state.completedMapTasks.asScala) {
    +          val mapStatusFile =
    +            blockManager.diskBlockManager.getFile(ShuffleMapStatusBlockId(shuffleId, mapId))
    +          if (mapStatusFile.exists()) {
    +            if (!mapStatusFile.delete()) {
    --- End diff --
    
    nit: you could merge the two `if`s.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150096063
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151358057
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44396/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150096333
  
    **[Test build #44124 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44124/consoleFull)** for PR 9214 at commit [`c3e4456`](https://github.com/apache/spark/commit/c3e4456788e4f6a10d07f5ff47eb4d6a8d19f543).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151583032
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150297106
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150624419
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150339352
  
    **[Test build #44168 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44168/consoleFull)** for PR 9214 at commit [`550e198`](https://github.com/apache/spark/commit/550e1983ee170928e816642101b5171b517cbd4c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150669422
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150638330
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153966914
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150626674
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150668648
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153823674
  
    Build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153832884
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153965282
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153830435
  
    **[Test build #45038 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45038/consoleFull)** for PR 9214 at commit [`5c8b247`](https://github.com/apache/spark/commit/5c8b24741982a2e14f5bb2f9e65625ee7364ef01).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151349668
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-155654822
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153858346
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45040/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153831931
  
    **[Test build #45040 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45040/consoleFull)** for PR 9214 at commit [`4d66df1`](https://github.com/apache/spark/commit/4d66df1ac9381cdecb5b0a045046e048c5333a87).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153964949
  
    Hi @JoshRosen
    
    > What's worse, I think that this locking scheme would have to coordinate across processes, since we'd need to make sure that the external shuffle service acquires the proper read locks.
    
    > I've noticed that this current patch does not employ any locks for reading, but I don't think that's currently a problem:
    
    > * The only case where we would need a lock is to prevent the case where we read the sort-shuffle index file and then have the data file replaced by a concurrent write.
    > * Since sort-shuffle only creates one data file, that file will never be overwritten once created.
    > 
    > Does that logic sound right?
    
    Hmm, perhaps, I'd like to be totally clear:  (a) assuming that there is no disk error which leads to a file going completely missing, then no files ever get replaced (so the data file can't get replaced after you've opened the index file for reading).  (b) if you do consider vanishing files -- then it is possible for the data file to get overwritten if you lose your index file or your mapstatus file.  So the worst case scenario is (i) reader open the index file, read the offset (ii) index file vanishes (iii) conflicting attempts finishes, notices the missing index file, writes an index file and overwrites the data file (iv) reader opens the data file, but it reads the wrong location.  And on windows, you'll have a problem if the conflicting attempt finishes any time the index or data file is open for reading, a much wider window.
    
    With non-deterministic data, there is another more subtle issue -- even if the second attempt commits its output in between any downstream reads, you can have the downstream stage see a mixture of output for one task.  Eg. stage A has two attempts for task 1 (A.1.1 & A.1.2).  In stage B, task 1 reads A.1.1, and task 2 reads A.1.2.  Might not seem like a big deal, but I doubt it is what a user would expect, it certainly doesn't fit my mental model I started with as a user, and there may be cases where it matters.  You might have some random data generating process, then some transformations which split the data up by keys -- but where you expect some relationship to hold between your keys.  Maybe some paired sampling process or something.  If you read output from different attempts, that relationship could be violated.  (Though as I'm writing this I'm realizing that *none* of the proposed solutions would help with this.)
    
    > Concurrent writes can occur in a few ways:
    > * They might occur if we cancelled / killed a task and then launched a new attempt of the same task. This could happen because task cancellation is asynchronous.
    
    actually, we do *not* currently kill running tasks.  See (SPARK-2666)[https://issues.apache.org/jira/browse/SPARK-2666].  At one point I think there was some discussion about intentionally not canceling tasks, since they might do useful work, but I think now there is consensus they should be cancelled, it just hasn't been done.  I suppose since its just for efficiency, not correctness, it hasn't been focused on.  It *should* be easy, but scheduler changes are always more painful than they seem ...
    
    Adding task cancellation would certainly make the issue here more rare, but it could still occur.
    
    > * They used to be able to occur if there were multiple concurrent attempts for the same stage, but I believe that this cause has been ruled out by scheduler changes.
    
    not exactly.  The scheduler prevents concurrent *non-zombie* attempts for the same stage.  But since tasks aren't cancelled on zombification, you've still got concurrent attempts, with one non-zombie and an unlimited number of zombie attempts.
    
    > In sort-based shuffle, the entire map output is stored in a single file, so an individual map output can't be "partially lost."
    
    Well, you've still got the index and data file, and one is useless without the other, so I think the output *can* be partially lost.  (and with this change, there would be a mapstatus file as well.)
    
    > I have an alternative proposal:
    > ...
    > * Once the master has marked an executor as dead, prevent that executor from re-registering: the executor should kill itself instead.
    
    This sounds like a *big* change in behavior to me.  I agree that allowing executors to re-register does add complexity, and it certainly has confused me as a user in the past.  But unless there is no way around it, it seems like the type of change we'd want to make configurable till we understood the full ramifications.  And in the meantime, we'd still want to get correctness.  Eg., I think another way you can lose an executor now is if it gets overwhelmed serving too many shuffle requests -- if one of those requests times out, you get a fetch failure, executor gets removed.  Of course, killing the executor shouldn't hurt correctness, I'm just trying to point out that there might be more cases the user needs to deal with for tuning things after that kind of change.  Also, removing an executor can actually have a *huge* effect on performance -- if you run some big 5 stage pipeline for 10 hours, and you there is some transient failure in the 10th hour, you will lose some shuffle ou
 tput for each of your 5 stages and it may result in a long delay.  Even if there aren't many tasks to execute, you'll still need to run the 5 stages serially.
    
    > By the way: this argument also uncovers a bug in our logic for deciding when to enable the OutputCommitCoordinator: due to the ability for tasks to keep running on "failed/dead" executors, jobs which do not use speculation are still vulnerable to output commit races in certain rare failure cases (I'll file a JIRA to fix this).
    
    yikes, good call.  Especially since we don't actually cancel tasks in zombie stage attempts, this might not be that rare.
    
    > would the proposal that I outlined above be a sufficient fix for SPARK-8029?
    
    yes, I do think it would be sufficient.  I think you'd want to include task cancellation, SPARK-2666, if you were to kill an executor on duplicate attempts, otherwise it would be far too common.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151377803
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44399/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153965884
  
    **[Test build #45098 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45098/consoleFull)** for PR 9214 at commit [`c206fc5`](https://github.com/apache/spark/commit/c206fc59239f8d0f90aa018aa3dfb0c309a174d0).
     * This patch **fails Scala style tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `case class ShuffleMapStatusBlockId(shuffleId: Int, mapId: Int) extends BlockId `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153987620
  
    **[Test build #45100 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45100/consoleFull)** for PR 9214 at commit [`c0edff1`](https://github.com/apache/spark/commit/c0edff1cd33ed7aa74ecbde770f835ca9b51fc08).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `case class ShuffleMapStatusBlockId(shuffleId: Int, mapId: Int) extends BlockId `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151586554
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9214#discussion_r43953044
  
    --- Diff: core/src/test/scala/org/apache/spark/shuffle/ShuffleOutputCoordinatorSuite.scala ---
    @@ -0,0 +1,136 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.spark.shuffle
    +
    +import java.io.{File, FileInputStream, FileOutputStream}
    +
    +import org.scalatest.BeforeAndAfterEach
    +
    +import org.apache.spark.{SparkConf, SparkFunSuite}
    +import org.apache.spark.scheduler.MapStatus
    +import org.apache.spark.serializer.{JavaSerializer, SerializerInstance}
    +import org.apache.spark.storage.BlockManagerId
    +import org.apache.spark.util.Utils
    +
    +class ShuffleOutputCoordinatorSuite extends SparkFunSuite with BeforeAndAfterEach {
    +
    +  var tempDir: File = _
    +  var mapStatusFile: File = _
    +  // use the "port" as a way to distinguish mapstatuses, just for the test
    +  def mapStatus(id: Int): MapStatus = MapStatus(BlockManagerId("1", "a.b.c", id), Array(0L, 1L))
    +  def ser: SerializerInstance = new JavaSerializer(new SparkConf()).newInstance()
    +
    +  override def beforeEach(): Unit = {
    +    tempDir = Utils.createTempDir()
    +    mapStatusFile = File.createTempFile("shuffle", ".mapstatus", tempDir)
    +    mapStatusFile.delete()
    +  }
    +
    +  override def afterEach(): Unit = {
    +    Utils.deleteRecursively(tempDir)
    +  }
    +
    +  private def writeFile(filename: String, data: Int): File = {
    +    val f = new File(tempDir, filename)
    +    val out = new FileOutputStream(f)
    +    out.write(data)
    +    out.close()
    +    f
    +  }
    +
    +  private def verifyFiles(successfulAttempt: Int): Unit = {
    +    (0 until 3).foreach { idx =>
    +      val exp = successfulAttempt* 3 + idx
    +      val file = new File(tempDir, s"d$idx")
    +      withClue(s"checking dest file $file") {
    +        assert(file.length === 1)
    +        val in = new FileInputStream(file)
    +        assert(in.read() === exp)
    +        in.close()
    +
    +      }
    +    }
    +  }
    +
    +  private def generateAttempt(attempt: Int): Seq[TmpDestShuffleFile] = {
    +    (0 until 3).map { idx =>
    +      val j = attempt * 3 + idx
    +      TmpDestShuffleFile(writeFile(s"t$j", j), new File(tempDir, s"d$idx"))
    +    }
    +  }
    +
    +  private def commit(files: Seq[TmpDestShuffleFile], id: Int): (Boolean, MapStatus) = {
    --- End diff --
    
    `id` -> `mapId`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151353522
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44392/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153965886
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45098/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150592921
  
    this isn't quite ready yet ... still working through test failures.  I think the remaining changes are to the tests, but need to work through those and then some cleanup ...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150340507
  
    **[Test build #44168 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44168/consoleFull)** for PR 9214 at commit [`550e198`](https://github.com/apache/spark/commit/550e1983ee170928e816642101b5171b517cbd4c).
     * This patch **fails to build**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153987943
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153823677
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45020/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/9214


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153867031
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45041/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150695485
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9214#discussion_r43670009
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/DiskBlockManager.scala ---
    @@ -125,6 +125,20 @@ private[spark] class DiskBlockManager(blockManager: BlockManager, conf: SparkCon
       }
     
       /**
    +   * Produces a unique block id and File suitable for storing shuffled data files, which are
    +   * uncompressed, before they are moved to their final location by the
    +   * [[org.apache.spark.shuffle.ShuffleOutputCoordinator]]
    +   */
    +  def createUncompressedTempShuffleBlock(): (TempUncompressedShuffleBlockId, File) = {
    +    var blockId = new TempUncompressedShuffleBlockId(UUID.randomUUID())
    +    while (getFile(blockId).exists()) {
    +      blockId = new TempUncompressedShuffleBlockId(UUID.randomUUID())
    +    }
    +    (blockId, getFile(blockId))
    +
    --- End diff --
    
    nit: stray empty line


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153777311
  
    **[Test build #45020 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45020/consoleFull)** for PR 9214 at commit [`cfdfd2c`](https://github.com/apache/spark/commit/cfdfd2c7aca33a40e687ed6c93859a72d8f8e280).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150669424
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44235/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150235777
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151586272
  
    **[Test build #44443 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44443/consoleFull)** for PR 9214 at commit [`e141d82`](https://github.com/apache/spark/commit/e141d82b86e48215a5cebf5e10a7cd733d75ea78).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `case class ShuffleMapStatusBlockId(shuffleId: Int, mapId: Int) extends BlockId `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150236398
  
    **[Test build #44149 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44149/consoleFull)** for PR 9214 at commit [`1c12ca5`](https://github.com/apache/spark/commit/1c12ca57f22c276f6dab03c45966b32967db51fa).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151353521
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153827670
  
    **[Test build #45036 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45036/consoleFull)** for PR 9214 at commit [`86f468a`](https://github.com/apache/spark/commit/86f468ab6d13fba9c0ae1fbe8289e2d05ea5bcef).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153827195
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150096052
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153866871
  
    **[Test build #45041 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45041/consoleFull)** for PR 9214 at commit [`e59df41`](https://github.com/apache/spark/commit/e59df413f206b7c727d276dca0bbd6de1180ad63).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `case class ShuffleMapStatusBlockId(shuffleId: Int, mapId: Int) extends BlockId `\n  * `public class JavaAssociationRulesExample `\n  * `public class JavaPrefixSpanExample `\n  * `public class JavaSimpleFPGrowth `\n  * `class StreamInterceptor implements TransportFrameDecoder.Interceptor `\n  * `public final class ChunkFetchSuccess extends ResponseWithBody `\n  * `public abstract class ResponseWithBody implements ResponseMessage `\n  * `public final class StreamFailure implements ResponseMessage `\n  * `public final class StreamRequest implements RequestMessage `\n  * `public final class StreamResponse extends ResponseWithBody `\n  * `public class TransportFrameDecoder extends ChannelInboundHandlerAdapter `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153775585
  
    thanks for the review @vanzin , I've just updated this


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153966901
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151511446
  
    **[Test build #44433 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44433/consoleFull)** for PR 9214 at commit [`5bbeec3`](https://github.com/apache/spark/commit/5bbeec3f8243f9caa94096e03c88d060ad5fcf13).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150335956
  
     Build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151500358
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150335984
  
    Build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151616015
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44448/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153930717
  
    I've thought about this some more and there are a few things that I still find unclear. To recap my understanding:
    
    - We need to guard against both concurrent writes to the same shuffle file and concurrent reads and writes to the shuffle file.
    - Concurrent writes can occur in a few ways:
      - They can't / shouldn't occur via speculation, since the DAGScheduler will not schedule speculative tasks on the same executor as original tasks.
      - They might occur if we cancelled / killed a task and then launched a new attempt of the same task. This could happen because task cancellation is asynchronous.
      - They used to be able to occur if there were multiple concurrent attempts for the same stage, but I believe that this cause has been ruled out by scheduler changes.
    - A shuffle file read that is concurrent with a write to the same file can only occur if map output was lost and recomputations were triggered.
      - In sort-based shuffle, the entire map output is stored in a single file, so an individual map output can't be "partially lost."
      -  In hash-based shuffle, it would be possible for a portion of a map's output to be lost in response to a disk failure.
    
    In the discussion above, it seems like executors that re-register after being marked as dead are cited as one possible source of read/write concurrency problems, too.
    
    I have an alternative proposal:
    
    - Given that concurrent writer are not expected to happen, we should be able to add some strong assertions / checks inside of Executor to be sure of this. We can maintain a set of `(mapId, shuffleId)` pairs to track which map outputs are currently being computed and can kill the entire executor if we see a duplicate attempt being launched.
    - Since the concurrent read and write problem can only occur if map output is _partially_ lost, handle the loss of the shuffle files themselves by immediately killing the executor. The argument here is that a disk failure or filesystem problem means that we should just hard-stop the executor.
    - Once the master has marked an executor as dead, prevent that executor from re-registering: the executor should kill itself instead. There is some driver-side state associated with each executor that is cleared when an executor is marked as dead and having to reason about the re-creation of this state when re-registering an executor (which has its own internal state) adds complexity and introduces the potential for bugs.
    
      The only opposing argument that I can think of is performance concerns; my rebuttal:
    
        - If executor disconnect and re-registration is a super-common occurrence then you should increase the heartbeat timeout.
        - If increasing the heartbeat timeout is undesirable because it means that tasks are not rescheduled soon enough _and_ it's also the case that dead executors frequently return, then users should enable speculation. By the way: this argument also uncovers a bug in our logic for deciding when to enable the `OutputCommitCoordinator`: due to the ability for tasks to keep running on "failed/dead" executors, jobs which do not use speculation are still vulnerable to output commit races in certain rare failure cases (I'll file a JIRA to fix this).
    
    Summary: in response to rare failure modes, such as disk loss or concurrent writes which are never supposed to happen, we should consider hard-killing executors. It should always be safe to kill an executor; killing executors should only cause performance issues, not correctness issues, and thus we should be able to rely on the ability to kill an executor as a catch-all safety net for handling rare error scenarios or bugs.
    
    Ignoring the question of implementation complexity, would the proposal that I outlined above be a sufficient fix for SPARK-8029?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153965638
  
    **[Test build #45098 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45098/consoleFull)** for PR 9214 at commit [`c206fc5`](https://github.com/apache/spark/commit/c206fc59239f8d0f90aa018aa3dfb0c309a174d0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150241169
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44149/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151377734
  
    **[Test build #44399 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44399/consoleFull)** for PR 9214 at commit [`4b7c71a`](https://github.com/apache/spark/commit/4b7c71a938d69be93baecb8ce320a2151b7a4658).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `case class ShuffleMapStatusBlockId(shuffleId: Int, mapId: Int) extends BlockId `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9214#discussion_r43079006
  
    --- Diff: core/src/test/scala/org/apache/spark/SortShuffleSuite.scala ---
    @@ -83,8 +83,8 @@ class SortShuffleSuite extends ShuffleSuite with BeforeAndAfterAll {
         shuffledRdd.count()
         // Ensure that the shuffle actually created files that will need to be cleaned up
         val filesCreatedByShuffle = getAllFiles -- filesBeforeShuffle
    -    filesCreatedByShuffle.map(_.getName) should be
    -    Set("shuffle_0_0_0.data", "shuffle_0_0_0.index")
    --- End diff --
    
    this wasn't actually testing anything earlier.  It was getting parsed as two entirely different statements, due to some wrinkle in the way the `be` scalatest construct work.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-155656408
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/45589/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153830236
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151502468
  
    **[Test build #44432 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44432/consoleFull)** for PR 9214 at commit [`eabf978`](https://github.com/apache/spark/commit/eabf97838517d5354685f48a4e0abc5fec329df3).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151358049
  
    **[Test build #44396 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44396/consoleFull)** for PR 9214 at commit [`3f5af9c`](https://github.com/apache/spark/commit/3f5af9c25fddbdb8e31c77eddc7861d27286c845).
     * This patch **fails Scala style tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `case class ShuffleMapStatusBlockId(shuffleId: Int, mapId: Int) extends BlockId `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9214#discussion_r43922417
  
    --- Diff: core/src/test/scala/org/apache/spark/SortShuffleSuite.scala ---
    @@ -83,8 +83,8 @@ class SortShuffleSuite extends ShuffleSuite with BeforeAndAfterAll {
         shuffledRdd.count()
         // Ensure that the shuffle actually created files that will need to be cleaned up
         val filesCreatedByShuffle = getAllFiles -- filesBeforeShuffle
    -    filesCreatedByShuffle.map(_.getName) should be
    -    Set("shuffle_0_0_0.data", "shuffle_0_0_0.index")
    --- End diff --
    
    :frowning:  This is why we should just ban the use of `ShouldMatchers`...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150235752
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151584092
  
    **[Test build #44451 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44451/consoleFull)** for PR 9214 at commit [`dc076b8`](https://github.com/apache/spark/commit/dc076b83f892723fe614e80b42da562b5ebf8d9f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153858343
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153830804
  
    I talked to Imran offline, and if the assumption that all attempts generate the same set of files holds, then this should be ok. It feels a little weird, because if the attempt outputs are non-deterministic, this change would be adding yet another source of non-determinism (i.e. you could mix outputs of different stage attempts in the subsequent shuffle), but if people feel the smaller diff is worth that, then so be it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153968076
  
    **[Test build #45100 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45100/consoleFull)** for PR 9214 at commit [`c0edff1`](https://github.com/apache/spark/commit/c0edff1cd33ed7aa74ecbde770f835ca9b51fc08).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150241165
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150340520
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-155656406
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151508392
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150268718
  
    **[Test build #44155 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44155/consoleFull)** for PR 9214 at commit [`3d96747`](https://github.com/apache/spark/commit/3d96747cd1cc76057bc1f581eb0ba6f25a3faa52).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9214#discussion_r43078906
  
    --- Diff: core/src/test/scala/org/apache/spark/ShuffleSuite.scala ---
    @@ -88,33 +92,16 @@ abstract class ShuffleSuite extends SparkFunSuite with Matchers with LocalSparkC
       }
     
       test("zero sized blocks") {
    -    // Use a local cluster with 2 processes to make sure there are both local and remote blocks
    -    sc = new SparkContext("local-cluster[2,1,1024]", "test", conf)
    -
    -    // 201 partitions (greater than "spark.shuffle.sort.bypassMergeThreshold") from 4 keys
    -    val NUM_BLOCKS = 201
    -    val a = sc.parallelize(1 to 4, NUM_BLOCKS)
    -    val b = a.map(x => (x, x*2))
    -
         // NOTE: The default Java serializer doesn't create zero-sized blocks.
         //       So, use Kryo
    -    val c = new ShuffledRDD[Int, Int, Int](b, new HashPartitioner(NUM_BLOCKS))
    -      .setSerializer(new KryoSerializer(conf))
    -
    -    val shuffleId = c.dependencies.head.asInstanceOf[ShuffleDependency[_, _, _]].shuffleId
    -    assert(c.count === 4)
    -
    -    val blockSizes = (0 until NUM_BLOCKS).flatMap { id =>
    -      val statuses = SparkEnv.get.mapOutputTracker.getMapSizesByExecutorId(shuffleId, id)
    -      statuses.flatMap(_._2.map(_._2))
    -    }
    -    val nonEmptyBlocks = blockSizes.filter(x => x > 0)
    -
    -    // We should have at most 4 non-zero sized partitions
    -    assert(nonEmptyBlocks.size <= 4)
    +    testZeroSizedBlocks(Some(new KryoSerializer(conf)))
       }
     
       test("zero sized blocks without kryo") {
    +    testZeroSizedBlocks(None)
    +  }
    +
    +  def testZeroSizedBlocks(serOpt: Option[Serializer]): Unit = {
    --- End diff --
    
    unrelated to this change, but "zero sized blocks" and "zero sized blocks without kryo" were almost identical so I made a helper,


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150695348
  
    **[Test build #44251 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44251/consoleFull)** for PR 9214 at commit [`2089e12`](https://github.com/apache/spark/commit/2089e12d93ff00f0d18e0ec3cbf23322a6974831).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151616014
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151363088
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151579574
  
    **[Test build #44448 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44448/consoleFull)** for PR 9214 at commit [`4df7955`](https://github.com/apache/spark/commit/4df7955db9c46b1549b3c0f4e238f5be7970c337).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by mateiz <gi...@git.apache.org>.
Github user mateiz commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151209206
  
    Maybe just go for version 2) above then, it seems like the simplest one.
    
    Regarding re-engineering vs not, the problem is that if you're trying to do a bug fix, you should introduce the least complexity possible. With fault tolerance in particular it's possible to imagine lots of conditions that don't really happen. For example, what if network messages get corrupted? What if DRAM gets corrupted? You just need to pick a failure model (e.g. do we trust the filesystem to be reliable or not) that fits your observations and make sure things are correct within that model.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150650304
  
    **[Test build #44229 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44229/consoleFull)** for PR 9214 at commit [`4a19702`](https://github.com/apache/spark/commit/4a19702bf1de0af003c2cbc58bf2ec61c79d17b9).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150267827
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9214#discussion_r43078843
  
    --- Diff: core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala ---
    @@ -639,7 +639,6 @@ private[spark] class ExternalSorter[K, V, C](
        * called by the SortShuffleWriter.
        *
        * @param blockId block ID to write to. The index file will be blockId.name + ".index".
    -   * @param context a TaskContext for a running Spark task, for us to update shuffle metrics.
    --- End diff --
    
    this got removed in [SPARK-10984](https://github.com/apache/spark/commit/85e654c5ec87e666a8845bfd77185c1ea57b268a), just cleaning up the comment


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151550216
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153830284
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153867025
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150667835
  
    @JoshRosen @rxin I'm going to see if I can do a little cleanup still, but the functionality is there now.  However, I did realize there are some open questions in the logic, particularly wrt to non-deterministic data.
    
    (a) its possible for shuffle files generated in one attempt to not be generated in another attempt.  Eg., if the data is empty, a sort based shuffle may not generate any output file.  Similarly, the hash based shuffle may not generate a file if the output for a particular (map,reduce) pair is empty.  Furthermore, based on the [test case for SPARK-4085](https://github.com/apache/spark/blob/master/core/src/test/scala/org/apache/spark/ShuffleSuite.scala#L266) if one shuffle output file is missing, we should consider the entire shuffle output missing and regenerate.  You could imagine some degenerate cases, eg., attempt 1 creates outputs a & b, attempt 2 creates outputs b & c, and so attempt 2 overwrites attempt 1, even though its not really the first attempt.
    
    (b) what should the MapStatus output of the uncommitted attempt be?  With deterministic data it doesn't *really* matter, at least to fix SPARK-8029.  The output sizes will be approximately the same and that is really good enough.  But with non-determinstic data, its possible that attempt1 gets committed, but then attempt2 creates a MapStatus with some zero-sized blocks where attempt1 had non-zero sized blocks.  The map status from attempt2 is used, so the shuffle-fetch side decides to completely skip reading those blocks.
    
    Some possible solutions are:
    
    1) undefined behavior on non-deterministic data
    2) Shuffle writers always create the same output files, even if they are zero-length.  This could really hurt the performance of the HashShuffle with lots of partitions, but then again, its already unusable with lots of partitions so it probably doesn't matter.  I assume the effect on sort shuffle is negligible.  And then add some new marker MapStatus which gets returned when the shuffle output is not committed.
    3) maybe the test for SPARK-4085 could be changed, and then we can change `ShuffleOutputCoordinator` so that it considers the destination pre-existing if *any* of the files are already there.  You still can have some weird cases where attempt 1 creates outputs a & b, and attempt 2 creates c & d, but it doesn't matter if you commit files c & d also.  You'd need the same marker MapStatus for the non-committed attempts.
    
    I assume (1) is not an option.  I'll implement (2) but wanted to see if you have any thoughts.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-155655292
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151358056
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151363072
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-155672852
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151623653
  
    **[Test build #44451 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44451/consoleFull)** for PR 9214 at commit [`dc076b8`](https://github.com/apache/spark/commit/dc076b83f892723fe614e80b42da562b5ebf8d9f).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `case class ShuffleMapStatusBlockId(shuffleId: Int, mapId: Int) extends BlockId `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9214#discussion_r43665840
  
    --- Diff: core/src/main/scala/org/apache/spark/shuffle/ShuffleOutputCoordinator.scala ---
    @@ -0,0 +1,105 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.spark.shuffle
    +
    +import java.io.{FileOutputStream, FileInputStream, File}
    +
    +import com.google.common.annotations.VisibleForTesting
    +
    +import org.apache.spark.storage.ShuffleMapStatusBlockId
    +import org.apache.spark.{SparkEnv, Logging}
    +import org.apache.spark.scheduler.MapStatus
    +import org.apache.spark.serializer.SerializerInstance
    +
    +/**
    + * Ensures that on each executor, there are no conflicting writes to the same shuffle files.  It
    + * implements "first write wins", by atomically moving all shuffle files into their final location,
    + * only if the files did not already exist. See SPARK-8029
    + */
    +private[spark] object ShuffleOutputCoordinator extends Logging {
    +
    +  /**
    +   * If any of the destination files do not exist, then move all of the temporary files to their
    +   * destinations, and return (true, the given MapStatus).  If all destination files exist, then
    +   * delete all temporary files, and return (false, the MapStatus from previously committed shuffle
    +   * output).
    +   *
    +   * Note that this will write to all destination files.  If the tmp file is missing, then a
    +   * zero-length destination file will be created.  This is so the ShuffleOutputCoordinator can work
    +   * even when there is a non-determinstic data, where the output exists in one attempt, but is
    +   * empty in another attempt.
    +   *
    +   * @param tmpToDest  Seq of (temporary, destination) file pairs
    +   * @param mapStatus the [[MapStatus]] for the output already written to the the temporary files
    +   * @return pair of: (1) true iff the set of temporary files was moved to the destination and (2)
    +   *         the MapStatus of the committed attempt.
    +   *
    --- End diff --
    
    nit: stray empty line.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150654523
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44231/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150650374
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44229/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150640491
  
    **[Test build #44235 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44235/consoleFull)** for PR 9214 at commit [`4145651`](https://github.com/apache/spark/commit/4145651724eb99fb440cc8509df154d8f8095b47).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151583060
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150654521
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153827253
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151355724
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151550219
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44432/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9214#discussion_r43664109
  
    --- Diff: core/src/main/java/org/apache/spark/shuffle/sort/BypassMergeSortShuffleWriter.java ---
    @@ -121,13 +125,22 @@ public BypassMergeSortShuffleWriter(
       }
     
       @Override
    -  public void write(Iterator<Product2<K, V>> records) throws IOException {
    +  public Seq<Tuple2<File, File>> write(Iterator<Product2<K, V>> records) throws IOException {
         assert (partitionWriters == null);
    +    final File indexFile = shuffleBlockResolver.getIndexFile(shuffleId, mapId);
    +    final File dataFile = shuffleBlockResolver.getDataFile(shuffleId, mapId);
         if (!records.hasNext()) {
           partitionLengths = new long[numPartitions];
    -      shuffleBlockResolver.writeIndexFile(shuffleId, mapId, partitionLengths);
    +      final File tmpIndexFile = shuffleBlockResolver.writeIndexFile(shuffleId, mapId, partitionLengths);
           mapStatus = MapStatus$.MODULE$.apply(blockManager.shuffleServerId(), partitionLengths);
    -      return;
    +      // create empty data file so we always commit same set of shuffle output files, even if
    +      // data is non-deterministic
    +      final File tmpDataFile = blockManager.diskBlockManager().createTempShuffleBlock()._2();
    +      tmpDataFile.createNewFile();
    --- End diff --
    
    Check the return value? This method doesn't throw on error.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151548207
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150337320
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-155143926
  
    @JoshRosen @rxin @mateiz so, what is the path forward here?  Josh is correct that this doesn't really correctly handle the case of missing files.  But meanwhile, spark is still not fault tolerant.  The way I see it, our options are:
    
    1) Prevent executors from re-registering, as Josh has suggested above.  Almost certainly out of scope for 1.6, and I am skeptical that it would even make 1.7.
    2) use this approach despite the flaw when files go missing.  It may work even if a file disappears, or it may not.  But in other cases, it is still a huge improvement.
    3) Don't even try to handle the case of missing shuffle files, the same way we won't handle corrupted shuffle files.  So the ShuffleOutputCoordinator logic would change so that it would not commit outputs if *any* output file already existed.  This would be a behavior change, specifically a reversal of SPARK-4085, but would be a consistent model of fault-tolerance.
    4) Use one shuffle output per attempt. https://github.com/apache/spark/pull/6648


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150654384
  
    **[Test build #44231 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44231/consoleFull)** for PR 9214 at commit [`89063dd`](https://github.com/apache/spark/commit/89063dd3654066e076e0e4f13250d25c414e88c3).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151557972
  
    **[Test build #44433 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44433/consoleFull)** for PR 9214 at commit [`5bbeec3`](https://github.com/apache/spark/commit/5bbeec3f8243f9caa94096e03c88d060ad5fcf13).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `case class ShuffleMapStatusBlockId(shuffleId: Int, mapId: Int) extends BlockId `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150096891
  
    **[Test build #44124 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44124/consoleFull)** for PR 9214 at commit [`c3e4456`](https://github.com/apache/spark/commit/c3e4456788e4f6a10d07f5ff47eb4d6a8d19f543).
     * This patch **fails to build**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151623842
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151357523
  
    **[Test build #44396 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44396/consoleFull)** for PR 9214 at commit [`3f5af9c`](https://github.com/apache/spark/commit/3f5af9c25fddbdb8e31c77eddc7861d27286c845).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9214#discussion_r43665022
  
    --- Diff: core/src/main/scala/org/apache/spark/shuffle/IndexShuffleBlockResolver.scala ---
    @@ -72,16 +72,23 @@ private[spark] class IndexShuffleBlockResolver(conf: SparkConf) extends ShuffleB
             logWarning(s"Error deleting index ${file.getPath()}")
           }
         }
    +
    +    file = blockManager.diskBlockManager.getFile(ShuffleMapStatusBlockId(shuffleId, mapId))
    +    if (file.exists()) {
    +      if (!file.delete()) {
    --- End diff --
    
    nit: same


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153909914
  
    > I think that with "last task wins" you'd still need a lock when opening files for reading & writing to make sure you don't open one task's index file and another task's data file. (a lot of work can happen between opening the data file for writing and opening the index file for writing with the current code, but that can be changed.)
    
    What's worse, I think that this locking scheme would have to coordinate across processes, since we'd need to make sure that the external shuffle service acquires the proper read locks.
    
    I've noticed that this current patch does not employ any locks for reading, but I don't think that's currently a problem:
    
    - The only case where we would need a lock is to prevent the case where we read the sort-shuffle index file and then have the data file replaced by a concurrent write.
    - Since sort-shuffle only creates one data file, that file will never be overwritten once created.
    
    Does that logic sound right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151353519
  
    **[Test build #44392 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44392/consoleFull)** for PR 9214 at commit [`5d11eca`](https://github.com/apache/spark/commit/5d11eca843ab19fc4d1b83d50a677bd6f2b6f0d8).
     * This patch **fails Scala style tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `case class ShuffleMapStatusBlockId(shuffleId: Int, mapId: Int) extends BlockId `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-155655277
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9214#discussion_r43155130
  
    --- Diff: core/src/main/java/org/apache/spark/shuffle/sort/BypassMergeSortShuffleWriter.java ---
    @@ -155,10 +170,16 @@ public void write(Iterator<Product2<K, V>> records) throws IOException {
           writer.commitAndClose();
         }
     
    -    partitionLengths =
    -      writePartitionedFile(shuffleBlockResolver.getDataFile(shuffleId, mapId));
    -    shuffleBlockResolver.writeIndexFile(shuffleId, mapId, partitionLengths);
    +    final File tmpDataFile = blockManager.diskBlockManager().createTempShuffleBlock()._2();
    +
    +    partitionLengths = writePartitionedFile(tmpDataFile);
    --- End diff --
    
    the type of tmp block used here doesn't matter, since [`writePartitionedFile`](https://github.com/apache/spark/blob/master/core/src/main/java/org/apache/spark/shuffle/sort/BypassMergeSortShuffleWriter.java#L182) doesn't go through `blockManager.getDiskWriter`.  However, I'm a bit confused on this worked before.  The [original partition files might be compressed](https://github.com/apache/spark/blob/master/core/src/main/java/org/apache/spark/shuffle/sort/BypassMergeSortShuffleWriter.java#L141), and those bytes just get copied to the final data file, though data files seem like they are [never compressed](https://github.com/apache/spark/blob/360ed832f5213b805ac28cf1d2828be09480f2d6/core/src/main/scala/org/apache/spark/storage/BlockManager.scala#L1165)?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153776274
  
    Build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151551083
  
    **[Test build #44443 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44443/consoleFull)** for PR 9214 at commit [`e141d82`](https://github.com/apache/spark/commit/e141d82b86e48215a5cebf5e10a7cd733d75ea78).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153776240
  
     Build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151355775
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9214#discussion_r43666862
  
    --- Diff: core/src/main/scala/org/apache/spark/shuffle/ShuffleOutputCoordinator.scala ---
    @@ -0,0 +1,105 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.spark.shuffle
    +
    +import java.io.{FileOutputStream, FileInputStream, File}
    +
    +import com.google.common.annotations.VisibleForTesting
    +
    +import org.apache.spark.storage.ShuffleMapStatusBlockId
    +import org.apache.spark.{SparkEnv, Logging}
    +import org.apache.spark.scheduler.MapStatus
    +import org.apache.spark.serializer.SerializerInstance
    +
    +/**
    + * Ensures that on each executor, there are no conflicting writes to the same shuffle files.  It
    + * implements "first write wins", by atomically moving all shuffle files into their final location,
    + * only if the files did not already exist. See SPARK-8029
    + */
    +private[spark] object ShuffleOutputCoordinator extends Logging {
    +
    +  /**
    +   * If any of the destination files do not exist, then move all of the temporary files to their
    +   * destinations, and return (true, the given MapStatus).  If all destination files exist, then
    +   * delete all temporary files, and return (false, the MapStatus from previously committed shuffle
    +   * output).
    +   *
    +   * Note that this will write to all destination files.  If the tmp file is missing, then a
    +   * zero-length destination file will be created.  This is so the ShuffleOutputCoordinator can work
    +   * even when there is a non-determinstic data, where the output exists in one attempt, but is
    +   * empty in another attempt.
    +   *
    +   * @param tmpToDest  Seq of (temporary, destination) file pairs
    +   * @param mapStatus the [[MapStatus]] for the output already written to the the temporary files
    +   * @return pair of: (1) true iff the set of temporary files was moved to the destination and (2)
    +   *         the MapStatus of the committed attempt.
    +   *
    +   */
    +  def commitOutputs(
    +      shuffleId: Int,
    +      partitionId: Int,
    +      tmpToDest: Seq[(File, File)],
    +      mapStatus: MapStatus,
    +      sparkEnv: SparkEnv): (Boolean, MapStatus) = synchronized {
    +    val mapStatusFile = sparkEnv.blockManager.diskBlockManager.getFile(
    +      ShuffleMapStatusBlockId(shuffleId, partitionId))
    +    val ser = sparkEnv.serializer.newInstance()
    +    commitOutputs(shuffleId, partitionId, tmpToDest, mapStatus, mapStatusFile, ser)
    +  }
    +
    +  @VisibleForTesting
    +  def commitOutputs(
    +      shuffleId: Int,
    +      partitionId: Int,
    +      tmpToDest: Seq[(File, File)],
    +      mapStatus: MapStatus,
    +      mapStatusFile: File,
    +      serializer: SerializerInstance): (Boolean, MapStatus) = synchronized {
    +    val destAlreadyExists = tmpToDest.forall{_._2.exists()} && mapStatusFile.exists()
    +    if (!destAlreadyExists) {
    +      tmpToDest.foreach { case (tmp, dest) =>
    +        // If *some* of the destination files exist, but not all of them, then its not clear
    +        // what to do.  There could be a task already reading from this dest file when we delete
    --- End diff --
    
    So, this justification feels a little weird to me. Let's say you have t1 and t2.
    
    * if t1 finishes and writes "n" files, and t2 finishes and writes "n + 1" files, then you'll get the output of t2
    * if instead t2 also creates "n" files, you'll get the output of t1 instead.
    
    Or am I misunderstanding something?
    
    Also, the "if a task is already reading from the file" case will probably make this whole block of code fail on Windows; I believe `File.delete()` will fail if the file is open, unlike on POSIX fses.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150297110
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44155/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153858105
  
    **[Test build #45040 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45040/consoleFull)** for PR 9214 at commit [`4d66df1`](https://github.com/apache/spark/commit/4d66df1ac9381cdecb5b0a045046e048c5333a87).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `case class ShuffleMapStatusBlockId(shuffleId: Int, mapId: Int) extends BlockId `\n  * `public class JavaAssociationRulesExample `\n  * `public class JavaPrefixSpanExample `\n  * `public class JavaSimpleFPGrowth `\n  * `class StreamInterceptor implements TransportFrameDecoder.Interceptor `\n  * `public final class ChunkFetchSuccess extends ResponseWithBody `\n  * `public abstract class ResponseWithBody implements ResponseMessage `\n  * `public final class StreamFailure implements ResponseMessage `\n  * `public final class StreamRequest implements RequestMessage `\n  * `public final class StreamResponse extends ResponseWithBody `\n  * `public class TransportFrameDecoder extends ChannelInboundHandlerAdapter `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150096897
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44124/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150296997
  
    **[Test build #44155 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44155/consoleFull)** for PR 9214 at commit [`3d96747`](https://github.com/apache/spark/commit/3d96747cd1cc76057bc1f581eb0ba6f25a3faa52).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153965296
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151353260
  
    **[Test build #44392 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44392/consoleFull)** for PR 9214 at commit [`5d11eca`](https://github.com/apache/spark/commit/5d11eca843ab19fc4d1b83d50a677bd6f2b6f0d8).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153987949
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45100/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9214#discussion_r43960203
  
    --- Diff: core/src/main/java/org/apache/spark/shuffle/sort/BypassMergeSortShuffleWriter.java ---
    @@ -155,10 +170,16 @@ public void write(Iterator<Product2<K, V>> records) throws IOException {
           writer.commitAndClose();
         }
     
    -    partitionLengths =
    -      writePartitionedFile(shuffleBlockResolver.getDataFile(shuffleId, mapId));
    -    shuffleBlockResolver.writeIndexFile(shuffleId, mapId, partitionLengths);
    +    final File tmpDataFile = blockManager.diskBlockManager().createTempShuffleBlock()._2();
    +
    +    partitionLengths = writePartitionedFile(tmpDataFile);
    --- End diff --
    
    If compression is employed, then each region of the single "partitioned" file is compressed separately. So the concatenation here should be correct.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150337350
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153965885
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150241029
  
    **[Test build #44149 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44149/consoleFull)** for PR 9214 at commit [`1c12ca5`](https://github.com/apache/spark/commit/1c12ca57f22c276f6dab03c45966b32967db51fa).
     * This patch **fails MiMa tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151377802
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150668921
  
    **[Test build #44251 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44251/consoleFull)** for PR 9214 at commit [`2089e12`](https://github.com/apache/spark/commit/2089e12d93ff00f0d18e0ec3cbf23322a6974831).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151500338
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150624389
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150339913
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44167/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153830446
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151363337
  
    **[Test build #44399 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44399/consoleFull)** for PR 9214 at commit [`4b7c71a`](https://github.com/apache/spark/commit/4b7c71a938d69be93baecb8ce320a2151b7a4658).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150695486
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44251/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150704989
  
    Well, as I've done some more testing and taken a closer look at the some of the corner cases, I don't think its easy to get this working.  The major problem comes from when executors are lost but they come back.  Say executor 1 generates some shuffle output, then for whatever reason the driver thinks its lost the executor.  The driver unregisters the shuffle output, then queues up some tasks to regenerate that shuffle output.  Meanwhile the executor comes back, and thanks to locality preferences its likely to get some of those same tasks again.  But the `ShuffleOutputCoordinator` will ignore the output of those tasks, since those files already exist.  If we don't send some map status back when the task completes, the driver will still think that some shuffle output needs to be generated, b/c its already removed the original shuffle output.
    
    We could send back the map status based on the new (but uncommitted) shuffle output, but then it could be wrong about the zero-sized blocks if the data is non-deterministic.  We could have the `ShuffleOutputCoordinator` remember the map status of all output it commits, so that it always returns the mapstatus of the committed output -- this way the second attempt would return the map status for the committed outputs of the first attempt.  This adds complexity and memory overhead, though.
    
    And, that still wouldn't solve the problem if the shuffle output files got corrupted somehow. You'd get a fetch failure, the executor would get removed, and then it would come right back.  The same tasks would get scheduled on it, but it would never replace the shuffle output.  You'd just spin till you get your 4 failures, and the job would fail.
    
    To get around this, the driver could add some logic to prevent executors from re-registering, but to me that sounds like a much bigger change that needs to be thought through very carefully.
    
    In my experience, it is actually much more common that an executor gets removed and then comes back, rather than a node being permanently unreachable.  There is just a long GC or something, the executor gets removed, but then it re-registers w/ the master.  (And there are unit tests in "local" mode which simulate the executor getting removed, but of course its the same executor which comes right back.)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150626295
  
    **[Test build #44229 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44229/consoleFull)** for PR 9214 at commit [`4a19702`](https://github.com/apache/spark/commit/4a19702bf1de0af003c2cbc58bf2ec61c79d17b9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153831566
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150096894
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150668613
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by mateiz <gi...@git.apache.org>.
Github user mateiz commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150708764
  
    Hey so I'm curious about two things here:
    1) If we just always replaced the output with a new one using a file rename, would we actually have a problem? I think that any thread that has a file open will still be reading from the old version of the file if you do a rename. You should double-check this, but I don't think it will switch mid-file. That might mean the "last task wins" strategy works.
    2) Otherwise, what I would do is store the status in a separate file, similar to the .index file we have for sort-based shuffle. There's no memory overhead and it's easy to read it back again when we're given a map task and we see that an output block for it already exists.
    
    Regarding shuffle files getting corrupted somehow, I think this is super unlikely and I haven't seen many systems try to defend against this. If this were an issue, we'd also have to worry about data cached with DISK_ONLY being corrupted, etc. I think this is considered in systems like HDFS because they store a huge amount of data for a very long time, but I don't think it's a major problem in Spark, and we can always add checksums later if we see it happen.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151558463
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153835197
  
    **[Test build #45041 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45041/consoleFull)** for PR 9214 at commit [`e59df41`](https://github.com/apache/spark/commit/e59df413f206b7c727d276dca0bbd6de1180ad63).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151196831
  
    Hi @mateiz, thanks for taking a look
    
    1) Do you know if that works reliably on all platforms?  Josh had suggested that trick as well during our brainstorm earlier, but we weren't sure if we could rely on it.  I think it doesn't work on windows, though I haven't tested and might be remembering wrong.  I'm definitely not an expert on this area though, happy to defer.  I did try this and it worked on my mac, anyway: https://gist.github.com/squito/222a28f04a6517aafba2
    I think that with "last task wins" you'd still need a lock when opening files for reading & writing to make sure you don't open one task's index file and another task's data file.  (a lot of work can happen between opening the data file for writing and opening the index file for writing with the current code, but that can be changed.)
    
    2) Yeah, that is a great point.  I will change this PR to store the map status in a file, that'll be a straightforward fix.
    
    I suppose you are right that if a DISK_ONLY file gets corrupted, your entire spark app is also doomed.  But that seems unfortunate to me, not something that we want to emulate.  We already see that users are running really long spark apps, on ever increasing cluster sizes (consider https://github.com/apache/spark/pull/9246).  Maybe its very uncommon so its not worth re-engineering things just for that.  But I do still feel that even though having one output per attempt is a slightly bigger change right now, the changes are mostly plumbing, and it makes for something that is safer and much easier to reason to about.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151508411
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153823493
  
    **[Test build #45020 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45020/consoleFull)** for PR 9214 at commit [`cfdfd2c`](https://github.com/apache/spark/commit/cfdfd2c7aca33a40e687ed6c93859a72d8f8e280).
     * This patch passes all tests.
     * This patch **does not merge cleanly**.
     * This patch adds the following public classes _(experimental)_:\n  * `case class ShuffleMapStatusBlockId(shuffleId: Int, mapId: Int) extends BlockId `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153859461
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45036/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150267857
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153859036
  
    **[Test build #45036 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45036/consoleFull)** for PR 9214 at commit [`86f468a`](https://github.com/apache/spark/commit/86f468ab6d13fba9c0ae1fbe8289e2d05ea5bcef).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `case class ShuffleMapStatusBlockId(shuffleId: Int, mapId: Int) extends BlockId `\n  * `public class JavaAssociationRulesExample `\n  * `public class JavaPrefixSpanExample `\n  * `public class JavaSimpleFPGrowth `\n  * `class StreamInterceptor implements TransportFrameDecoder.Interceptor `\n  * `public final class ChunkFetchSuccess extends ResponseWithBody `\n  * `public abstract class ResponseWithBody implements ResponseMessage `\n  * `public final class StreamFailure implements ResponseMessage `\n  * `public final class StreamRequest implements RequestMessage `\n  * `public final class StreamResponse extends ResponseWithBody `\n  * `public class TransportFrameDecoder extends ChannelInboundHandlerAdapter `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-155672824
  
    **[Test build #45592 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45592/consoleFull)** for PR 9214 at commit [`80e037d`](https://github.com/apache/spark/commit/80e037dcb731a918d516eb02c12440d8ba3e71a7).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `case class ShuffleMapStatusBlockId(shuffleId: Int, mapId: Int) extends BlockId `\n  * `public class JavaLBFGSExample `\n  * `class LDA @Since(\"1.6.0\") (`\n  * `  case class Metadata(`\n  * `      require(className == expectedClassName, s\"Error loading metadata: Expected class name\" +`\n  * `class SlidingRDDPartition[T](val idx: Int, val prev: Partition, val tail: Seq[T], val offset: Int)`\n  * `class SlidingRDD[T: ClassTag](@transient val parent: RDD[T], val windowSize: Int, val step: Int)`\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151576310
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153832840
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151349717
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151558468
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44433/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151548167
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-155672854
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/45592/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150650371
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150339911
  
    Build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150669271
  
    **[Test build #44235 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44235/consoleFull)** for PR 9214 at commit [`4145651`](https://github.com/apache/spark/commit/4145651724eb99fb440cc8509df154d8f8095b47).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-150340521
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44168/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151576276
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-155655968
  
    **[Test build #45592 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45592/consoleFull)** for PR 9214 at commit [`80e037d`](https://github.com/apache/spark/commit/80e037dcb731a918d516eb02c12440d8ba3e71a7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-153859458
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core][wip] first successful shuff...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-151615905
  
    **[Test build #44448 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44448/consoleFull)** for PR 9214 at commit [`4df7955`](https://github.com/apache/spark/commit/4df7955db9c46b1549b3c0f4e238f5be7970c337).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `case class ShuffleMapStatusBlockId(shuffleId: Int, mapId: Int) extends BlockId `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-8029][core] first successful shuffle ta...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9214#issuecomment-155654812
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org