You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by JoshRosen <gi...@git.apache.org> on 2014/10/07 01:33:40 UTC

[GitHub] spark pull request: [SPARK-2546] Clone JobConf for each task (bran...

GitHub user JoshRosen opened a pull request:

    https://github.com/apache/spark/pull/2684

    [SPARK-2546] Clone JobConf for each task (branch-1.0 / 1.1 backport)

    This patch attempts to fix SPARK-2546 in `branch-1.0` and `branch-1.1`.  The underlying problem is that thread-safety issues in Hadoop Configuration objects may cause Spark tasks to get stuck in infinite loops.  The approach taken here is to clone a new copy of the JobConf for each task rather than sharing a single copy between tasks.  Note that there are still Configuration thread-safety issues that may affect the driver, but these seem much less likely to occur in practice and will be more complex to fix (see discussion on the SPARK-2546 ticket).

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/JoshRosen/spark jobconf-fix-backport

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/2684.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2684
    
----
commit dd25697c490e40f644b544c975afff49e107ace6
Author: Josh Rosen <jo...@apache.org>
Date:   2014-10-06T23:26:29Z

    [SPARK-2546] [1.0 / 1.1 backport] Clone JobConf for each task.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2546] Clone JobConf for each task (bran...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/2684#issuecomment-58278639
  
    Looks like Jenkins is being flaky today, since we've been seeing a lot of these "git fetch failed" errors.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2546] Clone JobConf for each task (bran...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/2684#issuecomment-59407880
  
    Does anyone have additional feedback on this?  I have a test at https://gist.github.com/JoshRosen/287630864ac9803fe59f that demonstrates a (different) set of Configuration thread-safety symptoms that this patch fixes.
    
    When merging this, please also cherry-pick into `master` and `branch-1.0.0` (I opened this against `branch-1.1` because I originally thought that we might explore a different solution for Spark 1.2+).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2546] Clone JobConf for each task (bran...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2684#issuecomment-58133026
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/275/consoleFull) for   PR 2684 at commit [`dd25697`](https://github.com/apache/spark/commit/dd25697c490e40f644b544c975afff49e107ace6).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2546] Clone JobConf for each task (bran...

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2684#discussion_r18497896
  
    --- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala ---
    @@ -132,24 +132,12 @@ class HadoopRDD[K, V](
       // Returns a JobConf that will be used on slaves to obtain input splits for Hadoop reads.
       protected def getJobConf(): JobConf = {
         val conf: Configuration = broadcastedConf.value.value
    -    if (conf.isInstanceOf[JobConf]) {
    -      // A user-broadcasted JobConf was provided to the HadoopRDD, so always use it.
    -      conf.asInstanceOf[JobConf]
    -    } else if (HadoopRDD.containsCachedMetadata(jobConfCacheKey)) {
    -      // getJobConf() has been called previously, so there is already a local cache of the JobConf
    -      // needed by this RDD.
    -      HadoopRDD.getCachedMetadata(jobConfCacheKey).asInstanceOf[JobConf]
    -    } else {
    -      // Create a JobConf that will be cached and used across this RDD's getJobConf() calls in the
    -      // local process. The local cache is accessed through HadoopRDD.putCachedMetadata().
    -      // The caching helps minimize GC, since a JobConf can contain ~10KB of temporary objects.
    -      // Synchronize to prevent ConcurrentModificationException (Spark-1097, Hadoop-10456).
    -      HadoopRDD.CONFIGURATION_INSTANTIATION_LOCK.synchronized {
    -        val newJobConf = new JobConf(conf)
    +    HadoopRDD.CONFIGURATION_INSTANTIATION_LOCK.synchronized {
    +      val newJobConf = new JobConf(conf)
    --- End diff --
    
    Does this actually clone the internal map? Or does it just create pointers to the supplied `conf`? If it just creates pointers it seems like it might end up having the same synchronization issues.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2546] Clone JobConf for each task (bran...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2684#issuecomment-59585737
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21866/consoleFull) for   PR 2684 at commit [`f14f259`](https://github.com/apache/spark/commit/f14f25981f1b922f1a8d07dfd80774a78daec368).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2546] Clone JobConf for each task (bran...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2684#issuecomment-59585743
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21866/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2546] Clone JobConf for each task (bran...

Posted by mingyukim <gi...@git.apache.org>.
Github user mingyukim commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2684#discussion_r18554195
  
    --- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala ---
    @@ -132,24 +132,12 @@ class HadoopRDD[K, V](
       // Returns a JobConf that will be used on slaves to obtain input splits for Hadoop reads.
       protected def getJobConf(): JobConf = {
         val conf: Configuration = broadcastedConf.value.value
    -    if (conf.isInstanceOf[JobConf]) {
    -      // A user-broadcasted JobConf was provided to the HadoopRDD, so always use it.
    -      conf.asInstanceOf[JobConf]
    -    } else if (HadoopRDD.containsCachedMetadata(jobConfCacheKey)) {
    --- End diff --
    
    jobConfCacheKey doesn't seem to be used anymore. Should that be removed?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2546] Clone JobConf for each task (bran...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/2684#issuecomment-59642012
  
    I'm going to merge this and cherry-pick it into all maintenance branches.  We'll probably turn on cloning by default in 1.2 and we'll be sure to clearly document this configuration option in the 1.0.3 and 1.1.1 release notes.  Thanks to everyone who helped test this!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2546] Clone JobConf for each task (bran...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2684#discussion_r18498021
  
    --- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala ---
    @@ -132,24 +132,12 @@ class HadoopRDD[K, V](
       // Returns a JobConf that will be used on slaves to obtain input splits for Hadoop reads.
       protected def getJobConf(): JobConf = {
         val conf: Configuration = broadcastedConf.value.value
    -    if (conf.isInstanceOf[JobConf]) {
    -      // A user-broadcasted JobConf was provided to the HadoopRDD, so always use it.
    -      conf.asInstanceOf[JobConf]
    -    } else if (HadoopRDD.containsCachedMetadata(jobConfCacheKey)) {
    -      // getJobConf() has been called previously, so there is already a local cache of the JobConf
    -      // needed by this RDD.
    -      HadoopRDD.getCachedMetadata(jobConfCacheKey).asInstanceOf[JobConf]
    -    } else {
    -      // Create a JobConf that will be cached and used across this RDD's getJobConf() calls in the
    -      // local process. The local cache is accessed through HadoopRDD.putCachedMetadata().
    -      // The caching helps minimize GC, since a JobConf can contain ~10KB of temporary objects.
    -      // Synchronize to prevent ConcurrentModificationException (Spark-1097, Hadoop-10456).
    -      HadoopRDD.CONFIGURATION_INSTANTIATION_LOCK.synchronized {
    -        val newJobConf = new JobConf(conf)
    +    HadoopRDD.CONFIGURATION_INSTANTIATION_LOCK.synchronized {
    +      val newJobConf = new JobConf(conf)
    --- End diff --
    
    JobConf seems to implement this constructor by calling the superclass's constructor.
    
    Take a look at the `git blame` for Configuration:
    
    https://github.com/apache/hadoop/blame/release-2.5.1/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java#L662
    
    It looks like this constructor performs proper copying and has done so for a while (since 2009 or 2010).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2546] Clone JobConf for each task (bran...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2684#issuecomment-58128361
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/275/consoleFull) for   PR 2684 at commit [`dd25697`](https://github.com/apache/spark/commit/dd25697c490e40f644b544c975afff49e107ace6).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2546] Clone JobConf for each task (bran...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2684#issuecomment-58577460
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21545/consoleFull) for   PR 2684 at commit [`b562451`](https://github.com/apache/spark/commit/b562451f142078321a102fef4f48190acc822e03).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2546] Clone JobConf for each task (bran...

Posted by frydawg524 <gi...@git.apache.org>.
Github user frydawg524 commented on the pull request:

    https://github.com/apache/spark/pull/2684#issuecomment-59568666
  
    We were able to verify this fix on 1.0.2 by running a test benchmark job 6 times before and after the patch. 
    3/6 tests failed pre-patch and 0/6 failed post-patch. 
    
    We verified by checking the number of output part files for each job.
    For jobs that failed, when we hit the deadlock, we saw speculation kill and re-attempt the task.
    After doing this N times, the task failed and threw `java.io.IOException: Failed to save output of task`
    Ultimately, this lead to the job missing some indeterminate number of the output part files (the ones that failed to commit).
     
    After patching, we verified that for our benchmark jobs none of the part files were missing. 
    
    During benchmarking, we noticed a 8.69% decrease in performance as measured by the average job time from 5 runs, which is at acceptable levels for us. 
    
    Let me know if you need any more details. 
    
    Thanks Josh! 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2546] Clone JobConf for each task (bran...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/2684


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2546] Clone JobConf for each task (bran...

Posted by frydawg524 <gi...@git.apache.org>.
Github user frydawg524 commented on the pull request:

    https://github.com/apache/spark/pull/2684#issuecomment-59578918
  
    @JoshRosen, 
    Awesome! Thanks for helping out with this. I'll make sure that this gets broadcasted to my team. 
    
    Zach


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2546] Clone JobConf for each task (bran...

Posted by ash211 <gi...@git.apache.org>.
Github user ash211 commented on the pull request:

    https://github.com/apache/spark/pull/2684#issuecomment-59581783
  
    More flavor on the perf numbers was we ran 6 jobs in a row before and after (starting up a new driver on each job), discarded the first run, and took the average of the remaining five.
    
    Pre-patch the times were ~1m50s, post-patch they were ~2m1s.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2546] Clone JobConf for each task (bran...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2684#issuecomment-58566208
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21545/consoleFull) for   PR 2684 at commit [`b562451`](https://github.com/apache/spark/commit/b562451f142078321a102fef4f48190acc822e03).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2546] Clone JobConf for each task (bran...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2684#issuecomment-58116938
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21352/consoleFull) for   PR 2684 at commit [`dd25697`](https://github.com/apache/spark/commit/dd25697c490e40f644b544c975afff49e107ace6).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2546] Clone JobConf for each task (bran...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/2684#issuecomment-59578207
  
    @frydawg524 Thanks for testing this out!  I'm glad to hear that it solves the bug.
    
    I just pushed a new commit which adds a configuration option (`spark.hadoop.cloneConf`) for controlling whether to clone the configuration (as in the patch you tested) or share a single configuration object across all tasks (the old code).  The reasoning for this is that releasing 1.1.1 and 1.0.3 patches that cause measurable performance regressions will upset users who weren't affected by this issue.  In 1.2, we may revisit this by seeing if we can find ways to make the cloning process faster.
    
    I also plan to open an upstream ticket with Hadoop.  That won't solve the problem for Spark users who might be stuck using older Hadoop versions (so we still need our own workaround), but it would be nice to see this eventually get fixed upstream.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2546] Clone JobConf for each task (bran...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2684#issuecomment-58122578
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21352/consoleFull) for   PR 2684 at commit [`dd25697`](https://github.com/apache/spark/commit/dd25697c490e40f644b544c975afff49e107ace6).
     * This patch **fails** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2546] Clone JobConf for each task (bran...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2684#issuecomment-59578047
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21866/consoleFull) for   PR 2684 at commit [`f14f259`](https://github.com/apache/spark/commit/f14f25981f1b922f1a8d07dfd80774a78daec368).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2546] Clone JobConf for each task (bran...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2684#discussion_r18554911
  
    --- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala ---
    @@ -132,24 +132,12 @@ class HadoopRDD[K, V](
       // Returns a JobConf that will be used on slaves to obtain input splits for Hadoop reads.
       protected def getJobConf(): JobConf = {
         val conf: Configuration = broadcastedConf.value.value
    -    if (conf.isInstanceOf[JobConf]) {
    -      // A user-broadcasted JobConf was provided to the HadoopRDD, so always use it.
    -      conf.asInstanceOf[JobConf]
    -    } else if (HadoopRDD.containsCachedMetadata(jobConfCacheKey)) {
    --- End diff --
    
    You're right; good catch.  I'll remove it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2546] Clone JobConf for each task (bran...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2684#issuecomment-58122584
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21352/Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2546] Clone JobConf for each task (bran...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/2684#issuecomment-58565432
  
    Jenkins, retest this please (testing the new Jenkins).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2546] Clone JobConf for each task (bran...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2684#issuecomment-58277013
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21419/Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2546] Clone JobConf for each task (bran...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/2684#issuecomment-58142790
  
    By the way, I checked and this patch cleanly cherry-picks into `branch-1.0`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2546] Clone JobConf for each task (bran...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2684#issuecomment-58577472
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21545/Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org