You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by steveloughran <gi...@git.apache.org> on 2017/05/25 15:58:00 UTC

[GitHub] spark pull request #18111: [SPARK-20886][CORE] HadoopMapReduceCommitProtocol...

GitHub user steveloughran opened a pull request:

    https://github.com/apache/spark/pull/18111

    [SPARK-20886][CORE] HadoopMapReduceCommitProtocol to fail meaningfully if FileOutputCommitter.getWorkPath==null

    ## What changes were proposed in this pull request?
    
    Handles the situation where a `FileOutputCommitter.getWorkPath()` returns `null` by a `require()` call and a message which explains the problem and includes the `toString` value of the committer for better diagnostics.
    
    The situation occurs if the committer being passed in is a job committer, not a task committer, that is: it was initalised with a `JobAttemptContext` not a `TaskAttemptContext`.
    
    The existing code does an  `Option(workPath.toString).getOrElse(path)` which *may* be an attempt to handle the null path case. If so, it isn't, because its the `.toString()` call which is failing. If people do think that code should be resilient to null work paths, that line could be changed. However, it may hide the underlying problem: the committer is misconfigured.
    
    It may be a rare-occurence today, but it is more likely with modified subclasses of `FileOutputCommitter`, as well as possible
    with some ongoing work of mine in Hadoop to better support commitment to cloud storage infrastructures.
    
    ## How was this patch tested?
    
    Manually. The before & after stack traces are on the JIRA.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/steveloughran/spark cloud/SPARK-20886-committer-NPE

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/18111.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #18111
    
----
commit 02eb7bf0ee6b81841f22e3c46d822eaebb28e85c
Author: Steve Loughran <st...@hortonworks.com>
Date:   2017-05-25T15:46:50Z

    SPARK-20886 HadoopMapReduceCommitProtocol to fail with message if FileOutputCommitter.getWorkPath==null
    Add a requirement.
    The existing code does an Option.getWorkpath.toString() which *may* be an attempt to handle the null path case. If so, it isn't, because its the .toString() which is failing.
    
    Change-Id: Idddf9813761e7008425542f96903bce12bedd978

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18111: [SPARK-20886][CORE] HadoopMapReduceCommitProtocol to fai...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/18111
  
    It looks good to me too.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18111: [SPARK-20886][CORE] HadoopMapReduceCommitProtocol to han...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18111
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81237/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18111: [SPARK-20886][CORE] HadoopMapReduceCommitProtocol to han...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/18111
  
    OK. LGTM. I think this anyway should issue warnings:
    
    ```
    WARN org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Output Path is null in setupJob()
    WARN org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Output Path is null in commitJob()
    ```
    
    and I assume we prefer working permissively rather than failing fast. At least, it looks that's what `FileOutputCommitter` allows even if it skips moving the output.
    
    Please make a followup if I misunderstood and anything is wrong. Will merge this one in few days. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18111: [SPARK-20886][CORE] HadoopMapReduceCommitProtocol to han...

Posted by steveloughran <gi...@git.apache.org>.
Github user steveloughran commented on the issue:

    https://github.com/apache/spark/pull/18111
  
    I believe this patch implements the original design goal: if a committer doesn't have a working path supplied by `getWorkingPath()` then it downgrades. 
    
    It might be worthwhile doing a larger fault-injecting committer at some point.  have indirectly the ability to inject faults, but its a long way from the spark codebase, when really a thin shim is all that's needed. In particular, if a fault injecting ParquetOutputCommitter would test a big chunk of the codebase whose error handling codepaths are more likely to be tested in production than by jenkins. Are people interested (separate JIRA, obviously)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18111: [SPARK-20886][CORE] HadoopMapReduceCommitProtocol to fai...

Posted by steveloughran <gi...@git.apache.org>.
Github user steveloughran commented on the issue:

    https://github.com/apache/spark/pull/18111
  
    Not really. I thought about how I could do it, but essentially you do need to do things underneath the commit protocol, either in the Hadoop codebase (me) or in a test which somehow misconfigured things. It's most likely to surface in custom subclasses of `FileOutputCommitter`, so a test would probably consist of a run with a new committer set to always return null on `getWorkingPath`, verifying the changed assertion went from NPE to an IllegalArgException. 
    
    That's a lot of complexity a test for what is essentially: making explicit a requirement of the code, namely, that `getWorkPath.toString()` requires `getWorkPath` to be non-null
    
    I've included the before/after stack traces in the JIRA.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18111: [SPARK-20886][CORE] HadoopMapReduceCommitProtocol to fai...

Posted by steveloughran <gi...@git.apache.org>.
Github user steveloughran commented on the issue:

    https://github.com/apache/spark/pull/18111
  
    Is there anything else which needs to be one here, or is it matter of finding the right reviewer?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18111: [SPARK-20886][CORE] HadoopMapReduceCommitProtocol...

Posted by steveloughran <gi...@git.apache.org>.
Github user steveloughran commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18111#discussion_r118530138
  
    --- Diff: core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala ---
    @@ -73,7 +73,10 @@ class HadoopMapReduceCommitProtocol(jobId: String, path: String)
     
         val stagingDir: String = committer match {
           // For FileOutputCommitter it has its own staging path called "work path".
    -      case f: FileOutputCommitter => Option(f.getWorkPath.toString).getOrElse(path)
    +      case f: FileOutputCommitter =>
    +        val workPath = f.getWorkPath
    +        require(workPath != null, s"Committer has no workpath $f")
    +        Option(workPath.toString).getOrElse(path)
    --- End diff --
    
    If this was intended to be resilient, it'd have to become `Option(workpath).getOrElse(path).toString()`, though that would actually hide the problem I'd managed to create: hand in a job committer to a task.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18111: [SPARK-20886][CORE] HadoopMapReduceCommitProtocol to fai...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18111
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77373/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18111: [SPARK-20886][CORE] HadoopMapReduceCommitProtocol...

Posted by steveloughran <gi...@git.apache.org>.
Github user steveloughran commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18111#discussion_r133751724
  
    --- Diff: core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala ---
    @@ -73,7 +73,10 @@ class HadoopMapReduceCommitProtocol(jobId: String, path: String)
     
         val stagingDir: String = committer match {
           // For FileOutputCommitter it has its own staging path called "work path".
    -      case f: FileOutputCommitter => Option(f.getWorkPath.toString).getOrElse(path)
    +      case f: FileOutputCommitter =>
    +        val workPath = f.getWorkPath
    +        require(workPath != null, s"Committer has no workpath $f")
    +        Option(workPath.toString).getOrElse(path)
    --- End diff --
    
    {{workPath.toString()}} triggers an NPE, which was the reason for the stack trace.
    
    Now, what's this code trying to do? find a directory to put stuff. If the committer is a subclass of `FileOutputFormat`,  then its getWorkPath method can be called, which *should* return a non-null path. 
    
    If the requirement of the code is "get the workpath or return "path" if its null", maybe the code should really be
    
    ```scala
    Option(f.getWorkPath).getOrElse(path).toString
    ```
    That'd return the workPath if non null, falling back to the `path` variable, and then call `toString` on the returned object. I'm not sure off the top of my head if Scala type inference likes that though. It may be less elegant and more reliable to have
    
    ```scala
    val workPath = f.getWorkPath
    if (workPath != null) workPath.toString else path;
    ```
    +maybe a bonus paranoid check for workPath being "".
    
    I think that would actually get closer to the original goal of the code: always return a path, even of the committer doesn't have one
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18111: [SPARK-20886][CORE] HadoopMapReduceCommitProtocol to han...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/18111
  
    retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18111: [SPARK-20886][CORE] HadoopMapReduceCommitProtocol...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18111#discussion_r125544559
  
    --- Diff: core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala ---
    @@ -73,7 +73,10 @@ class HadoopMapReduceCommitProtocol(jobId: String, path: String)
     
         val stagingDir: String = committer match {
           // For FileOutputCommitter it has its own staging path called "work path".
    -      case f: FileOutputCommitter => Option(f.getWorkPath.toString).getOrElse(path)
    +      case f: FileOutputCommitter =>
    +        val workPath = f.getWorkPath
    +        require(workPath != null, s"Committer has no workpath $f")
    +        Option(workPath.toString).getOrElse(path)
    --- End diff --
    
    if `workPath` is not null, will `workPath.toString` return null?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18111: [SPARK-20886][CORE] HadoopMapReduceCommitProtocol...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18111#discussion_r131562253
  
    --- Diff: core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala ---
    @@ -73,7 +73,10 @@ class HadoopMapReduceCommitProtocol(jobId: String, path: String)
     
         val stagingDir: String = committer match {
           // For FileOutputCommitter it has its own staging path called "work path".
    -      case f: FileOutputCommitter => Option(f.getWorkPath.toString).getOrElse(path)
    +      case f: FileOutputCommitter =>
    +        val workPath = f.getWorkPath
    +        require(workPath != null, s"Committer has no workpath $f")
    +        Option(workPath.toString).getOrElse(path)
    --- End diff --
    
    I wonder the answer to this question ^ actually .. Wouldn't `Option(...).getOrElse(path)` be unnecessary?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18111: [SPARK-20886][CORE] HadoopMapReduceCommitProtocol to fai...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on the issue:

    https://github.com/apache/spark/pull/18111
  
    Could you also provide a test case to trigger the issue?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18111: [SPARK-20886][CORE] HadoopMapReduceCommitProtocol to han...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18111
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18111: [SPARK-20886][CORE] HadoopMapReduceCommitProtocol to fai...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18111
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18111: [SPARK-20886][CORE] HadoopMapReduceCommitProtocol to han...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/18111
  
    Merged to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18111: [SPARK-20886][CORE] HadoopMapReduceCommitProtocol to fai...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18111
  
    **[Test build #77373 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77373/testReport)** for PR 18111 at commit [`02eb7bf`](https://github.com/apache/spark/commit/02eb7bf0ee6b81841f22e3c46d822eaebb28e85c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18111: [SPARK-20886][CORE] HadoopMapReduceCommitProtocol to fai...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18111
  
    **[Test build #80807 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80807/testReport)** for PR 18111 at commit [`f017261`](https://github.com/apache/spark/commit/f01726145d4190d92f8cb4f707535958df671f40).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18111: [SPARK-20886][CORE] HadoopMapReduceCommitProtocol to han...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18111
  
    **[Test build #81237 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81237/testReport)** for PR 18111 at commit [`f017261`](https://github.com/apache/spark/commit/f01726145d4190d92f8cb4f707535958df671f40).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18111: [SPARK-20886][CORE] HadoopMapReduceCommitProtocol...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/18111


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18111: [SPARK-20886][CORE] HadoopMapReduceCommitProtocol to fai...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18111
  
    **[Test build #80807 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80807/testReport)** for PR 18111 at commit [`f017261`](https://github.com/apache/spark/commit/f01726145d4190d92f8cb4f707535958df671f40).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18111: [SPARK-20886][CORE] HadoopMapReduceCommitProtocol to fai...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18111
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80807/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18111: [SPARK-20886][CORE] HadoopMapReduceCommitProtocol to fai...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18111
  
    **[Test build #77373 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77373/testReport)** for PR 18111 at commit [`02eb7bf`](https://github.com/apache/spark/commit/02eb7bf0ee6b81841f22e3c46d822eaebb28e85c).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18111: [SPARK-20886][CORE] HadoopMapReduceCommitProtocol...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18111#discussion_r133761245
  
    --- Diff: core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala ---
    @@ -73,7 +73,10 @@ class HadoopMapReduceCommitProtocol(jobId: String, path: String)
     
         val stagingDir: String = committer match {
           // For FileOutputCommitter it has its own staging path called "work path".
    -      case f: FileOutputCommitter => Option(f.getWorkPath.toString).getOrElse(path)
    +      case f: FileOutputCommitter =>
    +        val workPath = f.getWorkPath
    +        require(workPath != null, s"Committer has no workpath $f")
    +        Option(workPath.toString).getOrElse(path)
    --- End diff --
    
    Oh, maybe we could do this:
    
    ```scala
    Option(f.getWorkPath).map(_.toString).getOrElse(path)
    ```
    
    if I understood correctly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18111: [SPARK-20886][CORE] HadoopMapReduceCommitProtocol to fai...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/18111
  
    @steveloughran, mind fixing few words in the title and deacription accordingly? Let me leave this to @cloud-fan for a final look.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18111: [SPARK-20886][CORE] HadoopMapReduceCommitProtocol to fai...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18111
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18111: [SPARK-20886][CORE] HadoopMapReduceCommitProtocol to han...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18111
  
    **[Test build #81237 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81237/testReport)** for PR 18111 at commit [`f017261`](https://github.com/apache/spark/commit/f01726145d4190d92f8cb4f707535958df671f40).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18111: [SPARK-20886][CORE] HadoopMapReduceCommitProtocol to han...

Posted by steveloughran <gi...@git.apache.org>.
Github user steveloughran commented on the issue:

    https://github.com/apache/spark/pull/18111
  
    thx


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org