You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by lpiepiora <gi...@git.apache.org> on 2015/10/27 17:47:17 UTC

[GitHub] spark pull request: [SPARK-11353][IO] Update jets3t version to 0.9...

GitHub user lpiepiora opened a pull request:

    https://github.com/apache/spark/pull/9306

    [SPARK-11353][IO] Update jets3t version to 0.9.4

    This PR updates jets3t dependency to 0.9.4, because of an error, which is thrown when code tries to write to S3 bucket located in Frankfurt.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/lpiepiora/spark 11353-fix-write-to-s3-aws4-hmac-sha256

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/9306.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #9306
    
----
commit 34a28e15720f24a77e7d618a9836f552f58e62f3
Author: Lukasz Piepiora <lu...@roq.ad>
Date:   2015-10-27T16:41:55Z

    [SPARK-11353][IO] Update jets3t version to 0.9.4

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11353][IO] Update jets3t version to 0.9...

Posted by lpiepiora <gi...@git.apache.org>.
Github user lpiepiora commented on the pull request:

    https://github.com/apache/spark/pull/9306#issuecomment-153152500
  
    Yes, thanks - I'm able to reproduce it. Now, I'm trying to figure out what's wrong.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11353][IO] Update jets3t version to 0.9...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9306#issuecomment-152343082
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11353][IO] Update jets3t version to 0.9...

Posted by lpiepiora <gi...@git.apache.org>.
Github user lpiepiora commented on the pull request:

    https://github.com/apache/spark/pull/9306#issuecomment-151666795
  
    Seems that JetS3t bumped their version of HttpClient to 4.5, which is conflicting with HtmlUnit, which actually tries to read private fields of HttpClientBuilder. Since the field name changed (from `sslcontext` to `sslContext`) the patch fails.
    
    I've tested it by building Spark locally, and we should be good by sticking to Spark's provided version of HttpClient, so I'll exclude this (and HttpCore) transient dependency from the jets3t library.
    
    Beside I've looked through Hadoop sources and compared it against [diff between 0.9.3 and 0.9.4] (https://bitbucket.org/jmurty/jets3t/branches/compare/Release-0.9.4%0DRelease-0.9.3#diff). The files affected by the change between 0.9.3 and 0.9.4 doesn't seem to touch any public members used directly by Hadoop code.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11353][IO] Update jets3t version to 0.9...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/9306#issuecomment-152310270
  
    Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11353][IO] Update jets3t version to 0.9...

Posted by markgrover <gi...@git.apache.org>.
Github user markgrover commented on the pull request:

    https://github.com/apache/spark/pull/9306#issuecomment-151576358
  
    LGTM too. HADOOP-9623 updated the jets3t version to 0.9.0 and it went in Hadoop 2.3.0, so Hadoop starting 2.3.0 or later should be fine. The only noteworthy thing from jets3t release notes was:
    `NOTE: Anyone who has implemented their own JetS3t service implemented the JetS3tRequestAuthorizer will need to adjust their code due to API changes.`
    A quick grep through spark code revealed no reference to JetS3tRequestAuthorizer so I think we should be ok.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11353][IO] Update jets3t version to 0.9...

Posted by lpiepiora <gi...@git.apache.org>.
Github user lpiepiora commented on the pull request:

    https://github.com/apache/spark/pull/9306#issuecomment-163237036
  
    @steveloughran yes, that's exactly what happened to me in this PR. I wanted to fix it but, in general as you've said this just yields more problems on multiple levels. 
    
    However s3a is not a breeze either (even in newer Hadoop 2.7+ versions), especially with Frankfurt buckets, which support only AWS Signature V4.
    
    I'll close this PR anyway, because I think this is not the right way either (even though this jets3t update was a minor one it upgraded transitive dependencies, which yielded multiple issues).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11353][IO] Update jets3t version to 0.9...

Posted by steveloughran <gi...@git.apache.org>.
Github user steveloughran commented on the pull request:

    https://github.com/apache/spark/pull/9306#issuecomment-163253041
  
    > However s3a is not a breeze either (even in newer Hadoop 2.7+ versions), especially with Frankfurt buckets, which support only AWS Signature V4.
    
    really? thought that worked. I know HADOOP-12537 mentioned it, but didn't think STS credentials were mandatory. As usual: file a JIRA.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11353][IO] Update jets3t version to 0.9...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9306#issuecomment-151566275
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11353][IO] Update jets3t version to 0.9...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/9306#issuecomment-151573372
  
    I think this is OK since it's a maintenance release, but are there any other changes across the two releases that might be an issue?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11353][IO] Update jets3t version to 0.9...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9306#issuecomment-151574080
  
    **[Test build #1958 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1958/consoleFull)** for PR 9306 at commit [`34a28e1`](https://github.com/apache/spark/commit/34a28e15720f24a77e7d618a9836f552f58e62f3).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11353][IO] Update jets3t version to 0.9...

Posted by Kinghack <gi...@git.apache.org>.
Github user Kinghack commented on the pull request:

    https://github.com/apache/spark/pull/9306#issuecomment-208981407
  
    since latest version of jets3t does not build into latest spark. Is it possible for now that I could access s3 file from regions that supports AWS4-HMAC-SHA256 only?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11353][IO] Update jets3t version to 0.9...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9306#issuecomment-152343026
  
    **[Test build #44634 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44634/consoleFull)** for PR 9306 at commit [`e1c9c09`](https://github.com/apache/spark/commit/e1c9c093b8b60caa1f66012dcce9b9c652ddabf4).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11353][IO] Update jets3t version to 0.9...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9306#issuecomment-152311050
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11353][IO] Update jets3t version to 0.9...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9306#issuecomment-152521127
  
    **[Test build #1961 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1961/consoleFull)** for PR 9306 at commit [`e1c9c09`](https://github.com/apache/spark/commit/e1c9c093b8b60caa1f66012dcce9b9c652ddabf4).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11353][IO] Update jets3t version to 0.9...

Posted by lpiepiora <gi...@git.apache.org>.
Github user lpiepiora commented on the pull request:

    https://github.com/apache/spark/pull/9306#issuecomment-151756584
  
    Jenkins, retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11353][IO] Update jets3t version to 0.9...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9306#issuecomment-151614467
  
    **[Test build #1958 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1958/consoleFull)** for PR 9306 at commit [`34a28e1`](https://github.com/apache/spark/commit/34a28e15720f24a77e7d618a9836f552f58e62f3).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11353][IO] Update jets3t version to 0.9...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9306#issuecomment-152554955
  
    **[Test build #1961 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1961/consoleFull)** for PR 9306 at commit [`e1c9c09`](https://github.com/apache/spark/commit/e1c9c093b8b60caa1f66012dcce9b9c652ddabf4).
     * This patch **fails Spark unit tests**.
     * This patch **does not merge cleanly**.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11353][IO] Update jets3t version to 0.9...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9306#issuecomment-152343083
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44634/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11353][IO] Update jets3t version to 0.9...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9306#issuecomment-152310983
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11353][IO] Update jets3t version to 0.9...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/9306#issuecomment-152589453
  
    @lpiepiora OK I believe Jenkins, that this is a real failure. It is the `UISeleniumSuite`. Something about the output of a UI has changed: possibly an error. Are you able to reproduce that?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11353][IO] Update jets3t version to 0.9...

Posted by lpiepiora <gi...@git.apache.org>.
Github user lpiepiora closed the pull request at:

    https://github.com/apache/spark/pull/9306


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11353][IO] Update jets3t version to 0.9...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9306#issuecomment-152312434
  
    **[Test build #44634 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44634/consoleFull)** for PR 9306 at commit [`e1c9c09`](https://github.com/apache/spark/commit/e1c9c093b8b60caa1f66012dcce9b9c652ddabf4).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11353][IO] Update jets3t version to 0.9...

Posted by steveloughran <gi...@git.apache.org>.
Github user steveloughran commented on the pull request:

    https://github.com/apache/spark/pull/9306#issuecomment-163222684
  
    Moving to Hadoop 0.90 [HADOOP-9623](https://issues.apache.org/jira/browse/HADOOP-9623) was what could be described as "an accidental disaster"'; the patch swallowed exceptions "which should never happen", resulting in [HADOOP-10589](https://issues.apache.org/jira/browse/HADOOP-10589); a seek(0) on a 0-byte file NPE-ing. (trivia: It was fixed by probably the only piece of co-recursive code in core hadoop)
    
    One issue with 0.90 is that the `close()` call on an input stream reads _all remaining bytes on the resource_ [HADOOP-12376](https://issues.apache.org/jira/browse/HADOOP-12376). This hurts: moving up to 0.94 may fix it. From the hadoop core perspective, the move to 0.90 broke enough things that we are scared to go near the s3n code again; all future work is in s3a.
    
    To summarise then: this may break s3n if not shaded, but you should be encouraging people to use s3a on Hadoop 2.7+ anyway


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org