You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by srowen <gi...@git.apache.org> on 2018/08/11 21:42:31 UTC

[GitHub] spark pull request #22081: [SPARK-23654][BUILD] remove jets3t as a dependenc...

GitHub user srowen opened a pull request:

    https://github.com/apache/spark/pull/22081

    [SPARK-23654][BUILD] remove jets3t as a dependency of spark

    ## What changes were proposed in this pull request?
    
    Remove jets3t dependency, and bouncy castle which it brings in; update licenses and deps
    Note this just takes over https://github.com/apache/spark/pull/21146 
    
    ## How was this patch tested?
    
    Existing tests.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/srowen/spark SPARK-23654

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22081.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22081
    
----
commit a222cb84fa7317e39e134c8233f3bebe596bf088
Author: Sean Owen <sr...@...>
Date:   2018-08-11T21:41:38Z

    Remove jets3t dependency, and bouncy castle which it brings in; update licenses and deps

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22081: [SPARK-23654][BUILD] remove jets3t as a dependency of sp...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/22081
  
    CC @steveloughran 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22081: [SPARK-23654][BUILD] remove jets3t as a dependency of sp...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22081
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22081: [SPARK-23654][BUILD] remove jets3t as a dependency of sp...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22081
  
    **[Test build #94849 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94849/testReport)** for PR 22081 at commit [`d0334c1`](https://github.com/apache/spark/commit/d0334c18aa9081715e97cb705c75f9250300df9d).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22081: [SPARK-23654][BUILD] remove jets3t as a dependency of sp...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22081
  
    **[Test build #4243 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4243/testReport)** for PR 22081 at commit [`a222cb8`](https://github.com/apache/spark/commit/a222cb84fa7317e39e134c8233f3bebe596bf088).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22081: [SPARK-23654][BUILD] remove jets3t as a dependenc...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22081#discussion_r209454764
  
    --- Diff: pom.xml ---
    @@ -984,24 +987,15 @@
               </exclusion>
             </exclusions>
           </dependency>
    -      <!-- See SPARK-1556 for info on this dependency: -->
    +      <!-- See SPARK-23654 for info on this dependency;
    +      It is used to keep javax.activation at v1.1.1 after dropping
    +      jets3t as a dependency.
    +       -->
           <dependency>
    -        <groupId>net.java.dev.jets3t</groupId>
    -        <artifactId>jets3t</artifactId>
    -        <version>${jets3t.version}</version>
    +        <groupId>javax.activation</groupId>
    +        <artifactId>activation</artifactId>
    +        <version>1.1.1</version>
    --- End diff --
    
    Yeah really the change is to remove jets3t; I think Steve also thought it necessary to add back javax.activation that it brought in but wasn't otherwise depended-on by Hadoop. That is I think this patches a gap in the Hadoop pom?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22081: [SPARK-23654][BUILD] remove jets3t as a dependency of sp...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22081
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22081: [SPARK-23654][BUILD] remove jets3t as a dependency of sp...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/22081
  
    Merged to master


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22081: [SPARK-23654][BUILD] remove jets3t as a dependency of sp...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22081
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2245/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22081: [SPARK-23654][BUILD] remove jets3t as a dependency of sp...

Posted by steveloughran <gi...@git.apache.org>.
Github user steveloughran commented on the issue:

    https://github.com/apache/spark/pull/22081
  
    making a test-time option is a reasonable idea -getting the unlimited JCE on the test machines (they don't right now) would remove the need for this


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22081: [SPARK-23654][BUILD] remove jets3t as a dependency of sp...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22081
  
    **[Test build #4243 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4243/testReport)** for PR 22081 at commit [`a222cb8`](https://github.com/apache/spark/commit/a222cb84fa7317e39e134c8233f3bebe596bf088).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22081: [SPARK-23654][BUILD] remove jets3t as a dependency of sp...

Posted by steveloughran <gi...@git.apache.org>.
Github user steveloughran commented on the issue:

    https://github.com/apache/spark/pull/22081
  
    No, the SDKs dont pull in bouncy-castle. Checked via mvnrepo
    
    * [core sdk](http://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk-core/1.11.271) pulls in jackson & httpclient; historically v. fussy about httpclient versions
    * [kinesis SDK](http://mvnrepository.com/artifact/com.amazonaws/amazon-kinesis-client/1.8.10): adds protobuf-2.6, guava 18.0, others.
    
    I've checked with Shane: the jenkins systems do not have the unlimited javax crypto in, so suspect that bouncy-castle is just needed for testing


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22081: [SPARK-23654][BUILD] remove jets3t as a dependency of sp...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/22081
  
    Hm @steveloughran looks like the Kinesis tests fail reliably. That makes me suspicious that jets3t is needed, given that's the AWS dependency here. WDYT?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22081: [SPARK-23654][BUILD] remove jets3t as a dependency of sp...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/22081
  
    Makes sense, but then I wonder how the tests work? we need a test dependency on it?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22081: [SPARK-23654][BUILD] remove jets3t as a dependency of sp...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22081
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94632/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22081: [SPARK-23654][BUILD] remove jets3t as a dependenc...

Posted by steveloughran <gi...@git.apache.org>.
Github user steveloughran commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22081#discussion_r209707448
  
    --- Diff: pom.xml ---
    @@ -984,24 +987,15 @@
               </exclusion>
             </exclusions>
           </dependency>
    -      <!-- See SPARK-1556 for info on this dependency: -->
    +      <!-- See SPARK-23654 for info on this dependency;
    +      It is used to keep javax.activation at v1.1.1 after dropping
    +      jets3t as a dependency.
    +       -->
           <dependency>
    -        <groupId>net.java.dev.jets3t</groupId>
    -        <artifactId>jets3t</artifactId>
    -        <version>${jets3t.version}</version>
    +        <groupId>javax.activation</groupId>
    +        <artifactId>activation</artifactId>
    +        <version>1.1.1</version>
    --- End diff --
    
    the reason for the activation is that it still comes in from somewhere and the version pulled in drops from 1.1.1 to 1.1; meaning it'd be an accidental downgrade of a JAR. I don't know exactly what uses javax.activation: it is one of those historical artifacts whose main role, mapping mime types, is potentially used somewhere important. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22081: [SPARK-23654][BUILD] remove jets3t as a dependency of sp...

Posted by steveloughran <gi...@git.apache.org>.
Github user steveloughran commented on the issue:

    https://github.com/apache/spark/pull/22081
  
    I've just pushed up my PR which is ~ in sync with this one; I'll close that one now and this can be the one to use. 
    
    Assume: kinesis uses bouncy castle somewhere.  There's some hints in the AWS docs
    
    [Encrypt and Decrypt Amazon Kinesis Records Using AWS KMS](https://aws.amazon.com/blogs/big-data/encrypt-and-decrypt-amazon-kinesis-records-using-aws-kms/) covers end-to-end encryption of Kinesis records. For this you need the AWS encryption SDK, whose docs [say you need bouncy castle](https://docs.aws.amazon.com/encryption-sdk/latest/developer-guide/java.html).
    
    And it looks like the AWS encryption SDK does explicitly [depend on bouncy castle](http://mvnrepository.com/artifact/com.amazonaws/aws-encryption-sdk-java/1.3.5).
    
    Imagine if *somehow* the removal of bouncy castle as a java crypto provider was stopping that round trip working with some of the encrypt/decrypt not happening. In which case adding bouncy castle should fix things. It worked before because jets3t in spark-core added bouncy castle, and the last bouncy-castle version update made it in sync with kinesis (and broke jets3t, but nobody has noticed...)
    
    But
    * There's no refs to javax.crypto, the aws crypto libs or calls to the class `KinesisEncryptionUtils`referenced in the blog post in the spark kinesis module (it's not in the latest SDKs either(
    * There's no build-time dependency on the aws-sdk encryption, which would transitively pull in the bouncy castle stuff.
    * Looking through the aws-sdk-bundle: no refs to javax.crypto in the kinesis code; encryption refs limited to the PUT request where you can request server-side encryption with a given KMS key. 
    * Nor is there any `com.aws.encryptionsdk` in that bundle, or shaded bouncy castle (which is good, as otherwise I'd have to deal with the fact that some ASF projects were shipping a shaded version of it unknowingly)
    
    It could just be a strong java crypto provided is needed, and in the absence of the unlimited java crypto JAR in the JDK lib dir (where it's needed for kerberos to work), bouncy-castle needs to be on the CP.
    
    What to do?
    
    1. you can remove jets3t independent of the bouncy castle changes, because Kinesis isn't going to be using jets3t. The aws-s3 module significantly supercedes the jets3t client's functionality, and is the only one you'd expect the other parts of the AWS SDK to pick up. 
    1. the bouncy-castle dependency could be upgraded to a later version in the kinesis module(s) alone, and explicitly added to kinesis-asl.
    1. Someone needs to do some experiments with what happens to the test suite with/without the full JCE and bouncy castle, maybe including more details on whats not matching up in the round trip tests
    1 Maybe including some new test which somehow explores what encryption algorithms/keys you get with/without the BC  and JCE-unlimited JARs
    
    
    
    
    
    
    



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22081: [SPARK-23654][BUILD] remove jets3t as a dependency of sp...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22081
  
    **[Test build #94632 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94632/testReport)** for PR 22081 at commit [`a222cb8`](https://github.com/apache/spark/commit/a222cb84fa7317e39e134c8233f3bebe596bf088).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22081: [SPARK-23654][BUILD] remove jets3t as a dependency of sp...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22081
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22081: [SPARK-23654][BUILD] remove jets3t as a dependency of sp...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22081
  
    **[Test build #94632 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94632/testReport)** for PR 22081 at commit [`a222cb8`](https://github.com/apache/spark/commit/a222cb84fa7317e39e134c8233f3bebe596bf088).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22081: [SPARK-23654][BUILD] remove jets3t as a dependency of sp...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22081
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22081: [SPARK-23654][BUILD] remove jets3t as a dependency of sp...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22081
  
    **[Test build #94849 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94849/testReport)** for PR 22081 at commit [`d0334c1`](https://github.com/apache/spark/commit/d0334c18aa9081715e97cb705c75f9250300df9d).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22081: [SPARK-23654][BUILD] remove jets3t as a dependency of sp...

Posted by steveloughran <gi...@git.apache.org>.
Github user steveloughran commented on the issue:

    https://github.com/apache/spark/pull/22081
  
    Thanks. Two less JARs on the CP to keep up to date —what more can anyone want?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22081: [SPARK-23654][BUILD] remove jets3t as a dependency of sp...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22081
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94849/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22081: [SPARK-23654][BUILD] remove jets3t as a dependency of sp...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/22081
  
    I see. I guess I'm trying to figure out whether it's reasonable or not to pull in bouncy castle -- just in the Kinesis module I guess -- on behalf of the user then?
    
    That's the default conservative thing to do, but having just gone through the ECCN process, I realize it's not trivial for us to redistribute bouncy castle. Worth trimming if it's really about all the same to users.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22081: [SPARK-23654][BUILD] remove jets3t as a dependenc...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22081#discussion_r209443568
  
    --- Diff: pom.xml ---
    @@ -984,24 +987,15 @@
               </exclusion>
             </exclusions>
           </dependency>
    -      <!-- See SPARK-1556 for info on this dependency: -->
    +      <!-- See SPARK-23654 for info on this dependency;
    +      It is used to keep javax.activation at v1.1.1 after dropping
    +      jets3t as a dependency.
    +       -->
           <dependency>
    -        <groupId>net.java.dev.jets3t</groupId>
    -        <artifactId>jets3t</artifactId>
    -        <version>${jets3t.version}</version>
    +        <groupId>javax.activation</groupId>
    +        <artifactId>activation</artifactId>
    +        <version>1.1.1</version>
    --- End diff --
    
    this changes from `jets3t.version>0.9.4`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22081: [SPARK-23654][BUILD] remove jets3t as a dependency of sp...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22081
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2090/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22081: [SPARK-23654][BUILD] remove jets3t as a dependency of sp...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/22081
  
    Hm, I wonder, does the (newer) Kinesis SDK pull in bouncy castle? that's fine if so, that would make sense. If https://github.com/apache/spark/pull/22099 works, then we'll see if this then passes.
    
    I suppose it's possible it's not actually related, and the Kinesis bit hasn't worked for a while because it's just never changed and so not tested, and needed an SDK update for some other reason. Who knows.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22081: [SPARK-23654][BUILD] remove jets3t as a dependenc...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/22081


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org