You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by steveloughran <gi...@git.apache.org> on 2018/08/14 06:03:25 UTC

[GitHub] spark pull request #22099: [SPARK-25111][BUILD] increment kinesis client/pro...

GitHub user steveloughran opened a pull request:

    https://github.com/apache/spark/pull/22099

    [SPARK-25111][BUILD] increment kinesis client/producer & aws-sdk versions

    ## What changes were proposed in this pull request?
    
    Increment the kinesis client, producer and transient AWS SDK versions to a more recent release.
    
    This is to help with the move off bouncy castle of #21146 and #22081; the goal is that moving up to the new SDK will allow a JVM with unlimited JCE but without bouncy castle to work with Kinesis endpoints. 
    
    Why this specific set of artifacts? it syncs up with the 1.11.271 AWS SDK used by hadoop 3.0.3, hadoop-3.1. and hadoop 3.1.1; that's been stable for the uses there (s3, STS, dynamo). 
    
    ## How was this patch tested?
    
    Running all the external/kinesis-asl tests via maven with java 8.121 & unlimited JCE, without bouncy castle (#21146); default endpoint of us-west.2. Without this SDK update I was getting http cert validation errors, with it they went away.
    
    # This PR is not ready without 
    
    * Jenkins test runs to see what it is happy with
    * more testing: repeated runs, another endpoint
    * looking at the new deprecation warnings and selectively addressing them (the AWS SDKs are pretty aggressive about deprecation, but sometimes they increase the complexity of the client code or block some codepaths off completely)


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/steveloughran/spark cloud/SPARK-25111-kinesis

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22099.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22099
    
----
commit e79e5b9c0bbdf24dcc3cda30dc2c1a70d12b02aa
Author: Steve Loughran <st...@...>
Date:   2018-08-14T05:38:02Z

    [SPARK-25111] increment kinesis client/producer lib versions & aws-sdk to match.
    
    Change-Id: Ic2d12a07d273bd1b6fc4c681075070f22ed1e44c

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22099: [SPARK-25111][BUILD] increment kinesis client/producer &...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22099
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94776/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22099: [SPARK-25111][BUILD] increment kinesis client/producer &...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22099
  
    **[Test build #4246 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4246/testReport)** for PR 22099 at commit [`e79e5b9`](https://github.com/apache/spark/commit/e79e5b9c0bbdf24dcc3cda30dc2c1a70d12b02aa).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22099: [SPARK-25111][BUILD] increment kinesis client/producer &...

Posted by steveloughran <gi...@git.apache.org>.
Github user steveloughran commented on the issue:

    https://github.com/apache/spark/pull/22099
  
    Local kinesis tests with both -Phadoop-3.1, -Phadoop-2.7 & `Phadoop-3.1 -Dhadoop.version=3.1.1` are all working here (with bouncycastle, unlimited JCE in JVM).
    
    I'm updating the #21146 PR with this patch to see what happens with the combination in Jenkins of no bouncycastle, updated Kinesis. 
    
    Test run failure here was `org.apache.spark.streaming.kafka010.DirectKafkaStreamSuite.offset recovery from kafka`; hard to see how it relates


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22099: [SPARK-25111][BUILD] increment kinesis client/producer &...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22099
  
    **[Test build #94776 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94776/testReport)** for PR 22099 at commit [`e79e5b9`](https://github.com/apache/spark/commit/e79e5b9c0bbdf24dcc3cda30dc2c1a70d12b02aa).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22099: [SPARK-25111][BUILD] increment kinesis client/producer &...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22099
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2166/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22099: [SPARK-25111][BUILD] increment kinesis client/pro...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/22099


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22099: [SPARK-25111][BUILD] increment kinesis client/producer &...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22099
  
    **[Test build #4247 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4247/testReport)** for PR 22099 at commit [`e79e5b9`](https://github.com/apache/spark/commit/e79e5b9c0bbdf24dcc3cda30dc2c1a70d12b02aa).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22099: [SPARK-25111][BUILD] increment kinesis client/producer &...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22099
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94728/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22099: [SPARK-25111][BUILD] increment kinesis client/producer &...

Posted by steveloughran <gi...@git.apache.org>.
Github user steveloughran commented on the issue:

    https://github.com/apache/spark/pull/22099
  
    thanks


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22099: [SPARK-25111][BUILD] increment kinesis client/producer &...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22099
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22099: [SPARK-25111][BUILD] increment kinesis client/producer &...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22099
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22099: [SPARK-25111][BUILD] increment kinesis client/producer &...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22099
  
    **[Test build #94776 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94776/testReport)** for PR 22099 at commit [`e79e5b9`](https://github.com/apache/spark/commit/e79e5b9c0bbdf24dcc3cda30dc2c1a70d12b02aa).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22099: [SPARK-25111][BUILD] increment kinesis client/producer &...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/22099
  
    Retest this please.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22099: [SPARK-25111][BUILD] increment kinesis client/producer &...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22099
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22099: [SPARK-25111][BUILD] increment kinesis client/producer &...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22099
  
    **[Test build #94728 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94728/testReport)** for PR 22099 at commit [`e79e5b9`](https://github.com/apache/spark/commit/e79e5b9c0bbdf24dcc3cda30dc2c1a70d12b02aa).
     * This patch **fails due to an unknown error code, -9**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22099: [SPARK-25111][BUILD] increment kinesis client/producer &...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/22099
  
    To be clear you think this passed because it still uses jets3t and that still brings in BC? Then we can maybe merge this and rebase the other change to find out. This update won't have changed that situation with strong crypto being required right?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22099: [SPARK-25111][BUILD] increment kinesis client/producer &...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22099
  
    **[Test build #4246 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4246/testReport)** for PR 22099 at commit [`e79e5b9`](https://github.com/apache/spark/commit/e79e5b9c0bbdf24dcc3cda30dc2c1a70d12b02aa).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22099: [SPARK-25111][BUILD] increment kinesis client/producer &...

Posted by steveloughran <gi...@git.apache.org>.
Github user steveloughran commented on the issue:

    https://github.com/apache/spark/pull/22099
  
    > To be clear you think this passed because it still uses jets3t and that still brings in BC? 
    correct
    
    > Then we can maybe merge this and rebase the other change to find out. 
    correct
    
    > This update won't have changed that situation with strong crypto being required right?
    
    don't know. What it did do was stop my local test runs without bouncy castle failing with errors about certificate validation. 
    
    This patch is a good thing to do anyway, because it's good to stay somewhat current with the AWS releases (more chance of issues being addressed, reduced cost of future migrations). So it can be merged in and then the problem of getting #22081's test run to work addressed after. 
    
    
    I reopened #21146 & applied this patched to it, to see what Jenkins did there.  The overall test runs come out as failing -hard to point to any related cause, but the Kinesis ones do all pass: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94769/testReport/org.apache.spark.streaming.kinesis/
    
    I'm going to close that one again to avoid confusion about which of the "remove jets3t" patches people should be looking at; once the kinesis update is merged in you'll need to retest your #22081 PR and let's see what Jenkins says there


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22099: [SPARK-25111][BUILD] increment kinesis client/producer &...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22099
  
    **[Test build #94728 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94728/testReport)** for PR 22099 at commit [`e79e5b9`](https://github.com/apache/spark/commit/e79e5b9c0bbdf24dcc3cda30dc2c1a70d12b02aa).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22099: [SPARK-25111][BUILD] increment kinesis client/producer &...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22099
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22099: [SPARK-25111][BUILD] increment kinesis client/producer &...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22099
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2195/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22099: [SPARK-25111][BUILD] increment kinesis client/producer &...

Posted by steveloughran <gi...@git.apache.org>.
Github user steveloughran commented on the issue:

    https://github.com/apache/spark/pull/22099
  
    As noted in #22146; stripping off bouncy castle and upgrading the SDK worked. But a local test run of just this patch brought up the same error seen in #22081 
    
    ```
    WithoutAggregationKinesisStreamSuite:
    - KinesisUtils API
    - RDD generation
    - basic operation
    - custom message handling *** FAILED ***
      The code passed to eventually never returned normally. Attempted 20 times over 2.092846262916667 minutes. Last failure message: collected.synchronized[Boolean](KinesisStreamTests.this.convertToEqualizer[scala.collection.mutable.HashSet[Int]](collected).===(modData.toSet[Int])(scalactic.this.Equality.default[scala.collection.mutable.HashSet[Int]])) was false
      Data received does not match data sent. (KinesisStreamSuite.scala:230)
    - Kinesis read with custom configurations
    - split and merge shards in a stream
    - failure recovery *** FAILED ***
      The code passed to eventually never returned normally. Attempted 105 times over 2.0055098129 minutes. Last failure message: isCheckpointPresent was true, but 0 was not greater than 10. (KinesisStreamSuite.scala:398)
    ```
    That wasn't a full clean build, so let's see what Jenkins says and some more test runs tomorrow. It could just be this is all showing up some flakiness in the test case. At the very least, some more details on the failure might be good.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22099: [SPARK-25111][BUILD] increment kinesis client/producer &...

Posted by steveloughran <gi...@git.apache.org>.
Github user steveloughran commented on the issue:

    https://github.com/apache/spark/pull/22099
  
    @srowen @budde @ajfabbri


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22099: [SPARK-25111][BUILD] increment kinesis client/producer &...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/22099
  
    Merged to master


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org